If your Mac is running low on disk space, duplicate files are often a silent culprit. Knowing how to find duplicate files on Mac using Terminal gives you precise control over what gets compared, where it looks, and how results are presented — without installing anything that costs money or obscures its logic. This guide covers four practical Terminal approaches: the built-in md5 command, the Homebrew tool fdupes, shell-native find pipelines, and a quick Python one-liner. Every command shown here runs on macOS Sequoia and Tahoe on both Apple Silicon and Intel Macs.
Why Duplicate Files Accumulate on macOS
Duplicates don't appear from nowhere. They build up through predictable patterns:
- Repeated downloads of the same PDF or installer from Safari or Chrome to
~/Downloads - Photo libraries synced from iCloud that overlap with a local Photos export
- Xcode simulator runtimes and derived data stored in
~/Library/Developer/Xcode/DerivedData - Node projects whose
node_modulesfolders pull identical packages across dozens of repos - Cargo build artifacts duplicated across workspaces under
~/.cargo/registry/src - Maven dependencies cached redundantly in
~/.m2/repository
Understanding where duplicates hide helps you run Terminal commands against the right directories rather than scanning your entire drive unnecessarily.
Command 1: md5 — Compare Files by Hash Without Any Install
Every macOS install ships with md5, a command that produces a 128-bit fingerprint for any file. Two files with identical MD5 hashes almost certainly have identical content, regardless of their names.
Hash a single file
Open Terminal and run:
md5 ~/Downloads/installer.dmg
You will see output like MD5 (/Users/you/Downloads/installer.dmg) = d41d8cd98f00b204e9800998ecf8427e. Copy that hash and compare it against another file:
md5 ~/Downloads/installer_copy.dmg
Matching hashes mean identical content.
Hash an entire folder and sort to expose duplicates
The real power comes from piping results through sort. Run this in Terminal:
find ~/Downloads -type f -exec md5 {} \; 2>/dev/null | sort > /tmp/hashes.txt && awk -F"= " "seen[$2]++ {print}" /tmp/hashes.txt
This prints every file whose hash appeared previously in the output. It won't delete anything — it only reports. Review the list, then remove entries you recognise as safe to delete.
Command 2: fdupes — Purpose-Built Duplicate Scanner
fdupes is a small, focused C program you install via Homebrew. It does recursive comparison using file size first, then a byte-for-byte check, which makes it both fast and reliable.
Install and run fdupes
- Install Homebrew if you haven't: follow the instructions at
brew.sh. - Install fdupes:
brew install fdupes - Scan a folder:
fdupes -r ~/Documents - Show sizes alongside results:
fdupes -rS ~/Documents - Summarise only (no file listing):
fdupes -rq ~/Documents
To scan your Downloads folder and see which groups are duplicates:
fdupes -rS ~/Downloads
fdupes groups duplicates together, separated by blank lines. Files in the same group are byte-identical. The -S flag prepends the size so you can prioritise the largest groups for deletion.
Delete interactively with fdupes -d
Running fdupes -rd ~/Downloads prompts you for each duplicate group and asks which copy to keep. This is slower but surgical — you decide file-by-file. Avoid the -N flag (auto-delete all but the first) on a folder you haven't reviewed, as it cannot distinguish an original from a copy if both are equally recent.
Command 3: find Pipelines — No Extra Tools Required
If you would rather not install Homebrew, a combination of find, sort, md5, and uniq covers the same ground using only what macOS ships with.
Size-first filter (fast pre-screen)
Most unique files differ in size. Grouping by size before hashing saves significant time on large folders:
find ~/Library/Caches -type f -print0 | xargs -0 stat -f "%z %N" | sort -n | uniq -Dw 10
This lists files that share the same size — a strong (though not certain) signal of duplication. From that shortlist you can run md5 on individual candidates to confirm.
Hash-based pipeline for a specific subtree
find ~/Library/Developer/Xcode/DerivedData -type f | while read f; do
echo "$(md5 -q "$f") $f"
done | sort | uniq -D -w 32
The -q flag tells md5 to print only the hash, making it easy to feed into uniq -w 32 (compare on the first 32 characters — the hash). This pipeline surfaces every file in DerivedData that has an identical twin somewhere in the same tree.
Command 4: Python One-Liner for Readable Output
Python 3 ships with macOS (via the Xcode Command Line Tools). This script groups duplicates into a readable report without any third-party dependency:
python3 - <<'EOF'
import os, hashlib, collections
hashes = collections.defaultdict(list)
for root, dirs, files in os.walk(os.path.expanduser('~/Downloads')):
for name in files:
path = os.path.join(root, name)
try:
h = hashlib.md5(open(path, 'rb').read()).hexdigest()
hashes[h].append(path)
except OSError:
pass
for h, paths in hashes.items():
if len(paths) > 1:
print('\n'.join(paths))
print()
EOF
Swap ~/Downloads for any directory. The script reads each file and computes an MD5 hash, then prints groups of identical files. For very large files (multi-GB videos or disk images), reading the entire file is slow — in those cases the find pipelines above are faster because they can filter by size first.
Where Duplicates Hide: Common macOS Locations
Not all folders are equally risky to scan or clean. The table below maps common duplicate hotspots to their typical sizes and whether it is safe to delete from them without app-level knowledge.
| Location | Typical Duplicate Type | Typical Size Impact | Safe to Delete Manually? |
|---|---|---|---|
~/Downloads |
Re-downloaded installers, PDFs | 1–20 GB | Yes — review first |
~/Library/Developer/Xcode/DerivedData |
Build artefacts | 5–50 GB | Yes — Xcode rebuilds on demand |
~/.m2/repository |
Duplicate Maven JARs across versions | 500 MB–5 GB | Caution — use mvn dependency:purge-local-repository |
~/.cargo/registry/src |
Rust crate source copies | 200 MB–3 GB | Yes — Cargo re-fetches as needed |
~/Library/Caches |
App cache blobs | 1–10 GB | Yes — apps regenerate caches |
~/Pictures/Photos Library.photoslibrary |
Duplicate originals and edited copies | Varies widely | No — use Photos app duplicate detection |
For a broader look at what else is consuming your disk, this breakdown of what takes up space on a Mac maps the largest categories you will encounter.
How to Safely Delete Duplicates After Finding Them
Finding duplicates is the easy part. Deleting safely requires a short checklist:
- Identify which copy is the canonical one. Sort by modification date with
ls -ltto find the newest version. - Check for symlinks. Run
ls -laon the directory. Deleting the target of a symlink breaks apps that point to it. - Move to Trash before permanent deletion. Use
mv ~/path/to/duplicate ~/.Trash/so you have a recovery window. - Confirm the remaining copy opens correctly before emptying Trash.
- Empty Trash. Only after verification, right-click the Trash icon and choose Empty Trash, or run
rm -rf ~/.Trash/*.
If the sheer number of duplicates makes manual triage impractical, a tool like Crumb can audit all of these locations at once and show what's safe before you delete — useful when you need both Terminal-level visibility and a visual summary side by side.
Limitations of Terminal Duplicate Detection
Terminal commands find byte-identical files well. They do not catch:
- Near-duplicate photos — a JPEG and its edited HEIF copy differ at the byte level even if they look identical on screen.
- Renamed copies —
fdupesandmd5correctly handle these since hash comparison is name-agnostic, but results can be surprising when file names differ. - Duplicates inside app bundles — scanning inside
.apppackages can flag internal resource sharing as duplication; be careful runningfdupesagainst/Applications. - Files behind sandboxed containers — many app containers under
~/Library/Containersare permission-restricted; Terminal commands silently skip unreadable files.
For a wider approach that includes photos and sandboxed apps, see the guide to finding duplicate files on Mac for free, which covers GUI tools alongside Terminal methods.