Duplicate files & photos

How to Find Duplicate Files on Mac Using Terminal: 4 Commands (md5, fdupes, find)

If your Mac is running low on disk space, duplicate files are often a silent culprit. Knowing how to find duplicate files on Mac using Terminal gives you precise control over what gets compared, where it looks, and how results are presented — without installing anything that costs money or obscures its logic. This guide covers four practical Terminal approaches: the built-in md5 command, the Homebrew tool fdupes, shell-native find pipelines, and a quick Python one-liner. Every command shown here runs on macOS Sequoia and Tahoe on both Apple Silicon and Intel Macs.

Why Duplicate Files Accumulate on macOS

Duplicates don't appear from nowhere. They build up through predictable patterns:

  • Repeated downloads of the same PDF or installer from Safari or Chrome to ~/Downloads
  • Photo libraries synced from iCloud that overlap with a local Photos export
  • Xcode simulator runtimes and derived data stored in ~/Library/Developer/Xcode/DerivedData
  • Node projects whose node_modules folders pull identical packages across dozens of repos
  • Cargo build artifacts duplicated across workspaces under ~/.cargo/registry/src
  • Maven dependencies cached redundantly in ~/.m2/repository

Understanding where duplicates hide helps you run Terminal commands against the right directories rather than scanning your entire drive unnecessarily.

Command 1: md5 — Compare Files by Hash Without Any Install

Every macOS install ships with md5, a command that produces a 128-bit fingerprint for any file. Two files with identical MD5 hashes almost certainly have identical content, regardless of their names.

Hash a single file

Open Terminal and run:

md5 ~/Downloads/installer.dmg

You will see output like MD5 (/Users/you/Downloads/installer.dmg) = d41d8cd98f00b204e9800998ecf8427e. Copy that hash and compare it against another file:

md5 ~/Downloads/installer_copy.dmg

Matching hashes mean identical content.

Hash an entire folder and sort to expose duplicates

The real power comes from piping results through sort. Run this in Terminal:

find ~/Downloads -type f -exec md5 {} \; 2>/dev/null | sort > /tmp/hashes.txt && awk -F"= " "seen[$2]++ {print}" /tmp/hashes.txt

This prints every file whose hash appeared previously in the output. It won't delete anything — it only reports. Review the list, then remove entries you recognise as safe to delete.

Command 2: fdupes — Purpose-Built Duplicate Scanner

fdupes is a small, focused C program you install via Homebrew. It does recursive comparison using file size first, then a byte-for-byte check, which makes it both fast and reliable.

Install and run fdupes

  1. Install Homebrew if you haven't: follow the instructions at brew.sh.
  2. Install fdupes: brew install fdupes
  3. Scan a folder: fdupes -r ~/Documents
  4. Show sizes alongside results: fdupes -rS ~/Documents
  5. Summarise only (no file listing): fdupes -rq ~/Documents

To scan your Downloads folder and see which groups are duplicates:

fdupes -rS ~/Downloads

fdupes groups duplicates together, separated by blank lines. Files in the same group are byte-identical. The -S flag prepends the size so you can prioritise the largest groups for deletion.

Delete interactively with fdupes -d

Running fdupes -rd ~/Downloads prompts you for each duplicate group and asks which copy to keep. This is slower but surgical — you decide file-by-file. Avoid the -N flag (auto-delete all but the first) on a folder you haven't reviewed, as it cannot distinguish an original from a copy if both are equally recent.

Command 3: find Pipelines — No Extra Tools Required

If you would rather not install Homebrew, a combination of find, sort, md5, and uniq covers the same ground using only what macOS ships with.

Size-first filter (fast pre-screen)

Most unique files differ in size. Grouping by size before hashing saves significant time on large folders:

find ~/Library/Caches -type f -print0 | xargs -0 stat -f "%z %N" | sort -n | uniq -Dw 10

This lists files that share the same size — a strong (though not certain) signal of duplication. From that shortlist you can run md5 on individual candidates to confirm.

Hash-based pipeline for a specific subtree

find ~/Library/Developer/Xcode/DerivedData -type f | while read f; do
  echo "$(md5 -q "$f") $f"
done | sort | uniq -D -w 32

The -q flag tells md5 to print only the hash, making it easy to feed into uniq -w 32 (compare on the first 32 characters — the hash). This pipeline surfaces every file in DerivedData that has an identical twin somewhere in the same tree.

Command 4: Python One-Liner for Readable Output

Python 3 ships with macOS (via the Xcode Command Line Tools). This script groups duplicates into a readable report without any third-party dependency:

python3 - <<'EOF'
import os, hashlib, collections
hashes = collections.defaultdict(list)
for root, dirs, files in os.walk(os.path.expanduser('~/Downloads')):
    for name in files:
        path = os.path.join(root, name)
        try:
            h = hashlib.md5(open(path, 'rb').read()).hexdigest()
            hashes[h].append(path)
        except OSError:
            pass
for h, paths in hashes.items():
    if len(paths) > 1:
        print('\n'.join(paths))
        print()
EOF

Swap ~/Downloads for any directory. The script reads each file and computes an MD5 hash, then prints groups of identical files. For very large files (multi-GB videos or disk images), reading the entire file is slow — in those cases the find pipelines above are faster because they can filter by size first.

Where Duplicates Hide: Common macOS Locations

Not all folders are equally risky to scan or clean. The table below maps common duplicate hotspots to their typical sizes and whether it is safe to delete from them without app-level knowledge.

Location Typical Duplicate Type Typical Size Impact Safe to Delete Manually?
~/Downloads Re-downloaded installers, PDFs 1–20 GB Yes — review first
~/Library/Developer/Xcode/DerivedData Build artefacts 5–50 GB Yes — Xcode rebuilds on demand
~/.m2/repository Duplicate Maven JARs across versions 500 MB–5 GB Caution — use mvn dependency:purge-local-repository
~/.cargo/registry/src Rust crate source copies 200 MB–3 GB Yes — Cargo re-fetches as needed
~/Library/Caches App cache blobs 1–10 GB Yes — apps regenerate caches
~/Pictures/Photos Library.photoslibrary Duplicate originals and edited copies Varies widely No — use Photos app duplicate detection

For a broader look at what else is consuming your disk, this breakdown of what takes up space on a Mac maps the largest categories you will encounter.

How to Safely Delete Duplicates After Finding Them

Finding duplicates is the easy part. Deleting safely requires a short checklist:

  1. Identify which copy is the canonical one. Sort by modification date with ls -lt to find the newest version.
  2. Check for symlinks. Run ls -la on the directory. Deleting the target of a symlink breaks apps that point to it.
  3. Move to Trash before permanent deletion. Use mv ~/path/to/duplicate ~/.Trash/ so you have a recovery window.
  4. Confirm the remaining copy opens correctly before emptying Trash.
  5. Empty Trash. Only after verification, right-click the Trash icon and choose Empty Trash, or run rm -rf ~/.Trash/*.

If the sheer number of duplicates makes manual triage impractical, a tool like Crumb can audit all of these locations at once and show what's safe before you delete — useful when you need both Terminal-level visibility and a visual summary side by side.

Limitations of Terminal Duplicate Detection

Terminal commands find byte-identical files well. They do not catch:

  • Near-duplicate photos — a JPEG and its edited HEIF copy differ at the byte level even if they look identical on screen.
  • Renamed copiesfdupes and md5 correctly handle these since hash comparison is name-agnostic, but results can be surprising when file names differ.
  • Duplicates inside app bundles — scanning inside .app packages can flag internal resource sharing as duplication; be careful running fdupes against /Applications.
  • Files behind sandboxed containers — many app containers under ~/Library/Containers are permission-restricted; Terminal commands silently skip unreadable files.

For a wider approach that includes photos and sandboxed apps, see the guide to finding duplicate files on Mac for free, which covers GUI tools alongside Terminal methods.

Reclaim your disk in one click

Crumb audits your whole Mac, tells you what's safe to delete, and frees the space in seconds — private, local, and Apple-notarized.

Download Crumb for macOS

Frequently asked questions

Is it safe to delete duplicate files found by md5 or fdupes on a Mac?
Generally yes, as long as you verify which copy is the one in active use before deleting. The safest approach is to move duplicates to the Trash rather than permanently deleting them immediately, so you can restore if something breaks. Never delete both copies of a pair without confirming one survives.
Does fdupes work on Apple Silicon Macs running macOS Sequoia?
Yes. Homebrew's fdupes formula is compiled as a native ARM64 binary on Apple Silicon and runs without Rosetta on M1, M2, M3, and M4 Macs. Install it with brew install fdupes in Terminal.
Will scanning ~/Library/Developer/Xcode/DerivedData for duplicates break Xcode?
No. DerivedData holds build artefacts that Xcode regenerates on the next build. Deleting the entire DerivedData folder or duplicate files within it is safe and commonly recommended to reclaim space. Your source code is unaffected because it lives in a separate project folder.
How much space can I expect to recover by removing duplicate files on a Mac?
Results vary widely. Developers with multiple Xcode projects can recover 10-50 GB from DerivedData alone. Typical users who download a lot of files often find 1-5 GB of duplicates in ~/Downloads. Photo library duplicates depend on how many imports have been done over the years.
Can Terminal duplicate commands find duplicate photos inside a Photos Library?
They can find byte-identical image files, but macOS Photos Library packages store originals and modified versions in a proprietary bundle structure under ~/Pictures/Photos Library.photoslibrary/originals. Terminal commands can enter this bundle, but results require careful interpretation. For photo duplicates, the built-in Photos app duplicate detection (available from macOS Ventura onward) is more reliable.