Page 1 of 1

Finding duplicate files

Posted: 16 Nov 2013, 09:58
by viking60
When cleaning up the computer I want to find and delete duplicate files. This is a one liner that will find duplicate files:

Code: Select all

find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate


Easy to remember command right? :-D If not :shock: - maybe it is time to make an alias:

Code: Select all

alias duplicates='find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate'


Now you can navigate to your directory of choice and check it for duplicate files by typing

Code: Select all

duplicates
. This will check all subdirectories too.
This is good for finding duplicate pictures and mp3's etc. It is not a good idea to delete duplicate files from themes and icon sets etc.
Just delete stuff that you have put there.
I always find a lot of duplicates in my Downloads directory..

Now if you want to find and delete duplicate files in one operation you could enter this command:

Code: Select all

find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d |  xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -f3-100 -d ' ' | tr '\n.' '\t.' | sed 's/\t\t/\n/g' | cut -f2-100 | tr '\t' '\n' | perl -i -pe 's/([ (){}-])/\\$1/g' | perl -i -pe 's/'\''/\\'\''/g' | xargs -pr rm -v

Enter y to confirm that all findings should be deleted. This is probably not a smart thing to do - and dangerous.
No risk - no fun though :berserkf

Re: Finding duplicate files

Posted: 29 Oct 2017, 10:07
by viking60
To find and delete duplicate files; you can also install fdupes.

You need to enter a directory

Code: Select all

fdupes ~/Downloads

This will find all duplicates in your Download directory.

To delete them you can do a:

Code: Select all

fdupes -d ~/Downloads

This will present you with a list of how many duplicates you want to preserve. Typing 1 (one) will preserve one copy.

You will have to do this for every duplicate found.

If you want to scan and remove recursively you can use the -r switch:

Code: Select all

fdupes -rd ~/Downloads


To see the size of the duplicates you can use the -S switch:

Code: Select all

fdupes -S ~/Downloads