Finding duplicates of photos with different filenames

#1
My daughter is running Win10 on a Dell laptop. Because she has four kids, she has thousands of photos and the list keeps growing. Thanks to some difficulties with downloading photos from her iPhone in the recent past, she has many duplicate photos and, unfortunately, many descriptive filenames have changed.

As an old programmer, I can’t stop thinking that I should able to help her detect and remove the duplicates. The basis for this is that although the filenames may be different, the actual photos should be identical, bit for bit, even including the supporting data on lens settings, orientation, etc. I don’t know "C" or its derivatives. The last language I used that might allow me to manipulate the layout of a record is BASIC. (I would be happiest if I could go back to my old mainframe days when I could work down at the bit level if necessary.)

There are packages that remove duplicates. I’ve worked with one from the CleanMyPC people. I’ve also thought about using iTunes to find the duplicates. The only idea I’ve had touches on setting the actual filename aside as a comment, converting the photo bits to hex (ASCII characters, should work just as well while cutting the filename length in half) and using the result as a "filename" to check for duplicates. Actually, I would only need to convert a relatively small initial part of the photo — this would give me some false positives but nothing more serious than that. That’s as far as I’ve gotten.

I’d appreciate any helpful comments and suggestions on this venture.
 


Josephur

Windows Forum Admin
Staff member
Premium Supporter
#2
What you are looking for is a hashing function such as MD5, SHA256. They convert the entirety of the file's contents into a string that represents the data.

Every major programming language will have a library or function to do this already, so all you would have to do is write a recursive program to look for image files and hash them then compare results when finished. That being said.... there's already tons of free programs out there to find duplicate images, why reinvent the wheel? :)
 


MikeHawthorne

Essential Member
Microsoft Community Contributor
#3
Hi

You should try the duplicate file finder in CCleaner, it has a number of parameters for finding the duplicate files.
If you unclick the box to look for duplicate file names it will search by file size, type and content.

I just tried it with and without using names and it appeared to find the same files either way.

At least it will give you some place to start.

The only real problem is that while it will let you ignore many types of files like system files it won't let you tell it what file types to look for.
If would be nice if you could look just for .jpg files or .tif files for example.

But even if you don't include file names in the search it will show the file names when it displays them along with the file type so you can screen out the extensions that your aren't interested in.

Just look at the duplicate files with the .jpg extensions for instance, they will be listed together and it will show the locations.

Mike

I got curious and looked around, I found a program called Duplicate Sweeper that may be more useful because it allows you to be more targeted in your search. I downloaded it and no alarms went off and after a brief look it seems like it may be what you are looking for.

It appears that if it finds two image files that are dups it will display them side by side. You can download the sample version but you have to pay to activate it.

Download Duplicate Sweeper for free | Wide Angle Software
 


Last edited:
This website is not affiliated, owned, or endorsed by Microsoft Corporation. It is a member of the Microsoft Partner Program.
Top