Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

KickStart to Find and Delete True Duplicate Files

Status
Not open for further replies.

asgarcymed

Technical User
Nov 22, 2007
14
PT
I want to create a VBS script able to search for true duplicate files (same SIZE + same MD5 HASH/CHECKSUM), and after deleting clones.

Main features: filters to include and/or exclude files and/or folders; total recursive search; automatic deletion of duplicate files according to previously defined rules.

Getting each file's size is relatively easy; unlike getting MD5... Maybe an external ActiveX=COM component could help getting MD5?

The script can write a CSV file with:
Full Path,Size,MD5

But, how to read such CSV file in order to find {same SIZE +(,) same MD5}

Do you have any general guidelines/tips how to do this?

Thanks in advance.

Best regards.
 
>relatively easy; unlike getting MD5

You want to look at CAPICOM
 
CAPICOM is very good, and really, I did not know it to exist... Thanks for teaching it! ;)


However, I still have doubts:

The script can write a CSV file with:
Full Path,Size,MD5

But, how to read such CSV file in order to find {same SIZE +(,) same MD5}


Thanks.

Regards.
 
Read the CSV item by item into a dictionary, using the filename & size & MD5 as the key. When you get an "This key is already associated with an element of this collection" error (runtime error 457) then that means you have found a duplicate.
 
(or use the dictionaries Exists method to see if the key already exists, which has much the same result)
 
You are talking about "Dictionary Object", right?

«
Set dict = CreateObject("Scripting.Dictionary")
»


Could you please post a code "fragment"/ "snippet"/ "sample"/ "example"?

Thanks.

Regards.
 
strongm is exactly right. The dictionary object is the way to go. Here is a thread to help you out. Also, read up on the link I posted in this thread as it will help you to understand the dictionary object better.

thread222-1313989

Swi
 
If you're interested in calculating an MD5 checksum you can find a free COM here: It will process an MD5 checksum of a file or a string.


--------------------------------------------------------------------------------
dm4ever
My philosophy: K.I.S.S - Keep It Simple Stupid
 
OK, thanks. If I have to do it myself on Windows, i was going to use md5sum.exe from UnxUtils.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top