Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Find duplicate files from different directories

Status
Not open for further replies.

tpham1002002

Technical User
Jun 20, 2012
6
0
0
US
Hi All,
I have about 200 folders in which each contains up to 6 to 10 sub-folders. There are about 100 different files that have been duplicated and stored in the folders and I must find their path locations and filenames.
I'd like to have your help to write a TCL or Perl script that can help me to find the duplicated file names and their path locations.
I am really appreciate for your help.
Regards
Tony
 
I personally no longer have a platform that runs Tcl BUT... it's not a hard problem. Look at the "file" and "glob" functions. Start with just the directories, not the sub-directories. Remember that in Tcl, arrays are hashed so you can have something like "filenames(xyz/abc/qrl.ext)=n" where n is the number of occurrences.

_________________
Bob Rashkin
 
hi,

As Bong already pointed out, you need to create a procedure to search for file using 'glob'. See following Link

Furthermore, let's assume you have finally created your search procedure or somebody else on this forum has much more free time then me and can give you the script to results the following list format as an example result:
Code:
set searcrhResults {
  dir1/dir2/dir3/file1.tcl   dir1/dir3/file1.tcl   dir1/dir2/file1.tcl   dir1/dir2/dir3/file2.tcl   dir1/dir2/dir3/file3.tcl   dir1/dir3/file2.tcl   dir1/file3.tcl   dir1/file4.tcl }

Then you can use the following script to parse the list for duplicates:
Code:
foreach file $searcrhResults {
  if {[catch {incr filenames([file tail $file],cnt)}]} {set filenames([file tail $file],cnt) 1}
  lappend filenames([file tail $file],paths) [file dirname $file]
}

When 'parray' command is unleashed onto array 'filenames', we get view the duplicates:
Code:
filenames(file1.tcl,cnt)   = 3
filenames(file1.tcl,paths) = dir1/dir2/dir3 dir1/dir3 dir1/dir2
filenames(file2.tcl,cnt)   = 2
filenames(file2.tcl,paths) = dir1/dir2/dir3 dir1/dir3
filenames(file3.tcl,cnt)   = 2
filenames(file3.tcl,paths) = dir1/dir2/dir3 dir1
filenames(file4.tcl,cnt)   = 1
filenames(file4.tcl,paths) = dir1

thacoda
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top