Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Printing duplicates in a hash 7

Status
Not open for further replies.

HughbertD

Technical User
Apr 30, 2007
42
GB
I have a hash with paths as the key and filenames as the value, I want to print out the path and filename of any of these elements that appear more than once in the hash.

Thank you if you can help at all
 
Well I had a situation where i needed to count where there was duplicated in my hash and use this
Code:
        $count = scalar grep {exists $_->{'Status'} && $_->{'Status'} eq "REFUND"} @trk;

so count = the number of indexes (ie records) where the key 'Status' equaled 'REFUND'.

so maybe you can use the same principle.

you would have to loop the hash and then grep the hash for keys matching current key, and if > 1 then you have duplicates.

i'm not 100% on the syntax, post some code and I'll try to help

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
You can't have a duplicate key/value pairs in a hash. You can have duplicate values that have different keys.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I think he just wants to print the path and filename of any duplicate filename's. So no duplicate key's just duplicate values.
 
I think you're right..... so post the code [wink]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
well.. in my limited knowledge..

Code:
%hash = ( 1 => "a", 2 => "a", 3 => "b", 4=> "b", 5=> "c");

for $key (keys %hash) {
	$count = grep {/$hash{$key}/} values %hash;
	if ($count > 1) {
		print "$key $hash{$key}\n";
	}
}
 
One better
Code:
%hash = ( 1 => "a", 2 => "a", 3 => "b", 4=> "b", 5=> "c");

for $key (keys %hash) {
	if ((grep {/$hash{$key}/} values %hash) > 1) {
		print "$key $hash{$key}\n";
	}
}
 
if only my wife and kids were that attentive to what I say.... [smile]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I can change my name to MrsKevinADC if it makes you feel better [gorgeous]
 
I think we better nip this in the bud. But I have to admit, that sexy looking smiley had me cosidering it for a minute. [wink]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Fantastic - thanks everyone, always nice to see a relationship bloom as well ;-)
 
Hi -

I don't think the above code works - In one instance it only finds 1 duplicate when there are more - yet in another case it prints out virtually all of the file it was processing -regardless of duplicates

Could anyone be kind enough to explain what the code does so I might understand a little better?

Thanks
 
this might make things a little more clear - this version uses a second hash to store the duplicate filenames and counts. (I'm just using Data::Dumper to display the contents of the hash)
Code:
use Data::Dumper;
my %hash = ( "path_1" => "file_a",
             "path_2" => "file_a",
             "path_3" => "file_b",
             "path_4" => "file_b",
             "path_5" => "file_c"
            );

my %dupes; # filename => count;
for $path (keys %hash) {
    $count = grep {/$hash{$path}/} values %hash;
    if ($count > 1) {
        $dupes{$hash{$path}} = $count;
    }
}
print Dumper(\%dupes);
for the first pass, assuming it's starting with the first key/value in the hash -

$path is set to "path_1"
using grep, the values of the hash are searched for $hash{$path} (which is "file_a")
grep returns the number of matches, and stores the result in $count
if $count > 1, the value of $dupes{$hash{$path} (or $dupes{"file_a"}) is set to $count;

 
just a different way to do it:

Code:
[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%hash[/blue] = [red]([/red] [red]"[/red][purple]path_1[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_2[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_3[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_4[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_5[/purple][red]"[/red] => [red]"[/red][purple]file_c[/purple][red]"[/red]
            [red])[/red][red];[/red]

[black][b]my[/b][/black] [blue]%dupes[/blue] = [red]([/red][red])[/red][red];[/red]
[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$paths[/blue] [red]([/red][url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%hash[/blue][red])[/red] [red]{[/red]
   [url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$hash[/blue][red]{[/red][blue]$paths[/blue][red]}[/red][red]}[/red][red]}[/red],[blue]$paths[/blue][red];[/red]
[red]}[/red]	  
[red]{[/red]
   [url=http://perldoc.perl.org/functions/local.html][black][b]local[/b][/black][/url] [blue]$"[/blue] = [red]"[/red][purple][purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][/purple][red]"[/red][red];[/red]
   [olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$paths[/blue] [red]([/red][black][b]keys[/b][/black] [blue]%dupes[/blue][red])[/red] [red]{[/red]
      [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$paths[/blue]:[purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][blue]@[/blue]{[blue]$dupes[/blue]{[blue]$paths[/blue]}}[purple][b]\n[/b][/purple][/purple][red]"[/red] [olive][b]if[/b][/olive] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$paths[/blue][red]}[/red][red]}[/red] > [fuchsia]1[/fuchsia][red];[/red]
   [red]}[/red]
[red]}[/red]



------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
A slight variation on Kevin's variant:

If order does not matter (ie you're not sorting), then consider using the while and each functions for interating on a hash instead of for and (keys or values).

Code:
[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%hash[/blue] = [red]([/red] [red]"[/red][purple]path_1[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_2[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_3[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_4[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_5[/purple][red]"[/red] => [red]"[/red][purple]file_c[/purple][red]"[/red]
            [red])[/red][red];[/red]

[black][b]my[/b][/black] [blue]%dupes[/blue] = [red]([/red][red])[/red][red];[/red]
[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$path[/blue], [blue]$file[/blue][red])[/red] = [url=http://perldoc.perl.org/functions/each.html][black][b]each[/b][/black][/url] [blue]%hash[/blue][red])[/red] [red]{[/red]
	[url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$file[/blue][red]}[/red][red]}[/red], [blue]$path[/blue][red];[/red]
[red]}[/red]
[red]{[/red]
	[url=http://perldoc.perl.org/functions/local.html][black][b]local[/b][/black][/url] [blue]$"[/blue] = [red]"[/red][purple][purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][/purple][red]"[/red][red];[/red]
	[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$file[/blue], [blue]$paths[/blue][red])[/red] = [black][b]each[/b][/black] [blue]%dupes[/blue][red])[/red] [red]{[/red]
		[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$file[/blue]:[purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][blue]@$paths[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red] [olive][b]if[/b][/olive] [blue]@$paths[/blue] > [fuchsia]1[/fuchsia][red];[/red]
	[red]}[/red]
[red]}[/red]

- Miller
 
Nice, Miller.. I didn't know about the each function.

Kevin - I just discovered your syntax highlighter.. pretty slick! [thumbsup2]
 
cool a solution to the problem and a courting couple all in one post, is that a TT first?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
oh and stars all round :)

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top