Printing duplicates in a hash 7

HughbertD · May 3, 2007

I have a hash with paths as the key and filenames as the value, I want to print out the path and filename of any of these elements that appear more than once in the hash.

Thank you if you can help at all

1DMF · May 3, 2007

Well I had a situation where i needed to count where there was duplicated in my hash and use this

Code:

        $count = scalar grep {exists $_->{'Status'} && $_->{'Status'} eq "REFUND"} @trk;

so count = the number of indexes (ie records) where the key 'Status' equaled 'REFUND'.

so maybe you can use the same principle.

you would have to loop the hash and then grep the hash for keys matching current key, and if > 1 then you have duplicates.

i'm not 100% on the syntax, post some code and I'll try to help

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.

KevinADC · May 3, 2007

You can't have a duplicate key/value pairs in a hash. You can have duplicate values that have different keys.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

travs69 · May 3, 2007

I think he just wants to print the path and filename of any duplicate filename's. So no duplicate key's just duplicate values.

KevinADC · May 3, 2007

I think you're right..... so post the code [wink]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

travs69 · May 3, 2007

well.. in my limited knowledge..

Code:

%hash = ( 1 => "a", 2 => "a", 3 => "b", 4=> "b", 5=> "c");

for $key (keys %hash) {
	$count = grep {/$hash{$key}/} values %hash;
	if ($count > 1) {
		print "$key $hash{$key}\n";
	}
}

travs69 · May 3, 2007

One better

Code:

%hash = ( 1 => "a", 2 => "a", 3 => "b", 4=> "b", 5=> "c");

for $key (keys %hash) {
	if ((grep {/$hash{$key}/} values %hash) > 1) {
		print "$key $hash{$key}\n";
	}
}

KevinADC · May 3, 2007

if only my wife and kids were that attentive to what I say.... [smile]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

travs69 · May 3, 2007

I can change my name to MrsKevinADC if it makes you feel better [gorgeous]

KevinADC · May 3, 2007

I think we better nip this in the bud. But I have to admit, that sexy looking smiley had me cosidering it for a minute. [wink]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

HughbertD · May 4, 2007

Fantastic - thanks everyone, always nice to see a relationship bloom as well ;-)

HughbertD · May 4, 2007

Hi -

I don't think the above code works - In one instance it only finds 1 duplicate when there are more - yet in another case it prints out virtually all of the file it was processing -regardless of duplicates

Could anyone be kind enough to explain what the code does so I might understand a little better?

Thanks

chazoid · May 4, 2007

this might make things a little more clear - this version uses a second hash to store the duplicate filenames and counts. (I'm just using Data:

umper to display the contents of the hash)

Code:

use Data::Dumper;
my %hash = ( "path_1" => "file_a",
             "path_2" => "file_a",
             "path_3" => "file_b",
             "path_4" => "file_b",
             "path_5" => "file_c"
            );

my %dupes; # filename => count;
for $path (keys %hash) {
    $count = grep {/$hash{$path}/} values %hash;
    if ($count > 1) {
        $dupes{$hash{$path}} = $count;
    }
}
print Dumper(\%dupes);

for the first pass, assuming it's starting with the first key/value in the hash -

$path is set to "path_1"
using grep, the values of the hash are searched for $hash{$path} (which is "file_a")
grep returns the number of matches, and stores the result in $count
if $count > 1, the value of $dupes{$hash{$path} (or $dupes{"file_a"}) is set to $count;

KevinADC · May 4, 2007

just a different way to do it:

Code:

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%hash[/blue] = [red]([/red] [red]"[/red][purple]path_1[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_2[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_3[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_4[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_5[/purple][red]"[/red] => [red]"[/red][purple]file_c[/purple][red]"[/red]
            [red])[/red][red];[/red]

[black][b]my[/b][/black] [blue]%dupes[/blue] = [red]([/red][red])[/red][red];[/red]
[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$paths[/blue] [red]([/red][url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%hash[/blue][red])[/red] [red]{[/red]
   [url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$hash[/blue][red]{[/red][blue]$paths[/blue][red]}[/red][red]}[/red][red]}[/red],[blue]$paths[/blue][red];[/red]
[red]}[/red]	  
[red]{[/red]
   [url=http://perldoc.perl.org/functions/local.html][black][b]local[/b][/black][/url] [blue]$"[/blue] = [red]"[/red][purple][purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][/purple][red]"[/red][red];[/red]
   [olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$paths[/blue] [red]([/red][black][b]keys[/b][/black] [blue]%dupes[/blue][red])[/red] [red]{[/red]
      [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$paths[/blue]:[purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][blue]@[/blue]{[blue]$dupes[/blue]{[blue]$paths[/blue]}}[purple][b]\n[/b][/purple][/purple][red]"[/red] [olive][b]if[/b][/olive] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$paths[/blue][red]}[/red][red]}[/red] > [fuchsia]1[/fuchsia][red];[/red]
   [red]}[/red]
[red]}[/red]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

MillerH · May 4, 2007

A slight variation on Kevin's variant:

If order does not matter (ie you're not sorting), then consider using the while and each functions for interating on a hash instead of for and (keys or values).

Code:

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%hash[/blue] = [red]([/red] [red]"[/red][purple]path_1[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_2[/purple][red]"[/red] => [red]"[/red][purple]file_a[/purple][red]"[/red],
             [red]"[/red][purple]path_3[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_4[/purple][red]"[/red] => [red]"[/red][purple]file_b[/purple][red]"[/red],
             [red]"[/red][purple]path_5[/purple][red]"[/red] => [red]"[/red][purple]file_c[/purple][red]"[/red]
            [red])[/red][red];[/red]

[black][b]my[/b][/black] [blue]%dupes[/blue] = [red]([/red][red])[/red][red];[/red]
[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$path[/blue], [blue]$file[/blue][red])[/red] = [url=http://perldoc.perl.org/functions/each.html][black][b]each[/b][/black][/url] [blue]%hash[/blue][red])[/red] [red]{[/red]
	[url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@[/blue][red]{[/red][blue]$dupes[/blue][red]{[/red][blue]$file[/blue][red]}[/red][red]}[/red], [blue]$path[/blue][red];[/red]
[red]}[/red]
[red]{[/red]
	[url=http://perldoc.perl.org/functions/local.html][black][b]local[/b][/black][/url] [blue]$"[/blue] = [red]"[/red][purple][purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][/purple][red]"[/red][red];[/red]
	[olive][b]while[/b][/olive] [red]([/red][black][b]my[/b][/black] [red]([/red][blue]$file[/blue], [blue]$paths[/blue][red])[/red] = [black][b]each[/b][/black] [blue]%dupes[/blue][red])[/red] [red]{[/red]
		[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$file[/blue]:[purple][b]\n[/b][/purple][purple][b]\t[/b][/purple][blue]@$paths[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red] [olive][b]if[/b][/olive] [blue]@$paths[/blue] > [fuchsia]1[/fuchsia][red];[/red]
	[red]}[/red]
[red]}[/red]

- Miller

chazoid · May 4, 2007

Nice, Miller.. I didn't know about the each function.

Kevin - I just discovered your syntax highlighter.. pretty slick! [thumbsup2]

1DMF · May 9, 2007

cool a solution to the problem and a courting couple all in one post, is that a TT first?

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.

1DMF · May 9, 2007

oh and stars all round

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you.

chazoid · May 9, 2007

right back at ya! ;-)

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Printing duplicates in a hash 7

Technical User

Programmer

Technical User

MIS

Technical User

MIS

MIS

Technical User

MIS

Technical User

Technical User

Technical User

Technical User

Technical User

Programmer

Technical User

Programmer

Programmer

Technical User

Similar threads

Log in

Part and Inventory Search

Sponsor