Trigger unique array element to display line contents in table? 1

wauger · Feb 18, 2008

Hi All,
I haven't played with Perl/CGI in a while and I'm finding I am a little rusty.

I have a data file that looks like this:
CORP,101,90210-sf101,aggregate2.402:1,10.1.1.100/27,402
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.100/27,403
BUS2,103,90210-sf103,aggregate4.402:1,10.1.3.100/27,404
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.100/27,405
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.101/27,403
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.105/27,405

I want to:
a. Load this into an array
b. Drop duplicates
c. Key off of the first field
d. Display all line contents common to the unique first field into a table.

Something that looks like this:

CORP
----
101 | 90210-sf101 | aggregate2.402:1 | 10.1.1.100/27 | 402

BUS1
----
102 | 90210-sf102 | aggregate3.402:1 | 10.1.2.100/27 | 403
102 | 90210-sf102 | aggregate3.402:1 | 10.1.2.101/27 | 403

BUS2
----
103 | 90210-sf103 | aggregate4.402:1 | 10.1.3.100/27 | 404

BUS3
----
104 | 90210-sf104 | aggregate5.402:1 | 10.1.4.100/27 | 405
104 | 90210-sf104 | aggregate5.402:1 | 10.1.4.105/27 | 405

I am loading the data table from a file into an array currently, and assigning values using a 'foreach' loop. but now I need to grab the first field and print each variable in the line that the same first field. Make sense?

Any help is appreciated.

KevinADC · Feb 18, 2008

how big is the file?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

wauger · Feb 18, 2008

It will only be about 1000 lines or so total.

max1x · Feb 18, 2008

Based on the data above, what exactly is duplicate?

3rd field seems to be a unique key to create a hash table, where the rest of values can be tied to and displayed.

wauger · Feb 18, 2008

There might not be anything duplicate, just saying that there could be.

What is the most efficient way of doing this? Thanks

KevinADC · Feb 18, 2008

You need to explain what duplicates means. Is it the whole line or just part of the line that constitutes a duplicate? I took it to mean the entire line must match to make them a duplicate.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

max1x · Feb 19, 2008

Here's my version of the solution, which uses awk to display the output as you have listed above. You can use File::xxx functions to attain the same. If you take take out awk, the output results are still the same, but displayed a little differently.

Code:

#!/usr/bin/perl

open FILE,"< ./file" or die;
        while (<FILE>) {
                ($ff,$junk) = split (/,/,$_);
                $val{$ff}++;
        }
close FILE;
        foreach $source (sort keys %val) {
                $data = `cat ./file | grep '^$source' | awk ' { FS = \",\" } {print \$2,\"|\",\$3,\"|\"\$4,\"|\",\$5}'`;
                print "$source\n$data\n";
        }

OUTPUT:

BUS1
102 | 90210-sf102 |aggregate3.402:1 | 10.1.2.100/27
102 | 90210-sf102 |aggregate3.402:1 | 10.1.2.101/27

BUS2
103 | 90210-sf103 |aggregate4.402:1 | 10.1.3.100/27

BUS3
104 | 90210-sf104 |aggregate5.402:1 | 10.1.4.100/27
104 | 90210-sf104 |aggregate5.402:1 | 10.1.4.105/27

CORP
101 | 90210-sf101 |aggregate2.402:1 | 10.1.1.100/27

max1x · Feb 19, 2008

sorry, forgot to display the last field.

Code:

$data = `cat ./file | grep '^$source' | awk ' { FS = \",\" } {print \$2,\"|\",\$3,\"|\",\$4,\"|\",\$5,\"|\",\$6}'`;

wauger · Feb 19, 2008

Thanks for the quick response everyone!

The code you provide would work perfectly but I am running this in CGI so I need to limit myself to perl.

I found a way (not optimal). Simply run a foreach loop and then include an IF condition that triggers each unique first field and pulls out all of the data for lines that match that first field. Unfortunately to get it to display the way I like, I have to run this about 7 times as there are 7 different unique first fields.

stevexff · Feb 19, 2008

Well that's not really a solution at all, is it? If you've got to read the file seven times to get an answer, performance is going to be appalling. What happens if you get a new location - will you change the program to read it eight times?

The simplest way to get rid of duplicates is to use a hash. Here, I've assumed that a duplicate means that the whole line is the same as one of the others. If your definition of duplicate is different, then you'll have to change it.

Because we are aggregating at two levels, this uses a hash of hashes.

Code:

use strict;
use warnings;

my %keys;

while (<DATA>) {
   chomp;
   my @cols = split(/,/);
   $keys{$cols[0]}->{join('|', @cols[1 .. $#cols])}++;
}

foreach (sort keys %keys) {
   print "$_\n----\n";
   print "$_\n" foreach (sort keys %{$keys{$_}});
   print "\n";
}
      

__DATA__
CORP,101,90210-sf101,aggregate2.402:1,10.1.1.100/27,402
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.100/27,403
BUS2,103,90210-sf103,aggregate4.402:1,10.1.3.100/27,404
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.100/27,405
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.101/27,403
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.105/27,405

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

KevinADC · Feb 20, 2008

Borrowing from steves code, this should produce the output the OP posted at the beginning of the thread:

Code:

[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]
[black][b]use[/b][/black] [green]warnings[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]%keys[/blue][red];[/red]

[olive][b]while[/b][/olive] [red]([/red]<DATA>[red])[/red] [red]{[/red]
   [black][b]my[/b][/black] [red]([/red][blue]$k[/blue],[blue]@v[/blue][red])[/red] = [url=http://perldoc.perl.org/functions/split.html][black][b]split[/b][/black][/url] [red]"[/red][purple],[/purple][red]"[/red][red];[/red]
   [url=http://perldoc.perl.org/functions/push.html][black][b]push[/b][/black][/url] [blue]@[/blue][red]{[/red][blue]$keys[/blue][red]{[/red][blue]$k[/blue][red]}[/red][red]}[/red], [url=http://perldoc.perl.org/functions/join.html][black][b]join[/b][/black][/url] [red]'[/red][purple] | [/purple][red]'[/red],[blue]@v[/blue][red];[/red]
[red]}[/red]

[olive][b]foreach[/b][/olive] [red]([/red][url=http://perldoc.perl.org/functions/sort.html][black][b]sort[/b][/black][/url] [url=http://perldoc.perl.org/functions/keys.html][black][b]keys[/b][/black][/url] [blue]%keys[/blue][red])[/red] [red]{[/red]
   [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$_[/blue][purple][b]\n[/b][/purple]----[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
   [black][b]print[/b][/black] [red]"[/red][purple][blue]$_[/blue][/purple][red]"[/red] [olive][b]for[/b][/olive] [blue]@[/blue][red]{[/red][blue]$keys[/blue][red]{[/red][blue]$_[/blue][red]}[/red][red]}[/red][red];[/red]
   [black][b]print[/b][/black] [red]"[/red][purple][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]
 
[teal]__DATA__[/teal]
[teal]CORP,101,90210-sf101,aggregate2.402:1,10.1.1.100/27,402[/teal]
[teal]BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.100/27,403[/teal]
[teal]BUS2,103,90210-sf103,aggregate4.402:1,10.1.3.100/27,404[/teal]
[teal]BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.100/27,405[/teal]
[teal]BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.101/27,403[/teal]
[teal]BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.105/27,405[/teal]

[tt]------------------------------------------------------------
Pragmas (perl 5.8.8) used :
[ul]
[li]strict - Perl pragma to restrict unsafe constructs[/li]
[li]warnings - Perl pragma to control optional warnings[/li]
[/ul]
[/tt]

output:

Code:

BUS1
----
102 | 90210-sf102 | aggregate3.402:1 | 10.1.2.100/27 | 403
102 | 90210-sf102 | aggregate3.402:1 | 10.1.2.101/27 | 403

BUS2
----
103 | 90210-sf103 | aggregate4.402:1 | 10.1.3.100/27 | 404

BUS3
----
104 | 90210-sf104 | aggregate5.402:1 | 10.1.4.100/27 | 405
104 | 90210-sf104 | aggregate5.402:1 | 10.1.4.105/27 | 405

CORP
----
101 | 90210-sf101 | aggregate2.402:1 | 10.1.1.100/27 | 402

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

stevexff · Feb 20, 2008

Kevin

wauger said:
Something that looks like this:

He didn't say it had to look exactly like that [smile]

But I'll allow that your split with the scalar and the array is neater than mine...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

stevexff · Feb 20, 2008

Also, I'd originally used a hash of arrays, but discarded it in favour of the hash of hashes when I realised it wouldn't remove the duplicates. So

Code:

use strict;
use warnings;

my %keys;

while (<DATA>) {
   chomp;
   my ($k,@v) = split /,/;
   $keys{$k}->{join(' | ', @v)}++;
}

foreach my $k (sort keys %keys) {
   print "$k\n----\n";
   print "$_\n" foreach (sort keys %{$keys{$k}});
   print "\n";
}
 
__DATA__
CORP,101,90210-sf101,aggregate2.402:1,10.1.1.100/27,402
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.100/27,403
BUS2,103,90210-sf103,aggregate4.402:1,10.1.3.100/27,404
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.100/27,405
BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.101/27,403
BUS3,104,90210-sf104,aggregate5.402:1,10.1.4.105/27,405
[red]BUS1,102,90210-sf102,aggregate3.402:1,10.1.2.100/27,403[/red]

should be the best of both solutions.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

KevinADC · Feb 20, 2008

Hopefully the OP comes back and reads these suggestions

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

wauger · Feb 22, 2008

Hey All,
WOW - thanks so much for the help! You all have been great. I am going to try to implement these right now. I'll keep you posted.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Trigger unique array element to display line contents in table? 1

wauger

Technical User

KevinADC

Technical User

wauger

Technical User

max1x

Programmer

wauger

Technical User

KevinADC

Technical User

max1x

Programmer

max1x

Programmer

wauger

Technical User

stevexff

Programmer

KevinADC

Technical User

stevexff

Programmer

stevexff

Programmer

KevinADC

Technical User

wauger

Technical User

Similar threads

Part and Inventory Search

Sponsor