Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using ActiveState Perl For Windows to Parse 3

Status
Not open for further replies.

pmking

IS-IT--Management
Mar 1, 2006
59
0
0
US
Hello All,

I have a parsing question. I have a formated report that I need to parse through and extract certain pieces of the text file.

Firsl of all, I am new to Perl, and I am not sure where to begin.

What I would like to do is:

1. Create a GUI interface so users can input where their text file that needs parsing is located on their pc.

2. Have a perl script (created via ActiveState Perl for windows) run against the text file and extract only what is needed, and output it to an excel spreadsheet. Not sure if this can be, the output to an excel spreadsheet.

I am not sure where to begin with the perl script, since the text file that I need to parse is formatted already. There are tabs, spaces, colons, dashes throughout this text file.

Am I on the right track by using perl? And if so, can somebody please, please help me get a start?

Thanks.


 
Stevexff,

Thanks so much for your response.

Okay, I am can I am getting this now.. From your post, you mentioned dictionary as something to relate the definiction of hashing, which made it very clear to me. Great input from your end regarding hashing.

How can I hash an element if the end result is always different. Meaning, I can't define it or put in the dictionary since the numerical value of what I am regex'ing' is always different.

Wait, I think the light just went off in my head.. So what I can do is hash

print "Enter the file you want to parse: ";
chomp(my $filename = <STDIN>);

my ($location, $subsystem, $fromtime, $totime,
$primauth, $planname, $average, $applcl1,
$db2cl2, $cputime, $occurrences, $commits,
$rollbbacks, $deadlocks);

open(FH,"<$filename") or die "Can't open $filename: $!";
while(<FH>){
chomp;
if (/^\s*LOCATION:\s*(\w+)/) {
$location = "LOCATION: $1"; [red]I would hash $location = LOCATION: $1[/red]
}
elsif (/^\s*SUBSYSTEM:\s*(\w+)/) {
$location = "SUBSYSTEM: $1"; [red]I would hash $location = SUBSYSTEM: $1[/red]
}
elsif (/^\s*PRIMAUTH:\s*(\S+)\s*PLANNAME:\s*(\S+)/) {
$location = "PRIMAUTH: $1 PLANNAME: $2"; [red]I would hash $PRIMAUTH: $1 PLANNAME: $2 ----AND SO ON BELOW[/red]
}
elsif (/\s*INTERVAL:\sFROM:\s*(\w+)/) {
$location = "FROM: $1";
}
elsif (/^\s*TO:\s*([\w\s\/:]+)/) {
$location = "TO: $1";
}
elsif (/\s*#OCCURRENCES\s*:\s*(\S+)/) {
$location = "OCCURRENCES: $1";
}
elsif (/^\s*ELAPSED\sTIME\s*([\d.]+)/) {
$location = "ELAPSED TIME: $1";
}
elsif (/^\s*CPU\sTIME\s*(\S+)/) {
$location = "CPU TIME: $1";
}
elsif (/\s*DEADLOCKS\s*(\d+\s\d+)/) {
$location = "DEADLOCKS: $1";
}
else {
next;
}
print "$location\n"; #how come location isn't printing?
}
close(FH);


-------END OF CODE SNIPIT-----------------

Okay, maybe the light turned off again:) So what I have in red above is what I would hash and then sort the hashes by value, right? Hope I am on the right track..


Thanks so much Stevexff, your dictionary example was good..
 
pmking

There is something (possibly a composite key) in your report that identifies each set of values uniquely. I'm guessing it's the PRIMAUTH: [red]DNCS[/red] PLANNAME: [red]PUOJLK[/red].
Code:
use strict;
use warnings;
use Data::Dumper;

my %locations = {}; # anonymous hash
my ($location, $plan);

while (<>) {
   chomp;
   if (/PRIMAUTH:\s+(\w+)\s+PLANNAME:\s+(\w+)/) {
      ($location, $plan) = ($1, $2);
      $locations{$location} = {} unless (exists $locations{$location});
      $locations{$location}->{$plan} = {} unless (exists $locations{$location}->{$plan});
      next;
   }
   if (/DEADLOCKS\s+(\d+)\s+(\d+)/) {
      push @{$locations{$location}->{$plan}->{'deadlocks'}}, ($1, $2);
      next;
   }
}

print Dumper(%locations);
Untested; in the example above, the %locations hash uses the LOCATION as a key. The lookup value of the key is a reference to another hash, which uses the plan name as the key. So this hash contains all the plan names for that location. Looking up the plan name in this hash returns a reference to yet another hash, which contains values for things like CPU times, deadlocks, etcetera for this specific plan name. If we look up the 'deadlocks' key in this hash, we get a reference to an array (as there are two numeric values for deadlocks). So the overall structure looks like
Code:
loc-hash{a} --> plan-hash{b} --> attrib-hash{c} --> values-array
So what you end up with in memory is a tree structure, with the %locations hash as the trunk, and the actual data values as the leaves. As you read through the report, you are building up the tree from the values you collect on the way.

HTH

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Seem to have 'location' on the brain from your previous posts. Suggest you substitute $primauth for all instances of $location in the previous code...[blush]

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Stevexff,

Thanks so much with your assistance..

Do you replace "all" $location with $primauth?

Thanks.
 
It *should* work as is. It just isn't very intuitive. Looking at the code above, a replace all of 'location' with 'primauth' ought to do the trick. If I get time tonight, I'll post a tested version...

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Code:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my %primauths;
my ($primauth, $plan);

while (<>) {
   chomp;
   if (/PRIMAUTH:\s+(\w+)\s+PLANNAME:\s+(\w+)/) {
      ($primauth, $plan) = ($1, $2);
      next;
   }
   if (/DEADLOCKS\s+(\d\.\d+)\s+(\d+)/) {
      push @{$primauths{$primauth}->{$plan}->{'deadlocks'}}, ($1, $2);
      next;
   }
   if (/ELAPSED\sTIME\s+(\d+:\d+\.\d+)\s+(\d+:\d+\.\d+)/) {
      push @{$primauths{$primauth}->{$plan}->{'elapsed'}}, ($1, $2);
   }
   if (/\#OCCURRENCES\s+:\s+(\d+)/) {
      push @{$primauths{$primauth}->{$plan}->{'occurred'}}, $1;
   }
}

print Dumper(%primauths);
This one seems to work. Just expand it a bit, for the other values you want.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
I totally agree with you KevinADC, Stevexff gets one more star from me..

Thanks so much..

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top