Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

file parsing/pattern matching/one big headache... 1

Status
Not open for further replies.

digm

Technical User
Nov 14, 2004
2
0
0
US
I run a few game servers and I'm trying to write a script to parse a log file into something more readable. Unfortunately it's turning into one big headache...

The format of the log is...
Code:
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe2
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
a1beb7c4ecd8cba7e7dddade70ecf445 1 20040705 20040705 sam
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim

The only useful fields are the first and the last. They represent the player's GUID and player name.
So, player jim logged on three times using the same guid, and joe changed his name to joe2, and his guid remained the same as well. What I'd like to do is end up with an output like this...

Code:
f17f39ddd593c46f9ce255eb20daac83 jim
a1beb7c4ecd8cba7e7dddade70ecf445 sam
12eca1f19798ab7587255e682016003f joe
12eca1f19798ab7587255e682016003f joe2

Basically, rearranging the log into something useful - eliminating all extra instances where the guid and handle match, and rearranging by guid.

Every time I make a breakthrough it seems to get more complicated. If someone could look over this code it would be greatly appreciated. The first few full lines of the log above can be cut/paste into a pbalias.dat file if you'd like to test.

Code:
#/usr/bin/perl -w
#
use strict;

open LOG, "pbalias.dat" || die;
my @data=();  #intial data array for name/handle from log
my @guidarray=(); #used for checking duplicate guids
my @namearray=(); #used for checking duplicate names
my @finalarray=(); #used for final array to be output to file
my $hcheck=&handlecheck;
my $counter;
my @line2=();

while (<LOG>) {
        my @line=split(/ /, $_); #take each line, split via whitespace
        chomp $line[4];
        my $hi=$line[0] . " " . $line[4] . "\n";  #take only guid and name, combine
        push (@data, $hi); #push that into first array
}

close LOG;

for ($counter=0; $counter < scalar(@data); $counter++) {
        my $lvar;
        my $hvar;
        @line2=split(/ /, @data[$counter]); #first split array element into guid/handle
        push (@guidarray, $line2[0]); #put the guid into one array
        push (@namearray, $line2[1]); #handle into another
        &guidcheck($line2[0]);  #send guid to subroutine to check if its already there
        my $gcheck=guidcheck($ARGV[0]);
        if ($gcheck) {  #if the guid was found
                &handlecheck($line2[1]);  #send the handle to check if that is there
                my $hcheck=handlecheck($ARGV[0]);
                if ($hcheck) {  #if handle is found
                        #do nothing
                }
                else {
                        push (@finalarray, @data[$counter]); #same guid,unique handle. throw it into final
                }
        }
        else
        {
                push (@finalarray, @data[$counter]); #otherwise throw it into final
        }
}

@finalarray=sort(@finalarray);
print @finalarray;

sub guidcheck {
        foreach (0..$#guidarray) {
                if ($_[0] eq $guidarray[$_]) {
		return (1);
                }
                else {
                return undef;
                }
        }
}

sub handlecheck {
        foreach (0..$#namearray) {
                if ($_[1] eq $namearray[$_]) {
                return (1);
                }
                else {
                return undef;
                }
        }
}
 
Why not store into a hash so you don't have to check to see if the GUID already exists?
 
That's how I initially tackled the problem, but ran into a wall...


If I used the GUID as the hash key, it could only have one value/playername. For the GUIDs that have more than one name associated with it, I could get around it by having a value be a reference to an array containing all the names, but it was getting a bit to complicated for me so I thought this would be a simpler approach.



And I realized I didn't say what my initial problem was. I think it may have something to do with the subroutines - as of now, it just pushes all the values into @finalarray whether they're in there or not.

And I'm not against rewriting the whole thing if using a hash would be easier - but if someone could point me in the direction of getting around the hash problem I had it'd be great.
 
Sticking with the hash idea, let the player names be the hash keys, then you can have multiple player names with same GUID, e.g.
Code:
#!perl
use strict;
use warnings;

my %data = map {chomp; my ($name, $guid) = (split)[-1, 0]; $name => $guid} <DATA>;
for (sort {$data{$a} cmp $data{$b}} keys %data) {
    print "$data{$_} $_\n";
}

__DATA__
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe2
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
a1beb7c4ecd8cba7e7dddade70ecf445 1 20040705 20040705 sam
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
Output
Code:
12eca1f19798ab7587255e682016003f joe
12eca1f19798ab7587255e682016003f joe2
a1beb7c4ecd8cba7e7dddade70ecf445 sam
f17f39ddd593c46f9ce255eb20daac83 jim



 
Actually, that could be even shorter:
Code:
[b]my %data = map {chomp; (split)[-1, 0]} <DATA>;[/b]
for (sort {$data{$a} cmp $data{$b}} keys %data) {
    print "$data{$_} $_\n";
}

 
Or... Lazier still. You could have both the GUID and the players name form the key of the hash. The hash would just be a way of uniquifying the list.

$GuidAndPlayerHash{"$GUID,$player"} ++;

Mike

To err is human, but to really foul things up you need a man; women just don't have the flair for it.

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

 
Just an alternate:

Code:
while (<LOG>) {
     chomp;
     @line = (split(/\s+/, $_))[0,4];
     $hash{$line[0]}{$line[1]} = 1;
}

foreach $GUID (sort keys(%hash)) {
     foreach $USERNAME (sort keys (%{$hash{$GUID}})) {
          print STDOUT "$GUID\t$USERNAME\n";
     }
}


Michael Libeson
 
If you had to do it with an array ...
Code:
#!perl
use strict;
use warnings;

my @data = map { [(split)[0, 4]] } sort <DATA>;
my @final;
my $prev;
for (my $i=0; $i<@data; $i++) {
    if (!defined($prev) || $data[$i]->[1] ne $prev->[1]) {
        push @final, $data[$i];
    }
    $prev = $data[$i];
}
for (@final) {
    print join " ", @$_, "\n";
}
Same output as the hash version.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top