file parsing/pattern matching/one big headache... 1

digm · Nov 14, 2004

I run a few game servers and I'm trying to write a script to parse a log file into something more readable. Unfortunately it's turning into one big headache...

The format of the log is...

Code:

12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe2
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
a1beb7c4ecd8cba7e7dddade70ecf445 1 20040705 20040705 sam
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim

The only useful fields are the first and the last. They represent the player's GUID and player name.
So, player jim logged on three times using the same guid, and joe changed his name to joe2, and his guid remained the same as well. What I'd like to do is end up with an output like this...

Code:

f17f39ddd593c46f9ce255eb20daac83 jim
a1beb7c4ecd8cba7e7dddade70ecf445 sam
12eca1f19798ab7587255e682016003f joe
12eca1f19798ab7587255e682016003f joe2

Basically, rearranging the log into something useful - eliminating all extra instances where the guid and handle match, and rearranging by guid.

Every time I make a breakthrough it seems to get more complicated. If someone could look over this code it would be greatly appreciated. The first few full lines of the log above can be cut/paste into a pbalias.dat file if you'd like to test.

Code:

#/usr/bin/perl -w
#
use strict;

open LOG, "pbalias.dat" || die;
my @data=();  #intial data array for name/handle from log
my @guidarray=(); #used for checking duplicate guids
my @namearray=(); #used for checking duplicate names
my @finalarray=(); #used for final array to be output to file
my $hcheck=&handlecheck;
my $counter;
my @line2=();

while (<LOG>) {
        my @line=split(/ /, $_); #take each line, split via whitespace
        chomp $line[4];
        my $hi=$line[0] . " " . $line[4] . "\n";  #take only guid and name, combine
        push (@data, $hi); #push that into first array
}

close LOG;

for ($counter=0; $counter < scalar(@data); $counter++) {
        my $lvar;
        my $hvar;
        @line2=split(/ /, @data[$counter]); #first split array element into guid/handle
        push (@guidarray, $line2[0]); #put the guid into one array
        push (@namearray, $line2[1]); #handle into another
        &guidcheck($line2[0]);  #send guid to subroutine to check if its already there
        my $gcheck=guidcheck($ARGV[0]);
        if ($gcheck) {  #if the guid was found
                &handlecheck($line2[1]);  #send the handle to check if that is there
                my $hcheck=handlecheck($ARGV[0]);
                if ($hcheck) {  #if handle is found
                        #do nothing
                }
                else {
                        push (@finalarray, @data[$counter]); #same guid,unique handle. throw it into final
                }
        }
        else
        {
                push (@finalarray, @data[$counter]); #otherwise throw it into final
        }
}

@finalarray=sort(@finalarray);
print @finalarray;

sub guidcheck {
        foreach (0..$#guidarray) {
                if ($_[0] eq $guidarray[$_]) {
		return (1);
                }
                else {
                return undef;
                }
        }
}

sub handlecheck {
        foreach (0..$#namearray) {
                if ($_[1] eq $namearray[$_]) {
                return (1);
                }
                else {
                return undef;
                }
        }
}

cgilover · Nov 14, 2004

Why not store into a hash so you don't have to check to see if the GUID already exists?

digm · Nov 14, 2004

That's how I initially tackled the problem, but ran into a wall...

If I used the GUID as the hash key, it could only have one value/playername. For the GUIDs that have more than one name associated with it, I could get around it by having a value be a reference to an array containing all the names, but it was getting a bit to complicated for me so I thought this would be a simpler approach.

And I realized I didn't say what my initial problem was. I think it may have something to do with the subroutines - as of now, it just pushes all the values into @finalarray whether they're in there or not.

And I'm not against rewriting the whole thing if using a hash would be easier - but if someone could point me in the direction of getting around the hash problem I had it'd be great.

mikevh · Nov 14, 2004

Sticking with the hash idea, let the player names be the hash keys, then you can have multiple player names with same GUID, e.g.

Code:

#!perl
use strict;
use warnings;

my %data = map {chomp; my ($name, $guid) = (split)[-1, 0]; $name => $guid} <DATA>;
for (sort {$data{$a} cmp $data{$b}} keys %data) {
    print "$data{$_} $_\n";
}

__DATA__
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe2
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim
a1beb7c4ecd8cba7e7dddade70ecf445 1 20040705 20040705 sam
12eca1f19798ab7587255e682016003f 1 20040705 20040705 joe
f17f39ddd593c46f9ce255eb20daac83 1 20040705 20040705 jim

Output

Code:

12eca1f19798ab7587255e682016003f joe
12eca1f19798ab7587255e682016003f joe2
a1beb7c4ecd8cba7e7dddade70ecf445 sam
f17f39ddd593c46f9ce255eb20daac83 jim

mikevh · Nov 14, 2004

Actually, that could be even shorter:

Code:

[b]my %data = map {chomp; (split)[-1, 0]} <DATA>;[/b]
for (sort {$data{$a} cmp $data{$b}} keys %data) {
    print "$data{$_} $_\n";
}

MikeLacey · Nov 15, 2004

Or... Lazier still. You could have both the GUID and the players name form the key of the hash. The hash would just be a way of uniquifying the list.

$GuidAndPlayerHash{"$GUID,$player"} ++;

Mike

To err is human, but to really foul things up you need a man; women just don't have the flair for it.

Want to get great answers to your Tek-Tips questions? Have a look at faq219-2884

mlibeson · Nov 15, 2004

Just an alternate:

Code:

while (<LOG>) {
     chomp;
     @line = (split(/\s+/, $_))[0,4];
     $hash{$line[0]}{$line[1]} = 1;
}

foreach $GUID (sort keys(%hash)) {
     foreach $USERNAME (sort keys (%{$hash{$GUID}})) {
          print STDOUT "$GUID\t$USERNAME\n";
     }
}

Michael Libeson

mikevh · Nov 15, 2004

If you had to do it with an array ...

Code:

#!perl
use strict;
use warnings;

my @data = map { [(split)[0, 4]] } sort <DATA>;
my @final;
my $prev;
for (my $i=0; $i<@data; $i++) {
    if (!defined($prev) || $data[$i]->[1] ne $prev->[1]) {
        push @final, $data[$i];
    }
    $prev = $data[$i];
}
for (@final) {
    print join " ", @$_, "\n";
}

Same output as the hash version.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

file parsing/pattern matching/one big headache... 1

digm

Technical User

cgilover

Programmer

digm

Technical User

mikevh

Programmer

mikevh

Programmer

MikeLacey

MIS

mlibeson

Programmer

mikevh

Programmer

Similar threads

Part and Inventory Search

Sponsor