stepping through arrays

edpatterson · May 19, 2005

I am processing screen reports that I have redirected to disk (command >diskfile). The reports are laid out below. I use a while() loop with a nested until() loop to do the processing. Basically while(<>){until (/^-/){...}}. This part is working just fine.

I would like to skip the redirected part and do all of the processing in RAM.

I replaced the command >deskfile with @array = `command`. As expected the @array is populated with one line per element. The problem I am having as stepping through the array. I have tried using nested labeled foreach loops with no joy at all.
[tt]
Output of the program is
"aaa","some_a_data","some_b_data","some_c_data","some_d_data"
"bbb","some_a_data","some_b_data","","some_d_data"
"ccc","some_a_data","some_b_data","some_c_data","some_d_data","some_e_data"
[/tt]
Ideas?
Ed

data set
[tt]
----------------------------
record aaa
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
----------------------------
record bbb
item a: some_a_data
item b: some_b_data
item d: some_d_data
----------------------------
record ccc
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
item e: some_e_data
----------------------------
[/tt]

dmazzini · May 19, 2005

Hi mate

You could save the output of the command in a variable like:

$result = `dir`;
print $result;

But you have to split the variable $result by line.

Example:

$output = qq(

----------------------------
record aaa
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
----------------------------
record bbb
item a: some_a_data
item b: some_b_data
item d: some_d_data
----------------------------
record ccc
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
item e: some_e_data
---------------------------- );

my @lines = split "\n",$output;
foreach (@lines) {
$i++;
print "Line $i. $_\n";
# Do your parser here. Reg expression, etc.
}

Cheers

dmazzini
GSM System and Telecomm Consultant

AHinMaine · May 19, 2005

You probably ought to have a look at File::Slurp. It's pretty simple and would probably fit the bill nicely.

--
Andy

http://www.nachoz.com

"Historically speaking, the presence of wheels in Unix has never precluded their reinvention."
Larry Wall

edpatterson · May 19, 2005

Thanks! Your post changed the way I was attacking the problem. I now have the following code which works. Problems occur when I add the -w switch. It does not like the unititalized values which occur if there is a blank element.
I thought that initializing the array with the empty string would get around it....

I am putting it in an array to make the conversion to a CSV easier.

[tt]
#!/usr/bin/perl
use strict;
my @array = "";
my $output = qq(
----------------------------
record aaa
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
----------------------------
record bbb
item a: some_a_data
item b: some_b_data
item d: some_d_data
----------------------------
record ccc
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
item e: some_e_data
----------------------------);

my @lines = split "\n",$output;
foreach (@lines) {
if(/^record\s+(.*)/){ $array[0] = $1;}
if(/\s+item a: (.*)/){ $array[1] = $1;}
if(/\s+item b: (.*)/){ $array[2] = $1;}
if(/\s+item c: (.*)/){ $array[3] = $1;}
if(/\s+item d: (.*)/){ $array[4] = $1;}
if(/\s+item e: (.*)/){ $array[5] = $1;}
if(/^-/){
print "@array\n";
@array = "";
}
}
[/tt]

edpatterson · May 19, 2005

naChoZ: Thanks for the headache. I am sure my answer is somewhere...

Ed

dmazzini · May 19, 2005

What about, skip blank lines and -------------

Here we go:

$output = qq(

----------------------------
record aaa
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
----------------------------
record bbb
item a: some_a_data
item b: some_b_data
item d: some_d_data
----------------------------
record ccc
item a: some_a_data
item b: some_b_data
item c: some_c_data
item d: some_d_data
item e: some_e_data

---------------------------- );

#print @array;

my @lines = split "\n",$output;
foreach (@lines) {
$i++;
next if $_ =~ /^\s*$/; # skip blank lines
next if $_ =~ /^\----/;
print "Line $i. $_\n";
# Do your parser here. Reg expression, etc.
}

dmazzini
GSM System and Telecomm Consultant

dmazzini · May 19, 2005

You can print the results every time you get "record"...thus you will skip the line"-------"

dmazzini
GSM System and Telecomm Consultant

edpatterson · May 19, 2005

I had added a if(/^\n/){next;} and the warning remains.

I believe it because of an uninitialized element in the array.
Not all elements of the array will be populated every time.

It is the dreaded:
'use of uninitialized value in join or string' which is pointing back to my print"@array\n"; line.

I guess I could leave warning off, but...

Ed

AHinMaine · May 19, 2005

Another thing you could do...

Instead of

Code:

command > diskfile

you could do

Code:

command | yourscript.pl

Then this should work:

Code:

#!/usr/local/bin/perl -lnw

for ( <> ) {
    chomp;
    next if m/^$/;
    next if m/^\-+/;         
    print;

    #
    # your code
    #

    ...

}

--
Andy

http://www.nachoz.com

"Historically speaking, the presence of wheels in Unix has never precluded their reinvention."
Larry Wall

KevinADC · May 19, 2005

a slightly different approach using an array of arrays:

Code:

#!perl
use strict;
use warnings;

my @lines = <DATA>; #read data into array
chomp @lines; #remove newlines from array
my @AofA; #array of arrays to store data
my $i = -1; #binary flag for indexing
foreach (@lines) {
   next if (/^\s*$/); #skip blank lines
   next if (/^-+$/); #skip --- lines
   if (index($_,'record')>-1) { #start a new record
      $i++;#increment index count
      /record\s+(\w+)/; #get the record label
      $AofA[$i]=[$1]; #make an annonymous array
      next;    
   }
   tr/ //d; #remove all spaces in line
   s/item\w://; #remove "item X:"
   push @{$AofA[$i]},$_; push into annoymous array
}
#print out to data
print "@{$_}\n" for (@AofA);
	
__DATA__
----------------------------
record    aaa

    item a: some_a_data
    item b: some_b_data
    item c: some_c_data
    item d: some_d_data
----------------------------

record bbb
    item a: some_a_data
    item b: some_b_data
    item d: some_d_data
----------------------------

record    ccc
    item a: some_a_data
    item b: some_b_data
    item c: some_c_data
    item d: some_d_data
    item e: some_e_data   
   
----------------------------

edpatterson · May 20, 2005

re: a slightly different approach using an array of arrays:

This approach ignores any missing elements.

On the other hand, I am increasing my Perl knowledge by leaps and bounds. It never ceases to amaze me how many different and sometimes better solutions there are to a problem.

I am using an array for temporary storage to make tracking empty values easier. In production I am stepping through the array and printing it out as a CSV. I receive the 'uninitialized' warning every time there is an empty element.

The need for the csv output will go away once I get a better grasp of Spreadsheet::WriteExcel. My target audience needs to be able to sort the data by the different fields.

Ed

mikevh · May 20, 2005

edpatterson said:
re: a slightly different approach using an array of arrays:

This approach ignores any missing elements.

I thought that might become an issue.
Here's a hash-of-hashes approach:

Code:

#!perl
use strict;
use warnings;

my $outerkey;
my @innerkeys;
my %HoH;

while (<DATA>) {
    chomp;
    /^\s*$/ && next;
    /^-/ && next;
    if (/^record\s+(.+)/) {
        $outerkey = $1;
    } elsif (/^\s+item\s+(.+):\s+(.*)/) {
        # add $1 to @innerkeys if not already seen
        unless (grep {$_ eq $1} @innerkeys) {
            push @innerkeys, $1;
        }
        $HoH{$outerkey}->{$1} = $2;
    } else {
        warn qq($.: "$_" unexpected.\n);
    }
}

@innerkeys = sort @innerkeys;
print join(",", @innerkeys), "\n";
for my $k (sort keys %HoH) {
    # set $HoH{$k}->'subkey' to "" if not defined
    @{$HoH{$k}}{@innerkeys} = 
        map {defined($_)? $_: qq("")} 
        @{$HoH{$k}}{@innerkeys};

    print join(",", $k, @{$HoH{$k}}{@innerkeys}), "\n";
}

__DATA__
----------------------------
record aaa
    item a: some_a_data
    item b: some_b_data
    item c: some_c_data
    item d: some_d_data
----------------------------
record bbb
    item a: some_a_data
    item b: some_b_data
    item d: some_d_data
----------------------------
record ccc
    item a: some_a_data
    item b: some_b_data
    item c: some_c_data
    item d: some_d_data
    item e: some_e_data
----------------------------

Output:

Code:

a,b,c,d,e
aaa,some_a_data,some_b_data,some_c_data,some_d_data,""
bbb,some_a_data,some_b_data,"",some_d_data,""
ccc,some_a_data,some_b_data,some_c_data,some_d_data,some_e_data

KevinADC · May 20, 2005

edpatterson said:
This approach ignores any missing elements.

Yes, it does. I missed your earlier remark:

edpatterson said:
I am putting it in an array to make the conversion to a CSV easier.

That should have tipped me off that missing elements would have to be taken into account. mikevh has posted a suggestion so give that a try, looks like it will work fine.

mikevh · May 20, 2005

Improvements/corrections (see bolded):

Code:

@innerkeys = sort @innerkeys;
[b]print join(",", 'KEY', @innerkeys), "\n";[/b]
for my $k (sort keys %HoH) {
    # set $HoH{$k}->'innerkey' to "" if not defined
    @{$HoH{$k}}{@innerkeys} = 
        map {defined($_)? $_: qq("")} 
        @{$HoH{$k}}{@innerkeys};

    [b]print join(",", 
        map {quotecol($_)} 
        $k, @{$HoH{$k}}{@innerkeys}), "\n";
}

sub quotecol {
    # Return argument quoted.
    local $_ = shift;
    /^".*"$/? $_: qq("$_");
}[/b]

Rest as before.

Output

Code:

KEY,a,b,c,d,e
"aaa","some_a_data","some_b_data","some_c_data","some_d_data",""
"bbb","some_a_data","some_b_data","","some_d_data",""
"ccc","some_a_data","some_b_data","some_c_data","some_d_data","some_e_data"

rharsh · May 21, 2005

Here's another shot at it:

Code:

my (@fields, %data, $cur_record, @temp_data);

while (<DATA>) {
    next if (/^[\s-]+$/);
    if (/\s*record\s*(\w+)/) {
        $cur_record = $1;
    } else {
        /^\s*(item \w+):\s*(.+)$/;
        $data{$cur_record}{$1} = $2;
    }
}

# Build List of Items
foreach (keys %data) {
    if (@fields < (keys %{$data{$_}})) { @fields = sort keys %{$data{$_}}; }
}
print join(",", "Record", @fields), "\n"; # Print Column Headers

foreach my $record (sort keys %data) {
    my @temp = map {defined($data{$record}{$_}) ? $data{$record}{$_} : '""'} @fields;
    print join(",", quoteList($record,@temp)), "\n";
}

sub quoteList {
    my @temp = @_;
    @temp = map {/^".*"$/? $_: qq("$_")} @temp;
    return wantarray ? @temp : $temp[0];
}

rharsh · May 21, 2005

A little bit of a modification to one of the regexs:

Change:

Code:

[red]/^\s*(item \w+):\s*(.+)$/;[/red]

To:

Code:

[red]/^\s*(item \w+):\s*(.+)[blue][b]\b\s*[/b][/blue]$/;[/red]

Basically, it's the same except it won't grab any extra spaces at the end of a line.

Oh, and thanks to mikevh for some of the code for the quoteList sub.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

stepping through arrays

edpatterson

IS-IT--Management

dmazzini

Programmer

AHinMaine

ISP

edpatterson

IS-IT--Management

edpatterson

IS-IT--Management

dmazzini

Programmer

dmazzini

Programmer

edpatterson

IS-IT--Management

AHinMaine

ISP

KevinADC

Technical User

edpatterson

IS-IT--Management

mikevh

Programmer

KevinADC

Technical User

mikevh

Programmer

rharsh

Technical User

rharsh

Technical User

Similar threads

Part and Inventory Search

Sponsor