How to avoid a foreach loop

user2base · Oct 16, 2002

I want to parse a file and push a line into an array only if it begins with a key that is part of a hash. All others lines will have to go in a different array.

I tried to use a foreach loop:

while ($line = <FILE>){
foreach $key (keys %hash){
if ($line =~ /^$key/){
push (@array1, $line);
}
else {
push (@array2, $line);
}
}
}

The problem is that non matching lines will be pushed n times (n being the number of keys in my hash) in @array2 rather than 1 time.
Is there a way around this ?
Basically I would need to include the foreach within the if.

Any idea ??

U.

justice41 · Oct 16, 2002

Could you provide a broader explaination of what you are doing? To me, a foreach loop inside a while loop is a red flag. You are defeating the advantages of a hash by looping through its keys for each line of the file. What does the file you are looping through contain? Just a single key? or a key with further values? What do you want to do with the data once it's sorted into arrays?

Based on what you have posted so far I would suggest a different data structure: a hash of arrays. This is a hash where each keys points to an array of values rather than a single scalar value.

A good reference for these more complex data structures is in the Data Structures Cookbook in the perldocs

http://www.perldoc.com/perl5.8.0/pod/perldsc.html

This would also solve the issue you described in your other post.

jaa

user2base · Oct 16, 2002

jaa,
Thanks for your help but I am not sure a hash of array would work in my case.
I agree with you that the foreach loop is not good inside a while loop.

What I want to do is defining a hash that contains a kind of dictionary of keywords. Each keyword will be associated with a "value" that will be used to create an array named with "value" that will contain all the line begining with the key within a file.
So, I want to parse a file and for each line that starts with one keyword then I push this line into the corresponding array (which gives me the other pb of naming this array). I also want to be abble to update that hash (that could contain about 50 keys) so I have to dynamically create it from a flat file that I will update depending on my needs (This part is easy to do). Because I can't do a if ($line =~ /^key1/ or /^key2/ or......or/^key50/) and update this each time I add a new key: It would not be scalable !

Let me give you a example:

I've got a file like this:
blabla
k1 info about k1
blabla
k2 info about k2
blabla
k3 info about k3
blabla

I will use a text file to define my hash
k1->K1
k2->K2
k3->K1

I want to parse the file so that at the end I have
@K1:
k1 info about k1
k3 info about k3
@K2:
k2 info about k2

And then create a file from each array.
Note that the name of the array will be used to create a file and it is different from the name of the key as the same array can match several keys (K1 matches lines starting with k1 and k3 in my example)
Note also that I prefer to push the info in a array and then create a file at the end rather than opening and closing the file each time I have a match.
That leads me to my other problem: How to create an array named Ki ??? In fact depending on my key the name of the array should be the corresponding value...
For instance it's easy to define a array named @red but I don't know how to name a array name with the content of $hash {key} !!

Sorry for this long explanation but any help would ve much appreciated.

justice41 · Oct 16, 2002

Well, this doesn't result in exactly what you specified but it should give you a jump start. For your specification of
@K1:
k1 info about k1
k3 info about k3
@K2:
k2 info about k2

I wasn't sure how the k3 line ended up in the @K1 array. So this just puts each line into an array whose name is the value of the key on that line. You can tweak it to get the desired output. This does show an example of the dynamic array naming.

Code:

my %hash = ( k1 => 'K1',
             k2 => 'K2',
             k3 => 'K3',
            );

while ( defined( $line = <DATA> ) ) {
    ## Grab the keyword if it exists
    next unless $line =~ /^(\w+)/;
    if ( exists $hash{ $1 } ) {

        ## push onto array with name of corresponding hash value
        push @{ $hash{ $1 } }, $line;  ## WARNING, SOFT REFERENCE!

    }
}

foreach $aref (values %hash) {

    print &quot;\@$aref:\n&quot;;
    print &quot;$_&quot; foreach ( @$aref );
}

Given the input

Code:

blabla
k1 info about k1
blabla

k2 info about k2
blabla
k3 info about k3
blabla
k2 more info about k2
blabla

This outputs

Code:

@K1:
k1 info about k1
@K2:
k2 info about k2
k2 more info about k2
@K3:
k3 info about k3

jaa

justice41 · Oct 16, 2002

Oh, by the way, this is structurally equivalent to a hash of arrays. You have a hash whose values are references to arrays. The only difference is that in this case the arrays are named arrays, whereas in a hash of arrays the arrays are anonymous arrays.

jaa

user2base · Oct 17, 2002

Many thanks jaa !
Your message helped me a lot. I am new to PERL and you gave me some hints about very useful syntax !!

Now, in fact my problem is a little bit more difficult as some keys in my hash will be composed of several words. And your program doesn't work if k1 is actually a string composed of several words.

Actually I want to create a program that parse a cisco configuration. I don't know if you are familiar with cisco but a key could be something like "router bgp" (k1 would be "router bgp" in the example). It's even more difficult as an other different key could be "router ospf" which starts with the same word "router"
On top of that I want to include in my array all the following lines that start with a space.
For instance:

blablabla
router bgp 64512
network blablabla
no auto-summary
blablabla

will lead to the creation of an array called BGP containing:
router bgp 64512
network blablabla
no auto-summary

And as I said before I want to dynamically create my hash each time I run the script as the associations between keys and array name will be updated on the fly in a text file that will look like:
router bgp / BGP
router ospf / OSPF
snmp / SNMP

etc...

I don't wan't to abuse so just answer if you wish. I will carry on my investigation and my PERL learning.
Thanks very much again for your very useful answer.

U.

pellman · Oct 17, 2002

I think this will do exactly what you want:
##########
# BEGIN
##########
# Define the regex to be used (instead of having to run
# multiple regex's)
my $regex = join('|', keys %hash);

my (%hash1, %hash2, @array1, @array2);

while ($line = <FILE>) {
if ($line =~ /^($regex)/) {
$hash1{$line} = 1;
}
else {
$hash2{$line} = 1;
}
}

@array1 = keys %hash1;
@array2 = keys %hash2;

##########
# END
##########

This will fix your duplicate problem--using hashes to collect a unique array is a great trick to know!

-Aaron Simmons
asimmons@mitre.org

justice41 · Oct 17, 2002

Here's a slight modification of the code in my previous post. I made the assumption that the key word will be a single word (smnp), or a single word preceeded by the word router. Distinguishing a valid router line from a blablabla line (or grabbing the exact key word) becomes more difficult if the format is less rigid than this. You might have to resort back to a loop through all the key words (or an 'or'ed regex as Aaron suggested).

That said, here is the modified code (I left out the foreach loop that prints because it doesn't need to change).

Code:

my $routers = 'routers.dat';
open RTR, $routers or die &quot;Can't open $routers; Reason: $!&quot;;
while (<RTR>) {

    chomp;
    my ($key, $value) = split m!\s*/\s*!;
    $hash{ $key } = $value;
}
close (RTR);

my $key;
while ( defined( $line = <DATA> ) ) {

    push @{ $hash{ $key } }, $line and next if $line =~ /^ +/;
    next unless $line =~ /^((?:router )?\w+)/; ## Grab the keyword if it exists
    if ( exists $hash{ $1 } ) {

        $key = $1; ## Remember key word for next iteration.
        push @{ $hash{ $key } }, $line;

    }
}

Now given the following input

Code:

blablabla
router bgp 64512
 network blablabla
 no auto-summary
blablablao
router ospf 24753
 network blarf
 auto-summary
blablabla

foofoo
snmp 38471
 gateway ithehda
 cracker port
nonesense

the output is

Code:

@BGP:
router bgp 64512
 network blablabla
 no auto-summary
@SNMP:
snmp 38471
 gateway ithehda
 cracker port
@OSPF:
router ospf 24753
 network blarf
 auto-summary

jaa

user2base · Oct 18, 2002

Thanks for your help.
Aaron: you just gave me some good tricks !

Jaa:
Your script is very close to work but not quite as the
keyword "router" is not the only one that could be encountered within a key composed of several words. Moreover on a cisco configuration there could someting like this:

identified key1
i want this in my array1
i want this in my array1
blablabla
unidentified key
i want this under the misc_array
i want this under the misc_array
identified key2
i want this in my array2
blablabla

your script leads to the creation of:

@key1:
identified key1
i want this in my array1
i want this in my array1
i want this under the misc_array
i want this under the misc_array

Rather than:

@key1
identified key1
i want this in my array1
i want this in my array1

@misc
unidentified key
i want this under the misc_array
i want this under the misc_array

Of course it's not your fault because I didn't tell you about all my requirements (including the misc array where I want everything else to go) in order to simplify my question !
I can probably use your script as a very good basis to my script but I don't understand enough the following matching:

/^((?:router )?\w+)/

Again, your help has been much appreciated !!

justice41 · Oct 18, 2002

The regex
[tt]
/^((?:router )?\w+)/
[/tt]
says match any 'word' (the \w+) and capture it into $1 (this is what the outer parentheses do) unless the 'word' is predeeded by 'router' in which case capture both 'router' and 'word' into $1. the '?' following the ')' says to match 0 or 1 of the atom that preceeds it. The (?: ..) means to group the word 'router' into an atom but don't capture 'router' by itself into a numbered variable (i.e. $2) i need to use the parentheses to group 'router' together because [tt]router?[/tt] would match
[tt]
route keyword
# or
router keyword
# because 'r' is now the atom that preceeds the '?'
# but not
keyword
# nor
snmp
[/tt]
Here's my last set of modifications.
[tt]
$hash{'MISC'} = 'MISC';
while ( defined( $line = <DATA> ) ) {

push @{ $hash{ $key } }, $line and next if $line =~ /^ +/;
next unless $line =~ /^((?:router )?\w+)/; ## Grab a keyword if it exists
if ( exists $hash{ $1 } ) {

$key = $1;
push @{ $hash{ $key } }, $line;

} else {

$key = 'MISC';
push @{ $hash{ $key } }, $line;
}
}
[/tt]
This puts everything else, including the blabla's, into the @MISC array. Like I said before, you need to figure out a way to distinguish 'blah blah' from an 'identified key' or 'unidentified key'. This will involve finding a pattern to replace the regex that will only match keys (identified or unidentified). If that's isn't possible, then you will have to resort to looping or 'or'ing. I will leave it up to you (or someone else) to sort out how.

jaa

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How to avoid a foreach loop

user2base

Technical User

justice41

Programmer

user2base

Technical User

justice41

Programmer

justice41

Programmer

user2base

Technical User

pellman

Programmer

justice41

Programmer

user2base

Technical User

justice41

Programmer

Similar threads

Part and Inventory Search

Sponsor