loop for regex

eatr · Feb 8, 2007

I'm trying to parse a file that has similar data in different sections; thus I need to set limiting conditions or there will be overmatching.

Problem is I can figure out the exact syntax.

I've read a file into an array and am looping through each line of the array. I want to use a WHILE or UNTIL to limit my search; in this case for single (or double) digit numbers between two word characters (one of which I know).

so,

foreach my $line (@array) {
while (1st word .. 2nd word) { # need exact syntax here
if ($line =~ /expression here/ {

......

thanks

KevinADC · Feb 8, 2007

post some sample data and explain what you are trying to match and when you don't want to match.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 8, 2007

Kevin

I appreciate your help. All I really need at this point is the precise syntax for setting a range (starting pattern to end pattern) from which I can then test and extract data and stop; be it a WHILE, FOR or IF loop.

This is a file that will, for the most part, be consistent. I need to break it down into sections, however.

thanks

KevinADC · Feb 8, 2007

it'd probably be easier to do reading the file:

Code:

[url=http://perldoc.perl.org/functions/open.html][black][b]open[/b][/black][/url][red]([/red]FH,[red]'[/red][purple]file.txt[/purple][red]'[/red][red])[/red] or [url=http://perldoc.perl.org/functions/die.html][black][b]die[/b][/black][/url] [red]"[/red][purple][blue]$![/blue][/purple][red]"[/red]:
[maroon]OUTTER[/maroon][maroon]:[/maroon] [olive][b]while[/b][/olive][red]([/red]<FH>[red])[/red][red]{[/red]
   [olive][b]if[/b][/olive] [red]([/red][red]/[/red][purple]foo[/purple][red]/[/red][red])[/red] [red]{[/red]
   [maroon]INNER[/maroon][maroon]:[/maroon] [olive][b]while[/b][/olive][red]([/red]<FH>[red])[/red][red]{[/red]
      [url=http://perldoc.perl.org/functions/do.html][black][b]do[/b][/black][/url] something[red];[/red]
      [olive][b]next[/b][/olive] OUTTER [olive][b]if[/b][/olive] [red]([/red][red]/[/red][purple]bar[/purple][red]/[/red][red])[/red][red];[/red]
   [red]}[/red]
[red]}[/red]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 8, 2007

I should've taken your advice earlier.

Here's an abbreviated sample of the data I want to match --in this part of the file, it's multiple lines. I don't know before hand how many there are in each set until I parse the first set. The format of these sets is consistent, however. I've indicated spacing and in each case there's a newline at the end of each line.

2
1A #one set (of 3 here) - nbrs and/or nbrs-words
3 1-2 digits
A (B,C)
G,B (D,E) #another set --- names with parens
E,F,G (T,R)
114
99 #another set --- 2 or 3 digit nbrs
122
2
1 #another set --- 1 or 2 digit nbrs
3
3
1 #another set --- 1 or 2 digit nbrs
2
2 2.5
3 .125 #another set- 2 sets of nbrs -1st digit
1 1 2nd - digit or float
1 2
3 .25 #same as above
2
etc.

I have the last word on the line preceding the 1st set of numbers ("Comments \n") and I know that the set immediately after the last line of this set consists of names (words), of the form shown above, only - no numbers. If I can set these as before and after points in which to parse the initial set of data (numbers), I can then count the number of lines and loop through each succeeding set.

KevinADC · Feb 8, 2007

LOL.... I don't understand a word of your explanation. Well, the words I do understand, but not the explanation. It's like reading double-talk:

Insert your elbow into your ear. Turn repentently until your stomach is full and your toes talk in tounges. You might see a bad odor or hear blue colors while attempting this so walk slowly while in a seated position.

What the hell-o are you trying to match! [bugeyed]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 8, 2007

I need to pull all that data.

Trying to figure out how to correctly use /s to span newlines. Once I do, I'll be able to pull the first set of data and I'll be on my way.

Help me out, Kevin

The line before the 2 (the first row in the sample) ends with "Comments \n"

so why doesn't the following work? :

foreach my $line (@array) {

if($line =~ m/Comments(.*)A (B,C)/s) {

print "$1\n";
}}

KevinADC · Feb 8, 2007

it's because you are processing one "line" at a time looping through the array. The regexp can only match the current line. Maybe you should read the file into a string instead of an array. Then you can do multi-line matching. I'm off to bed, I'll check back in the morning. [yawn]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 9, 2007

That does the trick.

Thanks

KevinADC · Feb 9, 2007

got it working now?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 9, 2007

Almost.

This is an incredible savings for me as the following code allows me to just grab all these lines as they are (given that I know the number of columns, once I know the length of the array, I can then move each column to an individual array):

$data_file = "retestfile.txt";

# Open the file for reading.
open DATA, "$data_file" or die "can't open $data_file $!";
undef $/;
$_ = <DATA>;
close DATA;
$/ = "\n";

$count=0;
if(/Comments(.*)Fractional/s) {

push (@pgm, $1);
$count++;
print @pgm;
print "count is $count\n";
$sizepgm=$#pgm;
$lengthpgm=@pgm;
print "size of program is $#pgm\n";
print "length of program is $lengthpgm\n";

}

The problem, however is that this array only has a single element: the size is 0 and the length is 1.
Additionally, it's adding a (empty) row at the top and the bottom --so the inital and final newlines are being read in, which I don't want.

What I need help with is how to load $1 in the array while splitting it on newline (\n). I suspect this would solve the problem but I can't quite figure out the syntax.

Thanks for your help.

KevinADC · Feb 9, 2007

The system variables, $1, etc, are read only so you have to assign the value of $1 to another variable then do whatever you want to the variable.

Code:

$count=0;
if(/Comments(.*)Fractional/s) {
   my $temp = $1;
   $temp =~ s/\n//g;
   if (length($temp)) {
      push (@pgm, $temp);
   [rest of your code]
   }
}

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 9, 2007

Kevin

I didn't explain the situation clearly enough, Sorry.

When I say the array has a single element I mean that all of the data is in a single cell (all in cell zero).

Your code does essentially the same thing. (size is 0; length is 1). What I need is for this data, broken by newline, to be 'put into individual rows', so that I can access each one individually.

I've been trying to take what's returned, clean it up and insert it into another array. I've had some success but I'm getting some extra lines.

Any ideas how, in the original code, to prevent the additional 'slot' (this translates to a row once ) from being added at the beginning and at the end of the data

Sorry, just starting to learn Perl seriously and I appreciate your help.

KevinADC · Feb 9, 2007

maybe:

Code:

$count=0;
if(/Comments(.*)Fractional/s) {
   my @temp = split(/\r?\n/,$1);
   push (@pgm, [@temp]);
   [rest of your code]
   }
}

that will make an array of arrays. Or maybe you don't need an array of arrays, in which case remove the square brackets:

Code:

$count=0;
if(/Comments(.*)Fractional/s) {
   my @temp = split(/\r?\n/,$1);
   push (@pgm, $temp);
   [rest of your code]
   }
}

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 10, 2007

Thanks

Another issue. I've only been working on individual 'components' of files to this point. One of these files typically has 9 or 10 components, all (essentially) similarly structured.

Say I have a starting and an end point for all the components (start, end) and all the data is between these two points.

So, I can load the file into $_, then

if(/START(.*)FINISH/s) {
....

}

would check and get the data -but only for a single component

If I wrap a WHILE around it

while() #for ---endless loop

or while (<>) ---nothing

nor does it work in the case of

until(/some text/)

How would I loop through the file getting the data between designated points?

thanks

KevinADC · Feb 10, 2007

Code:

while (/START(.*)FINISH/s[red]g[/red]) {
    ....

}

you might need the ? modifier too:

Code:

while (/START(.*[red]?[/red])FINISH/s[red]g[/red]) {
    ....

}

I recommend you read the regexp chapter of beginning perl:

http://www.perl.org/books/beginning-perl/

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

eatr · Feb 10, 2007

thanks

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

loop for regex

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

KevinADC

Technical User

eatr

Technical User

Similar threads

Part and Inventory Search

Sponsor