Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

loop for regex

Status
Not open for further replies.

eatr

Technical User
Jan 20, 2006
48
I'm trying to parse a file that has similar data in different sections; thus I need to set limiting conditions or there will be overmatching.

Problem is I can figure out the exact syntax.

I've read a file into an array and am looping through each line of the array. I want to use a WHILE or UNTIL to limit my search; in this case for single (or double) digit numbers between two word characters (one of which I know).

so,

foreach my $line (@array) {
while (1st word .. 2nd word) { # need exact syntax here
if ($line =~ /expression here/ {

......


thanks

 
post some sample data and explain what you are trying to match and when you don't want to match.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Kevin

I appreciate your help. All I really need at this point is the precise syntax for setting a range (starting pattern to end pattern) from which I can then test and extract data and stop; be it a WHILE, FOR or IF loop.

This is a file that will, for the most part, be consistent. I need to break it down into sections, however.

thanks
 
it'd probably be easier to do reading the file:


Code:
[url=http://perldoc.perl.org/functions/open.html][black][b]open[/b][/black][/url][red]([/red]FH,[red]'[/red][purple]file.txt[/purple][red]'[/red][red])[/red] or [url=http://perldoc.perl.org/functions/die.html][black][b]die[/b][/black][/url] [red]"[/red][purple][blue]$![/blue][/purple][red]"[/red]:
[maroon]OUTTER[/maroon][maroon]:[/maroon] [olive][b]while[/b][/olive][red]([/red]<FH>[red])[/red][red]{[/red]
   [olive][b]if[/b][/olive] [red]([/red][red]/[/red][purple]foo[/purple][red]/[/red][red])[/red] [red]{[/red]
   [maroon]INNER[/maroon][maroon]:[/maroon] [olive][b]while[/b][/olive][red]([/red]<FH>[red])[/red][red]{[/red]
      [url=http://perldoc.perl.org/functions/do.html][black][b]do[/b][/black][/url] something[red];[/red]
      [olive][b]next[/b][/olive] OUTTER [olive][b]if[/b][/olive] [red]([/red][red]/[/red][purple]bar[/purple][red]/[/red][red])[/red][red];[/red]
   [red]}[/red]
[red]}[/red]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I should've taken your advice earlier.

Here's an abbreviated sample of the data I want to match --in this part of the file, it's multiple lines. I don't know before hand how many there are in each set until I parse the first set. The format of these sets is consistent, however. I've indicated spacing and in each case there's a newline at the end of each line.

2
1A #one set (of 3 here) - nbrs and/or nbrs-words
3 1-2 digits
A (B,C)
G,B (D,E) #another set --- names with parens
E,F,G (T,R)
114
99 #another set --- 2 or 3 digit nbrs
122
2
1 #another set --- 1 or 2 digit nbrs
3
3
1 #another set --- 1 or 2 digit nbrs
2
2 2.5
3 .125 #another set- 2 sets of nbrs -1st digit
1 1 2nd - digit or float
1 2
3 .25 #same as above
2
etc.


I have the last word on the line preceding the 1st set of numbers ("Comments \n") and I know that the set immediately after the last line of this set consists of names (words), of the form shown above, only - no numbers. If I can set these as before and after points in which to parse the initial set of data (numbers), I can then count the number of lines and loop through each succeeding set.





 
LOL.... I don't understand a word of your explanation. Well, the words I do understand, but not the explanation. It's like reading double-talk:

Insert your elbow into your ear. Turn repentently until your stomach is full and your toes talk in tounges. You might see a bad odor or hear blue colors while attempting this so walk slowly while in a seated position.

What the hell-o are you trying to match! [bugeyed] [hammer] [hairpull2]


[smile]

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
I need to pull all that data.

Trying to figure out how to correctly use /s to span newlines. Once I do, I'll be able to pull the first set of data and I'll be on my way.

Help me out, Kevin

The line before the 2 (the first row in the sample) ends with "Comments \n"

so why doesn't the following work? :

foreach my $line (@array) {

if($line =~ m/Comments(.*)A (B,C)/s) {

print "$1\n";
}}


 
it's because you are processing one "line" at a time looping through the array. The regexp can only match the current line. Maybe you should read the file into a string instead of an array. Then you can do multi-line matching. I'm off to bed, I'll check back in the morning. [yawn]





------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
That does the trick.

Thanks
 
got it working now?

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Almost.

This is an incredible savings for me as the following code allows me to just grab all these lines as they are (given that I know the number of columns, once I know the length of the array, I can then move each column to an individual array):


$data_file = "retestfile.txt";


# Open the file for reading.
open DATA, "$data_file" or die "can't open $data_file $!";
undef $/;
$_ = <DATA>;
close DATA;
$/ = "\n";

$count=0;
if(/Comments(.*)Fractional/s) {

push (@pgm, $1);
$count++;
print @pgm;
print "count is $count\n";
$sizepgm=$#pgm;
$lengthpgm=@pgm;
print "size of program is $#pgm\n";
print "length of program is $lengthpgm\n";

}


The problem, however is that this array only has a single element: the size is 0 and the length is 1.
Additionally, it's adding a (empty) row at the top and the bottom --so the inital and final newlines are being read in, which I don't want.

What I need help with is how to load $1 in the array while splitting it on newline (\n). I suspect this would solve the problem but I can't quite figure out the syntax.


Thanks for your help.
 
The system variables, $1, etc, are read only so you have to assign the value of $1 to another variable then do whatever you want to the variable.

Code:
$count=0;
if(/Comments(.*)Fractional/s) {
   my $temp = $1;
   $temp =~ s/\n//g;
   if (length($temp)) {
      push (@pgm, $temp);
   [rest of your code]
   }
}

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Kevin

I didn't explain the situation clearly enough, Sorry.

When I say the array has a single element I mean that all of the data is in a single cell (all in cell zero).

Your code does essentially the same thing. (size is 0; length is 1). What I need is for this data, broken by newline, to be 'put into individual rows', so that I can access each one individually.

I've been trying to take what's returned, clean it up and insert it into another array. I've had some success but I'm getting some extra lines.

Any ideas how, in the original code, to prevent the additional 'slot' (this translates to a row once ) from being added at the beginning and at the end of the data

Sorry, just starting to learn Perl seriously and I appreciate your help.

 
maybe:

Code:
$count=0;
if(/Comments(.*)Fractional/s) {
   my @temp = split(/\r?\n/,$1);
   push (@pgm, [@temp]);
   [rest of your code]
   }
}

that will make an array of arrays. Or maybe you don't need an array of arrays, in which case remove the square brackets:

Code:
$count=0;
if(/Comments(.*)Fractional/s) {
   my @temp = split(/\r?\n/,$1);
   push (@pgm, $temp);
   [rest of your code]
   }
}

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Thanks

Another issue. I've only been working on individual 'components' of files to this point. One of these files typically has 9 or 10 components, all (essentially) similarly structured.

Say I have a starting and an end point for all the components (start, end) and all the data is between these two points.

So, I can load the file into $_, then

if(/START(.*)FINISH/s) {
....

}

would check and get the data -but only for a single component

If I wrap a WHILE around it

while() #for ---endless loop

or while (<>) ---nothing


nor does it work in the case of

until(/some text/)


How would I loop through the file getting the data between designated points?

thanks







 
Code:
while (/START(.*)FINISH/s[red]g[/red]) {
    ....

}

you might need the ? modifier too:

Code:
while (/START(.*[red]?[/red])FINISH/s[red]g[/red]) {
    ....

}



I recommend you read the regexp chapter of beginning perl:




------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top