Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

need regex help again.

Status
Not open for further replies.

StickyBit

Technical User
Jan 4, 2002
264
0
0
CA
Folks,

I'm trying to write an expression to pickup the following line in a data file:

0 0 0 360 0.2 12 9.1

The following expression works but I’m not sure why:

(/^\s([0-9]{1,}\s{5})/)

I thought this expression would work but it doesn’t :(

(/^\s([0-9]{1,}\s){4}[0-9]{1,}\.([0-9]{1,}\s){2}[0-9]{1,}\.[0-9]{1,}/)

Need help,

Stickybit.
 
Its quite useful to use the 'x' modifier to explain complex regular expressions:
Code:
while (<DATA>) {
    if (/^         # start of line
        \s         # a single whitespace
        (          # start grouping
        [0-9]{1,}  # 1 or more digits
        \s{5}      # exactly 5 spaces
        )          # end grouping
        /x) {
        print "matched 1\n";
    }
    if (/^         # start of line
        \s         # a single whitespace
        (          # start grouping
        [0-9]{1,}  # 1 or more digits
        \s         # a single whitespace
        )          # end grouping
        {4}        # grouping occurs exactly 4 times
        [0-9]{1,}  # 1 or more digits
        \.         # a decimal point
        (          # start grouping
        [0-9]{1,}  # one or more digits
        \s         # a single whitespace
        )          # end grouping
        {2}        # grouping occurs exactly 2 times
        [0-9]{1,}  # 1 or more digits
        \.         # a decimal point
        [0-9]{1,}  # 1 or more digits
        /x) {
        print "matched 2\n";
    }
}
__DATA__
 0     0     0      360     0.2   12       9.1
This reports "matched 1" only. I think there's a few errors in your second regexp.
Cheers, Neil
 
Here's a modified version of the regexp which should work:
Code:
#!/usr/bin/perl -w

while (<DATA>) {
    if (/^         # start of line
        \s         # a single whitespace
        (          # start grouping
        [0-9]+     # 1 or more digits
        \s+        # 1 or more whitespace
        )          # end grouping
        {4}        # grouping occurs exactly 4 times
        [0-9]+     # 1 or more digits
        \.         # a decimal point
        (          # start grouping
        [0-9]+     # 1 or more digits
        \s+        # 1 or more whitespace
        )          # end grouping
        {2}        # grouping occurs exactly 2 times
        [0-9]+     # 1 or more digits
        \.         # a decimal point
        [0-9]+     # 1 or more digits
        /x) {
        print "matched\n";
    }

    # same line as above without 'x' modifier
    if (/^\s([0-9]+\s+){4}[0-9]+\.([0-9]+\s+){2}[0-9]+\.[0-9]+/) {
        print "matched\n";
    }
}
__DATA__
 0     0     0      360     0.2   12       9.1
Cheers, Neil
 
This regex will match the line you gave, but to make it more specific, you'll need to post a little more of the info from your file.

Code:
$_ = '0     0     0      360     0.2   12       9.1';
print if (/^([\d.]+\s*){7}$/)
 
I didn't notice the white space at the beginning of the line - this should work better:

Code:
/^\s([\d.]+\s*){7}$/
 
Code:
$_ = '0     0     0      360     0.2   12       9.1';
my @arrayOfValues = split(/\s+/);

$arrayOfValues[0] = 0,
$arrayOfValues[1] = 0,
$arrayOfValues[2] = 0,
$arrayOfValues[3] = 360,

etc

-Nick
 
Assuming you only want lines containing 7 numeric elements:
Code:
#!perl
use strict;
use warnings;

while (<DATA>) {
    chomp;
    if ((my @line = m|(-?\d+(?:\.\d+)?)|g) == 7) {
        print join("\t", @line), "\n";
    }
}

__DATA__
 0     0     0      360     0.2   12       9.1
1 2 3 4 5
7 8.2 5 14.75 -1 0 8.9
Output:
Code:
0       0       0       360     0.2     12      9.1
7       8.2     5       14.75   -1      0       8.9
The re says:
1. Optional leading -
2. One or more digits
3. Optional decimal point and one or more digits. The ?: means don't capture the decimal and following digits as a separate field.



 
Stickybit,

it would be nice to come back and recognise these peeplz efforts, so they don't get pi$$ed 0ff, and dance on other people, we;re here to help, not get 91553d on.

If you had an account with me, I'd charge you 4 times, plus a percentage/margin/whatever/they/call/it/now

B nice
--Paul
 
Paul,

I always make it a point to thank/recognise those who have helped me. I got tied up with a couple emergencies yesterday otherwise I would have responded sooner…I apologize.

Anyways,

Thank you to everyone who helped me yesterday, my understanding of regex is a little better now. Special thanks to Toolkit for breaking things down into a simpler form and pointing out my mistakes.

I’m working on an inventory project where I have to collect the CPU information for several SUN Servers that display their CPU information as follows:

Data file 1 (4 CPU's)

___DATA___

Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
SYS 0 0 400 4.0 US-II 10.0
SYS 1 1 400 4.0 US-II 10.0
SYS 2 2 400 4.0 US-II 10.0
SYS 3 3 400 4.0 US-II 9.0

___DATA___

or as in my original example:

Data file 2 (1 CPU)

___DATA___

Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
0 0 0 360 0.2 12 9.1

___DATA___

My immediate requirement was the CPU count, so in the end I used the following regular expressions to count each line in the data file, which worked!

For data file 1

(/^[A-Z]{3}\s{1,}([0-9]{1,}\s{1,}){3}[0-9]{1,}\.[0-9]{1,}\s{1,}[A-Z]{1,}-[A-Z]{1,}\s{1,}[0-9]{1,}\.[0-9]{1,}/)

For data file 2

(/^\s([0-9]{1,}\s{1,}){4}[0-9]{1,}\.[0-9]{1,}\s{1,}[0-9]{1,}\s{1,}[0-9]{1,}\.[0-9]{1,}/)

I’m sure the expression could be more efficient but I thought I would come up with my own work first (based on the examples provided) then look at the answers provided and tweak later…

Thanks again folks!

Stickybit
 
Hey guys,

What does the $ at the end of the expression mean?

$_ = '0 0 0 360 0.2 12 9.1';
print if (/^\s([\d.]+\s*){7}$/)

thanks,

stickybit
 
From perlretut:
The anchor ^ means match at the beginning of the string and the anchor $ means match at the end of the string ...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top