Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing a file

Status
Not open for further replies.

encin0man

Programmer
Nov 24, 2006
4
0
0
US
I have a text file with the following format:

1.1.1 50.1 1065.1
1.1.2 51.1 14.1
1.1.5 52.1 101.1 Text...text....text
1.2.1 53.1 100.1 Text...text....text

And I want to get it looking like this and printed into a file:

50.1 1065.1
51.1 14.1
52.1 101.1
53.1 100.1



Any Ideas?

Thanks
 
I saved the text content you have above in a file called encino.txt (because I'm not sure of the name of the text file you want parsed.) I then named the output file to output.txt. Remember to replace encino.txt with whatever the filename is that you have.

This code is quick and dirty, but it works (only for this specific text file you provided). Let me know if you have any problems with it.

Code:
open(INFILE, "encino.txt") or die "File cannot be found.";
open(OUTFILE, ">>output.txt");

     while(<INFILE>) {
          chomp($_);
          if(/(50.1\t+?1065\.1)|(51\.1\t+?14\.1)|(52\.1\t+?101\.1)|(53\.1\t+?100\.1)/m) {
          print OUTFILE "$1\n$2\n$3\n$4\n";
       }
}

close INFILE
close OUTFILE
 
cyphix's solution works assuming your 50.1, 51.1, 52.1, and 53.1 lines are within the range of 50 to 53. It looks to me like the 50 is an incrementing number that would go on to 54, 55, 56, 57, and so-on and cpyhrix's code wouldn't work on it.

Assuming that the first three numbers on every line are separated by at least one space, here's a bit of a better version of the previously posted script.

Code:
open(INFILE, "encino.txt") or die "File cannot be found.";
open(OUTFILE, ">>output.txt");

     while(<INFILE>) {
          chomp($_);

          my ($first,$second,$third,$text) = split(/\s+/, $_, 4);
          print OUTFILE "$second $third\n";
}

close INFILE
close OUTFILE

-------------
Kirsle.net | Kirsle's Programs and Projects
 
I didn't explain what I did...sorry about that.

this:

Code:
if(/(50.1\t+?1065\.1)|(51\.1\t+?14\.1)|(52\.1\t+?101\.1)|(53\.1\t+?100\.1)/m)

is just a regular expression that reads the file and if these numbers are found:

50.1 1065.1
51.1 14.1
52.1 101.1
53.1 100.1

then it prints them to a file. Again, the code only works with the text file contents that you provided above. to get it to work with any other file, you'd have to modify the regular expression.
 
maybe a one liner:

Code:
perl -pe "s/^(\S+)\s+(\S+)\s+(\S+).*$/$1 $2/" infile.txt >outfile.txt

- Kevin, perl coder unexceptional!
 
oops, should be:

Code:
perl -pe "s/^(\S+)\s+(\S+)\s+(\S+).*$/[COLOR=red]$2 $3[/color]/" infile.txt >outfile.txt

- Kevin, perl coder unexceptional!
 
Wow guys, thanks for all the help.I will post my finished code as soon as I finish it. For now I just wanted to say thanks for the help.Basically, I chomped and split it as you showed. I am a bit of a newbie when it comes to regex.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top