problem with parsing 1

kokon · Apr 27, 2006

I need to parse 2 words between <td> brackets: "hello" and "bye"

$text= "<td class=sp></td><td>Hello</td><td class=sp></td><td>Bye</td><td class=sp>";

if ($text=~ m/<\/td><td>([^ ]*)<\/./g) {print "$1\n$2\n$3";}

but I only get first one "hello" and cannot parse "bye" word.

Please help

ishnid · Apr 27, 2006

For parsing HTML documents, you're far better off using a proper tag-aware HTML parser. My personal favourite is HTML::TokeParser::Simple. It makes life a *lot* easir:

Code:

#!/usr/bin/perl -w
use strict;
use HTML::TokeParser::Simple;

my $text= "<td class=sp></td><td>Hello</td><td class=sp></td><td>Bye</td><td class=sp>";

my $parser = new HTML::TokeParser::Simple( \$text );

my $print = 0;

while( my $t = $parser->get_token ) {
   $print = 1 if $t->is_start_tag( 'td' );       # turn on printing if we find a <td> tag
   $print = 1 if $t->is_end_tag( 'td' );         # turn off printing if we find a </td> tag
   print $t->as_is if ( $print && $t->is_text ); # print if printing is turned on and we've found text
}

spookie · Apr 27, 2006

Try this:

Code:

if (@str = $text=~ m/<td>(.*?)<\/td>/gi)
 {
  print $str[0];
  print $str[1];
 }

--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

problem with parsing 1

kokon

Programmer

ishnid

Programmer

spookie

Programmer

Similar threads

Part and Inventory Search

Sponsor