Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

problem with parsing 1

Status
Not open for further replies.

kokon

Programmer
Apr 27, 2006
31
PL

I need to parse 2 words between <td> brackets: "hello" and "bye"

$text= "<td class=sp></td><td>Hello</td><td class=sp></td><td>Bye</td><td class=sp>";

if ($text=~ m/<\/td><td>([^ ]*)<\/./g) {print "$1\n$2\n$3";}

but I only get first one "hello" and cannot parse "bye" word.

Please help
 
For parsing HTML documents, you're far better off using a proper tag-aware HTML parser. My personal favourite is HTML::TokeParser::Simple. It makes life a *lot* easir:
Code:
#!/usr/bin/perl -w
use strict;
use HTML::TokeParser::Simple;

my $text= "<td class=sp></td><td>Hello</td><td class=sp></td><td>Bye</td><td class=sp>";

my $parser = new HTML::TokeParser::Simple( \$text );

my $print = 0;

while( my $t = $parser->get_token ) {
   $print = 1 if $t->is_start_tag( 'td' );       # turn on printing if we find a <td> tag
   $print = 1 if $t->is_end_tag( 'td' );         # turn off printing if we find a </td> tag
   print $t->as_is if ( $print && $t->is_text ); # print if printing is turned on and we've found text
}
 
Try this:
Code:
if (@str = $text=~ m/<td>(.*?)<\/td>/gi)
 {
  print $str[0];
  print $str[1];
 }

--------------------------------------------------------------------------
I never set a goal because u never know whats going to happen tommorow.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top