Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular Expressions - Help me ??

Status
Not open for further replies.

WMitchellCPQ

Programmer
Sep 28, 2001
54
0
0
US
Hi i need to remove a navigation table (nav table) in a large number of files. How do I construct a reg exp so that the table will be remove this. The contents of the table vary from file to file.
Code:
<table id="NavTable>
 Varying text here 
<some tags>

</table>

Thanks in advance
 
awk '/^<table id="NavTable">/,/<\/table>/{next}1' infile >outfile
 
Code:
$text =~ s!<table id="NavTable">.*?</table>!!gs;

Where $text is the complete html text in a single string. This also assumes that inside the table tags for NavTable there are no other tables. Otherwise it could be very tricky. You may want to look at HTML::Tree (1) if that is the case.

(1) [URL unfurl="true"]http://search.cpan.org/dist/HTML-Tree/lib/HTML/Tree.pm[/url]

Barbie
Leader of Birmingham Perl Mongers
 
If the NavTable can contain tables:

Ruby:
Code:
class String
  # Like split(), but includes matching substrings.
  def shatter( re )
    self.gsub( re, "\1"+'\&'+"\1" ).split("\1")
  end
end

outtext='';   depth=0
# text is a string containing the whole file.
text.shatter( /<\/?table[^>]*>/ ).each{ |s|
  if s =~ /<table id="NavTable">/
    depth += 1
  else
    depth += (depth>0 ? 1 : 0)  if s =~ /<table/
  end
  outtext += s  if 0==depth
  depth -= (depth>0 ? 1 : 0)  if s =~ /<\/table>/
}
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top