Hello Folks,
I'm trying to parse an HTML file to extract a city name, using bash scripting on Linux. The tags around that city name are always formatted the same way. Here is an example with LOS ANGELES:
<....bunch of HTML code.....>
<tr><td align="right"> Prefix</td><td><b>240</b></td></tr>
<tr><td align=right>City</td><td><b>LOS ANGELES </b></td></tr>
<tr><td align=right>State</td><td><b>California</b></td></tr>
<....bunch of HTML code.....>
Basically, I am looking into isolating the string between
[highlight]City</td><td><b>[/highlight]
and
[highlight]</b></td></tr>
<tr><td align=right>State[/highlight]
I have banged by head trying to do a 2 process job:
* remove everything before the city name
* remove everything after the city name
So I end up with what I need... but I am sooooo getting lost in the sed and RE expressions.... :-(
Any help or hints would be greatly appreciated !
I'm trying to parse an HTML file to extract a city name, using bash scripting on Linux. The tags around that city name are always formatted the same way. Here is an example with LOS ANGELES:
<....bunch of HTML code.....>
<tr><td align="right"> Prefix</td><td><b>240</b></td></tr>
<tr><td align=right>City</td><td><b>LOS ANGELES </b></td></tr>
<tr><td align=right>State</td><td><b>California</b></td></tr>
<....bunch of HTML code.....>
Basically, I am looking into isolating the string between
[highlight]City</td><td><b>[/highlight]
and
[highlight]</b></td></tr>
<tr><td align=right>State[/highlight]
I have banged by head trying to do a 2 process job:
* remove everything before the city name
* remove everything after the city name
So I end up with what I need... but I am sooooo getting lost in the sed and RE expressions.... :-(
Any help or hints would be greatly appreciated !