YOUNGCODGER
Programmer
Hello,
I have written a script, using “wget”, to download a number of websites which feature one or more spreadsheets. An example is:-
What I want to do is extract the first two or three columns into a CSV file, optionally with/without the headers.
Part of the html is:-
datacelllefttopborder
"><span><P>0551 100</P></span></Td><Td c
lass="datacell content
datacelllefttopborder
"><span><P>g6</P></span></Td><Td class="
datacell content
datacelllefttopborder
"><span><P>2</P></span></Td><Td class="d
atacell content
datacelllefttopborder
"><span><P>N</P></span></Td><Td class="r
ightborder datacell content
rightborder datacelllefttopborder
"><span><P>N</P></span></Td></TR><TR cla
ss="datarow"><Td class="datacell content
datacelllefttopborder
"><span><P>0551 107</P></span></Td><Td c
lass="datacell content
datacelllefttopborder
"><span><P>g21</P></span></Td><Td class=
"datacell content
And what I want to achieve is:-
‘0500,no fee
‘0551100,g6
‘0551107,g21
etc.
Thanks in anticipation,
YoungCodger
I have written a script, using “wget”, to download a number of websites which feature one or more spreadsheets. An example is:-
What I want to do is extract the first two or three columns into a CSV file, optionally with/without the headers.
Part of the html is:-
datacelllefttopborder
"><span><P>0551 100</P></span></Td><Td c
lass="datacell content
datacelllefttopborder
"><span><P>g6</P></span></Td><Td class="
datacell content
datacelllefttopborder
"><span><P>2</P></span></Td><Td class="d
atacell content
datacelllefttopborder
"><span><P>N</P></span></Td><Td class="r
ightborder datacell content
rightborder datacelllefttopborder
"><span><P>N</P></span></Td></TR><TR cla
ss="datarow"><Td class="datacell content
datacelllefttopborder
"><span><P>0551 107</P></span></Td><Td c
lass="datacell content
datacelllefttopborder
"><span><P>g21</P></span></Td><Td class=
"datacell content
And what I want to achieve is:-
‘0500,no fee
‘0551100,g6
‘0551107,g21
etc.
Thanks in anticipation,
YoungCodger