ok, thanks for reply... i checked out both....
in simplicity here is what i came up with...but i am still weak at the scarping and comparing part...
#!/usr/bin/perl -w
use strict;
use warnings;
use LWP::Simple;
#use
use HTML::TableExtract;
use HTML:

arser;
my $url = 'http:testsite';
my $content = get $url or die "Couldn't get $url";
my $htex = HTML::TableExtract->new(headers=>['1st','2nd','Date']);
$htex->parse($content);
# Examine all matching tables
foreach my $table ($htex->table_states) {
# print "Table (", join(',', $table->coords), "):\n";
foreach my $row ($table->rows) {
# print join(',', @$row), "\n";
print @$row[0], @$row[1], @$row[2],"\n";
#print Dumper(@$row[1]);
}}
On the site the table looks like following:
Testing...
1st Location 2nd Defined Date AT Test
1 IL 2 Yes done
2 IL 3 Yes 03/10/2012 3 not done
3 IL 4 Yes 4 done
4 IL 5 Yes done
when i do source it has
<p align="center">Testing...</p>
<p align="center">
<table align="center" width="60%" border="1">
<tr align="CENTER" class="title">
<td width="10%" class="title">1st</td>
<td width="20%" class="title">Location</td>
<td width="10%" class="title">2nd</td>
<td width="10%" class="title">Defined</td>
<td width="10%" class="title">Date</td>
<td width="30%" class="title">AT</td>
<td width="10%" class="title">Test</td>
</tr>
<tr align="CENTER">
<td width="10%">
1
</td>
<td width="20%">
IL
</td>
<td width="10%">
2
</td>
<td width="10%">
Yes
</td>
<td width="10%">
</td>
<td width="30%">
</td>
<td width="10%">
done
</td>
</tr>
<tr align="CENTER">
<td width="10%">
2
</td>
<td width="20%">
IL
</td>
<td width="10%">
3
</td>
<td width="10%">
Yes
</td>
<td width="10%">
03/10/2012
</td>
<td width="30%">
3
</td>
<td width="10%">
not done
</td>
</tr>
</table>
</p>
and perl script output looks like:
Table (1,1):
1
2
Â
and so on with newline and spacing..
issue i see is the characters and where the date is empty it has another character instead of null that would be requried to pass to query...what i need to do is check the first and second element with the one in DB.
my thinking is saving all row[0] in one array row[1] in another and so on..
then loop over array one by one elembent and pass it to query to check...
any pointers? if i can get a dummy example or something off of above code i will appreciate it...thanks...