Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Viewing HTML source of webpages?

Status
Not open for further replies.

c4n

Programmer
Mar 12, 2002
110
SI
Can someone please give me some simple Perl code, that will enable me to do this:

- open a URL of my choice (not a file on my server!)
- read and save the HTML source code into a file

Thanks!
 
Code:
use LWP::Simple;

$url = "[URL unfurl="true"]www.google.com";[/URL]
$filename = "output.html";

$html = get($url);
open FILE, ">$filename";
print FILE $html;
close FILE;
----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
Oops! $url should equal " for the example. Need to include the full path to work. There's no error checking on opening the local file or retrieving the remote file, so add such things in if they concern you. ----------------------------------------------------------------------------------
...but I'm just a C man trying to see the light
 
Thanks - I already got it working though, like this:

use LWP::Simple qw(getstore);
$file = "file.txt";
$url="$process = getstore($url, $file);

I do have another question now - I'm trying to parse this file now with HTML::parser (strip off the HTML tags) and save the text into a new file (file2.txt), but can't seem to get it working... any ideas how should THIS code look like (no examples in perldoc, I'm not too good in OO programming yet either...)?

Thanks!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top