Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how do you read in a html page from another site? 1

Status
Not open for further replies.

fortytwo

Technical User
Apr 18, 2000
206
0
0
GB
I wasnt to create a script that reads in a html page from another site (like <A HREF=" TARGET="_new"> it, it's fun) and replaces certain elements/words with others.<br><br>Displaying the file is easy, it is just how do I read in the web page that is not on the same server into an array/variable?&nbsp;&nbsp;I can do files on the same server and displaying them is easy.<br><br>Also, when I use:<br><br>$variable =~ s/\#\#(bit to replace)\#\#/$replacement/;<br><br>it replaces all of the text including the two sets of ##<br><br>using tr/ has the same effect.<br><br>Any suggestions? <p>fortytwo<br><a href=mailto:will@hellacool.co.uk>will@hellacool.co.uk</a><br><a href= test site</a><br>
 
nifty, Thanks.&nbsp;&nbsp;Any more uses of the get command?<br><br>also, any ideas on the following:<br>when I use:<br><br>$variable =~ s/\#\#(bit to replace)\#\#/$replacement/;<br><br>it replaces all of the text including the two sets of ##<br><br>using tr/ has the same effect.<br><br>Any suggestions? <br><br>Thanks <p>fortytwo<br><a href=mailto:will@hellacool.co.uk>will@hellacool.co.uk</a><br><a href= test site</a><br>
 
about the connect error......I dunno......I have only recently started using the LWP::Simple module myself.&nbsp;&nbsp;I do know that LWP::Simple is a limited treatment of lib you are having problems, you may need to use a more robust portion of LWP than LWP::Simple.&nbsp;&nbsp;Check out 'perldoc LWP::Simple' for more.<br><br>about the pattern matching......<br><br><br><FONT FACE=monospace>$variable =~ s/\#\#(bit to replace)\#\#/$replacement/;</font><br><br>The <font color=red>s/find_pattern/replacement/</font> syntax replaces the stuff in the first set of slashes with the stuff in the second set of slashes.&nbsp;&nbsp;The use of '()' around your 'bit to replace' has no effect on this behavior.&nbsp;&nbsp;Instead, the paren's catch the 'bit to replace' in $1.&nbsp;&nbsp;So, if you want to keep the #'s, you have to put them back with the replace pattern....<br><FONT FACE=monospace>$variable =~ s/\#\#(bit to replace)\#\#/\#\#$replacement\#\#/;</font><br><br>....not quite as ugly, but a little slower.....<br><FONT FACE=monospace>$variable =~ s/(\#\#)bit to replace\#\#/$1$replacement$1/; </font><br>This catches the '##' in $1, and includes them in the replacement string.<br><br> <p> <br><a href=mailto: > </a><br><a href= > </a><br> keep the rudder amid ship and beware the odd typo
 
I have tried this script, but it only gives the output of the header and footer sub routines.  I guess I am doing something stupid, any suggestions?<br><br><FONT FACE=monospace><br>#!/usr/local/bin/perl<br><br>use LWP::Simple;<br>$URL = '&lt; = get("$URL");<br><br>&html_header;<br><br>print $content;<br><br>&html_trailer;<br>#######################################################<br>#sub routines<br><br>sub html_header {<br>   print "Content-type: text/html\n\n";<br>   print "&lt;html&gt;&lt;head&gt;&lt;title&gt;Testing...&lt;/title&gt;&lt;/head&gt;\n";<br>   print "&lt;body&gt;\n";<br>   print "&lt;hr&gt;&lt;br&gt;&lt;br&gt;\n";<br>}<br><br>sub html_trailer {<br>   print "&lt;br&gt;&lt;br&gt;&lt;hr&gt;\n";<br>   print "&lt;/body&gt;&lt;/html&gt;\n";<br>}<br></font><br><br>Notice the excellent use of the tt tags :)<br> <p>fortytwo<br><a href=mailto:will@hellacool.co.uk>will@hellacool.co.uk</a><br><a href= test site</a><br>
 
Me again. I figured out why i was getting the connect error. It is because of the proxy server of where im working. So if any1 else is having trouble with this. Here is what u do.<br><br>use LWP::UserAgent<br>$ua = new LWP::UserAgent;<br>$ua-&gt;proxy(http =&gt; '<A HREF=" TARGET="_new"> $req = new HTTP::Request('GET', '<A HREF=" TARGET="_new"> = $ua-&gt;request($req)-&gt;as_string;<br><br><br>
 
I am not getting a connect error, it is just not displaying anything except what is in the subs.&nbsp;&nbsp;The script is on a shared server, not my own.&nbsp;&nbsp;here is the URL:<br><br><A HREF=" TARGET="_new"> <p>fortytwo<br><a href=mailto:will@hellacool.co.uk>will@hellacool.co.uk</a><br><a href= test site</a><br>
 
I am having a similar problem.
I am trying to connect to yahoo.com and submit my user/pass.I found out where I have to submitted but I get the following error

Use of uninitialized value in print....


Although how can I form_query behind a firewall?

Thanks a lot
George
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top