Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Retrieve HTML pages

Status
Not open for further replies.

globos

Programmer
Nov 8, 2000
260
FR
Hi,

I want to get the content of an HTML page. I tried to use the classes URLConnection, URL, etc. from the java.net package. It seems to work but the web site from which I am trying to get the page detects that I am not an IE, Netscape user and does not allow access.
I think the solution is to setup an appropriate user agent for the request but how it could be done in Java?

Here is the code I used without the exception handlings :

URL urlMySongBook = new URL (" URLConnection connection = urlMySongBook.openConnection();

connection.setDoInput (true);

BufferedInputStream in = new BufferedInputStream (connection.getInputStream());
File fileOutput = new File ("E:\\temp\\test");
BufferedOutputStream out = new BufferedOutputStream (new FileOutputStream (fileOutput));

int dataRead = 0;

while (dataRead != -1)
{
dataRead = in.read ();
out.write (dataRead);
}
in.close ();
out.close ();

--
Globos
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top