Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to read in HTML source code and parse HTML?

Status
Not open for further replies.

dannydanny

IS-IT--Management
Oct 9, 2002
109
DK
Hi,

There is a webpage with a table of data which I would like to retrieve every so that I can update my Excel file.

Is there a way in VBS to "open" a URL and read in the html source code for parsing?

Thanks for any info,
Danny
 
Hi,
Usually I use this :

Window.navigate "URL" or Window.open "URL"
or
Window.open ("URL", "targetwindowname")

Then when the document is loaded, you can access the content with

Window.document.body.innerthtml
or
Window.document.body.outerhtml

If you want to get all inside the page use

Window.document.all.tags("html").item(0).outerhtml

You can reach and parse all elements in window, using the window and document model (see MSDN).

Take care of what you reference.When your script in a given Window, "window.document.... and others apply to the window where the script is. If you want to reference an other window or frame, use the relevant addressing.

When a script addresses its own window, you can omit the "window" word and write
navigate "URL" or document.body... or document.all

Be aware too about cross domain security. If your script is on a html page on disk client-side, you can't access directly the window.document object to parse it. To do that, use an activeX internet explorer component like that

In your local html parsing page write this script

<script language=vbscript>
Dim IE, Myvar
Set IE = CreateObject(&quot;InternetExplorer.Application&quot;)
IE.navigate &quot;URL&quot;
Do while IE.Busy
Loop
Do While ie.ReadyState <> 4
Loop

Myvar=IE.document.all.tags(&quot;html&quot;).item(0).outerhtml
alert Myvar
</script>

The loops on IE ready are there to wait the document is fully downloaded. I didn't find yet how to capture the Onload event of the children IE window... if somebody knows, let me know please.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top