Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parse HTML with VBA

Status
Not open for further replies.

TommyIndigo

Programmer
Sep 14, 2000
67
US
I'm trying to parse an HTML file, so I can eventually pull data into my Access database. I've come across DOM and MSXML, but not sure how to tackle this.

I can write my own logic once I can get to the tags. I will need to examine tag attributes, and values for the tags.

For example, one field I need is embedded in a DIV tag:

Code:
<DIV class=bizName>A to Z Appliances</DIV>

Therefore, I need to search for DIV tags that have the class attribute set to bizName, and get the value in between.

I need to do this via VB, and not Javascript or other languages.

Any ideas on a general approach?? Once I'm pointed in the right direction, I can dig deeper on my own. Sample code would be MOST appreciated! Thanks, TOM
 
What have you tried so far and where in your code are you stuck ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks for the prompt reply. I was being vague on purpose, because I'm not even sure what my general approach should be.

I was however able to create an object so I have access to the HTML itself. I'm not sure how to parse it from there. This is what I have so far:

Code:
Private Sub Command4_Click()

Dim link As String
Dim rawLocation As String
Dim rawCategory As String
Dim location As String
Dim category As String


'Open Yellowbook webpage, defaulting with selected search from form
rawCategory = Me.Text0.Value
rawLocation = Me.Text2.Value
location = Replace(rawLocation, " ", "+")
category = Replace(rawCategory, " ", "+")
link = "[URL unfurl="true"]http://www.yellowbook.com/search/?what="[/URL] + category + "&who=&where=" + location


Dim objIE: Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = False
objIE.Navigate link
Do Until objIE.ReadyState = 4
Loop



End Sub

I saw in my research DOM methods such as getElementsByTagName, not can't seem to implement them via VB. Not sure if that helps?
 
You may use the objIE.Document object ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
That seems to work. I can turn VISIBLE to TRUE, and I get the webpage and no errors.

I'm happy to use whatever approach works so I can get the data out of the tags. Any recommendations?
 
TommyIndigo,
PHV has you on the right path. You can get the collection of tags in a document by using [tt]document.all.tags("tag")[/tt]. Once you have the collection you can iterate through and check the [tt]className[/tt].
Code:
...
  Dim varTags As Variant, varTag As Variant
...
  Set varTags = objIE.document.all.tags("DIV")
  For Each varTag In varTags
    If varTag.Classname = "bizName" Then
      Debug.Print varTag.innerText
    End If
  Next varTag
...

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top