Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations John Tel on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Web Browser Control 2

Status
Not open for further replies.

aw23

Programmer
Nov 26, 2003
544
IL
I need to populate an Excel file with values from a browser. I have the links in an Excel file, I need to go through each one and take all the data and put it into a spreadsheet. I understand this is possible with a Web Browser control. However I know nothing about it. I'd really like to do it from Excel VBA, so I know I'm in the wrong forum. However when I did a search I came up with posts in this forum, and I don't know if I can use it in Excel.
Would anyone be able to give me a starting point?

Thanks so much
 
>>I need to populate an Excel file with values from a browser.

We can do that...

>>I have the links in an Excel file, I need to go through each one and take all the data and put it into a spreadsheet.

Can you expand on this...?
Are they in a list of cells?
If so (I'm guessing they are) I will show you how to select them, Then loop through the Selection

>>I understand this is possible with a Web Browser control. However I know nothing about it.

First, (Assuming you are using the VBA IDE in excel) you'll need to set a reference (Tools>References...) to:
Microsoft Internet Controls
and
Microsoft HTML Object Library
...this will contain your WebBrowser Object that you need...

>>I'd really like to do it from Excel VBA, so I know I'm in the wrong forum. However when I did a search I came up with posts in this forum, and I don't know if I can use it in Excel.

No Problem. Just as long as YOU KNOW that there is a VBA forum...
This particular question leans more towards the VB5/6 side anyway (in other words, I would do the same thing)
And... Yes you CAN use it in excel...

>>Would anyone be able to give me a starting point?

Sure...

after setting the reference that you need (mentioned above)
You need to create a macro Sub and create the Object...

Code:
Sub GetInfoFromWeb()
  Dim wb1 As WebBrowser
  Dim Doc As HTMLDocument
  Dim Cell As Range
  For Each Cell In Selection
    Set wb1 = CreateObject("InternetExplorer.Application")
    wb1.Navigate2 Site
    Do: DoEvents: Loop Until wb1.ReadyState = READYSTATE_COMPLETE
    Set Doc = wb1.Document
    'Get Data from document...
    'Process and write to output...
    wb1.Quit
    Set wb1 = Nothing
  Next
End Sub

Now, for further help, I need to have an example of the target site(s) source code, to know how you need to access the data... (there are several ways, some easier than others)

For a simple Example...
If you want to dump the source to a new sheet you can do this:
Code:
Sub GetInfoFromWeb()
  Dim wb1 As WebBrowser
  Dim Doc As HTMLDocument
  Dim Sheet As Worksheet
  Dim SheetNum As Integer
  Dim Cell As Range
  For Each Cell In Selection
    If Cell.Text <> "" Then
      Set wb1 = CreateObject("InternetExplorer.Application")
      'wb1.Visible = True
      wb1.Navigate2 Cell.Text
      Do: DoEvents: Loop Until wb1.ReadyState = READYSTATE_COMPLETE
      Set Doc = wb1.Document
      Set Sheet = Worksheets.Add(After:=Worksheets.Item(Worksheets.Count))
      On Error Resume Next
      Do
        Err = 0
        SheetNum = SheetNum + 1
        Sheet.Name = "WebSheet" & SheetNum
      Loop Until Err = 0
      ActiveCell = Doc.body.innerHTML
      wb1.Quit
      Set wb1 = Nothing
      On Error GoTo 0
    End If
  Next
End Sub

Since there are seemingly, endless ways to create webpages, you would have to know what the source looks like to be able to work with the file...

The above example is to show that you do have access to the Document Object...


PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
Thank you so much. I am trying to view the source but for some reason, it is not in view. I will try again at work tomorow (I'm working at home today).
I have copied your code, except that I have tweaked it a little.
This is what I need to do. I have a link in the spreadsheet. Part of the string is car=1. When I go to the link there is info on that car that I must collect in the same spreadsheet (once I get the data I know how to enter it into Excel) There may be more than one car. So I need to change the string to car=2, car=3, etc. If the car does not exist than the page will still exist, however the box I am collecting data from will not be populated. So I need to check that, continue looping and adding to the link until the box is blank. The following is my code so far (based on your code)

Code:
Sub GetInfoFromWeb()
Dim wb1 As WebBrowser
Dim L As Range
  Dim Doc As HTMLDocument
  Dim Cell As Range
  Dim Site
  Dim x As Integer
  For Each L In Intersect(Range("L2:L65536"), ActiveSheet.UsedRange)
    x = 1
    Site = L.Value
    'while the website is not blank
    If InStr(Site, "car=" & x) <> 0 Then
        Site = Replace(Site, "car=" & x, "car=" & x)
        Set wb1 = CreateObject("InternetExplorer.Application")
        wb1.Navigate2 Site
        Do: DoEvents: Loop Until wb1.readyState = READYSTATE_COMPLETE
        Set Doc = wb1.document
        'Get Data from document...
        'Process and write to output...
        wb1.Quit
        Set wb1 = Nothing
        
    End If
    x = x + 1
    'before go to next one need to have some other loop going here
  Next
 
to get the source, use this:
Doc.body.innerHTML

As shown above^

Such as if you created a User Form in your excel document, and it has a TextBox (such as Text1) with multiline turned on...
You could say:
Load Form1
Form1.Text1 = Doc.body.innerHTML
Form1.Show



PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
Thanks so much. I apologize, I didn't realize that you had already showed me.
I have now gotten all the source code. I need to get certain data. Do I need to get the source code and then get the data from that or can I get it directly from the webpage by looking at the source code?
Here is a portion of the source code, let me know if that helps.

<TD vAlign=bottom><INPUT size=22 value=Albert name=First_Name></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Last Name:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value=Smith name=Last_Name></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Your Title:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value=Principal name=Your_Title></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Company:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value="Just A Test" name=Company></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Address 1:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value="185 Granville" name=Address_1></TD></TR>

<TR>

<TD vAlign=top align=right><SPAN id=ffields>Address 2:</SPAN></SPAN></TD>

<TD vAlign=top><TEXTAREA name=Address_2 wrap=virtual cols=18></TEXTAREA></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>City:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value=Chicago name=City></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>State/Province:</SPAN></SPAN></TD>

<TD vAlign=bottom>

<TABLE cellSpacing=0 cellPadding=0 border=0>

<TBODY>

<TR>

<TD vAlign=bottom><SELECT name=State> <OPTION value=AB>AB<OPTION value=AK>AK<OPTION value=AL>AL<OPTION value=APO>APO<OPTION value=AR>AR<OPTION value=AZ>AZ<OPTION value=BC>BC<OPTION value=CA>CA<OPTION value=CO>CO<OPTION value=CT>CT<OPTION value=DC>DC<OPTION value=DE>DE<OPTION value=FL>FL<OPTION value=GA>GA<OPTION value=GU>GU<OPTION value=HI>HI<OPTION value=IA>IA<OPTION value=ID>ID<OPTION value=IL>IL<OPTION value=IN selected>IN<OPTION value=KS>KS<OPTION value=KY>KY<OPTION value=LA>LA<OPTION value=LB>LB<OPTION value=MA>MA<OPTION value=MB>MB<OPTION value=MD>MD<OPTION value=ME>ME<OPTION value=MI>MI<OPTION value=MN>MN<OPTION value=MO>MO<OPTION value=MS>MS<OPTION value=MT>MT<OPTION value=NB>NB<OPTION value=NC>NC<OPTION value=ND>ND<OPTION value=NE>NE<OPTION value=NF>NF<OPTION value=NH>NH<OPTION value=NJ>NJ<OPTION value=NM>NM<OPTION value=NS>NS<OPTION value=NT>NT<OPTION value=NV>NV<OPTION value=NY>NY<OPTION value=OH>OH<OPTION value=OK>OK<OPTION value=ON>ON<OPTION value=OR>OR<OPTION value=PA>PA<OPTION value=PE>PE<OPTION value=PR>PR<OPTION value=QC>QC<OPTION value=RI>RI<OPTION value=SC>SC<OPTION value=SD>SD<OPTION value=SK>SK<OPTION value=TN>TN<OPTION value=TX>TX<OPTION value=UT>UT<OPTION value=VA>VA<OPTION value=VI>VI<OPTION value=VT>VT<OPTION value=WA>WA<OPTION value=WI>WI<OPTION value=WV>WV<OPTION value=WY>WY<OPTION value=YK>YK</OPTION></SELECT> </TD>

<TD vAlign=bottom><SPAN id=ffields><SPAN id=red>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Zip:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=10 value=60645 name=Zip></TD></TR></TBODY></TABLE></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Phone:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT maxLength=15 size=22 value=773-770-1299 name=Phone></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields>Fax:</SPAN></TD>

<TD vAlign=bottom><INPUT maxLength=15 size=22 value=773-773-8889 name=Fax></TD></TR>

<TR>

<TD vAlign=bottom align=right><SPAN id=ffields><SPAN id=red>Email:</SPAN></SPAN></TD>

<TD vAlign=bottom><INPUT size=22 value=AlSmith@Testing.com name=Email></TD></TR>

<TR>

<TD vAlign=top align=right><SPAN id=ffields>Comments:</SPAN></TD>

<TD vAlign=top><TEXTAREA name=Comments rows=3 wrap=virtual cols=18></TEXTAREA> </TD></TR>

<TR>

<TD>&nbsp;</TD>

 
check into these:
Code:
  Dim Elem As IHTMLElement
  Set Elem = Doc.getElementById("ID")             ' <TagName id="[b]ID[/b]" name="Name">

  Dim Elems As IHTMLElementCollection
  Set Elems = Doc.getElementsByName("Name")       ' <TagName id="ID" name="[b]Name[/b]">
  Set Elems = Doc.getElementsByTagName("TagName") ' <[b]TagName[/b] id="ID" name="Name">


PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
I tried both. Either I'm doing something wrong or it doesn't work. In both Elem=[object]
I looked through the help files and didn't find anything. I'm really stuck over here.
Thanks
 
I tried the following code to access the input boxes. I don't get any errors but the if statement never returns true. There are several input tags on the page.

For Each Element In Doc.all
If Element.tagName = "input" Then
info = Element.Type
End If
Next
 
>If Element.tagName = "input" Then
Try this.
[tt] If LCase(Element.tagName) = "input" Then[/tt]

- tsuji
 
try:
Code:
For Each Element In Doc.all.tags("input")
  info = Element.Type
Next


PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
Thanks, one more question. Is multi-threading a possibility? I am running this program on 26,000 rows of data. Is there anyway to have 10 or so going at once?

Thanks
 
One more question. We ran this program over the weekend. At some point (after 15,000) rows of data there was a bug. Based on the error description I believe that the bug was because the internet connection must've gone down momentarily and so it couldn't open the page. I'd like to have a msgbox when this occurs so that the user (not myself) can see and stop it and run it again. The following is the code:
Code:
Private Sub OpenSite(site)
Set wb1 = CreateObject("InternetExplorer.Application")
wb1.navigate site
Do: DoEvents: Loop Until wb1.readyState = READYSTATE_COMPLETE
Set Doc = wb1.document
On Error Resume Next
Do
  Err = 0
Loop Until Err = 0
End Sub

Where can I put in the msgbox?

Thanks
 
Well...
Code:
On Error Resume Next
Do
  Err = 0
Loop Until Err = 0

Is a waste of code...

The on error block in my code was to set the Sheet.Name in excel... which could return an error if that sheet number already existed, so it continued to increment until a valid number was found...

PLUS, you ALWAYS want to follow up:
On Error Resume Next
With:
On Error Goto 0
As I did in my code block above, Which will show the Error MsgBox where Resume Next hides it (until Goto 0 is met...)

What was the error description?

I have no idea where your error occured without it, which is required to know where to implement the error handling...

Although, removing the On Error block may do the trick... ;-)

BTW... I don't think you can multithread it, although I have never tried...

Visit My Site
PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
Thanks, I was actually wondering what that piece of code did. Sonce I'm not familiar with the web browser control, I thought I'd leave it in. But now I understand! I will take it out and see. I don't remember what the actual error was, if I get it again, I will post it.

Thanks again!
 
aw23,

In case the control reach readystate=4, usually the doc object will not be likely to provoke an error, only that it might not be the doc you anticipated if the "site" is down or the network traffic congestion...

In this case you might want to check
[1] if your doc is actually being redirected to dnserror.htm like this:
[tt]
if instr(1,doc.location,"dnserror.htm",1) then
msgbox "Site may be not available."
end if
[/tt]
[2] or, you may check the characteristic string like this:
[tt]
if instr(1,doc.body.innerHTML,"The page you are looking for is currently unavailable.",1) then
msgbox "Site may be not available."
end if
[/tt]
[3] or, you may check doc.location against the site variable see if they equal or different only by ending backslash, if you are looking for default page.
[tt]
if strcomp(doc.location,site,1) * strcomp(doc.location, site & "/",1) then
msgbox "Site may be not available."
end if
[/tt](I use * to simply thing.)

[4] or, something for some reasons due to links in document etc, the doc return may not be complete (sometimes doc.body.innerHTML is empty string, but not necessary so), you have to test if the resulting page is what you might anticipated...

One way or the other, if it is mission-critical, you have to check against these possibilities. Sometimes, gaining the patterns of what may go wrong may help you shortening these checks.

- tsuji
 
Ok, here is the error I am getting:
Automation Error
The remote procedure call failed.

I am working on debugging now, I will try yours tsuji.

 
CubeE101,

Thank you so much for your help. I have completed that project. I am now working on another one, the difference is that the source code for this one is a table. Let's say I loop through the table, but there are no names or ID's so how can I identify what I am looking for? Here is a sample row.

Code:
<TR VALIGN=TOP>
	 <td width="130" align="right" valign="top"> <P ALIGN=RIGHT><B><FONT SIZE="-1" FACE="Arial,Helvetica,Univers,Zurich BT">Member Name:&nbsp;</FONT></B></td>
	<td width="10" align="left" valign="top"><P><FONT COLOR="#000000" SIZE="-1" FACE="Arial,Helvetica,Univers,Zurich BT">&nbsp;</FONT></TD>
	<td width="250" align="left" valign="top"><P><font face="Arial,Helvetica,Univers,Zurich BT" size="-1" color="#32519D"><b>Company Name<BR>&nbsp;</b>
</FONT></TD>
</TR>
Let's say I loop through the tables. I want to find the value of the cell. If the value of the cell is 'Member Name:&nbsp;' then I want to retrieve the value of the next cell. Is this possible?

Thanks
 
Anyone know how to gather data from a table (my last post?)
Thanks!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top