Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Web Crawl Problem - Vb.NET

Status
Not open for further replies.

asibin2000

Programmer
Dec 10, 2007
22
US
I'm using AxInterop.SHDocVw as my browser.

I'm able to run this code:
Code:
objDoc = WebBrowser1.Document

        ' Create a collection of all the "input" elements in the page.
        Dim wbrAll As mshtml.IHTMLElementCollection = objDoc.getElementsByTagName("input")
        ' Create an object that will be a single instance of an input element
        Dim wbrElm As mshtml.IHTMLElement

        ' Create a few variable to hold values from the input element
        Dim strName As String
        Dim strId As String
        Dim strvalue As String
        Dim strType As String

        For Each wbrElm In wbrAll

            ' Assign the inner html values of the input to our variables
            strName = wbrElm.getAttribute("name")
            strId = wbrElm.id
            strvalue = wbrElm.innerText
            strType = wbrElm.getAttribute("type")
            
            If strName = "user" Then
                wbrElm.innerText = "testuser"
            End If
        Next

No problem on this web application page we are automating.

The 2nd page opens and has the same type of code (lots of imbedded Javascript type code updating values).

When I run the above code and just cycle through the elements.. I get ZERO. So there is obviously something protecting this page from doing this kind of thing. If I view the code of the webpage that has ZERO "input" elements - I see several "input" elements.. but I can't access / reference them like I can from other pages.

So a few questions:
1. Has anyone ever stumbled across this problem before?
2. I'm thinking that the web app elements are being displayed through an overall single table?? So when I scan the page for "input" elements it says none? But I can't figure out if this is some kind of Javascript security thing or something new.
3. Could they be putting the input elements in some kind of sub form?


thanks in advance for any advice!

Lee
 
I was able to identify the hidden frame where the elements are located but I can't seem to address the elements within the frame.
 
I tried that, then you run into the session ID problem.

The problem is just being able to programmatically address the "input" element.

So when I drill down into the page - I can address non-framed input and image click elements no problem.

And I can scan the frames on the more complex page.

So frame(0), I need to be able to scan the elements within it and be able to innertext to a specific "input" element.

There is a 3rd party class that someone is selling that does this for you, but it's expensive and I don't need all the features - just need this one thing lol.

Thanks for your reply though.

Lee
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top