Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Read text from a web-site (2)

How To

Read text from a web-site (2)

by  RyanWoodward  Posted    (Edited  )
This FAQ is a revision the following older faq. It simply corrects a capitalization/case error and introduces a newer version of the OpenUrl function:

Read text from a web-site
faq705-4579
Posted: 15 Dec 03 by PeteJohnson
http://www.tek-tips.com/faqs.cfm?fid=4579

The older FAQ uses the incorrect case in C DLL function declaration. C is case sensitive. The problem may also be that tek-tips' formatter is changing the C function name automatically. The function HttpQueryInfoA need to start with a capital letter.

For keywording/search purposes, this FAQ answers the following questions:
[ul]
[li]How to programmatically download the text from a live website in Visual Basic/VBA (Visual Basic for Applications)[/li]
[li]How to read text from a website in Visual Basic/VBA[/li]
[li]How to use wininet.dll to download the HTML source of live web page in Visual Basic/VBA[/li]
[/ul]

Revised FAQ follows:

I had to write some code in VB6 to import currency exchange rate data into a database and I thought I'd post it here in case anyone else needed to do the same thing. The API part of the code was from Ryan Woodward who has many thanks from me! The Text from the website can be retrieved and placed in an text field or string by using a simple function call from within any form event, eg. if you had a command button called btnRetrieve you could put it in btnRetrieve_Click(). I then used Split, Left, Mid and Right to extract the parts of the data which I needed. In the following, txtURL is the web-site address and txtURLSource is the string variable you are placing it into.

txtURLSource = OpenUrl(txtURL)

To implement it, create a module called rwInetXfer and copy the following into it. It worked a treat for me, both in VB6 and when I imported it into Access2K.

Code:
'--rwInetXfer
'::  ::ver 3.1vb::
'::  ::orig. author R. Woodward::
'::  ::Ryan_Woodward@yahoo.com::
'::
':: IMPORTANT!:
'::   If you're downloading this from www.tek-tips.com then you
'::   may have to change the case of a function declaration because
'::   they're using an auto code to html formatter that accidentally
'::   might change the case-sensitive C function call name
'::   because it contains the keyword "http".
'::   1) Find the line that starts with
'::>> Private Declare Function httpQueryInfo ...
'::   2) Make sure that the "H" is capitalized and the "ttp"
'::      is lowercase in BOTH the function name AND the Alias name
'::      after the 'Lib "wininet.dll"' portion.
'::   OR JUST REPLACE IT WITH
'::>>  Private Declare Function H_t_t_pQueryInfo Lib "wininet.dll" Alias "H_t_t_pQueryInfoA" _
'::>>      (ByVal hHttpRequest As Long, ByVal lInfoLevel As Long, ByRef sBuffer As Any, _
'::>>      ByRef lBufferLength As Long, ByRef lIndex As Long) As Integer
'::   AND THEN *REMOVE* THE UNDERSCORES FROM H_t_t_p
'::        i just put the underscores in to "trick" their formatter
'::        that is incorrectly changing the first H to lowercase
'::
'::WHY USE:
'::  Many versions of the Microsoft Internet Transfer Control
'::  that shipped with VB circa ver VB6 are buggy or unreliable
'::  Here's a nice substitute with source code
'::  Currently only implements OpenUrl
'::
'::DESC:
'::  Internet File Transfer Object
'::  Retrieve internet files over HTTP using
'::  Windows system DLLs (wininet) and system network config
'::
'::E.G.:
'::  Dim inet As rwInetXfer
'::  debug.Print "HTML SOURCE-"
'::  debug.Print inet.OpenUrl("http://www.yahoo.com")
'::
Option Explicit
Const ClassName = "rwInetXfer"

Public DontUseCache As Boolean

'--
Private Enum InfoLevelEnum
    HTTP_QUERY_CONTENT_TYPE = 1
    HTTP_QUERY_CONTENT_LENGTH = 5
    HTTP_QUERY_EXPIRES = 10
    HTTP_QUERY_LAST_MODIFIED = 11
    HTTP_QUERY_PRAGMA = 17
    HTTP_QUERY_VERSION = 18
    HTTP_QUERY_STATUS_CODE = 19
    HTTP_QUERY_STATUS_TEXT = 20
    HTTP_QUERY_RAW_HEADERS = 21
    HTTP_QUERY_RAW_HEADERS_CRLF = 22
    HTTP_QUERY_FORWARDED = 30
    HTTP_QUERY_SERVER = 37
    HTTP_QUERY_USER_AGENT = 39
    HTTP_QUERY_SET_COOKIE = 43
    HTTP_QUERY_REQUEST_METHOD = 45
    HTTP_STATUS_DENIED = 401
    HTTP_STATUS_PROXY_AUTH_REQ = 407
End Enum

Private Const INTERNET_OPEN_TYPE_PRECONFIG = 0
Private Const INTERNET_OPEN_TYPE_DIRECT = 1
Private Const INTERNET_OPEN_TYPE_PROXY = 3

Private Const SCUSERAGENT = "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 5.1)"
Private Const INTERNET_FLAG_RELOAD = &H80000000
Private Const INTERNET_FLAG_ASYNC = &H10000000  ' this request is asynchronous (where supported)

Private Const INTERNET_FLAG_FROM_CACHE = &H1000000   ' use offline semantics
Private Const INTERNET_FLAG_OFFLINE = INTERNET_FLAG_FROM_CACHE
Private Const INTERNET_FLAG_EXISTING_CONNECT = &H20000000


'--WinAPI sleep function
'::
':: Standard win API function to pause program execution for
':: the specified amount of time in miliseconds
'::
Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)


'--HttpQueryInfo
'::DESC:
'::  Queries for information about an HTTP request.
'::
Private Declare Function HttpQueryInfo Lib "wininet.dll" Alias "HttpQueryInfoA" _
    (ByVal hHttpRequest As Long, ByVal lInfoLevel As Long, ByRef sBuffer As Any, _
    ByRef lBufferLength As Long, ByRef lIndex As Long) As Integer


Private Declare Function InternetOpen Lib "wininet.dll" Alias "InternetOpenA" _
    (ByVal sAgent As String, ByVal lAccessType As Long, ByVal sProxyName As String, _
    ByVal sProxyBypass As String, ByVal lFlags As Long) As Long

'--InternetOpenUrl
'::DESC:
'::  Open a handle for retrieving a URL
'::EG:
'::  hOpenUrl = InternetOpenUrl(hOpen, sUrl, vbNullString, 0, INTERNET_FLAG_RELOAD, 0)
'::  hOpenUrl = InternetOpenUrl(hOpen, sUrl, vbNullString, 0, INTERNET_FLAG_FROM_CACHE, 0)
'::  hOpenUrl = InternetOpenUrl(hOpen, sUrl, vbNullString, 0, INTERNET_FLAG_EXISTING_CONNECT, 0)
'::  hOpenUrl = InternetOpenUrl(hOpen, sUrl, vbNullString, 0, 0, 0)
'::
Private Declare Function InternetOpenUrl Lib "wininet.dll" Alias "InternetOpenUrlA" _
    (ByVal hOpen As Long, ByVal sUrl As String, ByVal sHeaders As String, _
    ByVal lLength As Long, ByVal lFlags As Long, ByVal lContext As Long) As Long

Private Declare Function InternetReadFile Lib "wininet.dll" _
    (ByVal hFile As Long, ByVal sBuffer As String, ByVal lNumBytesToRead As Long, _
    lNumberOfBytesRead As Long) As Integer

Private Declare Function InternetCloseHandle Lib "wininet.dll" _
    (ByVal hInet As Long) As Integer

Private Declare Function InternetGetLastResponseInfo Lib "wininet.dll" Alias "InternetGetLastResponseInfoA" _
    (lpdwError As Long, ByVal lpszBuffer As String, ByVal lpdwBufferLength As Long) As Integer

'--OpenUrl(url)
'::DESC:
'::  Retrieve the page specified by "url"
'::  Returns string of page source
'::  On error, returns "error #"
'::     e.g. page not found returns "error 404"
'::
Public Function OpenUrl(ByVal sUrl As String) As String
    #If DEVREL < 1 Then
        On Error GoTo exitfunc
    #End If
    Dim s As String
    Dim sReadBuf As String * 2048   'a data buffer for InternetOpen fcns
    Dim bytesRead As Long
    Dim hInet As Long       'wininet handle
    Dim hUrl As Long        'url request handle
    Dim flagMoreData As Boolean
    Dim ret As String
    ' used for callling HttpQueryInfo
    Dim sErrBuf As String * 255
    Dim sErrBufLen As Long
    Dim dwIndex As Long
    ' return codes and err code saves
    Dim lastErr As Long
    Dim bRet As Boolean
    Dim wRet As Integer
    ' http status code
    Dim httpCode As Integer
    ' how many retry
    Dim retrynum_internetopenurl As Integer
    ' grab a handle for using wininet
    hInet = InternetOpen(SCUSERAGENT, INTERNET_OPEN_TYPE_PRECONFIG, vbNullString, vbNullString, 0)
    If Err.LastDllError <> 0 Then
        lastErr = Err.LastDllError
        ret = "error (wininet.dll," & lastErr & " on InternetOpen)"
        GoTo exitfunc
    End If
    ' retrieve the requested URL
    '   we might be retrying... update a counter var
retry_internetopenurl:
    retrynum_internetopenurl = retrynum_internetopenurl + 1
    If retrynum_internetopenurl > 3 Then GoTo exitfunc
    If DontUseCache Then
        hUrl = InternetOpenUrl(hInet, sUrl, vbNullString, 0, INTERNET_FLAG_RELOAD, 0)
    Else
        hUrl = InternetOpenUrl(hInet, sUrl, vbNullString, 0, 0, 0)
    End If
    If Err.LastDllError <> 0 Then
        lastErr = Err.LastDllError
        If lastErr = 2 Then
            'a hack, sometimes this seems to fail the first time but succeed on subsequent
            '   first noted the problem getting http://finance.yahoo.com/?u,  using XP/IE6.0.26
            Sleep (250)
            InternetCloseHandle (hUrl)
            GoTo retry_internetopenurl
        End If
        ret = "error (wininet.dll," & lastErr & " on InternetOpenUrl)"
        GoTo exitfunc
    End If
    ' get query info, this should give us a status code among other things
    sErrBufLen = 255
    bRet = HttpQueryInfo(hUrl, HTTP_QUERY_STATUS_CODE, ByVal sErrBuf, sErrBufLen, dwIndex)
    If Err.LastDllError <> 0 Then
        lastErr = Err.LastDllError
        ret = "error (wininet.dll," & lastErr & " on HttpQueryInfo)"
        GoTo exitfunc
    End If
    ' sErrBuf should now hopefully contain HTTP status code stuff
    ' if the call failed, no status info was returned (i.e. sErrBuf is empty)
    '   then throw error
    If sErrBufLen = 0 Or Not bRet Then
        ret = "error"
        GoTo exitfunc
    Else
        ' retrieve the HTTP status code
        httpCode = CInt(Left(sErrBuf, sErrBufLen))
        If httpCode >= 300 Then
            ret = "error " & httpCode
            GoTo exitfunc
        End If
    End If
    ' if we made it this far, then we can begin retrieving data
    flagMoreData = True
    Do While flagMoreData
        sReadBuf = vbNullString
        wRet = InternetReadFile(hUrl, sReadBuf, Len(sReadBuf), bytesRead)
        If Err.LastDllError <> 0 Then
            lastErr = Err.LastDllError
            ret = "error (wininet.dll," & lastErr & " on InternetReadFile)"
            If lastErr = 32 Then
                'this if/then seems to be necessary for retrieving some ASP pages
                '  they always seem to get an error 32 on the first access
                '  but subsequent accesses are then retrieved OK
                '  first observed on http://www.luckypix.com/
                Sleep (250)
                InternetCloseHandle (hUrl)
                GoTo retry_internetopenurl
            End If
            GoTo exitfunc
        End If
        If wRet <> 1 Then
            ret = "error (wininet.dll, InternetReadFile() returned " & wRet & ")"
            GoTo exitfunc
        End If
        s = s & Left$(sReadBuf, bytesRead)
        If Not CBool(bytesRead) Then flagMoreData = False
    Loop
    ret = s
exitfunc:
    If hUrl <> 0 Then InternetCloseHandle (hUrl)
    If hInet <> 0 Then InternetCloseHandle (hInet)
    OpenUrl = ret
End Function


Private Sub Class_Initialize()
    DontUseCache = False
End Sub


'--VERSION HISTORY
'::
':: Versions 1.0-3.0
'::    Unfortunately, I really didn't keep incremental notes.
'::    After version 1.0, added a retry loop and some code
'::    to catch that "error 32" with a pause and retry
':: Version 3.1
'::    added a prefix note to initial comments to fix the way the code
'::    copies and pastes from the tek-tips site.
'::
Register to rate this FAQ  : BAD 1 2 3 4 5 6 7 8 9 10 GOOD
Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

Back
Top