Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Grabbing data from a website 3

Status
Not open for further replies.

valicon

MIS
Sep 26, 2000
125
0
0
US
Hi all,

I need to write a script that will go out to a URL, grab the data from the web page and then save it as a text file on the local users PC (where this script will run). What is the best language to write this in and what commands can I use to accomplish this? Thanks.



 
"Best" language? That's pretty subjective.

I'd say VBScript because there are more reference materials covering these types of things in VBScript or VB on Windows platforms than there are JScript or Perl or something.

How to? You might look at the Internet Transfer Control for example. There is some discussion at:


There are fancier and more modern controls available from Microsoft too, but you don't need a lot of multi-connection or XML-handling capabilities for this application, right?

You could also drop down the the Winsock control level and roll your own HTTP conversations, but this isn't necessary with the Inet control and its big brothers around. Inet will even do FTP and Gopher if you need those capabilities!
 
Do I need Visual Studio to compile this script or can it be wriiten in notepad??? The stuff that I read from the link above seems to lead me to believe that the script must be compiled. I don't need XML or anything fancy just a script to go out to the page, get the data from the page and then save it as a text file. Internet Transfer Control seems to be the ticket.
 
To use Inet you'll find you need Visual Studio, Office Developer, or at least VB installed.

If this will be a WSH script you have no choice: the script will only run where the Inet developer license is installed (through installing one of those developer tools).

For a script in an HTML or or an HTA (HTML Application), you can use LPKTOOL to create a License Manager Package (LPK) file to provide Internet Explorer with a developer-issued run-time license.

Aside from Inet, many now are using the XMLHTTP component that I don't think has these license restrictions. Despite the name, XMLHTTP can fetch plain old HTML from a web site as well as play with XML.
 
Thanks for the great info! I will try both ways hopefully tomorrow and keep you posted! Thanks again!
 
Okay I managed to get sme code from the web:

Dim b() As Byte
Dim intCount As Integer
Dim strData As String

Inet1.Cancel ' Stops any current operations

b() = Inet1.OpenURL(" _
index.html", icByteArray)

For intCount = 0 To UBound(b) - 1

strData = strData & Chr(b(intCount))

Next intCount


This code needs to be compiled, I tried compiling it and I get errors. What am I doing wrong? Help!
 
valicon muttered:-
>>This code needs to be compiled, I tried compiling it and >>I get errors. What am I doing wrong? Help!


getting code that doesn't make sense probably :)




Jay
How can I go forward when I don't know which way I'm facing (John Lennon)​
 
That's VB code, not VBScript.

VBScript doesn't have a compiler.

As I said earlier, if you are going to do this in VBScript or JScript you will need to host it in IE as an HTM or HTA, or in WSH. If you choose WSH you will have to have a developer license installed for Inet on the machine where you run the script. If you choose IE you will be able to run on the developer-license machine, or create an LPK file and reference it in your HTM/HTA making it portable.

So... a WSH script is stuck with running where the developer license is, while an HTM/HTA script will be portable.

But in either case you would need a machine with VB6 or VS6 installed, probably the "Professional" or "Enterprise" version, to do your development. At that point you might as well just write a VB program unless you have another reason not to.
 
It sounds like you aren't getting too far with this yet. Let's back up.

Inet requires a license you must not have. So take a look at XMLHTTP instead. This is an unlicensed control, and any machine with IE 4.0 or later probably has it installed. One of the best resources for Windows development is MSDN Online:


Here you get articles and reference documents rather than just code fragments that might or might not meet your needs.

I found:


This has some samples and intro information on XMLHTTPRequest.

Then there is:


This page gives links to pages on each of the methods and properties exposed by XMLHTTPRequest. Always good to get the real scoop.

Finally, a working WSH script using XMLHTTPRequest:

XMLHTTP.vbs
Code:
Option Explicit
Dim objXMLHTTP, objFSO, objTS
Const FSO_OVERWRITE = True

Set objXMLHTTP = CreateObject("MSXML2.XMLHTTP")
With objXMLHTTP
  .open "GET", "[URL unfurl="true"]http://www.google.com",[/URL] False
  On Error Resume Next
  .send
  If Err.Number <> 0 Then
    Msgbox &quot;XMLHTTP error &quot; & Hex(Err.Number) & &quot; &quot; & Err.Description
  ElseIf .status <> 200 Then
    MsgBox &quot;HTTP error &quot; & CStr(.status) & &quot; &quot; & .statusText
  Else
    Set objFSO = CreateObject(&quot;Scripting.FileSystemObject&quot;)
    Set objTS = objFSO.CreateTextFile(&quot;google.txt&quot;, FSO_OVERWRITE)
    objTS.Write Replace(.responseText, vbLf, vbNewLine)
    objTS.Close
    Set objTS = Nothing
    Set objFSO = Nothing
    MsgBox &quot;Complete!&quot;
  End If
End With
Set objXMLHTTP = Nothing
Apologies to Google.

Note that since Google is a Unix-based site it uses LFs as line delimiters, so I replace them here with native Windows line breaks. A Windows site's response should not be processed this way. Also, the status property is the standard HTTP client return status value, exposed as a Long Variant by this component.

Hope this gets you on your way finally.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top