Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

String Manipulation problems! 1

Status
Not open for further replies.

soundmangav

Programmer
Apr 5, 2002
8
GB

Hi,

I'm developing an interactive Lottery Syndicate program using VB6, which allows members to automatically check their National Lottery numbers against those on the official website, (Access 2000 & SQL application)

Having successfully downloaded the html source code of the results page into a Rich TextBox using the Internet Transfer Control component (Inet), I am now attempting to search through this string, then identify a substring preceeding the numbers themselves, and finally extract these numbers to a listbox or numbers of text boxes (7).

Anyway here's the code I've used, and the problem follows:

Private Sub Command1_Click() 'Get Results Command Button

Dim arr() As String
Dim i As Long

RichTextBox1.Text = Inet1.OpenURL(txtURL.Text, icString) 'Populate RTB with source code

arr = Split(RichTextBox1, vbCrLf) 'Split up each line of the RTB into an Array

For i = 0 To UBound(arr)
If InStr(arr(i), "images/results/numbers") Then 'Test to see if the Line contains a Lotto Number
List1.AddItem Mid$(arr(i), InStr(arr(i), "ALT=") + 5, 2) 'Extract the Lotto Numbers
End If
Next

End Sub

Here's a sample of the source code which appears in the RTB: Note there is about a 100 lines preceeding this section, but the lotto numbers themselves are uniquely identified with the preceeding string "images/results/numbers ..."

ie. the info I want to extract lies within the "ALT" Tag, that is for below, "08", "14", "25", "36", "38", "47" and "15".

<IMG SRC=&quot;images/results/onresu02.jpg&quot; ALT=&quot;Saturday&quot; WIDTH=152 HEIGHT=37 BORDER=0><br>
The winning balls for draw number 654 on 30 March 2002 were:<BR CLEAR=ALL><IMG SRC=&quot;images/results/numbers/num08.gif&quot; ALT=&quot;08&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num14.gif&quot; ALT=&quot;14&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num25.gif&quot; ALT=&quot;25&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num36.gif&quot; ALT=&quot;36&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num38.gif&quot; ALT=&quot;38&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num47.gif&quot; ALT=&quot;47&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/bonus/15.gif&quot; ALT=&quot;Bonus Ball 15&quot; WIDTH=83 HEIGHT=68 BORDER=0><BR CLEAR=all>

But for some reason my coding seems to only extract the 2 character value &quot;Sa&quot; from the string:

<IMG SRC=&quot;images/results/onresu02.jpg&quot; ALT=&quot;Saturday&quot; WIDTH=152 HEIGHT=37 BORDER=0><br>

even though this doesn't corresspond to the identified string:

&quot;images/results/numbers&quot;

why is this so???

Surely it should read through the following source code, locate the string above and extract each instance of the substring (2 characters following &quot;Alt=&quot;)

If anyone could shed some light on this problem it would be greatly appreciated.

GT.


 
Consider adding a reference to the Microsoft VBScript Regular Expressions library, which would then allow this code:
[tt]
Private Sub Command1_Click()
Dim re As RegExp
Dim myMatches As MatchCollection
Dim myMatch As Match

RichTextBox1.Text = Inet1.OpenURL(txtUrl.Text, icString) 'Populate RTB with source code

Set re = New RegExp
re.Global = True
re.Pattern = &quot;.*images/results/numbers/num([0-9][0-9])\.gif.*&quot;
Set myMatches = re.Execute(RichTextBox1.Text)
For Each myMatch In myMatches
List1.AddItem myMatch.SubMatches(0)
Next
End Sub
 
I guess I should mention that if the page has more than one days results on it (as on the UK National Lottery page) then the example code presented above will grab the numbers from all the days.
 

Cheers for the suggestion, it works great! I'm checking the results for both draws (Wed & Sat) so its fine to extract them all to the same listbox as I can just check each item in turn against each members selection, as they're all in numerical order anyway.

I've also modified the code to extract the bonus ball numbers for each draw to a seperate listbox, as shown below:

Private Sub Command1_Click()

RichTextBox1.Text = Inet1.OpenURL(txtURL.Text, icString) 'Populate RTB with source code

Set re = New RegExp
re.Global = True

re.Pattern = &quot;.*images/results/numbers/num([0-9][0-9])\.gif.*&quot;

Set myMatches = re.Execute(RichTextBox1.Text)
For Each myMatch In myMatches
List1.AddItem myMatch.SubMatches(0)
Next

re.Pattern = &quot;.*images/results/bonus/([0-9][0-9])\.gif.*&quot;

Set myMatches = re.Execute(RichTextBox1.Text)
For Each myMatch In myMatches
List2.AddItem myMatch.SubMatches(0)
Next

End Sub


Now my final problem lies in how to extract the draw day information ie. &quot;Saturday&quot; below in the ALT tag, and also how could I extract the entire string &quot;The winning balls for draw number 654 on 30 March 2002 were:&quot; from below, any suggestions?

Thanks again in advance: GT.

</td>
</tr>
</table>
<IMG SRC=&quot;images/results/onresu02.jpg&quot; ALT=&quot;Saturday&quot; WIDTH=152 HEIGHT=37 BORDER=0><br>
The winning balls for draw number 654 on 30 March 2002 were:<BR CLEAR=ALL>
<IMG SRC=&quot;images/results/numbers/num08.gif&quot; ALT=&quot;08&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num14.gif&quot; ALT=&quot;14&quot; WIDTH=53 HEIGHT=54 BORDER=0>
<IMG SRC=&quot;images/results/numbers/num25.gif&quot; ALT=&quot;25&quot; WIDTH=53 HEIGHT=54 BORDER=0>
 
[tt]
Private Sub Command1_Click()

Dim re As RegExp
Dim myMatches As MatchCollection
Dim myMatch As Match



RichTextBox1.Text = Inet1.OpenURL(txtURL.Text, icString) 'Populate RTB with source code

Set re = New RegExp
re.Global = True
re.IgnoreCase = True
re.Pattern = &quot;&quot;&quot;images/results/onresu0[2,3].*&quot;&quot;(.*)&quot;&quot;|(the winning balls.*:)|.*images/results/numbers/num([0-9][0-9])\.gif.*|.*images/results/bonus/([0-9][0-9])\.gif.*&quot;
Set myMatches = re.Execute(RichTextBox1.Text)
For Each myMatch In myMatches
' What is in each submatch?
' 0 matches day
' 1 matches descriptive date
' 2 matches selected balls
' 3 matches bonus ball
List1.AddItem myMatch.SubMatches(0) & myMatch.SubMatches(1) & myMatch.SubMatches(2) & myMatch.SubMatches(3)
Next

End Sub
 

Nice One, that's exactly what I was looking for.

Originally I was going to approach the problem using the InStr and Mid$ functions to manipulate the string, but this method is far better.

Although I've never used this Library before don't fully understand the structure of the Regular Expression:

re.Pattern = &quot;&quot;&quot;images/results/onresu0[2,3].*&quot;&quot;(.*)&quot;&quot;|(the winning balls.*:)|.*images/results/numbers/num([0-9][0-9])\.gif.*|.*images/results/bonus/([0-9][0-9])\.gif.*&quot;

Anyway I'll look into it and try to figure out exactly whats happening!

Thanks again: GT.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top