Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Searching through HTML source...

Status
Not open for further replies.

Sheltered

Programmer
Nov 26, 2002
55
0
0
GB
Hi folks, hoping someone can use there superior knowledge on this one as i've hit a brick wall and can't get round it.

I have a page which pulls the html source from a webpage and assigns it to a variable, all works ok, no probs here.

What i need to do is loop through the html and find each instance of a particular string and perform a check against it.

For example, the page contains, amongst other normal html tags etc, several of the following images...

<TD align=center>
<img alt='SERVER0201' src='../images/2.gif' height=18 width=18>
<img alt='SERVER0201' src='../images/1.gif' height=18 width=18>
</TD>

I need to search the html for the string "SERVER0201" and then check each instance wether it has 2.gif or 1.gif.
If all instances have 1.gif, assign a variable called "servStatus" to "OUT" or if all instances have 2.gif (or a mixture of the two) then assign the variable to "IN"

I have managed to find the first instance using "IF instr(sourcetext,"SERVER0201")..." where sourcetext is the HTML source.

Any idea how i can acheive the required result?

Thanks
Pete
 
Set regex = Server.CreateObject("vbscript.regexp")
regex.ignorecase=true
regex.global=true
regex.pattern="<img.*?alt='SERVER0201'.*?src='../images/(.+?)'.*?>"
Str1="asdsad<img alt='SERVER0201' src='../images/2.gif' height=18 width=18><img alt='SERVER0201' src='../images/1.gif' height=18 width=18>as[asd][/asd]"
Set MatchInt = Regex.Execute(Str1)
For Each IntMatch in MatchInt
response.Write(IntMatch.value)
Next


This will jut seggregate the Img that has SERVER... in its alt tag.

i couldnt refine it better that this...

Known is handfull, Unknown is worldfull
 
Off the top of my head I would say run the expression twice, once to findthe matchs, the second to replace sections of the string with easily parsed data to check against, sine all we need is the image name in the end, I would suggest doing the following:
Code:
'create a boolean, true stands for OUT, false stands for IN
Dim isOut
isOut = True

Set regex = Server.CreateObject("vbscript.regexp")
    regex.ignorecase=true
    regex.global=true
regex.pattern="<img.*?alt='SERVER0201'.*?src='../images/(.+?)'.*?>"

'get the matches to the pattern
Dim matches, amatch, aval
Set matches = regex.Execute(YourString)

'check each match for "1.gif"
For Each amatch in matches
   'replace the whole match with just the contents of the first group in the pattern
   '   groups are defined by parantheses, in this pattern we only have one group: (.+?)
   aval = regex.Replace(amatch,"$1")

   'now do your if check
   If aval = "1.gif" Then
      isOut = isIN AND True
   Else
      'by making it false it will be false in all future loops also since we used an AND in the True case
      isOut = False
   End If
Next

'now set the status based on the isOut boolean
If isOut Then 
   servStatus = "OUT"
Else
   servStatus = "IN"
End If

Hope that helps, there may be a couple errors because I misread your post and had the IN and OUT's backwards the first time around, I think I made all the necessary corrections buy I may have missed something in going back and changing those values.

-T

[sub]01000111 01101111 01110100 00100000 01000011 01101111 01100110 01100110 01100101 01100101 00111111[/sub]
Need an expensive ASP developer in the North Carolina area? Feel free to let me know.


 
aah, i thought about that, u got to me before i could...

Known is handfull, Unknown is worldfull
 
Thanks guys, i'm gonna have a play around but what you have provided looks like it should do the trick.
Very kind of you both.

I'll let you know how i get on.

Thanks again
Pete
 
i'm pretty shoddy with regex, i tend to use arrays for things.. if you're comfortable with arrays you could do a split on SERVER0201, then resplit ( search forum for the function ) on the gif value then compare the ubounds for found instances, or use your original line of thinking, instr(text,'SERVER0201') and use instr's ability to search for next instance of 1.gif with a difference of last found and current find with the difference of the two values being less than say 15 characters.

[thumbsup2]DreX
aKa - Robert
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top