Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Regular Expression (return text between body tag) 1

Status
Not open for further replies.

jsiimml

Programmer
Feb 14, 2006
10
GB
Hi, I want to create a regular expression which will return the content within the <body> tags of a string. I have retrieved the string but I can't get the regular expression to work. I've tried:

Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.MultiLine = True
objRegExp.Pattern = "<body>(.*?)</body>"
vblContent = objRegExp.Replace(vblContent,"$1")
Set objRegExp = nothing

I hoped that it would return the bit in brackets as $1 and replace the original string with everything within the body tags but it does not work. Appreciate it if someone could help. Thanks
 
try this
Code:
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.MultiLine = True
objRegExp.Pattern = "<body>(.*?)</body>"
objRegExp.Execute(vblContent)
For Each objMatch in objMatches
  Response.Write objMatch.Value & "<BR>"
Next
Set objRegExp = nothing

}...the bane of my life!
 
I wonder why the op has not been bothered to report back. It must be just a causal question.

[1]>[tt]vblContent = objRegExp.Replace(vblContent,"$1")[/tt]
This is a misunderstanding of what .replace is doing here. If $1 return correctly the content between the body-tag, the line means to replace the return the content ($1) with body open and close tags by the content itself. The net result is making body open and close tags disappeared. It would not be what intended.

[2] pattern (.*?)
It will be largely insufficient for a commonly multi-lined html with plenty newline (meta-)characters.

This is how to make things happen in most cases against all oddities, with underlying assumption of only one pair of body tag, open and close---otherwise, just more matches and submatches to enumerate.
[tt]
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
[red]'[/red]objRegExp.MultiLine = True [green]'no need[/green]
objRegExp.Pattern = "<body[red][^>]*[/red]>([red]([/red].[red]|\n)[/red]*?)</body>"
[blue]if objRegExp.execute(vblContent).count<>0 then[/blue]
[red]'[/red]vblContent = objRegExp.Replace(vblContent,"$1") [green]'wrong idea, doing different thing[/green]
[blue]vblContent=objRegExp.execute(vblContent).item(0).submatches(0)
else
vblContent="" 'depends on how you want to do with no match
end if[/blue]
Set objRegExp = nothing
[/tt]
 
nice tsuji


General FAQ faq333-2924
5 steps to asking a question faq333-3811
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top