Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

A smarter HTMLEncode()

Status
Not open for further replies.

Karl Blessing

Programmer
Feb 25, 2000
2,936
US
I have this code sniplet that does some HTML Striping for viewing , but though it does correctly replace tabs, spaces, and any Amperstands (For other html encoded peices that didnt quite make it)

Code:
Function ToHTML(ByRef pString)
	Dim lObjRegExp
	Dim lLngStart
	Dim lLngEnd
	
	If VarType(pString) = vbNull Then Exit Function
	
	ToHTML = pString
	
	' Parse TAGS
	Set lObjRegExp = New RegExp
	lObjRegExp.Global = True
	lObjRegExp.Pattern = &quot;<[^>]*>&quot;
	ToHTML = lObjRegExp.Replace(pString, &quot;&quot;)
	Set lObjRegExp = Nothing

	' HTML Encoding
	ToHTML = Server.HTMLEncode(ToHTML)
	
	' Change Carriage Returns and Line Feeds to HTML
	ToHTML = Replace(ToHTML, vbCrLf, &quot;<BR>&quot;)
	ToHTML = Replace(ToHTML, vbCr, &quot;<BR>&quot;)
	ToHTML = Replace(ToHTML, vbLf, &quot;<BR>&quot;)
	ToHTML = Replace(ToHTML, &quot;&amp;&quot;, &quot;&&quot;)
	
	' Change Tabs to HTML
	ToHTML = Replace(ToHTML, vbTab, &quot;&nbsp;&nbsp;&nbsp; &quot;)
	
	' Change double-space to HTML
	While Not InStr(1, ToHTML, &quot;  &quot;) = 0
		ToHTML = Replace(ToHTML, &quot;  &quot;, &quot;&nbsp; &quot;)
	Wend
End Function

The question is however, how do I get it to leave certain tags &quot;as is&quot;

for example, I would like to leave <img...> , <a ... >...</a>, and a couple other tags intact so that they show up appropiratly. The reason I have it striping out all the tags is to keep both the message from looking cluttered in html code, and to keep the web-based email system more secure from javascript and other posible automatic things.

Karl Blessing aka kb244{fastHACK}
kblogo.jpg
 
how many other tags are there that you want to leave intact?

if there are a handful, try doing a regexp search for each one of them (store them into an array), then replace the < > with the [kbd]&lt; and &gt;[/kbd]. after that - strip all the remaining tags using your regexp, then do a replace, replacing all the [kbd]&gt; and &lt;[/kbd] with their correct counterparts.

or -- you can always implement some sort of mark up language, like the one present here, and make people adhere to that. that seems to be a standard across most bb now, so that might be an option.

hth
leo
 
that would be very inconvient for the user (the markup language thing) mainly because the thing is a Web Based Email interface. You dont want to tell people sending you email that they have to design it with some sort of market for them to see it. But i'll give the other thing a try. Karl Blessing aka kb244{fastHACK}
kblogo.jpg
 
The other thing (the replace method) should work well, but is not all too scalable (ie: if you want to add tags, you need to open up the code, and add to the array). A possiblity for this would be to read the tags from a configuration file, or from a table containing allowed tags. this way, adding new tags only involves opening up a config file or adding to a row in a database.

just a thought as you start designing... good luck
leo
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top