Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

instr doesn't work for unicode (utf-8)

Status
Not open for further replies.

mickers

Programmer
Dec 21, 2005
7
US
Hi,
I am using instr to search for strings from pages I have read into a variable. When the page uses "charset=UTF-8" then the comparison fails. All code is below. Does anyone know how to do search for strings in unicode?

I am trying to detect a url on a page. Typical call is:
DoesLinkExist(" "Note that the UnicodeTools object executes unicode versions of vbscript commands, but I can't get that to find the text either. The RegExp option fails to detect text that is definitely on the page. The instr option also fails to detect the same text.

The bottom line here is that when the sDomain text actually exists in sPage it isn't detected.

Thanks for your input...
--
==============================
Code:
function DoesLinkExist(sPage, sDomain)
	dim xmlcontent
	sErr = ""
	set xmlcontent = CreateObject("MSXML2.ServerXMLHTTP")  

	on error resume next 
	xmlcontent.open "GET", "[URL unfurl="true"]http://"[/URL] & sPage, false 
	xmlcontent.send ""
	status = xmlcontent.status
	
	if err.number <> 0 or status <> 200 then 
		sErr = "PAGE NOT FOUND, Status = " & status
	else 
		sContent = xmlcontent.responseText
		sSearch = sDomain

		bRegExp = false
		if bRegExp then		
			'Prepare a regular expression object
			Set myRegExp = New RegExp
			myRegExp.IgnoreCase = True
			myRegExp.Global = True
			myRegExp.Pattern = sSearch
			' write out msg for each match
			Set myMatches = myRegExp.Execute(sContent)
			For Each myMatch in myMatches
			  sMatches =  sMatches & myMatch.Value & ", "
			Next		
			if sMatches = "" then
				sErr = "NO Regex MATCH"
			else
				sErr = ""
			end if
		else
			iPlace = inStr(1, sContent, sSearch, vbTextCompare)
			if isNull(iPlace) or iPlace = 0 then 
				Set u = New UnicodeTools
				sSearchUni = u.CStrU(sSearch)
				sContentUni = u.CStrU(sContent)
				iPlace = u.inStrU(1, sContentUni, sSearchUni, vbBinaryCompare)
				if isNull(iPlace) or iPlace = 0 then 
					sErr = sSearch & " NO Unicode Instr MATCH"
				end if
				Set u = Nothing
			end if
		end if 
	end if

	set xmlcontent = nothing 
	DoesLinkExist = sErr
end function

 
>I am using instr to search for strings from pages I have read into a variable. When the page uses "charset=UTF-8" then the comparison fails.
That is typical when we don't get the script exactly right and then we start imagining the causes of the unexpected performance of it.

[0] vbs support unicode all right.

[1] By bRegExp=false, you direct the script to use instr(). But you never let the instr() to perform.
[tt]
else
iPlace = inStr(1, sContent, sSearch, vbTextCompare)
if isNull(iPlace) or iPlace = 0 then
[red]'[/red]Set u = New UnicodeTools
[red]'[/red]sSearchUni = u.CStrU(sSearch)
[red]'[/red]sContentUni = u.CStrU(sContent)
[red]'[/red]iPlace = u.inStrU(1, sContentUni, sSearchUni, vbBinaryCompare)
[red]'[/red]if isNull(iPlace) or iPlace = 0 then
[red]'[/red]sErr = sSearch & " NO Unicode Instr MATCH"
[red]'[/red]end if
[red]'[/red]Set u = Nothing
[blue]sErr = sSearch & " No instr MATCH"
else
sErr = sSearch & " instr MATCH at position " & iPlace[/blue]
end if
end if
end if
[/tt]
[2] The function is called like this.
[tt] dim smsg
smsg= DoesLinkExist(" " response.write smsg & "<br />"
[/tt]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top