Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

** Replacing Characters: ’ – “ ??? **

Status
Not open for further replies.

jonnywah

Programmer
Feb 7, 2004
80
0
0
US
I am trying to write a VB function to replace these awkward characters in a string:

Replace with Single Quote ('):
’

Replace with DASH (-):
–

Replace with 2 Single Quotes (''):
“ (or) â€

Replace with 3 Periods ( ... ):
…


Function CleanString(strInput)
mySTemp = Replace(Replace(Replace(Replace(Replace(strInput,"–", "-"), "“", "''"), "â€", "''"), "…", " ... "), "'", "''")
CleanString = mySTemp
End Function


I am super tired, and I can't get this function to work. Please help. Any suggestions or information would be appreciated. Thank you in advance.
 
You want your replacement of just †to be the outside replace (the next-to-last one, just before the single-single-quote replacement), because otherwise it will "fire" before any that are outside it, specifically the elipses one. Otherwise it looks fine. What are you getting back?

To verify that it looks fine, I did some temporary indenting and parentheses matching:
Code:
Replace(
  Replace(
    Replace(
      Replace(
        Replace(
          strInput,"–", "-"
        ), "“", "''"
      ), "â€", "''"
    ), "…", " ... "
  ), "'", "''"
)
 
I am not getting errors but the replace(s) aren't working.
 
perchance, try using concactenated ASCII values in the replace statements instead of the textual values.

might also help to grab a portion of the source with the undesired values and do a small loop thru them and response out the ascii values. sometimes there's hidden un-visible characters in there, and might be partly why the replaces might be failing.

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
Great suggestion - the problem is that I can't all the ASCII characters for:

’

–

“

â€

…
 
Code:
<%
VarArr = Array("’","–","“","â€","…")

for each value in vararr
  Response.Write Value & "&nbsp;"
  for i=1 to len(value)
    Response.Write Asc(mid(Value,i,1)) & "&nbsp;"
  Next
  Response.Write "<br>" & vbcrlf
Next
%>

output :
Code:
’ 226 128 153 
– 226 128 147 
“ 226 128 156 
†226 128 
… 226 128 166

updated replace with ascii chars :

mind you, this is still without knowledge if there's possible extra hidden or undisplayable chars in the string ...
Code:
Input = OriginStr

OutPut = Replace(Input, Chr(226) & Chr(128) & Chr(153) , "'")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(147) , "-")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(166), "...")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(156) , "''")
OutPut = Replace(OutPut, Chr(226) & Chr(128) , "''")

Response.Write Output

also as gen had noted, moved the shortest common string replacement to last as to not cause replacement issues.

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
also would be greatly beneficial if you could supply a line of sample source, ( original String ) to work with in the matter, or your code where this issue is happening, and if at all possible, the end result you're getting, or the error, whichever applies.

kind of like the blind telling the deaf how something looks.
we're getting nowhere fast.

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
Still doesn't work.

’, –, “, â€, …

returns:

’, â€', “, â€, …

Hmmm?
 
as noted in my prior post, some sample info would be greatly beneficial

Code:
<%
Response.Write StripJunk("’,–,“,â€,…") & "<br>"
Response.Write StripJunk("blahblah–testblah…somthing†with a – and some other “") & "<br>"

Function StripJunk(Input)
OutPut = Replace(Input, Chr(226) & Chr(128) & Chr(153) , "'")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(147) , "-")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(166), "...")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(156) , "''")
OutPut = Replace(OutPut, Chr(226) & Chr(128) , "''")
StripJunk = OutPut
End Function
%>

OUTPUT ( functional )
Code:
',-,'','',...
blahblah-testblah...somthing'' with a - and some other ''

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
I am looking for the following chemicals for Jame’s house – NO2…O2…CH3… I need the absolute “highest quality†chemicals.

returns:

I am looking for the following chemicals for Jame’s house â€' NO2…O2…CH3… I need the absolute “highest quality†chemicals.

where it should return:

I am looking for the following chemicals for Jame's house - NO2 ... O2 ... CH3 ... I need the absolute ''highest quality'' chemicals.
 
Same as before and updated with sample string you issued.
Code:
<%
Response.Write StripJunk("’,–,“,â€,…") & "<br>"
Response.Write StripJunk("blahblah–testblah…somthing†with a – and some other “") & "<br>"
Response.Write StripJunk("I am looking for the following chemicals for Jame’s house – NO2…O2…CH3… I need the absolute “highest quality†chemicals.") & "<br>"

Function StripJunk(Input)
OutPut = Replace(Input, Chr(226) & Chr(128) & Chr(153) , "'")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(147) , "-")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(166), "...")
OutPut = Replace(OutPut, Chr(226) & Chr(128) & Chr(156) , "''")
OutPut = Replace(OutPut, Chr(226) & Chr(128) , "''")
StripJunk = OutPut
End Function
%>

Output :
Code:
',-,'','',...
blahblah-testblah...somthing'' with a - and some other ''
I am looking for the following chemicals for Jame's house - NO2...O2...CH3... I need the absolute ''highest quality'' chemicals.

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
perhaps your page is cached, try refreshing it.

[thumbsup2]DreX
aKa - Robert
if all else fails, light it on fire and do the happy dance!
 
Tried refreshing but it didn't work.

I use this function to capture the value in a text area ("txtDesc").

Example:

Desc = StripJunk(Request.Form("txtDesc"))

The function StripJunk is called before inserting the data into the database. I'm not sure why it's not working here.
 
Or, as noted, you have hidden characters, which hasn't been eliminated. For example, if you do this:
Code:
Response.Write(Len("NO2…O2…CH3… I"))
But copying and pasting your actual text inside the quotes, not what I wrote there, it should print 19. If the number is higher then you've got some hidden characters.

They could be revealed by something like this:
Code:
strTest = "NO2…O2…CH3… I" [COLOR=gray]'remember to copy and paste from your original[/color]
For i = 1 to Len(strTest)
    Response.Write(Mid(strTest, i, 1) & " = " & Asc(Mid(strTest, i, 1)))
    Response.Write("<br>" & vbCrLf)
Next
You should get something like
Code:
N = 78
O = 79
2 = 50
etc., but hopefully one or more lines like this:
Code:
= 11
= 29
which would tell you what and where the hidden characters are.
 
Ah, you noted that StripJunk works when headed to the db while I was typing the hidden chars thing.

Maybe if you post some more code.
 
I am not sure how knowing the hidden characters will help solve the problem ...

 
I think this is a problem with the text area. I remember encountering a similar problem where I use trim for the text area value and it didn't work.

Eg.

Desc = Trim(Request.Form("txtDesc"))

I got around this problem by calling a javascript function TrimDesc:

<TD bgColor=#cccccc><TEXTAREA style="FONT-SIZE: 10px; FONT-FAMILY: Verdana, Arial, Sans-serif" name="txtDesc" id="txtDesc" rows=8 cols=80 onchange="TrimDesc(this.value);"><%=ObjRC1("Description")%></TEXTAREA>


 
Knowing the hidden characters will allow you to replace them and get it to work. For example, let's say there's a vertical tab character hidden between â and € on all of them. If you use DreX's code:
Code:
OutPut = Replace(Input, Chr(226) & Chr(128) & Chr(153) , "'")
it's not going to work, because there is no Chr(226) followed by Chr(128) -- there's a vertical tab character, Chr(19) in between them. As such the replace will never work. If you knew there was a vertical tab in between them then you'd change that same replace to this:
Code:
OutPut = Replace(Input, Chr(226) [COLOR=blue] & Chr(19)[/color] & Chr(128) & Chr(153) , "'")
and the replace would work.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top