Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Replace ’ characters from string

Status
Not open for further replies.

stinkybee

Programmer
May 15, 2001
218
GB
I'm reading a CSV file into a c# script and trying to replace certain characters that are causing problems. However, I can't seem to replace ’ which needs to be an apostrophe.

I have tried stripping non-ascii characters with a regex but that leaves a ™ which I then cannot replace either.

Any ideas on how I can replace these unwanted characters?

The source file is being accessed remotely so I have no access to it in terms of changing the encoding

Thanks

Web Development Manager
 
Thanks for reply, this has perhaps got me on the right track.

I can now replace the first two characters using unicode references with a regular expression or simple replace but the TM symbol is still not being replaced.

For example, this does nothing
Code:
new_text = Regex.Replace(new_text, @"\u00E2\u20AC\u2122", string.Empty);

However, this removes the †characters
Code:
new_text = Regex.Replace(new_text, @"\u00E2\u20AC", string.Empty);

This leaves the TM character but I need to replace the whole ’ string with a single quote. I do not want to try and replace the TM separately as there will be occasions where there is a valid TM character. Having said that I am having no luck replacing the TM on its own anyway.

I have also tried replacing using other codes for the trademark symbol such as ™ ™ and ™ with no luck.

Any ideas?

Web Development Manager
 
For those that are interested I have managed to find a workaround solution by using IndexOf to find the position of the first two characters †then using Remove get rid of the those characters and the next one in the string. Then I use an Insert to put the apostrophe in the same place

For example
Code:
	while(new_text.IndexOf("\u00E2\u20AC")!=-1) {
		int position = new_text.IndexOf("\u00E2\u20AC");
		new_text = new_text.Remove(new_text.IndexOf("\u00E2\u20AC"), 3);
		new_text = new_text.Insert(position, "''");
	}

A bit clunky but it works. Would still like to know a solution using replace but this will have to do for now.

Web Development Manager
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top