XMLHttpRequest and Charsets

pbb72 · May 15, 2006

Hi,

I'm using XMLHttpRequest in JavaScript to retreive information from remote websites into Internet Explorer. Everything works great, except for two websites, where the server doesn't return any Charset information in the headers. The result is that XMLHttp interprets the results as UTF-8, while I know they are ISO-8859-1, which results in all extended characters (for example å, ø, æ) being displayed as questionmarks.

I do not have access to those servers, or to the documents on them. Is there any way I can force XMLHttpRequest to interpret the documents as ISO-8859-1, or to display the UTF-8 results properly in another way?

Thanks, Peter

BillyRayPreachersSon · May 16, 2006

I do not have access to those servers, or to the documents on them.

So you're scraping other people's websites without their permission, presumably?

Is there any way I can force XMLHttpRequest to interpret the documents as ISO-8859-1, or to display the UTF-8 results properly in another way?

Possibly. You might be able to do a regexp replacement on the returned data, switching known characters for HTML entities. Give that a whirl.

Hope this helps,
Dan

Coedit Limited - Delivering standards compliant, accessible web solutions

[tt]Dan's Page [blue]@[/blue] Code Couch

http://www.codecouch.com/dan/

[/tt]

BillyRayPreachersSon · May 19, 2006

Peter,

Did the above information help? Did you resolve your query?

Dan

Coedit Limited - Delivering standards compliant, accessible web solutions

[tt]Dan's Page [blue]@[/blue] Code Couch

http://www.codecouch.com/dan/

[/tt]

pbb72 · May 19, 2006

Thanks for the reply, Dan.

It didn't help however. All of the special characters get character code 65535 when I read them... :-(

tsuji · May 20, 2006

To: op
If you can provide more detail of what user environment you page is operating in... If the user environment is such that the least common denominator rules and that users are just mouse-clickers, may be you have to live with it.

pbb72 · May 20, 2006

Well, apart from half of the text being unreadable (the texts are not English but Norwegian, with lots of mangled up å, ø, æ), the main problem is that the mangeling-up also takes the next 2 or 3 characters with it.
For example: "<div>få</div>" gets converted to "<div>f?iv>".
This makes proper interpretation of the results quite hard.

And besides, "they just have to live with it", that can't be programmer saying that? ;-)

tsuji · May 20, 2006

>"they just have to live with it", that can't be programmer saying that?
You'll be surprised if you know the attitudes of the community of client-side page designers...

Here is a cryptic solution for you. (You say you use ie, it is ie applicable. But you have to make sure adodb.stream is not disabled in the browser environment. Usually it is.)
[tt]
//you have your xmlhttp object captured the response
//here is what you do after the part
var ostream=new ActiveXObject("adodb.stream");
with (ostream) {
type=1;
open();
write(oxmlhttp.responsebody);
type=2;
charset="iso-8859-1";
var s=readText(-1);
close();
}
ostream=null;
//start working on with the variable s which is the xml response string at the place of .responseText.
[/tt]

pbb72 · May 20, 2006

Thanks, that worked like a charm!!

Apparently, you forgot 1 line in the code. After the write command, I needed to add "position = 0;", and then everything worked as required.

tsuji, you are da man! :-D

tsuji · May 20, 2006

>After the write command, I needed to add "position = 0;",
You have to! (I sometimes make persistent the info, then it won't need to reposition.) Glad you know your biz, that safe me thousands (I am exaggerating) words.

pbb72 · May 20, 2006

Hehehe, yeah with the help of your code, I was able to Google the final details together.
Thanks again man!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

XMLHttpRequest and Charsets

pbb72

Programmer

BillyRayPreachersSon

Programmer

BillyRayPreachersSon

Programmer

pbb72

Programmer

tsuji

Technical User

pbb72

Programmer

tsuji

Technical User

pbb72

Programmer

tsuji

Technical User

pbb72

Programmer

Similar threads

Part and Inventory Search

Sponsor