Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Printing To The Screen in Chinese

Status
Not open for further replies.

Krus1972

Programmer
Mar 18, 2004
145
0
0
US
I am currently using classic ASP to parse an XML file that contains text in Chinese.

I have no problem parsing the actual XML file and the XML nodes within the file.

The problem I am having is that the text within the actual XML nodes is chinese.


When I place the contents of the XML node (the chinese text) inside an ASP variable called "myvariable" and then I attempt to write it to the screen using Response.write(myvariable) I get nothing but question marks printed to the screen where the chinese letters should be.

I want to preserve the chinese lettering and have it writen to the screen unaltered.

Does anyone know why the chinese text is being converted to question marks when I store it in "myvariable" and then response.write(myvariable) to the screen?

Any help would be well appreciated.



 
Also look how your Session.Codepage is set:

“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.” (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
I am using the following script and it is NOT showing the proper chinese letters. It is showing a bunch of other special charachters, but the the proper chinese charachters

Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"

Response.CodePage = 65001
Response.CharSet = "utf-8"

SearchEngine = "URL_For_The_Chinese_XML_Feed"
Set XML = Server.CreateObject("Microsoft.XMLHTTP")
XML.Open "GET", SearchEngine, False
XML.Send
THESTRING = XML.responseXML.selectSingleNode("ItemSearchResponse/Description").Text
Response.write(THESTRING)


The XML feed does contain the proper chinese letteres but when they are printed to the screen using the script above, the charachters are changed to a bunch of other special looking charachters.

I am trying to preserve the original chinese text that is in the XML file and display it properly on the screen.

Any help would be well appreciated.

Thanks

 
I'd give it a try by adding Session.CodePage = 65001 and converting the entire asp page to UTF-8 without BOM.

“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.” (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 

That doesn't work. Instead of getting the proper, unaltered chinese letters I am getting the following type of text:

MP3,但2G的主流容量却使得那些对存储容量有更大需求的用户望而却步。为满足更多用户的需求,飞利浦正式推出了SA28çš„4G版。在产品外观上,4G版和2Gç‰ˆå¹¶æ— åŒºåˆ«ï¼Œä¾æ—§é‡‡ç”¨äº†æœ€å—æ¬¢è¿Žçš„é»‘è‰²æœºèº«é…ä»¥é“¶è‰²è¾¹æ¡†çš„å½¢å¼ï¼Œä½†åœ¨åŒ…è£…ä¸Šå´é‡‡ç”¨äº†æ›´ä¸ºç²¾ç¾Žçš„æ°´æ™¶ç¤¼ç›’å¼åŒ…è£…ã€‚brbr很多男生都不喜欢将MP3æŒ‚åœ¨èƒ¸å‰ï¼Œè§‰å¾—é‚£æ ·ä¸å¤Ÿé…·ï¼Œæœ‰äº†è¿™ä¸ªæŠ¤å¥—å’Œçš®å¸¦æ‰£ï¼Œä½ å°±å¯ä»¥å¾ˆè½»æ¾çš„å°†SA28挂在腰带上,或者轻松地利用那个卡子将SA28随意地别在领口、衣襟,将它作为一件酷酷的装饰。brbrè‡‚å¸¦çš„åˆè¡·è™½ç„¶æ˜¯ä¸“ä¸ºè¿åŠ¨æ—¶å¬éŸ³ä¹è€Œè®¾è®¡ï¼Œä½†å®ƒçš„ä¾¿æ·æ€§å´ä½¿å®ƒè½»æ¾çªç ´äº†è¿åŠ¨çš„èŒƒç•´ï¼Œå¸¦ç»™ä½ æ›´åŠ è‡ªç”±çš„åº”ç”¨ä½“


I am really at a loss here.

 
I have tried some many things in an effort to get the Chinese letter to print to the screen properly using my code listed above.
I am at a complete loss.

Any help would be well appreciated.

 
If you want a foolproof way to print Chinese, send it as &#..; codes. For example
Code:
<html>
<head><title>Chinese</title></head>
<body>
&#20320;&#22909;<!-- in decimal-->
&#x4F60;&#x597D;<!-- or in hex -->
</body>
</html>
Will print ni hao. UTF-8 isn't easy as the number of characters transmitted per character can vary from 1 to 7. You could, as an alternative, try UTF-16. If that still doesn't work, then try reading the XML nodes and converting them to the relevant decimal or hex numbers for printing.
 
Forgot that it would translate the stuff. The code read (without the spaces)

& # 20320 ; & # 22909 ; <!-- in decimal -->
& # x4F60 ; & # x 5970 ;<!-- in hex -->
 
I think it would be a pain in the neck to translate already properly stored Chinese into entities.
What was posted on 11 Nov 12 12:36 was already proper Chinese in UTF-8 - just not interpreted corretcly.

As obviously your code side is meanwhile properly coded, I suspect your IIS is not yet set properly.
Go to Start=>Control Panel=>Administration=>IIS Manager=>ASP=>Codepage
Set it to 65001.

Good luck!


“Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family.” (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
Could you post a sample of what you are trying to display? Just the first 10 characters will do.
 
3 things you need for the ASP file:
<%@ CodePage=65001 LANGUAGE = VBScript.Encode %>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Save As Encoding: UTF-8
 
I've just been playing this. IE does not display UTF-8. It only displays Unicode even though the encoding is called utf-8. You need a bit of scripting to read in the utf-8 and convert it to unicode.
Code:
<% @CodePage=65001 Language="JavaScript" %>
<%
Response.CodePage = 65001;
Response.AddHeader ("content-type", "text/html;charset=UTF-8");
Response.ContentType="text/html;charset=UTF-8";
Response.CharSet = "UTF-8";
// Build up the string
str = ""
str += "\xE4\xbd\xa0";
str += "\xE5\xA5\xBD";
// This works
Response.Write ("Supposed to get &#x4f60;&#x597d;<br/>");
Response.Write ("This is utf-8: " + str + "<br/>");
// Convert UTF-8 to Unicode
ustr = "";
ii = -1; imax = str.length;
while (ii < imax) {
   utf1 = str.charCodeAt(++ii);
   if ((utf1 & 0x80) == 0)
      ustr += String.fromCharCode(utf1);
   // Do something when ((utf1 & 0xE0) == 0x80)
   else if ((utf1 & 0xF0) == 0xE0) {
      utf2 = str.charCodeAt(++ii);
      utf3 = str.charCodeAt(++ii);
      ustr += String.fromCharCode(((utf1 & 15) << 12) | ((utf2 & 63) << 6) | (utf3 & 63));
   }
}
Response.Write ("This is unicode: " + ustr + "<br/>");
%>
 
You could also do it in vbscript but the lack of shift operators makes coding a bit awkward.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top