Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

special character(& # 160 ;) become question mark after transform.

Status
Not open for further replies.

kanghao

IS-IT--Management
Jul 4, 2004
68
KR

special character(& # 160 ;) become question mark after transform.

=======Server locale ============
[tree]:/home>locale

LANG=C
LC_CTYPE=C
LC_COLLATE="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES="C"
LC_ALL=

==========Test.java=============
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import java.io.File;
import java.io.FileInputStream;
import java.io.ByteArrayInputStream;
import java.io.FileOutputStream;


public class Test{
public static void main(String[] args) throws Exception {
DocumentBuilderFactory factory;
DocumentBuilder builder;
Document domDoc;
Element root;

factory = DocumentBuilderFactory.newInstance();
builder = factory.newDocumentBuilder();
File f = new File("/home/aaa.xml");

FileInputStream fis = new FileInputStream(f);
byte[] b = new byte[(int)f.length()];
fis.read(b);
fis.close();

String xml_str = new String(b);
domDoc = builder.parse(new ByteArrayInputStream(xml_str.getBytes()));
root = domDoc.getDocumentElement();
System.out.println("Euc-kr ==========================");
System.out.println(GetStringClass.getStringedNodeEucKr(root));
System.out.println("Replace =========================");
System.out.println(GetStringClass.getStringedNodeReplace(root));

}
}
==============================
====GetStringClass.java=======
import java.io.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.transform.*;
import org.w3c.dom.*;

public class GetStringClass {

public static String getStringedNodeEucKr(Node node) throws TransformerConfigurationException, TransformerException {
StringWriter sw = new StringWriter();
DOMSource source = new DOMSource(node);
StreamResult result = new StreamResult(sw);
Transformer form = TransformerFactory.newInstance().newTransformer();
form.setOutputProperty(OutputKeys.ENCODING, "euc-kr");
form.setOutputProperty(OutputKeys.INDENT, "no");
form.transform(source, result);
String value = sw.toString();
return value;
}

public static String getStringedNodeReplace(Node node) throws TransformerConfigurationException, TransformerException {
StringWriter sw = new StringWriter();
DOMSource source = new DOMSource(node);
StreamResult result = new StreamResult(sw);
Transformer form = TransformerFactory.newInstance().newTransformer();
form.transform(source, result);
String value = sw.toString();
value = replace("UTF-8","EUC-KR",value);
return value;
}
}
============aaa.xml==============
<?xml version="1.0" encoding="EUC-KR"?>
<Summary>
<p>(one_Korean_character)&#160;(one_Korean_character)&#160;&#160;(one_Korean_character)</p>
</Summary>

========== output ===============

Euc-kr ==========================
<?xml version="1.0" encoding="EUC-KR"?>
<Summary>
<p>(?)&#160;(?)&#160;&#160;(?)</p>
</Summary>
Replace =========================
<?xml version="1.0" encoding="EUC-KR"?>
<Summary>
<p>(?)??)?(?)</p>
</Summary>

Buttom line is I can't change the OS character set property and it's on JDK 1.4.2.05.
Thanks.
 
It looks like you're setting your encoding correctly to Korean. Have you written your output to a file, and used a hex editor on it to see if the byte is still 0x160?

If it's in the file correctly, then more than likely it's a font problem.

Chip H.


____________________________________________________________________
Click here to learn Ways to help with Tsunami Relief
If you want to get the best response to a question, please read FAQ222-2244 first
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top