I'm establishing a TCP connection to a remote chat server. Once connected, we exchange data back and forth in a typical fashion.
The server sends data UTF8 encoded. I read from a socket, and then convert the bytes into chars, then read them from a BufferedReader. That is then converted to a string and printed to a JTextArea.
The problem is this: When converting the bytes, if I do nothing (accept the default charset when converting), then nothing but the lower ASCII characters appear. If I convert the bytes using UTF-8 decoder, then many more characters show up correctly, but many others still look like boxes in my JTextArea.
So if I have no utf8 decoder and I read bytes and append them to a JTextArea, no special characters show up correctly (they look like unconverted UTF8 characters)...
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
If I have defined a utf8 decoder, when I append to JTextArea, many more characters (the extended ones) show up correctly now but others (the even more extended ones) now show up as boxes in the JTextArea.
Charset charset = Charset.forName("UTF-8");
CharsetDecoder myUTF8Decoder = charset.newDecoder();
in = new BufferedReader(new InputStreamReader(socket.getInputStream(), myUTF8Decoder));
So, am I decoding these wrong? Or is the JTextArea not setup to show these extended characters? Doesn't Java use unicode by default? Is it a locale problem?
Any help would be greatly appreciated. I've battled with this for days!
The server sends data UTF8 encoded. I read from a socket, and then convert the bytes into chars, then read them from a BufferedReader. That is then converted to a string and printed to a JTextArea.
The problem is this: When converting the bytes, if I do nothing (accept the default charset when converting), then nothing but the lower ASCII characters appear. If I convert the bytes using UTF-8 decoder, then many more characters show up correctly, but many others still look like boxes in my JTextArea.
So if I have no utf8 decoder and I read bytes and append them to a JTextArea, no special characters show up correctly (they look like unconverted UTF8 characters)...
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
If I have defined a utf8 decoder, when I append to JTextArea, many more characters (the extended ones) show up correctly now but others (the even more extended ones) now show up as boxes in the JTextArea.
Charset charset = Charset.forName("UTF-8");
CharsetDecoder myUTF8Decoder = charset.newDecoder();
in = new BufferedReader(new InputStreamReader(socket.getInputStream(), myUTF8Decoder));
So, am I decoding these wrong? Or is the JTextArea not setup to show these extended characters? Doesn't Java use unicode by default? Is it a locale problem?
Any help would be greatly appreciated. I've battled with this for days!