Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Character set checking

Status
Not open for further replies.

fatcodeguy

Programmer
Feb 25, 2002
281
CA
Hi,

I have a database with CLOBs that I query to write XML files from the fields (UTF-8 encoding), and sometimes the file has a character that's invalid to the XML parser (if i open the generated xml in notepad, it shows up as a square).

I'm pretty sure that this is a character set issue that originates from the insertion of the data in the CLOB. I can't go back and do it right, i have to work with what's in the db.

Here's my question: How can I check for an invalid character?

Thanks!!
 
Its probably a new line or CRLF character - Try doing a replace on the CLOB before you parse it as XML, eg :

clobString = clobString.replaceAll("\r\n", "").replaceAll("\n","");

or something similar ...

If that fails, you'll need to rip through the clob looking at each character to work out what it is ...

Click here to learn Ways to help with Tsunami Relief
--------------------------------------------------
Free Database Connection Pooling Software
 
That's what I figured it would be too, but I checked for it and that's not it.

How would i check each individual character against the character set?
 
There may be a better way, but I guess I would do :

char[] ca = clobString.toCharArray();
for (int i = 0; i < ca.length; i++) {
System.out.println(ca +" " +((int)ca));
}

and try to work out what is going on ...

Click here to learn Ways to help with Tsunami Relief
--------------------------------------------------
Free Database Connection Pooling Software
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top