Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

& problems

Status
Not open for further replies.

sedj

Programmer
Aug 6, 2002
5,610
Hi,

As we know, XML parsers will error if a tag is like :

<myTag>Bodgit & Scarper</myTag>

and the way to fix this is to have it as :


<myTag>Bodgit &amp; Scarper</myTag>

But, is there any way for the first way to work ? Currently, I am doing a String.replace() type function on the XML before parsing, to convert '&' into '&amp;' ... is this the only real solution ?

Cheers




--------------------------------------------------
Free Database Connection Pooling Software
 
Hi sedj!

I've been down that route, and it's a mess. Your code needs to do something like this:

1. Check for an ampersand
2. See if there are characters following the ampersand
3. See if the following characters make up a valid predefined entity (&amp; &lt; etc)
4. See if the following characters make up a user-defined entity (from an XSD, DTD, etc)
5. If both above are not true, then you can replace it with the &amp; entity
6. Otherwise go onto the next ampersand character
7. Don't forget to check for things like the entity spanning multiple lines:
[tab][tab]&a
[tab][tab]mp;
is valid, too.

I would strongly encourage you to get the document supplier to do it correctly in the first place.

Chip H.


____________________________________________________________________
Click here to learn Ways to help with Tsunami Relief
If you want to get the best response to a question, please read FAQ222-2244 first
 
Thanks Chip, I was afraid that was the case ! Oh well :(



--------------------------------------------------
Free Database Connection Pooling Software
 
..And...
8. Watch out for CDATA blocks ;-)

You might could use Regular Expressions to do this...

Visit My Site
PROGRAMMER: (n) Red-eyed, mumbling mammal capable of conversing with inanimate objects.
 
Forgot about them. Thanks CubeE101.

Chip H.


____________________________________________________________________
Click here to learn Ways to help with Tsunami Relief
If you want to get the best response to a question, please read FAQ222-2244 first
 
You _could_ try changing the parser's charset to ISO-8859-1 .. However I don't know if this will work.

Worth a try?

Matt

 
flumpy -

Won't make a difference in this case, as the problem is with XML itself.

Actually, it's a similar problem to what you have in any data transfer file. If you use a CSV file, the comma becomes a special value, and you have to escape it in some fashion to prevent the recipient from thinking there are a variable number of fields on each row.

If you're using quotes to indicate string values (and to indicate that a comma within the string doesn't count as a field separator), you now have to escape the quote characters. There's just no way to win sometimes...

XML's method of escaping it is to require it to be sent as [ignore]&amp;[/ignore].

Chip H.


____________________________________________________________________
Click here to learn Ways to help with Tsunami Relief
If you want to get the best response to a question, please read FAQ222-2244 first
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top