Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to modify XML DTD for Escape characters

Status
Not open for further replies.

vwani

Programmer
May 26, 2004
10
US
Hi,

I've a standard XML parser written in C++. But this parser doesn't take care of escape characters like &,<,>,', etc.
I have a DTD which I want to modify to take care of the escape characters. Can someone help me?

My DTD is as follows

<!ELEMENT media (objectURI|type|size|name|NotifyURI)*>
<!ELEMENT NotifyURI (#PCDATA)>
<!ELEMENT size (#PCDATA)>
<!ELEMENT type (#PCDATA)>
<!ELEMENT name (#PCDATA)>

The XML document that I'm parsing is as follows
<media>
<name>ABC</name>
<type>audio/mp3</type>
<size>12345</size>
<NotifyURI><description>DESC</description>
</media>

My parser does not escape "&" within <NotifyURI> and it no londer parses the document.

Can someone suggest me what changes should I make to my DTD so that my parser recognizes the URI as is without treating "&" as escape charcter?

Thanks,
VW
 
It's not the parsers job to "take care of" special characters. They signal the parser to treat them different from normal characters; what you want would make them useless.

The way to go about this is to prevent the problem when you're writing the XML. Use a URLEncode(string) method on the <NotifyURI> element before writing the element.
 
The problem is I don't have control over the format of URL.
If I want to use URLEncode on my URL in C++ how do I do it?

I understand that I can also do this with DTD.

Thanks
vw
 
You have it backwards. You can obliterate the meanings of the special entities (& lt; etc.) and create new special entities in a DTD, but you can't change the meanings of the reserved characters.

You might want to look up the URLEncode method on MSDN.
 
Those characters are special because they were designated by the W3C as being predefined entities in XML.

Whoever is sending you the data is doing it wrong, and there's no way to make it "right" on your end.

Chip H.


____________________________________________________________________
If you want to get the best response to a question, please read FAQ222-2244 first
 
Well, Chip, yes there is, but not with XML technologies. He'd have to run it through a stream editor, for example, and edit the URL element, replacing &'s with & amp;'s. But as you state, the data creator is doing it wrong and if vwani has any control over the process at all, he should insist on getting well-formed XML.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top