Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Anyone ever seen XML like this? 1

Status
Not open for further replies.

woolly2

Programmer
Jan 22, 2007
11
US
Hi

I am fairly new to XML and I'm working with a client who is using a form of XML I have never seen before, I was wondering if anyone here had? As you can see they are encoding the angle brackets of the tags and I have no idea why!

Thanks

<Address>
&lt;Line1></Line1&gt;
&lt;Line2/&gt;
&lt;Line3/&gt;
&lt;City></City&gt;
&lt;State></State&gt;
&lt;PostalCode></PostalCode&gt;
&lt;/Address&gt;
 
To represent character data, there are five internal entities which encode five special character data, (>,<,&,".') being encoded to (&gt;,&lt;,&amp;,&quot;,&apos;). That could be one possible reading of those data. (But you are so rush that your post partly does that and partly not, or, that the data is insufficiently encoded.) For instance, this is a well-formed xml document on its own.
[tt]
<root>
&lt;Address&gt;
&lt;Line1[red]&gt;&lt;[/red]/Line1&gt;
&lt;Line2/&gt;
&lt;Line3/&gt;
&lt;City[red]&gt;&lt;[/red]/City&gt;
&lt;State[red]&gt;&lt;[/red]/State&gt;
&lt;PostalCode[red]&gt;&lt;[/red]/PostalCode&gt;
&lt;/Address&gt;
</root>
[/tt]
 
I'm sorry I did make a bit of a mess of the XML in my first post, It should have been thus;

&lt;Address&gt;
&lt;Line1&gt;&lt;/Line1&gt;
&lt;City&gt;&lt;/City&gt;
&lt;State&gt;&lt;/State&gt;
&lt;PostalCode&gt;&lt;/PostalCode&gt;
&lt;/Address&gt;

I am still confused, are you saying it is legal XML or not and if so why would anyone encode the XML structure rather than just the data?

Thanks
 
>why would anyone encode the XML structure rather than just the data?
The use of xml could be manifold. Those character data represent some information that the agent/client at the receiving end will understand. And that data represents an xml element named Address and that's a piece of information valuable enough for that client. So this is an instance of how that might come about. You can image a lot more.
 
I'm sorry I really don't get this. If XML is supposed to be very strict in the way you use and write it, how can you just encode or not encode parts of it as a personal preference. The client not only wants me to pass the XML in this way but they return it in that format as well. The Java parser class I am using does not recognize it as valid XML so I will have to decode their encoding. I can of course do that but that seems like a long way around to a simple solution, just pass it back as valid XML in the first place.
 
Your client is confused, but that may not be a very helpful response.

I have seen this happen enough to know that the programmer on the client side is probably new to XML and most likely is using VB or C# (but that is just an educated guess).

You can write an XSLT stylesheet to take your well-formed XML and render it in this way for delivery to your client. I would have to see exactly what you are getting from the client to help you understand how to recover the information into a well-formed XML document.

Tom Morrison
 
Tom

Thanks for the response. To be clear I'm not really looking for any help with the XML, if I have to I'll work it out, it can't be that hard. I guess what I'm trying to establish is this, first is what the client is using here valid XML and two if it is why, why would you encode the structure of the XML rather than just the data?

I'm not sure what they are using but it is a Microsoft based language, C# would be a good guess.

I am sure they are confused because whilst they encode the angle brackets they leave the quotes alone, which to my mind defeats their argument.

Thanks
 
Your OP does not show unescaped quotes. Perhaps you should show a real example (with sanitized data).

Your OP shows something that is invalid XML for several reasons. That is why I suggest showing a real example from this poorly formed application. tsuji is an expert, and is showing a well-formed XML document on a conceptual basis from your original post.

woolly2 said:
why would you encode the structure of the XML rather than just the data?

Because of confusion. Almost certainly the other party is really unaware of what has happened, and is a subscriber to the notion that ignorance is bliss.

So, what to do? Either adapt to the confused, or, if there is a specification governing the data interchange, press for the other party to conform to the agreed specification.

Tom Morrison
 
Tom here is a line that to me proves I am dealing with a confused person, I changed the data to protect the innocent. As you can see they are not encoding the quotes, just the angle brackets. I am going to try and get them to at least pass back valid XML but I'm not that hopeful, believe it or not this is a very big company I am dealing with, hence my caution here. If all else fails I will just decode their encoding and make the best of a messy situation. As you said ignorance is bliss but not so much when your on the receiving end of it :)

&lt;user emailAddress="user@domain.com" /&gt;

Thanks for all of the help Tom and tsuji

 
The op have to understand that the scope of use of xml is bigger than some fixed idea. (If the client is confused, your job is to enlight them. But do not let your own immature idea negating client's requested functionality and draw a wrong judgement before your clients.) Image that that string is inserted into a textarea of an html page. How do you do it? you encode the string like that.
[tt]
<html>
<body>
<form>
<textarea cols="50" rows="10">
&lt;Address&gt;
&lt;Line1&gt;&lt;/Line1&gt;
&lt;Line2/&gt;
&lt;Line3/&gt;
&lt;City&gt;&lt;/City&gt;
&lt;State&gt;&lt;/State&gt;
&lt;PostalCode&gt;&lt;/PostalCode&gt;
&lt;/Address&gt;
</textarea>
</form>
</body>
</html>
[/tt]
That's is how those tutorial sites do out there. Is it a legitimate use?
 
tsuji,

Your last response came at the same time as the OP, but you can see that the client is using incorrect escaping, and the result cannot be used in a textarea as in your example.

I too have seen this type of problem where companies (including, for example, a very large automaker) have created these confused results. I see this mostly where EDI is being replaced by XML, and the engineers have not fully informed themselves about XML. Another source for this confusion is engineers recently graduated from college or university, who are convinced that they never make mistakes, even though they find themselves in the presence of a guru.
guru.gif


lol.gif


Tom Morrison
 
Hi, Tom.

If my first post gave the impression that you _must_ encode all the five entities whenever they appear, it is regretable.

In fact, [1] in text node, only (<,>,&) are obliged to be encoded. (<,>) for obvious reason. (&) goes with it for the encoding reason itself (as & acquired special meaning). But (",') are not necessary, you can encode them but you can also spare encoding. But, [2] in an attribute node, that is another matter. Again, it is for obvious reason. If attribute is enclosed in quotes, then (") need to be encoded - but, (') is still left as optional. Similarly for the case if attribute is encoded in apostroph. Observing the above, the document is still well-formed, meaning well-formedness is not only tagging, but also involving some encoding requirement.

So if the info is scripted in the text node, it is pretty legal to see (",') not encoded. It should not be a problem.
 
Tsuji if you have you right and to be honest I am far from sure I do, the example you give above assumes that the string is inserted into a textarea of an html page. The client I am working with has a "Web Service" I have to pass XML with the angle brackets encoded as per your example and I get back an XML return, again with the angle brackets encoded.

I just wanted to clear that up.

Thanks
 
I assume nothing on any clients' behave. Just see show how a string of the kind may turn up in response to a question "Anyone ever seen XML like this?" That's all.
 
Tsuji

I can now see that not only was the XML in my first post incorrect but my question was too loosely worded as well, please forgive me. If I may, I'll try again.

Given the examples I have shown above;

&lt;user emailAddress="user@domain.com" /&gt;

Would this be correctly formated and valid XML if I was posting into a clients web service?

Thanks for both the advice and the patience, I really am do my best to learn here. It is way more complex than I thought
 
You are at the mercy of what the WSDL says for the web service.

I would claim that a web service that requires escaped entities is poorly constructed. However, I have seen many implementors of web services claim that this has to be done so as not to confuse the processing of the 'SOAP stack'. This IMO is utter nonsense. Have a look at W3C's own example which is included in the SOAP standard; you will find no escaped entities. I have implemented SOAP web services on both IIS and Apache without confusing the SOAP protocol elements with the 'payload' elements; likewise I have customers that implement SOAP over simple sockets without problem.

So, is this a publicly published web service? If so, perhaps we can gain some amount of guidance.
lol.gif


Tom Morrison
 
OK I think things are beginning to make more sense now, this is very close to the excuse I was given "so as not to confuse the processing of the 'SOAP stack'", here is their excuse "The String that contains the XML transaction must be encoded so it does not interfere with the XML of the SOAP packet". As you said that makes no sense but it looks like I am going to have to work with it all the same.

As for a WSDL they don't have one, or at least they have not given it to me as of yet.

Thanks
 
Try demanding a WSDL. That will help us understand this 'web service' better...
 
I did get a number of xsd documents from them, one for each of the web service transactions. I know they are XML schema's but that is about all, would they use these instead of a WSDL, or can you have them and a WSDL?

Confused? I should be :)

Thanks

 
Actually, the WSDL should reference the XML Schema documents that they sent you. The WSDL is an XML document that describes the web service. A WSDL document has several main sections, one of which is the schema for the messages that are to be used within the web service. It is typical for the WSDL's schema section to use an <xs:include> element to incorporate by reference the external XML Schema.

Here is a very decent presentation on this topic.

Again, is this a publicly published web service?

Have you tried an HTTP GET (such as with a browser) on the service URL to see if it will return a WSDL?

The WSDL has become quite important, as modern IDEs (Eclipse w/ a plug-in, Visual Studio, etc.) will fetch the WSDL and automatically build a proxy for the services. Not to have a WSDL is to show callous disregard for the client.

Tom Morrison
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top