Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Basic XML question about structuring of an xml document 1

Status
Not open for further replies.

ttdobj

Technical User
Sep 30, 2002
63
GB
We have a client who has supplied an xml file that I think is not quite right, but I'm not sure how to word my reason why I think this.

There file is structured in a similar way to this:

<root>
<item_number>1</item_number>
<item_description>a widget</item_number>
<item_number>2</item_number>
<item_description>a foobar</item_description>
<item_number>3</item_number>
<item_description>a third thing</item_description>
</root>

My feeling is that it should be structured like this:

<root>
<item>
<item_number>1</item_number>
<item_description>a widget</item_number>
</item>
<item>
<item_number>2</item_number>
<item_description>a foobar</item_description>
</item>
<item>
<item_number>3</item_number>
<item_description>a third thing</item_description>
</item>
</root>

Am I right?

Ta john
 
The original is a legitimate xml source file apart probably a typos of yours.
><item_description>a widget</item_number>
[tt]<item_description>a widget</item_[red]description[/red]>[/tt]
 
John,

There is nothing wrong per se with the document as supplied. Your urge to hierarchy is quite reasonable, though, since the document as supplied will be somewhat more difficult to manipulate.

So...it remains to be asked: What do you intend to do with this document?

Tom Morrison
 
yes, sorry about the typo.

The only reason that <item_number>2</item_number> is not associated with <item_description>a third thing</item_description> is because of the position of these elements within the file.

Isn't that a bad way of structuring an xml file? Even though it does not break any specific rules of being well-formed?

John

 
The file from the client is going to be parsed, analysed and imported into a database.

How is that relevant?

John
 
It is neither good or bad. One should not assign moral judgement too soon, in particular if you are not _too_good about the biz yet. It is easy to cunter by saying for instance the original consume less bandwidth when transmitting. Is it good? or is it bad?
 
John,

It is relevant because it imputes a certain set of tools that will be used to manipulate the document.

From your answer, I would presume that you will use an existing XML parser to created a Document Object Model (DOM) representation of your input document in memory for your analysis phase, that phase to be implemented in some commonly used procedural language. At that point, the positional nature of the sibling elements can be be exploited as you traverse the DOM tree.

If this assumption is correct, then it would do no good to assume that this is dynamic data that is being input into an XML pipeline to appear in an HTML presentation in a browser. This use of the input document might benefit from the use of XSLT, for example.

Do you understand the relevance now? :)

Tom Morrison
 
Thank you Tom
yes, I see what you say
The import files are being imported via VB.net application, validated, etc, then pushed into MS SQL.

Because the XML files are holding a lot of varied data, at lots of "different levels" my feeling was that I would be able to deal with the files easier if they were more hierarchically structured. To show another example of the structure:
<parent>
<primary_ref>1</primary_ref>
<primary_desc>blah blah</primary_desc>
<secondary_ref>5</secondary_ref>
<secondary_desc>blurg blurg</secondary_desc>
<secondary_factor>3.14</secondary_factor>
<primary_ref>1</primary_ref>
<primary_desc>blah blah</primary_desc>
<secondary_ref>6</secondary_ref>
<secondary_desc>blurg blah</secondary_desc>
<secondary_factor>3.141</secondary_factor>
<primary_ref>1</primary_ref>
<primary_desc>blah blah</primary_desc>
<secondary_ref>7</secondary_ref>
<secondary_desc>blah blurg</secondary_desc>
<secondary_factor>3.1415</secondary_factor>
<primary_ref>2</primary_ref>
<primary_desc>blah blah</primary_desc>
<secondary_ref>5</secondary_ref>
<secondary_desc>blurg blurg</secondary_desc>
<secondary_factor>3.14</secondary_factor>
</parent>

Considering this is purely for importing the information to the database. No analysis takes place on the actual file itself.

John
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top