Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Create an XML document from data in a CSV file

Status
Not open for further replies.

mpopnoe

Programmer
Feb 28, 2002
47
0
0
US
I read the first half of "Beginning XML 2nd edition" by wrox, and that was about 6 months ago....so I am not even to the status of newbie yet. Anyway, I believe I understand the concepts of creating xml documents based on dtds, and using DOM, etc. to access/manipulate the data within the document, but how do you create the xml document dynamically (i.e. get the data INTO an xml doc)? I have CSV files that contain 105 columns of data and I need to create an XML document to store this data. I have the DTDs to use, and once the data is in an xml document I feel fairly comfortable using DOM to access it.....just need to know how to transfer the initial data from CSV to an xml document.

 
I'm fairly new to this too, and I've just been researching and working on this very topic. You'll find lots of suggestions on the Web, but little substance:

Third-party products: Java classes, .NET classes... No, thank you.

Some Russian PhD claims to have a str-split-to-whatever function in his FXSL library that he promotes all over the place. Found the library; no such function. Thanks for nothing, comrade.

Other sources show how to recursively break down a delimited string. The problem with that is it's XHTML-oriented: you don't have the opportunity to name your elements, they're separated by <BR/>s or another static tag.

Then there's the &quot;wait for XSLT v2&quot; crowd. Useless.

I finally used sed to wrap my CSV rows (the whole row) with begin and end tags so the whole row becomes an XML element. I pass that file off to xt and use XSLT string functions to split the row up on the commas, wrapping each value in the named tag I want. The problem with this approach is that it gets out-of-hand very quickly. My rows have a dozen columns, of which I'm interested in four. Here's my XSL for column 8:

<!-- AVAIL, column 8 -->
<on-hand><xsl:value-of select=&quot;substring-before(substring-after(substring-after(substring-after(substring-after(substring-after(substring-after(substring-after(.,','),','),','),','),','),','),','),',')&quot;/></on-hand>

There could be a better way to do this, but I couldn't find it.

With 105 columns to convert to XML elements, your best bet is to bite the bullet and write a program in your favorite language.

If you figure out something smarter than what I'm doing, please report back!

Good Luck,
harebrain
 
Thanks for your quick response. Most of the data I ran into out there for converting data to XML was java and .net classes, so I'll either use the java classes or write something on my own....just wanted to make sure I wasn't missing the easy way of going about this...I have ended up doing things the hard way more than once =)

Again, thanks for your help.
 
I am trying to do the same thing only I'm using a fixed length format rather than comma delimited. The client requires it to be written in VB .Net. Now, I know very little about VB .Net, and on the surface it looks like the task should be simple. There is a lot of built in functionality that should make this nearly effortless and very straightforward. However, I'm learning there are some limitations that make much of VB .Net's classes useless for my task. It should be as simple as read a schema into a VB dataset, parse the text file, populate the dataset, write the XML file. Reading the schema and writing the XML file are only a few lines of code in VB .Net because it's built in. My only real overhead is parsing the text file and correctly inserting data into the dataset.

But . . . VB .Net is not completely compliant with XML 1.0 specs. There are some valid schema configurations that it chokes on, and of course, the ones provided for this project fit into that category, and I'm bound by the schema provided by the client. So, I too am forced, as far as I can tell, to bite the bullet and write the routines myself to &quot;brute force&quot; the data into the schema. As you say, there is very little practical information out there. I'm sure there are a lot of other people faced with this same dilemma. I would love to hear how others have approached similar tasks.

Clint
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top