Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Advice Required

Status
Not open for further replies.

nidgep

Programmer
Sep 4, 2001
80
GB
Hi

I have written a small exe in vb6 that reads in a text file one line at a time.
Using two of the fields (accountref and propertyref) from the text file the program queries an XML document, for the presence of an <account> element with an acctref attribute that matches the "accountref" value read in from the text file.

If a match is made then program will return the <account> node to the calling code.
The program then queries the <account> parent to check for the presence of the propertyref value within a <propref> child node. If this exists then the current sequence number from the <propref> sibling <seq>nn</seq> is returned.

If neither the accountref and/or the propertyref exist then they are created/inserted at the appropriate position withi the xml document.
The idea is that the xml doc is updated following each import run.
However....there are 20,000 accounts within the xml document
which amounts to 85,000 lines of text and 2.6Mb.

My real question is that I am using the XMLDOM (msxml4.dll) to do the work and wondered if there is a better/quicker method of solving this problem.

any help or advice would be appreciated.
 
I'd look at using a database to do this. Either a native XML database or storing the accounts as XML blobs in a traditional database.

Perhaps looking at XQuery may be useful too, although I haven't used this myself.

Jon

"There are 10 types of people in the world... those who understand binary and those who don't.
 
thanks for the reply.

I too have suggested using a database to achieve this, but apparently it has to be done this way!??

When you mention XML blobs - is this something that Ms SQL server supports? I would have thought that storing xml as binary would defeat the object.

Are XML blobs within the realms of XML databases such as Tamino etc.?

Can XPath and XQuery be used within VB6?

Cheers
 
I guess they don't teach this anymore, but this is a simple problem with a simple solution. Pre-process your text and XML files by sorting them on the account/accountref fields. Then you can process them sequentially, matching as you go through each file. Because you've eliminated the need for random access, you can use the SAX parser. Considering the size of your XML document, this will be quicker (I'd guess significantly quicker) and reduce the memory requirements.
 
Hi

Thanks for the feedback.

Unfortunately the input source file is not sorted by the application that creates it, and would need to be done by ourselves beforehand - pre-processed.

However, the SAX parser cannot update the document in the same way that using the XMLDOM can.

I agree that using the SAX parser should be much quicker, but is not just a matching/grab the reference exercise,
and as the XML document needs to reflect any new accounts/properties that are present within the source file, the SAX parser isn't suitable for this.

Cheers
 
I suppose I didn't supply enough detail for you. And the glib response about "a matching/grab the reference exercise" suggests that you didn't apply much thought before dismissing the idea.

What you're missing is that you don't try to update the XML document in place. Use the two inputs to create a new XML document that contains all the new/changed. You *can* do that with SAX.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top