Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Making Multiple XML output files from 1 xml file

Status
Not open for further replies.

groitblat

Technical User
Mar 25, 2009
14
US
I am not a programmer and hoping someone can point me to an app/program that will help my client out.
Have 1 great big xml file with approx 200+ orders in it.
Need each order to be extracted into it's own 'file' as to then process into an order system.
For example these 3 orders are in the file & ultimately 1 need each one to be it's own file.
So I'd end up file1.xml, file2.xml, file3.xml
The break between orders is <order></order>...

I'm really hoping to avoid having to pull each record out manually knowing how many records there are.

Any help or pointing to a program that a non-programmer can use would be greatly appreciated.

Thank you
<Order>
<ShipmentServiceLevelCategory>SecondDay</ShipmentServiceLevelCategory>
<OrderTotal>
<Amount>127.35</Amount>
<CurrencyCode>USD</CurrencyCode>
</OrderTotal>
<ShipServiceLevel>SecondDay</ShipServiceLevel>
<LatestShipDate>2014-12-03T03:04:00Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<ShippingAddress>
<Phone>5015555555</Phone>
<PostalCode>72020</PostalCode>
<Name>sample name</Name>
<CountryCode>US</CountryCode>
<StateOrRegion>Arkansas</StateOrRegion>
<AddressLine1>555 main</AddressLine1>
<City>Bradford</City>
</ShippingAddress>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>109-9264954-9814647</SellerOrderId>
<BuyerEmail>p5lbtkmc6x6f5y0@marketplace.amazon.com</BuyerEmail>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Shipped</OrderStatus>
<BuyerName>sample name</BuyerName>
<LastUpdateDate>2014-12-03T03:20:47Z</LastUpdateDate>
<EarliestShipDate>2014-12-03T03:04:00Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:37:42Z</PurchaseDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<AmazonOrderId>109-9264954-9814647</AmazonOrderId>
<NumberOfItemsShipped>1</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
<Order>
<ShipmentServiceLevelCategory>SecondDay</ShipmentServiceLevelCategory>
<OrderTotal>
<Amount>35.88</Amount>
<CurrencyCode>USD</CurrencyCode>
</OrderTotal>
<ShipServiceLevel>SecondDay</ShipServiceLevel>
<LatestShipDate>2014-12-02T14:09:57Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<ShippingAddress>
<Phone>270-555-5555</Phone>
<PostalCode>42129</PostalCode>
<Name>sample2</Name>
<CountryCode>US</CountryCode>
<StateOrRegion>KY</StateOrRegion>
<AddressLine2>106 S. main St.</AddressLine2>
<AddressLine1>P.O. Box 1</AddressLine1>
<City>Edmonton</City>
</ShippingAddress>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>108-3617067-1489801</SellerOrderId>
<BuyerEmail>g3n55s57mf0fn2f@marketplace.amazon.com</BuyerEmail>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Shipped</OrderStatus>
<BuyerName>sample2</BuyerName>
<LastUpdateDate>2014-12-02T14:12:35Z</LastUpdateDate>
<EarliestShipDate>2014-12-02T14:09:57Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:37:57Z</PurchaseDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<AmazonOrderId>108-3617067-1489801</AmazonOrderId>
<NumberOfItemsShipped>1</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
<Order>
<ShipmentServiceLevelCategory>SecondDay</ShipmentServiceLevelCategory>
<OrderTotal>
<Amount>50.78</Amount>
<CurrencyCode>USD</CurrencyCode>
</OrderTotal>
<ShipServiceLevel>SecondDay</ShipServiceLevel>
<LatestShipDate>2014-12-02T18:59:28Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<ShippingAddress>
<Phone>714-555-5555</Phone>
<PostalCode>92506-2469</PostalCode>
<Name>sample3</Name>
<CountryCode>US</CountryCode>
<StateOrRegion>CA</StateOrRegion>
<AddressLine1>555 test address</AddressLine1>
<City>RIVERSIDE</City>
</ShippingAddress>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>002-4671276-0885017</SellerOrderId>
<BuyerEmail>2bg20j5gv8zwmlh@marketplace.amazon.com</BuyerEmail>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Shipped</OrderStatus>
<BuyerName>sample3</BuyerName>
<LastUpdateDate>2014-12-03T00:55:02Z</LastUpdateDate>
<EarliestShipDate>2014-12-02T18:59:28Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:38:07Z</PurchaseDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<AmazonOrderId>002-4671276-0885017</AmazonOrderId>
<NumberOfItemsShipped>1</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
<Order>
<ShipmentServiceLevelCategory>Standard</ShipmentServiceLevelCategory>
<ShipServiceLevel>Standard</ShipServiceLevel>
<LatestShipDate>2014-12-09T07:59:59Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>111-6118963-7054668</SellerOrderId>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Pending</OrderStatus>
<LastUpdateDate>2014-12-02T02:58:32Z</LastUpdateDate>
<EarliestShipDate>2014-12-09T07:59:59Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:41:05Z</PurchaseDate>
<NumberOfItemsUnshipped>1</NumberOfItemsUnshipped>
<AmazonOrderId>111-6118963-7054668</AmazonOrderId>
<NumberOfItemsShipped>0</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
</Orders>
 
XML is not a scripting language so does not have a means of writing to physical files, and for applications for whatever your OS is I would suggest starting here

If you are using Linux or Mac 'awk' should be able to handle that with a one line command.

awk '/match-pattern/{x="F"++i".xml";}{print > x;}' bigfile.xml


replace match-pattern with your criteria to split the file on. Output files will be named Fnn.xml



Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
Thanks - someone suggested csplit from the linux world & that seems to actually work pretty well once I figured out the syntax

- gina
 
I would use a simple XSLT, driven by the host OS scripting language. The XSLT is very simple.

Please provide the host OS (i.e. Windows, Linux, etc) so I can provide something that is appropriate to host.

Tom Morrison
Hill Country Software
 
Also, if you are using an XSLT processor, which one? (If you don't know, that's ok.)

Tom Morrison
Hill Country Software
 
No idea on XLST processor....windows 8.1 is the OS that I'm using to deal with this issue.
Csplit is sort of viable but if there's a better solution I'm all ears

Thank you
 
First, download MSXSL.exe from the Microsoft site. This is the command line wrapper for the Microsoft XML processor. It will be used in the following example.

First, we need an XSLT that will count the number of Order elements in the Orders document. It is going to output a single text line in a format that is useful for the CMD scripting engine to use.
Code:
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="[URL unfurl="true"]http://www.w3.org/1999/XSL/Transform">[/URL]
<xsl:output encoding="utf-8" method="text"/>

<xsl:template match="/">
<xsl:for-each select="/Orders/Order">
<xsl:variable name="crlf"><xsl:choose><xsl:when test="position() = last()"/><xsl:otherwise><xsl:text>
</xsl:text></xsl:otherwise></xsl:choose></xsl:variable>
<xsl:value-of select="concat(position(),$crlf)"/>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>

The output of this XSLT is
Code:
1
2
3
4
If there are 136 orders, there will be 136 lines. Breathtaking stuff [bigsmile], but you can see how it helps generate all the files in a bit...

Now we need an XSLT that will extract a specific order. This is essentially an identity transform modified with a template that applies the identity transform on the appropriate Order element specified by the input parameter OrderNumber.

Code:
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="[URL unfurl="true"]http://www.w3.org/1999/XSL/Transform">[/URL]
<xsl:output encoding="utf-8"/>

<xsl:param name="OrderNumber"/>

<xsl:template match="/">
	<xsl:apply-templates select="Orders/Order[position() = $OrderNumber]" />
</xsl:template>

<xsl:template match="@*|node()">
<xsl:copy>
  <xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

</xsl:stylesheet>

Now, with msxsl.exe, Orders.xml, EnumerateOrders.xsl and SelectOrder.xsl all in one directory (to make it simple to show here), start a Command Prompt, cd to the directory containing these files and execute the following command:
Code:
FOR /F %I IN ('msxsl Orders.xml EnumerateOrders.xsl') DO msxsl Orders.xml SelectOrder.xsl -o Order-%I.xml OrderNumber=%I

I get the following (after correcting the provided sample XML to make it well formed):
Code:
<Order>
<ShipmentServiceLevelCategory>SecondDay</ShipmentServiceLevelCategory>
<OrderTotal>
<Amount>127.35</Amount>
<CurrencyCode>USD</CurrencyCode>
</OrderTotal>
<ShipServiceLevel>SecondDay</ShipServiceLevel>
<LatestShipDate>2014-12-03T03:04:00Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<ShippingAddress>
<Phone>5015555555</Phone>
<PostalCode>72020</PostalCode>
<Name>sample name</Name>
<CountryCode>US</CountryCode>
<StateOrRegion>Arkansas</StateOrRegion>
<AddressLine1>555 main</AddressLine1>
<City>Bradford</City>
</ShippingAddress>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>109-9264954-9814647</SellerOrderId>
<BuyerEmail>p5lbtkmc6x6f5y0@marketplace.amazon.com</BuyerEmail>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Shipped</OrderStatus>
<BuyerName>sample name</BuyerName>
<LastUpdateDate>2014-12-03T03:20:47Z</LastUpdateDate>
<EarliestShipDate>2014-12-03T03:04:00Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:37:42Z</PurchaseDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<AmazonOrderId>109-9264954-9814647</AmazonOrderId>
<NumberOfItemsShipped>1</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
Code:
<Order>
<ShipmentServiceLevelCategory>SecondDay</ShipmentServiceLevelCategory>
<OrderTotal>
<Amount>35.88</Amount>
<CurrencyCode>USD</CurrencyCode>
</OrderTotal>
<ShipServiceLevel>SecondDay</ShipServiceLevel>
<LatestShipDate>2014-12-02T14:09:57Z</LatestShipDate>
<MarketplaceId>ATVPDKIKX0DER</MarketplaceId>
<SalesChannel>Amazon.com</SalesChannel>
<ShippingAddress>
<Phone>270-555-5555</Phone>
<PostalCode>42129</PostalCode>
<Name>sample2</Name>
<CountryCode>US</CountryCode>
<StateOrRegion>KY</StateOrRegion>
<AddressLine2>106 S. main St.</AddressLine2>
<AddressLine1>P.O. Box 1</AddressLine1>
<City>Edmonton</City>
</ShippingAddress>
<OrderType>StandardOrder</OrderType>
<SellerOrderId>108-3617067-1489801</SellerOrderId>
<BuyerEmail>g3n55s57mf0fn2f@marketplace.amazon.com</BuyerEmail>
<FulfillmentChannel>AFN</FulfillmentChannel>
<OrderStatus>Shipped</OrderStatus>
<BuyerName>sample2</BuyerName>
<LastUpdateDate>2014-12-02T14:12:35Z</LastUpdateDate>
<EarliestShipDate>2014-12-02T14:09:57Z</EarliestShipDate>
<PurchaseDate>2014-12-02T02:37:57Z</PurchaseDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<AmazonOrderId>108-3617067-1489801</AmazonOrderId>
<NumberOfItemsShipped>1</NumberOfItemsShipped>
<PaymentMethod>Other</PaymentMethod>
</Order>
and so forth for Order-3.xml and Order-4.xml.

Since you are most likely unfamiliar with XSLT, this may all be a mystery to you. Please ask questions...

[Note to those more experienced: I purposely created an example that minimized the amount of 'stuff' necessary to get the job done. I am thoroughly aware that EXSLT and XSLT 2 provide additional possibilities, but these would complicate the solution.

Tom Morrison
Hill Country Software
 
Good old M$,

I like how that sequence of replies started off with

I would use a simple XSLT ...
And then got steadily deeper into "XML geek" territory, no offence to you Jim. It's just the whole M$ way of making things more complicated and obscure/arcane than they need to be that I increasingly find amusing.

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
Hi Chris,

Well, in my world, these are both 'simple XSLT'. I spent more time looking up [tt]FOR /F[/tt] which I knew existed, but without the familiarity of constant usage to allow fluency. [dazed]

The problem with the simpler solutions based on text editors or line oriented text files is that sometime in the future Ama$on will trivially change what they deliver, or will stop transmitting the whitespace between elements (which is extraneous, except for human readability). At that point, the ad hoc solutions have a much higher probability of failure than a solution treating the content of the file as what it is, an XML document.

And, since I have delivered a plug-and-play solution at no cost, I hope that Gina can impress the client.

At this point, I simply cannot help being an XSLT Geek[small]TM[/small] since I have been immersed in it for a long time. [upsidedown] And XSLT is not necessarily a M$ invention; Michael Kay - a fellow countryman of yours, I believe - is considered by many to be the father of XSLT. If I had known the last time I was in Reading that it was really going to be my last time in Reading (due to a job change), I would have tried to gain an audience...

Tom Morrison
Hill Country Software
 
It's not the solution that I find amusing, it's the fact that you have write the style sheet, then parse it through other processes to get the desired result.

Whereas
Bash:
awk '/<Order>/{x="F"++i".xml";}{print > x;}' test.xml

Does the same thing in a matter of milliseconds, no additional applications to be installed, no style sheet that needed coding and testing.

Linux CLI may seem arcane and 'geeky' but it is infinitely more flexible and capable than M$ Windows will ever be.

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top