Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Andrzejek on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Urgent help on importing XML data using T-SQL 1

Status
Not open for further replies.

JCAD1

Programmer
Apr 17, 2003
18
GB
I am testing a script I have just written to import some sample xml data into a SQL table.

The import is done in two stages: the first stage imports the xml file into a sql table and the second, parses out the file into a relational format and then imports it into a second table.

The imported data should look as follows:

orderID CustID OrderDate ProdID Qty
10248 VINET Jul 4 1996 12:00AM 11 12
10248 VINET Jul 4 1996 12:00AM 42 10
10283 LILAS Aug 16 1996 12:00AM 72 3

The script works but fails to import the data in the second row. In other words, the customer with orderID number 10248 ordered two products but the script is not importing the product with ProdID number 42.

How do I modify the script to import this record?

Here is the XML data (E:\Extracts\XMLData\customers.xml):

<ROOT>
<Customer CustomerID="VINET" ContactName="Paul Henriot">
<Order OrderID="10248" CustomerID="VINET" EmployeeID="5"
OrderDate="1996-07-04T00:00:00">
<OrderDetail ProductID="11" Quantity="12"/>
<OrderDetail ProductID="42" Quantity="10"/>
</Order>
</Customer>
<Customer CustomerID="LILAS" ContactName="Carlos Gonzlez">
<Order OrderID="10283" CustomerID="LILAS" EmployeeID="3"
OrderDate="1996-08-16T00:00:00">
<OrderDetail ProductID="72" Quantity="3"/>
</Order>
</Customer>
</ROOT>


Here is the script:

-- Stage 1. Import XML file into a SQL table
CREATE TABLE XmlImportTest
(
xmlFileName VARCHAR(300), -- To store xml file name
xml_data XML -- To store actual xml data
)

DECLARE @xmlFileName VARCHAR(300)
SELECT @xmlFileName = 'E:\Extracts\XMLData\customers.xml'

EXEC('
INSERT INTO XmlImportTest(xmlFileName, xml_data)

SELECT ''' + @xmlFileName + ''', xmlData
FROM
(
SELECT *
FROM OPENROWSET (BULK ''' + @xmlFileName + ''' , SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)
')

SELECT * FROM XmlImportTest

-- Stage 2. Parse out XML file into a relational format and then import it into another table

CREATE TABLE [dbo].[tbl_XML_Customer](
[OrderID] [int] NULL,
[CustomerID] [varchar](10) NULL,
[OrderDate] [datetime] NULL,
[ProdID] [int] NULL,
[Qty] [int] NULL
)

INSERT INTO [tbl_XML_Customer](
[OrderID],
[CustomerID],
[OrderDate],
[ProdID],
[Qty]
)
SELECT tab.col.value('./Order[1]/@OrderID','int') AS 'OrderID',
tab.col.value('./@CustomerID[1]','varchar(10)') AS 'CustomerID',
tab.col.value('./Order[1]/@OrderDate','datetime') AS 'OrderDate',
tab.col.value('./Order[1]/OrderDetail[1]/@ProductID','int') AS 'ProductID',
tab.col.value('./Order[1]/OrderDetail[1]/@Quantity','int') AS 'Quantity'
FROM [XmlImportTest]
CROSS APPLY
XML_Data.nodes('//Customer') AS tab(col)

SELECT *
FROM [tbl_XML_Customer]


Thank you so much in advance for your help.

jcad1
 
try this:

Code:
SELECT tab.col.value('../@OrderID','int') AS 'OrderID',
tab.col.value('../../@CustomerID','varchar(10)') AS 'CustomerID',
tab.col.value('../@OrderDate','datetime') AS 'OrderDate',
tab.col.value('@ProductID','int') AS 'ProductID',
tab.col.value('@Quantity','int') AS 'Quantity'
FROM [XmlImportTest]
CROSS APPLY
XML_Data.nodes('//Customer/Order/OrderDetail') AS tab(col)

Note that the node I start with is the order detail. I then use the .. to jump up a node level to get the data you need.



-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
May I take this opportunity to say thank you once again to gmmastros for his solution to the above problem.

I am now applying it to a real situation in which I receive large XML files and am required to import them into SQL server 2005. I have written a script (similar to the above) to import it into the table. However, I am stuck on what seems to be a minor issue but fixing it is proving be a tug of war.

The file I am working on has 164 columns of data but I have provided only a few columns to make things simple. Here is the sample data:


<?xml version="1.0" encoding="utf-8" ?>
<CDS-XMLInterchange xmlns=" xmlns:dc=" xmlns:dcq=" xmlns:gms=" xmlns:xsi=" xsi:schemaLocation=" SchemaDate="2008-01-14" SchemaVersion="6-0">
<CDSBulkGroup-060-Message CDSType="020" CDSBulkReplacementGroup="060" CDSProtocolIdentifier="020">
<CDSTransactionHeader_BulkUpdate>
<CDSType>020</CDSType>
<CDSProtocolIdentifier>020</CDSProtocolIdentifier> <CDSUniqueIdentifier>BT00000951840052</CDSUniqueIdentifier>
<CDSBulkReplacementGroup>060</CDSBulkReplacementGroup>
<CDSActivityDate>2009-06-04</CDSActivityDate>
</CDSTransactionHeader_BulkUpdate>
<Care_Activity>
<PersonGroupPatient_CareActivity_Structure>
<PatientIdentity_General_Structure>
<LocalPatientIdentifier>TV77383</LocalPatientIdentifier>
<ValidatedNHSNo_Structure ValidatedNHSNumberStatusIndicator="01">
<NHSNumberStatusIndicator>01</NHSNumberStatusIndicator>
<OrganisationCode_PCTOfResidence>7GE</OrganisationCode_PCTOfResidence>
</ValidatedNHSNo_Structure>
</PatientIdentity_General_Structure>
</PersonGroupPatient_CareActivity_Structure>
<CareAttendance>
<ActivityCharacteristics>
<AttendanceIdentifier>951840052</AttendanceIdentifier>
<FirstAttendance>2</FirstAttendance>
<OperationStatus>9</OperationStatus>
</ActivityCharacteristics>
</CareAttendance>
</Care_Activity>
</CDSBulkGroup-060-Message>
<CDSBulkGroup-060-Message CDSType="020" CDSBulkReplacementGroup="060" CDSProtocolIdentifier="020">
<CDSTransactionHeader_BulkUpdate>
<CDSType>020</CDSType>
<CDSProtocolIdentifier>020</CDSProtocolIdentifier>
<CDSUniqueIdentifier>BT00000952281808</CDSUniqueIdentifier>
<CDSBulkReplacementGroup>060</CDSBulkReplacementGroup>
<CDSActivityDate>2009-08-20</CDSActivityDate>
</CDSTransactionHeader_BulkUpdate>
<Care_Activity>
<PersonGroupPatient_CareActivity_Structure>
<PatientIdentity_General_Structure>
<LocalPatientIdentifier>T1032982</LocalPatientIdentifier>
<ValidatedNHSNo_Structure ValidatedNHSNumberStatusIndicator="01"><NHSNumber>4465861931</NHSNumber>
<NHSNumberStatusIndicator>01</NHSNumberStatusIndicator>
<OrganisationCode_PCTOfResidence>9XJ</OrganisationCode_PCTOfResidence>
</ValidatedNHSNo_Structure>
</PatientIdentity_General_Structure>
</PersonGroupPatient_CareActivity_Structure>
<CareAttendance>
<ActivityCharacteristics>
<AttendanceIdentifier>952281808</AttendanceIdentifier>
<FirstAttendance>2</FirstAttendance>
<OperationStatus>9</OperationStatus>
</ActivityCharacteristics>
</CareAttendance>
</Care_Activity>
</CDSBulkGroup-060-Message>
</CDS-XMLInterchange>


Here is the script:

-- Stage 1. Import XML file into a SQL table
CREATE TABLE tbl_XML_ALL_CDS_Import
(
xmlFileName VARCHAR(300),
xml_data XML
)

DECLARE @xmlFileName VARCHAR(300)
SELECT @xmlFileName = 'E:\Extracts\XMLData\op_test.xml'

EXEC('
INSERT INTO tbl_XML_ALL_CDS_Import(xmlFileName, xml_data)

SELECT ''' + @xmlFileName + ''', xmlData
FROM
(
SELECT *
FROM OPENROWSET (BULK ''' + @xmlFileName + ''' , SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)
')

SELECT * FROM tbl_XML_ALL_CDS_Import

DROP TABLE tbl_XML_ALL_CDS_Import

-- Stage 2. Parse out XML file into a relational format and then import it into another table
CREATE TABLE [tbl_XML_PAS_Import]
(
[CDSType] varchar(3) null,
[UniqueID] varchar(30) null,
[BlkReplmntGrp] varchar(3) null,
[ActDate] varchar(10) null,
[LocalID] varchar(10) null,
[NoStatusInd] varchar(2) null,
[OrgCode] varchar(3) null,
[AttendID] varchar(10) null,
[1stAttend] varchar(1) null,
[OPStatus] varchar(1) null
)

INSERT INTO [tbl_XML_PAS_Import]
(
[CDSType],
[UniqueID],
[BlkReplmntGrp],
[ActDate],
[LocalID],
[NoStatusInd],
[OrgCode],
[AttendID],
[1stAttend],
[OPStatus]
)
SELECT tab.col.value('./@CDSType','varchar(3)') AS 'CDSType',
tab.col.value('./CDSTransactionHeader_BulkUpdate[1]/CDSUniqueIdentifier[1]','varchar(30)') AS 'UniqueID',
tab.col.value('./CDSTransactionHeader_BulkUpdate[1]/CDSBulkReplacementGroup[1]','varchar(3)') AS 'BlkReplmntGrp',
tab.col.value('./CDSTransactionHeader_BulkUpdate[1]/CDSActivityDate[1]','varchar(10)') AS 'ActDate',
tab.col.value('./Care_Activity[1]/PersonGroupPatient_CareActivity_Structure[1]/PatientIdentity_General_Structure[1]/LocalPatientIdentifier[1]','varchar(10)') AS 'LocalID',
tab.col.value('./Care_Activity[1]/PersonGroupPatient_CareActivity_Structure[1]/PatientIdentity_General_Structure[1]/ValidatedNHSNo_Structure[1]/NHSNumberStatusIndicator[1]','varchar(2)') AS 'NoStatusInd',
tab.col.value('./Care_Activity[1]/PersonGroupPatient_CareActivity_Structure[1]/PatientIdentity_General_Structure[1]/ValidatedNHSNo_Structure[1]/OrganisationCode_PCTOfResidence[1]','varchar(3)') AS 'OrgCode',
tab.col.value('./Care_Activity[1]/CareAttendance[1]/ActivityCharacteristics[1]/AttendanceIdentifier[1]','varchar(10)') AS 'AttendID',
tab.col.value('./Care_Activity[1]/CareAttendance[1]/ActivityCharacteristics[1]/FirstAttendance[1]','varchar(1)') AS '1stAttend',
tab.col.value('./Care_Activity[1]/CareAttendance[1]/ActivityCharacteristics[1]/OperationStatus[1]','varchar(1)') AS 'OPStatus'
FROM [tbl_XML_ALL_CDS_Import]
CROSS APPLY
XML_Data.nodes('//CDSBulkGroup060Message') AS tab(col)


When I run the script it does't import the data but when I remove all the attributes in the root element 'CDS-XMLInterchange', save the file and run the script, it works fine. This is not how I want it to work. In fact, some files are so huge that it's impossible to open and edit them in an XML editor I am using. The question is:

a) How can I import the data without getting rid of the attributes (xmlns, xmlns:dc, xmlns:dcq, xmlns:gms, xmlns:xsi, xsi:schemaLocation, etc)?

b) Is there any way by which my script can be improved, e.g use a JOIN to join child elements to one another (e.g CDSTransactionHeader_BulkUpdate and Care_Activity) and pull data out of them?

I am new to XML and so your help is greatly appreciated.

Thanks

jcad1
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top