Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

XSLT - finding values 1

Status
Not open for further replies.

Sniipe

Programmer
Oct 9, 2006
115
IE
I have an XML document that looks somewhat like this
Code:
<Workbook>
 ...
 <Worksheet ss:Name="x">
  <Table ...>
   <Row>
    ...
   </Row>
   <Row>
    <Cell>
     <Data ss:Type="String">MPRCode</Data>
    </Cell>
    <Cell ss:StyleID="s77" id="[b][u]MPRCodeVal[/u][/b]">
     <Data ss:Type="String">[b][u]MPR001[/u][/b]</Data>
    </Cell>
    <Cell ...>
    </Cell>
   </Row>
  </Table>
 </Worksheet>
 <Worksheet ss:Name="y">
 ...
 </Worksheet>
 <Worksheet ss:Name="z">
 ...
 </Worksheet>
</Workbook>

Essentially a load of Rows and columns... but I have added in the id's.
What I know is the id (eg MPRCodeVal - shown underlined above), but what I need is the value MPR001 (shown underlined above)which is in a sub element of the Cell...
I have a good few id's to search for and am wondering how to build this part for the XSLT sheet.

Thanks for your help.
 
[0] Using xslt as a tool of query/search is admittedly not tapping its central design goal. It is still a useful and handy use of xslt. Eventually xquery would be the tool proper for the job. This is just an aside to keep in mind.

[1] The very direct approach is to treat the workbook xml as a lookup table. Group all the id needed to search for in an independent xml document and leave the lookup data xml alone.

[1.1] Let's say the workbook xml is a file "data.xml" (in the same directory).

[2] The search data xml can be made as simple and as straight forward as possible. Like this.
[tt]
<root>
<searchid>MPRCodeVal</searchid>
<searchid>anotherMPRCodeVal</searchid>
<!-- etc -->
</root>
[/tt]
[3] All the work is done by the xslt running against the search xml above, such as this.
[tt]
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="[ignore][/ignore]"
xmlns:ss="urn:schemas-microsoft-com:eek:ffice:spreadsheet"
>
<xsl:eek:utput method="text" />
<xsl:variable name="crlf" select="'&#x0d;&#x0a;'" />
<xsl:variable name="tab" select="'&#x09;'" />
<xsl:template match="/">
<xsl:apply-templates select="//searchid" />
</xsl:template>
<xsl:template match="searchid">
<xsl:value-of select="concat(.,':',$crlf)" />
<xsl:for-each select="document('[blue]data.xml[/blue]')//ss:Cell[@id=normalize-space(current()/ss:Data/text())]">
<xsl:value-of select="concat($tab,normalize-space(.),$crlf)" />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
[/tt]
The above is a quick sketch for a possible approach.
 
Amendment
A last minute change of mind is done on the wrong lines. These are the proper reading of the two lines.
[tt]
<xsl:for-each select="document('data.xml')//ss:Cell[@id=[red]normalize-space(current())[/red]]">
<xsl:value-of select="concat($tab,[red]normalize-space(./ss:Data)[/red],$crlf)" />
</xsl:for-each>
[/tt]
 
wow, what have I let myself into. Again tsuji, thank you for your valuable post. You are invaluable to tek-tips.com and have never steared me wrong. It will take me a while to disect your post.
I hear what you are saying about this route probably not being ideal.
This is what actually happens...
Users download an excel file (which has been generated from an XML document into an XML spreadsheet.
Typically users make changes to this exact same file then upload it.
Now I have an XML spreadsheet (as per my first post in this thread)
Esentially I want to upload the values to a DB, but I want to make the XML file friendlier... so I'm going to transform the difficult XML into easier to use XML for my DB uploads.

I will re-read more after lunch and repost how I get on.
 
ok; I'm trying to figure out what you did there tsuji. I have 3 files...
data.xml - contains the XML spreadsheet
transform.xsl - contains your searching transformation
search.xml - contains the id's I'm searching for.

I would transform the search.xml against the transform.xsl?

At the moment I'm not getting any results back... I will look into it when I'm free again.
 
[4] >I would transform the search.xml against the transform.xsl?
Yes, that is the idea. A couple of points to make sure.

[4.1] The workbook (data.xml), I suppose is having a default namespace of "urn:schemas-microsoft-com:eek:ffice:spreadsheet" which is prefixed in the transform.xsl as "ss" and is declared in the root xsl:stylesheet as well.

[4.1.1] That means I suppose the workbook (data.xml) starts out something at least with the default namespace declared like this, simplified out all the other namespaces commonly appeared in a ms office excel xml document.
[tt]
<?xml version="1.0" ?>
<?mso-microsoft progid="Excel.Sheet"?>
<Workbook xmlns:ss="urn:schema-microsoft-com:eek:ffice:spreadsheet" xmlns="urn:schema-microsoft-com:eek:ffice:spreadsheet">
<!-- etc etc, meaning Row, Cell and Data tags are in the said namespace -->
</Workbook>
[/tt]
[4.2] The data.xml should be placed in the same folder as current directory of the application, usually case, being the directory of search.xml and transform.xsl.

[4.3] What to expect? The output would be something like this.
[tt]
MPRCodeVal:
MPR001
anotherMPRCodeVal:

[/tt]
Just because there is only one found for MPRCodeVal and zero found for anotherMPRCodeVal.
 
The output that I expect would be another XML file that would be cleaner than the spreadsheet. This XML file will then be worked on by vb.net

I will look into the data.xml and transform.xsl again and get back to you when I get further with the problem. Thanks again
 
I was getting results but my transform program wasn't displaying properly... I'm using XMLPad... is there anything better. Either way I'm getting some of the text out:

Invalid at the top level of the document. Error processing resource 'file:///C:/Projects/Nortel/MarketingPromotions/Applica...

MPRCodeVal:
^
 
[5] Output to different format is just some mechanical transposition of it. The method="text" is here for clean illustration free of additional noise. It should not be a generic difficulty.

[6] As to top level document error, I am not in a position to debug. Time is on your side to correct all typos etc etc.
 
I'm back to this again, its not transforming, because I get back a square character at the beginning of the string. This is a seperate problem to do with .net - I'm sure I'll figure it out.

Otherwise I'm getting the following with my simple search xml.

?MPRCodeVal:
PromotionNameVal:
PromotionGroupCodeVal:
StartDateVal:
EndDateVal:
DescriptionVal:

The start of my data.xml doc looks like this

<?xml version="1.0" encoding="utf-8"?>
<Workbook xmlns="urn:schemas-microsoft-com:eek:ffice:spreadsheet" xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice" xmlns:x="urn:schemas-microsoft-com:eek:ffice:excel" xmlns:ss="urn:schemas-microsoft-com:eek:ffice:spreadsheet" xmlns:html=" <DocumentProperties xmlns="urn:schemas-microsoft-com:eek:ffice:eek:ffice">

I've opened all three documents in ie to ensure that they are all ok - and they are.

changing the xsl output in the XSL document changes the output to:
?<?xml version="1.0" encoding="utf-8"?>MPRCodeVal:
PromotionNameVal:
PromotionGroupCodeVal:
StartDateVal:
EndDateVal:
DescriptionVal:
With the square box before the <?xml...

but with the error this time being:
The path contains illegal characters.

attached is an image of the file structure on IE of data.xml
 
I take it the magic happens here:
Code:
<xsl:for-each select="document('data.xml')//ss:Cell[@id=normalize-space(current())]"><xsl:value-of select="concat($tab,normalize-space(./ss:Data),$crlf)"/></xsl:for-each>
It selects the document data.xml which is in the same directory.

Then does it go looking for ss:Cell ?
The line I want would be in
Code:
...
<Cell ss:StyleID="s77" id="MPRCodeVal">
   <Data ss:Type="String">MPR001</Data>
</Cell>
...

I'm a little puzzled by your work, trying to figure it out.
 
I'm trying to get it to output to XML
 
From the jpg, it does not show Workbook, Row, Cell, Data be in any namespace, and that ss prefix is defined nowhere, so I take it that it is not the complete document (that's fine, don't post the full thing.) You have to determine what namespace are they in and declare the prefix ss in the xslt document accordingly - nothing is more important than that.
 
The ss prefix is defined on the second line of the data.xml as:
<Workbook xmlns="urn:schemas-microsoft-com:eek:ffice:spreadsheet" xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice" xmlns:x="urn:schemas-microsoft-com:eek:ffice:excel" xmlns:ss="urn:schemas-microsoft-com:eek:ffice:spreadsheet" xmlns:html="
You have to determine what namespace are they in and declare the prefix ss in the xslt document accordingly - nothing is more important than that.
The order to find the value I need to search for is:
workbook/worksheet/table/row/cell (its id value)
With the value I need being the next sub element
workbook/worksheet/table/row/cell/data (its element value)

I don't know what you mean by
it does not show Workbook, Row, Cell, Data be in any namespace

I think at this stage I will give up on the XML/XSLT stuff. It will be probably less time consuming for me to try and understand xslt than to go thru an ugly xml spreadsheet document via vb.net
 
I am surprised you have found it unsurmountable. I am not going to be advocate of merits of different approaches. You find a way you feel comfortable, go with it.
 
Sorry tsuji, its just that I'm not familar with XSLT - and I'm trying to use it beacuse I thought it would be the best way forward.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top