Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Change Date format 1

Status
Not open for further replies.

sem

Programmer
Jun 3, 2000
4,709
UA
I have an XML containing
Code:
<property name="dc" type="timestamp">17 March 2004 05:43:41</property>

I need to convert it to

Code:
<dc>2004-03-17</dc>

Can anybody help me?

Regards, Dima
 
I would look at the substring() function to do most of the work. Converting "March" to 3 is tougher, and I'm not sure how you'd do that.

BTW, in XML terms, that's not really a date value -- it's a string that looks like a date. The 'real' XML date format is the same as ISO-8601:: YYYY-MM-DDTHH:MM:SS.NNNN (plus a timezone offset). Using the ISO-8601 format from end-to-end, you don't have any conversion problems.

Chip H.


If you want to get the best response to a question, please check out FAQ222-2244 first
 
Chip is (as always) right:
you'd save yourself a lot of trouble if you'd use ISO-format.
Anyway, using substring-functions you can get it done.
To parse the month, you'll need an extra xml-file that links names to numbers, I named it 'months.xml':
Code:
<months>
   <month name='January' number='01' />
   <month name='Februay' number='02' />
   <month name='March' number='03' />
   <month name='April' number='04' />
</months>
A stylesheet could contain this:
Code:
<xsl:variable name="months" select="document('months.xml')//month" />
...
<xsl:template match="//property[@type='timestamp']">
  <xsl:element name="{@name}">
    <xsl:variable name="day" select="substring-before(.,' ')" />
    <xsl:variable name="rest" select="substring-after(.,' ')" />
    <xsl:variable name="month" select="$months[@name=substring-before($rest,' ')]/@number" />
    <xsl:variable name="year" select="substring-before(substring-after($rest,' '),' ')" />
    <xsl:value-of select="concat($year,'-',$month,'-',$day)" />
  </xsl:element>
</xsl:template>
 
Thank you, Jel and Chiph. The real problem is that the original format is even worse: dd MMMM yyyy hh:mm:ss. Besides it's locale-aware, thus there's no simple way to obtain 3 from either March or ???? or ???????? etc. Fortunately I use Xalan and may write a Java extension that can manage this, though it'd be better to get a generic way.


Regards, Dima
 
sem said:
Besides it's locale-aware, thus there's no simple way to obtain 3 from either March or ???? or ???????? etc.
I was afraid of that -- it's the worst-case scenario for dates -- not only is it not in ISO-8601, the day/month order can change, so you don't know if it's March 2nd or February 3rd, or even if the month is in English, Russian, or maybe Turkish!

If you can talk to the person sending you the document and get them to switch formats, that would be best. After that, all you have to worry about is the timezone!

(aren't dates fun?)

jel -
Thanks for the code snippet!

Chip H.


If you want to get the best response to a question, please check out FAQ222-2244 first
 
sem -
If you can get them to switch to UTF-8 encoding, you won't have any trouble with mixing Cyrillic and Latin characters in your document.

A good book for international coding is:
[tab]Developing International Software
by Microsoft Press

It's geared for .NET development, but there's a ton of good reference material in there too (code pages, keyboard layouts, date/time formats, and so forth).

Chip H.


If you want to get the best response to a question, please check out FAQ222-2244 first
 
Who do you mean by "them"? My document was in Unicode and these question marks were generated by this site: I've typed them correctly :)

Regards, Dima
 
I did a "view source" on this page, and the <html> tag doesn't have any encoding specified. Doesn't that default it to ISO-8859-1 (Latin-1)??

Chip H.


If you want to get the best response to a question, please check out FAQ222-2244 first
 
The default encoding is the default specified by the web server (sent out as part of the HTTP ContentType header: e.g. Content-Type: text/html; charset=UTF-8)

In Tek-Tips case it's UTF-8.

IIRC, UTF-8 is the default charset for an apache install; for IIS I believe the server's language determines the default charset.

Here's a free tool to allow you to check the HTTP headers:

<marc> i wonder what will happen if i press this...[ul][li]please tell us if our suggestion has helped[/li][li]need some help? faq581-3339[/li][/ul]
 
Site management has said that they filter out a lot of the Unicode character range. Cyrillic is included in that range.

Chip H.


If you want to get the best response to a question, please check out FAQ222-2244 first
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top