Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations dencom on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

utf-8 encoded strings in RSS feeds

Status
Not open for further replies.

MacTommy

Programmer
Feb 26, 2007
116
NL
Dear all,

I am using XML::RSS::parser (Perl 5.8.8) to download RSS feeds. These feeds however regularly are utf-8 encoded, so the raw XML looks like this:

Code:
<?xml version="1.0" encoding="UTF-8"?>
...
<item>
 <title>€˜Title with quotesg’</title>
 <link>[URL unfurl="true"]http://feeds.someRSS.nl/</link>[/URL]
 <description>This is what an &euml; looks like: ë
 </description> 
</item>
...

I actually don't really know what wrong here...
- Is this correct UTF-8, but is my way viewing it wrong?
- Do I have to convert every string in some way?
- Is there some option in the XML parser so it does the encoding right itself (I mean, it is in the xml header...).

Thanks a lot!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top