Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

expat and line breaks in tags

Status
Not open for further replies.

postb99

Programmer
Dec 20, 2004
1
FR
Hi,

I use PHP 4.3.9-2 with shipped expat_1.95.8 on Linux Debian.

My input XML document uses the following encoding : <?xml version="1.0" encoding="iso-8859-15"?>

I parse this document with SAX handling.

The beginning of the parser code is :

$xp = xml_parser_create();

// set element handler
xml_set_element_handler($xp, "elementBegin", "elementEnd");
xml_set_character_data_handler($xp, "characterData");
xml_parser_set_option($xp, XML_OPTION_CASE_FOLDING, false);

I have the following problem : when I have a line break inside a tag of the XML input document, the character data handler only keeps the part of the tag content after the line break. This is not what I expect.

The same if the character data handler encounters some entity like &#339;

Any idea of which option I could have forgotten ? I tried using iso-8859-1 encoding but it doesn't help. If it's not possible I'll switch to DOM XML everywhere, since I prefer sticking to PHP4, and felt SAX would be more efficient for reading big input files...

Thanks a lot for feedback!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top