Hi,
I use PHP 4.3.9-2 with shipped expat_1.95.8 on Linux Debian.
My input XML document uses the following encoding : <?xml version="1.0" encoding="iso-8859-15"?>
I parse this document with SAX handling.
The beginning of the parser code is :
$xp = xml_parser_create();
// set element handler
xml_set_element_handler($xp, "elementBegin", "elementEnd");
xml_set_character_data_handler($xp, "characterData");
xml_parser_set_option($xp, XML_OPTION_CASE_FOLDING, false);
I have the following problem : when I have a line break inside a tag of the XML input document, the character data handler only keeps the part of the tag content after the line break. This is not what I expect.
The same if the character data handler encounters some entity like œ
Any idea of which option I could have forgotten ? I tried using iso-8859-1 encoding but it doesn't help. If it's not possible I'll switch to DOM XML everywhere, since I prefer sticking to PHP4, and felt SAX would be more efficient for reading big input files...
Thanks a lot for feedback!
I use PHP 4.3.9-2 with shipped expat_1.95.8 on Linux Debian.
My input XML document uses the following encoding : <?xml version="1.0" encoding="iso-8859-15"?>
I parse this document with SAX handling.
The beginning of the parser code is :
$xp = xml_parser_create();
// set element handler
xml_set_element_handler($xp, "elementBegin", "elementEnd");
xml_set_character_data_handler($xp, "characterData");
xml_parser_set_option($xp, XML_OPTION_CASE_FOLDING, false);
I have the following problem : when I have a line break inside a tag of the XML input document, the character data handler only keeps the part of the tag content after the line break. This is not what I expect.
The same if the character data handler encounters some entity like œ
Any idea of which option I could have forgotten ? I tried using iso-8859-1 encoding but it doesn't help. If it's not possible I'll switch to DOM XML everywhere, since I prefer sticking to PHP4, and felt SAX would be more efficient for reading big input files...
Thanks a lot for feedback!