Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

xml.etree.ElementTree - xml.parsers.expat.ExpatError: undefined entity

Status
Not open for further replies.

JustinEzequiel

Programmer
Jul 30, 2001
1,192
PH
from I thought that just setting the entity dict of the XMLParser instance would be sufficient
but evidently it's not enough.

what am I missing?

I know that I can replace the named entities with their unicode equivalents but that would mess up my
output.
I am asked, among other things, to check if the idxname matches the surname.
If not, then I am to flag an error indicating the line and column numbers
so that they can find and fix the error in the XML file.

(oh and yes I am stuck with python 2.5.4)

Thanks in advance.

Justin

source code
Code:
from xml.etree import ElementTree
from htmlentitydefs import name2codepoint
from StringIO import StringIO
import unicodedata

def getParser():
    xp = ElementTree.XMLParser()
    for k, v in name2codepoint.iteritems():
        xp.entity[k] = unichr(v)

    return xp

test = '''<surnamegrp>
<surname print="yes">Mu&ntilde;oz</surname>
<idxname>Munoz</idxname>
</surnamegrp>'''

if __name__ == '__main__':
    print 'ntilde' in name2codepoint # True
    xp = getParser()
    print 'ntilde' in xp.entity # True
    print unicodedata.name(xp.entity['ntilde']) # LATIN SMALL LETTER N WITH TILDE

##    xp.feed(test)
##    e = xp.close()
    b = StringIO(test)
    t = ElementTree.parse(b, xp)

output
Code:
C:\Users\justin\Desktop>parserTest.py
True
True
LATIN SMALL LETTER N WITH TILDE
Traceback (most recent call last):
  File "C:\Users\justin\Desktop\parserTest.py", line 27, in <module>
    t = ElementTree.parse(b, xp)
  File "C:\Python25\lib\xml\etree\ElementTree.py", line 862, in parse
    tree.parse(source, parser)
  File "C:\Python25\lib\xml\etree\ElementTree.py", line 586, in parse
    parser.feed(data)
  File "C:\Python25\lib\xml\etree\ElementTree.py", line 1245, in feed
    self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: undefined entity: line 2, column 23

Code:
>>> import sys
>>> sys.version
'2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]'
[/code
 
apparently, if I add a DOCTYPE, it works

Code:
[COLOR=#CC0000]<!DOCTYPE nul SYSTEM "nul.dtd">[/color]
<surnamegrp>
<surname print="yes">Mu&ntilde;oz</surname>
<idxname>Munoz</idxname>
</surnamegrp>
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Sponsor

Back
Top