[lxml-dev] lxml with utf-8
Daniel Jirku
nepi at gmx.ch
Wed Jun 18 16:35:05 CEST 2008
hi..
i'm new to lxml but very interested to using it...
i now have a problem. i want to add an element with an umlaut (non ascii character), so im using utf-8. but as soon as i run my pyhton script, i get the following error:
File "lxml.etree.pyx", line 835, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:9595)
File "apihelpers.pxi", line 409, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:28436)
File "apihelpers.pxi", line 951, in lxml.etree._utf8 (src/lxml/lxml.etree.c:32423)
AssertionError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes
My script looks like this...
# -*- coding: utf-8 -*-
....
parser = etree.XMLParser(encoding="utf-8")
etree.set_default_parser(parser)
# bad sign 'ö'
badString = "blöm"
root = etree.Element("neuIns")
for i in range(5):
tagAd = etree.SubElement(root, "ad", id=str(i))
foo = etree.SubElement(tagAd, "foo")
foo.text = badString.encode("utf8")
toStringValue = etree.tostring(root, encoding="utf-8", method="xml")
writeToFile(toStringValue)
--------
all the new parser set up and badString.enocde i just did to be shure everything is utf-8... without it it also doesn't work :)
my setup is:
python 2.5
pydev (eclipse)
default encoding in eclipse is utf-8, also stdout encoding is utf-8
i found this in the mailing list (http://article.gmane.org/gmane.comp.python.lxml.devel/2320/match=assertionerror), but i think it should be possible to write utf-8 strings to an xml with lxml?! what i'm doing wrong...
hope you can help..
thanks in advance..
dani.
--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
More information about the lxml-dev
mailing list