[lxml-dev] lxml with utf-8

Daniel Jirku nepi at gmx.ch
Wed Jun 18 16:35:05 CEST 2008


hi..

i'm new to lxml but very interested to using it...

i now have a problem. i want to add an element with an umlaut (non ascii character), so im using utf-8. but as soon as i run my pyhton script, i get the following error:

  File "lxml.etree.pyx", line 835, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:9595)
  File "apihelpers.pxi", line 409, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:28436)
  File "apihelpers.pxi", line 951, in lxml.etree._utf8 (src/lxml/lxml.etree.c:32423)
AssertionError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes

My script looks like this...

    # -*- coding: utf-8 -*-
    ....
    parser = etree.XMLParser(encoding="utf-8")
    etree.set_default_parser(parser)
    
    # bad sign 'ö'
    badString = "blöm"
    
    root = etree.Element("neuIns")
    for i in range(5):
        tagAd = etree.SubElement(root, "ad", id=str(i))
        foo = etree.SubElement(tagAd, "foo")
        foo.text = badString.encode("utf8")

    toStringValue = etree.tostring(root, encoding="utf-8", method="xml")
    writeToFile(toStringValue)

    --------
all the new parser set up and badString.enocde i just did to be shure everything is utf-8... without it it also doesn't work :)

my setup is:
python 2.5
pydev (eclipse)
default encoding in eclipse is utf-8, also stdout encoding is utf-8

i found this in the mailing list (http://article.gmane.org/gmane.comp.python.lxml.devel/2320/match=assertionerror), but i think it should be possible to write utf-8 strings to an xml with lxml?! what i'm doing wrong...

hope you can help..
thanks in advance..
dani.
-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer


More information about the lxml-dev mailing list