[lxml-dev] Struggling with unicode again

Collioud, Olivier Olivier.Collioud at wipo.int
Wed Jun 3 19:16:11 CEST 2009


Hi,

I'm trying to update an etree in a WSGI application with data coming from a posted form.
The data is converted first using urllib.unquote_plus.

I know that the data (text) is then UTF-8 encoded.

LXML is giving:

Traceback (most recent call last):
File "D:/Applications/IPC_Definitions_Editor/defedit/defedit.py", line 130, in application
elt.text = text
File "lxml.etree.pyx", line 835, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:9595),
File "apihelpers.pxi", line 409, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:28436)
File "apihelpers.pxi", line 951, in lxml.etree._utf8 (src/lxml/lxml.etree.c:32423)
AssertionError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes

What encoding do I need to convert 'text' to and how ?

Thanks,

Olivier.


More information about the lxml-dev mailing list