[lxml-dev] Weird errors in tostring
Bruno
brunobg at gmail.com
Sat Apr 12 22:38:20 CEST 2008
Hi,
I'm getting a weird error in lxml.html.tostring; it happens in one machine but
not in another, although both are using lxml 2.0.2, but one has python 2.5
(which works all the time) and the other python 2.4 (which doesn't). Here's
the relevant backtrace:
File "/home/spyder/spyder/core/base.py", line 289, in treetostring
return tostring(root, method='xml', encoding=unicode)
File
"/usr/lib/python2.4/site-packages/lxml-2.0.2-py2.4-linux-i686.egg/lxml/html/
__init__.py", line 1313, in tostring
encoding=encoding)
File "lxml.etree.pyx", line 2455, in lxml.etree.tostring
File "serializer.pxi", line 61, in lxml.etree._tostring
File "serializer.pxi", line 126, in lxml.etree._tounicode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 21-24:
invalid data
In the other machine all goes well. FYI, the tree (root variable) is being
built with root = lxml.html.fromstring(data). I'm parsing data in utf8 and
iso-8859-1, and this particular backtrace happened in a HTML document
correctly labelled with a meta charset=iso-8859-1.
If you have any ideas of how to trace what is going wrong?
More information about the lxml-dev
mailing list