[lxml-dev] Help with an error message
Stefan Behnel
stefan_ml at behnel.de
Thu Jan 3 17:30:22 CET 2008
Hi,
Konstantin Ryabitsev wrote:
> I'm having trouble with the following case. One of my automatic import
> scripts takes data from one source and submits it to another as an XML
> feed. Recently, it started failing because one of the entries contains
> a null. The testcase is such:
>
> from lxml.etree import Element
> sourcestr = 'Contains a null: \x00'
> unistr = unicode(sourcestr, 'utf-8')
> elt = Element('foo').text = unistr
>
> Running it will cause the following error:
>
> Traceback (most recent call last):
> File "foo.py", line 6, in <module>
> elt = Element('foo').text = unistr
> File "etree.pyx", line 741, in etree._Element.text.__set__
> File "apihelpers.pxi", line 344, in etree._setNodeText
> File "apihelpers.pxi", line 648, in etree._utf8
> AssertionError: All strings must be XML compatible, either Unicode or ASCII
>
> Can someone suggest the best way to deal with this?
My first question is: why do you need a '\x00' here? If you want to pass
binary data in XML, the best way is to use a safe encoding such as uuencode or
whatever. That should be part of your XML language spec/schema/...
Stefan
More information about the lxml-dev
mailing list