[lxml-dev] etree.tostring generate invalid XML?

Qiangning Hong hongqn at gmail.com
Fri May 18 18:32:18 CEST 2007


>>> from lxml import etree
>>> e = lxml.etree.Element('root')
>>> e.text = u'\x08'
>>> xml = etree.tostring(e, 'utf8')
>>> xml
'<root>\x08</root>'
>>> etree.XML(xml)
>>> etree.XML(xml)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "etree.pyx", line 1749, in etree.XML
  File "parser.pxi", line 934, in etree._parseMemoryDocument
  File "parser.pxi", line 830, in etree._parseDoc
  File "parser.pxi", line 516, in etree._BaseParser._parseDoc
  File "parser.pxi", line 619, in etree._handleParseResult
  File "parser.pxi", line 590, in etree._raiseParseError
etree.XMLSyntaxError: line 1: PCDATA invalid Char value 8

Shouldn't xml be '<root>&#8;</root>' ?  Is it a bug of lxml?

-- 
Qiangning Hong
http://www.douban.com/people/hongqn/


More information about the lxml-dev mailing list