[lxml-dev] iteraprse comment parsing
kris
kris at cs.ucsb.edu
Wed Oct 17 22:55:03 CEST 2007
Hello,
I noticed a difference in parsing behavior between
iterwalk and iterparse. It could simply be that I do
not know how to turn off comment parsing.
Any help would be appreciated.
Thanks,
Kris
python ~/xml-test.py
using parse
['start', <Element r at 2aad85d0f3c0>, 'end', <Element r at
2aad85d0f3c0>]
using walk
['start', <Element r at 2aad85d0f460>, 'start', <!-- asjsjs -->,
'end', <!-- asjsjs -->, 'end', <Element r at 2aad85d0f460>]
BAD
lxml.etree: (1, 1, 2, 0)
libxml used: (2, 6, 27)
libxml compiled: (2, 6, 27)
libxslt used: (1, 1, 20)
libxslt compiled: (1, 1, 19)
====================================================================================
from lxml import etree
from StringIO import StringIO
x = "<r> <!-- asjsjs --></r>"
print "using parse"
parser = []
for a,e in etree.iterparse(StringIO(x), events=('start','end')):
parser += (a,e)
print parser
print "using walk"
walker = []
for a,e in etree.iterwalk(etree.XML(x),
events=('start','end')):
walker += (a,e)
print walker
if parser != walker:
print "BAD"
print "lxml.etree: ", etree.LXML_VERSION
print "libxml used: ", etree.LIBXML_VERSION
print "libxml compiled: ", etree.LIBXML_COMPILED_VERSION
print "libxslt used: ", etree.LIBXSLT_VERSION
print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION
--
Kristian Kvilekval
kris at cs.ucsb.edu http://www.cs.ucsb.edu/~kris w:805-636-1599 h:504-9756
More information about the lxml-dev
mailing list