[lxml-dev] iteraprse comment parsing

kris kris at cs.ucsb.edu
Wed Oct 17 22:55:03 CEST 2007


        Hello,
         
        I noticed a difference in parsing behavior between 
        iterwalk and iterparse.  It could simply be that I do
        not know how to turn off comment parsing.  
        
        Any help would be appreciated.
        
        Thanks,
        Kris
        
        
        python ~/xml-test.py
        using parse
        ['start', <Element r at 2aad85d0f3c0>, 'end', <Element r at
        2aad85d0f3c0>]
        using walk
        ['start', <Element r at 2aad85d0f460>, 'start', <!-- asjsjs -->,
        'end', <!-- asjsjs -->, 'end', <Element r at 2aad85d0f460>]
        BAD
        lxml.etree:        (1, 1, 2, 0)
        libxml used:       (2, 6, 27)
        libxml compiled:   (2, 6, 27)
        libxslt used:      (1, 1, 20)
        libxslt compiled:  (1, 1, 19)
        
        
        ====================================================================================
        
        
        from lxml import etree
        from StringIO import StringIO
        
        x = "<r> <!-- asjsjs --></r>"
        
        
        print "using parse"
        parser = []
        for a,e in etree.iterparse(StringIO(x), events=('start','end')):
            parser += (a,e)
        print parser
        
        
        
        print "using walk"
        walker = []
        for a,e in etree.iterwalk(etree.XML(x),
        events=('start','end')): 
            walker += (a,e)
        
        print walker
        
        if parser != walker:
            print "BAD"
        
        print "lxml.etree:       ", etree.LXML_VERSION
        print "libxml used:      ", etree.LIBXML_VERSION
        print "libxml compiled:  ", etree.LIBXML_COMPILED_VERSION
        print "libxslt used:     ", etree.LIBXSLT_VERSION
        print "libxslt compiled: ", etree.LIBXSLT_COMPILED_VERSION
        
        
        
-- 
Kristian Kvilekval
kris at cs.ucsb.edu  http://www.cs.ucsb.edu/~kris w:805-636-1599 h:504-9756



More information about the lxml-dev mailing list