<div>There appears to be a bug with lxml.sax's handling of comments, as the following code causes lxml.sax.saxify to fail:</div><div><br class="khtml-block-placeholder"></div><div>"""</div><div>import lxml.etree
, lxml.sax, xml.sax.handler</div><div>from cStringIO import StringIO</div><div><br class="khtml-block-placeholder"></div><div>p = lxml.etree.HTMLParser(remove_blank_text=True)</div><div>h = xml.sax.handler.ContentHandler()
</div><div>f = StringIO("<body><!-- foo --><p>bar</p></body>")</div><div>t = lxml.etree.parse(f, p)</div><div>lxml.sax.saxify(t, h)</div><div>"""</div><div><br class="khtml-block-placeholder">
</div><div>"""</div><div>Traceback (most recent call last):</div><div> File "saxBug.py", line 11, in <module></div><div> lxml.sax.saxify(t, h)</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 178, in saxify</div><div> return ElementTreeProducer(element_or_tree, content_handler).saxify()</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 130, in saxify</div><div> self._recursive_saxify(self._element, {})</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 160, in _recursive_saxify</div><div> self._recursive_saxify(child, prefixes)</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 160, in _recursive_saxify</div><div> self._recursive_saxify(child, prefixes)</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 149, in _recursive_saxify</div><div> ns_uri, local_name = _getNsTag(element.tag)</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/lxml-
1.3beta-py2.5-macosx-10.4-i386.egg/lxml/sax.py", line 8, in _getNsTag</div><div> if tag[0] == '{':</div><div>TypeError: 'builtin_function_or_method' object is unsubscriptable</div><div>"""
</div><div><br class="khtml-block-placeholder"></div><div>I have been able to replicate the above error with both release and svn lxml, as well as with both Apple-supplied libxml2/libxslt and up-to-date libraries.</div><div>
<br class="khtml-block-placeholder"></div><div>Also, and I doubt this is related, but `make test` fails for me on OS X 10.4.9 with MacPython 2.5.1 (<a href="http://python.org">python.org</a> binary):</div><div><br class="khtml-block-placeholder">
</div><div>"""</div><div>python test.py -p -v </div><div><br class="khtml-block-placeholder"></div><div>TESTED VERSION:</div><div> Python: (2, 5, 1, 'final', 0)</div><div> lxml.etree
: (1, 3, -1, 42667)</div><div> libxml used: (2, 6, 28)</div><div> libxml compiled: (2, 6, 28)</div><div> libxslt used: (1, 1, 20)</div><div> libxslt compiled: (1, 1, 20)</div><div><br class="khtml-block-placeholder">
</div><div> 733/733 (100.0%): Doctest: xpathxslt.txt </div><div>======================================================================</div><div>FAIL: test_module_HTML_unicode (
lxml.tests.test_htmlparser.HtmlParserTestCaseBase)</div><div>----------------------------------------------------------------------</div><div>Traceback (most recent call last):</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 260, in run
</div><div> testMethod()</div><div> File "/Users/erik/Projects/lxml/src/lxml/tests/test_htmlparser.py", line 33, in test_module_HTML_unicode</div><div> self.uhtml_str)</div><div> File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py", line 334, in failUnlessEqual
</div><div> (msg or '%r != %r' % (first, second))</div><div>AssertionError: u'<html><head><title>test \xc3\x83\xc2\xa1\xef\xa3\x92</title></head><body><h1>page \xc3\x83\xc2\xa1\xef\xa3\x92 title</h1></body></html>' != u'<html><head><title>test \xc3\xa1\uf8d2</title></head><body><h1>page \xc3\xa1\uf8d2 title</h1></body></html>'
</div><div><br class="khtml-block-placeholder"></div><div>----------------------------------------------------------------------</div><div>Ran 733 tests in 1.380s</div><div><br class="khtml-block-placeholder"></div><div>FAILED (failures=1)
</div><div>"""</div><br>-- <br>Erik Swanson