[lxml-dev] html5lib tree builder in lxml 2.2
Stefan Behnel
stefan_ml at behnel.de
Wed Mar 25 22:44:36 CET 2009
Geoffrey Sneddon wrote:
> Getting around to actually looking at lxml 2.2's html5lib support, I
> note that it has its own treebuilder: I presume there was some reason
> (bugs?) that html5lib's own wasn't used.
Armin Ronacher wrote that part, so he should know:
http://comments.gmane.org/gmane.comp.python.lxml.devel/3848?set_lines=100000
It uses a subclass of html5lib's TreeBuilder, so it's not a rewrite or
something in that order.
> Would it be possible to get a
> patch for html5lib that would fix these issues (this'll need to be under
> the MIT license)?
It's mainly about stuff that ET doesn't support, such as the DOCTYPE, or
top-level comments. I don't know if the html5lib project is interested in
that, but it shouldn't be too hard to add some conditional lxml specifics
to their code.
Stefan
More information about the lxml-dev
mailing list