[lxml-dev] Any way to pass encoding to html.html_parser?
Stefan Behnel
stefan_ml at behnel.de
Thu Sep 27 08:32:38 CEST 2007
js wrote:
> A simple question about lxml2.0alpha3's new feature.
>
>> * Parsers accept an 'encoding' keyword argument that overrides the
>> encoding of the parsed documents.
>
> How can I pass encoding argument to the parser when using html.parse instead of
> etree.parse?
Hmm, true, you can't currently do that, as lxml.html.html_parser is a parser
instance, not a class.
It's easy to build an equivalent parser, though. The next release will
duplicate the parser class into lxml.html, until then, you can do this:
class HTMLParser(lxml.etree.HTMLParser):
def __init__(self, **kwargs):
super(HTMLParser, self).__init__(**kwargs)
self.setElementClassLookup(lxml.html.HtmlElementClassLookup())
Stefan
More information about the lxml-dev
mailing list