[lxml-dev] Any way to pass encoding to html.html_parser?
js
ebgssth at gmail.com
Thu Sep 27 14:51:19 CEST 2007
Thank you for your help!
and I'm looking forward to the next release.
On 9/27/07, Stefan Behnel <stefan_ml at behnel.de> wrote:
>
> js wrote:
> > A simple question about lxml2.0alpha3's new feature.
> >
> >> * Parsers accept an 'encoding' keyword argument that overrides the
> >> encoding of the parsed documents.
> >
> > How can I pass encoding argument to the parser when using html.parse instead of
> > etree.parse?
>
> Hmm, true, you can't currently do that, as lxml.html.html_parser is a parser
> instance, not a class.
>
> It's easy to build an equivalent parser, though. The next release will
> duplicate the parser class into lxml.html, until then, you can do this:
>
> class HTMLParser(lxml.etree.HTMLParser):
> def __init__(self, **kwargs):
> super(HTMLParser, self).__init__(**kwargs)
> self.setElementClassLookup(lxml.html.HtmlElementClassLookup())
>
> Stefan
>
>
More information about the lxml-dev
mailing list