[lxml-dev] lxml parser encodings? What's supported?
John Krukoff
jkrukoff at ltgc.com
Fri Sep 26 01:32:38 CEST 2008
On Wed, 2008-09-17 at 21:31 +0200, Stefan Behnel wrote:
> Hi,
> No, you've found a bug. The way the override input encoding is checked by the
> parser instantiation is simply wrong, it doesn't find any "standard" encoding
> (utf-8 or ASCII), neither does it find iconv encodings.
>
> Here's a fix.
>
> Stefan
After some abortive fumbling until I figured out I needed to have cython
installed to use the patch, I gave it a try. Looks like it works fine
here for my use case:
>>> html.fromstring( '<html></html>', parser = html.HTMLParser( ) )
<Element html at 81b471c>
>>> html.fromstring( '<html></html>', parser = html.HTMLParser( encoding
= 'us-ascii' ) )
<Element html at 81b444c>
--
John Krukoff <jkrukoff at ltgc.com>
Land Title Guarantee Company
More information about the lxml-dev
mailing list