[lxml-dev] Unicode oddness
Adam
dood at zworg.com
Wed Apr 8 12:27:13 CEST 2009
Stefan Behnel <stefan_ml <at> behnel.de> writes:
> Your HTML snippet lacks a <meta> tag, so the HTMLParser has no way of
> knowing what encoding your HTML snippet uses. It therefore falls back to
> assuming Latin-1. If your snippet was encoded in Latin-1, you'd be quite
> happy about this default.
>
> If you know the encoding in advance, you can create your own parser
> instance and pass it the "encoding" keyword option.
Of course! Thank you, I had a feeling I was overlooking something simple.
More information about the lxml-dev
mailing list