[lxml-dev] Encoding problems with lxml
Stefan Behnel
stefan_ml at behnel.de
Thu Jun 28 11:35:18 CEST 2007
Bruno Barberi Gnecco wrote:
> a) when reading pages in iso-8859-1, accented characters are converted to HTML
> sequences, such as à for ` + a. I don't want this to happen, how to avoid it?
I only noticed now that this was referring to parsing. Any reason you don't
want entities resolved her?
lxml 2.0 will allow you to keep entities in the tree, although they are rarely
of any help.
Stefan
More information about the lxml-dev
mailing list