[lxml-dev] clean_html

Francesco cattafra at hotmail.com
Wed Jun 24 12:29:30 CEST 2009


I have written the following code:

>>> from lxml.html.clean import clean_html
>>> html = "»"
>>> print clean_html(html)
<p>»</p>

I am wondering why I have an extra character (Â) in my output.
What should I do to avoid that?

Thanks,

Francesco



More information about the lxml-dev mailing list