[lxml-dev] Installing lxml 2.0beta1 via easy_install requires Cython; also, question about lxml.html.clean.clean_html
Jon Rosebaugh
chairos at gmail.com
Sat Jan 12 05:14:32 CET 2008
I attempted to install lxml 2.0beta1 via easy_install (easy_install
lxml==2.0beta1), and it didn't work. After a bunch of experimentation,
I discovered that the C files that are supposed to be present in the
download were not present. After installing a patched version of
Cython 0.9.6.10b (patched according to the directions I found on this
list) lxml successfully installed. But I was very surprised at this
requirement.
Also, I'm not sure, but I think the lxml.html.clean.clean_html()
function might not be working properly? I followed the example at
http://codespeak.net/lxml/dev/lxmlhtml.html#cleaning-up-html but got
different results. I expected this:
<html>
<body>
<div>
<style>/* deleted */</style>
<a href="">a link</a>
<a href="#">another link</a>
<p>a paragraph</p>
<div>secret EVIL!</div>
of EVIL!
Password:
annoying EVIL!
<a href="evil-site">spam spam SPAM!</a>
<img src="evil!">
</div>
</body>
</html>
But got this:
<div><style>/* deleted */</style><body>
<a href="">a link</a>
<a href="#">another link</a>
<p>a paragraph</p>
<div>secret EVIL!</div>
of EVIL!
Password:
annoying EVIL!<a href="evil-site">spam spam SPAM!</a>
<img src="evil!"></body></div>
More information about the lxml-dev
mailing list