[lxml-dev] Installing lxml 2.0beta1 via easy_install requires Cython; also, question about lxml.html.clean.clean_html
Stefan Behnel
stefan_ml at behnel.de
Sat Jan 12 09:46:36 CET 2008
Hi,
Jon Rosebaugh wrote:
> I attempted to install lxml 2.0beta1 via easy_install (easy_install
> lxml==2.0beta1), and it didn't work. After a bunch of experimentation,
> I discovered that the C files that are supposed to be present in the
> download were not present. After installing a patched version of
> Cython 0.9.6.10b (patched according to the directions I found on this
> list) lxml successfully installed.
Hmm, it shouldn't be that hard. The tgz I downloaded has the .c files, so
installing without Cython should work just fine. I just removed my local
Cython install and did an "easy_install lxml" (which downloaded, built and
installed 2.0beta1) and also an "easy_install lxml-2.0beta1.tar.gz". Both
worked just fine.
Maybe you had an older version of Cython installed? If that's found, it will
be used - and obviously fail.
> Also, I'm not sure, but I think the lxml.html.clean.clean_html()
> function might not be working properly? I followed the example at
> http://codespeak.net/lxml/dev/lxmlhtml.html#cleaning-up-html but got
> different results. I expected this:
> <html>
> <body>
> <div>
> <style>/* deleted */</style>
> <a href="">a link</a>
> <a href="#">another link</a>
> <p>a paragraph</p>
> <div>secret EVIL!</div>
> of EVIL!
> Password:
> annoying EVIL!
> <a href="evil-site">spam spam SPAM!</a>
> <img src="evil!">
> </div>
> </body>
> </html>
>
> But got this:
> <div><style>/* deleted */</style><body>
>
> <a href="">a link</a>
> <a href="#">another link</a>
> <p>a paragraph</p>
> <div>secret EVIL!</div>
> of EVIL!
>
>
> Password:
> annoying EVIL!<a href="evil-site">spam spam SPAM!</a>
> <img src="evil!"></body></div>
That one should work, too. I just ran lxmlhtml.txt as doctest (which
admittedly wasn't included in the test suite before) and it just worked. Same
for test_clean.txt.
What's the version of libxml2 you are using? Can you try running the test
suite and see if that works for you?
Stefan
More information about the lxml-dev
mailing list