[lxml-dev] lxml.html, now with ignored namespaces!

Stefan Behnel stefan_ml at behnel.de
Fri Jun 26 08:55:48 CEST 2009


Hi,

Thomas Weigel wrote:
> I am using lxml to parse HTML documents, which include a custom 
> namespace (for example, "<p cs:content='fruit'>FRUIT</p>").
> 
> In lxml 2.2.0, on Windows, this worked just fine, and elements could be 
> processed based on this data.
> 
> In lxml 2.2.2, on Linux, this fails. The above example becomes "<p 
> content='fruit'>FRUIT</p>" as soon as it is parsed by lxml.html (or 
> lxml.etree.HTMLParser()).

You forgot to mention which versions of libxml2 you are using on both
systems. That's likely the reason for the difference.

http://codespeak.net/lxml/FAQ.html#i-think-i-have-found-a-bug-in-lxml-what-should-i-do

Stefan



More information about the lxml-dev mailing list