[lxml-dev] lxml 2.0alpha5 released
Stefan Behnel
stefan_ml at behnel.de
Sat Nov 24 12:51:34 CET 2007
Hi all,
lxml 2.0alpha5 made it to PyPI. This is (hopefully) the last alpha in the
pre-2.0 series, so please report any remaining API quirks, weirdnesses and
bugs now to make sure they get fixed before 2.0 gets its API freeze during the
beta cycle. If all works out well, there should not be more than one beta
release before the final version.
This release features a major overhaul of the target parser, including an
internal SAX parser framework and an ET compatible TreeBuilder implementation.
The complete Changelog follows below.
Note that the API now enforces keyword-only arguments in a couple of places.
This can require some syntactic changes in existing code.
Have fun,
Stefan
2.0alpha5 (2007-11-24)
Features added
* Rich comparison of element.attrib proxies.
* ElementTree compatible TreeBuilder class.
* Use default prefixes for some common XML namespaces.
* lxml.html.clean.Cleaner now allows for a host_whitelist, and two
overridable methods: allow_embedded_url(el, url) and the more general
allow_element(el).
* Extended slicing of Elements as in element[1:-1:2], both in etree and in
objectify
* Resolvers can now provide a base_url keyword argument when resolving a
document as string data.
* When using lxml.doctestcompare you can give the doctest option
NOPARSE_MARKUP (like # doctest: +NOPARSE_MARKUP) to suppress the special
checking for one test.
Bugs fixed
* Target parser failed to report comments.
* In the lxml.html iter_links() method, links in <object> tags weren't
recognized. (Note: plugin-specific link parameters still aren't
recognized.) Also, the <embed> tag, though not standard, is now included
in lxml.html.defs.special_inline_tags.
* Using custom resolvers on XSLT stylesheets parsed from a string could
request ill-formed URLs.
* With lxml.doctestcompare if you do <tag xmlns="..."> in your output, it
will then be namespace-neutral (before the ellipsis was treated as a
real namespace).
Other changes
* The module source files were renamed to "lxml.*.pyx", such as
"lxml.etree.pyx". This was changed for consistency with the way Pyrex
commonly handles package imports. The main effect is that classes now
know about their fully qualified class name, including the package name
of their module.
* Keyword-only arguments in some API functions, especially in the parsers
and serialisers.
More information about the lxml-dev
mailing list