[lxml-dev] lxml 2.1 released
Alexander Limi
limi at plone.org
Thu Jul 10 05:59:38 CEST 2008
The download link on the web site reports a 404:
http://codespeak.net/lxml/lxml-2.1.tgz
This is also present in the Deliverance buildout, which is how I got the
error. Going to the web site didn't help. :)
— Alexander Limi
On Wed, 09 Jul 2008 06:27:21 -0700, Stefan Behnel <stefan_ml at behnel.de>
wrote:
> Hi,
>
> lxml 2.1 finally made it to PyPI!
>
> This is a major new release that follows the 2.0 series with a couple of
> cleanups and tons of new features. The complete changelog follows below.
>
> This is also the first version that officially supports Python 3, as
> released
> in 3.0beta1.
>
> Have fun,
> Stefan
>
>
> 2.1 (2008-07-09)
> ================
>
> Features added
> --------------
>
> * Smart strings can be switched off in XPath (``smart_string`` keyword
> option).
>
> * ``lxml.html.rewrite_links()`` strips links to work around documents
> with whitespace in URL attributes.
>
> * Pickling ``ElementTree`` objects in lxml.objectify.
>
> * Major overhaul of ``tools/xpathgrep.py`` script.
>
> * Pickling ``ElementTree`` objects in lxml.objectify.
>
> * Support for parsing from file-like objects that return unicode
> strings.
>
> * New function ``etree.cleanup_namespaces(el)`` that removes unused
> namespace declarations from a (sub)tree (experimental).
>
> * XSLT results support the buffer protocol in Python 3.
>
> * Polymorphic functions in ``lxml.html`` that accept either a tree or
> a parsable string will return either a UTF-8 encoded byte string, a
> unicode string or a tree, based on the type of the input.
> Previously, the result was always a byte string or a tree.
>
> * Support for Python 2.6 and 3.0 beta.
>
> * File name handling now uses a heuristic to convert between byte
> strings (usually filenames) and unicode strings (usually URLs).
>
> * Parsing from a plain file object frees the GIL under Python 2.x.
>
> * Running ``iterparse()`` on a plain file (or filename) frees the GIL
> on reading under Python 2.x.
>
> * Conversion functions ``html_to_xhtml()`` and ``xhtml_to_html()`` in
> lxml.html (experimental).
>
> * Most features in lxml.html work for XHTML namespaced tag names
> (experimental).
>
> * All parse functions in lxml.html take a ``parser`` keyword argument.
>
> * lxml.html has a new parser class ``XHTMLParser`` and a module
> attribute ``xhtml_parser`` that provide XML parsers that are
> pre-configured for the lxml.html package.
>
> * Error logging in Schematron (requires libxml2 2.6.32 or later).
>
> * Parser option ``strip_cdata`` for normalising or keeping CDATA
> sections. Defaults to ``True`` as before, thus replacing CDATA
> sections by their text content.
>
> * ``CDATA()`` factory to wrap string content as CDATA section.
>
> * New event types 'comment' and 'pi' in ``iterparse()``.
>
> * ``XSLTAccessControl`` instances have a property ``options`` that
> returns a dict of access configuration options.
>
> * Constant instances ``DENY_ALL`` and ``DENY_WRITE`` on
> ``XSLTAccessControl`` class.
>
> * Extension elements for XSLT (experimental!)
>
> * ``Element.base`` property returns the xml:base or HTML base URL of
> an Element.
>
> * ``docinfo.URL`` property is writable.
>
>
> Bugs fixed
> ----------
>
> * Custom resolvers were not used for XMLSchema includes/imports and
> XInclude processing.
>
> * CSS selector parser dropped remaining expression after a function
> with parameters.
>
> * Descending dot-separated classes in CSS selectors were not resolved
> correctly.
>
> * ``ElementTree.parse()`` didn't handle target parser result.
>
> * Potential threading problem in XInclude.
>
> * Crash in Element class lookup classes when the __init__() method of
> the super class is not called from Python subclasses.
>
> * A number of problems related to unicode/byte string conversion of
> filenames and error messages were fixed.
>
> * Building on MacOS-X now passes the "flat_namespace" option to the C
> compiler, which reportedly prevents build quirks and crashes on this
> platform.
>
> * Windows build was broken.
>
> * Rare crash when serialising to a file object with certain encodings.
>
> * Incorrect evaluation of ``el.find("tag[child]")``.
>
> * Moving a subtree from a document created in one thread into a
> document of another thread could crash when the rest of the source
> document is deleted while the subtree is still in use.
>
> * Passing an nsmap when creating an Element will no longer strip
> redundantly defined namespace URIs. This prevented the definition
> of more than one prefix for a namespace on the same Element.
>
> * Resolving to a filename in custom resolvers didn't work.
>
> * lxml did not honour libxslt's second error state "STOPPED", which
> let some XSLT errors pass silently.
>
> * Memory leak in Schematron with libxml2 >= 2.6.31.
>
> * lxml.etree accepted non well-formed namespace prefix names.
>
> * Hanging thread in conjunction with GTK threading.
>
> * Crash bug in iterparse when moving elements into other documents.
>
> * HTML elements' ``.cssselect()`` method was broken.
>
> * ``ElementTree.find*()`` didn't accept QName objects.
>
> * Default encoding for plain text serialisation was different from
> that of XML serialisation (UTF-8 instead of ASCII).
>
>
> Other changes
> -------------
>
> * ``objectify.enableRecursiveStr()`` was removed, use
> ``objectify.enable_recursive_str()`` instead
>
> * Speed-up when running XSLTs on documents from other threads
>
> * Non-ASCII characters in attribute values are no longer escaped on
> serialisation.
>
> * Passing non-ASCII byte strings or invalid unicode strings as .tag,
> namespaces, etc. will result in a ValueError instead of an
> AssertionError (just like the tag well-formedness check).
>
> * Up to several times faster attribute access (i.e. tree traversal) in
> lxml.objectify.
>
> * lxml should now build without problems on MacOS-X.
>
> * If the default namespace is redundantly defined with a prefix on the
> same Element, the prefix will now be preferred for subelements and
> attributes. This allows users to work around a problem in libxml2
> where attributes from the default namespace could serialise without
> a prefix even when they appear on an Element with a different
> namespace (i.e. they would end up in the wrong namespace).
>
> * Major cleanup in internal ``moveNodeToDocument()`` function, which
> takes care of namespace cleanup when moving elements between
> different namespace contexts.
>
> * New Elements created through the ``makeelement()`` method of an HTML
> parser or through lxml.html now end up in a new HTML document
> (doctype HTML 4.01 Transitional) instead of a generic XML document.
> This mostly impacts the serialisation and the availability of a DTD
> context.
>
> * Minor API speed-ups.
>
> * The benchmark suite now uses tail text in the trees, which makes the
> absolute numbers incomparable to previous results.
>
> * Generating the HTML documentation now requires Pygments_, which is
> used to enable syntax highlighting for the doctest examples.
>
> .. _Pygments: http://pygments.org/
>
> Most long-time deprecated functions and methods were removed:
>
> - ``etree.clearErrorLog()``, use ``etree.clear_error_log()``
>
> - ``etree.useGlobalPythonLog()``, use
> ``etree.use_global_python_log()``
>
> - ``etree.ElementClassLookup.setFallback()``, use
> ``etree.ElementClassLookup.set_fallback()``
>
> - ``etree.getDefaultParser()``, use ``etree.get_default_parser()``
>
> - ``etree.setDefaultParser()``, use ``etree.set_default_parser()``
>
> - ``etree.setElementClassLookup()``, use
> ``etree.set_element_class_lookup()``
>
> Note that ``parser.setElementClassLookup()`` has not been removed
> yet, although ``parser.set_element_class_lookup()`` should be used
> instead.
>
> - ``xpath_evaluator.registerNamespace()``, use
> ``xpath_evaluator.register_namespace()``
>
> - ``xpath_evaluator.registerNamespaces()``, use
> ``xpath_evaluator.register_namespaces()``
>
> - ``objectify.setPytypeAttributeTag``, use
> ``objectify.set_pytype_attribute_tag``
>
> - ``objectify.setDefaultParser()``, use
> ``objectify.set_default_parser()``
--
Alexander Limi · http://limi.net
More information about the lxml-dev
mailing list