[lxml-dev] lxml 2.1 released

Alexander Limi limi at plone.org
Thu Jul 10 05:59:38 CEST 2008


The download link on the web site reports a 404:

http://codespeak.net/lxml/lxml-2.1.tgz

This is also present in the Deliverance buildout, which is how I got the  
error. Going to the web site didn't help. :)

— Alexander Limi

On Wed, 09 Jul 2008 06:27:21 -0700, Stefan Behnel <stefan_ml at behnel.de>  
wrote:

> Hi,
>
> lxml 2.1 finally made it to PyPI!
>
> This is a major new release that follows the 2.0 series with a couple of
> cleanups and tons of new features. The complete changelog follows below.
>
> This is also the first version that officially supports Python 3, as  
> released
> in 3.0beta1.
>
> Have fun,
> Stefan
>
>
> 2.1 (2008-07-09)
> ================
>
> Features added
> --------------
>
> * Smart strings can be switched off in XPath (``smart_string`` keyword
>   option).
>
> * ``lxml.html.rewrite_links()`` strips links to work around documents
>   with whitespace in URL attributes.
>
> * Pickling ``ElementTree`` objects in lxml.objectify.
>
> * Major overhaul of ``tools/xpathgrep.py`` script.
>
> * Pickling ``ElementTree`` objects in lxml.objectify.
>
> * Support for parsing from file-like objects that return unicode
>   strings.
>
> * New function ``etree.cleanup_namespaces(el)`` that removes unused
>   namespace declarations from a (sub)tree (experimental).
>
> * XSLT results support the buffer protocol in Python 3.
>
> * Polymorphic functions in ``lxml.html`` that accept either a tree or
>   a parsable string will return either a UTF-8 encoded byte string, a
>   unicode string or a tree, based on the type of the input.
>   Previously, the result was always a byte string or a tree.
>
> * Support for Python 2.6 and 3.0 beta.
>
> * File name handling now uses a heuristic to convert between byte
>   strings (usually filenames) and unicode strings (usually URLs).
>
> * Parsing from a plain file object frees the GIL under Python 2.x.
>
> * Running ``iterparse()`` on a plain file (or filename) frees the GIL
>   on reading under Python 2.x.
>
> * Conversion functions ``html_to_xhtml()`` and ``xhtml_to_html()`` in
>   lxml.html (experimental).
>
> * Most features in lxml.html work for XHTML namespaced tag names
>   (experimental).
>
> * All parse functions in lxml.html take a ``parser`` keyword argument.
>
> * lxml.html has a new parser class ``XHTMLParser`` and a module
>   attribute ``xhtml_parser`` that provide XML parsers that are
>   pre-configured for the lxml.html package.
>
> * Error logging in Schematron (requires libxml2 2.6.32 or later).
>
> * Parser option ``strip_cdata`` for normalising or keeping CDATA
>   sections.  Defaults to ``True`` as before, thus replacing CDATA
>   sections by their text content.
>
> * ``CDATA()`` factory to wrap string content as CDATA section.
>
> * New event types 'comment' and 'pi' in ``iterparse()``.
>
> * ``XSLTAccessControl`` instances have a property ``options`` that
>   returns a dict of access configuration options.
>
> * Constant instances ``DENY_ALL`` and ``DENY_WRITE`` on
>   ``XSLTAccessControl`` class.
>
> * Extension elements for XSLT (experimental!)
>
> * ``Element.base`` property returns the xml:base or HTML base URL of
>   an Element.
>
> * ``docinfo.URL`` property is writable.
>
>
> Bugs fixed
> ----------
>
> * Custom resolvers were not used for XMLSchema includes/imports and
>   XInclude processing.
>
> * CSS selector parser dropped remaining expression after a function
>   with parameters.
>
> * Descending dot-separated classes in CSS selectors were not resolved
>   correctly.
>
> * ``ElementTree.parse()`` didn't handle target parser result.
>
> * Potential threading problem in XInclude.
>
> * Crash in Element class lookup classes when the __init__() method of
>   the super class is not called from Python subclasses.
>
> * A number of problems related to unicode/byte string conversion of
>   filenames and error messages were fixed.
>
> * Building on MacOS-X now passes the "flat_namespace" option to the C
>   compiler, which reportedly prevents build quirks and crashes on this
>   platform.
>
> * Windows build was broken.
>
> * Rare crash when serialising to a file object with certain encodings.
>
> * Incorrect evaluation of ``el.find("tag[child]")``.
>
> * Moving a subtree from a document created in one thread into a
>   document of another thread could crash when the rest of the source
>   document is deleted while the subtree is still in use.
>
> * Passing an nsmap when creating an Element will no longer strip
>   redundantly defined namespace URIs.  This prevented the definition
>   of more than one prefix for a namespace on the same Element.
>
> * Resolving to a filename in custom resolvers didn't work.
>
> * lxml did not honour libxslt's second error state "STOPPED", which
>   let some XSLT errors pass silently.
>
> * Memory leak in Schematron with libxml2 >= 2.6.31.
>
> * lxml.etree accepted non well-formed namespace prefix names.
>
> * Hanging thread in conjunction with GTK threading.
>
> * Crash bug in iterparse when moving elements into other documents.
>
> * HTML elements' ``.cssselect()`` method was broken.
>
> * ``ElementTree.find*()`` didn't accept QName objects.
>
> * Default encoding for plain text serialisation was different from
>   that of XML serialisation (UTF-8 instead of ASCII).
>
>
> Other changes
> -------------
>
> * ``objectify.enableRecursiveStr()`` was removed, use
>   ``objectify.enable_recursive_str()`` instead
>
> * Speed-up when running XSLTs on documents from other threads
>
> * Non-ASCII characters in attribute values are no longer escaped on
>   serialisation.
>
> * Passing non-ASCII byte strings or invalid unicode strings as .tag,
>   namespaces, etc. will result in a ValueError instead of an
>   AssertionError (just like the tag well-formedness check).
>
> * Up to several times faster attribute access (i.e. tree traversal) in
>   lxml.objectify.
>
> * lxml should now build without problems on MacOS-X.
>
> * If the default namespace is redundantly defined with a prefix on the
>   same Element, the prefix will now be preferred for subelements and
>   attributes.  This allows users to work around a problem in libxml2
>   where attributes from the default namespace could serialise without
>   a prefix even when they appear on an Element with a different
>   namespace (i.e. they would end up in the wrong namespace).
>
> * Major cleanup in internal ``moveNodeToDocument()`` function, which
>   takes care of namespace cleanup when moving elements between
>   different namespace contexts.
>
> * New Elements created through the ``makeelement()`` method of an HTML
>   parser or through lxml.html now end up in a new HTML document
>   (doctype HTML 4.01 Transitional) instead of a generic XML document.
>   This mostly impacts the serialisation and the availability of a DTD
>   context.
>
> * Minor API speed-ups.
>
> * The benchmark suite now uses tail text in the trees, which makes the
>   absolute numbers incomparable to previous results.
>
> * Generating the HTML documentation now requires Pygments_, which is
>   used to enable syntax highlighting for the doctest examples.
>
> .. _Pygments: http://pygments.org/
>
> Most long-time deprecated functions and methods were removed:
>
> - ``etree.clearErrorLog()``, use ``etree.clear_error_log()``
>
> - ``etree.useGlobalPythonLog()``, use
>   ``etree.use_global_python_log()``
>
> - ``etree.ElementClassLookup.setFallback()``, use
>   ``etree.ElementClassLookup.set_fallback()``
>
> - ``etree.getDefaultParser()``, use ``etree.get_default_parser()``
>
> - ``etree.setDefaultParser()``, use ``etree.set_default_parser()``
>
> - ``etree.setElementClassLookup()``, use
>   ``etree.set_element_class_lookup()``
>
>   Note that ``parser.setElementClassLookup()`` has not been removed
>   yet, although ``parser.set_element_class_lookup()`` should be used
>   instead.
>
> - ``xpath_evaluator.registerNamespace()``, use
>   ``xpath_evaluator.register_namespace()``
>
> - ``xpath_evaluator.registerNamespaces()``, use
>   ``xpath_evaluator.register_namespaces()``
>
> - ``objectify.setPytypeAttributeTag``, use
>   ``objectify.set_pytype_attribute_tag``
>
> - ``objectify.setDefaultParser()``, use
>   ``objectify.set_default_parser()``



-- 
Alexander Limi · http://limi.net



More information about the lxml-dev mailing list