==============
lxml changelog
==============
2.3 (2011-02-06)
================
Features added
--------------
* When looking for children, ``lxml.objectify`` takes '{}tag' as
meaning an empty namespace, as opposed to the parent namespace.
Bugs fixed
----------
* When finished reading from a file-like object, the parser
immediately calls its ``.close()`` method.
* When finished parsing, ``iterparse()`` immediately closes the input
file.
* Work-around for libxml2 bug that can leave the HTML parser in a
non-functional state after parsing a severly broken document (fixed
in libxml2 2.7.8).
* ``marque`` tag in HTML cleanup code is correctly named ``marquee``.
Other changes
--------------
* Some public functions in the Cython-level C-API have more explicit
return types.
2.3beta1 (2010-09-06)
=====================
Features added
--------------
Bugs fixed
----------
* Crash in newer libxml2 versions when moving elements between
documents that had attributes on replaced XInclude nodes.
* ``XMLID()`` function was missing the optional ``parser`` and
``base_url`` parameters.
* Searching for wildcard tags in ``iterparse()`` was broken in Py3.
* ``lxml.html.open_in_browser()`` didn't work in Python 3 due to the
use of os.tempnam. It now takes an optional 'encoding' parameter.
Other changes
--------------
2.3alpha2 (2010-07-24)
======================
Features added
--------------
Bugs fixed
----------
* Crash in XSLT when generating text-only result documents with a
stylesheet created in a different thread.
Other changes
--------------
* ``repr()`` of Element objects shows the hex ID with leading 0x
(following ElementTree 1.3).
2.3alpha1 (2010-06-19)
======================
Features added
--------------
* Keyword argument ``namespaces`` in ``lxml.cssselect.CSSSelector()``
to pass a prefix-to-namespace mapping for the selector.
* New function ``lxml.etree.register_namespace(prefix, uri)`` that
globally registers a namespace prefix for a namespace that newly
created Elements in that namespace will use automatically. Follows
ElementTree 1.3.
* Support 'unicode' string name as encoding parameter in
``tostring()``, following ElementTree 1.3.
* Support 'c14n' serialisation method in ``ElementTree.write()`` and
``tostring()``, following ElementTree 1.3.
* The ElementPath expression syntax (``el.find*()``) was extended to
match the upcoming ElementTree 1.3 that will ship in the standard
library of Python 3.2/2.7. This includes extended support for
predicates as well as namespace prefixes (as known from XPath).
* During regular XPath evaluation, various ESXLT functions are
available within their namespace when using libxslt 1.1.26 or later.
* Support passing a readily configured logger instance into
``PyErrorLog``, instead of a logger name.
* On serialisation, the new ``doctype`` parameter can be used to
override the DOCTYPE (internal subset) of the document.
* New parameter ``output_parent`` to ``XSLTExtension.apply_templates()``
to append the resulting content directly to an output element.
* ``XSLTExtension.process_children()`` to process the content of the
XSLT extension element itself.
* ISO-Schematron support based on the de-facto Schematron reference
'skeleton implementation'.
* XSLT objects now take XPath object as ``__call__`` stylesheet
parameters.
* Enable path caching in ElementPath (``el.find*()``) to avoid parsing
overhead.
* Setting the value of a namespaced attribute always uses a prefixed
namespace instead of the default namespace even if both declare the
same namespace URI. This avoids serialisation problems when an
attribute from a default namespace is set on an element from a
different namespace.
* XSLT extension elements: support for XSLT context nodes other than
elements: document root, comments, processing instructions.
* Support for strings (in addition to Elements) in node-sets returned
by extension functions.
* Forms that lack an ``action`` attribute default to the base URL of
the document on submit.
* XPath attribute result strings have an ``attrname`` property.
* Namespace URIs get validated against RFC 3986 at the API level
(required by the XML namespace specification).
* Target parsers show their target object in the ``.target`` property
(compatible with ElementTree).
Bugs fixed
----------
* API is hardened against invalid proxy instances to prevent crashes
due to incorrectly instantiated Element instances.
* Prevent crash when instantiating ``CommentBase`` and friends.
* Export ElementTree compatible XML parser class as
``XMLTreeBuilder``, as it is called in ET 1.2.
* ObjectifiedDataElements in lxml.objectify were not hashable. They
now use the hash value of the underlying Python value (string,
number, etc.) to which they compare equal.
* Parsing broken fragments in lxml.html could fail if the fragment
contained an orphaned closing '' tag.
* Using XSLT extension elements around the root of the output document
crashed.
* ``lxml.cssselect`` did not distinguish between ``x[attr="val"]`` and
``x [attr="val"]`` (with a space). The latter now matches the
attribute independent of the element.
* Rewriting multiple links inside of HTML text content could end up
replacing unrelated content as replacements could impact the
reported position of subsequent matches. Modifications are now
simplified by letting the ``iterlinks()`` generator in ``lxml.html``
return links in reversed order if they appear inside the same text
node. Thus, replacements and link-internal modifications no longer
change the position of links reported afterwards.
* The ``.value`` attribute of ``textarea`` elements in lxml.html did
not represent the complete raw value (including child tags etc.). It
now serialises the complete content on read and replaces the
complete content by a string on write.
* Target parser didn't call ``.close()`` on the target object if
parsing failed. Now it is guaranteed that ``.close()`` will be
called after parsing, regardless of the outcome.
Other changes
-------------
* Official support for Python 3.1.2 and later.
* Static MS Windows builds can now download their dependencies
themselves.
* ``Element.attrib`` no longer uses a cyclic reference back to its
Element object. It therefore no longer requires the garbage
collector to clean up.
* Static builds include libiconv, in addition to libxml2 and libxslt.
2.2.8 (2010-09-02)
==================
Bugs fixed
----------
* Crash in newer libxml2 versions when moving elements between
documents that had attributes on replaced XInclude nodes.
* Import fix for urljoin in Python 3.1+.
2.2.7 (2010-07-24)
==================
Bugs fixed
----------
* Crash in XSLT when generating text-only result documents with a
stylesheet created in a different thread.
2.2.6 (2010-03-02)
==================
Bugs fixed
----------
* Fixed several Python 3 regressions by building with Cython 0.11.3.
2.2.5 (2010-02-28)
==================
Features added
--------------
* Support for running XSLT extension elements on the input root node
(e.g. in a template matching on "/").
Bugs fixed
----------
* Crash in XPath evaluation when reading smart strings from a document
other than the original context document.
* Support recent versions of html5lib by not requiring its
``XHTMLParser`` in ``htmlparser.py`` anymore.
* Manually instantiating the custom element classes in
``lxml.objectify`` could crash.
* Invalid XML text characters were not rejected by the API when they
appeared in unicode strings directly after non-ASCII characters.
* lxml.html.open_http_urllib() did not work in Python 3.
* The functions ``strip_tags()`` and ``strip_elements()`` in
``lxml.etree`` did not remove all occurrences of a tag in all cases.
* Crash in XSLT extension elements when the XSLT context node is not
an element.
2.2.4 (2009-11-11)
==================
Bugs fixed
----------
* Static build of libxml2/libxslt was broken.
2.2.3 (2009-10-30)
==================
Features added
--------------
Bugs fixed
----------
* The ``resolve_entities`` option did not work in the incremental feed
parser.
* Looking up and deleting attributes without a namespace could hit a
namespaced attribute of the same name instead.
* Late errors during calls to ``SubElement()`` (e.g. attribute related
ones) could leave a partially initialised element in the tree.
* Modifying trees that contain parsed entity references could result
in an infinite loop.
* ObjectifiedElement.__setattr__ created an empty-string child element when the
attribute value was rejected as a non-unicode/non-ascii string
* Syntax errors in ``lxml.cssselect`` could result in misleading error
messages.
* Invalid syntax in CSS expressions could lead to an infinite loop in
the parser of ``lxml.cssselect``.
* CSS special character escapes were not properly handled in
``lxml.cssselect``.
* CSS Unicode escapes were not properly decoded in ``lxml.cssselect``.
* Select options in HTML forms that had no explicit ``value``
attribute were not handled correctly. The HTML standard dictates
that their value is defined by their text content. This is now
supported by lxml.html.
* XPath raised a TypeError when finding CDATA sections. This is now
fully supported.
* Calling ``help(lxml.objectify)`` didn't work at the prompt.
* The ``ElementMaker`` in lxml.objectify no longer defines the default
namespaces when annotation is disabled.
* Feed parser failed to honout the 'recover' option on parse errors.
* Diverting the error logging to Python's logging system was broken.
Other changes
-------------
2.2.2 (2009-06-21)
==================
Features added
--------------
* New helper functions ``strip_attributes()``, ``strip_elements()``,
``strip_tags()`` in lxml.etree to remove attributes/subtrees/tags
from a subtree.
Bugs fixed
----------
* Namespace cleanup on subtree insertions could result in missing
namespace declarations (and potentially crashes) if the element
defining a namespace was deleted and the namespace was not used by
the top element of the inserted subtree but only in deeper subtrees.
* Raising an exception from a parser target callback didn't always
terminate the parser.
* Only {true, false, 1, 0} are accepted as the lexical representation for
BoolElement ({True, False, T, F, t, f} not any more), restoring lxml <= 2.0
behaviour.
Other changes
-------------
2.2.1 (2009-06-02)
==================
Features added
--------------
* Injecting default attributes into a document during XML Schema
validation (also at parse time).
* Pass ``huge_tree`` parser option to disable parser security
restrictions imposed by libxml2 2.7.
Bugs fixed
----------
* The script for statically building libxml2 and libxslt didn't work
in Py3.
* ``XMLSchema()`` also passes invalid schema documents on to libxml2
for parsing (which could lead to a crash before release 2.6.24).
Other changes
-------------
2.2 (2009-03-21)
================
Features added
--------------
* Support for ``standalone`` flag in XML declaration through
``tree.docinfo.standalone`` and by passing ``standalone=True/False``
on serialisation.
Bugs fixed
----------
* Crash when parsing an XML Schema with external imports from a
filename.
2.2beta4 (2009-02-27)
=====================
Features added
--------------
* Support strings and instantiable Element classes as child arguments
to the constructor of custom Element classes.
* GZip compression support for serialisation to files and file-like
objects.
Bugs fixed
----------
* Deep-copying an ElementTree copied neither its sibling PIs and
comments nor its internal/external DTD subsets.
* Soupparser failed on broken attributes without values.
* Crash in XSLT when overwriting an already defined attribute using
``xsl:attribute``.
* Crash bug in exception handling code under Python 3. This was due
to a problem in Cython, not lxml itself.
* ``lxml.html.FormElement._name()`` failed for non top-level forms.
* ``TAG`` special attribute in constructor of custom Element classes
was evaluated incorrectly.
Other changes
-------------
* Official support for Python 3.0.1.
* ``Element.findtext()`` now returns an empty string instead of None
for Elements without text content.
2.2beta3 (2009-02-17)
=====================
Features added
--------------
* ``XSLT.strparam()`` class method to wrap quoted string parameters
that require escaping.
Bugs fixed
----------
* Memory leak in XPath evaluators.
* Crash when parsing indented XML in one thread and merging it with
other documents parsed in another thread.
* Setting the ``base`` attribute in ``lxml.objectify`` from a unicode
string failed.
* Fixes following changes in Python 3.0.1.
* Minor fixes for Python 3.
Other changes
-------------
* The global error log (which is copied into the exception log) is now
local to a thread, which fixes some race conditions.
* More robust error handling on serialisation.
2.2beta2 (2009-01-25)
=====================
Bugs fixed
----------
* Potential memory leak on exception handling. This was due to a
problem in Cython, not lxml itself.
* ``iter_links`` (and related link-rewriting functions) in
``lxml.html`` would interpret CSS like ``url("link")`` incorrectly
(treating the quotation marks as part of the link).
* Failing import on systems that have an ``io`` module.
2.1.5 (2009-01-06)
==================
Bugs fixed
----------
* Potential memory leak on exception handling. This was due to a
problem in Cython, not lxml itself.
* Failing import on systems that have an ``io`` module.
2.2beta1 (2008-12-12)
=====================
Features added
--------------
* Allow ``lxml.html.diff.htmldiff`` to accept Element objects, not
just HTML strings.
Bugs fixed
----------
* Crash when using an XPath evaluator in multiple threads.
* Fixed missing whitespace before ``Link:...`` in ``lxml.html.diff``.
Other changes
-------------
* Export ``lxml.html.parse``.
2.1.4 (2008-12-12)
==================
Bugs fixed
----------
* Crash when using an XPath evaluator in multiple threads.
2.0.11 (2008-12-12)
===================
Bugs fixed
----------
* Crash when using an XPath evaluator in multiple threads.
2.2alpha1 (2008-11-23)
======================
Features added
--------------
* Support for XSLT result tree fragments in XPath/XSLT extension
functions.
* QName objects have new properties ``namespace`` and ``localname``.
* New options for exclusive C14N and C14N without comments.
* Instantiating a custom Element classes creates a new Element.
Bugs fixed
----------
* XSLT didn't inherit the parse options of the input document.
* 0-bytes could slip through the API when used inside of Unicode
strings.
* With ``lxml.html.clean.autolink``, links with balanced parenthesis,
that end in a parenthesis, will be linked in their entirety (typical
with Wikipedia links).
Other changes
-------------
2.1.3 (2008-11-17)
==================
Features added
--------------
Bugs fixed
----------
* Ref-count leaks when lxml enters a try-except statement while an
outside exception lives in sys.exc_*(). This was due to a problem in
Cython, not lxml itself.
* Parser Unicode decoding errors could get swallowed by other
exceptions.
* Name/import errors in some Python modules.
* Internal DTD subsets that did not specify a system or public ID were
not serialised and did not appear in the docinfo property of
ElementTrees.
* Fix a pre-Py3k warning when parsing from a gzip file in Py2.6.
* Test suite fixes for libxml2 2.7.
* Resolver.resolve_string() did not work for non-ASCII byte strings.
* Resolver.resolve_file() was broken.
* Overriding the parser encoding didn't work for many encodings.
Other changes
-------------
2.0.10 (2008-11-17)
===================
Bugs fixed
----------
* Ref-count leaks when lxml enters a try-except statement while an
outside exception lives in sys.exc_*(). This was due to a problem in
Cython, not lxml itself.
2.1.2 (2008-09-05)
==================
Features added
--------------
* lxml.etree now tries to find the absolute path name of files when
parsing from a file-like object. This helps custom resolvers when
resolving relative URLs, as lixbml2 can prepend them with the path
of the source document.
Bugs fixed
----------
* Memory problem when passing documents between threads.
* Target parser did not honour the ``recover`` option and raised an
exception instead of calling ``.close()`` on the target.
Other changes
-------------
2.0.9 (2008-09-05)
==================
Bugs fixed
----------
* Memory problem when passing documents between threads.
* Target parser did not honour the ``recover`` option and raised an
exception instead of calling ``.close()`` on the target.
2.1.1 (2008-07-24)
==================
Features added
--------------
Bugs fixed
----------
* Crash when parsing XSLT stylesheets in a thread and using them in
another.
* Encoding problem when including text with ElementInclude under
Python 3.
Other changes
-------------
2.0.8 (2008-07-24)
==================
Features added
--------------
* ``lxml.html.rewrite_links()`` strips links to work around documents
with whitespace in URL attributes.
Bugs fixed
----------
* Crash when parsing XSLT stylesheets in a thread and using them in
another.
* CSS selector parser dropped remaining expression after a function
with parameters.
Other changes
-------------
2.1 (2008-07-09)
================
Features added
--------------
* Smart strings can be switched off in XPath (``smart_strings``
keyword option).
* ``lxml.html.rewrite_links()`` strips links to work around documents
with whitespace in URL attributes.
Bugs fixed
----------
* Custom resolvers were not used for XMLSchema includes/imports and
XInclude processing.
* CSS selector parser dropped remaining expression after a function
with parameters.
Other changes
-------------
* ``objectify.enableRecursiveStr()`` was removed, use
``objectify.enable_recursive_str()`` instead
* Speed-up when running XSLTs on documents from other threads
2.0.7 (2008-06-20)
==================
Features added
--------------
* Pickling ``ElementTree`` objects in lxml.objectify.
Bugs fixed
----------
* Descending dot-separated classes in CSS selectors were not resolved
correctly.
* ``ElementTree.parse()`` didn't handle target parser result.
* Potential threading problem in XInclude.
* Crash in Element class lookup classes when the __init__() method of
the super class is not called from Python subclasses.
Other changes
-------------
* Non-ASCII characters in attribute values are no longer escaped on
serialisation.
2.1beta3 (2008-06-19)
=====================
Features added
--------------
* Major overhaul of ``tools/xpathgrep.py`` script.
* Pickling ``ElementTree`` objects in lxml.objectify.
* Support for parsing from file-like objects that return unicode
strings.
* New function ``etree.cleanup_namespaces(el)`` that removes unused
namespace declarations from a (sub)tree (experimental).
* XSLT results support the buffer protocol in Python 3.
* Polymorphic functions in ``lxml.html`` that accept either a tree or
a parsable string will return either a UTF-8 encoded byte string, a
unicode string or a tree, based on the type of the input.
Previously, the result was always a byte string or a tree.
* Support for Python 2.6 and 3.0 beta.
* File name handling now uses a heuristic to convert between byte
strings (usually filenames) and unicode strings (usually URLs).
* Parsing from a plain file object frees the GIL under Python 2.x.
* Running ``iterparse()`` on a plain file (or filename) frees the GIL
on reading under Python 2.x.
* Conversion functions ``html_to_xhtml()`` and ``xhtml_to_html()`` in
lxml.html (experimental).
* Most features in lxml.html work for XHTML namespaced tag names
(experimental).
Bugs fixed
----------
* ``ElementTree.parse()`` didn't handle target parser result.
* Crash in Element class lookup classes when the __init__() method of
the super class is not called from Python subclasses.
* A number of problems related to unicode/byte string conversion of
filenames and error messages were fixed.
* Building on MacOS-X now passes the "flat_namespace" option to the C
compiler, which reportedly prevents build quirks and crashes on this
platform.
* Windows build was broken.
* Rare crash when serialising to a file object with certain encodings.
Other changes
-------------
* Non-ASCII characters in attribute values are no longer escaped on
serialisation.
* Passing non-ASCII byte strings or invalid unicode strings as .tag,
namespaces, etc. will result in a ValueError instead of an
AssertionError (just like the tag well-formedness check).
* Up to several times faster attribute access (i.e. tree traversal) in
lxml.objectify.
2.0.6 (2008-05-31)
==================
Features added
--------------
Bugs fixed
----------
* Incorrect evaluation of ``el.find("tag[child]")``.
* Windows build was broken.
* Moving a subtree from a document created in one thread into a
document of another thread could crash when the rest of the source
document is deleted while the subtree is still in use.
* Rare crash when serialising to a file object with certain encodings.
Other changes
-------------
* lxml should now build without problems on MacOS-X.
2.1beta2 (2008-05-02)
=====================
Features added
--------------
* All parse functions in lxml.html take a ``parser`` keyword argument.
* lxml.html has a new parser class ``XHTMLParser`` and a module
attribute ``xhtml_parser`` that provide XML parsers that are
pre-configured for the lxml.html package.
Bugs fixed
----------
* Moving a subtree from a document created in one thread into a
document of another thread could crash when the rest of the source
document is deleted while the subtree is still in use.
* Passing an nsmap when creating an Element will no longer strip
redundantly defined namespace URIs. This prevented the definition
of more than one prefix for a namespace on the same Element.
Other changes
-------------
* If the default namespace is redundantly defined with a prefix on the
same Element, the prefix will now be preferred for subelements and
attributes. This allows users to work around a problem in libxml2
where attributes from the default namespace could serialise without
a prefix even when they appear on an Element with a different
namespace (i.e. they would end up in the wrong namespace).
2.0.5 (2008-05-01)
==================
Features added
--------------
Bugs fixed
----------
* Resolving to a filename in custom resolvers didn't work.
* lxml did not honour libxslt's second error state "STOPPED", which
let some XSLT errors pass silently.
* Memory leak in Schematron with libxml2 >= 2.6.31.
Other changes
-------------
2.1beta1 (2008-04-15)
=====================
Features added
--------------
* Error logging in Schematron (requires libxml2 2.6.32 or later).
* Parser option ``strip_cdata`` for normalising or keeping CDATA
sections. Defaults to ``True`` as before, thus replacing CDATA
sections by their text content.
* ``CDATA()`` factory to wrap string content as CDATA section.
Bugs fixed
----------
* Resolving to a filename in custom resolvers didn't work.
* lxml did not honour libxslt's second error state "STOPPED", which
let some XSLT errors pass silently.
* Memory leak in Schematron with libxml2 >= 2.6.31.
* lxml.etree accepted non well-formed namespace prefix names.
Other changes
-------------
* Major cleanup in internal ``moveNodeToDocument()`` function, which
takes care of namespace cleanup when moving elements between
different namespace contexts.
* New Elements created through the ``makeelement()`` method of an HTML
parser or through lxml.html now end up in a new HTML document
(doctype HTML 4.01 Transitional) instead of a generic XML document.
This mostly impacts the serialisation and the availability of a DTD
context.
2.0.4 (2008-04-13)
==================
Features added
--------------
Bugs fixed
----------
* Hanging thread in conjunction with GTK threading.
* Crash bug in iterparse when moving elements into other documents.
* HTML elements' ``.cssselect()`` method was broken.
* ``ElementTree.find*()`` didn't accept QName objects.
Other changes
-------------
2.1alpha1 (2008-03-27)
======================
Features added
--------------
* New event types 'comment' and 'pi' in ``iterparse()``.
* ``XSLTAccessControl`` instances have a property ``options`` that
returns a dict of access configuration options.
* Constant instances ``DENY_ALL`` and ``DENY_WRITE`` on
``XSLTAccessControl`` class.
* Extension elements for XSLT (experimental!)
* ``Element.base`` property returns the xml:base or HTML base URL of
an Element.
* ``docinfo.URL`` property is writable.
Bugs fixed
----------
* Default encoding for plain text serialisation was different from
that of XML serialisation (UTF-8 instead of ASCII).
Other changes
-------------
* Minor API speed-ups.
* The benchmark suite now uses tail text in the trees, which makes the
absolute numbers incomparable to previous results.
* Generating the HTML documentation now requires Pygments_, which is
used to enable syntax highlighting for the doctest examples.
.. _Pygments: http://pygments.org/
Most long-time deprecated functions and methods were removed:
- ``etree.clearErrorLog()``, use ``etree.clear_error_log()``
- ``etree.useGlobalPythonLog()``, use
``etree.use_global_python_log()``
- ``etree.ElementClassLookup.setFallback()``, use
``etree.ElementClassLookup.set_fallback()``
- ``etree.getDefaultParser()``, use ``etree.get_default_parser()``
- ``etree.setDefaultParser()``, use ``etree.set_default_parser()``
- ``etree.setElementClassLookup()``, use
``etree.set_element_class_lookup()``
Note that ``parser.setElementClassLookup()`` has not been removed
yet, although ``parser.set_element_class_lookup()`` should be used
instead.
- ``xpath_evaluator.registerNamespace()``, use
``xpath_evaluator.register_namespace()``
- ``xpath_evaluator.registerNamespaces()``, use
``xpath_evaluator.register_namespaces()``
- ``objectify.setPytypeAttributeTag``, use
``objectify.set_pytype_attribute_tag``
- ``objectify.setDefaultParser()``, use
``objectify.set_default_parser()``
2.0.3 (2008-03-26)
==================
Features added
--------------
* soupparser.parse() allows passing keyword arguments on to
BeautifulSoup.
* ``fromstring()`` method in ``lxml.html.soupparser``.
Bugs fixed
----------
* ``lxml.html.diff`` didn't treat empty tags properly (e.g.,
`` ``).
* Handle entity replacements correctly in target parser.
* Crash when using ``iterparse()`` with XML Schema validation.
* The BeautifulSoup parser (soupparser.py) did not replace entities,
which made them turn up in text content.
* Attribute assignment of custom PyTypes in objectify could fail to
correctly serialise the value to a string.
Other changes
-------------
* ``lxml.html.ElementSoup`` was replaced by a new module
``lxml.html.soupparser`` with a more consistent API. The old module
remains for compatibility with ElementTree's own ElementSoup module.
* Setting the XSLT_CONFIG and XML2_CONFIG environment variables at
build time will let setup.py pick up the ``xml2-config`` and
``xslt-config`` scripts from the supplied path name.
* Passing ``--with-xml2-config=/path/to/xml2-config`` to setup.py will
override the ``xml2-config`` script that is used to determine the C
compiler options. The same applies for the ``--with-xslt-config``
option.
2.0.2 (2008-02-22)
==================
Features added
--------------
* Support passing ``base_url`` to file parser functions to override
the filename of the file(-like) object.
Bugs fixed
----------
* The prefix for objectify's pytype namespace was missing from the set
of default prefixes.
* Memory leak in Schematron (fixed only for libxml2 2.6.31+).
* Error type names in RelaxNG were reported incorrectly.
* Slice deletion bug fixed in objectify.
Other changes
-------------
* Enabled doctests for some Python modules (especially ``lxml.html``).
* Add a ``method`` argument to ``lxml.html.tostring()``
(``method="xml"`` for XHTML output).
* Make it clearer that methods like ``lxml.html.fromstring()`` take a
``base_url`` argument.
2.0.1 (2008-02-13)
==================
Features added
--------------
* Child iteration in ``lxml.pyclasslookup``.
* Loads of new docstrings reflect the signature of functions and
methods to make them visible in API docs and ``help()``
Bugs fixed
----------
* The module ``lxml.html.builder`` was duplicated as
``lxml.htmlbuilder``
* Form elements would return None for ``form.fields.keys()`` if there
was an unnamed input field. Now unnamed input fields are completely
ignored.
* Setting an element slice in objectify could insert slice-overlapping
elements at the wrong position.
Other changes
-------------
* The generated API documentation was cleaned up and disburdened from
non-public classes etc.
* The previously public module ``lxml.html.setmixin`` was renamed to
``lxml.html._setmixin`` as it is not an official part of lxml. If
you want to use it, feel free to copy it over to your own source
base.
* Passing ``--with-xslt-config=/path/to/xslt-config`` to setup.py will
override the ``xslt-config`` script that is used to determine the C
compiler options.
2.0 (2008-02-01)
================
Features added
--------------
* Passing the ``unicode`` type as ``encoding`` to ``tostring()`` will
serialise to unicode. The ``tounicode()`` function is now
deprecated.
* ``XMLSchema()`` and ``RelaxNG()`` can parse from StringIO.
* ``makeparser()`` function in ``lxml.objectify`` to create a new
parser with the usual objectify setup.
* Plain ASCII XPath string results are no longer forced into unicode
objects as in 2.0beta1, but are returned as plain strings as before.
* All XPath string results are 'smart' objects that have a
``getparent()`` method to retrieve their parent Element.
* ``with_tail`` option in serialiser functions.
* More accurate exception messages in validator creation.
* Parse-time XML schema validation (``schema`` parser keyword).
* XPath string results of the ``text()`` function and attribute
selection make their Element container accessible through a
``getparent()`` method. As a side-effect, they are now always
unicode objects (even ASCII strings).
* ``XSLT`` objects are usable in any thread - at the cost of a deep
copy if they were not created in that thread.
* Invalid entity names and character references will be rejected by
the ``Entity()`` factory.
* ``entity.text`` returns the textual representation of the entity,
e.g. ``&``.
* New properties ``position`` and ``code`` on ParseError exception (as
in ET 1.3)
* Rich comparison of ``element.attrib`` proxies.
* ElementTree compatible TreeBuilder class.
* Use default prefixes for some common XML namespaces.
* ``lxml.html.clean.Cleaner`` now allows for a ``host_whitelist``, and
two overridable methods: ``allow_embedded_url(el, url)`` and the
more general ``allow_element(el)``.
* Extended slicing of Elements as in ``element[1:-1:2]``, both in
etree and in objectify
* Resolvers can now provide a ``base_url`` keyword argument when
resolving a document as string data.
* When using ``lxml.doctestcompare`` you can give the doctest option
``NOPARSE_MARKUP`` (like ``# doctest: +NOPARSE_MARKUP``) to suppress
the special checking for one test.
* Separate ``feed_error_log`` property for the feed parser interface.
The normal parser interface and ``iterparse`` continue to use
``error_log``.
* The normal parsers and the feed parser interface are now separated
and can be used concurrently on the same parser instance.
* ``fromstringlist()`` and ``tostringlist()`` functions as in
ElementTree 1.3
* ``iterparse()`` accepts an ``html`` boolean keyword argument for
parsing with the HTML parser (note that this interface may be
subject to change)
* Parsers accept an ``encoding`` keyword argument that overrides the encoding
of the parsed documents.
* New C-API function ``hasChild()`` to test for children
* ``annotate()`` function in objectify can annotate with Python types and XSI
types in one step. Accompanied by ``xsiannotate()`` and ``pyannotate()``.
* ``ET.write()``, ``tostring()`` and ``tounicode()`` now accept a keyword
argument ``method`` that can be one of 'xml' (or None), 'html' or 'text' to
serialise as XML, HTML or plain text content.
* ``iterfind()`` method on Elements returns an iterator equivalent to
``findall()``
* ``itertext()`` method on Elements
* Setting a QName object as value of the .text property or as an attribute
will resolve its prefix in the respective context
* ElementTree-like parser target interface as described in
http://effbot.org/elementtree/elementtree-xmlparser.htm
* ElementTree-like feed parser interface on XMLParser and HTMLParser
(``feed()`` and ``close()`` methods)
* Reimplemented ``objectify.E`` for better performance and improved
integration with objectify. Provides extended type support based on
registered PyTypes.
* XSLT objects now support deep copying
* New ``makeSubElement()`` C-API function that allows creating a new
subelement straight with text, tail and attributes.
* XPath extension functions can now access the current context node
(``context.context_node``) and use a context dictionary
(``context.eval_context``) from the context provided in their first
parameter
* HTML tag soup parser based on BeautifulSoup in ``lxml.html.ElementSoup``
* New module ``lxml.doctestcompare`` by Ian Bicking for writing simplified
doctests based on XML/HTML output. Use by importing ``lxml.usedoctest`` or
``lxml.html.usedoctest`` from within a doctest.
* New module ``lxml.cssselect`` by Ian Bicking for selecting Elements with CSS
selectors.
* New package ``lxml.html`` written by Ian Bicking for advanced HTML
treatment.
* Namespace class setup is now local to the ``ElementNamespaceClassLookup``
instance and no longer global.
* Schematron validation (incomplete in libxml2)
* Additional ``stringify`` argument to ``objectify.PyType()`` takes a
conversion function to strings to support setting text values from arbitrary
types.
* Entity support through an ``Entity`` factory and element classes. XML
parsers now have a ``resolve_entities`` keyword argument that can be set to
False to keep entities in the document.
* ``column`` field on error log entries to accompany the ``line`` field
* Error specific messages in XPath parsing and evaluation
NOTE: for evaluation errors, you will now get an XPathEvalError instead of
an XPathSyntaxError. To catch both, you can except on ``XPathError``
* The regular expression functions in XPath now support passing a node-set
instead of a string
* Extended type annotation in objectify: new ``xsiannotate()`` function
* EXSLT RegExp support in standard XPath (not only XSLT)
Bugs fixed
----------
* Missing import in ``lxml.html.clean``.
* Some Python 2.4-isms prevented lxml from building/running under
Python 2.3.
* XPath on ElementTrees could crash when selecting the virtual root
node of the ElementTree.
* Compilation ``--without-threading`` was buggy in alpha5/6.
* Memory leak in the ``parse()`` function.
* Minor bugs in XSLT error message formatting.
* Result document memory leak in target parser.
* Target parser failed to report comments.
* In the ``lxml.html`` ``iter_links`` method, links in ``