[lxml-dev] Behaviour change in findtext

Stefan Behnel stefan_ml at behnel.de
Fri Feb 20 08:26:24 CET 2009


Hi,

Aloys Baillet wrote:
> I was planning on upgrading to a recent version of lxml but found that our
> code was failing in numerous places with None objects found in unexpected
> places.
> In lxml 2+ the findtext method will ignore the default and return None if
> the element is found but the text is empty.

Thanks, this change was introduced in ElementTree 1.3 and lxml 2.0. Note
that _elementpath.py is mostly a copy of ElementPath.py in ET, except for
some minor adaptations and Py3 fixes.


> In elementtree and lxml before 2 the findtext method would never return
> None, if the element is found but empty it would return the default.

This is not true. ET 1.2 (and thus lxml <= 1.3) returned an empty string
instead, which wasn't necessarily the default either. So, for ET 1.2
compatibility, it should return an empty string if the text is empty, and
the 'default' value (which is None if not passed!) when the element is not
found.

I wonder why the default is None, though. If the function is supposed to
avoid checks on user side by always returning a string value, the default
should be the empty string as well. Plus, lxml.etree knows the difference
between an empty string text value ('') and no text content (None). So this
would blur things in one place while keeping them transparent in all others.

Fredrik, do you have any comments on this?

Stefan


More information about the lxml-dev mailing list