[lxml-dev] Possible bug in xpath?

Stefan Behnel stefan_ml at behnel.de
Thu Jan 24 22:02:05 CET 2008


Hi,

Bruno Barberi Gnecco wrote:
>     I think I may have run into a bug. I'm attaching a sample code
> to reproduce it. Instead of getting back just the '<strong>..</strong>',
> I get the entire div content.
> 
>     If OTOH I'm doing something stupid, just tell me :) Thanks a lot,

tostring(el) and tounicode(el) serialise the Element object you pass, and the
.tail text of an Element is part of the Element object you are serialising.
Try calling

    tounicode( et.getroot()[0] )

That should give you the same output that you see in your XPath example.

Here's another example that might make it clear why this is so:

    >>> import lxml.etree as et
    >>> root = et.Element("test")

    >>> root.text = "TEXT"
    >>> et.tostring(root)
    <test>TEXT</test>

    >>> et.tail = "TAIL"
    >>> et.tostring(root)
    <test>TEXT</test>TAIL

    >>> et.tail = None
    >>> et.tostring(root)
    <test>TEXT</test>

I also updated the FAQ entry on this topic.

http://codespeak.net/lxml/dev/FAQ.html#what-about-that-trailing-text-on-serialised-elements

Stefan


More information about the lxml-dev mailing list