[lxml-dev] Possible bug in xpath?
Stefan Behnel
stefan_ml at behnel.de
Thu Jan 24 22:02:05 CET 2008
Hi,
Bruno Barberi Gnecco wrote:
> I think I may have run into a bug. I'm attaching a sample code
> to reproduce it. Instead of getting back just the '<strong>..</strong>',
> I get the entire div content.
>
> If OTOH I'm doing something stupid, just tell me :) Thanks a lot,
tostring(el) and tounicode(el) serialise the Element object you pass, and the
.tail text of an Element is part of the Element object you are serialising.
Try calling
tounicode( et.getroot()[0] )
That should give you the same output that you see in your XPath example.
Here's another example that might make it clear why this is so:
>>> import lxml.etree as et
>>> root = et.Element("test")
>>> root.text = "TEXT"
>>> et.tostring(root)
<test>TEXT</test>
>>> et.tail = "TAIL"
>>> et.tostring(root)
<test>TEXT</test>TAIL
>>> et.tail = None
>>> et.tostring(root)
<test>TEXT</test>
I also updated the FAQ entry on this topic.
http://codespeak.net/lxml/dev/FAQ.html#what-about-that-trailing-text-on-serialised-elements
Stefan
More information about the lxml-dev
mailing list