[lxml-dev] Some XPath questions...
Ian Bicking
ianb at colorstudy.com
Tue Jul 3 01:26:06 CEST 2007
Stefan Behnel wrote:
>> So when I use // it works. Huh. I prefer descendant-or-self, because I
>> find it peculiar to do a search from the root when you've called the
>> method on some particular element (that may not be at the root).
>
> There's also ".//*".
That seems to be equivalent to //*, i.e., // goes directly to the root
regardless of context.
>>>>>> div:empty (no children, including text, maybe not including whitespace).
>>>>> Ouch. let me think about that one.
>>>> Yeah, I couldn't figure that one out. I thought this might work:
>>>> >>> xpath('E:empty')
>>>> e[count(./children::*) = 0 and string(.) = '']
>>>> But maybe I don't understand how count() works; this isn't a valid XPath
>>>> expression.
>>> You want "child" not "children". Using normalize-space(.) instead of
>>> string(.) will exclude whitespace. This does assume you are ignoring
>>> comments and PIs; I believe that's the behavior you want.
>> Cool, that seems to work right.
>
> What about "e[not(*) and not(normalize-space())]" ?
Yes, that works too.
>> One query I'm realizing might be really hard (maybe too hard in XPath)
>> is *:first-of-type, *:last-of-type, and *:only-of-type, since they match
>> in a funny sort of way. You can't really do:
>>
>> *[count(../*[name() = name()) = 1]
>
> You need two expressions here, one to find the node and one to compare it to
> others (note that name() can also take an argument) - but those are really
> trick, you're right. They may already touch the borders of what XPath can express.
I could probably do it by adding a new function, I suppose;
css:last-of-type() for instance. It's not that hard to do in Python,
after all.
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
| Write code, do good | http://topp.openplans.org/careers
More information about the lxml-dev
mailing list