[lxml-dev] xpath comparison

Stefan Behnel stefan_ml at behnel.de
Fri Jun 6 15:55:36 CEST 2008


Olivier Collioud top-posted:
> I would have expected that:
> 
> def isin(context,symbol,start,end):
>     return start <= symbol <= end
> ns = ElementTree.FunctionNamespace('http://wipo.int/isin')
> ns.prefix = 'ii'
> ns['isin'] = isin
> 
> for mref in definitionsTree.xpath('//MREF[ii:isin(%s, at START, at END)]' %
> symbol):

That should spell

  for mref in definitionsTree.xpath(
            '//MREF[ii:isin($symbol, at START, at END)]', symbol=symbol):


> would be faster than:
> 
> for mref in definitionsTree.xpath('//MREF'):
>     if mref.get('START') <= symbol <= mref.get('END'):
>         ...
> 
> Because the first solution should visit less node.

No. It has to visit all nodes and call the Python function on them. This can't
be optimised away.

libxml2 doesn't seem to support string comparisons using "<" and ">" in XPath,
so I guess you're basically out of luck. What you could try is testing for a
common prefix of your start and end marker and only compare values that start
with that. Or, if your elements are sorted, you can use el.find() instead,
which short-circuits in lxml 2.0. (BTW, you can sort the children of an
element using the usual tuple pack-sort-unpack scheme and reassign the slice
at the end).

Please report back if any of the above approaches worked better for you.

Stefan


More information about the lxml-dev mailing list