[lxml-dev] matches()

Stefan Behnel stefan_ml at behnel.de
Tue May 29 22:47:42 CEST 2007


Hi Ian,

Ian Bicking wrote:
>>>    doc.xpath('descendant-or-self::*[starts-with(lower-case(@href), 
>>> "javascript:")]')
>> Well, maybe this one doesn't work either (returns 1/0).  Now I'm just 
>> confused.
> 
> Adding to this, I'm trying to do the rel matching with:
> 
> etree.XPath("descendant-or-self::a[fn:lower-case(@rel)=$rel]")

IIRC, "lower-case()" is XPath 2.0. libxml2 supports XPath 1.0 only, so there
just is no such function.

It's easy to implement that in Python, though:

    def make_lower_case(ctxt, s):
         return s.lower()

    etree.FunctionNamespace("myNs")["lower-case"] = make_lower_case

    find = etree.XPath(
         "descendant-or-self::a[fn:lower-case(string(@rel))=$rel]",
         {'fn':'myNs'})

(Note the call to "string(...)" to make sure we get a string value here, not a
node set.)

BTW, I get a reproduceable crash with the above under libxml2 2.6.27, but it
works with 2.6.28. Sigh...


> I also tried using XPath(r'...[translate(@class, "\n\t\r", "   ")]) and 
> that didn't work.  The \n etc doesn't seem to be interpreted; only if I 
> include the actual characters does it work.  (I then noticed 
> normalize-whitespace, which is better, but it still seems odd.)

Hmm, you didn't try without the r'', did you?

    XPath('...[translate(@class, "\n\t\r", "   ")])

That should work as it leaves it to Python to handle the char escapes.

Stefan



More information about the lxml-dev mailing list