[lxml-dev] Preventing XPath injection

Marius Gedminas marius at pov.lt
Sun Sep 7 20:05:58 CEST 2008


On Sun, Sep 07, 2008 at 12:16:25PM -0500, Ian Bicking wrote:
> Geoffrey Sneddon wrote:
> > On 6 Sep 2008, at 18:52, Alex Klizhentas wrote:
> > 
> >> That's strange, I thought it should be quoted like: '
> > 
> > Nope. A string is "[^"]*" or '[^']*' — it is exactly what is between  
> > the quotes.
> 
> When I was trying to figure out CSS to XPath translation, I tried to 
> figure out how string quoting worked in XPath.  Unfortunately I couldn't 
> find any reference to string quoting in the specs (though of course I 
> might have missed it).  This seemed like a very peculiar omission.

XPath 2.0 spec rectifies that:

  The value of a string literal is an atomic value whose type is
  xs:string and whose value is the string denoted by the characters
  between the delimiting apostrophes or quotation marks. If the literal
  is delimited by apostrophes, two adjacent apostrophes within the
  literal are interpreted as a single apostrophe. Similarly, if the
  literal is delimited by quotation marks, two adjacent quotation marks
  within the literal are interpreted as one quotation mark.

      -- http://www.w3.org/TR/xpath20/#id-literals

XPath 1.0 is silent on the matter.  I suppose you could always
concatenate strings, e.g. concat("Look, it's a ", '"quoted string"!')...

Marius Gedminas
-- 
Hoping the problem  magically goes away  by ignoring it is the "microsoft
approach to programming" and should never be allowed.
                -- Linus Torvalds
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20080907/92505927/attachment.pgp 


More information about the lxml-dev mailing list