[Lxml-checkins] r42705 - lxml/trunk/doc

scoder at codespeak.net scoder at codespeak.net
Sat May 5 19:09:00 CEST 2007


Author: scoder
Date: Sat May  5 19:08:59 2007
New Revision: 42705

Modified:
   lxml/trunk/doc/xpathxslt.txt
Log:
rewrite of XPath doc page

Modified: lxml/trunk/doc/xpathxslt.txt
==============================================================================
--- lxml/trunk/doc/xpathxslt.txt	(original)
+++ lxml/trunk/doc/xpathxslt.txt	Sat May  5 19:08:59 2007
@@ -6,10 +6,15 @@
 compliant way.
 
 .. contents::
-.. 
+..
    1  XPath
+     1.1  The ``xpath()`` method
+     1.2  The ``XPath`` class
+     1.3  The ``XPathEvaluator`` classes
+     1.4  ``ETXPath``
    2  XSLT
 
+
 The usual setup procedure::
 
   >>> from lxml import etree
@@ -17,12 +22,17 @@
 
 
 XPath
------
+=====
+
+lxml.etree supports the simple path syntax of the `find, findall and
+findtext`_ methods on ElementTree and Element, as known from the original
+ElementTree library (ElementPath_).  As an lxml specific extension, these
+classes also provide an ``xpath()`` method that supports expressions in the
+complete XPath syntax, as well as `extension functions`_.
 
-lxml.etree supports the simple path syntax of the ``findall()`` etc.  methods
-on ElementTree and Element, as known from the original ElementTree library.
-As an extension, these classes also provide an ``xpath()`` method that
-supports expressions in the complete XPath syntax.
+.. _ElementPath: http://effbot.org/zone/element-xpath.htm
+.. _`find,  findall and findtext`: http://effbot.org/zone/element.htm#searching-for-subelements
+.. _`extension functions`: extensions.html
 
 There are also specialized XPath evaluator classes that are more efficient for
 frequent evaluation: ``XPath`` and ``XPathEvaluator``.  See the `performance
@@ -32,6 +42,10 @@
 
 .. _`performance comparison`: performance.html#xpath
 
+
+The ``xpath()`` method
+----------------------
+
 For ElementTree, the xpath method performs a global XPath query against the
 document (if absolute) or against the root node (if relative)::
 
@@ -48,7 +62,7 @@
   >>> r[0].tag
   'bar'
 
-When ``xpath()`` is used on an element, the XPath expression is evaluated
+When ``xpath()`` is used on an Element, the XPath expression is evaluated
 against the element (if relative) or against the root tree (if absolute)::
 
   >>> root = tree.getroot()
@@ -66,6 +80,19 @@
   >>> r[0].tag
   'bar'
 
+The ``xpath()`` method has support for XPath variables::
+
+  >>> expr = "//*[local-name() = $name]"
+
+  >>> print root.xpath(expr, name = "foo")[0].tag
+  foo
+
+  >>> print root.xpath(expr, name = "bar")[0].tag
+  bar
+
+  >>> print root.xpath("$text", text = "Hello World!")
+  Hello World!
+
 Optionally, you can provide a ``namespaces`` keyword argument, which should be
 a dictionary mapping the namespace prefixes used in the XPath expression to
 namespace URIs::
@@ -102,11 +129,10 @@
 * a (unicode) string, when the XPath expression has a string result.
 
 * a list of items, when the XPath expression has a list as result.  The items
-  may include elements, strings and tuples.  Text nodes and attributes in the
-  result are returned as strings (the text node content or attribute value).
-  Comments are also returned as strings, enclosed by the usual ``<!--`` and
-  ``-->`` markers.  Namespace declarations are returned as tuples of strings:
-  ``(prefix, URI)``.
+  may include elements (also comments and processing instructions), strings
+  and tuples.  Text nodes and attributes in the result are returned as strings
+  (the text node content or attribute value).  Namespace declarations are
+  returned as tuples of strings: ``(prefix, URI)``.
 
 A related convenience method of ElementTree objects is ``getpath(element)``,
 which returns a structural, absolute XPath expression to find that element::
@@ -124,8 +150,111 @@
   True
 
 
+The ``XPath`` class
+-------------------
+
+The ``XPath`` class compiles an XPath expression into a callable function::
+
+  >>> root = etree.XML("<root><a><b/></a><b/></root>")
+
+  >>> find = etree.XPath("//b")
+  >>> print find(root)[0].tag
+  b
+
+The compilation takes as much time as in the ``xpath()`` method, but it is
+done only once per class instantiation.  This makes it especially efficient
+for repeated evaluation of the same XPath expression.
+
+Just like the ``xpath()`` method, the ``XPath`` class supports XPath
+variables::
+
+  >>> count_elements = etree.XPath("count(//*[local-name() = $name])")
+
+  >>> print count_elements(root, name = "a")
+  1.0
+  >>> print count_elements(root, name = "b")
+  2.0
+
+This supports very efficient evaluation of modified versions of an XPath
+expression, as compilation is still only required once.
+
+Prefix-to-namespace mappings can be passed as second parameter::
+
+  >>> root = etree.XML("<root xmlns='NS'><a><b/></a><b/></root>")
+
+  >>> find = etree.XPath("//n:b", {'n':'NS'})
+  >>> print find(root)[0].tag
+  {NS}b
+
+You can pass the boolean keyword ``regexp`` to enable Python regular
+expressions in the EXSLT_ namespace::
+
+  >>> regexpNS = "http://exslt.org/regular-expressions"
+  >>> find = etree.XPath("//*[r:test(., '^abc$', 'i')]",
+  ...                    {'r':regexpNS}, regexp = True)
+
+  >>> root = etree.XML("<root><a>aB</a><b>aBc</b></root>")
+  >>> print find(root)[0].text
+  aBc
+
+.. _EXSLT: http://www.exslt.org/
+
+
+The ``XPathEvaluator`` classes
+------------------------------
+
+lxml.etree provides two other efficient XPath evaluators that work on
+ElementTrees or Elements respectively: ``XPathDocumentEvaluator`` and
+``XPathElementEvaluator``.  They are automatically selected if you use the
+XPathEvaluator helper for instantiation::
+
+  >>> root = etree.XML("<root><a><b/></a><b/></root>")
+  >>> xpatheval = etree.XPathEvaluator(root)
+
+  >>> print isinstance(xpatheval, etree.XPathElementEvaluator)
+  True
+
+  >>> print xpatheval("//b")[0].tag
+  b
+
+This class provides efficient support for evaluating different XPath
+expressions on the same Element or ElementTree.
+
+
+``ETXPath``
+-----------
+
+ElementTree supports a language named ElementPath_ in its ``find*()`` methods.
+One of the main differences between XPath and ElementPath is that the XPath
+language requires an indirection through prefixes for namespace support,
+whereas ElementTree uses the Clark notation (``{ns}name``) to avoid prefixes
+completely.  The other major difference regards the capabilities of both path
+languages.  Where XPath supports various sophisticated ways of restricting the
+result set through functions and boolean expressions, ElementPath only
+supports pure path traversal without nesting or further conditions.  So, while
+the ElementPath syntax is self-contained and therefore easier to write and
+handle, XPath is much more powerful and expressive.
+
+lxml.etree bridges this gap through the class ``ETXPath``, which accepts XPath
+expressions with namespaces in Clark notation.  It is identical to the
+``XPath`` class, except for the namespace notation.  Normally, you would
+write::
+
+  >>> root = etree.XML("<root xmlns='ns'><a><b/></a><b/></root>")
+
+  >>> find = etree.XPath("//p:b", {'p' : 'ns'})
+  >>> print find(root)[0].tag
+  {ns}b
+
+``ETXPath`` allows you to change this to::
+
+  >>> find = etree.ETXPath("//{ns}b")
+  >>> print find(root)[0].tag
+  {ns}b
+
+
 XSLT
-----
+====
 
 lxml.etree introduces a new class, lxml.etree.XSLT. The class can be
 given an ElementTree object to construct an XSLT transformer::


More information about the lxml-checkins mailing list