[Lxml-checkins] r42647 - in lxml/trunk: benchmark doc

scoder at codespeak.net scoder at codespeak.net
Thu May 3 22:26:17 CEST 2007


Author: scoder
Date: Thu May  3 22:26:17 2007
New Revision: 42647

Modified:
   lxml/trunk/benchmark/bench_etree.py
   lxml/trunk/doc/performance.txt
Log:
benchmark for indexed child access

Modified: lxml/trunk/benchmark/bench_etree.py
==============================================================================
--- lxml/trunk/benchmark/bench_etree.py	(original)
+++ lxml/trunk/benchmark/bench_etree.py	Thu May  3 22:26:17 2007
@@ -18,6 +18,19 @@
         for child in reversed(root):
             pass
 
+    def bench_first_child(self, root):
+        for i in range(1000):
+            child = root[0]
+
+    def bench_last_child(self, root):
+        for i in range(1000):
+            child = root[-1]
+
+    def bench_middle_child(self, root):
+        pos = len(root) / 2
+        for i in range(1000):
+            child = root[pos]
+
     @with_attributes(True, False)
     @with_text(text=True, utext=True)
     def bench_tostring_utf8(self, root):

Modified: lxml/trunk/doc/performance.txt
==============================================================================
--- lxml/trunk/doc/performance.txt	(original)
+++ lxml/trunk/doc/performance.txt	Thu May  3 22:26:17 2007
@@ -14,7 +14,7 @@
 .. _ElementTree:  http://effbot.org/zone/element-index.htm
 .. _cElementTree: http://effbot.org/zone/celementtree.htm
 
-The statements made here are backed by the benchmark scripts
+The statements made here are backed by the (micro-)benchmark scripts
 `bench_etree.py`_, `bench_xpath.py`_ and `bench_objectify.py`_ that come with
 the lxml source distribution.  The timings cited below compare lxml 1.3 (with
 libxml2 2.6.26) to the ElementTree and cElementTree versions shipped with
@@ -166,6 +166,29 @@
   cET: root_getchildren          (--TR T2)    0.0150 msec/pass
   ET : root_getchildren          (--TR T2)    0.0091 msec/pass
 
+When accessing single children, however, e.g. by index, this handicap is
+negligible::
+
+  lxe: first_child               (--TR T2)    0.2499 msec/pass
+  cET: first_child               (--TR T2)    0.2048 msec/pass
+  ET : first_child               (--TR T2)    0.9291 msec/pass
+
+  lxe: last_child                (--TR T1)    0.2511 msec/pass
+  cET: last_child                (--TR T1)    0.2148 msec/pass
+  ET : last_child                (--TR T1)    0.9191 msec/pass
+
+... unless you add the time to find a child index in a bigger list, as ET and
+cET use Python lists here, which are based on arrays.  The data structure used
+by libxml2 is a linked tree, and thus, a linked list of children::
+
+  lxe: middle_child              (--TR T1)    0.2921 msec/pass
+  cET: middle_child              (--TR T1)    0.2069 msec/pass
+  ET : middle_child              (--TR T1)    0.9291 msec/pass
+
+  lxe: middle_child              (--TR T2)    1.9028 msec/pass
+  cET: middle_child              (--TR T2)    0.2089 msec/pass
+  ET : middle_child              (--TR T2)    0.9360 msec/pass
+
 As opposed to ET, libxml2 has a notion of documents that each element must be
 in.  This results in a major performance difference for creating independent
 Elements that end up in independently created documents::


More information about the lxml-checkins mailing list