[Lxml-checkins] r42647 - in lxml/trunk: benchmark doc
scoder at codespeak.net
scoder at codespeak.net
Thu May 3 22:26:17 CEST 2007
Author: scoder
Date: Thu May 3 22:26:17 2007
New Revision: 42647
Modified:
lxml/trunk/benchmark/bench_etree.py
lxml/trunk/doc/performance.txt
Log:
benchmark for indexed child access
Modified: lxml/trunk/benchmark/bench_etree.py
==============================================================================
--- lxml/trunk/benchmark/bench_etree.py (original)
+++ lxml/trunk/benchmark/bench_etree.py Thu May 3 22:26:17 2007
@@ -18,6 +18,19 @@
for child in reversed(root):
pass
+ def bench_first_child(self, root):
+ for i in range(1000):
+ child = root[0]
+
+ def bench_last_child(self, root):
+ for i in range(1000):
+ child = root[-1]
+
+ def bench_middle_child(self, root):
+ pos = len(root) / 2
+ for i in range(1000):
+ child = root[pos]
+
@with_attributes(True, False)
@with_text(text=True, utext=True)
def bench_tostring_utf8(self, root):
Modified: lxml/trunk/doc/performance.txt
==============================================================================
--- lxml/trunk/doc/performance.txt (original)
+++ lxml/trunk/doc/performance.txt Thu May 3 22:26:17 2007
@@ -14,7 +14,7 @@
.. _ElementTree: http://effbot.org/zone/element-index.htm
.. _cElementTree: http://effbot.org/zone/celementtree.htm
-The statements made here are backed by the benchmark scripts
+The statements made here are backed by the (micro-)benchmark scripts
`bench_etree.py`_, `bench_xpath.py`_ and `bench_objectify.py`_ that come with
the lxml source distribution. The timings cited below compare lxml 1.3 (with
libxml2 2.6.26) to the ElementTree and cElementTree versions shipped with
@@ -166,6 +166,29 @@
cET: root_getchildren (--TR T2) 0.0150 msec/pass
ET : root_getchildren (--TR T2) 0.0091 msec/pass
+When accessing single children, however, e.g. by index, this handicap is
+negligible::
+
+ lxe: first_child (--TR T2) 0.2499 msec/pass
+ cET: first_child (--TR T2) 0.2048 msec/pass
+ ET : first_child (--TR T2) 0.9291 msec/pass
+
+ lxe: last_child (--TR T1) 0.2511 msec/pass
+ cET: last_child (--TR T1) 0.2148 msec/pass
+ ET : last_child (--TR T1) 0.9191 msec/pass
+
+... unless you add the time to find a child index in a bigger list, as ET and
+cET use Python lists here, which are based on arrays. The data structure used
+by libxml2 is a linked tree, and thus, a linked list of children::
+
+ lxe: middle_child (--TR T1) 0.2921 msec/pass
+ cET: middle_child (--TR T1) 0.2069 msec/pass
+ ET : middle_child (--TR T1) 0.9291 msec/pass
+
+ lxe: middle_child (--TR T2) 1.9028 msec/pass
+ cET: middle_child (--TR T2) 0.2089 msec/pass
+ ET : middle_child (--TR T2) 0.9360 msec/pass
+
As opposed to ET, libxml2 has a notion of documents that each element must be
in. This results in a major performance difference for creating independent
Elements that end up in independently created documents::
More information about the lxml-checkins
mailing list