[Lxml-checkins] r53695 - in lxml/trunk: . doc

scoder at codespeak.net scoder at codespeak.net
Fri Apr 11 19:32:59 CEST 2008


Author: scoder
Date: Fri Apr 11 19:32:55 2008
New Revision: 53695

Modified:
   lxml/trunk/   (props changed)
   lxml/trunk/doc/performance.txt
Log:
 r3932 at delle:  sbehnel | 2008-04-11 15:16:32 +0200
 link to HTML benchmarks


Modified: lxml/trunk/doc/performance.txt
==============================================================================
--- lxml/trunk/doc/performance.txt	(original)
+++ lxml/trunk/doc/performance.txt	Fri Apr 11 19:32:55 2008
@@ -193,6 +193,16 @@
 input documents are not considerably bigger than the output, lxml is
 the clear winner.
 
+Regarding HTML parsing, Ian Bicking has done some `benchmarking on
+lxml's HTML parser`_, comparing it to a number of other famous HTML
+parser tools for Python.  lxml wins this contest by quite a length.
+To give an idea, the numbers suggest that lxml.html can run a couple
+of parse-serialise cycles in the time that other tools need for
+parsing alone.  The comparison even shows some very favourable results
+regarding memory consumption.
+
+.. _`benchmarking on lxml's HTML parser`: http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/
+
 
 The ElementTree API
 ===================


More information about the lxml-checkins mailing list