[lxml-dev] build & performance issues with 2.0beta2

jholg at gmx.de jholg at gmx.de
Thu Jan 31 16:45:53 CET 2008


Hi there,

sorry to interrupt the testimonial celebrations ;-)
but I'm having some problems with 2.0beta2:

First of all, it does not build any more using gcc 2.95.2, yes I know,
might old compiler...then again, Cython produces C code, not funky C++ stuff (2.0alpha-r47832 still built without problems). This is the error I get:

$ /apps/pydev/hjoukl/bin/python2.4 setup.py build
Building with Cython 0.9.6.11b.
Building lxml version 2.0.beta2-51091.
running build
[...]
cythoning src/lxml/lxml.etree.pyx to src/lxml/lxml.etree.c
building 'lxml.etree' extension
creating build/temp.solaris-2.8-sun4u-2.4
creating build/temp.solaris-2.8-sun4u-2.4/src
creating build/temp.solaris-2.8-sun4u-2.4/src/lxml
gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I/apps/prod//include -I/apps/prod//include/libxml2 -I/apps/prod/include/libxml2 -I/apps/prod/include -I/apps/pydev/hjoukl/include/python2.4 -c src/lxml/lxml.etree.c -o build/temp.solaris-2.8-sun4u-2.4/src/lxml/lxml.etree.o -w
src/lxml/lxml.etree.c: In function `__pyx_PyInt_AsLongLong':
src/lxml/lxml.etree.c:110165: parse error before `long'
src/lxml/lxml.etree.c:110167: `val' undeclared (first use in this function)
src/lxml/lxml.etree.c:110167: (Each undeclared identifier is reported only once
src/lxml/lxml.etree.c:110167: for each function it appears in.)
src/lxml/lxml.etree.c: In function `__pyx_PyInt_AsUnsignedLongLong':
src/lxml/lxml.etree.c:110185: parse error before `long'
src/lxml/lxml.etree.c:110187: `val' undeclared (first use in this function)
error: command 'gcc' failed with exit status 1
1 pytaf at adevp02 .../lxml-2.0beta2 $


Now, when I switch to use gcc 3.4.4 I can build successfully, but:

0 lb54320 at adevp02 .../lxml-2.0beta2 $ LD_LIBRARY_PATH=/apps/prod/gcc/3.4.4/lib python2.4 test.py -p -v  '' '!test_schematron_invalid*'
/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/html/__init__.py:22: UserWarning: This version of libxml2 has a known XPath bug. Use it at your own risk.
  _rel_links_xpath = etree.XPath("descendant-or-self::a[@rel]")

TESTED VERSION: 2.0.beta2-51091
    Python:            (2, 4, 4, 'final', 0)
    lxml.etree:        (2, 0, -98, 51091)
    libxml used:       (2, 6, 27)
    libxml compiled:   (2, 6, 27)
    libxslt used:      (1, 1, 20)
    libxslt compiled:  (1, 1, 20)

 855/855 (100.0%): Doctest: xpathxslt.txt
======================================================================
FAIL: Doctest: validation.txt
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod//lib/python2.4/unittest.py", line 260, in run
    testMethod()
  File "/apps/prod//lib/python2.4/doctest.py", line 2157, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for validation.txt
  File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 0

----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 113, in validation.txt
Failed example:
    dtd = etree.DTD(external_id = docbook) # requires catalog support
Exception raised:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[16]>", line 1, in ?
        dtd = etree.DTD(external_id = docbook) # requires catalog support
      File "dtd.pxi", line 50, in lxml.etree.DTD.__init__
    DTDParseError: failed to load external entity "-//OASIS//DTD DocBook XML V4.2//EN"
----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 116, in validation.txt
Failed example:
    dtd.assertValid(root) # doctest: +ELLIPSIS
Expected:
    Traceback (most recent call last):
    DocumentInvalid: Element article content does not follow the DTD, ...
Got:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[18]>", line 1, in ?
        dtd.assertValid(root) # doctest: +ELLIPSIS
      File "lxml.etree.pyx", line 2375, in lxml.etree._Validator.assertValid
    DocumentInvalid: No declaration for element article, line 1


======================================================================
FAIL: Doctest: validation.txt
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod//lib/python2.4/unittest.py", line 260, in run
    testMethod()
  File "/apps/prod//lib/python2.4/doctest.py", line 2157, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for validation.txt
  File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 0

----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 113, in validation.txt
Failed example:
    dtd = etree.DTD(external_id = docbook) # requires catalog support
Exception raised:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[16]>", line 1, in ?
        dtd = etree.DTD(external_id = docbook) # requires catalog support
      File "dtd.pxi", line 50, in lxml.etree.DTD.__init__
    DTDParseError: failed to load external entity "-//OASIS//DTD DocBook XML V4.2//EN"
----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 116, in validation.txt
Failed example:
    dtd.assertValid(root) # doctest: +ELLIPSIS
Expected:
    Traceback (most recent call last):
    DocumentInvalid: Element article content does not follow the DTD, ...
Got:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[18]>", line 1, in ?
        dtd.assertValid(root) # doctest: +ELLIPSIS
      File "lxml.etree.pyx", line 2375, in lxml.etree._Validator.assertValid
    DocumentInvalid: No declaration for element article, line 1


======================================================================
FAIL: Doctest: validation.txt
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod//lib/python2.4/unittest.py", line 260, in run
    testMethod()
  File "/apps/prod//lib/python2.4/doctest.py", line 2157, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for validation.txt
  File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 0

----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 113, in validation.txt
Failed example:
    dtd = etree.DTD(external_id = docbook) # requires catalog support
Exception raised:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[16]>", line 1, in ?
        dtd = etree.DTD(external_id = docbook) # requires catalog support
      File "dtd.pxi", line 50, in lxml.etree.DTD.__init__
    DTDParseError: failed to load external entity "-//OASIS//DTD DocBook XML V4.2//EN"
----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 116, in validation.txt
Failed example:
    dtd.assertValid(root) # doctest: +ELLIPSIS
Expected:
    Traceback (most recent call last):
    DocumentInvalid: Element article content does not follow the DTD, ...
Got:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[18]>", line 1, in ?
        dtd.assertValid(root) # doctest: +ELLIPSIS
      File "lxml.etree.pyx", line 2375, in lxml.etree._Validator.assertValid
    DocumentInvalid: No declaration for element article, line 1


======================================================================
FAIL: Doctest: validation.txt
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod//lib/python2.4/unittest.py", line 260, in run
    testMethod()
  File "/apps/prod//lib/python2.4/doctest.py", line 2157, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for validation.txt
  File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 0
 
----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 113, in validation.txt
Failed example:
    dtd = etree.DTD(external_id = docbook) # requires catalog support
Exception raised:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[16]>", line 1, in ?
        dtd = etree.DTD(external_id = docbook) # requires catalog support
      File "dtd.pxi", line 50, in lxml.etree.DTD.__init__
    DTDParseError: failed to load external entity "-//OASIS//DTD DocBook XML V4.2//EN"
----------------------------------------------------------------------
File "/data/pydev/DOWNLOADS/LXML/lxml/versions/SVN_CHECKOUTS/TAGS/lxml-2.0beta2/src/lxml/tests/../../../doc/validation.txt", line 116, in validation.txt
Failed example:
    dtd.assertValid(root) # doctest: +ELLIPSIS
Expected:
    Traceback (most recent call last):
    DocumentInvalid: Element article content does not follow the DTD, ...
Got:
    Traceback (most recent call last):
      File "/apps/prod//lib/python2.4/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest validation.txt[18]>", line 1, in ?
        dtd.assertValid(root) # doctest: +ELLIPSIS
      File "lxml.etree.pyx", line 2375, in lxml.etree._Validator.assertValid
    DocumentInvalid: No declaration for element article, line 1
 
 
----------------------------------------------------------------------
Ran 855 tests in 37.860s
 
FAILED (failures=4)
1 lb54320 at adevp02 .../lxml-2.0beta2 $


Compared to 2.0alpha (I rebuilt that also with gcc 3.4.4):

2 pytaf at adevp02 .../current $ LD_LIBRARY_PATH=/apps/prod/gcc/3.4.4/lib python2.4 test.py -p -v  '' '!test_schematron_invalid*'
/data/pydev/DOWNLOADS/LXML/lxml/versions/lxml-2.0alpha-r47832/src/lxml/html/__init__.py:22: UserWarning: This version of libxml2 has a known XPath bug. Use it at your own risk.
  _rel_links_xpath = etree.XPath("descendant-or-self::a[@rel]")
 
TESTED VERSION: 2.0.alpha4
    Python:            (2, 4, 4, 'final', 0)
    lxml.etree:        (2, 0, -196, 0)
    libxml used:       (2, 6, 27)
    libxml compiled:   (2, 6, 27)
    libxslt used:      (1, 1, 20)
    libxslt compiled:  (1, 1, 20)
 
 824/824 (100.0%): Doctest: xpathxslt.txt
----------------------------------------------------------------------
Ran 824 tests in 2.698s
 
OK


So basically performance drops by factor >10 for me, on a Sparc Solaris 8 box, python2.4, gcc 3.4.4.
I haven't yet looked into the failing tests.

Remarks: I currently disable the schematron tests because some of them dump core with my setup.

Holger


-- 
GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail


More information about the lxml-dev mailing list