Installing lxml =============== Requirements ------------ You need libxml2 and libxslt, in particular: * libxml 2.6.16 (newer versions should work). It can be found here: http://xmlsoft.org/downloads.html * libxslt 1.1.12 (newer versions should work). It can be found here: http://xmlsoft.org/XSLT/downloads.html You also need Pyrex (0.9.3) to compile the software. The official homepage can be found here: * http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ However, see below for an updated version if you have any trouble using it, especially with GCC 4.x. You also need Python 2.3 or later. Installation ------------ Type:: python setup.py install to compile and install the library. It's also possible to do this:: python setup.py build_ext -i or just:: make This will not install lxml, but if you place lxml's "src" on your PYTHONPATH somehow, you can import it and play with it. Building lxml with gcc 4.0 -------------------------- Pyrex 0.9.3.1 generates C code that gcc 4.0 does not accept. Pending an official release of a version of Pyrex that does work with gcc 4.0, the lxml project currently provides an updated version of Pyrex in its Subversion repository: http://codespeak.net/svn/lxml/pyrex/ To install it, you can just download one of the following files: http://codespeak.net/svn/lxml/pyrex/dist/Pyrex-0.9.3.1.tar.gz http://codespeak.net/svn/lxml/pyrex/dist/Pyrex-0.9.3.1-1.src.rpm It is based on Pyrex 0.9.3.1 and contains a number of patches that make lxml compile and appear to work with gcc 4.0. If you use this version, you can simply skip the rest of the section. In case you want to apply them yourself, the first one is: http://codespeak.net/lxml/Pyrex-0.9.3-gcc4.patch Some Linux distributions such as Fedora Core 4 and Ubuntu Linux may already have most of this applied. In that case, this smaller patch may be applicable to make lxml compile properly: http://codespeak.net/lxml/Pyrex-0.9.3-gcc4-small.patch It may however actually be that at the time you read this, this extra patch has been applied by the distributions as well. You may still encounter the following problem when building the extension:: TypeError: swig_sources() takes exactly 2 arguments (3 given) To fix this, look for the following line in Pyrex/Distutils/build_ext.py (around line 35):: def swig_sources (self, sources): and change it to:: def swig_sources (self, sources, *otherargs): The above install files have these three changes applied. Troubleshooting --------------- lxml's setup.py tries to be smart and uses libxml2's xml2-config to find the installation path of libxml2. If this cannot be found or doesn't work for some reason or another, try editing the setup.py, by changing this:: # if you want to configure include dir manually, you can do so here, # for instance: # include_dirs = ['/usr/include/libxml2'] include_dirs = guess_include_dirs() Into something like this:: include_dirs = ['/usr/include/libxml2'] If that still doesn't work, try registering the extension in a different way entirely; there's a commented block of code at the bottom of setup.py with an example. If you still have trouble, contact us on the `mailing list`_. .. _`mailing list`: http://codespeak.net/mailman/listinfo/lxml-dev Running the tests ----------------- You can run the main tests by using:: python test.py Alternatively, you can use:: make test To run the ElementTree and cElementTree compatibility tests, make sure you have lxml on your PYTHONPATH first, then run:: python selftest.py and:: python selftest2.py If the tests give failures, errors, or worse, segmentation faults, we'd really like to know. Please contact us on the `mailing list`_, and please specify the version of libxml2, libxslt and Python you were using. .. _`mailing list`: http://codespeak.net/mailman/listinfo/lxml-dev