Installing lxml =============== Requirements ------------ You need Python 2.3 or later. You need libxml2 and libxslt, in particular: * libxml 2.6.16 (newer versions should work). It can be found here: http://xmlsoft.org/downloads.html * libxslt 1.1.12 (newer versions should work). It can be found here: http://xmlsoft.org/XSLT/downloads.html See below for instructions how to get these for Windows. On MacOS-X 10.4, you can use the installed system libraries and the binary egg distribution of lxml. If you want to build lxml from SVN, you also need Pyrex_. If you are using a released version of lxml, it should come with the generated C file in the source distribution, so no Pyrex is needed in that case. .. _Pyrex: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ See also the notes on building with gcc 4.0 below if you are having trouble with Pyrex. If you have read these instructions and still cannot manage to install lxml, you can check the archives of the `mailing list`_ to see if your problem is known or otherwise send a mail to the list. .. _`mailing list`: http://codespeak.net/mailman/listinfo/lxml-dev Installation ------------ If you have easy_install_, you can run the following as super-user:: easy_install lxml .. _easy_install: http://peak.telecommunity.com/DevCenter/EasyInstall This has been reported to work on Linux, MacOS-X 10.4 and Windows, as long as libxml2 and libxslt are installed. To compile and install lxml without easy_install, download the source tar-ball, unpack it and type:: python setup.py install If you do not want to install lxml right away, but first test it from the source directory, you can build it in-place like this:: python setup.py build_ext -i or just:: make If you then place lxml's "src" directory on your PYTHONPATH somehow, you can import lxml.etree and play with it. Installation on Windows ----------------------- As always, installation on Windows is different. If you do not want to go through the hassle of compiling everything by hand, you can use the binary distribution of libxml2 and libxslt. It is available here: http://www.zlatkovic.com/libxml.en.html Note that you need both libxml2 and libxslt, as well as iconv and zlib. You can then download a binary version of lxml 0.9 for Python 2.4 from the following address: http://carcass.dhs.org/lxml-0.9.win32-py2.4.exe or the egg distribution from http://cheeseshop.python.org/pypi/lxml The egg can directly be installed using easy_install_. Both builds were kindly contributed by Steve Howe. If they do not work for you, feel free to report to the mailing list. Building lxml with gcc 4.0 or Python 2.4 ---------------------------------------- Pyrex 0.9.3.1 generates C code that gcc 4.0 does not accept. Pending an official release of a version of Pyrex that does work with gcc 4.0, the lxml project currently provides an updated version of Pyrex in its Subversion repository: http://codespeak.net/svn/lxml/pyrex/ To install it, you can just download one of the following files: http://codespeak.net/svn/lxml/pyrex/dist/Pyrex-0.9.3.1.tar.gz http://codespeak.net/svn/lxml/pyrex/dist/Pyrex-0.9.3.1-1.src.rpm It is based on Pyrex 0.9.3.1 and contains a number of patches that make lxml compile and appear to work with gcc 4.0. If you use this version, you can simply skip the rest of the section. In case you want to apply them yourself, the first one is: http://codespeak.net/lxml/Pyrex-0.9.3-gcc4.patch Some Linux distributions such as Fedora Core 4 and Ubuntu Linux may already have most of this applied. In that case, this smaller patch may be applicable to make lxml compile properly: http://codespeak.net/lxml/Pyrex-0.9.3-gcc4-small.patch It may however actually be that at the time you read this, this extra patch has been applied by the distributions as well. You may still encounter the following problem when building the extension on Python 2.4:: TypeError: swig_sources() takes exactly 2 arguments (3 given) To fix this, look for the following line in Pyrex/Distutils/build_ext.py (around line 35):: def swig_sources (self, sources): and change it to:: def swig_sources (self, sources, *otherargs): The above install files have these changes applied. It should do no harm if you install them instead of the official Pyrex version. Running the tests and reporting errors -------------------------------------- The source distribution (tgz) contains a test suite for lxml. You can run it from the top-level directory:: python test.py Note that the test script only tests the in-place build (see "Installation" above), as it searches the "src" directory. You can use the following one-step command to trigger an in-place build and test it:: make test To run the ElementTree and cElementTree compatibility tests, make sure you have lxml on your PYTHONPATH first, then run:: python selftest.py and:: python selftest2.py If the tests give failures, errors, or worse, segmentation faults, we'd really like to know. Please contact us on the `mailing list`_, and please specify the version of lxml, libxml2, libxslt and Python you were using, as well as your operating system type (Linux, Windows, MacOs, ...). .. _`mailing list`: http://codespeak.net/mailman/listinfo/lxml-dev