[Lxml-checkins] r42642 - lxml/trunk/doc
scoder at codespeak.net
scoder at codespeak.net
Thu May 3 21:10:50 CEST 2007
Author: scoder
Date: Thu May 3 21:10:48 2007
New Revision: 42642
Modified:
lxml/trunk/doc/FAQ.txt
lxml/trunk/doc/build.txt
Log:
contributing and building
Modified: lxml/trunk/doc/FAQ.txt
==============================================================================
--- lxml/trunk/doc/FAQ.txt (original)
+++ lxml/trunk/doc/FAQ.txt Thu May 3 21:10:48 2007
@@ -17,23 +17,26 @@
1.5 What is the difference between lxml.etree and lxml.objectify?
1.6 Why is my application so slow?
1.7 Why do I get errors about missing UCS4 symbols when installing lxml?
- 2 Bugs
- 2.1 My application crashes! Why does lxml.etree do that?
- 2.2 I think I have found a bug in lxml. What should I do?
- 3 Threading
- 3.1 Can I use threads to concurrently access the lxml API?
- 3.2 Does my program run faster if I use threads?
- 3.3 Would my single-threaded program run faster if I turned off threading?
- 4 Parsing and Serialisation
- 4.1 Why doesn't the ``pretty_print`` option reformat my XML output?
- 4.2 Why can't lxml parse my XML from unicode strings?
- 4.3 What is the difference between str(xslt(doc)) and xslt(doc).write() ?
- 4.4 Why can't I just delete parents or clear the root node in iterparse()?
- 5 XPath and Document Traversal
- 5.1 What are the ``findall()`` and ``xpath()`` methods on Element(Tree)?
- 5.2 Why doesn't ``findall()`` support full XPath expressions?
- 5.3 How can I find out which namespace prefixes are used in a document?
- 5.4 How can I specify a default namespace for XPath expressions?
+ 2 Contributing
+ 2.1 Why is lxml not written in Python?
+ 2.2 How can I contribute?
+ 3 Bugs
+ 3.1 My application crashes! Why does lxml.etree do that?
+ 3.2 I think I have found a bug in lxml. What should I do?
+ 4 Threading
+ 4.1 Can I use threads to concurrently access the lxml API?
+ 4.2 Does my program run faster if I use threads?
+ 4.3 Would my single-threaded program run faster if I turned off threading?
+ 5 Parsing and Serialisation
+ 5.1 Why doesn't the ``pretty_print`` option reformat my XML output?
+ 5.2 Why can't lxml parse my XML from unicode strings?
+ 5.3 What is the difference between str(xslt(doc)) and xslt(doc).write() ?
+ 5.4 Why can't I just delete parents or clear the root node in iterparse()?
+ 6 XPath and Document Traversal
+ 6.1 What are the ``findall()`` and ``xpath()`` methods on Element(Tree)?
+ 6.2 Why doesn't ``findall()`` support full XPath expressions?
+ 6.3 How can I find out which namespace prefixes are used in a document?
+ 6.4 How can I specify a default namespace for XPath expressions?
General Questions
@@ -167,6 +170,64 @@
.. _`build instructions`: build.html
+Contributing
+============
+
+Why is lxml not written in Python?
+----------------------------------
+
+lxml interfaces with two C libraries: libxml2 and libxslt. Accessing them at
+the C-level is required for performance reasons.
+
+To avoid writing plain C-code and caring too much about the details of
+built-in types and reference counting, lxml is written in Pyrex_, a
+Python-like language that is translated into C-code. Chances are that if you
+know Python, you can write code that Pyrex accepts. Again, the C-ish style
+used in the lxml code is just for performance optimisations. If you want to
+contribute, don't bother with the details, a Python implementation of your
+contribution is better than none. And keep in mind that lxml's flexible API
+often favours an implementation of features in pure Python, without bothering
+with C-code at all.
+
+Please contact the `mailing list`_ if you need any help.
+
+.. _Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
+
+
+How can I contribute?
+---------------------
+
+Besides enhancing the code, there are a lot of places where you can help the
+project and its user base. You can
+
+* spread the word and write about lxml. Many users (especially new Python
+ users) have not yet heared about lxml, although our user base is constantly
+ growing. If you write your own blog and feel like saying something about
+ lxml, go ahead and do so. If we think your contribution or criticism is
+ valuable to other users, we may even put a link or a quote on the project
+ page.
+
+* provide code examples for the general usage of lxml or specific problems
+ solved with lxml. Readable code is a very good way of showing how a library
+ can be used and what great things you can do with it. Again, if we hear
+ about it, we can set a link on the project page.
+
+* work on the documentation. The web page is generated from a set of ReST_
+ `text files`_. It is meant both as a representative project page for lxml
+ and as a site for documenting lxml's API and usage. If you have questions
+ or an idea how to make it more readable and accessible while you are reading
+ it, please send a comment to the `mailing list`_.
+
+.. _ReST: http://docutils.sourceforge.net/rst.html
+.. _`text files`: http://codespeak.net/svn/lxml/trunk/doc/
+
+* improve the docstrings. lxml uses docstrings to support Python's integrated
+ online ``help()`` function. However, sometimes these are not sufficient to
+ grasp the details of the function in question. If you find such a place,
+ you can try to write up a better description and send it to the `mailing
+ list`_.
+
+
Bugs
====
@@ -176,7 +237,7 @@
One of the goals of lxml is "no segfaults", so if there is no clear warning in
the documentation that you were doing something potentially harmful, you have
found a bug and we would like to hear about it. Please report this bug to the
-mailing list. See the next section on how to do that.
+`mailing list`_. See the next section on how to do that.
I think I have found a bug in lxml. What should I do?
Modified: lxml/trunk/doc/build.txt
==============================================================================
--- lxml/trunk/doc/build.txt (original)
+++ lxml/trunk/doc/build.txt Thu May 3 21:10:48 2007
@@ -2,8 +2,10 @@
=============================
To build lxml from source, you need libxml2 and libxslt properly installed,
-including header files (possibly shipped in -dev packages). The build process
-also requires setuptools_.
+*including the header files*. These are likely shipped in separate ``-dev``
+or ``-devel`` packages like ``libxml2-dev``, which you need to install. The
+build process also requires setuptools_. The lxml source distribution comes
+with a script called ``ez_setup.py`` that can be used to install them.
.. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools
@@ -34,18 +36,22 @@
Newer versions of lxml depend on features and bug fixes that are not yet
available in an official Pyrex release. This includes support for the
- external C-API of lxml, for Python 2.5 and for 64 bit architectures.
+ external C-API of lxml.etree, for Python 2.5 and for 64 bit architectures.
To build lxml 1.1 and later from non-release or modified sources, you must
- therefore install an updated Pyrex version from here:
+ therefore use an updated Pyrex version from here:
http://codespeak.net/svn/lxml/pyrex/
- Since version 1.1.2, the lxml source distribution includes this Pyrex
- version. It will be used if the 'pyrex' directory is available in the lxml
- root directory. If you install from SVN or delete this directory from the
- unpacked distribution directory, the normally installed Pyrex version will
- be used.
+ A subversion checkout of lxml will automatically retrieve the latest Pyrex
+ as external project source (``svn:externals``). Look out for the ``Pyrex``
+ directory in the source tree.
+
+ Since version 1.1.2, the lxml source distribution also includes this Pyrex
+ version. It will be used if the ``Pyrex`` directory is available in the
+ lxml root directory. If you install from SVN or delete this directory from
+ the unpacked distribution directory, the normally installed Pyrex version
+ will be used.
* lxml 1.0 and earlier
@@ -86,6 +92,10 @@
python setup.py build
+or::
+
+ python setup.py bdist_egg
+
If you want to test lxml from the source directory, it is better to build it
in-place like this::
@@ -96,15 +106,24 @@
make
If you get errors about missing header files (e.g., ``libxml/xmlversion.h``)
-then you need to add the location of that file to the include path like::
+then you need to make sure the development packages of libxml2 and libxslt are
+properly installed. If this doesn't help, you may have to add the location of
+the header files to the include path like::
- python setup.py build_ext -i -I /usr/include/libxml2
+ python setup.py build_ext -i -I /usr/include/libxml2
where the file is in ``/usr/include/libxml2/libxml/xmlversion.h``
To use lxml.etree in-place, you can place lxml's ``src`` directory on your
Python module search path (PYTHONPATH) and then import ``lxml.etree`` to play
-with it.
+with it::
+
+ # cd lxml
+ # PYTHONPATH=src python
+ Python 2.5.1
+ Type "help", "copyright", "credits" or "license" for more information.
+ >>> from lxml import etree
+ >>>
To recompile after changes, note that you may have to run ``make clean`` or
delete the file ``src/lxml/etree.c``. Distutils do not automatically pick up
@@ -125,8 +144,8 @@
make test
-To run the ElementTree and cElementTree compatibility tests, make sure
-you have lxml on your PYTHONPATH first, then run::
+This also runs the ElementTree and cElementTree compatibility tests. To call
+them separately, make sure you have lxml on your PYTHONPATH first, then run::
python selftest.py
@@ -147,15 +166,16 @@
This is the procedure to make an lxml egg for your platform:
-* download the lxml-x.y.tar.gz release. This contains the pregenerated C so we
- don't run into any Pyrex issues. Unpack it and cd into it.
+* Download the lxml-x.y.tar.gz release. This contains the pregenerated C so
+ that you don't run into any Pyrex issues. Unpack it and cd into it.
* python setup.py build
-* if you're on a unixy platform, cd into build/lib.your.platform and
- strip any .so file you find there. This reduces the size of the egg.
+* If you're on a unixy platform, cd into ``build/lib.your.platform`` and strip
+ any ``.so`` file you find there. This reduces the size of the egg
+ considerably.
-* python setup.py bdist_egg upload
+* ``python setup.py bdist_egg upload``
The last 'upload' step only works if you have access to the lxml cheeseshop
entry. If not, you can just make an egg with ``bdist_egg`` and mail it to the
More information about the lxml-checkins
mailing list