[lxml-dev] Many thanks to...
Gilles Lenfant
gilles.lenfant at gmail.com
Tue Jan 29 19:16:50 CET 2008
The lxml developers great team.
I just released some days ago openxmllib, a Python library that
extracts text and meta-data from OpenXML documents (MS Office 2007,
Apple iWork, and some others) for full text indexing purpose. Perhaps
more features in the future.
http://code.google.com/p/openxmllib/
Got headaches reading and understanding OpenXML docs. Hopefully, lxml
is so easy to work with and so fast...
The words of a 60 pages Word .docx document is now extracted in 0.2
seconds instead of 8 seconds on my MacBook and I removed 60% of the
code volume since I switched from the standard XML libs that come with
Python 2.4.
lxml rocks and grooves
--
Gilles Lenfant
gilles.lenfant at gmail.com
More information about the lxml-dev
mailing list