[lxml-dev] Many thanks to...

Gilles Lenfant gilles.lenfant at gmail.com
Tue Jan 29 19:16:50 CET 2008


The lxml developers great team.

I just released some days ago openxmllib, a Python library that  
extracts text and meta-data from OpenXML documents (MS Office 2007,  
Apple iWork, and some others) for full text indexing purpose. Perhaps  
more features in the future.

http://code.google.com/p/openxmllib/

Got headaches reading and understanding OpenXML docs. Hopefully, lxml  
is so easy to work with and so fast...

The words of a 60 pages Word .docx document is now extracted in 0.2  
seconds instead of 8 seconds on my MacBook and I removed 60% of the  
code volume since I switched from the standard XML libs that come with  
Python 2.4.

lxml rocks and grooves
-- 
Gilles Lenfant
gilles.lenfant at gmail.com



More information about the lxml-dev mailing list