[lxml-dev] Many thanks to...

Stefan Behnel stefan_ml at behnel.de
Wed Jan 30 14:44:36 CET 2008


Hi,

Gilles Lenfant a écrit:
> I just released some days ago openxmllib, a Python library that  
> extracts text and meta-data from OpenXML documents (MS Office 2007,  
> Apple iWork, and some others) for full text indexing purpose. Perhaps  
> more features in the future.
> 
> http://code.google.com/p/openxmllib/

Cool, thanks for sharing that.


> Got headaches reading and understanding OpenXML docs. Hopefully, lxml  
> is so easy to work with and so fast...

I assume you meant something like "luckily" rather than "hopefully". :)


> The words of a 60 pages Word .docx document is now extracted in 0.2  
> seconds instead of 8 seconds on my MacBook

That's a substantial speed-up...


> and I removed 60% of the  
> code volume since I switched from the standard XML libs that come with  
> Python 2.4.

... but I would even expect that to be a much more important gain in the long
term.


> lxml rocks and grooves

:]

Regards,
Stefan


More information about the lxml-dev mailing list