[lxml-dev] Many thanks to...
Stefan Behnel
stefan_ml at behnel.de
Wed Jan 30 14:44:36 CET 2008
Hi,
Gilles Lenfant a écrit:
> I just released some days ago openxmllib, a Python library that
> extracts text and meta-data from OpenXML documents (MS Office 2007,
> Apple iWork, and some others) for full text indexing purpose. Perhaps
> more features in the future.
>
> http://code.google.com/p/openxmllib/
Cool, thanks for sharing that.
> Got headaches reading and understanding OpenXML docs. Hopefully, lxml
> is so easy to work with and so fast...
I assume you meant something like "luckily" rather than "hopefully". :)
> The words of a 60 pages Word .docx document is now extracted in 0.2
> seconds instead of 8 seconds on my MacBook
That's a substantial speed-up...
> and I removed 60% of the
> code volume since I switched from the standard XML libs that come with
> Python 2.4.
... but I would even expect that to be a much more important gain in the long
term.
> lxml rocks and grooves
:]
Regards,
Stefan
More information about the lxml-dev
mailing list