[lxml-dev] gzip compression detection
Stefan Behnel
stefan_ml at behnel.de
Mon Jul 28 20:27:26 CEST 2008
Hi,
Dr R. Sanderson wrote:
> In the documentation it says that lxml can automatically detect and
> process gzipped xml (.gz). Which I'm sure (but haven't tried) works
> when it's parsing from a file with the appropriate extension, but is
> this possible from an in memory string?
>
> My situation: I have a berkeley db based storage system which maintains
> gzipped xml. I currently just use python's gzip module to uncompress
> before sending to lxml, but if I could skip this step I'm sure there'd
> be good performance benefits.
Yes, I recently thought about that, too, mainly in the context of pickling.
http://comments.gmane.org/gmane.comp.gnome.lib.xml.general/14465
It would be something to implement, though, as the support in libxml2 is
restricted to files. Supporting this for in-memory data isn't that hard, but
it would require writing a callback-driven filter for a libxml2 I/O output
buffer: buffer what gets written, compress it, write it out to the next output
buffer. Not hard, but not entirely trivial either.
Stefan
More information about the lxml-dev
mailing list