[lxml-dev] Huge memory leak in latest 2.0
Stefan Behnel
stefan_ml at behnel.de
Mon Dec 10 00:11:18 CET 2007
Artur Siekielski wrote:
> I'm using latest 2.0 version from trunk, rev. 49494 (because it supports
> 'encoding' keyword in HTMLParser). I'm parsing many HTML documents in
> loop, 100-200kB each. I have noticed that memory used by my program
> increases about 1MB after each document processed, so after a few
> hundreds of passes system is about to hang. Running the same code with
> lxml 1.3.6 doesn't cause such memory usage increase.
>
> I'm using the following library calls:
> tree = etree.parse( <opened file>, HTMLParser(encoding=...))
> etree.tostring(tree)
> el.xpath(...)
> getting children and attributes of elements
>
> I'm using libxml2 version 2.6.28.
>
> If anyone knows about solution/workaround, please write.
Hmmm, weird. The problem doesn't result from any change in lxml, just from the
switch to Cython 0.9.6.8+. And I don't even see any obvious problem in the
generated code.
Anyway, here's a patch that seems to make the leak go away on my side. Could
you give it a try?
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: leak-fix.patch
Type: text/x-patch
Size: 460 bytes
Desc: not available
Url : http://codespeak.net/pipermail/lxml-dev/attachments/20071210/34f72308/attachment.bin
More information about the lxml-dev
mailing list