[lxml-dev] Huge memory leak in latest 2.0

Stefan Behnel stefan_ml at behnel.de
Sat Dec 8 17:20:58 CET 2007


Hi,

Artur Siekielski wrote:
> I'm using latest 2.0 version from trunk, rev. 49494 (because it supports 
> 'encoding' keyword in HTMLParser). I'm parsing many HTML documents in 
> loop, 100-200kB each. I have noticed that memory used by my program 
> increases about 1MB after each document processed, so after a few 
> hundreds of passes system is about to hang. Running the same code with 
> lxml 1.3.6 doesn't cause such memory usage increase.
> 
> I'm using the following library calls:
> tree = etree.parse( <opened file>, HTMLParser(encoding=...))
> etree.tostring(tree)
> el.xpath(...)
> getting children and attributes of elements

thanks for the report, I can reproduce this with a simple call to the parser.
I'll look into it.

Stefan


More information about the lxml-dev mailing list