[lxml-dev] Leaking tracebacks

Christian Heimes lists at cheimes.de
Fri Nov 14 18:13:18 CET 2008


Hello!

A college run into a memory leak on one of our servers a couple of days 
ago. With the help of Dowser - a CherryPy plugin - we were able to nail 
down the leak to traceback and frame objects. Some code in Martijn 
Faassen's oaipmh package kept references to an exception traceback.

Today I was able to track 'n nail down the issue. I highly suspect lxml 
to be the culprit. Whenever the code reaches lxml with an exception in 
sys.exc_info the exception leaks. My dowser patch shows a refcnt of 4 
references to the root traceback object that can't be traced to objects 
in the gc.get_objects() list.

A minor change to oaipmh.server.XMLTreeServer gets rid of the leaking 
tracebacks:

     def handleException(self, exception):
         if isinstance(exception, error.ErrorBase):
             sys.exc_clear() ## <- REF LEAK FIX
             envelope = self._outputErrors(
                 [(exception.oainame(), str(exception))])
             return envelope
         # unhandled exception, so raise again
         raise

We are using lxml 2.1.2 on Python 2.5.2. I think lxml 2.0.7 didn't have 
the problem.

Christian



More information about the lxml-dev mailing list