[lxml-dev] thread-related crash when using xslt
Stefan Behnel
stefan_ml at behnel.de
Thu Feb 26 09:24:21 CET 2009
Hi Martijn,
Martijn Faassen wrote:
> Attached is a small tarball that demonstrates code that crashes when the
> code is run in a thread but doesn't crash when it is run stand-alone. I
> isolated the specific XSLT + XML combination that seems to trigger this
> crash. I suspect it has to do with passing an XSLT object to a thread.
I've seen enough of these all over the place to consider this possible. ;)
I'll look into this as soon as I get to it.
I was about to release another beta anyway - the latest changelog has
gotten longer than I expected, and I really love being able to say that
lxml is now fully Py3 compatible. So I'll see if I can get this to work
before putting out a 2.2beta4. The still-future Cython 0.11 has also
matured a lot by now, so it's worth another release.
> I run this with lxml 2.1.5 in Python 2.4, libxml2 2.6.32 and libxslt
> 1.1.24
Just in case, if the crash is related to transformation errors, you might
want to try with 2.2beta3, or even with the trunk, if you also install the
latest trunk Cython (sorry for that).
> By the way, the FAQ implies that passing an XSLT object into a thread
> will slow things down (probably as the XSLT would be re-interpreted). Is
> that still true in the current codebase? I had the impression from
> previous discussions that this would change.
Yes, the (ugly) code section that this statement was referring to was
killed somewhere in 2.1.x. I removed the paragraph from the FAQ and also
clarified a couple of other things while at it. lxml now even has a working
test case for passing trees along a thread pipeline, so the safety of
threading really has improved a lot lately.
It's impressively hard to get these things right. Threads are just plain
evil. Their only excuse in lxml is that XML handling is often I/O expensive
and can involve major time consuming operations inside libxml2 and libxslt
(XSLT is really a great candidate for that). So freeing the GIL when we
know we are about to do most of our work outside of the Python interpreter
gets you pretty far.
Stefan
More information about the lxml-dev
mailing list