[lxml-dev] etree.XSLT gets stuck

Stefan Behnel stefan_ml at behnel.de
Thu Jan 31 18:28:02 CET 2008


Hi Dmitri,

Dmitri Fedoruk wrote:
> I'm a very happy lxml user, just in case :)

:)


>>  this has been a suspiciously calm week on the list
> Well, I wanted to write this later, but as I see this line... Yes,
> there is a problem I encounter.

I just knew I shouldn't have said that... :)


> I'm using lxml-2.0beta1 at the moment. Did not upgrade to beta2 as
> I've waited for the release. As I have mentioned before, I use xslt
> for html generation out of xml.
> 
> I have a bunch of templates and several machines with one and the same
> configuration. libxml2-2.6.30, libxslt-1.1.22, python 2.5.1, apache
> 2.2/mod_python 3.3.1, amd64, FreeBSD 6.2 .

Note that libxml2 2.6.30 has a security relevant bug, just in case you cannot
control where your XML files come from.


> Every machine runs the same routine when my process is initialised -
> read a template, compile it & store it in the dictionary. The same
> code, the same data. It looks like this:
> 
> xslt_parser = etree.XMLParser(no_network = False, resolve_entities =
> True, load_dtd = True)
> xslt_doc = etree.parse( urllib2.urlopen( xslt_path , xslt_parser )

If "xslt_path" is a local filename or an HTTP/FTP URL, then there's  no need
to deploy urlopen() as libxml2 can handle those for you. Just pass the path or
URL right in, that's simpler and should be quite a bit faster (it also frees
the GIL, although that might not help you here).


> transformations[ xslt_path ] = etree.XSLT(xslt_doc)
> 
> Normally  etree.XSLT runs in ~0.05 secons. Nevertheless, on one
> machine sometimes it gets stuck for 3-5 seconds. I'm sure it's not
> because of the urlopen, as I've measured the compilation time itself.

Hmmm, I wouldn't know where the XSLT() call could hang. I mean, you said you
are using processes, not threads, right? And the stylesheet is always the
exact same one?

When I run 1000 XSLT() calls over one and the same tree, I get almost
predictable numbers, somewhere within 160-210 msecs for a 400KB XSL file with
some 5500 XSL Elements (actually I didn't even know libxslt was *that* fast).
And I don't see any spikes anywhere.

But you said "on one machine". Is that always the same one? Maybe there's
something wrong with the setup, or some background task is running, or it
swaps, or ...

Stefan


More information about the lxml-dev mailing list