[lxml-dev] lxml + mod_python: cannot unmarshal code objects in restricted execution mode
David Danier
goliath.mailinglist at gmx.de
Thu Sep 13 18:02:21 CEST 2007
> Everything became much easier and 10 times
> faster, but I've encountered the subject problem.
Same problem here, but with different code and versions:
* Django as webframework
* Apache 2.0.59 and 2.2.4
* lxml 1.3.x (all versions)
* mod_python 3.2.10 and 3.3.1
* libxml2 2.6.28 / libxslt 1.1.20
I think this might have something to do with mod_python fiddling with
__builtins__, at least googling for the error message told me, that
Python switches to restricted mode when doing so (but this might one
trigger of many). lxml seems to have callbacks run in its own "sandbox"
(or something like this, at least it seems to be a different environment
as the outer code had), which works fine unless the restricted mode is
triggered.
Somehow restricted mode is only mentioned in the docs for RExec
(http://docs.python.org/lib/module-rexec.html), but should not be
available any more, to I don't know what lxml exactly does to use callbacks.
Some further bug-finding I did revealed, that the "unmarshaling"-error
only occured if all modules I used in the callback are loaded before the
callback runs. If I load them inside the callback the error differs.
Example:
------------8<----------------------------------------------------
# unmarshaling error
from foo import bar
def callback(ctx, ...):
return bar()
---------------------------------------------------->8------------
------------8<----------------------------------------------------
# other error
def callback(ctx, ...):
from foo import bar
return bar()
---------------------------------------------------->8------------
As I have the needed mod_python-configuration not done here I can't tell
the other error, but I will add this later. (And I think it was some
ImportError)
I did not report this problem, because I was not sure which part in the
chain to produce webpages was responsible. Django does fiddle with
__builtins__, too (but removing it didn't help). And perhaps this is
simply a mod_python-bug. So I used FastCGI, which works well.
But I'm very interested in a better solution. ;-)
For the questions raised by Lee Brown:
> Are you trying to execute this code in a Handler or in a Filter? There's world
> of hidden trouble lurking in Filters because of their re-entrant nature.
I use normal XSLT-callbacks. Tried different methods to tell lxml which
callbacks I have, none worked.
(global namespace, callbacks as "extensions"-parameter for etree.XSLT)
XSLT-sample-snippet:
<xsl:value-of select="py:highlightInline(string(.))"
disable-output-escaping="yes"/>
(Namespace is defined, callback gets called and works fine...until I try
to use the code with mod_python)
> Which Apache MPM are you using? If you're using a multiple-process module, you
> might try swithing to a single-process-multiple-thread module to see if this
> behavior changes.
Using prefork here, as all threaded modules have problems with mod_php.
mod_php might be another error-source. Read something about failing
DB-connections when using mod_php and mod_python. But I don't really
think disabling mod_php will make a difference here.
Greetings, David Danier
More information about the lxml-dev
mailing list