execnet: Distributed Python deployment and communication

home |  install |  examples |  basic API |  support 

Table Of Contents

Previous topic

Receive file contents from remote SSH account

Next topic

execnet CHANGELOG

API in a nutshell

execnet ad-hoc instantiates local and remote Python interpreters. Each interpreter is accessible through a Gateway which manages code and data communication. Channels allow to exchange data between the local and the remote end. Groups help to manage creation and termination of sub interpreters.

_images/basic1.png

Gateways: bootstrapping Python interpreters

All Gateways are instantiated via a call to makegateway() passing it a gateway specification or URL.

execnet.makegateway(spec)

create and configure a gateway to a Python interpreter. The spec string encodes the target gateway type and configuration information. The general format is:

key1=value1//key2=value2//...

If you leave out the =value part a True value is assumed. Valid types: popen, ssh=hostname, socket=host:port. Valid configuration:

id=<string>     specifies the gateway id
python=<path>   specifies which python interpreter to execute
execmodel=model 'thread', 'eventlet', 'gevent' model for execution
chdir=<path>    specifies to which directory to change
nice=<path>     specifies process priority of new process
env:NAME=value  specifies a remote environment variable setting.

If no spec is given, self.defaultspec is used.

Here is an example which instantiates a simple Python subprocess:

>>> gateway = execnet.makegateway()

gateways allow to remote execute code and exchange data bidirectionally.

examples for valid gateway specifications

  • ssh=wyvern//python=python3.3//chdir=mycache specifies a Python3.3 interpreter on the host wyvern. The remote process will have mycache as its current working directory.
  • ssh=-p 5000 myhost makes execnet pass “-p 5000 myhost” arguments to the underlying ssh client binary, effectively specifying a custom port.
  • popen//python=python2.6//nice=20 specification of a python subprocess using the python2.6 executable which must be discoverable through the system PATH; running with the lowest CPU priority (“nice” level). By default current dir will be the current dir of the instantiator.
  • popen//dont_write_bytecode uses the same executable as the current Python, and also passes the -B flag on startup, which tells Python not write .pyc or .pyo files. Note that this only works under CPython 2.6 and newer.
  • popen//env:NAME=value specifies a subprocess that uses the same interpreter as the one it is initiated from and additionally remotely sets an environment variable NAME to value.
  • popen//execmodel=eventlet specifies a subprocess that uses the same interpreter as the one it is initiated from but will run the other side using eventlet for handling IO and dispatching threads.
  • socket=192.168.1.4:8888 specifies a Python Socket server process that listens on 192.168.1.4:8888``

remote_exec: execute source code remotely

All gateways offer a simple method to execute source code in the instantiated subprocess-interpreter:

Gateway.remote_exec(source)

return channel object and connect it to a remote execution thread where the given source executes.

  • source is a string: execute source string remotely with a channel put into the global namespace.
  • source is a pure function: serialize source and call function with **kwargs, adding a channel object to the keyword arguments.
  • source is a pure module: execute source of module with a channel in its global namespace

In all cases the binding __name__='__channelexec__' will be available in the global namespace of the remotely executing code.

It is allowed to pass a module object as source code in which case it’s source code will be obtained and get sent for remote execution. remote_exec returns a channel object whose symmetric counterpart channel is available to the remotely executing source.

Gateway.reconfigure([py2str_as_py3str=True, py3str_as_py2str=False])

reconfigures the string-coercion behaviour of the gateway

Channels: exchanging data with remote code

A channel object allows to send and receive data between two asynchronously running programs.

Channel.send(item)

sends the given item to the other side of the channel, possibly blocking if the sender queue is full. The item must be a simple python type and will be copied to the other side by value. IOError is raised if the write pipe was prematurely closed.

Channel.receive(timeout)

receive a data item that was sent from the other side. timeout: None [default] blocked waiting. A positive number indicates the number of seconds after which a channel.TimeoutError exception will be raised if no item was received. Note that exceptions from the remotely executing code will be reraised as channel.RemoteError exceptions containing a textual representation of the remote traceback.

Channel.setcallback(callback, endmarker=_NOENDMARKER)

set a callback function for receiving items.

All already queued items will immediately trigger the callback. Afterwards the callback will execute in the receiver thread for each received data item and calls to receive() will raise an error. If an endmarker is specified the callback will eventually be called with the endmarker when the channel closes.

Channel.makefile(mode, proxyclose=False)

return a file-like object. mode can be ‘w’ or ‘r’ for writeable/readable files. if proxyclose is true file.close() will also close the channel.

Channel.close(error)

close down this channel with an optional error message. Note that closing of a channel tied to remote_exec happens automatically at the end of execution and cannot be done explicitely.

Channel.waitclose(timeout)

wait until this channel is closed (or the remote side otherwise signalled that no more data was being sent). The channel may still hold receiveable items, but not receive any more after waitclose() has returned. Exceptions from executing code on the other side are reraised as local channel.RemoteErrors. EOFError is raised if the reading-connection was prematurely closed, which often indicates a dying process. self.TimeoutError is raised after the specified number of seconds (default is None, i.e. wait indefinitely).

Channel.RemoteError = <class 'execnet.gateway_base.RemoteError'>
Channel.TimeoutError = <class 'execnet.gateway_base.TimeoutError'>

Grouped Gateways and robust termination

All created gateway instances are part of a group. If you call execnet.makegateway it actually is forwarded to the execnet.default_group. Group objects are container objects (see group examples) and manage the final termination procedure:

Group.terminate(timeout=None)

trigger exit of member gateways and wait for termination of member gateways and associated subprocesses. After waiting timeout seconds try to to kill local sub processes of popen- and ssh-gateways. Timeout defaults to None meaning open-ended waiting and no kill attempts.

This method is implicitely called for each gateway group at process-exit, using a small timeout. This is fine for interactive sessions or random scripts which you rather like to error out than hang. If you start many processes then you often want to call group.terminate() yourself and specify a larger or not timeout.

threading models: gevent, eventlet, thread

New in version 1.2: (status: experimental!)

execnet supports “thread”, “eventlet” and “gevent” as thread models on each of the two sides. You need to decide which model to use before you create any gateways:

# content of threadmodel.py
import execnet
# locally use "eventlet", remotely use "thread" model
execnet.set_execmodel("eventlet", "thread")
gw = execnet.makegateway()
print (gw)
print (gw.remote_status())
print (gw.remote_exec("channel.send(1)").receive())

You need to have eventlet installed in your environment and then you can execute this little test file:

$ python threadmodel.py
<Gateway id='gw0' receive-live, eventlet model, 0 active channels>
<RInfo 'numchannels=0, numexecuting=0, execmodel=thread'>
1

Note

With python3 you can (as of December 2013) only use the thread model because neither eventlet-0.14.0 nor gevent-1.0 support Python3. When they start to support Python3, execnet will probably just work because it is itself Python3 compatible.

How to execute in the main thread

When the remote side of a gateway uses the ‘thread’ model, execution will preferably run in the main thread. This allows GUI loops or other code to behave correctly. If you, however, start multiple executions concurrently, they will run in non-main threads.

remote_status: get low-level execution info

All gateways offer a simple method to obtain some status information from the remote side.

Gateway.remote_status(source)

return information object about remote execution status.

Calling this method tells you e.g. how many execution tasks are queued, how many are executing and how many channels are active.

rsync: synchronise filesystem with remote

execnet implements a simple efficient rsyncing protocol. Here is a basic example for using RSync:

rsync = execnet.RSync('/tmp/source')
gw = execnet.makegateway()
rsync.add_target(gw, '/tmp/dest')
rsync.send()

And here is API info about the RSync class.

class execnet.RSync(sourcedir, callback=None, verbose=True)

This class allows to send a directory structure (recursively) to one or multiple remote filesystems.

There is limited support for symlinks, which means that symlinks pointing to the sourcetree will be send “as is” while external symlinks will be just copied (regardless of existance of such a path on remote side).

add_target(gateway, destdir, finishedcallback=None, **options)

Adds a remote target specified via a gateway and a remote destination directory.

send(raises=True)

Sends a sourcedir to all added targets. Flag indicates whether to raise an error or return in case of lack of targets

Debugging execnet

By setting the environment variable EXECNET_DEBUG you can configure a tracing mechanism:

EXECNET_DEBUG=1:
 write per-process trace-files to execnet-debug-PID
EXECNET_DEUBG=2:
 perform tracing to stderr (popen-gateway slaves will send this to their instantiator)

cross-interpreter serialization of python objects

New in version 1.1.

Execnet exposes a function pair which you can safely use to store and load values from different Python interpreters (e.g. Python2 and Python3, PyPy and Jython). Here is a basic example:

>>> import execnet
>>> dump = execnet.dumps([1,2,3])
>>> execnet.loads(dump)
[1,2,3]

For more examples see Dumping and loading values across interpreter versions.

execnet.dumps(spec)

return a serialized bytestring of the given obj.

The obj and all contained objects must be of a builtin python type (so nested dicts, sets, etc. are all ok but not user-level instances).

execnet.loads(spec)

return the object as deserialized from the given bytestring.

py2str_as_py3str: if true then string (str) objects previously
dumped on Python2 will be loaded as Python3 strings which really are text objects.
py3str_as_py2str: if true then string (str) objects previously
dumped on Python3 will be loaded as Python2 strings instead of unicode objects.

if the bytestring was dumped with an incompatible protocol version or if the bytestring is corrupted, the execnet.DataFormatError will be raised.