[pypy-svn] r37282 - pypy/dist/pypy/doc/discussion
antocuni at codespeak.net
antocuni at codespeak.net
Wed Jan 24 18:10:48 CET 2007
Author: antocuni
Date: Wed Jan 24 18:10:47 2007
New Revision: 37282
Added:
pypy/dist/pypy/doc/discussion/VM-integration.txt (contents, props changed)
Log:
Some random thoughts on how to integrate PyPy with .NET. Any reviewing
is kindly welcome :-)
Added: pypy/dist/pypy/doc/discussion/VM-integration.txt
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/doc/discussion/VM-integration.txt Wed Jan 24 18:10:47 2007
@@ -0,0 +1,263 @@
+==============================================
+Integration of PyPy with host Virtual Machines
+==============================================
+
+This document is based on the discussion I had with Samuele during the
+Duesseldorf sprint. It's not much more than random thoughts -- to be
+reviewed!
+
+Terminology disclaimer: both PyPy and .NET have the concept of
+"wrapped" or "boxed" objects. To avoid confusion I will use "wrapping"
+on the PyPy side and "boxing" on the .NET side.
+
+General idea
+============
+
+The goal is to find a way to efficiently integrate the PyPy
+interpreter with the hosting environment such as .NET. What we would
+like to do includes but it's not limited to:
+
+ - calling .NET methods and instantiate .NET classes from Python
+
+ - subclass a .NET class from Python
+
+ - handle native .NET objects as transparently as possible
+
+ - automatically apply obvious Python <--> .NET conversions when
+ crossing the borders (e.g. intgers, string, etc.)
+
+One possible solution is the "proxy" approach, in which we manually
+(un)wrap/(un)box all the objects when they cross the border.
+
+Example
+-------
+
+ ::
+
+ public static int foo(int x) { return x}
+
+ >>>> from somewhere import foo
+ >>>> print foo(42)
+
+In this case we need to take the intval field of W_IntObject, box it
+to .NET System.Int32, call foo using reflection, then unbox the return
+value and reconstruct a new (or reuse an existing one) W_IntObject.
+
+The other approach
+------------------
+
+The general idea to solve handle this problem is to split the
+"stateful" and "behavioral" parts of wrapped objects, and use already
+boxed values for storing the state.
+
+This way when we cross the Python --> .NET border we can just throw
+away the behavioral part; when crossing .NET --> Python we have to
+find the correct behavioral part for that kind of boxed object and
+reconstruct the pair.
+
+
+Split state and behaviour in the flowgraphs
+===========================================
+
+The idea is to write a graph transformation that takes an usual
+ootyped flowgraph and split the classes and objects we want into a
+stateful part and a behavioral part.
+
+We need to introduce the new ootypesystem type ``Pair``: it acts like
+a Record but it hasn't its own identiy: the id of the Pair is the id
+of its first member.
+
+ XXX about ``Pair``: I'm not sure this is totally right. It means
+ that an object can change identity simply by changing the value of a
+ field??? Maybe we could add the constraint that the "id" field
+ can't be modifiend after initialization (but it's not easy to
+ enforce).
+
+ XXX-2 about ``Pair``: how to implement it in the backends? One
+ possibility is to use "struct-like" types if available (as in
+ .NET). But in this case it's hard to implement methods/functions
+ that modify the state of the object (such as __init__, usually). The
+ other possibility is to use a reference type (i.e., a class), but in
+ this case there will be a gap between the RPython identity (in which
+ two Pairs with the same state are indistinguishable) and the .NET
+ identity (in which the two objects will have a different identity,
+ of course).
+
+Step 1: RPython source code
+---------------------------
+
+ ::
+
+ class W_IntObject:
+ def __init__(self, intval):
+ self.intval = intval
+
+ def foo(self, x):
+ return self.intval + x
+
+ def bar():
+ x = W_IntObject(41)
+ return x.foo(1)
+
+
+Step 2: RTyping
+---------------
+
+Sometimes the following examples are not 100% accurate for the sake of
+simplicity (e.g: we directly list the type of methods instead of the
+ootype._meth instances that contains it).
+
+Low level types
+
+ ::
+
+ W_IntObject = Instance(
+ "W_IntObject", # name
+ ootype.OBJECT, # base class
+ {"intval": (Signed, 0)}, # attributes
+ {"foo": Meth([Signed], Signed)} # methods
+ )
+
+
+Prebuilt constants (referred by name in the flowgraphs)
+
+ ::
+
+ W_IntObject_meta_pbc = (...)
+ W_IntObject.__init__ = (static method pbc - see below for the graph)
+
+
+Flowgraphs
+
+ ::
+
+ bar() {
+ 1. x = new(W_IntObject)
+ 2. oosetfield(x, "meta", W_IntObject_meta_pbc)
+ 3. direct_call(W_IntObject.__init__, x, 41)
+ 4. result = oosend("foo", x, 1)
+ 5. return result
+ }
+
+ W_IntObject.__init__(W_IntObject self, Signed intval) {
+ 1. oosetfield(self, "intval", intval)
+ }
+
+ W_IntObject.foo(W_IntObject self, Signed x) {
+ 1. value = oogetfield(self, "value")
+ 2. result = int_add(value, x)
+ 3. return result
+ }
+
+Step 3: Transformation
+----------------------
+
+This step is done before the backend plays any role, but it's still
+driven by its need, because at this time we want a mapping that tell
+us what classes to split and how (i.e., which boxed value we want to
+use).
+
+Let's suppose we want to map W_IntObject.intvalue to the .NET boxed
+``System.Int32``. This is possible just because W_IntObject contains
+only one field. Note that the "meta" field inherited from
+ootype.OBJECT is special-cased because we know that it will never
+change, so we can store it in the behaviour.
+
+
+Low level types
+
+ ::
+
+ W_IntObject_bhvr = Instance(
+ "W_IntObject_bhvr",
+ ootype.OBJECT,
+ {}, # no more fields!
+ {"foo": Meth([W_IntObject_pair, Signed], Signed)} # the Pair is also explicitly passed
+ )
+
+ W_IntObject_pair = Pair(
+ ("value", (System.Int32, 0)), # (name, (TYPE, default))
+ ("behaviour", (W_IntObject_bhvr, W_IntObject_bhvr_pbc))
+ )
+
+
+Prebuilt constants
+
+ ::
+
+ W_IntObject_meta_pbc = (...)
+ W_IntObject.__init__ = (static method pbc - see below for the graph)
+ W_IntObject_bhvr_pbc = new(W_IntObject_bhvr); W_IntObject_bhvr_pbc.meta = W_IntObject_meta_pbc
+ W_IntObject_value_default = new System.Int32(0)
+
+
+Flowgraphs
+
+ ::
+
+ bar() {
+ 1. x = new(W_IntObject_pair) # the behaviour has been already set because
+ # it's the default value of the field
+
+ 2. # skipped (meta is already set in the W_IntObject_bhvr_pbc)
+
+ 3. direct_call(W_IntObject.__init__, x, 41)
+
+ 4. bhvr = oogetfield(x, "behaviour")
+ result = oosend("foo", bhvr, x, 1) # note that "x" is explicitly passed to foo
+
+ 5. return result
+ }
+
+ W_IntObject.__init__(W_IntObjectPair self, Signed value) {
+ 1. boxed = clibox(value) # boxed is of type System.Int32
+ oosetfield(self, "value", boxed)
+ }
+
+ W_IntObject.foo(W_IntObject_bhvr bhvr, W_IntObject_pair self, Signed x) {
+ 1. boxed = oogetfield(self, "value")
+ value = unbox(boxed, Signed)
+
+ 2. result = int_add(value, x)
+
+ 3. return result
+ }
+
+
+Inheritance
+-----------
+
+Apply the transformation to a whole class (sub)hierarchy is a bit more
+complex. Basically we want to mimic the same hierarchy also on the
+``Pair``\s, but we have to fight the VM limitations. In .NET for
+example, we can't have "covariant fields"::
+
+ class Base {
+ public Base field;
+ }
+
+ class Derived: Base {
+ public Derived field;
+ }
+
+A solution is to use only kind of ``Pair``, whose ``value`` and
+``behaviour`` type are of the most precise type that can hold all the
+values needed by the subclasses::
+
+ class W_Object: pass
+ class W_IntObject(W_Object): ...
+ class W_StringObject(W_Object): ...
+
+ ...
+
+ W_Object_pair = Pair(System.Object, W_Object_bhvr)
+
+Where ``System.Object`` is of course the most precise type that can
+hold both ``System.Int32`` and ``System.String``.
+
+This means that the low level type of all the ``W_Object`` subclasses
+will be ``W_Object_pair``, but it also means that we will need to
+insert the appropriate downcasts every time we want to access its
+fields. I'm not sure how much this can impact performances.
+
+
More information about the pypy-svn
mailing list