[pypy-svn] r37282 - pypy/dist/pypy/doc/discussion

antocuni at codespeak.net antocuni at codespeak.net
Wed Jan 24 18:10:48 CET 2007


Author: antocuni
Date: Wed Jan 24 18:10:47 2007
New Revision: 37282

Added:
   pypy/dist/pypy/doc/discussion/VM-integration.txt   (contents, props changed)
Log:
Some random thoughts on how to integrate PyPy with .NET. Any reviewing
is kindly welcome :-)



Added: pypy/dist/pypy/doc/discussion/VM-integration.txt
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/doc/discussion/VM-integration.txt	Wed Jan 24 18:10:47 2007
@@ -0,0 +1,263 @@
+==============================================
+Integration of PyPy with host Virtual Machines
+==============================================
+
+This document is based on the discussion I had with Samuele during the
+Duesseldorf sprint. It's not much more than random thoughts -- to be
+reviewed!
+
+Terminology disclaimer: both PyPy and .NET have the concept of
+"wrapped" or "boxed" objects. To avoid confusion I will use "wrapping"
+on the PyPy side and "boxing" on the .NET side.
+
+General idea
+============
+
+The goal is to find a way to efficiently integrate the PyPy
+interpreter with the hosting environment such as .NET. What we would
+like to do includes but it's not limited to:
+
+  - calling .NET methods and instantiate .NET classes from Python
+
+  - subclass a .NET class from Python
+
+  - handle native .NET objects as transparently as possible
+
+  - automatically apply obvious Python <--> .NET conversions when
+    crossing the borders (e.g. intgers, string, etc.)
+
+One possible solution is the "proxy" approach, in which we manually
+(un)wrap/(un)box all the objects when they cross the border.
+
+Example
+-------
+
+  ::
+
+    public static int foo(int x) { return x}
+
+    >>>> from somewhere import foo
+    >>>> print foo(42)
+
+In this case we need to take the intval field of W_IntObject, box it
+to .NET System.Int32, call foo using reflection, then unbox the return
+value and reconstruct a new (or reuse an existing one) W_IntObject.
+
+The other approach
+------------------
+
+The general idea to solve handle this problem is to split the
+"stateful" and "behavioral" parts of wrapped objects, and use already
+boxed values for storing the state.
+
+This way when we cross the Python --> .NET border we can just throw
+away the behavioral part; when crossing .NET --> Python we have to
+find the correct behavioral part for that kind of boxed object and
+reconstruct the pair.
+
+
+Split state and behaviour in the flowgraphs
+===========================================
+
+The idea is to write a graph transformation that takes an usual
+ootyped flowgraph and split the classes and objects we want into a
+stateful part and a behavioral part.
+
+We need to introduce the new ootypesystem type ``Pair``: it acts like
+a Record but it hasn't its own identiy: the id of the Pair is the id
+of its first member.
+
+  XXX about ``Pair``: I'm not sure this is totally right. It means
+  that an object can change identity simply by changing the value of a
+  field???  Maybe we could add the constraint that the "id" field
+  can't be modifiend after initialization (but it's not easy to
+  enforce).
+
+  XXX-2 about ``Pair``: how to implement it in the backends? One
+  possibility is to use "struct-like" types if available (as in
+  .NET). But in this case it's hard to implement methods/functions
+  that modify the state of the object (such as __init__, usually). The
+  other possibility is to use a reference type (i.e., a class), but in
+  this case there will be a gap between the RPython identity (in which
+  two Pairs with the same state are indistinguishable) and the .NET
+  identity (in which the two objects will have a different identity,
+  of course).
+
+Step 1: RPython source code
+---------------------------
+
+  ::
+
+    class W_IntObject:
+        def __init__(self, intval):
+            self.intval = intval
+    
+        def foo(self, x):
+            return self.intval + x
+
+    def bar():
+        x = W_IntObject(41)
+        return x.foo(1)
+
+
+Step 2: RTyping
+---------------
+
+Sometimes the following examples are not 100% accurate for the sake of
+simplicity (e.g: we directly list the type of methods instead of the
+ootype._meth instances that contains it).
+
+Low level types
+
+  ::
+
+    W_IntObject = Instance(
+        "W_IntObject",                   # name
+        ootype.OBJECT,                   # base class
+        {"intval": (Signed, 0)},         # attributes
+        {"foo": Meth([Signed], Signed)}  # methods
+    )
+
+
+Prebuilt constants (referred by name in the flowgraphs)
+
+  ::
+
+    W_IntObject_meta_pbc = (...)
+    W_IntObject.__init__ = (static method pbc - see below for the graph)
+
+
+Flowgraphs
+
+  ::
+
+    bar() {
+      1.    x = new(W_IntObject)
+      2.    oosetfield(x, "meta", W_IntObject_meta_pbc)
+      3.    direct_call(W_IntObject.__init__, x, 41)
+      4.    result = oosend("foo", x, 1)
+      5.    return result
+    }
+
+    W_IntObject.__init__(W_IntObject self, Signed intval) {
+      1.    oosetfield(self, "intval", intval)
+    }
+
+    W_IntObject.foo(W_IntObject self, Signed x) {
+      1.    value = oogetfield(self, "value")
+      2.    result = int_add(value, x)
+      3.    return result
+    }
+
+Step 3: Transformation
+----------------------
+
+This step is done before the backend plays any role, but it's still
+driven by its need, because at this time we want a mapping that tell
+us what classes to split and how (i.e., which boxed value we want to
+use).
+
+Let's suppose we want to map W_IntObject.intvalue to the .NET boxed
+``System.Int32``. This is possible just because W_IntObject contains
+only one field. Note that the "meta" field inherited from
+ootype.OBJECT is special-cased because we know that it will never
+change, so we can store it in the behaviour.
+
+
+Low level types
+
+  ::
+
+    W_IntObject_bhvr = Instance(
+        "W_IntObject_bhvr",
+        ootype.OBJECT,
+        {},                                               # no more fields!
+        {"foo": Meth([W_IntObject_pair, Signed], Signed)} # the Pair is also explicitly passed
+    )
+
+    W_IntObject_pair = Pair(
+        ("value", (System.Int32, 0)),  # (name, (TYPE, default))
+        ("behaviour", (W_IntObject_bhvr, W_IntObject_bhvr_pbc))
+    )
+
+
+Prebuilt constants
+
+  ::
+
+    W_IntObject_meta_pbc = (...)
+    W_IntObject.__init__ = (static method pbc - see below for the graph)
+    W_IntObject_bhvr_pbc = new(W_IntObject_bhvr); W_IntObject_bhvr_pbc.meta = W_IntObject_meta_pbc
+    W_IntObject_value_default = new System.Int32(0)
+
+
+Flowgraphs
+
+  ::
+
+    bar() {
+      1.    x = new(W_IntObject_pair) # the behaviour has been already set because
+                                      # it's the default value of the field
+
+      2.    # skipped (meta is already set in the W_IntObject_bhvr_pbc)
+
+      3.    direct_call(W_IntObject.__init__, x, 41)
+
+      4.    bhvr = oogetfield(x, "behaviour")
+            result = oosend("foo", bhvr, x, 1) # note that "x" is explicitly passed to foo
+
+      5.    return result
+    }
+
+    W_IntObject.__init__(W_IntObjectPair self, Signed value) {
+      1.    boxed = clibox(value)             # boxed is of type System.Int32
+            oosetfield(self, "value", boxed)
+    }
+
+    W_IntObject.foo(W_IntObject_bhvr bhvr, W_IntObject_pair self, Signed x) {
+      1.    boxed = oogetfield(self, "value")
+            value = unbox(boxed, Signed)
+
+      2.    result = int_add(value, x)
+
+      3.    return result
+    }
+
+
+Inheritance
+-----------
+
+Apply the transformation to a whole class (sub)hierarchy is a bit more
+complex. Basically we want to mimic the same hierarchy also on the
+``Pair``\s, but we have to fight the VM limitations. In .NET for
+example, we can't have "covariant fields"::
+
+  class Base {
+        public Base field;
+  }
+
+  class Derived: Base {
+        public Derived field;
+  }
+
+A solution is to use only kind of ``Pair``, whose ``value`` and
+``behaviour`` type are of the most precise type that can hold all the
+values needed by the subclasses::
+
+   class W_Object: pass
+   class W_IntObject(W_Object): ...
+   class W_StringObject(W_Object): ...
+
+   ...
+
+   W_Object_pair = Pair(System.Object, W_Object_bhvr)
+
+Where ``System.Object`` is of course the most precise type that can
+hold both ``System.Int32`` and ``System.String``.
+
+This means that the low level type of all the ``W_Object`` subclasses
+will be ``W_Object_pair``, but it also means that we will need to
+insert the appropriate downcasts every time we want to access its
+fields. I'm not sure how much this can impact performances.
+
+


More information about the pypy-svn mailing list