Scientific and technological objectives of the project & state of the art =============================================================================== Problem to be solved --------------------------- Current language implementations are static and hard to modify by users. Even the implementations of open-source dynamic languages have non-flexible designs crafted by a small group of developers. While designing a good programming language is no easy task, often application developers seek more support and configurability of their language. This is especially the case for the rising number of specialized runtime environments where small memory footprints, concurrency or real-time features are essential. Very-high-level languages (VHLL) offer a highly productive development tool. However, whole-program static compilation is often not suitable, so that VHLL are usually implemented by introducing a bytecode and interpretation level, resulting in a loss of performance. This is in contrast to lower-level languages such as C, which offer performance but greatly decrease productivity. The architectures of current interpreter implementations are a compromise between initial simplicity, Unix-style portability and performance in a single code base. The resulting monolithic (C) code bases are inflexible and expensive to evolve. Early design decisions concerning aspects such as memory management, object model and layout, security, multi-threading model are deeply tangled in the code and cannot be reverted, experimented with or adapted to specific runtime environments. In the long run, user pressure tends to shape interpreters towards performance at the expense of other aspects (simplicity, flexibility), possibly culminating in the introduction of a native 'Just-In-Time' (JIT) compiler, adding a significant amount of code and complexity and further impairing the flexibility. [S03]_ Consequently, application developers are often left with bad choices not only regarding productivity versus performance, but they have to work around hard-wired characteristics built into the languages. Instead they would like to adapt and configure a VHLL to suit their needs while at the same time developing on top of a custom but standardized language offering both productivity and performance. Quantified specific objective ----------------------------------- The aim of the PyPy project is to research and implement an interpreter and runtime environment for the Python language, built on a unique architecture enabling both productivity and performance. The main building block is the concept of an Object Space which cleanly encapsulates computational operations on objects, separated from the bytecode interpreter. A number of features, most of which are hard to implement as application-level libraries, can become part of the language in a manageable and user-configurable way if they are implemented modularly as Object Spaces. For example: * choice of memory and threading model * choice of speed vs. memory trade-offs * distributed/parallel execution (SMP or Networked) * orthogonal persistence * pervasive security support * logic and aspect-oriented programming The second key idea is to write the interpreter and Object Spaces in a VHLL language (Python itself) and recover performance with a separate translation process producing specialized efficient low-level code. The translation is not one-shot: it can be repeated and tailored to weave different aspects into releases with different trade-offs and features, giving the user of the language a real choice. The PyPy project plans to deliver a compliant language implementation passing all unit-tests of the current reference C-implementation that are relevant to the new one. At least 90% of existing unit tests will be directly +applicable.[*]_ Moreover, we will be able to add techniques such as JIT compilation without making the interpreter more complex. Thus we will get speed increases of 2-10 times over the reference C-implementation. We expect algorithmic code to run at least at 50% of the speed of pure, optimized C code. .. [*] The reference C-implementation contains some tests that depend on implementation details. The exact line between a language feature and an implementation detail might at times be hard to draw precisely, but in all cases this only concerns a minority of the tests (less than 10%). Current State of The Art ------------------------------ Interpreters Modularity ++++++++++++++++++++++++ Haskell monadic modular interpreters [LHJ95]_ [H98]_ are a researched attempt at achieving modularity for interpreters. They have not been tried on something as large as Python but in the context of Domain-Specific Languages (DSLs). However, the approach is hard to relate to for programmers accustomed to more conventional Object-Oriented (OO) programming and requires sophisticated partial evaluation to remove the monadic overhead. On the positive side, monad transformers are powerful enough to modularize continuation passing, exceptions and other control flow aspects. Some of the kind of modularity we are interested in for interpreters - subsetting, choice among implementation of basics aspects (memory management,...), feature selection - has some similarity with recent research on OS modularity and software composition, e.g. the Flux Group OSKit project and their composition tool Knit [RFSLE00]_. In its basics, the approach of writing a language interpreter in the language itself (a subset thereof) and then producing a running low-level code version through a translation process has already been taken e.g. for the Scheme (Scheme 48) [K97]_ and the Smalltalk (Squeak) [IKM97]_ languages. These approaches are typically based on translation from high-level source code (or parsed source code) into C. Obtaining in this way an interpreter comparable with the current C Python implementation would be within current state of the art; but note that our approach differs in three respects (as explained in more details in the next section): what we will translate is a dynamically loaded and initialized program state instead of the static source code; the translator works by abstract interpretation (also known as symbolic interpretation), which makes it independent from the actual source or bytecode; and we will put the key modularity and flexibility concerns into the translator. We plan to exploit the gained flexibility of the translator much further. It will enable separation on concerns between the translator, the core interpreter, and the Object Spaces, in the spirit of Aspect Oriented Programming (AOP) as developed in [KLM97]_. AOP separate cross-cutting aspects into separate components. This approach has however not been used on interpreters for large practical languages. JITs and Dynamic Optimisation Complexity ++++++++++++++++++++++++++++++++++++++++++ JIT compilers have been reasonably well studied; an account of their history is given in [A03]_ . But actually writing a JIT compiler for a given language is generally a major task [WAU99]_ . For example the recent Jalapeno Java VM [AFGHS00]_ and its compilers, although being in written in Java, are a complex architecture, requiring fine tuning, using runtime sampling to assess at runtime the dynamic call graph to drive inlining. Different techniques to ease this path have been recently explored: * To implement a dynamic language in a flexible way, it can be written on top of another one, e.g. as a dynamic translator that produces bytecodes for an existing virtual machine. If the virtual machine is already equipped with state-of-the-art JIT compilation, it is possible to leverage its benefits to the new language. This path is not only much shorter than designing a complete custom JIT compiler, but it is also easier to maintain and evolve. This idea is explored for the Self virtual machine in [WAU99]_, building on top of it experimental Java and Smalltalk implementations . As pointed out in that paper, some source language features may not match any of the target virtual machine features. When this issue arises, we are left with the hard problem of refactoring an efficient JIT compiler-based virtual machine. * Using a completely different approach, to make it easier to derive a JIT compiler from an existing C interpreter, DynamoRIO instruments the execution of compiled programs and optimizes them dynamically. It has been extended with specific support for interpreters. With minimal amounts of changes in the source of an interpreter, it can significantly reduce the processor-time interpretative overhead [S03]_ . While this offers a highly competitive gain/effort ratio, performance does not reach the levels of a hand-crafted JIT compiler. The current Python C implementation (CPython) is a straight-forward bytecode interpreter. Psyco [R03]_ is an extension to it implementing a highly experimental [APJ03]_ specializing JIT compiler based on abstract interpretation. In its current incarnation it is also a delicate hand-crafted piece of code, which is hard to maintain and not flexible. But this should not inherently be the case. Moreover it would likely benefit from integration with type-feedback techniques such as those developed for Self [HCU91]_ [HU94]_. Beyond State of The Art ----------------------------- Our architecture is based on Object Spaces and translation. The interpreter's main dispatch loop handles control flow and supporting data structures (frame stack, bytecode objects...); each individual operation on objects is dispatched to the Object Space. Object Spaces, in contrast to monadic interpreters, will allow customization with OO techniques, as they encapsulate the object operation semantics, creation and global state of all objects. As mentioned above, monad transformers can be used to modularize continuation passing, exceptions and other control flow aspects. We expect to encapsulate each of those either in Object Spaces, in the interpreter loop, or as aspects of the translation process (which would then generate for example continuation-passing code). This latter point is in contrast to the related work previously cited: we don't expect to encode a fixed single interpreter in all its details in a subset of our VHLL (Python). We plan to exploit the flexibility and abstraction gained by using the VHLL and -- most importantly -- the indirectness of translation in order to "weave" aspects such as continuation passing, memory management, object layout, threading model etc. at translation time. Many of these aspects are really cross-cutting concerns in the original Aspect Oriented Programming (AOP) sense. E.g. we expect to code addition between Python integers in a high-level way independent of memory management and of boxing issues. In general our approach relates to the seminal AOP ideas of [KLM97]_: the subset of Python in which we express the core interpreter and Object Spaces is the *component language* in the terminology of the paper, while the translator is a *weaver* written with the full dynamism of Python. In summary, we will explore for each feature and extension how to best implement it by separation of concerns: either in OO-customized Object Spaces, or in the core or in modules to be translated, or as a pluggable behavior of the translation itself. The front-end of the translator itself will be innovative in that it is based on abstract (symbolic) interpretation. The translation of code from our Python subset (in particular the source of PyPy itself) will thus be driven by PyPy's own Python interpreter, made symbolic by plugging a custom Object Space. This turns the Object Space operations into the semantics units of translation as well, leveraging the coherence of our architecture and gaining independence from surface issues including the details of the bytecode itself. Moreover, we can use the full dynamism of Python at system definition time, i.e. when PyPy loads, until we reach a "stable-and-ready" state that the translator can freeze into a snapshot containing exactly the features and modules that we need. In previous work, the translator only operated on the static source code. During the translation process, weaving custom aspects will be done mainly as intermediate stages, while the customizable back-end generates low-level code in various styles as needed. For example, it can be in continuation passing style (CPS), or it can target altogether different runtime environments like Java -- providing a cheap way to regenerate a replacement for Jython, the Python interpreter written in Java, without the burden of having to maintain it synchronized with CPython. Another key innovative aspect of our project is to generate the JIT compiler instead of writing it manually. Indeed, although Psyco was hand-written, large parts have a one-to-one correspondence to whole sections of the CPython interpreter. This is uncommon for JIT compilers, but Psyco's good results give ample proof-of-concept. (This property can be explicitly related to the DynamoRIO project cited above, which instruments the interpreter itself.) The one-to-one correspondence between parts of CPython and Psyco is as follows: to each expression in the source of CPython corresponds a more complex expression in Psyco, which does instrumentation and optimizations. Our plan is thus to generate automatically as much of the JIT as possible, by changing the translator to generate instrumenting/optimizing C instructions instead of (or in addition to) the normal C instructions that would be the direct translation. Finally we expect to develop automatic interpreter build tools that allow choices about aspects, predefined or user-supplied modifications and extensions to be made from the language user. ---------------------------------------------- **References** .. [A03] John Aycock, "A Brief History of Just-In-Time", ACM Computing Surveys 35, 2 (June 2003), pp. 97-113. .. file = jit-history.pdf .. [AFGHS00] Matthew Arnold, Stephen J. Fink, David Grove, Michael Hind, and Peter F. Sweeney, "Adaptive optimization in the Jalapeno JVM", In Conference on Object-Oriented Prorgramming and Systems, pp. 47-65, 2000. http://citeseer.nj.nec.com/arnold00adaptive.html .. [APJ03] John Aycock and David Pereira and Georges Jodoin, "UCPy: Reverse-Engineering Python", PyCon DC 2003, 9pp. http://pages.cpsc.ucalgary.ca/~aycock/papers/ucpy.pdf .. file = ucpy-reverse-engineering-python.pdf .. [H98] Paul Hudak. "Modular Domain Specific Languages and Tools". ICSR 98. 1998. http://haskell.org/frp/dsl.ps .. file = dsl.ps.gz .. [HCU91] Urs Hölzle, Craig Chambers, and David Ungar, "Optimizing Dynamically-Typed Object-Oriented Languages with Polymorphic Inline Caches", ECOOP'91 Conference Proceedings, Geneva, 1991. Published as Springer Verlag Lecture Notes in Computer Science 512, Springer Verlag, Berlin, 1991. http://self.sunlabs.com/papers/ecoop91.ps.Z .. file = hlzle91optimizing.ps.gz .. [HU94] Urs Hölzle, David Ungar, "Reconciling Responsiveness with Performance in Pure Object-Oriented Languages", PLDI '94 and OOPSLA '94 http://www.cs.ucsb.edu/oocsb/papers/toplas96.pdf/reconciling-responsiveness-with-performance.pdf .. file = reconciling-responsiveness-with-performance.pdf .. [IKM97] Dan Ingalls, Ted Kaehler, John Maloney, Scott Wallace, and Alan Kay. "Back to the future: The story of Squeak, A practical Smalltalk written in itself." In Proceedings OOPSLA'97, pages 318--326, November 1997. ftp://st.cs.uiuc.edu/Smalltalk/Squeak/docs/OOPSLA.Squeak.html .. file = OOPSLA.Squeak.htm .. [K97] Richard Kelsey, "Pre-Scheme: A Scheme Dialect for Systems Programming".1997. http://citeseer.nj.nec.com/kelsey97prescheme.html .. file = kelsey97prescheme.pdf .. [KLM97] Gregor Kiczales and John Lamping and Anurag Menhdhekar and Chris Maeda and Cristina Lopes and Jean-Marc Loingtier and John Irwin, "Aspect-Oriented Programming", ECOOP'97 http://www2.parc.com/csl/groups/sda/publications/papers/Kiczales-ECOOP97/for-web.pdf .. file = kiczales97aspectoriented.pdf .. [LHJ95] Sheng Liang, Paul Hudak and Mark Jones. "Monad Transformers and Modular Interpreters". 22nd ACM Symposium on Principles of Programming Languages (POPL'95). January 1995. http://java.sun.com/people/sl/papers/popl95.ps.gz .. file = popl95.ps.gz .. [R03] Armin Rigo, http://psyco.sourceforge.net .. file = psycoguide.ps.gz .. [RFSLE00] Alastair Reid, Matthew Flatt, Leigh Stoller, Jay Lepreau, and Eric Eide, "Knit: Component Composition for Systems Software", in Proceedings of the Fourth Symposium on Operating Systems Design and Implementation OSDI'2000, 2000. http://www.cs.utah.edu/flux/papers/knit-osdi00.pdf .. [S03] Gregory Sullivan et. al., "Dynamic Native Optimization of Native Interpreters". IVME 03. 2003. http://www.ai.mit.edu/~gregs/dynamorio.html .. file = ivme03.pdf .. [WAU99] Mario Wolczko, Ole Agesen, David Ungar, "Towards a Universal Implementation Substrate for Object-Oriented Languages",OOPSLA 99 workshop on Simplicity, Performance and Portability in Virtual Machine Design, OOPSLA '99, Denver, CO, Nov 2, 1999. http://research.sun.com/research/kanban/oopsla-vm-wkshp.pdf .. file = oopsla-vm-wkshp.pdf