PyPy
PyPy JIT[overview]
modified Sep 17, 2008 by Armin Rigo

Motivating JIT Compiler Generation

This is a non-technical introduction and motivation for PyPy's approach to Just-In-Time compiler generation.

1   Motivation

1.1   Overview

Writing an interpreter for a complex dynamic language like Python is not a small task. So doing it a high-level language looks like a good idea, because high-level languages have many advantages over low-level ones (flexibility, ease of implementation, no low-level issues...). This is the basic observation behind PyPy.

But coding in a high-level language has other benefits beyond the obvious ones. Perhaps most importantly, it allows the language interpreter to be analyzed and turned into a compiler. This is precisely what our JIT compiler generator does. Based on partial evaluation techniques, it can turn the interpreter of an arbitrary dynamic language into a just-in-time compiler for the same language. It works mostly automatically and only needs guidance by the language implementor in the form of a small number of hints in the source code of the interpreter. The resulting JIT compiler has the same language semantics as the original interpreter by construction. It generates machine code for the user program while aggressively optimizing it, leading to a big performance boost.

1.2   Partial evaluation vs. manually-written JIT compilers

Partial evaluation is a well-known and much-researched topic, considered to be very promising. There have been many attempts to use it to automatically transform an interpreter into a compiler. However, none of them have lead to substantial speedups for real-world languages. We believe that the missing key insight is to use partial evaluation to produce just-in-time compilers, rather than classical ahead-of-time compilers. If this turns out to be correct, the practical speed of dynamic languages could be vastly improved.

By comparison, getting a JIT compiler today involves either manually writing one, which is a huge amount of effort, or reusing an existing well-tuned JIT (like that of the JVM). However, the latter is hard because of concept mismatches between the implementor's language and the host VM language: the former needs to be compiled to the target environment in such a way that the JIT is able to speed it up significantly - an approach which essentially has failed in Python so far: even though CPython is a simple interpreter, its Java and .NET re-implementations are not faster (and cannot run all existing Python applications).

1.3   Practical results

The JIT compilers that we generate use some techniques that are not in widespread use so far, but they are not exactly new either. The point we want to make here is not that we are pushing the theoretical limits of how fast a given dynamic language can be run. Our point is: we are making it practical to have reasonably good Just-In-Time compilers for all dynamic languages, no matter how complicated or non-widespread (e.g. Open Source dynamic languages without large industry or academic support, or internal domain-specific languages). By practical we mean that this should be:

  • Easy: requires little more efforts than writing the interpreter in the first place.
  • Maintainable: our generated JIT compilers are not separate projects (we do not generate separate source code, but only throw-away C code that is compiled into the generated VM). In other words, the whole JIT compiler is regenerated anew every time the high-level interpreter is modified, so that they cannot get out of sync no matter how fast the language evolves.
  • Fast enough: we think that we can get some rather good performance out of the generated JIT compilers. That's the whole point, of course. Previous experience shows that it should be possible (our generated JIT compilers are similar to the hand-written Psyco).

2   Alternative approaches to improve speed

NOTE:Please take the following section as just a statement of opinion. In order to be debated over, the summaries should first be expanded into full arguments. We include them here as links; we are aware of them, even if sometimes pessimistic about them :-)

There are a large number of approaches to improving the execution speed of dynamic programming languages, most of which only produce small improvements and none offer the flexibility and customisability provided by our approach. Over the last 6 years of tweaking, the speed of CPython has only improved by a factor of 1.3 or 1.4 (depending on benchmarks). Many tweaks are applicable to PyPy as well. Indeed, some of the CPython tweaks originated as tweaks for PyPy.

IronPython initially achieved a speed of about 1.8 times that of CPython by leaving out some details of the language and by leveraging the large investment that Microsoft has put into making the .NET platform fast; the current, more complete implementation has roughly the same speed as CPython. In general, the existing approaches have reached the end of the road, speed-wise. Microsoft's Dynamic Language Runtime (DLR), often cited in this context, is essentially only an API to make the techniques pioneered in IronPython official. At best, it will give another small improvement.

Another technique regularly mentioned is adding types to the language in order to speed it up: either explicit optional typing or soft typing (i.e., inferred "likely" types). For Python, all projects in this area have started with a simplified subset of the language; no project has scaled up to anything close to the complete language. This would be a major effort and be platform- and language-specific. Moreover maintenance would be a headache: we believe that many changes that are trivial to implement in CPython, are likely to invalidate previous carefully-tuned optimisations.

For major improvements in speed, JIT techniques are necessary. For Python, Psyco gives typical speedups of 2 to 4 times - up to 100 times in algorithmic examples. It has come to a dead end because of the difficulty and huge costs associated with developing and maintaining it. It has a relatively poor encoding of language semantics - knowledge about Python behavior needs to be encoded by hand and kept up-to-date. At least, Psyco works correctly even when encountering one of the numerous Python constructs it does not support, by falling back to CPython. The PyPy JIT can be seen as a metaprogrammatic, non-language-specific equivalent of Psyco.

A different kind of prior art are self-hosting JIT compilers such as Jikes. Jikes is a JIT compiler for Java written in Java. It has a poor encoding of language semantics; it would take an enormous amount of work to encode all the details of a Python-like language directly into a JIT compiler. It also has limited portability, which is an issue for Python; it is likely that large parts of the JIT compiler would need retargetting in order to run in a different environment than the intended low-level one.

More recently, several larger projects have started in the JIT area. For instance, Sun Microsystems is investing in JRuby, which aims to use the Java Hotspot JIT to improve the performance of Ruby. However, this requires a lot of hand crafting and will only provide speedups for one language on one platform. Some issues are delicate, e.g., how to remove the overhead of constantly boxing and unboxing, typical in dynamic languages. An advantage compared to PyPy is that there are some hand optimizations that can be performed, that do not fit in the metaprogramming approach. But metaprogramming makes the PyPy JIT reusable for many different languages on many different execution platforms. It is also possible to combine the approaches - we can get substantial speedups using our JIT and then feed the result to Java's Hotspot JIT for further improvement. One of us is even a member of the JSR 292 Expert Group to define additions to the JVM to better support dynamic languages, and is contributing insights from our JIT research, in ways that will also benefit PyPy.

3   Further reading

The basic ideas have been shown to work in our prototype JIT-enabled pypy-c, as described in Status. Further technical documentation is in progress; see the Index page. You can also read more about the Theory of partial evaluation.