PyPy
PyPy[faq]
modified Apr 17, 2009 by Armin Rigo

Frequently Asked Questions

General

What is PyPy?

PyPy is both:

  • a reimplementation of Python in Python, and
  • a framework for implementing interpreters and virtual machines for programming languages, especially dynamic languages.

PyPy tries to find new answers about ease of creation, flexibility, maintainability and speed trade-offs for language implementations. For further details see our goal and architecture document .

Is PyPy a drop in replacement for CPython?

Almost!

The mostly likely stumbling block for any given project is support for extension modules. PyPy supports a continually growing number of extension modules, but so far mostly only those found in the standard library.

The language features (including builtin types and functions) are very complete and well tested, so if your project does not use many extension modules there is a good chance that it will work with PyPy.

We list the differences we know about in cpython_differences.

On what platforms does PyPy run?

PyPy is regularly and extensively tested on Linux machines and on Mac OS X and mostly works under Windows too (but is tested there less extensively). PyPy needs a CPython running on the target platform to bootstrap, as cross compilation is not really meant to work yet. At the moment you need CPython 2.4 (with ctypes 0.9.9.6 or newer) or CPython 2.5 for the translation process.

PyPy also basically works in a 64-bit Linux environment.

Which Python version (2.x?) does PyPy implement?

PyPy currently aims to be fully compatible with Python 2.5. That means that it contains the standard library of Python 2.5 and that it supports 2.5 features (such as the with statement).

Do threads work? What are the modules that work?

Operating system-level threads basically work. If you enable the thread module then PyPy will get support for GIL based threading. Note that PyPy also fully supports stackless-like microthreads (although both cannot be mixed yet).

As for other modules: The general rule of thumb is that pure-Python modules work, C extension modules don't. However, many of the C extension modules of the standard library have been re-implemented in pure Python or as a mixed module (for some there were also older pure-Python versions available). The list of supported modules can be found as follows:

  • pure Python implementations: all the modules found in the pypy/lib/ directory
  • mixed module implementations: all the modules listed in essential_modules, default_modules and working_modules in pypy/config/pypyoption.py.

Can I use CPython extension modules?

No and there are no short-term plans to support this. CPython extension modules rely heavily on CPython's C API which contains a lot of implementation details like reference counting, exact C-level object implementation and layout etc.

A module based on ctypes is a different matter -- we support these nowadays.

The long-term answer might be different. In principle, it should be possible for PyPy to support the CPython C API (or at least a large subset of its official part). It means that "in the fullness of time" you might be able to simply recompile existing CPython extension modules and use them with PyPy.

How fast is PyPy?

The answer to this question depends on how the PyPy interpreter is run. When optimized and translated to C, the PyPy interpreter runs somewhere between 1 and 2 times slower than CPython. If the PyPy interpreter is run on top of CPython, i.e., without translation of any kind, then it is slower by a factor of 2000. Obtaining good measurements for the performance when run on the CLI or JVM is difficult, but it is safe to say that PyPy currently runs slower than IronPython (CLI) or Jython (JVM) for most tests.

The work on the Just-In-Time compiler is currently at its fifth revision, but it looks like this revision is finally going to make it. It is work in progress; see the JIT documentation.

Can PyPy support intepreters for other languages beyond Python?

The toolsuite that translates the PyPy interpreter is quite general and can be used to create optimized versions of interpreters for any language, not just Python. Of course, these interpreters can make use of the same features that PyPy brings to Python: translation to various languages, stackless features, garbage collection, implementation of various things like arbitrarily long integers, etc.

Currently, we have preliminary versions of a JavaScript interpreter (Leonardo Santagada as his Summer of PyPy project), a Prolog interpreter (Carl Friedrich Bolz as his Bachelor thesis), and a SmallTalk interpreter (produced during a sprint). All of them are unfinised at the moment.

Development

How do I get into PyPy development? Can I come to sprints?

Sure you can come to sprints! We always welcome newcomers and try to help them get started in the project as much as possible (e.g. by providing tutorials and pairing them with experienced PyPy developers). Newcomers should have some Python experience and read some of the PyPy documentation before coming to a sprint.

Coming to a sprint is usually also the best way to get into PyPy development. If you want to start on your own, take a look at the list of project suggestions. If you get stuck or need advice, contact us. Usually IRC is the most immediate way to get feedback (at least during some parts of the day; many PyPy developers are in Europe) and the mailing list is better for long discussions.

I am getting strange errors while playing with PyPy, what should I do?

It seems that a lot of strange, unexplainable problems can be magically solved by removing all the *.pyc files from the PyPy source tree (the script py.cleanup from py/bin will do that for you). Another thing you can do is removing the directory pypy/_cache completely. If the error is persistent and still annoys you after this treatment please send us a bug report (or even better, a fix :-)

PyPy translation tool chain

Can PyPy compile normal Python programs?

No, PyPy is not a Python compiler.

In Python, it is mostly impossible to prove anything about the types that a program will manipulate by doing a static analysis. It should be clear if you are familiar with Python, but if in doubt see [BRETT].

What could be attempted is static "soft typing", where you would use a whole bunch of heuristics to guess what types are probably going to show up where. In this way, you could compile the program into two copies of itself: a "fast" version and a "slow" version. The former would contain many guards that allow it to fall back to the latter if needed. That would be a wholly different project than PyPy, though. (As far as we understand it, this is the approach that the LLVM group would like to see LLVM used for, so if you feel like working very hard and attempting something like this, check with them.)

What PyPy contains is, on the one hand, an non-soft static type inferencer for RPython, which is a sublanguage that we defined just so that it's possible and not too hard to do that; and on the other hand, for the full Python language, we have an interpreter, and a JIT generator which can produce a Just-In-Time Compiler from the interpreter. The resulting JIT works for the full Python language in a way that doesn't need type inference at all.

For more motivation and details about our approach see also [D05.1], section 3.

[BRETT]Brett Cannon, Localized Type Inference of Atomic Types in Python, http://www.ocf.berkeley.edu/~bac/thesis.pdf
[D05.1]Compiling Dynamic Language Implementations, Report from the PyPy project to the E.U., http://codespeak.net/svn/pypy/extradoc/eu-report/D05.1_Publish_on_translating_a_very-high-level_description.pdf

What is this RPython language?

RPython is a restricted subset of the Python language. It is used for implementing dynamic language interpreters within the PyPy framework. The restrictions are to ensure that type inference (and so, ultimately, translation to other languages) of RPython programs is possible. These restrictions only apply after the full import happens, so at import time arbitrary Python code can be executed.

The property of "being RPython" always applies to a full program, not to single functions or modules (the translation tool chain does a full program analysis). "Full program" in the context of "being RPython" is all the code reachable from an "entry point" function. The translation toolchain follows all calls recursively and discovers what belongs to the program and what not.

The restrictions that apply to programs to be RPython mostly limit the ability of mixing types in arbitrary ways. RPython does not allow the usage of two different types in the same variable. In this respect (and in some others) it feels a bit like Java. Other features not allowed in RPython are the usage of special methods (__xxx__) except __init__ and __del__, and the usage of reflection capabilities (e.g. __dict__).

Most existing standard library modules are not RPython, except for some functions in os, math and time that are natively supported. In general it is quite unlikely that an existing Python program is by chance RPython; it is most likely that it would have to be heavily rewritten. To read more about the RPython limitations read the RPython description.

Does RPython have anything to do with Zope's Restricted Python?

No. Zope's RestrictedPython aims to provide a sandboxed execution environment for CPython. PyPy's RPython is the implementation language for dynamic language interpreters. However, PyPy also provides a robust sandboxed Python Interpreter.

Can I use PyPy and RPython to compile smaller parts of my Python program?

No. That would be possible, and we played with early attempts in that direction, but there are many delicate issues: for example, how the compiled and the non-compiled parts exchange data. Supporting this in a nice way would be a lot of work.

PyPy is certainly a good starting point for someone that would like to work in that direction. Early attempts were dropped because they conflicted with refactorings that we needed in order to progress on the rest of PyPy; the currently active developers of PyPy have different priorities. If someone wants to start working in that direction I imagine that he might get a (very little) bit of support from us, though.

Alternatively, it's possible to write a mixed-module, i.e. an extension module for PyPy in RPython, which you can then import from your Python program when it runs on top of PyPy. This is similar to writing a C extension module for CPython in term of investment of effort (without all the INCREF/DECREF mess, though).

What's the "NOT_RPYTHON" I see in some docstrings?

If you put "NOT_RPYTHON" into the docstring of a function and that function is found while trying to translate an RPython program, the translation process stops and reports this as an error. You can therefore mark functions as "NOT_RPYTHON" to make sure that they are never analyzed.

Couldn't we simply take a Python syntax tree and turn it into Lisp?

It's not necessarily nonsense, but it's not really The PyPy Way. It's pretty hard, without some kind of type inference, to translate, say this Python:

a + b

into anything significantly more efficient than this Common Lisp:

(py:add a b)

And making type inference possible is what RPython is all about.

You could make #'py:add a generic function and see if a given CLOS implementation is fast enough to give a useful speed (but I think the coercion rules would probably drive you insane first). -- mwh

Do I have to rewrite my programs in RPython?

No. PyPy always runs your code in its own interpreter, which is a full and compliant Python 2.4 interpreter. RPython is only the language in which parts of PyPy itself are written and extension modules for it. The answer to whether something needs to be written as an extension module, apart from the "gluing to external libraries" reason, will change over time as speed for normal Python code improves.

Which backends are there?

Currently, there are backends for C, the CLI, and the JVM. All of these can translate the entire PyPy interpreter. To learn more about backends take a look at the translation document.

How do I compile my own interpreters?

Start from the example of pypy/translator/goal/targetnopstandalone.py, which you compile by typing:

python translate.py targetnopstandalone

You can have a look at intermediate C source code, which is (at the moment) put in /tmp/usession-*/testing_1/testing_1.c. Of course, all the functions and stuff used directly and indirectly by your entry_point() function has to be RPython. Another example you may want to look at is pypy/translator/goal/targetprologstandalone.py, the target for the in-progress Prolog implementation; this target for example enables a stackless build programmatically.