============== PyPy ============== :Author: Maciej Fijalkowski, merlinux GmbH :Title: PyPy - Automatic Generation of VMs for Dynamic Languages :Date: 6th March 2008 VMs for dynamic languages =================================== * Example - Python * Coded in a low-level language, such as C (or Java or C#) * Hard-coded design decisions (garbage collection, threading model) * Primary example - CPython. - refcounting fixed hard to change - global interpreter lock hard to change Ideally, we would... ==================== * Use a high-level language * Use a statically analyzable implementation language * Have a flexible compiler toolchain to do part of the job for you Generating VMs ============== * Writing by hand is a lot of trouble * We want to perform each of the following once: - encode the language semantics - encode language-agnostic design decisions (ie object model) - encode platform details PyPy Motivation =============== * Create highly-flexible compiler toolchain * Use high-level language for encoding semantics * Use compiler toolchain to try hard to make things such as object model orthogonal to language semantics encoding Development technology ======================= * We use test driven development extensively * We've got a test suite which would need ~8 hours to run on a single processor * Our special testing tool, py.test, little hassle and a lot of features * Sprint-driven development General idea ============ .. raw:: html Even more ========= * Take any interpreter, python, javascript, smalltalk... * Make some decisions (which gc to use...) * Compile it to your platform (C/POSIX, .NET, ...) An example of benefit ===================== * Python has complicated semantics * Python guarantees that it won't segfault on a stack exhaustion * CPython includes some stack checks in the source, but they don't catch every case * We include it automatically so all cases are guaranteed to be covered Example of static analysis ======================================================== * we analyze interpreter source, not user code! * replace all external calls with stubs * sandboxing - a very small piece of code to trust * build your own library (in python) which implements your own security policy * no need to worry about third party modules RPython ======= * We use RPython, a proper subset of Python to implement a Python interpreter * More static than Python * Designed for speed, and designed for writing interpreters * Still high-level, fully analyzable * User code (executed by the interpreter) is full python (not restricted) Abstract interpretation ======================= * We start by importing the RPython modules on top of CPython * Next we analyze the bytecode and produce a forest of flow graphs * We call this the abstract interpretation of bytecode * The bytecode can be prepared dynamically (meta-programming) Flow graphs =========== * Intermediate representation * Can encode different layers of abstraction Annotation ========== * The part which does the type inference over existing graphs * A Powerful but practical type system * Still very high-level RTyper ====== * Translates high-level graphs into lower-levels * Two type systems: - lltype, for backends that understand pointers, structures etc. - ootype, for backends that have notion of objects * lltype has more low-level details, such as an implementation of strings (ootype assumes strings built-in) Translation aspects =================== * Optimizations (inlining, malloc removal) * Different GCs * ... GC framework ============ * flexible * implemented as a translation aspect * gc-specific things, such as write barriers, are inserted automatically * different root finding strategies * effective (RPython is faster than C when GC is a bottleneck) * GCs are written in RPython as well * Motivation: MMtk Backends ======== * lltype-based - C, LLVM * ootype-based - CLI, JVM, ... * llinterpreter - a virtual backend for running flow graphs directly, useful for testing * .NET bindings, for example, are backend specific Special translation aspect ============================== * generating Just-in-time compilers from interpreters JIT - motivation ================ * Interpreters are much easier to write than compilers * For many languages, one cannot achieve C-level performance with static analysis * Dynamic compilation can produce faster results than static compilation (e.g. a good Java VM versus gcj) Traditional JIT approach ======================== * Written by hand * Carefully encoded language semantics * Hard to maintain as language evolves * We can do better! PyPy approach to JIT ===================== .. raw:: html
JIT - basics ============ * partial evaluation * automatic generation of a compiler from an interpreter * an old idea with relatively few practical applications so far JIT - general idea =================== * constant-propagate python bytecode through the interpreter * may not yield good performance (our experiments show about 2x for removing intepretation overhead) * things such as the types of Python objects are still not known Solution: promotion =================== * Enhance partial evaluation to be able to *promote* run-time values into compile-time values * Implementation-wise this is a generalization of the polymorphic in-line caches used in Java VM JITs Concrete ingredients ==================== * Variety of hints to guide partial evaluation * Promotion on certain paths to achieve "static-enough" parts * Lazy allocation of objects (allocation only happens when the object escapes) * Use CPU stack and registers as a cache for Python frame objects that live in the heap Irrelevant to interpreter ============================== * These techniques can be applied relatively easily to other interpreters, just by adding a few hints * Or to any other program written in RPython (for instance a templating language where the templates are constant) The hint annotator ============================ * This annotator is very similiar to the normal one; it adds color propagation * Green - compile-time, can be constant-folded * Red - runtime * Some rules about the color propagation The rainbow interpreter ===================================== * Very very experimental, lives on a branch right now * We dump colored graphs as a bytecode and interpret them JIT details - backends ====================== * JIT backends produce assembly code from coloured graphs * As usual, we plan to have many (PPC, i386, JVM, dummy one...) * They're very rudimentary right now * It's a relatively pleasant job to write Python that generates assembler code * We would definitely benefit from help from an assembler expert JIT conclusion ============== * We're able to run carefully crafted examples ~60x faster than CPython * About the same as gcc -O0 * We can definitely do better, for example by enhancing the backends * There is a lot of work to be done in JIT area JIT plans ========= * Faster than C! * For long-enough running programs that's theoretically possible * We have more information at runtime than one can have at compile-time JIT short term plans ===================== * Compile just hotspots (not everything) * Extend backends and make them smarter * Make generating assembly code faster * Benchmark bottlenecks It's not only technical =========================== * EU 6th Framework Programme * Actively searching for both grants and consultancy contracts Project future =============== * More JIT work * Making PyPy's python interpreter run existing applications * Interfacing with external libraries * We're very friendly for newcomers