.. include:: ================================================= PyPy - Implementing Python in Python ================================================= :Authors: Armin Rigo, Samuele Pedroni & Carl Friedrich Bolz :Date: 23rd April 2006 :Location: Tokyo, AIST Python implementation facts =========================== - Parser/Compiler produces bytecode - Virtual Machine interprets bytecode - strongly dynamically typed - clean object model at Python and C level Python implementations =========================== - CPython: main Python version (BDFL'ed by Guido) - Jython: compiles to Java Bytecode - IronPython (MS): compiles to .NET's CLR - PyPy: self-contained - self-translating - flexible PyPy project facts ======================= - started 2003 as a grass-root effort - aims: flexibility, research, speed - test-driven development - received EU-funding from end 2004 on - 350 subscribers to pypy-dev, 150.000 LOCs, 20.000 visitors per month, - MIT license Three public releases ===================== - 0.6 quite compliant python implementation - 0.7 compliant self-contained python implementation - 0.8 full parser and compiler, "10-50 times" better speed - next 0.9: own GCs, coroutine, tasklets ... Documentation ====================== - http://codespeak.net/pypy - talks, papers, slides available on the site - technical reports to the EU PyPy development method ======================== - sprints - agile - test-driven - open source culture PyPy implementation facts ============================ - implements Python language in Python itself - parts implemented in a restricted subset: RPython - "static enough" for full-program type inference - at boot time we allow unrestricted python! PyPy/Python architecture ============================= - parser and compiler - bytecode interpreter - Standard Object Space / Type implementations - Python VM = interpreter + Standard Object Space - builtin and fundamental modules PyPy/Python architecture picture ================================== - PDF Parser and Compiler =================== - parses python source code to AST - compiles AST to code objects (bytecode) - works from the CPython grammar definition - can be modified/extended at runtime (almost) Bytecode interpreter ==================== - interprets bytecode/code objects through Frame objects - Frames tie to global and local variable scopes - implements control flow (loops, branches, exceptions, calls) - dispatches all operations on objects to an object implementation library ("Object Space") Object Spaces ============= - library of all python types and operations on them - encapsulates all knowledge about app-level objects - is not concerned with control flow or bytecode - e.g. enough control to implement lazy evaluation Builtin and Fundamental Modules =============================== - around 200 builtin functions and classes - fundamental modules like 'sys' and 'os' implemented - quite fully compliant to CPython's regression tests - a number of modules missing or incomplete (socket ...) PyPy/Translation architecture ============================= - bytecode interpreter - Abstract Interpretation (Flow Object Space) - Type Inference (Annotation) - Specialising to lltypesystem / ootypesystem - C and LLVM Backends to lltypesystem PyPy/Translation overview ========================= - PDF Abstract Interpretation ======================== - bytecode interpreter dispatches to Flow Object Space - Flow Object Space implements abstract operations - produces flow graphs as a side effect - starts from "live" byte code NOT source code - pygame demonstration Type Inference =============== - performs forward propagating type inference - is used to infer the types in flow graphs - needs types of the entry point function's arguments - assumes that the used types are static - goes from very special to more general values Specialization =========================== - annotated flow graphs are specialized for language families - lltypesystem (for C like languages): C, LLVM - ootypesystem (for OO languages): Smalltalk, .NET CLI, (Java/Javascript...) - result is specialized flow graphs - these contain operations at target level Backends ========== - produce code out of specialized flow graphs - complete backends: C, LLVM - ongoing: .NET CLI, Squeak (JavaScript) - foreign function calls: manually written glue snippets - big example Translation Aspects ==================== - implementation decisions (GC, threading, CC) at translation time - most other language implementations do a "fixed" decision - translation aspects are weaved into the produced code - independent from language semantics (python interpreter) Aspects: Memory Models =================================== - Currently implemented: refcounting, Boehm-collector - more general exact GCs - mark & sweep (in-progress) - copying - ... Aspects: Threading Models ================================= - currently implemented: single thread and global interpreter lock - future plans: free threading models - stacklessness: don't use the C stack for user-level recursion - Continuation Passing Style (CPS) - implemented as a part of the backends technical outlook 2006 ====================== .. html:
Ongoing, next: - specialising JIT-compiler, processor backends - stackless tasklet pickling - GC / threading integration + extensions - orthogonal persistence and distribution (see thunk example) - built-in security (e-lang ...) Just-in-time compilation ================================= - PDF .. |bullet| unicode:: U+02022 .. footer:: Samuele Pedroni, Carl Friedrich Bolz, Armin Rigo |bullet| 23rd April 2006