\documentclass{llncs} %%%% Compression Light+: LNCS margin reduced by +/-7mm along all edges (RG). %\textwidth=130mm % LNCS: 122mm %\textheight=203mm % LNCS: 193mm \renewcommand{\baselinestretch}{0.97} \usepackage{amssymb} \usepackage{amsmath} \usepackage[sans]{dsfont} \usepackage{color} \usepackage{ifthen} \usepackage{xspace} \usepackage{listings} \usepackage{fancyvrb} \usepackage{multirow} %\usepackage[pdftex]{graphicx} %\input{macros} \pagestyle{plain} %\lstset{mathescape=true,language=Java,basicstyle=\tt,keywordstyle=\bf} \lstset{language=Python, basicstyle=\scriptsize\ttfamily, keywordstyle=\color{blue}, % I couldn't find a way to make chars both bold and tt frame=none, stringstyle=\color{blue}, fancyvrb=true, xleftmargin=20pt,xrightmargin=20pt, showstringspaces=false} \setlength{\tabcolsep}{1ex} \let\oldcite=\cite \renewcommand\cite[1]{\ifthenelse{\equal{#1}{XXX}}{[citation~needed]}{\oldcite{#1}}} \begin{document} \title{Towards a More Effective Implementation of Python on Top of Statically Typed Virtual Machines\thanks{This work has been partially supported by MIUR EOS DUE - Extensible Object Systems for Dynamic and Unpredictable Environments and by the EU-funded project: IST 004779 PyPy (PyPy: Implementing Python in Python).}} \author{Antonio Cuni} \institute{DISI, University of Genova, Italy} \maketitle \section{Introduction} The main goal of our thesis is ``to develop new techniques, and possibly adapt old ones which have not been yet applied to Python, in order to provide a new, \textbf{more efficient implementation} of the language''. Our research mainly focus on providing a good and portable implementation of Python for object oriented virtual machines, like CLI and JVM. We believe that the main reason for bad performances of all current Python implementations is the lack of information about types at compile time, due to the extreme dynamicity of the language. PyPy\footnote{http://codespeak.net/pypy} is a mature project which provides a Python implementation for several platforms, but unfortunately it still suffers from efficiency problems. Among the other feature, PyPy can automatically generate a JIT compiler by examining the source code of the Python interpreter. Our plan is to exploit this JIT compiler generator to get a fast implementation of Python that runs on top of CLI/.NET. Moreover, since the two virtual machines have a lot in common, we think that our research can be applied to the JVM as well. Ideally, the JIT compiler should eliminate most of (if not all) the overhead of Python over statically typed languages such as C\#. \section{Current status of the research} The first step was to develop a JIT compiler for Python. Instead of writing a new compiler from scratch, we reused PyPy translation toolchain which already provided an automatic JIT compiler generator from the the interpreter of Python. The existing JIT compiler generator was only able to emit CPU machine code and, therefore, was not usable for virtual machines as CLI and JVM. Thus, the first step was to adapt the existing JIT compiler generator to target object oriented virtual machines; then, we could develop a \emph{CLI backend} for the JIT compiler generator. At the moment of writing, the CLI JIT backend is almost completed, and all the challenging problems have been resolved. Unfortunately, some minor features are still unimplemented, thus preventing the JIT generator to be applied to the full Python language, but this is a minor implementation detail. \section{Preliminary results} The preliminary results we got are very promising and much better than those expected. Since we still cannot experiment the approach with the full Python language, we decided to implement an interpreter for a toy language with dynamic features similar to those of Python and experiment with the JIT compiler automatically generated from the interpreter of this toy language. We experimented with some example programs in the toy language and measured the execution time of such programs by running the normal interpreter with or without the JIT compiler automatically generated from it. Results are surprising: on some benchmarks, we got a speedup factor up to 500, which means an execution time 1.5 times slower than the same algorithm written in C\#. Moreover, for one benchmark we got an execution time 1.62 times \textbf{faster} than the same algorithm written in C\#. More details can be found on the correspondent technical report\cite{ancona_cuni_bolz_rigo_2009} and on the official \emph{PyPy Blog}\cite{Cuni08-1}\cite{Cuni08-2}\cite{Cuni08-3}. These results show that a JIT compiler can produce efficient code also for object oriented virtual machines. To our knowledge, this is the first time that JIT compilation have been extensively applied to emit bytecode for virtual machines. This benchmark involves mainly numeric operations, but by construction the JIT compilers generated by PyPy can yield substantial improvements also for other operations; for example, they can easily eliminate the overhead of dynamic method dispatch, and we plan to write new benchmarks to demonstrate that. Unfortunately, sometimes the CLI is \emph{not expressive enough} for our needs, forcing us to emit non optimal code in some cases \cite{ancona_cuni_bolz_rigo_2009}; the results described above show that this does not prevent our toy language to run very fast, but it is unclear whether this will be true also for the full Python language. However, we are confident that it is possible to find alternate solutions that will allow us to generate much better code at the cost of more time spent during the compilation phase. \section{Evaluation Plan} The main goal of our research is to show that, with a proper implementation, Python programs can run at the same order of magnitude of speed as C\# programs on .NET. We don't expect to speed up \emph{all kind} of Python programs: for the scope of the thesis, we expect to get very good performances on benchmarks that mostly involve numerical operations. Then, we will try to optimize the object oriented features of Python, such as method dispatch. However, we don't know if there will be enough time to develop this area in detail before the end of the PhD: in that case, it will become the subject of future research. \bibliographystyle{plain} \bibliography{paper} \end{document}