[pypy-svn] r45103 - pypy/extradoc/talk/dyla2007
arigo at codespeak.net
arigo at codespeak.net
Sun Jul 15 14:30:02 CEST 2007
Author: arigo
Date: Sun Jul 15 14:30:01 2007
New Revision: 45103
Modified:
pypy/extradoc/talk/dyla2007/dyla.tex
Log:
Some more progress.
Modified: pypy/extradoc/talk/dyla2007/dyla.tex
==============================================================================
--- pypy/extradoc/talk/dyla2007/dyla.tex (original)
+++ pypy/extradoc/talk/dyla2007/dyla.tex Sun Jul 15 14:30:01 2007
@@ -51,26 +51,24 @@
\section{Introduction}
Dynamic languages are traditionally implemented by writing a virtual
-machine for them in a low-level language like C or in a language that
-can relatively easily be turned into C. The machine implements an
+machine (VM) for them in a low-level language like C or in a language that
+can relatively easily be turned into C. The VM implements an
object model supporting the high level dynamic language's objects. It
typically provides features like automatic garbage collection. Recent
languages like Python, Ruby, Perl and JavaScript have complicated
semantics which are most easily mapped to a naive interpreter operating
on syntax trees or bytecode; simpler languages\footnote
{
-In the sense of the primitive semantics. ``Simple'' here is
-as opposed to ``complicated'', not as opposed to ``complex'': Common
-Lisp for example is not a small language, but it can at least in theory
-be expressed from a smaller core of primitives. In Python, all
-primitive operations have complicated semantics. The argument developed
-in the present paper is more relevant to ``complicated'' dynamic languages.
+In the sense of the primitive semantics. For example, in Python most
+primitive operations have complicated semantics; by contrast, in Common
+Lisp complex features like the reader and printer can in theory be
+implemented in terms of simpler primitives as library code.
}
like Lisp, Smalltalk and Self typically have more
efficient implementations based on code generation.
The effort required to build a new virtual machine is relatively
-large. This is particularly true for languages which are complex
+large. This is particularly true for languages which are complicated
and in constant evolution. Language implementation communities from an
open-source or academic context have only limited resources. Therefore they
cannot afford to have a highly complex implementation and often choose simpler
@@ -82,10 +80,12 @@
For these reasons writing a virtual machine in C is problematic because it
forces the language implementer to deal with many low-level details (like
-garbage collection and threading issues). Limitations
-of the C implementation lead to alternative implementations which draw
+garbage collection and threading issues). If a language becomes popular,
+limitations of the C implementation eventually
+lead to alternative implementations which draw
resources from the reference implementation. An alternative to writing
-implementations in C is to build them on top of one of the newer object oriented
+implementations in C is to build them on top of one of the newer
+general-purpose object-oriented
virtual machines (``OO VM'') such as the JVM (Java Virtual Machine) or the
CLR (Common Language Runtime of the .NET framework). This is often wanted by
the community anyway, since it leads to the ability to re-use the libraries of
@@ -98,7 +98,7 @@
In this paper, we will argue that it is possible to
benefit from and integrate with OO VMs while keeping the dynamic
-language implemented with a single, simple source code base. The idea is
+language implemented as a single, simple source code base. The idea is
to write an interpreter for that language in another sufficiently
high-level but less dynamic language. This interpreter plays the role
of a specification for the dynamic language. With a sufficiently capable
@@ -106,7 +106,7 @@
this specification -- either wholly custom VMs for C-level operating
systems or as a layer on top of various OO VMs. In other words,
meta-programming techniques can be used to successfully replace a
-foreseeable one-VM-fits-all standardization attempt.
+foreseeable one-OO-VM-fits-all standardization attempt.
The crux of the argument is that VMs for dynamic languages should not be
written by hand! The PyPy project \cite{pypy} is the justification,
@@ -118,14 +118,14 @@
PyPy contains a Python interpreter implemented in Python, from which
Python VMs can be generated. The reader is referred to
\cite{pypyvmconstruction} for a technical presentation. Let us emphasis
-again that the argument we make in the present paper is \emph{not} that
+that the argument we make in the present paper is \emph{not} that
VMs for dynamic languages should be written in their own host language
(as many projects like Squeak \cite{Squeak} and others have done) but
instead that VMs should not be \emph{written} in the first place -- they
should be generated from simple interpreters written in any suitable
high-level\footnote{``High-level'' is taken by opposition to languages
like Scheme48's PreScheme \cite{kelsey-prescheme} or Squeak's \cite{Squeak}
-SLang which use the syntax and
+SLang, which use the syntax and
metaprogramming facilities of a high-level language but encode
low-level details like object layout and memory management.} language.
@@ -133,7 +133,8 @@
implemented in C and on top of OO VMs and some of the problems of these
approaches, using various Python implementations as the main example. In
section \ref{sect:metaprogramming} we will describe our proposed
-meta-programming approach.
+meta-programming approach and compare the two solutions. We summarize
+our position and conclude in section \ref{sect:conclusion}.
\section{Approaches to Dynamic Language Implementation}
@@ -141,8 +142,9 @@
\def\implname#1{\emph{#1}}
-The observation that limitations of a C-based implementation of a dynamic
-language leads to the emergence of additional implementations is clear
+Limitations of a C-based implementation of a dynamic
+language lead to the emergence of additional implementations --
+this observation is clear
in the case of Python. The reference implementation, \implname{CPython}
\cite{cpy251}, is a simple recursive interpreter. \implname{Stackless
Python} \cite{stackless} is a fork that adds micro-threading
@@ -151,7 +153,8 @@
implementation too complex. Another implementation of the Python
language is \implname{Psyco} \cite{psyco-software}, an extension of
CPython which adds a JIT-compiler. Finally, \implname{Jython} is a
-re-implementation for the Java VM and \implname{IronPython} for
+re-implementation for the Java VM and \implname{IronPython}
+a re-implementation for
the CLR. All of these ultimately need to be kept synchronized with the
relatively fast evolution of the language.
@@ -168,7 +171,7 @@
implementing it in C. Let's take a look at the advantages that are usually
cited for basing a
language implementation of a dynamic language on a standard object oriented
-virtual machine, for example the JVM or the CLR.
+virtual machine.
\begin{itemize}
\item
@@ -537,6 +540,7 @@
\section{Conclusion}
+\label{sect:conclusion}
Here are the two central points that we have asserted in the present
paper:
More information about the pypy-svn
mailing list