[pypy-dev] Re: Compiler
Ludovic Aubry
Ludovic.Aubry at logilab.fr
Mon May 9 12:59:01 CEST 2005
Hi,
On Sun, May 08, 2005 at 11:26:59AM +0200, Armin Rigo wrote:
> Hi Ludovic, hi all,
>
> I had a look at recparser, and how to integrate it into PyPy. Ideally,
> it can be exported as the 'parser' module by adding a line to
> interpreter/baseobjspace.py (see the commented-out line about the other
> 'parser').
Yes we've been experiencing with that already.
> A few comments about the interface file pyparser.py (this
> should be put in some documentation...):
>
> * applevel() requires obscure tweaking about the 'import compiler'
> statement, the prevent the whole compiler package to be dragged in and
> compiled by PyPy (which may be what we want later, but for now it just
> doesn't work, I expect). I checked that in.
maybe this will solve the problem we're seeing when trying to compile
parse trees generated from either parsers.
> * the interpleveldef exports a class, 'STType'. I added another hack in
> lazymodule.py to make that work. Basically, the interp-level exports
> had to be wrapped objects, or functions -- which get wrapped
> automatically. Types now also get wrapped automatically. Previously,
> you'd have needed an interpleveldef like
>
> 'STType': 'space.gettypeobject(pyparser.STType.typedef)'
>
> which fishes the typedef (i.e. the definition of the app-level type)
> corresponding to the class STType, and asks the space to build a real
> app-level type object for it.
ok
> At the moment, with the above changes, it appears to work rather nicely
> (at least the few exported methods). But we cannot feed the parse
> tuples to the pure Python compiler package because the latter expect
> tuples with line number information, and as far as I see you're always
> generating tuples without. It seems that you're collecting the
> information already so it should not be difficult to fix.
>
> The next step would be to integrate it so that it is used by the
> built-ins, like compile(). There is a new abstraction, class Compiler,
> in pypy.interpreter.compiler. Its purpose is to be subclassed by
> concrete compilers; currently there is only CPythonCompiler, which
> cheats and calls compile() at interpreter-level. I guess that it should
> be possible to create another subclass that uses recparser and the pure
> Python compiler package to do its job, or even a generic PythonCompiler
> that uses whatever built-in 'parser' module is available, and then the
> pure Python compiler package.
>
> All of PyPy ends up using the compiler instance is stored in the current
> execution context whenever it needs to compile source code (including at
> the interactive prompt).
>
>
> Finally, a quick look over the recparser sources shows a few constructs
> that are clearly not "RPython", i.e. too dynamic. We need to think a
> bit and see how to address the issue. About RPython:
> http://codespeak.net/pypy/index.cgi?doc/coding-style.html#restricted-python
>
> Before we actually try to perform type inference on recparser, it's a
> bit hard to know if there are type problems or not. It is often the
> case that even when we write code knowing that it should be RPython we
> oversee some subtle typing problem. I'll give it a try, I guess (this
> is done by enabling the recparser module in baseobjspace as hinted
> above, running "dist/goal/translate_pypy.py targetpypy", and trying to
> make sense out of the obscure assertion errors and enormous flow graphs
> we get...)
> For now, a problematic feature that is obvious is the visitor pattern
> that you use extensively. It's definitely a great pattern, but not one
> that immediately applies to C- or Java-like languages. I'm not saying
> that you should rewrite all of recparser; more that we need to find a
> trick to implement visitor patterns without the getattr() with a
> computed attribute name. Possibly something along these lines:
>
> class MyVisitor:
> def visit_name1(self, node):
> ...
> def visit_name2(self, node):
> ...
>
> # this can be computed by a for loop instead:
> VISIT_MAP = {'name1': visit_name1,
> 'name2': visit_name2,
> }
>
> class Node:
> def visit(self, visitor):
> visit_meth = visitor.VISIT_MAP[self.name]
> visit_meth(visitor, self)
>
> The difference with the getattr() case is that the operation that
> replaces it, a getitem on a constant dictionary, has a reasonable
> C-level equivalent, namely a (precomputed) hash table lookup.
sure, I discussed that with Hoelger already, thing is the visitor isn't
used for parsing but only by the EBNFParser which parses the python
grammar file and turn it into a tree of grammar object
This should be called only at startup time.
I must say I am not sure whether the following call in recparser/__init__.py:
PYTHON_PARSER = pythonutil.python_grammar()
really is called at bootstrap time ?
anyway, at this time PYTHON_PARSER is a static tree of objects
representing the grammar and for now the parsing is done by providing a
'builder' object to the match method of the tree (in fact there are
several subtrees, one for each grammar targets)
>
> That's it for now. Don't hesitate to ask if I'm not making sense, or
> for more help about integration issues. I am aware that it is some kind
> of guesswork at the moment. Just feel free to post to pypy-dev.
>
>
> A bientot,
>
> Armin.
>
--
Ludovic Aubry LOGILAB, Paris (France).
http://www.logilab.com http://www.logilab.fr http://www.logilab.org
More information about the pypy-dev
mailing list