From cfbolz at gmx.de Thu Jul 1 10:06:46 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Thu, 01 Jul 2010 10:06:46 +0200 Subject: [pypy-dev] CGO 2011 Conference Message-ID: <4C2C4C96.9030808@gmx.de> Hi all, I think this conference could be interesting to us: http://www.cgo.org/cgo2011/call_papers.html From the call for papers: * Techniques for efficient execution of dynamically typed languages Deadline is 15 of September. Cheers, Carl Friedric From fijall at gmail.com Thu Jul 1 12:02:31 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 1 Jul 2010 04:02:31 -0600 Subject: [pypy-dev] PyPy 1.3 released In-Reply-To: References: Message-ID: On Wed, Jun 30, 2010 at 3:24 PM, Phyo Arkar wrote: > So far , python-mysql still not working.. > > Anyone had sucessfully got it work? Hey. I'm not aware of anyone who had any success. You can come to #pypy on irc.freenode.net and we can see how to solve the problem. > > On Fri, Jun 25, 2010 at 11:27 PM, Maciej Fijalkowski > wrote: >> >> ======================= >> PyPy 1.3: Stabilization >> ======================= >> >> Hello. >> >> We're please to announce release of PyPy 1.3. This release has two major >> improvements. First of all, we stabilized the JIT compiler since 1.2 >> release, >> answered user issues, fixed bugs, and generally improved speed. >> >> We're also pleased to announce alpha support for loading CPython extension >> modules written in C. While the main purpose of this release is increased >> stability, this feature is in alpha stage and it is not yet suited for >> production environments. >> >> Highlights of this release >> ========================== >> >> * We introduced support for CPython extension modules written in C. As of >> now, >> ?this support is in alpha, and it's very unlikely unaltered C extensions >> will >> ?work out of the box, due to missing functions or refcounting details. The >> ?support is disable by default, so you have to do:: >> >> ? import cpyext >> >> ?before trying to import any .so file. Also, libraries are >> source-compatible >> ?and not binary-compatible. That means you need to recompile binaries, >> using >> ?for example:: >> >> ? python setup.py build >> >> ?Details may vary, depending on your build system. Make sure you include >> ?the above line at the beginning of setup.py or put it in your >> PYTHONSTARTUP. >> >> ?This is alpha feature. It'll likely segfault. You have been warned! >> >> * JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and >> its >> ?stability greatly improved since 1.2 release. >> >> * Various small improvements have been added to the JIT code, as well as a >> great >> ?speedup of compiling time. >> >> Cheers, >> Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and >> the PyPy team >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev > > From hakan at debian.org Thu Jul 1 16:02:30 2010 From: hakan at debian.org (Hakan Ardo) Date: Thu, 1 Jul 2010 16:02:30 +0200 Subject: [pypy-dev] array performace? Message-ID: Hi, are there any python construct that the jit will be able to compile into c-type array accesses? Consider the following test: l=0.0 for i in xrange(640,640*480): l+=img[i] intimg[i]=intimg[i-640]+l With the 1.3 release of the jit it executes about 20 times slower than a similar construction in C if I create the arrays using: import _rawffi RAWARRAY = _rawffi.Array('d') img=RAWARRAY(640*480, autofree=True) intimg=RAWARRAY(640*480, autofree=True) Using a list is about 40 times slower and using array.array is about 400 times slower. Any suggestion on how to improve the performance of these kind of constructions? Thanx. -- H?kan Ard? From fijall at gmail.com Thu Jul 1 16:46:09 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 1 Jul 2010 08:46:09 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: Message-ID: Hey. There is a variety of reasons why those behave like this (array module is in PyPy written in Python for example, using _rawffi). There is a branch that plans to fix that for all lists, but that's not finished yet. On Thu, Jul 1, 2010 at 8:02 AM, Hakan Ardo wrote: > Hi, > are there any python construct that the jit will be able to compile > into c-type array accesses? Consider the following test: > > ? ?l=0.0 > ? ?for i in xrange(640,640*480): > ? ? ? ?l+=img[i] > ? ? ? ?intimg[i]=intimg[i-640]+l > > With the 1.3 release of the jit it executes about 20 times slower than > a similar construction in C if I create the arrays using: > > ? ?import _rawffi > ? ?RAWARRAY = _rawffi.Array('d') > ? ?img=RAWARRAY(640*480, autofree=True) > ? ?intimg=RAWARRAY(640*480, autofree=True) > > Using a list is about 40 times slower and using array.array is about > 400 times slower. Any suggestion on how to improve the performance of > these kind of constructions? > > ?Thanx. > > -- > H?kan Ard? > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > > > > From arigo at tunes.org Thu Jul 1 17:28:27 2010 From: arigo at tunes.org (Armin Rigo) Date: Thu, 1 Jul 2010 17:28:27 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: Message-ID: <20100701152827.GA30661@code0.codespeak.net> Hi, On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: > are there any python construct that the jit will be able to compile > into c-type array accesses? Consider the following test: > > l=0.0 > for i in xrange(640,640*480): > l+=img[i] > intimg[i]=intimg[i-640]+l This is still implemented as a list of Python objects (as expected, because the JIT cannot prove that we won't suddenly try to put something else than a float in the same list). Using _rawffi.Array('d') directly is the best option right now. I'm not sure why the array.array module is 400 times slower, but it's definitely slower given that it's implemented at app-level using a _rawffi.Array('c') and doing the conversion by itself (for some partially stupid reasons like doing the right kind of error checking). A bientot, Armin. From fijall at gmail.com Thu Jul 1 17:35:17 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 1 Jul 2010 09:35:17 -0600 Subject: [pypy-dev] array performace? In-Reply-To: <20100701152827.GA30661@code0.codespeak.net> References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Thu, Jul 1, 2010 at 9:28 AM, Armin Rigo wrote: > Hi, > > On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: >> are there any python construct that the jit will be able to compile >> into c-type array accesses? Consider the following test: >> >> ? ? l=0.0 >> ? ? for i in xrange(640,640*480): >> ? ? ? ? l+=img[i] >> ? ? ? ? intimg[i]=intimg[i-640]+l > > This is still implemented as a list of Python objects (as expected, > because the JIT cannot prove that we won't suddenly try to put something > else than a float in the same list). > > Using _rawffi.Array('d') directly is the best option right now. ?I'm not > sure why the array.array module is 400 times slower, but it's definitely > slower given that it's implemented at app-level using a _rawffi.Array('c') > and doing the conversion by itself (for some partially stupid reasons like > doing the right kind of error checking). > > > A bientot, > > Armin. The main reason why _rawffi.Array is slow is that JIT does not look into that module, so there is wrapping and unwrapping going on. Relatively easy to fix I suppose, but _rawffi.Array was not meant to be used like that (array.array looks like a better candidate). From alex.gaynor at gmail.com Thu Jul 1 17:40:38 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Thu, 1 Jul 2010 10:40:38 -0500 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Thu, Jul 1, 2010 at 10:35 AM, Maciej Fijalkowski wrote: > On Thu, Jul 1, 2010 at 9:28 AM, Armin Rigo wrote: >> Hi, >> >> On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: >>> are there any python construct that the jit will be able to compile >>> into c-type array accesses? Consider the following test: >>> >>> ? ? l=0.0 >>> ? ? for i in xrange(640,640*480): >>> ? ? ? ? l+=img[i] >>> ? ? ? ? intimg[i]=intimg[i-640]+l >> >> This is still implemented as a list of Python objects (as expected, >> because the JIT cannot prove that we won't suddenly try to put something >> else than a float in the same list). >> >> Using _rawffi.Array('d') directly is the best option right now. ?I'm not >> sure why the array.array module is 400 times slower, but it's definitely >> slower given that it's implemented at app-level using a _rawffi.Array('c') >> and doing the conversion by itself (for some partially stupid reasons like >> doing the right kind of error checking). >> >> >> A bientot, >> >> Armin. > > The main reason why _rawffi.Array is slow is that JIT does not look > into that module, so there is wrapping and unwrapping going on. > Relatively easy to fix I suppose, but _rawffi.Array was not meant to > be used like that (array.array looks like a better candidate). > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev If array.array performance is important to your work, the array.py module looks like a good target for writing at interp level, and it's not too much code. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From glavoie at gmail.com Thu Jul 1 20:57:43 2010 From: glavoie at gmail.com (Gabriel Lavoie) Date: Thu, 1 Jul 2010 14:57:43 -0400 Subject: [pypy-dev] Improving Stackless/Coroutines implementation In-Reply-To: References: Message-ID: Hello everyone, the change is implemented in r75735. I also added a coroutine.throw() method to raise any exception inside any coroutine. I don't know if some people need this but I personnally do. For now, it's implemented approximately like greenlet.throw(). The documentation for stackless.html page was updated in pypy/doc/stackless.txt. If someone could review the changes and possibly update the documentation on the website it would be appreciated. ;) Gabriel 2010/6/29 Gabriel Lavoie > Hello everyone, > as a few knows here, I'm working heavily with PyPy's "stackless" > module for my Master degree project to make it more distributed. Since I > started to work full time on this project I've encountered a few bugs > (mostly related to pickling of tasklets) and missing implementation details > in the module. The latest problem I've encountered is to be able to detect > when tasklet.kill() is called, within the tasklet being killed. With > Stackless CPython, TaskletExit is raised and can be caught but this part > wasn't really implemented in PyPy's stackless module. Since the module is > implemented on top of coroutines and since coroutine.kill() is called within > tasklet.kill(), the exception thrown by the coroutine implementation needs > to be caught. Here's the problem: > > http://codespeak.net/pypy/dist/pypy/doc/stackless.html#coroutines > > - > > coro.kill() > > Kill coro by sending an exception to it. (At the moment, the exception > is not visible to app-level, which means that you cannot catch it, and that > try: finally: clauses are not honored. This will be fixed in the > future.) > > > The exception is not thrown at app level and a coroutine dies silently. > Took a look at the code and I've been able to expose a CoroutineExit > exception to app level on which I intend implementing TaskletExit correctly. > I'm also able to catch the exception as expected but the code is not yet > complete. > > Right now, I have a question on how to expose correctly the CoroutineExit > and TaskletExit exceptions to app level. Here's what I did: > > W_CoroutineExit = _new_exception('CoroutineExit', W_Exception, 'Exit > requested...') > > class AppCoroutine(Coroutine): # XXX, StacklessFlags): > > def __init__(self, space, state=None): > # Some other code here > > # Exporting new exception to __builtins__ and "exceptions" modules > self.w_CoroutineExit = space.gettypefor(W_CoroutineExit) > space.setitem( > space.exceptions_module.w_dict, > space.new_interned_str('CoroutineExit'), > self.w_CoroutineExit) > space.setitem(space.builtin.w_dict, > space.new_interned_str('CoroutineExit'), > self.w_CoroutineExit) > > I talked about this on #pypy (IRC) but people weren't sure about exporting > new names to __builtins__. On my side I wanted to make it look as most as > possible as how Stackless CPython did it with TaskletExit, which is directly > available in __builtins__. This would make code compatible with both > Stackless Python and PyPy's stackless module. Also, exporting names this way > would only make them appear in __builtins__ when the "_stackless" module is > enabled (pypy-c built with --stackless). > > What are your opinions about it? (Maciej, I already know about yours! ;) > > Thank you very much, > > Gabriel (WildChild) > > -- > Gabriel Lavoie > glavoie at gmail.com > -- Gabriel Lavoie glavoie at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100701/ca112188/attachment-0001.htm From hakan at debian.org Fri Jul 2 07:24:07 2010 From: hakan at debian.org (Hakan Ardo) Date: Fri, 2 Jul 2010 07:24:07 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: OK, so making an interpreter level implementation of array.array seams like a good idea. Would it be possible to get the jit to remove the wrapping/unwrapping in that case to get better performance than _rawffi.Array('d'), which is already an interpreter level implementation? Are there some docs to get me started at writing interpreter level objects? I've had a look at _rawffi/array.py and am a bit confused about the W_Array.typedef = TypeDef('Array',...) construction. Maybe there is a easier example to start with? On Thu, Jul 1, 2010 at 5:40 PM, Alex Gaynor wrote: > On Thu, Jul 1, 2010 at 10:35 AM, Maciej Fijalkowski wrote: >> On Thu, Jul 1, 2010 at 9:28 AM, Armin Rigo wrote: >>> Hi, >>> >>> On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: >>>> are there any python construct that the jit will be able to compile >>>> into c-type array accesses? Consider the following test: >>>> >>>> ? ? l=0.0 >>>> ? ? for i in xrange(640,640*480): >>>> ? ? ? ? l+=img[i] >>>> ? ? ? ? intimg[i]=intimg[i-640]+l >>> >>> This is still implemented as a list of Python objects (as expected, >>> because the JIT cannot prove that we won't suddenly try to put something >>> else than a float in the same list). >>> >>> Using _rawffi.Array('d') directly is the best option right now. ?I'm not >>> sure why the array.array module is 400 times slower, but it's definitely >>> slower given that it's implemented at app-level using a _rawffi.Array('c') >>> and doing the conversion by itself (for some partially stupid reasons like >>> doing the right kind of error checking). >>> >>> >>> A bientot, >>> >>> Armin. >> >> The main reason why _rawffi.Array is slow is that JIT does not look >> into that module, so there is wrapping and unwrapping going on. >> Relatively easy to fix I suppose, but _rawffi.Array was not meant to >> be used like that (array.array looks like a better candidate). >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev > > If array.array performance is important to your work, the array.py > module looks like a good target for writing at interp level, and it's > not too much code. > > Alex > > -- > "I disapprove of what you say, but I will defend to the death your > right to say it." -- Voltaire > "The people's good is the highest law." -- Cicero > "Code can always be simpler than you think, but never as simple as you > want" -- Me > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev -- H?kan Ard? From alex.gaynor at gmail.com Fri Jul 2 07:40:21 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Fri, 2 Jul 2010 00:40:21 -0500 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 12:24 AM, Hakan Ardo wrote: > OK, so making an interpreter level implementation of array.array seams > like a good idea. Would it be possible to get the jit to remove the > wrapping/unwrapping in that case to get better performance than > _rawffi.Array('d'), which is already an interpreter level > implementation? > > Are there some docs to get me started at writing interpreter level > objects? I've had a look at _rawffi/array.py and am a bit confused > about the W_Array.typedef = TypeDef('Array',...) ?construction. Maybe > there is a easier example to start with? > > On Thu, Jul 1, 2010 at 5:40 PM, Alex Gaynor wrote: >> On Thu, Jul 1, 2010 at 10:35 AM, Maciej Fijalkowski wrote: >>> On Thu, Jul 1, 2010 at 9:28 AM, Armin Rigo wrote: >>>> Hi, >>>> >>>> On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: >>>>> are there any python construct that the jit will be able to compile >>>>> into c-type array accesses? Consider the following test: >>>>> >>>>> ? ? l=0.0 >>>>> ? ? for i in xrange(640,640*480): >>>>> ? ? ? ? l+=img[i] >>>>> ? ? ? ? intimg[i]=intimg[i-640]+l >>>> >>>> This is still implemented as a list of Python objects (as expected, >>>> because the JIT cannot prove that we won't suddenly try to put something >>>> else than a float in the same list). >>>> >>>> Using _rawffi.Array('d') directly is the best option right now. ?I'm not >>>> sure why the array.array module is 400 times slower, but it's definitely >>>> slower given that it's implemented at app-level using a _rawffi.Array('c') >>>> and doing the conversion by itself (for some partially stupid reasons like >>>> doing the right kind of error checking). >>>> >>>> >>>> A bientot, >>>> >>>> Armin. >>> >>> The main reason why _rawffi.Array is slow is that JIT does not look >>> into that module, so there is wrapping and unwrapping going on. >>> Relatively easy to fix I suppose, but _rawffi.Array was not meant to >>> be used like that (array.array looks like a better candidate). >>> _______________________________________________ >>> pypy-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/pypy-dev >> >> If array.array performance is important to your work, the array.py >> module looks like a good target for writing at interp level, and it's >> not too much code. >> >> Alex >> >> -- >> "I disapprove of what you say, but I will defend to the death your >> right to say it." -- Voltaire >> "The people's good is the highest law." -- Cicero >> "Code can always be simpler than you think, but never as simple as you >> want" -- Me >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev > > > > -- > H?kan Ard? > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > I'd take a look at the cStringIO module, it's a decent example of the APIs (and not too much code). FWIW one thing to note is that array uses the struct module, which is also pure python. I believe it's possible to still use that with an interp-level module, but it may just become another bottle neck, just something to consider. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From fijall at gmail.com Fri Jul 2 08:04:26 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 00:04:26 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: > OK, so making an interpreter level implementation of array.array seams > like a good idea. Would it be possible to get the jit to remove the > wrapping/unwrapping in that case to get better performance than > _rawffi.Array('d'), which is already an interpreter level > implementation? it should work mostly out of the box (you can also try this for _rawffi.array part of module, if you want to). It's probably enough to enable module in pypy/module/pypyjit/policy.py so JIT can have a look there. In case of _rawffi, probably a couple of hints for the jit to not look inside some functions (which do external calls for example) should also be needed, since for example JIT as of now does not support raw mallocs (using C malloc and not our GC). Still, making an array module interp-level is probably the sanest approach. > Are there some docs to get me started at writing interpreter level > objects? I've had a look at _rawffi/array.py and am a bit confused > about the W_Array.typedef = TypeDef('Array',...) ?construction. Maybe > there is a easier example to start with? TypeDef is a way to expose interpreter level (RPython) object to app-level (Python). It tells what methods there are what properties and what attributes. > > On Thu, Jul 1, 2010 at 5:40 PM, Alex Gaynor wrote: >> On Thu, Jul 1, 2010 at 10:35 AM, Maciej Fijalkowski wrote: >>> On Thu, Jul 1, 2010 at 9:28 AM, Armin Rigo wrote: >>>> Hi, >>>> >>>> On Thu, Jul 01, 2010 at 04:02:30PM +0200, Hakan Ardo wrote: >>>>> are there any python construct that the jit will be able to compile >>>>> into c-type array accesses? Consider the following test: >>>>> >>>>> ? ? l=0.0 >>>>> ? ? for i in xrange(640,640*480): >>>>> ? ? ? ? l+=img[i] >>>>> ? ? ? ? intimg[i]=intimg[i-640]+l >>>> >>>> This is still implemented as a list of Python objects (as expected, >>>> because the JIT cannot prove that we won't suddenly try to put something >>>> else than a float in the same list). >>>> >>>> Using _rawffi.Array('d') directly is the best option right now. ?I'm not >>>> sure why the array.array module is 400 times slower, but it's definitely >>>> slower given that it's implemented at app-level using a _rawffi.Array('c') >>>> and doing the conversion by itself (for some partially stupid reasons like >>>> doing the right kind of error checking). >>>> >>>> >>>> A bientot, >>>> >>>> Armin. >>> >>> The main reason why _rawffi.Array is slow is that JIT does not look >>> into that module, so there is wrapping and unwrapping going on. >>> Relatively easy to fix I suppose, but _rawffi.Array was not meant to >>> be used like that (array.array looks like a better candidate). >>> _______________________________________________ >>> pypy-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/pypy-dev >> >> If array.array performance is important to your work, the array.py >> module looks like a good target for writing at interp level, and it's >> not too much code. >> >> Alex >> >> -- >> "I disapprove of what you say, but I will defend to the death your >> right to say it." -- Voltaire >> "The people's good is the highest law." -- Cicero >> "Code can always be simpler than you think, but never as simple as you >> want" -- Me >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev > > > > -- > H?kan Ard? > From fijall at gmail.com Fri Jul 2 08:45:15 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 00:45:15 -0600 Subject: [pypy-dev] [pypy-svn] r75683 - in pypy/trunk: include lib-python/modified-2.5.2/distutils lib-python/modified-2.5.2/distutils/command pypy/_interfaces pypy/module/cpyext pypy/module/cpyext/test In-Reply-To: <20100630145114.AB57C282BE3@codespeak.net> References: <20100630145114.AB57C282BE3@codespeak.net> Message-ID: Hey. Any reason why we should copy .h files during translation and can't just have them there? Cheers, fijal On Wed, Jun 30, 2010 at 8:51 AM, wrote: > Author: antocuni > Date: Wed Jun 30 16:51:13 2010 > New Revision: 75683 > > Added: > ? pypy/trunk/include/ ? (props changed) > ? pypy/trunk/include/README > Removed: > ? pypy/trunk/pypy/_interfaces/ > Modified: > ? pypy/trunk/lib-python/modified-2.5.2/distutils/command/build_ext.py > ? pypy/trunk/lib-python/modified-2.5.2/distutils/sysconfig_pypy.py > ? pypy/trunk/pypy/module/cpyext/api.py > ? pypy/trunk/pypy/module/cpyext/test/test_api.py > Log: > create a directory trunk/include to contains all the headers file. They are > automatically copied there from cpyext/include during translation. The > generated pypy_decl.h and pypy_macros.h are also put there, instead of the > now-gone pypy/_interfaces. > > The goal is to have the svn checkout as similar as possible as release > tarballs and virtualenvs, which have an include/ dir at the top > > > > Added: pypy/trunk/include/README > ============================================================================== > --- (empty file) > +++ pypy/trunk/include/README ? Wed Jun 30 16:51:13 2010 > @@ -0,0 +1,7 @@ > +This directory contains all the include files needed to build cpython > +extensions with PyPy. ?Note that these are just copies of the original headers > +that are in pypy/module/cpyext/include: they are automatically copied from > +there during translation. > + > +Moreover, pypy_decl.h and pypy_macros.h are automatically generated, also > +during translation. > > Modified: pypy/trunk/lib-python/modified-2.5.2/distutils/command/build_ext.py > ============================================================================== > --- pypy/trunk/lib-python/modified-2.5.2/distutils/command/build_ext.py (original) > +++ pypy/trunk/lib-python/modified-2.5.2/distutils/command/build_ext.py Wed Jun 30 16:51:13 2010 > @@ -167,7 +167,7 @@ > ? ? ? ? # for Release and Debug builds. > ? ? ? ? # also Python's library directory must be appended to library_dirs > ? ? ? ? if os.name == 'nt': > - ? ? ? ? ? ?self.library_dirs.append(os.path.join(sys.prefix, 'pypy', '_interfaces')) > + ? ? ? ? ? ?self.library_dirs.append(os.path.join(sys.prefix, 'include')) > ? ? ? ? ? ? if self.debug: > ? ? ? ? ? ? ? ? self.build_temp = os.path.join(self.build_temp, "Debug") > ? ? ? ? ? ? else: > > Modified: pypy/trunk/lib-python/modified-2.5.2/distutils/sysconfig_pypy.py > ============================================================================== > --- pypy/trunk/lib-python/modified-2.5.2/distutils/sysconfig_pypy.py ? ?(original) > +++ pypy/trunk/lib-python/modified-2.5.2/distutils/sysconfig_pypy.py ? ?Wed Jun 30 16:51:13 2010 > @@ -13,12 +13,7 @@ > > ?def get_python_inc(plat_specific=0, prefix=None): > ? ? from os.path import join as j > - ? ?cand = j(sys.prefix, 'include') > - ? ?if os.path.exists(cand): > - ? ? ? ?return cand > - ? ?if plat_specific: > - ? ? ? ?return j(sys.prefix, "pypy", "_interfaces") > - ? ?return j(sys.prefix, 'pypy', 'module', 'cpyext', 'include') > + ? ?return j(sys.prefix, 'include') > > ?def get_python_version(): > ? ? """Return a string containing the major and minor Python version, > > Modified: pypy/trunk/pypy/module/cpyext/api.py > ============================================================================== > --- pypy/trunk/pypy/module/cpyext/api.py ? ? ? ?(original) > +++ pypy/trunk/pypy/module/cpyext/api.py ? ? ? ?Wed Jun 30 16:51:13 2010 > @@ -45,11 +45,9 @@ > ?pypydir = py.path.local(autopath.pypydir) > ?include_dir = pypydir / 'module' / 'cpyext' / 'include' > ?source_dir = pypydir / 'module' / 'cpyext' / 'src' > -interfaces_dir = pypydir / "_interfaces" > ?include_dirs = [ > ? ? include_dir, > ? ? udir, > - ? ?interfaces_dir, > ? ? ] > > ?class CConfig: > @@ -100,9 +98,16 @@ > ?udir.join('pypy_macros.h').write("/* Will be filled later */") > ?globals().update(rffi_platform.configure(CConfig_constants)) > > -def copy_header_files(): > +def copy_header_files(dstdir): > + ? ?assert dstdir.check(dir=True) > + ? ?headers = include_dir.listdir('*.h') + include_dir.listdir('*.inl') > ? ? for name in ("pypy_decl.h", "pypy_macros.h"): > - ? ? ? ?udir.join(name).copy(interfaces_dir / name) > + ? ? ? ?headers.append(udir.join(name)) > + ? ?for header in headers: > + ? ? ? ?header.copy(dstdir) > + ? ? ? ?target = dstdir.join(header.basename) > + ? ? ? ?target.chmod(0444) # make the file read-only, to make sure that nobody > + ? ? ? ? ? ? ? ? ? ? ? ? ? # edits it by mistake > > ?_NOT_SPECIFIED = object() > ?CANNOT_FAIL = object() > @@ -881,7 +886,8 @@ > ? ? ? ? deco(func.get_wrapper(space)) > > ? ? setup_init_functions(eci) > - ? ?copy_header_files() > + ? ?trunk_include = pypydir.dirpath() / 'include' > + ? ?copy_header_files(trunk_include) > > ?initfunctype = lltype.Ptr(lltype.FuncType([], lltype.Void)) > ?@unwrap_spec(ObjSpace, str, str) > > Modified: pypy/trunk/pypy/module/cpyext/test/test_api.py > ============================================================================== > --- pypy/trunk/pypy/module/cpyext/test/test_api.py ? ? ?(original) > +++ pypy/trunk/pypy/module/cpyext/test/test_api.py ? ? ?Wed Jun 30 16:51:13 2010 > @@ -1,3 +1,4 @@ > +import py > ?from pypy.conftest import gettestobjspace > ?from pypy.rpython.lltypesystem import rffi, lltype > ?from pypy.interpreter.baseobjspace import W_Root > @@ -68,3 +69,13 @@ > ? ? ? ? api.PyPy_GetWrapped(space.w_None) > ? ? ? ? api.PyPy_GetReference(space.w_None) > > + > +def test_copy_header_files(tmpdir): > + ? ?api.copy_header_files(tmpdir) > + ? ?def check(name): > + ? ? ? ?f = tmpdir.join(name) > + ? ? ? ?assert f.check(file=True) > + ? ? ? ?py.test.raises(py.error.EACCES, "f.open('w')") # check that it's not writable > + ? ?check('Python.h') > + ? ?check('modsupport.inl') > + ? ?check('pypy_decl.h') > _______________________________________________ > pypy-svn mailing list > pypy-svn at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-svn > From fijall at gmail.com Fri Jul 2 09:28:15 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 01:28:15 -0600 Subject: [pypy-dev] [pypy-svn] r75683 - in pypy/trunk: include lib-python/modified-2.5.2/distutils lib-python/modified-2.5.2/distutils/command pypy/_interfaces pypy/module/cpyext pypy/module/cpyext/test In-Reply-To: <4C2D9369.7030004@gmail.com> References: <20100630145114.AB57C282BE3@codespeak.net> <4C2D9369.7030004@gmail.com> Message-ID: On Fri, Jul 2, 2010 at 1:21 AM, Antonio Cuni wrote: > On 02/07/10 08:45, Maciej Fijalkowski wrote: >> >> Hey. >> >> Any reason why we should copy .h files during translation and can't >> just have them there? >> > > I talked with Amaury and he told me that he prefers to keep all the > cpyext-related files together, which I think makes sense. ?Moreover, we need > to generate© pypy_decl.h and pypy_macros.h anyway, so we can copy the > others as well while we are at it. > > ciao, > Anto > Fine by me. Can you fix test_package then? It assumes there is Python.h in include (which might not be there). From anto.cuni at gmail.com Fri Jul 2 09:21:13 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Fri, 02 Jul 2010 09:21:13 +0200 Subject: [pypy-dev] [pypy-svn] r75683 - in pypy/trunk: include lib-python/modified-2.5.2/distutils lib-python/modified-2.5.2/distutils/command pypy/_interfaces pypy/module/cpyext pypy/module/cpyext/test In-Reply-To: References: <20100630145114.AB57C282BE3@codespeak.net> Message-ID: <4C2D9369.7030004@gmail.com> On 02/07/10 08:45, Maciej Fijalkowski wrote: > Hey. > > Any reason why we should copy .h files during translation and can't > just have them there? > I talked with Amaury and he told me that he prefers to keep all the cpyext-related files together, which I think makes sense. Moreover, we need to generate© pypy_decl.h and pypy_macros.h anyway, so we can copy the others as well while we are at it. ciao, Anto From anto.cuni at gmail.com Fri Jul 2 09:30:30 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Fri, 02 Jul 2010 09:30:30 +0200 Subject: [pypy-dev] [pypy-svn] r75683 - in pypy/trunk: include lib-python/modified-2.5.2/distutils lib-python/modified-2.5.2/distutils/command pypy/_interfaces pypy/module/cpyext pypy/module/cpyext/test In-Reply-To: References: <20100630145114.AB57C282BE3@codespeak.net> <4C2D9369.7030004@gmail.com> Message-ID: <4C2D9596.5050105@gmail.com> On 02/07/10 09:28, Maciej Fijalkowski wrote: > Fine by me. Can you fix test_package then? It assumes there is > Python.h in include (which might not be there). ah right... because when we run own-test translation didn't happen, so .h are not there. Ok, I'll fix it later. ciao, Anto From tobami at googlemail.com Fri Jul 2 09:27:10 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 2 Jul 2010 09:27:10 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hi Paolo, hey! I think it is a great idea. With logs you get both: correct normalized totals AND the ability to display the individual stacked series, which necessarily add arithmetically. But it strikes me, hasn't anyone written a paper about that method already? or at least documented it? Anyway I need to check that the math is right (hopefully today), and then I would go and implement it. I'll tell you how it goes. Cheers, Miquel 2010/6/30 Paolo Giarrusso : > Hi Miquel, > I'm quite busy (because of a paper deadline next Tuesday), sorry for > not answering earlier. > > I was just struck by an idea: there is a stacked bar plot where the > total bar is related to the geometric mean, such that it is > normalization-invariant. But this graph _is_ complicated. > > It is a stacked plot of _logarithms_ of performance ratios? This way, > the complete stacked bar shows the logarithm of the product, rather > than their sum, i.e. the log of the (geometric mean)^N rather than > their arithmetic mean. log of the (geometric mean)^N = N*log of the > (geometric mean). > > Some simple maths (I didn't write it out, so please recheck!) seems to > show that showing (a+b*log (ratio)), instead of log(ratio), gives > still a fair comparison, obtaining N*a+b*N*log(geomean) = > \Theta(log(geomean)). You need to put a and b because showing if the > ratio is 1, log(1) is zero (b is the representation scale which is > always there). > > About your workaround: I would like a table with the geometric mean of > the ratios, where we get the real global performance ratio among the > interpreters. As far as the results of your solution do not contradict > that _real_ table, it should be a reasonable workaround (but I would > embed the check in the code - otherwise other projects _will be_ > bitten by that). Probably, I would like the website to offer such a > table to users, and I would like a graph of the overall performance > ratio over time (actually revisions). > > Finally, the docs of your web application should at the very least > reference the paper and this conversation (if there's a public archive > of the ML, as I think), and ideally explain the issue. > > Sorry for being too dense, maybe - if I was unclear, please tell me > and I'll answer next week. > > Best regards, > Paolo > > On Mon, Jun 28, 2010 at 11:21, Miquel Torres wrote: >> Hi Paolo, >> >> I read the paper, very interesting. It is perfectly clear that to >> calculate a normalized total only the geometric mean makes sense. >> >> However, a stacked bars plot shows the individual benchmarks so it >> implicitly is an arithmetic mean. The only solution (apart from >> removing the stacked charts and only offering total bars) is the >> weighted approach. >> >> External weights are not very practical though. Codespeed is used by >> other projects so an extra option would need to be added to the >> settings to allow the introducing of arbitrary weights to benchmarks. >> A bit cumbersome. I have an idea that may work. Take the weights from >> a defined baseline so that the run times are equal, which is the same >> as normalizing to a baseline. It would be the same as now, only that >> you can't choose the normalization, it will be weighted (normalized) >> according the default baseline (which you already can already >> configure in the settings). >> >> You may say that it is still an arithmetic mean, but there won't be >> conflicting results because there is only a single normalization. For >> PyPy that would be cpython, and everything would make sense. >> I know it is a work around, not a solution. If you think it is a bad >> idea, the only other possibility is not to have stacked bars (as in >> "showing individual benchmarks"). But I find them useful. Yes you can >> see the individual benchmark results better in the normal bars chart, >> but there you don't see visually which benchmarks take the biggest >> part of the pie, which helps visualize what parts of your program need >> most improving. >> >> What do you think? >> >> Regards, >> Miquel >> >> >> 2010/6/25 Paolo Giarrusso : >>> On Fri, Jun 25, 2010 at 19:08, Miquel Torres wrote: >>>> Hi Paolo, >>>> >>>> I am aware of the problem with calculating benchmark means, but let me >>>> explain my point of view. >>>> >>>> You are correct in that it would be preferable to have absolute times. Well, >>>> you actually can, but see what it happens: >>>> http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars >>> >>> Ahah! I didn't notice that I could skip normalization! This does not >>> fully invalidate my point, however. >>> >>>> Absolute values would only work if we had carefully chosen benchmaks >>>> runtimes to be very similar (for our cpython baseline). As it is, html5lib, >>>> spitfire and spitfire_cstringio completely dominate the cummulative time. >>> >>> I acknowledge that (btw, it should be cumulative time, with one 'm', >>> both here and in the website). >>> >>>> And not because the interpreter is faster or slower but because the >>>> benchmark was arbitrarily designed to run that long. Any improvement in the >>>> long running benchmarks will carry much more weight than in the short >>>> running. >>> >>>> What is more useful is to have comparable slices of time so that the >>>> improvements can be seen relatively over time. >>> >>> If you want to sum up times (but at this point, I see no reason for >>> it), you should rather have externally derived weights, as suggested >>> by the paper (in Rule 3). >>> As soon as you take weights from the data, lots of maths that you need >>> is not going to work any more - that's generally true in many cases in >>> statistics. >>> And the only way making sense to have external weights is to gather >>> them from real world programs. Since that's not going to happen >>> easily, just stick with the geometric mean. Or set an arbitrarily low >>> weight, manually, without any math, so that the long-running >>> benchmarks stop dominating the res. It's no fraud, since the current >>> graph is less valid anyway. >>> >>>> Normalizing does that i >>>> think. >>> Not really. >>> >>>> It just says: we have 21 tasks which take 1 second to run each on >>>> interpreter X (cpython in the default case). Then we see how other >>>> executables compare to that. What would the geometric mean achieve here, >>>> exactly, for the end user? >>> >>> You actually need the geomean to do that. Don't forget that the >>> geomean is still a mean: it's a mean performance ratio which averages >>> individual performance ratios. >>> If PyPy's geomean is 0.5, it means that PyPy is going to run that task >>> in 11.5 seconds instead of 21. To me, this sounds exactly like what >>> you want to achieve. Moreover, it actually works, unlike what you use. >>> >>> For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no >>> JIT). Then, change the normalization among the two: >>> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars >>> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars >>> with the current data, you get that in one case cpython is faster, in >>> the other pypy-c is faster. >>> It can't happen with the geomean. This is the point of the paper. >>> >>> I could even construct a normalization baseline $base such that >>> CPython seems faster than PyPy-JIT. Such a base should be very fast >>> on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai >>> becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on >>> other benchmarks (so that they disappear in the sum). >>> >>> So, the only difference I see is that geomean works, arithm. mean >>> doesn't. That's why Real Benchmarkers use geomean. >>> >>> Moreover, you are making a mistake quite common among non-physicists. >>> What you say makes sense under the implicit assumption that dividing >>> two times gives something you can use as a time. When you say "Pypy's >>> runtime for a 1 second task", you actually want to talk about a >>> performance ratio, not about the time. In the same way as when you say >>> "this bird runs 3 meters long in one second", a physicist would sum >>> that up as "3 m/s" rather than "3 m". >>> >>>> I am not really calculating any mean. You can see that I carefully avoided >>>> to display any kind of total bar which would indeed incur in the problem you >>>> mention. That a stacked chart implicitly displays a total is something you >>>> can not avoid, and for that kind of chart I still think normalized results >>>> is visually the best option. >>> >>> But on a stacked bars graph, I'm not going to look at individual bars >>> at all, just at the total: it's actually less convenient than in >>> "normal bars" to look at the result of a particular benchmark. >>> >>> I hope I can find guidelines against stacked plots, I have a PhD >>> colleague reading on how to make graphs. >>> >>> Best regards >>> -- >>> Paolo Giarrusso - Ph.D. Student >>> http://www.informatik.uni-marburg.de/~pgiarrusso/ >>> >> > > > > -- > Paolo Giarrusso - Ph.D. Student > http://www.informatik.uni-marburg.de/~pgiarrusso/ > From hakan at debian.org Fri Jul 2 09:37:03 2010 From: hakan at debian.org (Hakan Ardo) Date: Fri, 2 Jul 2010 09:37:03 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: Hi, I've got a simple implementation of array now, wrapping lltype.malloc with no error checking yet (cStringIO was great help, thx). How can I test this with the jit? Do I need to translate the entire pypy or is there a quicker way? > there. In case of _rawffi, probably a couple of hints for the jit to > not look inside some functions (which do external calls for example) > should also be needed, since for example JIT as of now does not > support raw mallocs (using C malloc and not our GC). Still, making an > array module interp-level is probably the sanest approach. Do I need to guard the lltype.malloc call with such hints? What is the syntax? -- H?kan Ard? From p.giarrusso at gmail.com Fri Jul 2 09:47:57 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 2 Jul 2010 09:47:57 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski wrote: > On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: >> OK, so making an interpreter level implementation of array.array seams >> like a good idea. Would it be possible to get the jit to remove the >> wrapping/unwrapping in that case to get better performance than >> _rawffi.Array('d'), which is already an interpreter level >> implementation? > > it should work mostly out of the box (you can also try this for > _rawffi.array part of module, if you want to). It's probably enough to > enable module in pypy/module/pypyjit/policy.py so JIT can have a look > there. In case of _rawffi, probably a couple of hints for the jit to > not look inside some functions (which do external calls for example) > should also be needed, since for example JIT as of now does not > support raw mallocs (using C malloc and not our GC). > Still, making an > array module interp-level is probably the sanest approach. That might be a bad sign. For CPython, people recommend to write extensions in C for performance, i.e. to make them less maintainable and understandable for performance. A good JIT should make this unnecessary in as many cases as possible. Of course, the array module might be an exception, if it's a single case. But performance 20x slower than C, with a JIT, is a big warning, since fast interpreters are documented to be (in general) just 10x slower than C. In this case, the JIT should be instructed to look into that module; if the result is still slow, the missing optimizations need to be traced down and added. Also, it seems that at some point in the future, the JIT should in general look into the whole standard library by default _and_ learn to be careful to such external calls. Isn't it? Comments appreciated. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From p.giarrusso at gmail.com Fri Jul 2 09:53:04 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 2 Jul 2010 09:53:04 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 09:47, Paolo Giarrusso wrote: > On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski wrote: >> On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: >>> OK, so making an interpreter level implementation of array.array seams >>> like a good idea. Would it be possible to get the jit to remove the >>> wrapping/unwrapping in that case to get better performance than >>> _rawffi.Array('d'), which is already an interpreter level >>> implementation? >> >> it should work mostly out of the box (you can also try this for >> _rawffi.array part of module, if you want to). It's probably enough to >> enable module in pypy/module/pypyjit/policy.py so JIT can have a look >> there. In case of _rawffi, probably a couple of hints for the jit to >> not look inside some functions (which do external calls for example) >> should also be needed, since for example JIT as of now does not >> support raw mallocs (using C malloc and not our GC). > >> Still, making an >> array module interp-level is probably the sanest approach. > > That might be a bad sign. > For CPython, people recommend to write extensions in C for > performance, i.e. to make them less maintainable and understandable > for performance. Here, I forgot to state explicitly that having to rewrite a module at the interpreter level is somehow similar. Imagine that was suggested, the day PyPy will be standard, to application authors. > A good JIT should make this unnecessary in as many cases as possible. > Of course, the array module might be an exception, if it's a single > case. > But performance 20x slower than C, with a JIT, is a big warning, since > fast interpreters are documented to be (in general) just 10x slower > than C. > In this case, the JIT should be instructed to look into that module; > if the result is still slow, the missing optimizations need to be > traced down and added. > Also, it seems that at some point in the future, the JIT should in > general look into the whole standard library by default _and_ learn to > be careful to such external calls. Isn't it? > Comments appreciated. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From p.giarrusso at gmail.com Fri Jul 2 09:58:29 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 2 Jul 2010 09:58:29 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: On Fri, Jul 2, 2010 at 09:27, Miquel Torres wrote: > Hi Paolo, > > hey! I think it is a great idea. With logs you get both: correct > normalized totals AND the ability to display the individual stacked > series, which necessarily add arithmetically. But it strikes me, > hasn't anyone written a paper about that method already? or at least > documented it? I guess the problem is that the graph is weird enough, and that you need arbitrary a and b to make it work, since the logarithm might get negative, and arbitrarily big. log 0 = - inf. I still think that's fair and makes sense, but it's somewhat hard to sell. > Anyway I need to check that the math is right (hopefully today), and > then I would go and implement it. > I'll tell you how it goes. > > Cheers, > Miquel > > > > 2010/6/30 Paolo Giarrusso : >> Hi Miquel, >> I'm quite busy (because of a paper deadline next Tuesday), sorry for >> not answering earlier. >> >> I was just struck by an idea: there is a stacked bar plot where the >> total bar is related to the geometric mean, such that it is >> normalization-invariant. But this graph _is_ complicated. >> >> It is a stacked plot of _logarithms_ of performance ratios? This way, >> the complete stacked bar shows the logarithm of the product, rather >> than their sum, i.e. the log of the (geometric mean)^N rather than >> their arithmetic mean. log of the (geometric mean)^N = N*log of the >> (geometric mean). >> >> Some simple maths (I didn't write it out, so please recheck!) seems to >> show that showing (a+b*log (ratio)), instead of log(ratio), gives >> still a fair comparison, obtaining N*a+b*N*log(geomean) = >> \Theta(log(geomean)). You need to put a and b because showing if the >> ratio is 1, log(1) is zero (b is the representation scale which is >> always there). >> >> About your workaround: I would like a table with the geometric mean of >> the ratios, where we get the real global performance ratio among the >> interpreters. As far as the results of your solution do not contradict >> that _real_ table, it should be a reasonable workaround (but I would >> embed the check in the code - otherwise other projects _will be_ >> bitten by that). Probably, I would like the website to offer such a >> table to users, and I would like a graph of the overall performance >> ratio over time (actually revisions). >> >> Finally, the docs of your web application should at the very least >> reference the paper and this conversation (if there's a public archive >> of the ML, as I think), and ideally explain the issue. >> >> Sorry for being too dense, maybe - if I was unclear, please tell me >> and I'll answer next week. >> >> Best regards, >> Paolo >> >> On Mon, Jun 28, 2010 at 11:21, Miquel Torres wrote: >>> Hi Paolo, >>> >>> I read the paper, very interesting. It is perfectly clear that to >>> calculate a normalized total only the geometric mean makes sense. >>> >>> However, a stacked bars plot shows the individual benchmarks so it >>> implicitly is an arithmetic mean. The only solution (apart from >>> removing the stacked charts and only offering total bars) is the >>> weighted approach. >>> >>> External weights are not very practical though. Codespeed is used by >>> other projects so an extra option would need to be added to the >>> settings to allow the introducing of arbitrary weights to benchmarks. >>> A bit cumbersome. I have an idea that may work. Take the weights from >>> a defined baseline so that the run times are equal, which is the same >>> as normalizing to a baseline. It would be the same as now, only that >>> you can't choose the normalization, it will be weighted (normalized) >>> according the default baseline (which you already can already >>> configure in the settings). >>> >>> You may say that it is still an arithmetic mean, but there won't be >>> conflicting results because there is only a single normalization. For >>> PyPy that would be cpython, and everything would make sense. >>> I know it is a work around, not a solution. If you think it is a bad >>> idea, the only other possibility is not to have stacked bars (as in >>> "showing individual benchmarks"). But I find them useful. Yes you can >>> see the individual benchmark results better in the normal bars chart, >>> but there you don't see visually which benchmarks take the biggest >>> part of the pie, which helps visualize what parts of your program need >>> most improving. >>> >>> What do you think? >>> >>> Regards, >>> Miquel >>> >>> >>> 2010/6/25 Paolo Giarrusso : >>>> On Fri, Jun 25, 2010 at 19:08, Miquel Torres wrote: >>>>> Hi Paolo, >>>>> >>>>> I am aware of the problem with calculating benchmark means, but let me >>>>> explain my point of view. >>>>> >>>>> You are correct in that it would be preferable to have absolute times. Well, >>>>> you actually can, but see what it happens: >>>>> http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars >>>> >>>> Ahah! I didn't notice that I could skip normalization! This does not >>>> fully invalidate my point, however. >>>> >>>>> Absolute values would only work if we had carefully chosen benchmaks >>>>> runtimes to be very similar (for our cpython baseline). As it is, html5lib, >>>>> spitfire and spitfire_cstringio completely dominate the cummulative time. >>>> >>>> I acknowledge that (btw, it should be cumulative time, with one 'm', >>>> both here and in the website). >>>> >>>>> And not because the interpreter is faster or slower but because the >>>>> benchmark was arbitrarily designed to run that long. Any improvement in the >>>>> long running benchmarks will carry much more weight than in the short >>>>> running. >>>> >>>>> What is more useful is to have comparable slices of time so that the >>>>> improvements can be seen relatively over time. >>>> >>>> If you want to sum up times (but at this point, I see no reason for >>>> it), you should rather have externally derived weights, as suggested >>>> by the paper (in Rule 3). >>>> As soon as you take weights from the data, lots of maths that you need >>>> is not going to work any more - that's generally true in many cases in >>>> statistics. >>>> And the only way making sense to have external weights is to gather >>>> them from real world programs. Since that's not going to happen >>>> easily, just stick with the geometric mean. Or set an arbitrarily low >>>> weight, manually, without any math, so that the long-running >>>> benchmarks stop dominating the res. It's no fraud, since the current >>>> graph is less valid anyway. >>>> >>>>> Normalizing does that i >>>>> think. >>>> Not really. >>>> >>>>> It just says: we have 21 tasks which take 1 second to run each on >>>>> interpreter X (cpython in the default case). Then we see how other >>>>> executables compare to that. What would the geometric mean achieve here, >>>>> exactly, for the end user? >>>> >>>> You actually need the geomean to do that. Don't forget that the >>>> geomean is still a mean: it's a mean performance ratio which averages >>>> individual performance ratios. >>>> If PyPy's geomean is 0.5, it means that PyPy is going to run that task >>>> in 11.5 seconds instead of 21. To me, this sounds exactly like what >>>> you want to achieve. Moreover, it actually works, unlike what you use. >>>> >>>> For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no >>>> JIT). Then, change the normalization among the two: >>>> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars >>>> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars >>>> with the current data, you get that in one case cpython is faster, in >>>> the other pypy-c is faster. >>>> It can't happen with the geomean. This is the point of the paper. >>>> >>>> I could even construct a normalization baseline $base such that >>>> CPython seems faster than PyPy-JIT. Such a base should be very fast >>>> on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai >>>> becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on >>>> other benchmarks (so that they disappear in the sum). >>>> >>>> So, the only difference I see is that geomean works, arithm. mean >>>> doesn't. That's why Real Benchmarkers use geomean. >>>> >>>> Moreover, you are making a mistake quite common among non-physicists. >>>> What you say makes sense under the implicit assumption that dividing >>>> two times gives something you can use as a time. When you say "Pypy's >>>> runtime for a 1 second task", you actually want to talk about a >>>> performance ratio, not about the time. In the same way as when you say >>>> "this bird runs 3 meters long in one second", a physicist would sum >>>> that up as "3 m/s" rather than "3 m". >>>> >>>>> I am not really calculating any mean. You can see that I carefully avoided >>>>> to display any kind of total bar which would indeed incur in the problem you >>>>> mention. That a stacked chart implicitly displays a total is something you >>>>> can not avoid, and for that kind of chart I still think normalized results >>>>> is visually the best option. >>>> >>>> But on a stacked bars graph, I'm not going to look at individual bars >>>> at all, just at the total: it's actually less convenient than in >>>> "normal bars" to look at the result of a particular benchmark. >>>> >>>> I hope I can find guidelines against stacked plots, I have a PhD >>>> colleague reading on how to make graphs. >>>> >>>> Best regards >>>> -- >>>> Paolo Giarrusso - Ph.D. Student >>>> http://www.informatik.uni-marburg.de/~pgiarrusso/ >>>> >>> >> >> >> >> -- >> Paolo Giarrusso - Ph.D. Student >> http://www.informatik.uni-marburg.de/~pgiarrusso/ >> > -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From fijall at gmail.com Fri Jul 2 10:14:36 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 02:14:36 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 1:47 AM, Paolo Giarrusso wrote: > On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski wrote: >> On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: >>> OK, so making an interpreter level implementation of array.array seams >>> like a good idea. Would it be possible to get the jit to remove the >>> wrapping/unwrapping in that case to get better performance than >>> _rawffi.Array('d'), which is already an interpreter level >>> implementation? >> >> it should work mostly out of the box (you can also try this for >> _rawffi.array part of module, if you want to). It's probably enough to >> enable module in pypy/module/pypyjit/policy.py so JIT can have a look >> there. In case of _rawffi, probably a couple of hints for the jit to >> not look inside some functions (which do external calls for example) >> should also be needed, since for example JIT as of now does not >> support raw mallocs (using C malloc and not our GC). > >> Still, making an >> array module interp-level is probably the sanest approach. > > That might be a bad sign. > For CPython, people recommend to write extensions in C for > performance, i.e. to make them less maintainable and understandable > for performance. > A good JIT should make this unnecessary in as many cases as possible. > Of course, the array module might be an exception, if it's a single > case. > But performance 20x slower than C, with a JIT, is a big warning, since > fast interpreters are documented to be (in general) just 10x slower > than C. There is a lot of unsupported claims in your sentences, however, that's not my point. array module is the main source in Python for single-type arrays (including C types which are not available under Python). The other would be numpy. That makes sense to write in C/RPython, since it's lower-level than Python has. From fijall at gmail.com Fri Jul 2 10:16:13 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 02:16:13 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 1:37 AM, Hakan Ardo wrote: > Hi, > I've got a simple implementation of array now, wrapping lltype.malloc > with no error checking yet (cStringIO was great help, thx). How can I > test this with the jit? Do I need to translate the entire pypy or is > there a quicker way? > >> there. In case of _rawffi, probably a couple of hints for the jit to >> not look inside some functions (which do external calls for example) >> should also be needed, since for example JIT as of now does not >> support raw mallocs (using C malloc and not our GC). Still, making an >> array module interp-level is probably the sanest approach. > > Do I need to guard the lltype.malloc call with such hints? What is the syntax? > I can see into making raw_malloc just a call from JIT. That shouldn't be a big issue. For now you can either: a) use from pypy.rlib import rgc and use rgc.malloc_nonmovable (not sure if jit'll like it), so you'll get a gc-managed non-movable memory b) just wrap call to malloc in a function with decorator dont_look_inside (from pypy.rlib.jit) From arigo at tunes.org Fri Jul 2 10:17:04 2010 From: arigo at tunes.org (Armin Rigo) Date: Fri, 2 Jul 2010 10:17:04 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: <20100702081704.GA12280@code0.codespeak.net> Hi Fijal, On Thu, Jul 01, 2010 at 09:35:17AM -0600, Maciej Fijalkowski wrote: > The main reason why _rawffi.Array is slow is that JIT does not look > into that module, so there is wrapping and unwrapping going on. > Relatively easy to fix I suppose, but _rawffi.Array was not meant to > be used like that (array.array looks like a better candidate). If you mean "better candidate" for being fast right now, then you missed my point: our array.array module is implemented on top of _rawffi.Array. If you mean "better candidate" for being optimizable given some work, then yes, I agree that the array module is a good target. A bientot, Armin. From arigo at tunes.org Fri Jul 2 10:18:59 2010 From: arigo at tunes.org (Armin Rigo) Date: Fri, 2 Jul 2010 10:18:59 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> Message-ID: <20100702081859.GB12280@code0.codespeak.net> Hi Alex, On Fri, Jul 02, 2010 at 12:40:21AM -0500, Alex Gaynor wrote: > FWIW one thing to note is that array > uses the struct module, which is also pure python. No: we have a pure Python version, but in a normally compiled pypy-c, there is an interp-level version of 'struct' too. A bientot, Armin. From fijall at gmail.com Fri Jul 2 10:19:02 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 02:19:02 -0600 Subject: [pypy-dev] array performace? In-Reply-To: <20100702081704.GA12280@code0.codespeak.net> References: <20100701152827.GA30661@code0.codespeak.net> <20100702081704.GA12280@code0.codespeak.net> Message-ID: On Fri, Jul 2, 2010 at 2:17 AM, Armin Rigo wrote: > Hi Fijal, > > On Thu, Jul 01, 2010 at 09:35:17AM -0600, Maciej Fijalkowski wrote: >> The main reason why _rawffi.Array is slow is that JIT does not look >> into that module, so there is wrapping and unwrapping going on. >> Relatively easy to fix I suppose, but _rawffi.Array was not meant to >> be used like that (array.array looks like a better candidate). > > If you mean "better candidate" for being fast right now, then you missed > my point: our array.array module is implemented on top of > _rawffi.Array. ?If you mean "better candidate" for being optimizable > given some work, then yes, I agree that the array module is a good > target. > By "better candidate" I mean that having JIT see _rawffi might mean some struggle for it to understand what's going on with raw pointers and writing array in interp-level would be better. From arigo at tunes.org Fri Jul 2 10:23:10 2010 From: arigo at tunes.org (Armin Rigo) Date: Fri, 2 Jul 2010 10:23:10 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <20100702081704.GA12280@code0.codespeak.net> Message-ID: <20100702082310.GC12280@code0.codespeak.net> Hi Fijal, On Fri, Jul 02, 2010 at 02:19:02AM -0600, Maciej Fijalkowski wrote: > By "better candidate" I mean that having JIT see _rawffi might mean > some struggle for it to understand what's going on with raw pointers > and writing array in interp-level would be better. Ah, right. Armin. From fijall at gmail.com Fri Jul 2 10:35:20 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 02:35:20 -0600 Subject: [pypy-dev] array performace? In-Reply-To: <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: On Fri, Jul 2, 2010 at 2:26 AM, wrote: >> On Fri, Jul 2, 2010 at 1:47 AM, Paolo Giarrusso > >> wrote: >> > On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski > wrote: >> >> On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: >> >>> OK, so making an interpreter level implementation of array.array > seams >> >>> like a good idea. Would it be possible to get the jit to remove > the >> >>> wrapping/unwrapping in that case to get better performance than >> >>> _rawffi.Array('d'), which is already an interpreter level >> >>> implementation? >> >> >> >> it should work mostly out of the box (you can also try this for >> >> _rawffi.array part of module, if you want to). It's probably enough > to >> >> enable module in pypy/module/pypyjit/policy.py so JIT can have a > look >> >> there. In case of _rawffi, probably a couple of hints for the jit > to >> >> not look inside some functions (which do external calls for > example) >> >> should also be needed, since for example JIT as of now does not >> >> support raw mallocs (using C malloc and not our GC). >> > >> >> Still, making an >> >> array module interp-level is probably the sanest approach. >> > >> > That might be a bad sign. >> > For CPython, people recommend to write extensions in C for >> > performance, i.e. to make them less maintainable and understandable >> > for performance. >> > A good JIT should make this unnecessary in as many cases as > possible. >> > Of course, the array module might be an exception, if it's a single >> > case. >> > But performance 20x slower than C, with a JIT, is a big warning, > since >> > fast interpreters are documented to be (in general) just 10x slower >> > than C. >> >> There is a lot of unsupported claims in your sentences, however, >> that's not my point. >> > > That's a little harsh. When the JIT was originally developed it was > envisaged that it would be faster to re-write code to app level to give > speed-ups. If that's changed that's fine, but it's not an "unsupported > claim" > > Ben > Unsupported claim is for example that fast interpreters are 10x slower than C. On what exactly? Did he write this particular benchmark in C and in fast interpreter to compare? Another unsupported claim is that JIT is 20x slower than C here. Array module is not even JITted, because it's based on _rawffi which itself operates on low-level pointers which JIT does not want to deal with. That's exactly the reason why JIT doesn't look into _rawffi module and making it look there doesn't sound like a good idea (instead, we're trying to replace it with something JIT-friendly that knows how to do FFI calls into C, there is a summer of code project). All I'm trying to say is that there are valid reasons that array module should be on interpreter level and none of this has anything to do with incapabilities of the JIT. Cheers, fijal From Ben.Young at sungard.com Fri Jul 2 11:26:18 2010 From: Ben.Young at sungard.com (Ben.Young at sungard.com) Date: Fri, 2 Jul 2010 10:26:18 +0100 Subject: [pypy-dev] PyPy Speed Message-ID: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> http://speed.pypy.org/overview/ seems to have been unavailable for the last couple of days. It gives a 500 whenever I visit it Ben Young - Senior Software Engineer SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ CONFIDENTIALITY: This email (including any attachments) may contain confidential, proprietary and privileged information, and unauthorized disclosure or use is prohibited. If you received this email in error, please notify the sender and delete this email from your system. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100702/110c7818/attachment.htm From fijall at gmail.com Fri Jul 2 11:28:03 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 03:28:03 -0600 Subject: [pypy-dev] PyPy Speed In-Reply-To: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: Hey. I know miquel was talking about rolling in new version. Apparently, did not work :) On Fri, Jul 2, 2010 at 3:26 AM, wrote: > http://speed.pypy.org/overview/ seems to have been unavailable for the last > couple of days. It gives a 500 whenever I visit it > > > > Ben Young - Senior Software Engineer > > SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR > > Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ > > > > CONFIDENTIALITY:? This email (including any attachments) may contain > confidential, proprietary and privileged information, and unauthorized > disclosure or use is prohibited.? If you received this email in error, > please notify the sender and delete this email from your system.? Thank you. > > > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From Ben.Young at sungard.com Fri Jul 2 11:36:28 2010 From: Ben.Young at sungard.com (Ben.Young at sungard.com) Date: Fri, 2 Jul 2010 10:36:28 +0100 Subject: [pypy-dev] PyPy Speed In-Reply-To: References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: <01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> Ok thanks :) -----Original Message----- From: Maciej Fijalkowski [mailto:fijall at gmail.com] Sent: 02 July 2010 10:28 To: Young, Ben Cc: pypy-dev at codespeak.net Subject: Re: [pypy-dev] PyPy Speed Hey. I know miquel was talking about rolling in new version. Apparently, did not work :) On Fri, Jul 2, 2010 at 3:26 AM, wrote: > http://speed.pypy.org/overview/ seems to have been unavailable for the last > couple of days. It gives a 500 whenever I visit it > > > > Ben Young - Senior Software Engineer > > SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR > > Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ > > > > CONFIDENTIALITY:? This email (including any attachments) may contain > confidential, proprietary and privileged information, and unauthorized > disclosure or use is prohibited.? If you received this email in error, > please notify the sender and delete this email from your system.? Thank you. > > > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From tobami at googlemail.com Fri Jul 2 11:49:53 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 2 Jul 2010 11:49:53 +0200 Subject: [pypy-dev] PyPy Speed In-Reply-To: <01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: Hi Ben, no, that is not the case, the new version has been online for a week without problems. The reason is the renaming of the "overview" to "changes". Maybe I should have left the URL /overview/ active with a redirection to /changes/, sorry. You would have seen that if you had checked the root URL (speed.pypy.org) btw. Anyway thanks for pointing it out. Cheers, Miquel 2010/7/2 : > Ok thanks :) > > -----Original Message----- > From: Maciej Fijalkowski [mailto:fijall at gmail.com] > Sent: 02 July 2010 10:28 > To: Young, Ben > Cc: pypy-dev at codespeak.net > Subject: Re: [pypy-dev] PyPy Speed > > Hey. > > I know miquel was talking about rolling in new version. Apparently, > did not work :) > > On Fri, Jul 2, 2010 at 3:26 AM, ? wrote: >> http://speed.pypy.org/overview/ seems to have been unavailable for the last >> couple of days. It gives a 500 whenever I visit it >> >> >> >> Ben Young - Senior Software Engineer >> >> SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR >> >> Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ >> >> >> >> CONFIDENTIALITY:? This email (including any attachments) may contain >> confidential, proprietary and privileged information, and unauthorized >> disclosure or use is prohibited.? If you received this email in error, >> please notify the sender and delete this email from your system.? Thank you. >> >> >> >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev From Ben.Young at sungard.com Fri Jul 2 11:51:36 2010 From: Ben.Young at sungard.com (Ben.Young at sungard.com) Date: Fri, 2 Jul 2010 10:51:36 +0100 Subject: [pypy-dev] PyPy Speed In-Reply-To: References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp><01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: <01781CA2CC22B145B230504679ECF48C01AC44B1@EMEA-EXCHANGE03.internal.sungard.corp> Ah, ok thanks. I had bookmarked the other page, so I just clicked and assumed it was broken Thanks, Ben -----Original Message----- From: Miquel Torres [mailto:tobami at googlemail.com] Sent: 02 July 2010 10:50 To: Young, Ben Cc: pypy-dev at codespeak.net Subject: Re: [pypy-dev] PyPy Speed Hi Ben, no, that is not the case, the new version has been online for a week without problems. The reason is the renaming of the "overview" to "changes". Maybe I should have left the URL /overview/ active with a redirection to /changes/, sorry. You would have seen that if you had checked the root URL (speed.pypy.org) btw. Anyway thanks for pointing it out. Cheers, Miquel 2010/7/2 : > Ok thanks :) > > -----Original Message----- > From: Maciej Fijalkowski [mailto:fijall at gmail.com] > Sent: 02 July 2010 10:28 > To: Young, Ben > Cc: pypy-dev at codespeak.net > Subject: Re: [pypy-dev] PyPy Speed > > Hey. > > I know miquel was talking about rolling in new version. Apparently, > did not work :) > > On Fri, Jul 2, 2010 at 3:26 AM, ? wrote: >> http://speed.pypy.org/overview/ seems to have been unavailable for the last >> couple of days. It gives a 500 whenever I visit it >> >> >> >> Ben Young - Senior Software Engineer >> >> SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR >> >> Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ >> >> >> >> CONFIDENTIALITY:? This email (including any attachments) may contain >> confidential, proprietary and privileged information, and unauthorized >> disclosure or use is prohibited.? If you received this email in error, >> please notify the sender and delete this email from your system.? Thank you. >> >> >> >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev From fijall at gmail.com Fri Jul 2 12:20:18 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 04:20:18 -0600 Subject: [pypy-dev] PyPy Speed In-Reply-To: <01781CA2CC22B145B230504679ECF48C01AC44B1@EMEA-EXCHANGE03.internal.sungard.corp> References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC44B1@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: To be fair it's not like it said "404 not found" to me On Fri, Jul 2, 2010 at 3:51 AM, wrote: > Ah, ok thanks. I had bookmarked the other page, so I just clicked and assumed it was broken > > Thanks, > Ben > > -----Original Message----- > From: Miquel Torres [mailto:tobami at googlemail.com] > Sent: 02 July 2010 10:50 > To: Young, Ben > Cc: pypy-dev at codespeak.net > Subject: Re: [pypy-dev] PyPy Speed > > Hi Ben, > > no, that is not the case, the new version has been online for a week > without problems. > > The reason is the renaming of the "overview" to "changes". Maybe I > should have left the URL /overview/ active with a redirection to > /changes/, sorry. You would have seen that if you had checked the root > URL (speed.pypy.org) btw. > > Anyway thanks for pointing it out. > > Cheers, > Miquel > > > 2010/7/2 ?: >> Ok thanks :) >> >> -----Original Message----- >> From: Maciej Fijalkowski [mailto:fijall at gmail.com] >> Sent: 02 July 2010 10:28 >> To: Young, Ben >> Cc: pypy-dev at codespeak.net >> Subject: Re: [pypy-dev] PyPy Speed >> >> Hey. >> >> I know miquel was talking about rolling in new version. Apparently, >> did not work :) >> >> On Fri, Jul 2, 2010 at 3:26 AM, ? wrote: >>> http://speed.pypy.org/overview/ seems to have been unavailable for the last >>> couple of days. It gives a 500 whenever I visit it >>> >>> >>> >>> Ben Young - Senior Software Engineer >>> >>> SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR >>> >>> Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ >>> >>> >>> >>> CONFIDENTIALITY:? This email (including any attachments) may contain >>> confidential, proprietary and privileged information, and unauthorized >>> disclosure or use is prohibited.? If you received this email in error, >>> please notify the sender and delete this email from your system.? Thank you. >>> >>> >>> >>> _______________________________________________ >>> pypy-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/pypy-dev >>> >> >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev > > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From p.giarrusso at gmail.com Fri Jul 2 14:08:35 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 2 Jul 2010 14:08:35 +0200 Subject: [pypy-dev] array performace? In-Reply-To: <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: On Fri, Jul 2, 2010 at 10:55, wrote: >> On Fri, Jul 2, 2010 at 2:26 AM, ? wrote: >> >> On Fri, Jul 2, 2010 at 1:47 AM, Paolo Giarrusso >> > >> >> wrote: >> >> > On Fri, Jul 2, 2010 at 08:04, Maciej Fijalkowski >> > wrote: >> >> >> On Thu, Jul 1, 2010 at 1:18 PM, Hakan Ardo wrote: >> >> >>> OK, so making an interpreter level implementation of array.array >> > seams >> >> >>> like a good idea. Would it be possible to get the jit to remove >> > the >> >> >>> wrapping/unwrapping in that case to get better performance than >> >> >>> _rawffi.Array('d'), which is already an interpreter level >> >> >>> implementation? >> >> >> >> >> >> it should work mostly out of the box (you can also try this for >> >> >> _rawffi.array part of module, if you want to). It's probably enough >> > to >> >> >> enable module in pypy/module/pypyjit/policy.py so JIT can have a >> > look >> >> >> there. In case of _rawffi, probably a couple of hints for the jit >> > to >> >> >> not look inside some functions (which do external calls for >> > example) >> >> >> should also be needed, since for example JIT as of now does not >> >> >> support raw mallocs (using C malloc and not our GC). >> >> > >> >> >> Still, making an >> >> >> array module interp-level is probably the sanest approach. >> >> > >> >> > That might be a bad sign. >> >> > For CPython, people recommend to write extensions in C for >> >> > performance, i.e. to make them less maintainable and understandable >> >> > for performance. >> >> > A good JIT should make this unnecessary in as many cases as >> > possible. >> >> > Of course, the array module might be an exception, if it's a single >> >> > case. >> >> > But performance 20x slower than C, with a JIT, is a big warning, >> > since >> >> > fast interpreters are documented to be (in general) just 10x slower >> >> > than C. >> >> >> >> There is a lot of unsupported claims in your sentences, however, >> >> that's not my point. >> >> >> > >> > That's a little harsh. When the JIT was originally developed it was >> > envisaged that it would be faster to re-write code to app level to give >> > speed-ups. If that's changed that's fine, but it's not an "unsupported >> > claim" >> > >> > Ben >> > >> >> Unsupported claim is for example that fast interpreters are 10x slower >> than C. That's the only unsupported claim, but it comes from "The Structure and Performance of E?cient Interpreters". I studied that as a student on VM, you are writing one, so I (unconsciously) guessed that everybody knows that paper - I know that's a completely broken way of writing, but I didn't spot it. >>On what exactly? Did he write this particular benchmark in C >> and in fast interpreter to compare? Another unsupported claim is that >> JIT is 20x slower than C here. I did not claim that - I am aware that it is not even JITted. I complain against the lack of JITting. >> Array module is not even JITted, >> because it's based on _rawffi which itself operates on low-level >> pointers which JIT does not want to deal with. I would say that instead of doing manual annotations or rewriting at the interp-level (which doesn't scale), it would be overall simpler to make the JIT learn itself how to deal with those calls (i.e. inline everything around, leave the external call as a call), once and for all. What you suggest below might be a way to do it. >> That's exactly the >> reason why JIT doesn't look into _rawffi module and making it look >> there doesn't sound like a good idea (instead, we're trying to replace >> it with something JIT-friendly that knows how to do FFI calls into C, >> there is a summer of code project). Well, at the abstraction level I'm speaking, it sounds like there in the end, the JIT will be able to do what is needed. I am not aware of the details. But then, at the end of that project, it seems to me that it should be possible to write the array module in pure Python using this new FFI interface and have the JIT look at it, shouldn't it? I do not concentrate on array specifically - rewriting a few modules at interpreter level is fine. But as a Python developer I should have no need for that. >> All I'm trying to say is that there are valid reasons that array >> module should be on interpreter level and none of this has anything to >> do with incapabilities of the JIT. > Fair enough, and I do see your point, but I think Paolo comment was not aimed at array, just the implication (in this case) that to get performance you need to re-write in rpython. I think his point in general is correct, even if he picked the wrong example to mention it :) (and his 20x claim comes from the original email, so I don't think it's entirely unsupported) Thanks for understanding my point. I'm unsure whether an ideal JIT could allow leaving array at the app-level (and I noted also in the original mail that I was unsure on this). > Of course in this case I'm sure there are good reasons, but it is certainly interesting to see the push towards more rpython code than app-level. I guess that's because the JIT can "see" and accelerate rpython code too I believe, so it?s win-win (because of the code size issues and things like that) > Incidentally, is there a reason that geninterped code is so bloated compared to rpython code that looks like it could have been generated from the app-level equivalent? Would there be a way of annotating the app-level code so that when it's geninterped it's as tight as the equivalent rpython? -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From tobami at googlemail.com Fri Jul 2 16:16:14 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 2 Jul 2010 16:16:14 +0200 Subject: [pypy-dev] PyPy Speed In-Reply-To: References: <01781CA2CC22B145B230504679ECF48C01AC448A@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC449C@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC44B1@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: > To be fair it's not like it said "404 not found" to me right, that is wrong 2010/7/2 Maciej Fijalkowski : > To be fair it's not like it said "404 not found" to me > > On Fri, Jul 2, 2010 at 3:51 AM, ? wrote: >> Ah, ok thanks. I had bookmarked the other page, so I just clicked and assumed it was broken >> >> Thanks, >> Ben >> >> -----Original Message----- >> From: Miquel Torres [mailto:tobami at googlemail.com] >> Sent: 02 July 2010 10:50 >> To: Young, Ben >> Cc: pypy-dev at codespeak.net >> Subject: Re: [pypy-dev] PyPy Speed >> >> Hi Ben, >> >> no, that is not the case, the new version has been online for a week >> without problems. >> >> The reason is the renaming of the "overview" to "changes". Maybe I >> should have left the URL /overview/ active with a redirection to >> /changes/, sorry. You would have seen that if you had checked the root >> URL (speed.pypy.org) btw. >> >> Anyway thanks for pointing it out. >> >> Cheers, >> Miquel >> >> >> 2010/7/2 ?: >>> Ok thanks :) >>> >>> -----Original Message----- >>> From: Maciej Fijalkowski [mailto:fijall at gmail.com] >>> Sent: 02 July 2010 10:28 >>> To: Young, Ben >>> Cc: pypy-dev at codespeak.net >>> Subject: Re: [pypy-dev] PyPy Speed >>> >>> Hey. >>> >>> I know miquel was talking about rolling in new version. Apparently, >>> did not work :) >>> >>> On Fri, Jul 2, 2010 at 3:26 AM, ? wrote: >>>> http://speed.pypy.org/overview/ seems to have been unavailable for the last >>>> couple of days. It gives a 500 whenever I visit it >>>> >>>> >>>> >>>> Ben Young - Senior Software Engineer >>>> >>>> SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR >>>> >>>> Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/ >>>> >>>> >>>> >>>> CONFIDENTIALITY:? This email (including any attachments) may contain >>>> confidential, proprietary and privileged information, and unauthorized >>>> disclosure or use is prohibited.? If you received this email in error, >>>> please notify the sender and delete this email from your system.? Thank you. >>>> >>>> >>>> >>>> _______________________________________________ >>>> pypy-dev at codespeak.net >>>> http://codespeak.net/mailman/listinfo/pypy-dev >>>> >>> >>> _______________________________________________ >>> pypy-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/pypy-dev >> >> >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > From cfbolz at gmx.de Fri Jul 2 20:35:46 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Fri, 02 Jul 2010 20:35:46 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> Message-ID: <4C2E3182.4020307@gmx.de> Hi Paolo, On 07/02/2010 02:08 PM, Paolo Giarrusso wrote: >>> Unsupported claim is for example that fast interpreters are 10x >>> slower than C. > That's the only unsupported claim, but it comes from "The Structure > and Performance of E?cient Interpreters". I studied that as a > student on VM, you are writing one, so I (unconsciously) guessed > that everybody knows that paper - I know that's a completely broken > way of writing, but I didn't spot it. Even if something is claimed by a well-known paper, it doesn't necessarily have to be true. The paper considers a class of interpreters where each specific bytecode does very little work (the paper does not make this assumption explicit). This is not the case for Python at all, so I think that the conclusions of the paper don't apply directly. This is explained quite clearly in the following paper: Virtual-Machine Abstraction and Optimization Techniques by Stefan Brunthaler in Bytecode 2009. [...] > Well, at the abstraction level I'm speaking, it sounds like there in > the end, the JIT will be able to do what is needed. I am not aware > of the details. But then, at the end of that project, it seems to me > that it should be possible to write the array module in pure Python > using this new FFI interface and have the JIT look at it, shouldn't > it? I do not concentrate on array specifically - rewriting a few > modules at interpreter level is fine. But as a Python developer I > should have no need for that. That's a noble goal :-). I agree with the goal, but I still wanted to point out that the case of array is really quite outside of the range of possibilities of typical JIT compilers. Consider the hypothetical problem of having to write a pure-Python array module without using any other module, only builtin types. Then you would have to map arrays to be normal Python lists, and you would have no way to circumvent the fact that all objects in the lists are boxed. The JIT is now not helping you at all, because it only optimizes on a code level, and cannot change the way your data is structured in memory. I know that this is not at all how you are proposing the array module should be written, but I still wanted to point out that current JITs don't help you much if your data is represented in a bad way. We have some ideas how data representations could be optimized at runtime, but nothing implemented yet. Cheers, Carl Friedrich From p.giarrusso at gmail.com Fri Jul 2 21:35:55 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 2 Jul 2010 21:35:55 +0200 Subject: [pypy-dev] array performace? In-Reply-To: <4C2E3182.4020307@gmx.de> References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On Fri, Jul 2, 2010 at 20:35, Carl Friedrich Bolz wrote: > Hi Paolo, > > On 07/02/2010 02:08 PM, Paolo Giarrusso wrote: >>>> Unsupported claim is for example that fast interpreters are 10x >>>> slower than C. >> That's the only unsupported claim, but it comes from "The Structure >> and Performance of E?cient Interpreters". I studied that as a >> student on VM, you are writing one, so I (unconsciously) guessed >> that everybody knows that paper - I know that's a completely broken >> way of writing, but I didn't spot it. > > Even if something is claimed by a well-known paper, it doesn't > necessarily have to be true. The paper considers a class of interpreters > where each specific bytecode does very little work (the paper does not > make this assumption explicit). This is not the case for Python at all, > so I think that the conclusions of the paper don't apply directly. Well, actually what I mention is not a conclusion of that paper, but what you say probably applies to the original paper which is referenced, so it doesn't matter. > This is explained quite clearly in the following paper: > > Virtual-Machine Abstraction and Optimization Techniques by Stefan > Brunthaler in Bytecode 2009. I already mentioned that paper, a couple of years ago, when discussing threading in PyPy, and my point was dismissed on general arguments. I'm happy to see now a paper stating your point, so that it can be discussed more precisely. But the obvious question is: given the mixed characteristics of the Lua interpreter, what is the instruction subdivision in that case? They write it's in the same class without any measurement, while it can complete an addition in 5 instructions instead of 3, and avoiding the need for separate loads. In Python, instead, refcounting alone is a very expensive operation. Beyond that, that paper also acknowledges that a virtual machine for Prolog, even if using dynamic types like Python, was in the same efficiency class as lower-level VMs. I agree however that other optimizations are needed first. I would expect Lua to seem more 'low-level' also from this point of view, and thus able to benefit more from threading. And with Python 3.0, where the distinction between int and long is gone, the Lua implementation would be almost fine, if one uses tagged integer and optimizes overflow checking through assembler (it's two lines of assembly code on x86/x86_64). > That's a noble goal :-). I agree with the goal, but I still wanted to > point out that the case of array is really quite outside of the range of > possibilities of typical JIT compilers. Consider the hypothetical > problem of having to write a pure-Python array module without using any > other module, only builtin types. Then you would have to map arrays to > be normal Python lists, and you would have no way to circumvent the fact > that all objects in the lists are boxed. The JIT is now not helping you > at all, because it only optimizes on a code level, and cannot change the > way your data is structured in memory. > I know that this is not at all how you are proposing the array module > should be written, but I still wanted to point out that current JITs > don't help you much if your data is represented in a bad way. We have > some ideas how data representations could be optimized at runtime, but > nothing implemented yet. OK, agreed. It would still be generally useful if the JIT _could_ optimize such cases, but that's hard enough. Especially, trying to recognize that the list is used with homogeneous element does not look easy in such a setting. However, again, what about tagged integers? They wouldn't allow optimizing all uses of arrays, but they would be generally useful on at least 31-bit integers and narrow characters. If I had more free time, and then also enough disk space to translate PyPy (I recall I hadn't when I conceived trying), I could maybe try doing that myself, with some help. Don't hold your breath for that, though. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From hakan at debian.org Fri Jul 2 21:59:01 2010 From: hakan at debian.org (Hakan Ardo) Date: Fri, 2 Jul 2010 21:59:01 +0200 Subject: [pypy-dev] Interpreter level array implementation Message-ID: Hi, we got the simplest possible interpreter level implementation of an array-like object running (in the interplevel-array branch) and it executes my previous example about 2 times slower than optimized C. Attached is the trace generated by the following example: img=array(640*480); l=0; i=0; while i<640*480: l+=img[i] i+=1 a simplified version of that trace is: 1. [p0, p1, p2, p3, i4, p5, p6, p7, p8, p9, p10, f11, i] 2. i14 = int_lt(i, 307200) 3. guard_true(i14, descr=) 4. guard_nonnull_class(p10, 145745952, descr=) 5. img = getfield_gc(p10, descr=) 6. f17 = getarrayitem_gc(img, i, descr=) 7. f18 = float_add(f11, f17) 8. i20 = int_add_ovf(i, 1) 9. guard_no_overflow(, descr=) # 10. i23 = getfield_raw(149604768, descr=) 11. i25 = int_add(i23, 1) 12. setfield_raw(149604768, i25, descr=) 13. i28 = int_and(i25, -2131755008) 14. i29 = int_is_true(i28) 15. guard_false(i29, descr=) 16. jump(p0, p1, p2, p3, 27, ConstPtr(ptr31), ConstPtr(ptr32), ConstPtr(ptr33), p8, p9, p10, f18, i20) Does these operation more or less correspond to assembler instructions? I guess that the extra overhead here as compared to the the C version would be line 4, 5, 9 and 10-15. What's 10-15 all about? I guess that most of these additional operation would not affect the performance of more complicated loops as they will only occur once per loop (although combining the guard on line 9 with line 3 might be a possible optimization)? Line 4 will appear once for each array used in the loop and line 5 once for every array access, right? Can the array implementation be designed in someway that would not generate line 5 above? Or would it be possible to get rid of it by some optimization? -- H?kan Ard? -------------- next part -------------- A non-text attachment was scrubbed... Name: log Type: application/octet-stream Size: 2316 bytes Desc: not available Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100702/0fbdacbc/attachment.obj From alex.gaynor at gmail.com Fri Jul 2 22:12:19 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Fri, 2 Jul 2010 15:12:19 -0500 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Fri, Jul 2, 2010 at 2:59 PM, Hakan Ardo wrote: > Hi, > we got the simplest possible interpreter level implementation of an > array-like object running (in the interplevel-array branch) and it > executes my previous example about 2 times slower than optimized C. > Attached is the trace generated by the following example: > > ? ?img=array(640*480); ? l=0; ? i=0; > ? ?while i<640*480: > ? ? ? ?l+=img[i] > ? ? ? ?i+=1 > > a simplified version of that trace is: > > ? 1. [p0, p1, p2, p3, i4, p5, p6, p7, p8, p9, p10, f11, i] > ? 2. i14 = int_lt(i, 307200) > ? 3. ? guard_true(i14, descr=) > ? 4. ? guard_nonnull_class(p10, 145745952, descr=) > ? 5. img = getfield_gc(p10, descr=) > ? 6. f17 = getarrayitem_gc(img, i, descr=) > ? 7. f18 = float_add(f11, f17) > ? 8. i20 = int_add_ovf(i, 1) > ? 9. ? guard_no_overflow(, descr=) # > ?10. i23 = getfield_raw(149604768, descr=) > ?11. i25 = int_add(i23, 1) > ?12. setfield_raw(149604768, i25, descr=) > ?13. i28 = int_and(i25, -2131755008) > ?14. i29 = int_is_true(i28) > ?15. ? guard_false(i29, descr=) > ?16. jump(p0, p1, p2, p3, 27, ConstPtr(ptr31), ConstPtr(ptr32), > ? ? ? ? ? ConstPtr(ptr33), p8, p9, p10, f18, i20) > > Does these operation more or less correspond to assembler > instructions? I guess that the extra overhead here as compared to the > the C version would be line 4, 5, 9 and 10-15. What's 10-15 all about? > I guess that most of these additional operation would not affect the > performance of more complicated loops as they will only occur once per > loop (although combining the guard on line 9 with line 3 might be a > possible optimization)? Line 4 will appear once for each array used in > the loop and line 5 once for every array access, right? > > Can the array implementation be designed in someway that would not > generate line 5 above? Or would it be possible to get rid of it by > some optimization? > > -- > H?kan Ard? > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > In addition to the things you noted, I guess the int overflow check can be optimized out, since i+=1 can never cause it to overflow given that i is bounded at 640*480. I suppose in general that would require more dataflow analysis. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From fijall at gmail.com Fri Jul 2 23:16:36 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 15:16:36 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: [snip] > the need for separate loads. In Python, instead, refcounting alone is > a very expensive operation. How does that apply to pypy? From fijall at gmail.com Fri Jul 2 23:21:17 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 15:21:17 -0600 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: General note - we consider 2x optimized C a pretty good result :) Details below On Fri, Jul 2, 2010 at 1:59 PM, Hakan Ardo wrote: > Hi, > we got the simplest possible interpreter level implementation of an > array-like object running (in the interplevel-array branch) and it > executes my previous example about 2 times slower than optimized C. > Attached is the trace generated by the following example: > > ? ?img=array(640*480); ? l=0; ? i=0; > ? ?while i<640*480: > ? ? ? ?l+=img[i] > ? ? ? ?i+=1 > > a simplified version of that trace is: > > ? 1. [p0, p1, p2, p3, i4, p5, p6, p7, p8, p9, p10, f11, i] > ? 2. i14 = int_lt(i, 307200) > ? 3. ? guard_true(i14, descr=) > ? 4. ? guard_nonnull_class(p10, 145745952, descr=) > ? 5. img = getfield_gc(p10, descr=) > ? 6. f17 = getarrayitem_gc(img, i, descr=) > ? 7. f18 = float_add(f11, f17) > ? 8. i20 = int_add_ovf(i, 1) > ? 9. ? guard_no_overflow(, descr=) # > ?10. i23 = getfield_raw(149604768, descr=) > ?11. i25 = int_add(i23, 1) > ?12. setfield_raw(149604768, i25, descr=) > ?13. i28 = int_and(i25, -2131755008) > ?14. i29 = int_is_true(i28) > ?15. ? guard_false(i29, descr=) > ?16. jump(p0, p1, p2, p3, 27, ConstPtr(ptr31), ConstPtr(ptr32), > ? ? ? ? ? ConstPtr(ptr33), p8, p9, p10, f18, i20) > > Does these operation more or less correspond to assembler > instructions? Yes. Use PYPYJITLOG=log pypy-c ... to get assembler. View using pypy/jit/backend/x86/tool/viewcode.py > I guess that the extra overhead here as compared to the > the C version would be line 4, 5, 9 and 10-15. What's 10-15 all about? It's about a couple of things that python interpreter has to perform. Notably asynchronous signal checking and thread swapping with GIL. > I guess that most of these additional operation would not affect the > performance of more complicated loops as they will only occur once per > loop (although combining the guard on line 9 with line 3 might be a > possible optimization)? Line 4 will appear once for each array used in > the loop and line 5 once for every array access, right? Yes. We don't do loop invariant optimizations for some reasons, the best of it being the fact that to loop you can always add a bridge which will invalidate this invariant. > > Can the array implementation be designed in someway that would not > generate line 5 above? Or would it be possible to get rid of it by > some optimization? No, it's about optimizations of JIT itself (it's an artifact of python looping rather than array module). > > -- > H?kan Ard? > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > Cheers, fijal From bokr at oz.net Sat Jul 3 00:56:39 2010 From: bokr at oz.net (Bengt Richter) Date: Fri, 02 Jul 2010 15:56:39 -0700 Subject: [pypy-dev] array performace? In-Reply-To: <4C2E3182.4020307@gmx.de> References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote: > Hi Paolo, > > On 07/02/2010 02:08 PM, Paolo Giarrusso wrote: >>>> Unsupported claim is for example that fast interpreters are 10x >>>> slower than C. >> That's the only unsupported claim, but it comes from "The Structure >> and Performance of E???cient Interpreters". I studied that as a >> student on VM, you are writing one, so I (unconsciously) guessed >> that everybody knows that paper - I know that's a completely broken >> way of writing, but I didn't spot it. > > Even if something is claimed by a well-known paper, it doesn't > necessarily have to be true. The paper considers a class of interpreters > where each specific bytecode does very little work (the paper does not > make this assumption explicit). This is not the case for Python at all, > so I think that the conclusions of the paper don't apply directly. > > This is explained quite clearly in the following paper: > > Virtual-Machine Abstraction and Optimization Techniques by Stefan > Brunthaler in Bytecode 2009. > > > [...] >> Well, at the abstraction level I'm speaking, it sounds like there in >> the end, the JIT will be able to do what is needed. I am not aware >> of the details. But then, at the end of that project, it seems to me >> that it should be possible to write the array module in pure Python >> using this new FFI interface and have the JIT look at it, shouldn't >> it? I do not concentrate on array specifically - rewriting a few >> modules at interpreter level is fine. But as a Python developer I >> should have no need for that. > > That's a noble goal :-). I agree with the goal, but I still wanted to > point out that the case of array is really quite outside of the range of > possibilities of typical JIT compilers. Consider the hypothetical > problem of having to write a pure-Python array module without using any > other module, only builtin types. Then you would have to map arrays to > be normal Python lists, and you would have no way to circumvent the fact > that all objects in the lists are boxed. The JIT is now not helping you > at all, because it only optimizes on a code level, and cannot change the > way your data is structured in memory. > > I know that this is not at all how you are proposing the array module > should be written, but I still wanted to point out that current JITs > don't help you much if your data is represented in a bad way. We have > some ideas how data representations could be optimized at runtime, but > nothing implemented yet. A thought/question: Could/does JIT make use of information in an assert statement? E.g., could we write assert set(type(x) for x in img) == set([float]) and len(img)==640*480 in front of a loop operating on img and have JIT use the info as assumed true even when "if __debug__:" suites are optimized away? Could such assertions allow e.g. a list to be implemented as a homogeneous vector of unboxed representations? What kind of guidelines for writing assertions would have to exist to make them useful to JIT most easily? Regards, Bengt Richter From amauryfa at gmail.com Sat Jul 3 01:14:40 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 3 Jul 2010 01:14:40 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: Hi, 2010/7/3 Bengt Richter : > A thought/question: > > Could/does JIT make use of information in an assert statement? E.g., could we write > ? ? assert set(type(x) for x in img) == set([float]) and len(img)==640*480 > in front of a loop operating on img and have JIT use the info as assumed true > even when "if __debug__:" suites are optimized away? > > Could such assertions allow e.g. a list to be implemented as a homogeneous vector > of unboxed representations? > > What kind of guidelines for writing assertions would have to exist to make them > useful to JIT most easily? If efficient python code needs this, I'd better write the loop in C and explicitly choose the types. The C code could be inlined in the python script, and compiled on demand. At least you'll know what you get. -- Amaury Forgeot d'Arc From bokr at oz.net Sat Jul 3 02:38:16 2010 From: bokr at oz.net (Bengt Richter) Date: Fri, 02 Jul 2010 17:38:16 -0700 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: <4C2E8678.5070208@oz.net> On 07/02/2010 04:14 PM Amaury Forgeot d'Arc wrote: > Hi, > > 2010/7/3 Bengt Richter : >> A thought/question: >> >> Could/does JIT make use of information in an assert statement? E.g., could we write >> assert set(type(x) for x in img) == set([float]) and len(img)==640*480 >> in front of a loop operating on img and have JIT use the info as assumed true >> even when "if __debug__:" suites are optimized away? >> >> Could such assertions allow e.g. a list to be implemented as a homogeneous vector >> of unboxed representations? >> >> What kind of guidelines for writing assertions would have to exist to make them >> useful to JIT most easily? > > If efficient python code needs this, I'd better write the loop in C > and explicitly choose the types. > The C code could be inlined in the python script, and compiled on demand. > At least you'll know what you get. > Well, even C accepts hints like 'register' (and may ignore you, so you are not truly sure what you get ;-) The point of using assert would be to let the user remain within the python language, while still passing useful hints to the compiler. If I wanted to mix languages (not uninteresting!), I'd go with racket (the star formerly known as PLT-scheme) http://www.racket-lang.org/ They have extended programmability right down to the reader/tokenizer, so it might well be possible for them to accept literal C as a translated sub/macro-language, given the appropriate syntax definitions written in racket. For more, see http://docs.racket-lang.org/guide/languages.html and more specifically http://docs.racket-lang.org/guide/hash-reader.html Regards, Bengt Richter From fijall at gmail.com Sat Jul 3 07:00:33 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 2 Jul 2010 23:00:33 -0600 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On Fri, Jul 2, 2010 at 4:56 PM, Bengt Richter wrote: > On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote: >> Hi Paolo, >> >> On 07/02/2010 02:08 PM, Paolo Giarrusso wrote: >>>>> Unsupported claim is for example that fast interpreters are 10x >>>>> slower than C. >>> That's the only unsupported claim, but it comes from "The Structure >>> and Performance of E???cient Interpreters". I studied that as a >>> student on VM, you are writing one, so I (unconsciously) guessed >>> that everybody knows that paper - I know that's a completely broken >>> way of writing, but I didn't spot it. >> >> Even if something is claimed by a well-known paper, it doesn't >> necessarily have to be true. The paper considers a class of interpreters >> where each specific bytecode does very little work (the paper does not >> make this assumption explicit). This is not the case for Python at all, >> so I think that the conclusions of the paper don't apply directly. >> >> This is explained quite clearly in the following paper: >> >> Virtual-Machine Abstraction and Optimization Techniques by Stefan >> Brunthaler in Bytecode 2009. >> >> >> [...] >>> Well, at the abstraction level I'm speaking, it sounds like there in >>> the end, the JIT will be able to do what is needed. I am not aware >>> of the details. But then, at the end of that project, it seems to me >>> that it should be possible to write the array module in pure Python >>> using this new FFI interface and have the JIT look at it, shouldn't >>> it? I do not concentrate on array specifically - rewriting a few >>> modules at interpreter level is fine. But as a Python developer I >>> should have no need for that. >> >> That's a noble goal :-). I agree with the goal, but I still wanted to >> point out that the case of array is really quite outside of the range of >> possibilities of typical JIT compilers. Consider the hypothetical >> problem of having to write a pure-Python array module without using any >> other module, only builtin types. Then you would have to map arrays to >> be normal Python lists, and you would have no way to circumvent the fact >> that all objects in the lists are boxed. The JIT is now not helping you >> at all, because it only optimizes on a code level, and cannot change the >> way your data is structured in memory. >> >> I know that this is not at all how you are proposing the array module >> should be written, but I still wanted to point out that current JITs >> don't help you much if your data is represented in a bad way. We have >> some ideas how data representations could be optimized at runtime, but >> nothing implemented yet. > > A thought/question: > > Could/does JIT make use of information in an assert statement? E.g., could we write > ? ? assert set(type(x) for x in img) == set([float]) and len(img)==640*480 > in front of a loop operating on img and have JIT use the info as assumed true > even when "if __debug__:" suites are optimized away? > > Could such assertions allow e.g. a list to be implemented as a homogeneous vector > of unboxed representations? > > What kind of guidelines for writing assertions would have to exist to make them > useful to JIT most easily? > > Regards, > Bengt Richter if you look closer this assertion is insanely complex to derive any informations from (you even used a generator expression). Besides, nothing stops you from changing that assumption later. You would need some sort of static analyzis which is either very hard or plain impossible in Python. Instead, we rather pursue ways of getting some runtime profiling data to get usage patterns. Cheers, fijal From hakan at debian.org Sat Jul 3 08:14:00 2010 From: hakan at debian.org (Hakan Ardo) Date: Sat, 3 Jul 2010 08:14:00 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Fri, Jul 2, 2010 at 11:21 PM, Maciej Fijalkowski wrote: > General note - we consider 2x optimized C a pretty good result :) Details below As do I :) I just want to make this as jit-friendly as possible without rely knowing what's jit-friendly... > Yes. We don't do loop invariant optimizations for some reasons, the > best of it being the fact that to loop you can always add a bridge > which will invalidate this invariant. Are you telling me that you probably never will include that kind of optimization because of the limitations it imposes on other parts of the jit or just that it would be a lot of work to get it in place? What is a bridge? -- H?kan Ard? From fijall at gmail.com Sat Jul 3 08:20:01 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 3 Jul 2010 00:20:01 -0600 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Sat, Jul 3, 2010 at 12:14 AM, Hakan Ardo wrote: > On Fri, Jul 2, 2010 at 11:21 PM, Maciej Fijalkowski wrote: >> General note - we consider 2x optimized C a pretty good result :) Details below > > As do I :) I just want ?to make this as jit-friendly as possible > without rely knowing what's jit-friendly... I think it's fairly JIT friendly. You can look into traces (as you did), but seems fine to me. > >> Yes. We don't do loop invariant optimizations for some reasons, the >> best of it being the fact that to loop you can always add a bridge >> which will invalidate this invariant. > > Are you telling me that you probably never will include that kind of > optimization because of the limitations it imposes on other parts of > the jit or just that it would be a lot of work to get it in place? It requires thinking. It's harder to do because we don't know statically upfront how many paths we'll compile to assembler, but I can think about ways to mitigate that. > > What is a bridge? If guard fails often enough, it's traced and compiled to assembler. That's a bridge > > -- > H?kan Ard? > From p.giarrusso at gmail.com Sat Jul 3 08:58:34 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Sat, 3 Jul 2010 08:58:34 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Sat, Jul 3, 2010 at 08:20, Maciej Fijalkowski wrote: > On Sat, Jul 3, 2010 at 12:14 AM, Hakan Ardo wrote: >> On Fri, Jul 2, 2010 at 11:21 PM, Maciej Fijalkowski wrote: >>> General note - we consider 2x optimized C a pretty good result :) Details below >> >> As do I :) I just want ?to make this as jit-friendly as possible >> without rely knowing what's jit-friendly... > > I think it's fairly JIT friendly. You can look into traces (as you > did), but seems fine to me. >>> Yes. We don't do loop invariant optimizations for some reasons, the >>> best of it being the fact that to loop you can always add a bridge >>> which will invalidate this invariant. >> >> Are you telling me that you probably never will include that kind of >> optimization because of the limitations it imposes on other parts of >> the jit or just that it would be a lot of work to get it in place? > > It requires thinking. It's harder to do because we don't know > statically upfront how many paths we'll compile to assembler, but I > can think about ways to mitigate that. Isn't there some existing research about that in the 'tracing' community? As far as I remember, the theory is that traces are assembled in trace trees, and that each time a (simplified*) SSA optimization pass is applied to the trace tree to compile it. Not sure whether they do it also for Javascript, since there compilation times have to be very fast, but I guess they did so in their Java compiler. Also, in other cases the general JIT approach is 'optimize and invalidate if needed'. For instance, if a Java class has no subclass, it's not safe to assume this will hold forever to perform optimization; but the optimization is performed and a hook is installed so that class loading will undo the optimization. Another issue: what is i4 for? It's not used at all in the loop, but it is reset to 27 at the end of it, each time. Doesn't such a var waste some (little) time? * SSA on trace trees took advantage of their simpler structure compared to graphs for some operations. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From arigo at tunes.org Sat Jul 3 09:14:02 2010 From: arigo at tunes.org (Armin Rigo) Date: Sat, 3 Jul 2010 09:14:02 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: <20100703071402.GA19649@code0.codespeak.net> Hi Alex, On Fri, Jul 02, 2010 at 03:12:19PM -0500, Alex Gaynor wrote: > In addition to the things you noted, I guess the int overflow check > can be optimized out, since i+=1 can never cause it to overflow given > that i is bounded at 640*480. I suppose in general that would require > more dataflow analysis. Hakan mentioned this. It's actually an easy optimization in our linear code; I guess I will give it a try. A bientot, Armin. From arigo at tunes.org Sat Jul 3 09:28:14 2010 From: arigo at tunes.org (Armin Rigo) Date: Sat, 3 Jul 2010 09:28:14 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: <20100703072814.GB19649@code0.codespeak.net> Hi Paolo, On Sat, Jul 03, 2010 at 08:58:34AM +0200, Paolo Giarrusso wrote: > Isn't there some existing research about that in the 'tracing' > community? (...) Not sure > whether they do it also for Javascript, since there compilation times > have to be very fast, but I guess they did so in their Java compiler. We are not very good at mentioning existing research, but at least for the case of tracing JITs I think we know pretty much everything published, which you might find by googling for tracing JIT. (It's always a better approach than doing "guesses" in an unrelated project's mailing list.) For how PyPy's tracing JIT compares to existing approaches, there is a PyPy paper at: http://codespeak.net/svn/pypy/extradoc/talk/icooolps2009/ As well as the start of a draft about virtuals at: http://codespeak.net/svn/pypy/extradoc/talk/s3-2010/ And you should not miss Benjamin's great summary at: http://codespeak.net/pypy/dist/pypy/doc/jit/pyjitpl5.html A bientot, Armin. From anto.cuni at gmail.com Sat Jul 3 09:52:37 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sat, 03 Jul 2010 09:52:37 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: <4C2EEC45.3090205@gmail.com> On 03/07/10 08:14, Hakan Ardo wrote: > What is a bridge? you might be interested to read the chapter of my PhD thesis which explains exactly that, with diagrams: http://codespeak.net/svn/user/antocuni/phd/thesis/thesis.pdf In particular, section 6.4 explains the difference between loops, bridges and entry bridges. ciao, Anto From cfbolz at gmx.de Sat Jul 3 10:03:27 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Sat, 3 Jul 2010 10:03:27 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: Hi Paolo, 2010/7/3 Paolo Giarrusso : >> It requires thinking. It's harder to do because we don't know >> statically upfront how many paths we'll compile to assembler, but I >> can think about ways to mitigate that. > > Isn't there some existing research about that in the 'tracing' > community? As far as I remember, the theory is that traces are > assembled in trace trees, and that each time a (simplified*) SSA > optimization pass is applied to the trace tree to compile it. Not sure > whether they do it also for Javascript, since there compilation times > have to be very fast, but I guess they did so in their Java compiler. There are two ways to deal with attaching now traces to existing ones. On the one hand there are trace trees, which recompile the whole tree of traces when a new one is added. This can be costly. On the other hand, there is trace stitching, which just patches the existing trace to jump to the new one. PyPy (and TraceMonkey, I think) uses trace stitching. The problem with loop-invarian code motion is that when you stitch in a new trace (what we call a bridge) it is not clear that the code that was invariant so far is invariant on the new path as well. Cheers, Carl Friedrich From santagada at gmail.com Sat Jul 3 09:57:51 2010 From: santagada at gmail.com (Leonardo Santagada) Date: Sat, 3 Jul 2010 04:57:51 -0300 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Jul 3, 2010, at 3:58 AM, Paolo Giarrusso wrote: > Another issue: what is i4 for? It's not used at all in the loop, but > it is reset to 27 at the end of it, each time. Doesn't such a var > waste some (little) time? This I found interesting. Do anyone know the answer? -- Leonardo Santagada santagada at gmail.com From p.giarrusso at gmail.com Sat Jul 3 10:14:34 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Sat, 3 Jul 2010 10:14:34 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: <20100703072814.GB19649@code0.codespeak.net> References: <20100703072814.GB19649@code0.codespeak.net> Message-ID: On Sat, Jul 3, 2010 at 09:28, Armin Rigo wrote: > Hi Paolo, > > On Sat, Jul 03, 2010 at 08:58:34AM +0200, Paolo Giarrusso wrote: >> Isn't there some existing research about that in the 'tracing' >> community? ?(...) ? Not sure >> whether they do it also for Javascript, since there compilation times >> have to be very fast, but I guess they did so in their Java compiler. > > We are not very good at mentioning existing research, but at least for > the case of tracing JITs I think we know pretty much everything > published, which you might find by googling for tracing JIT. ?(It's > always a better approach than doing "guesses" in an unrelated project's > mailing list.) If you had read the next sentence you'd have found out that I did read some papers about that (where I learned about trace trees). My guess was just about whether their Java compiler used trace trees or the other possibility, i.e., trace stitching (as I now learned). But thanks for the references, I'll have a look later. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From william.leslie.ttg at gmail.com Sat Jul 3 16:20:46 2010 From: william.leslie.ttg at gmail.com (William Leslie) Date: Sun, 4 Jul 2010 00:20:46 +1000 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On 3 July 2010 08:56, Bengt Richter wrote: > On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote: > A thought/question: > > Could/does JIT make use of information in an assert statement? E.g., could we write > ? ? assert set(type(x) for x in img) == set([float]) and len(img)==640*480 > in front of a loop operating on img and have JIT use the info as assumed true > even when "if __debug__:" suites are optimized away? There are several reasons we can't make use of such information from the JIT at the moment. It requires more information that we have, and it is difficult to analyse quickly. If img is visible from outside the current thread, for example, the ad-hoc memory model of the python language means we would have to order writes and reads to img from other threads with the JIT's own accesses. Similarly, functions that we call may insert objects that break this invariant. Determining when this may occur requires analysing a lot of code - for example, if *one* type was not int, it could implement a __radd__ method that broke the invariant. It's typically faster to just execute the code than to find out. In the presence of whole-program optimisation this sort of thing is possible, with the right analysis it may be possible within the JIT, but the question remains as to if it will be profitable. (This is an area I have been exploring, but don't hold your breath for results.) On 3 July 2010 10:38, Bengt Richter wrote: > On 07/02/2010 04:14 PM Amaury Forgeot d'Arc wrote: >> If efficient python code needs this, I'd better write the loop in C >> and explicitly choose the types. >> The C code could be inlined in the python script, and compiled on demand. >> At least you'll know what you get. >> > Well, even C accepts hints like 'register' (and may ignore you, so you are not truly sure what you get ;-) > > The point of using assert would be to let the user remain within the python language, while still passing > useful hints to the compiler. Interesting you mention racket. Racket comes with a static language that integrates with their usual dynamic Scheme. Many common lisp implementations provide optional typing. Paolo recently bemoaned the trend toward writing modules at interp level for speed* - I'm not really sure if it is a trend now or not - but at some point it might be fun looking at optional typing annotations that compile the case for those assumptions. It might be a precursor to cython or pyrex support. * with justification : though ok for the stdlib, translating pypy every time you add an extension module is going to get old. fast. > Could such assertions allow e.g. a list to be implemented as a homogeneous vector > of unboxed representations? Pypy is already great in terms of data layout, for example pypy uses shadow classes in the form of 'structures', but supporting more complicated layout optimisations (such as row or column order storage for structures so the JIT can do relational algebra) would probably be unique. It doesn't seem so far off considering that in the progression (list int) -> (list unpacked tuple int) -> (list unpacked homogenous structure), the first step, limiting or otherwise determining the item type, is the most complicated. > If I wanted to mix languages (not uninteresting!), I'd go with > racket (the star formerly known as PLT-scheme) -- possible can of worms -- As for mixing languages, that is the pinnacle of awesome; but this is probably not the list for it. MLVMs such as JVM+JSR-292, Racket, GNU Guile, and Parrot; it seems to me that once you settle on an execution / object model and / or bytecode format, you've already decided what languages (where the 's' seems superfluous) support is going to be first class for. Don't get me wrong, I find each of these really exciting, but good multi-platform integration is a much harder problem than writing a few compilers with a common bytecode format; and even the common bytecode format is probably not a good idea, because different languages need (really) different primatives, as pirate has bought out. Other impedance mismatches, such as calling conventions (eg, javascript and lua functions silently accepting an incorrect number of arguments), reduction methods (applicative vs normal order vs call-by-name), mutable strings, TCE, various type systems involving structural types, Oliviera/Sulzmann classes, existential types, dependant types, value types, single and multiple inheretance, and the completely insane (prolog) make implementing real multi-language platforms a mammoth task. And even if you manage to get that working, how do you make exception hierarchies work? Why can't I cast my Java ArrayList as a C# ArrayList? etc. Sure, you could probably hook up a few of the bundled VMs, IO or E would make for a great twisted integration DSL. But actually convincing people to lock themselves into an unstandardised, unproven chimera? Lets just say that doing multi-language right is NP-hard. Doing it while targeting JVM and CLI, offering platform integration while supporting exotic language constructs like real continuations? Likely impossible. It's a nice idea, but probably out of Pypy's scope. -- William Leslie From p.giarrusso at gmail.com Sat Jul 3 18:51:49 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Sat, 3 Jul 2010 18:51:49 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On Fri, Jul 2, 2010 at 23:16, Maciej Fijalkowski wrote: > [snip] > >> the need for separate loads. In Python, instead, refcounting alone is >> a very expensive operation. > > > How does that apply to pypy? I was talking about the original paper. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From p.giarrusso at gmail.com Sat Jul 3 19:22:54 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Sat, 3 Jul 2010 19:22:54 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: On Sat, Jul 3, 2010 at 16:20, William Leslie wrote: > On 3 July 2010 08:56, Bengt Richter wrote: >> On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote: > Paolo recently bemoaned the > trend toward writing modules at interp level for speed* - I'm not > really sure if it is a trend now or not - but at some point it might > be fun looking at optional typing annotations that compile the case > for those assumptions. It might be a precursor to cython or pyrex > support. > * with justification : though ok for the stdlib, translating pypy > every time you add an extension module is going to get old. fast. That's one point, but it's not the biggest one. I guess that if that happens often enough, at some point one will need to implement separate compilation for RPython as well (at least for development). I mean, whole-program optimization (which one would maybe lose) is optional in other languages. 1) The real problem is that you don't want users to need interp-level coding for their program. If they need, there's something wrong (and I now think/hope it's not the case). 2) Another instance of the same issue happens when Python developers are suggested to write extensions in C or to perform inlining by hand. 3) The last case is users avoiding Python (or another high-level language) altogether because of bad performance. The common factor is that in all cases, a weakness of the implementation makes the abstraction less desirable, and thus user programs are hand-optimized and become less maintainable. That's why efficient JITs (including PyPy) are important. It is interesting that 2) stems also from the desire of Guido van Rossum to keep CPython simple, while complicating life for its users. >> Could such assertions allow e.g. a list to be implemented as a homogeneous vector >> of unboxed representations? > Pypy is already great in terms of data layout, for example pypy uses > shadow classes in the form of 'structures', but supporting more > complicated layout optimisations (such as row or column order storage > for structures so the JIT can do relational algebra) would probably be > unique. It doesn't seem so far off considering that in the progression > (list int) -> (list unpacked tuple int) -> (list unpacked homogenous > structure), the first step, limiting or otherwise determining the item > type, is the most complicated. > As for mixing languages, that is the pinnacle of awesome; but this is > probably not the list for it. MLVMs such as JVM+JSR-292, Racket, GNU > Guile, and Parrot; it seems to me that once you settle on an execution > / object model and / or bytecode format, you've already decided what > languages (where the 's' seems superfluous) support is going to be > first class for. You are right about "first class support". But assembly doesn't offer first class support for anything, and still you can make it work. Of course, bytecodes are more limited, but sometimes you might manage. I had 3 colleague students who implemented, for instance, a Python-to-JVM bytecode compiler which was way faster than Jython. Which was the trick? Python methods were encoded as Java classes (maybe with static methods), and they performed inline-caching in bytecode, i.e., each call was converted to something like if (target.class() == this_class) specificMethodClass.perform(target, args) else (perform normal method resolution, and possibly regenerate the class). I'm unsure about the actual call produced for the call - either they used static classes, or they just relied on inline-caching/inlining by the underlying JIT. Another detail (I guess) is that you need some form of shadow classes (like Self, V8, and also PyPy I guess - if you talk about the same thing). Unfortunately, I don't know whether they published their code - it was for a term project for a course held by Lars Bak (the V8 author) in Aarhus. It worked quite well, and there was still potential for optimization. I don't know how feature-complete they were, though; still, they managed to perform a meta-implementation of Inline-Caching (and the same trick allows also polymorphic inline-caching), where meta- is used like in meta-interpreter. I guess it would still be possible to interoperate with Java classes - you can still provide, I think, a conventional interface (where methods become just... methods), even if possibly it will be slower. > Other impedance mismatches, such as calling conventions (eg, > javascript and lua functions silently accepting an incorrect number of > arguments), reduction methods (applicative vs normal order vs > call-by-name), mutable strings, TCE, various type systems involving > structural types, Oliviera/Sulzmann classes, existential types, > dependant types, value types, single and multiple inheretance, and the > completely insane (prolog) make implementing real multi-language > platforms a mammoth task. And even if you manage to get that working, > how do you make exception hierarchies work? > Why can't I cast my Java > ArrayList as a C# ArrayList? etc. Well, this latter question seems somehow solved by .NET, even if they don't really support the original libraries. Or you just use the VM and write conversion functions for that. > Sure, you could probably hook up a few of the bundled VMs, IO or E IO? E? > would make for a great twisted integration DSL. But actually > convincing people to lock themselves into an unstandardised, unproven > chimera? Lets just say that doing multi-language right is NP-hard. > Doing it while targeting JVM and CLI, offering platform integration > while supporting exotic language constructs like real continuations? Now that you mention it, I wonder about how Scala's future support (in next release) for (delimited) continuations will work. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From anto.cuni at gmail.com Sat Jul 3 21:23:32 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sat, 03 Jul 2010 21:23:32 +0200 Subject: [pypy-dev] array performace? In-Reply-To: References: <20100701152827.GA30661@code0.codespeak.net> <01781CA2CC22B145B230504679ECF48C01AC4415@EMEA-EXCHANGE03.internal.sungard.corp> <01781CA2CC22B145B230504679ECF48C01AC445A@EMEA-EXCHANGE03.internal.sungard.corp> <4C2E3182.4020307@gmx.de> Message-ID: <4C2F8E34.3030701@gmail.com> On 03/07/10 19:22, Paolo Giarrusso wrote: > I had 3 colleague students who implemented, for instance, a > Python-to-JVM bytecode compiler which was way faster than Jython. > Which was the trick? [cut] I'm ready to bet that they did not implement a Python compiler, but a simil-Python language that implements 80/90/95% of Python features. The web is full of projects like this. I'm not saying that the techniques used for that project are not worth of attention, just that probably "the trick" was not to support the features of Python that are hardest to implement efficiently. ciao, Anto From hakan at debian.org Sun Jul 4 10:50:25 2010 From: hakan at debian.org (Hakan Ardo) Date: Sun, 4 Jul 2010 10:50:25 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: On Sat, Jul 3, 2010 at 8:20 AM, Maciej Fijalkowski wrote: >>> Yes. We don't do loop invariant optimizations for some reasons, the >>> best of it being the fact that to loop you can always add a bridge >>> which will invalidate this invariant. >> >> Are you telling me that you probably never will include that kind of >> optimization because of the limitations it imposes on other parts of >> the jit or just that it would be a lot of work to get it in place? > > It requires thinking. It's harder to do because we don't know > statically upfront how many paths we'll compile to assembler, but I > can think about ways to mitigate that. Could it be treated similar to how you handle: s=0 i=0 while i<100000: s+=i i+=1 if i>50000: i=float(i) which nicely generates two separate traces I believe... -- H?kan Ard? From p.giarrusso at gmail.com Sun Jul 4 11:04:01 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Sun, 4 Jul 2010 11:04:01 +0200 Subject: [pypy-dev] Interpreter level array implementation In-Reply-To: References: Message-ID: Hi Carl, first, thanks for reading and for your explanation. On Sat, Jul 3, 2010 at 10:03, Carl Friedrich Bolz wrote: > 2010/7/3 Paolo Giarrusso : >>> It requires thinking. It's harder to do because we don't know >>> statically upfront how many paths we'll compile to assembler, but I >>> can think about ways to mitigate that. >> >> Isn't there some existing research about that in the 'tracing' >> community? As far as I remember, the theory is that traces are >> assembled in trace trees, and that each time a (simplified*) SSA >> optimization pass is applied to the trace tree to compile it. Not sure >> whether they do it also for Javascript, since there compilation times >> have to be very fast, but I guess they did so in their Java compiler. > > There are two ways to deal with attaching now traces to existing ones. > On the one hand there are trace trees, which recompile the whole tree > of traces when a new one is added. This can be costly. On the other > hand, there is trace stitching, which just patches the existing trace > to jump to the new one. PyPy (and TraceMonkey, I think) uses trace > stitching. For TraceMonkey, response times suggest the usage of trace stitching. The original Java compiler used trace trees. But if I have a Python application server, I'm probably willing to accept the bigger compilation time, especially if compilation is performed by a background thread. Would it be possible to accommodate this case? > The problem with loop-invarian code motion is that when you stitch in > a new trace (what we call a bridge) it is not clear that the code that > was invariant so far is invariant on the new path as well. I see - but what about noting potential modifications to the involved objects and invalidating the old traces, similarly to how classloading invalidates other optimizations? Of course, some heuristics and tuning would be needed I guess, since I expect that invalidations here would be much more frequent otherwise. Such heuristics would probably approximate a solution to the problem mentioned by Maciej: > It requires thinking. It's harder to do because we don't know > statically upfront how many paths we'll compile to assembler, but I > can think about ways to mitigate that. However, I still wonder how easy it is to recognize a potential write. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From fijall at gmail.com Sun Jul 4 22:25:30 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 4 Jul 2010 14:25:30 -0600 Subject: [pypy-dev] [pypy-svn] r75824 - in pypy/branch/interplevel-array/pypy/module/array: . test In-Reply-To: <20100704190622.4F935282B9D@codespeak.net> References: <20100704190622.4F935282B9D@codespeak.net> Message-ID: > + > + ? ?def item_w(self, w_item): > + ? ? ? ?space=self.space > + ? ? ? ?if self.typecode == 'c': > + ? ? ? ? ? ?return self.space.str_w(w_item) > + ? ? ? ?elif self.typecode == 'u': > + ? ? ? ? ? ?return self.space.unicode_w(w_item) > + > + ? ? ? ?elif self.typecode == 'b': > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<-128: > + ? ? ? ? ? ? ? ?msg='signed char is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>127: > + ? ? ? ? ? ? ? ?msg='signed char is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(rffi.SIGNEDCHAR, item) > + ? ? ? ?elif self.typecode == 'B': > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<0: > + ? ? ? ? ? ? ? ?msg='unsigned byte integer is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>255: > + ? ? ? ? ? ? ? ?msg='unsigned byte integer is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(rffi.UCHAR, item) > + > + ? ? ? ?elif self.typecode == 'h': > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<-32768: > + ? ? ? ? ? ? ? ?msg='signed short integer is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>32767: > + ? ? ? ? ? ? ? ?msg='signed short integer is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(rffi.SHORT, item) > + ? ? ? ?elif self.typecode == 'H': > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<0: > + ? ? ? ? ? ? ? ?msg='unsigned short integer is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>65535: > + ? ? ? ? ? ? ? ?msg='unsigned short integer is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(rffi.USHORT, item) > + > + ? ? ? ?elif self.typecode in ('i', 'l'): > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<-2147483648: > + ? ? ? ? ? ? ? ?msg='signed integer is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>2147483647: > + ? ? ? ? ? ? ? ?msg='signed integer is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(lltype.Signed, item) > + ? ? ? ?elif self.typecode in ('I', 'L'): > + ? ? ? ? ? ?item=self.space.int_w(w_item) > + ? ? ? ? ? ?if item<0: > + ? ? ? ? ? ? ? ?msg='unsigned integer is less than minimum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?elif item>4294967295: > + ? ? ? ? ? ? ? ?msg='unsigned integer is greater than maximum' > + ? ? ? ? ? ? ? ?raise OperationError(space.w_OverflowError, space.wrap(msg)) > + ? ? ? ? ? ?return rffi.cast(lltype.Unsigned, item) > + > + ? ? ? ?elif self.typecode == 'f': > + ? ? ? ? ? ?item=self.space.float_w(w_item) > + ? ? ? ? ? ?return rffi.cast(lltype.SingleFloat, item) > + ? ? ? ?elif self.typecode == 'd': > + ? ? ? ? ? ?return self.space.float_w(w_item) > + Hey. This looks a bit ugly, you can definitely do it with some constant dict or something (we have special support for iterating over constants and unrolling the iteration, look for unrolling_iterable). Also, annotator can fold a bunch of ifs into a switch, but not if "in" operator is used (or is fine though). From hakan at debian.org Mon Jul 5 07:54:59 2010 From: hakan at debian.org (Hakan Ardo) Date: Mon, 5 Jul 2010 07:54:59 +0200 Subject: [pypy-dev] [pypy-svn] r75824 - in pypy/branch/interplevel-array/pypy/module/array: . test In-Reply-To: References: Message-ID: On Sun, Jul 4, 2010 at 10:25 PM, Maciej Fijalkowski wrote: > > Hey. This looks a bit ugly, ?It does, doesn't it :) > ?you can definitely do it with some > constant dict or something Yes, there is an overflow check needed on the integer types but not on the character an float types, but I guess that could be solved with a flag in the dict. I was actually considering to introduce separate subclasses for each typecode overriding intem_w and descr_getitem. That would get rid of the typecode attribute lookup all together. > (we have special support for iterating over > constants and unrolling the iteration, look for unrolling_iterable). > Also, annotator can fold a bunch of ifs into a switch, but not if "in" > operator is used (or is fine though). That's nice features, good to know about. Thanx. -- H?kan Ard? From hakan at debian.org Mon Jul 5 08:53:20 2010 From: hakan at debian.org (Hakan Ardo) Date: Mon, 5 Jul 2010 08:53:20 +0200 Subject: [pypy-dev] [pypy-svn] r75824 - in pypy/branch/interplevel-array/pypy/module/array: . test In-Reply-To: References: <20100704190622.4F935282B9D@codespeak.net> Message-ID: I've checked in a dict-based version. Not sure it became that clean after all. Is the getattr(space, tc.unwrap) construction ok? On Mon, Jul 5, 2010 at 7:26 AM, Hakan Ardo wrote: > On Sun, Jul 4, 2010 at 10:25 PM, Maciej Fijalkowski wrote: >> >> Hey. This looks a bit ugly, > > ?It does, doesn't it :) > >> ?you can definitely do it with some >> constant dict or something > > Yes, there is an overflow check needed on the integer types but not on > the character an float types, but I guess that could be solved with a > flag in the dict. > > I was actually considering to introduce separate subclasses for each > typecode overriding intem_w and descr_getitem. That would get rid of > the typecode attribute lookup all together. > >> (we have special support for iterating over >> constants and unrolling the iteration, look for unrolling_iterable). >> Also, annotator can fold a bunch of ifs into a switch, but not if "in" >> operator is used (or is fine though). > > That's nice features, good to know about. Thanx. > > > > -- > H?kan Ard? > -- H?kan Ard? From bhartsho at yahoo.com Fri Jul 9 04:08:19 2010 From: bhartsho at yahoo.com (Hart's Antler) Date: Thu, 8 Jul 2010 19:08:19 -0700 (PDT) Subject: [pypy-dev] Interactive Translation and JIT Message-ID: <756045.59228.qm@web114009.mail.gq1.yahoo.com> I'm using Jason Creighton branch, and i am trying to test the JIT from interactive translation. Is it now allowed? I'm getting this error: NotImplementedError: --gcrootfinder=asmgcc requires standalone Or am i not setting the options correctly on the translator, here is how i'm translating. from pypy.translator.interactive import Translation t = Translation( pypy_entry_point ) t.config.translation.suggest(jit=True, jit_debug='steps', jit_backend='x86', gc='boehm') t.annotate() t.rtype() f = t.compile_c() f() complete code: http://pastebin.com/T42cqSbz demo: http://www.youtube.com/watch?v=HwbDG3Rdi_Q -brett From arigo at tunes.org Fri Jul 9 09:43:51 2010 From: arigo at tunes.org (Armin Rigo) Date: Fri, 9 Jul 2010 09:43:51 +0200 Subject: [pypy-dev] Interactive Translation and JIT In-Reply-To: <756045.59228.qm@web114009.mail.gq1.yahoo.com> References: <756045.59228.qm@web114009.mail.gq1.yahoo.com> Message-ID: <20100709074351.GA8538@code0.codespeak.net> Hi Brett, On Thu, Jul 08, 2010 at 07:08:19PM -0700, Hart's Antler wrote: > I'm using Jason Creighton branch, and i am trying to test the JIT from > interactive translation. Is it now allowed? I'm getting this error: > NotImplementedError: --gcrootfinder=asmgcc requires standalone Indeed, it is not allowed. As far as I know, the interactive translation does not support making standalone programs. You need to run translate.py as described e.g. here: http://codespeak.net/pypy/dist/pypy/doc/getting-started-python.html#translating-the-pypy-python-interpreter A bientot, Armin. From arigo at tunes.org Fri Jul 9 09:47:19 2010 From: arigo at tunes.org (Armin Rigo) Date: Fri, 9 Jul 2010 09:47:19 +0200 Subject: [pypy-dev] Interactive Translation and JIT In-Reply-To: <20100709074351.GA8538@code0.codespeak.net> References: <756045.59228.qm@web114009.mail.gq1.yahoo.com> <20100709074351.GA8538@code0.codespeak.net> Message-ID: <20100709074719.GB8538@code0.codespeak.net> Re-hi, On Fri, Jul 09, 2010 at 09:43:51AM +0200, Armin Rigo wrote: > You need to run translate.py as described e.g. here: ... or to use pypy/jit/tl/pypyjit.py for a quick test of the JIT running on top of PyPy -- although you won't get any assembler, but only the so-called 'llgraph' backend, which emulates assembler by hand using higher-level type-safe operations. (It should be possible in theory to tweak pypyjit.py to really use the x86 backend.) A bientot, Armin. From Dave.Cross at cdl.co.uk Tue Jul 13 13:09:56 2010 From: Dave.Cross at cdl.co.uk (Dave Cross) Date: Tue, 13 Jul 2010 12:09:56 +0100 Subject: [pypy-dev] Windows binaries Message-ID: Hi, Is there a likely delivery date for Windows binaries of PyPy 1.3? Dave.

**********************************************************************
Please consider the environment - do you really need to print this email?

This email is intended only for the person(s) named above and may contain private and confidential information. If it has come to you in error, please destroy and permanently delete any copy in your possession and contact us on +44 (0) 161 480 4420. The information in this email is copyright © CDL Group Holdings Limited. We cannot accept any liability for any loss or damage sustained as a result of software viruses. It is your responsibility to carry out such virus checking as is necessary before opening any attachment.
Cheshire Datasystems Limited uses software which automatically screens incoming emails for inappropriate content and attachments. If the software identifies such content or attachment, the email will be forwarded to our Technology Department for checking. You should be aware that any email which you send to Cheshire Datasystems Limited is subject to this procedure.
Cheshire Datasystems Limited, Strata House, Kings Reach Road, Stockport SK4 2HD
Registered in England and Wales with Company Number 3991057
VAT registration: 727 1188 33

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100713/41981590/attachment.htm 

From fijall at gmail.com  Tue Jul 13 13:57:31 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 13 Jul 2010 13:57:31 +0200
Subject: [pypy-dev] Windows binaries
In-Reply-To: 
References: 
Message-ID: 

On Tue, Jul 13, 2010 at 1:09 PM, Dave Cross  wrote:
> Hi,
>
>
>
> Is there a likely delivery date for Windows binaries of PyPy 1.3?
>

Eh, sorry, my fault, will upload them today.

>
>
> Dave.
>
> **********************************************************************
> Please consider the environment - do you really need to print this email?
>
> This email is intended only for the person(s) named above and may contain
> private and confidential information. If it has come to you in error, please
> destroy and permanently delete any copy in your possession and contact us on
> +44 (0) 161 480 4420. The information in this email is copyright ? CDL Group
> Holdings Limited. We cannot accept any liability for any loss or damage
> sustained as a result of software viruses. It is your responsibility to
> carry out such virus checking as is necessary before opening any attachment.
> Cheshire Datasystems Limited uses software which automatically screens
> incoming emails for inappropriate content and attachments. If the software
> identifies such content or attachment, the email will be forwarded to our
> Technology Department for checking. You should be aware that any email which
> you send to Cheshire Datasystems Limited is subject to this procedure.
> Cheshire Datasystems Limited, Strata House, Kings Reach Road, Stockport SK4
> 2HD
> Registered in England and Wales with Company Number 3991057
> VAT registration: 727 1188 33
>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Sun Jul 18 18:36:24 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 18 Jul 2010 18:36:24 +0200
Subject: [pypy-dev] [pypy-svn] r76268 - pypy/branch/micronumpy/pypy/tool
In-Reply-To: <20100716224104.25C49282BD4@codespeak.net>
References: <20100716224104.25C49282BD4@codespeak.net>
Message-ID: 

Benchmarks generally should go to pypy/benchmarks directory in the
main source tree (that is svn+ssh://codespeak.net/svn/pypy/benchmarks)

On Sat, Jul 17, 2010 at 12:41 AM,   wrote:
> Author: dan
> Date: Sat Jul 17 00:41:02 2010
> New Revision: 76268
>
> Added:
> ? pypy/branch/micronumpy/pypy/tool/convolve.py
> Modified:
> ? pypy/branch/micronumpy/pypy/tool/numpybench.py
> Log:
> Oops, I forgot the most important part of the benchmark!
>
> Added: pypy/branch/micronumpy/pypy/tool/convolve.py
> ==============================================================================
> --- (empty file)
> +++ pypy/branch/micronumpy/pypy/tool/convolve.py ? ? ? ?Sat Jul 17 00:41:02 2010
> @@ -0,0 +1,43 @@
> +from __future__ import division
> +from __main__ import numpy as np
> +
> +def naive_convolve(f, g):
> + ? ?# f is an image and is indexed by (v, w)
> + ? ?# g is a filter kernel and is indexed by (s, t),
> + ? ?# ? it needs odd dimensions
> + ? ?# h is the output image and is indexed by (x, y),
> + ? ?# ? it is not cropped
> + ? ?if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
> + ? ? ? ?raise ValueError("Only odd dimensions on filter supported")
> + ? ?# smid and tmid are number of pixels between the center pixel
> + ? ?# and the edge, ie for a 5x5 filter they will be 2.
> + ? ?#
> + ? ?# The output size is calculated by adding smid, tmid to each
> + ? ?# side of the dimensions of the input image.
> + ? ?vmax = f.shape[0]
> + ? ?wmax = f.shape[1]
> + ? ?smax = g.shape[0]
> + ? ?tmax = g.shape[1]
> + ? ?smid = smax // 2
> + ? ?tmid = tmax // 2
> + ? ?xmax = vmax + 2*smid
> + ? ?ymax = wmax + 2*tmid
> + ? ?# Allocate result image.
> + ? ?h = np.zeros([xmax, ymax], dtype=f.dtype)
> + ? ?# Do convolution
> + ? ?for x in range(xmax):
> + ? ? ? ?for y in range(ymax):
> + ? ? ? ? ? ?# Calculate pixel value for h at (x,y). Sum one component
> + ? ? ? ? ? ?# for each pixel (s, t) of the filter g.
> + ? ? ? ? ? ?s_from = max(smid - x, -smid)
> + ? ? ? ? ? ?s_to = min((xmax - x) - smid, smid + 1)
> + ? ? ? ? ? ?t_from = max(tmid - y, -tmid)
> + ? ? ? ? ? ?t_to = min((ymax - y) - tmid, tmid + 1)
> + ? ? ? ? ? ?value = 0
> + ? ? ? ? ? ?for s in range(s_from, s_to):
> + ? ? ? ? ? ? ? ?for t in range(t_from, t_to):
> + ? ? ? ? ? ? ? ? ? ?v = x - smid + s
> + ? ? ? ? ? ? ? ? ? ?w = y - tmid + t
> + ? ? ? ? ? ? ? ? ? ?value += g[smid - s, tmid - t] * f[v, w]
> + ? ? ? ? ? ?h[x, y] = value
> + ? ?return h
>
> Modified: pypy/branch/micronumpy/pypy/tool/numpybench.py
> ==============================================================================
> --- pypy/branch/micronumpy/pypy/tool/numpybench.py ? ? ?(original)
> +++ pypy/branch/micronumpy/pypy/tool/numpybench.py ? ? ?Sat Jul 17 00:41:02 2010
> @@ -21,13 +21,29 @@
> ? ? return numpy.array(kernel)
>
> ?if __name__ == '__main__':
> - ? ?from sys import argv as args
> - ? ?width, height, kwidth, kheight = [int(x) for x in args[1:]]
> + ? ?from optparse import OptionParser
> +
> + ? ?option_parser = OptionParser()
> + ? ?option_parser.add_option('--kernel-size', dest='kernel', default='3x3',
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? help="The size of the convolution kernel, given as WxH. ie 3x3"
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?"Note that both dimensions must be odd.")
> + ? ?option_parser.add_option('--image-size', dest='image', default='256x256',
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? help="The size of the image, given as WxH. ie. 256x256")
> + ? ?option_parser.add_option('--runs', '--count', dest='count', default=1000,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? help="The number of times to run the convolution filter")
> +
> + ? ?options, args = option_parser.parse_args()
> +
> + ? ?def parse_dimension(arg):
> + ? ? ? ?return [int(s.strip()) for s in arg.split('x')]
> +
> + ? ?width, height = parse_dimension(options.image)
> + ? ?kwidth, kheight = parse_dimension(options.kernel)
> + ? ?count = int(options.count)
>
> ? ? image = generate_image(width, height)
> ? ? kernel = generate_kernel(kwidth, kheight)
>
> ? ? from timeit import Timer
> ? ? convolve_timer = Timer('naive_convolve(image, kernel)', 'from convolve import naive_convolve; from __main__ import image, kernel; gc.enable()')
> - ? ?count = 100
> ? ? print "%.5f sec/pass" % (convolve_timer.timeit(number=count)/count)
> _______________________________________________
> pypy-svn mailing list
> pypy-svn at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-svn
>

From jcreigh at gmail.com  Thu Jul 22 15:34:55 2010
From: jcreigh at gmail.com (Jason Creighton)
Date: Thu, 22 Jul 2010 09:34:55 -0400
Subject: [pypy-dev] Building a shared library on x86-64 fails due to static
	linking of libffi
Message-ID: 

Hello,

While working on asmgcc-64, I ran into this issue. For some reason, PyPy
wants to link libffi statically on some platforms, Linux included. But when
compiling with the "shared" option (as is done in some asmgcroot tests), you
get link errors like:

/usr/bin/ld: /usr/lib/libffi.a(ffi64.o): relocation R_X86_64_32S against
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/lib/libffi.a: could not read symbols: Bad value

I interpret this to mean that since we building a shared library, the
resulting library must be position independent, so we can't link in non-PIC
such as is found in the static version of libffi on my system. (Ubuntu
10.04, x86-64). And indeed, if I switch to linking dynamically, the error
goes away and things seem to work.

However, I don't want to just blindly enable dynamic linking, because there
must be a reason it was configured to link statically in the first place.
What is that reason?

Also, what steps should I take here? I think I need to enable dynamic
linking of libffi on x86-64 Linux when building a shared library at the very
least, but to reduce the number of code paths, I'm somewhat inclined to link
dynamically whether we're building a library or not. What do you guys think?

Thanks,

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100722/0ac1b02a/attachment.htm 

From amauryfa at gmail.com  Thu Jul 22 17:03:57 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 22 Jul 2010 17:03:57 +0200
Subject: [pypy-dev] Building a shared library on x86-64 fails due to
	static linking of libffi
In-Reply-To: 
References: 
Message-ID: 

Hi,

2010/7/22 Jason Creighton :
> Hello,
>
> While working on asmgcc-64, I ran into this issue. For some reason, PyPy
> wants to link libffi statically on some platforms, Linux included. But when
> compiling with the "shared" option (as is done in some asmgcroot tests), you
> get link errors like:
>
> /usr/bin/ld: /usr/lib/libffi.a(ffi64.o): relocation R_X86_64_32S against
> `.rodata' can not be used when making a shared object; recompile with -fPIC
> /usr/lib/libffi.a: could not read symbols: Bad value
>
> I interpret this to mean that since we building a shared library, the
> resulting library must be position independent, so we can't link in non-PIC
> such as is found in the static version of libffi on my system. (Ubuntu
> 10.04, x86-64). And indeed, if I switch to linking dynamically, the error
> goes away and things seem to work.

Exactly

> However, I don't want to just blindly enable dynamic linking, because there
> must be a reason it was configured to link statically in the first place.
> What is that reason?
>
> Also, what steps should I take here? I think I need to enable dynamic
> linking of libffi on x86-64 Linux when building a shared library at the very
> least, but to reduce the number of code paths, I'm somewhat inclined to link
> dynamically whether we're building a library or not. What do you guys think?

The reason is actually in the code: pypy/rlib/libffi.py

    # On some platforms, we try to link statically libffi, which is small
    # anyway and avoids endless troubles for installing.  On other platforms
    # libffi.a is typically not there, so we link dynamically.

Probably static linking to libffi should be disabled on 64bit platform.
Or just skip the test: for what I know, --shared is not really needed
on Unix platforms.

-- 
Amaury Forgeot d'Arc

From ndbecker2 at gmail.com  Thu Jul 22 17:59:37 2010
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 22 Jul 2010 11:59:37 -0400
Subject: [pypy-dev] Building a shared library on x86-64 fails due to
	static linking of libffi
References: 
	
Message-ID: 

AFAIK, i386 is the only platform that allows building a shared lib linked 
with a static lib.


From bhartsho at yahoo.com  Fri Jul 23 06:49:53 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Thu, 22 Jul 2010 21:49:53 -0700 (PDT)
Subject: [pypy-dev] rpython questions, **kw, __call__, __getattr__
Message-ID: <92687.49811.qm@web114018.mail.gq1.yahoo.com>

Looking through the pypy source code i see **kw, __call__ and __getattr__ are used, but when i try to write my own rpython code that uses these conventions, i get translation errors.  Do i need to borrow from "application space" in order to do this or can i just give hints to the annotator?
Thanks,
-brett



#this is allowed
def func(*args): print(args)

#but this is not?
def func(**kw): print(args)
#error call pattern too complex

#this class fails to translate, are we not allowed to define our own __call__ and __getattr__ in rpython?
class A(object):
  __call__(*args): print(args)
  __getattr__(self,name): print(name)






From fijall at gmail.com  Fri Jul 23 10:20:40 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 23 Jul 2010 10:20:40 +0200
Subject: [pypy-dev] rpython questions, **kw, __call__, __getattr__
In-Reply-To: <92687.49811.qm@web114018.mail.gq1.yahoo.com>
References: <92687.49811.qm@web114018.mail.gq1.yahoo.com>
Message-ID: 

Hello.

__call__ and __getattr__ won't work. You see it in pypy source code,
because not all of pypy source code is RPython (in fact, Python is a
metaprogramming language for RPython). Same goes to **kw.

On Fri, Jul 23, 2010 at 6:49 AM, Hart's Antler  wrote:
> Looking through the pypy source code i see **kw, __call__ and __getattr__ are used, but when i try to write my own rpython code that uses these conventions, i get translation errors. ?Do i need to borrow from "application space" in order to do this or can i just give hints to the annotator?
> Thanks,
> -brett
>
>
>
> #this is allowed
> def func(*args): print(args)
>
> #but this is not?
> def func(**kw): print(args)
> #error call pattern too complex
>
> #this class fails to translate, are we not allowed to define our own __call__ and __getattr__ in rpython?
> class A(object):
> ?__call__(*args): print(args)
> ?__getattr__(self,name): print(name)
>
>
>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From cfbolz at gmx.de  Fri Jul 23 10:23:15 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Fri, 23 Jul 2010 10:23:15 +0200
Subject: [pypy-dev] rpython questions, **kw, __call__, __getattr__
In-Reply-To: <92687.49811.qm@web114018.mail.gq1.yahoo.com>
References: <92687.49811.qm@web114018.mail.gq1.yahoo.com>
Message-ID: <4C495173.9070107@gmx.de>

On 07/23/2010 06:49 AM, Hart's Antler wrote:
> Looking through the pypy source code i see **kw, __call__ and
> __getattr__ are used,

Where exactly are they used? Not all of the code in PyPy is RPython.

> but when i try to write my own rpython code
> that uses these conventions, i get translation errors.  Do i need to
> borrow from "application space" in order to do this or can i just
> give hints to the annotator? Thanks, -brett
>
>
>
> #this is allowed
 > def func(*args):
 >     print(args)
>
> #but this is not?
 > def func(**kw):
 >     print(args)
 > #error call pattern too complex
>
> #this class fails to translate, are we not allowed to define our own
> __call__ and __getattr__ in rpython?


> class A(object):
 >     __call__(*args):
 >          print(args)
 >     __getattr__(self,name):
 >          print(name)

You cannot use any __xxx__ functions in RPython, only __init__ and 
__del__. Anyway, you cannot translate a class, so "fails to translate" 
has no meaning :-).

Cheers,

Carl Friedrich

From bhartsho at yahoo.com  Sat Jul 24 11:06:09 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Sat, 24 Jul 2010 02:06:09 -0700 (PDT)
Subject: [pypy-dev] PyPyGTK v0.1
Message-ID: <114704.12351.qm@web114014.mail.gq1.yahoo.com>

http://pastebin.com/UhnEurqb

The above is a crude way to run pygtk from rpython (through CPython and talking on a pipe), but at least it partially works.  Callbacks are limited to quoted lambdas, but some simple return types back to rpython is possible - i'm going to try that next.  There is no support for dynamic attribute access, but most of pygtk involves function calls.  Where attribute access is required, i guess extra proxy functions could be written.

-brett 




From cfbolz at gmx.de  Sat Jul 24 11:21:28 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Sat, 24 Jul 2010 11:21:28 +0200
Subject: [pypy-dev] PyPyGTK v0.1
In-Reply-To: <114704.12351.qm@web114014.mail.gq1.yahoo.com>
References: <114704.12351.qm@web114014.mail.gq1.yahoo.com>
Message-ID: <4C4AB098.1010306@gmx.de>

Hi Brett,

On 07/24/2010 11:06 AM, Hart's Antler wrote:
> http://pastebin.com/UhnEurqb

Nice. Did you also see this?: 
http://morepypy.blogspot.com/2009/11/using-cpython-extension-modules-with.html
I guess it could be used for GTK as well.

BTW, I guess if we ever wanted "real" GTK support, without proxying 
CPython, we should use the GObject-introspection features, which should 
make the wrapping rather simple.

Cheers,

Carl Friedrich

From bhartsho at yahoo.com  Mon Jul 26 07:47:02 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Sun, 25 Jul 2010 22:47:02 -0700 (PDT)
Subject: [pypy-dev] PyPy Proxy
Message-ID: <345280.58625.qm@web114015.mail.gq1.yahoo.com>

The code from PyPyGTK has been generalized so that it can work with PyGame, and PyODE.  Function calls are improved so that different arg types can be accepted by defining custom wrappers per function.  Custom wrappers can also be made for the return values, so different types can be handled as well (it seems that rpython restricts what can be returned from a function to the same types).  Proxy objects can move back and forth from CPython to RPython.  The wrappers for pygtk, pyode, and pygame are by no means complete, but some basic tests are working.  Callbacks are limited to quoted lambdas, but it could be improved.

http://pastebin.com/rWEfgMSN

I had seen the other proxy method before, but found few examples, how does it work, from Rpython or the PyPy interpreter?
http://morepypy.blogspot.com/2009/11/using-cpython-extension-modules-with.html




From kevinar18 at hotmail.com  Tue Jul 27 04:09:16 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Mon, 26 Jul 2010 22:09:16 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
Message-ID: 


Might as well warn you: This is going to be a rather long post.
I'm not sure if this is appropriate to post here or if would fit right in with the mailing list.? Sorry, if it is the wrong place to post about this.


I've looked through the documenation (http://codespeak.net/pypy/dist/pypy/doc/stackless.html) and didn't really see what I was looking for.? I've also investigated several options in the default CPython.

What I'm trying to accomplish:
I am trying to write a particular threading scenario that follows these rules.? It is partly an experiment and partly for actual production code.

1. Hundreds or thousands of micro-threads that are essentially small self-contained programs (not really, but you can think of them that way).
2. No shared state - data is passed around from one micro-thread to another; only one micro-thread has access to the data at a time. (although the programmer gets the impression there is no shared state, in reality, the underlying implementation uses shared memory / shared state for speed; the data does not move; you just pass around a reference/pointer to some shared memory)
3. The micro-threads can run in parallel on different cpu cores, get moved to a different core, etc....
4. The micro-threads are truly pre-emptive (uses hardware interrupt pre-emption).
5. It is my intention to write my own scheduler that will suspend the micro-threads, start them, control the sharing of data, assign them to different CPU cores etc....? In fact, for my purposes, I MUST write my own scheduler as I have very specific requirements on when they should and should not run.


Now, I have spent some time trying to find a way to achieve this ... and I can implement a rather poor version using default Python.? However, I don't see any way to implement my ideal version.? Maybe someone here might have some pointers for me.

Shared Memory between parallel processes
----------------------------------------
Quick Question: Do queues from the multiprocessing module use shared memory?? If the answer is YES, you can just skip this section, because that would solve this particular problem.

(For simplicity, let's assume a quad core CPU)
It is my intent to create 4 threads/processs (one per core) and use the scheduler to assign a micro-thread (of which there may be hundreds) to one of the 4 threads/processes.? However, the micro-threads need to exchange data quickly; to do that I need shared memory -- and that is where I'm having some trouble.
Normally, 4 threads would be the ideal solution -- as they can run in parallel and use shared memory.? However, because of the Python GIL, I can't use threads in this way; thus, I have to use 4 processes, which are not setup to share memory.

Question: How can I share Python Objects between processes USING SHARED MEMORY?? I do not want to have to copy or "pass" data back and forth between processes or have to use a proxy "server" process.? These are both too much of a performance hit for my needs; shared memory is what I need.

The multiprocessing module offers me 4 options: "queues", "pipes", "shared memory map", and a "server process".
"Shared memory map" won't work as it only handles C values and arrays (not Python objects or variables).
"Server Process" sounds like a bad idea.? Am I correct in that this option requires extra processing power and does not even use shared memory?? If so, that would be a very bad choice for me.
The big question then... do "queues" and "pipes" used shared memory or do they pass data back and forth between processes?? (if they used shared memory, then that would be perfect)

Does PyPy have any other options for me?


True Pre-emptive scheduling?

----------------------------

Any way to get pre-emptive micro-threads?? Stackless (the real 
Stackless, not the one in PyPy) has the ability to suspend them after a 
certain number of interpreter instructions; however, this is prone to 
problems because it can run much longer than expected.? Ideally, I would 
like to have true pre-emptive scheduling using 
hardware interrupts based on timing or CPU cycles (like the OS does for 
real threads).

I am currently not aware of any way to achieve this in CPython, PyPy, Unladen Swallow, Stackless, etc....


Are there detailed docs on why the Python GIL exists?
-----------------------------------------------------
I don't mean trivial statements like "because of C extensions" or "because the interpreter can't handle it".
It may be possible that my particular usage would not require the GIL.? However, I won't know this until I can understand what threading problems the Python interpreter has that the GIL was meant to protect against.? Is there detailed documentation about this anywhere that covers all the threading issues that the GIL was meant to solve?




Thanks,
Kevin
 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

From evan at theunixman.com  Tue Jul 27 08:27:03 2010
From: evan at theunixman.com (Evan Cofsky)
Date: Mon, 26 Jul 2010 23:27:03 -0700
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: 
Message-ID: <20100727062702.GE12699@tunixman.com>

On 07/26 22:09, Kevin Ar18 wrote:
> What I'm trying to accomplish:
>
> I am trying to write a particular threading scenario that follows these
> rules.? It is partly an experiment and partly for actual
> production code.

This is actually interesting to me as well. I can't count the number of
times I've had to implement something like this for projects. It would be
nice to be able to use a public module instead of writing it all
yet again.

> Now, I have spent some time trying to find a way to achieve this ... and
> I can implement a rather poor version using default Python.? However, I
> don't see any way to implement my ideal version.? Maybe someone here
> might have some pointers for me.

> Shared Memory between parallel processes

This is the way I usually implement it. I'm currently mulling over some
sort of byte-addressable abstraction that can use a buffer or any sequence
as a backing store, which would make it useful for mmap objects as well.
And I'm thinking about using the class definitions and inheritance to
handle nested structures in some way.

> Quick Question: Do queues from the multiprocessing module use shared
> memory?? If the answer is YES, you can just skip this section, because
> that would solve this particular problem.

I can't imagine it wouldn't, but I haven't checked the source yet.

> Question: How can I share Python Objects between processes USING SHARED
> MEMORY?? I do not want to have to copy or "pass" data back and forth
> between processes or have to use a proxy "server" process.? These are
> both too much of a performance hit for my needs; shared memory is what
> I need.

Anonymous memory-mapped regions would work, with a suitable data
abstraction. Or even memory-mapped files, which aren't really all that
different on systems anymore.

> The multiprocessing module offers me 4 options: "queues", "pipes", "shared memory map", and a "server process".
> "Shared memory map" won't work as it only handles C values and arrays (not Python objects or variables).

cPickle could help. But then there's a serialization/deserialization step
which wouldn't really be too fast. It's not slow, but the cost of copying
the data is far outweighed by the cost of the dumps/loads, and if you need
to share multiple copies you're really going to feel it.

> "Server Process" sounds like a bad idea.? Am I correct in that this
> option requires extra processing power and does not even use
> shared memory??

Not really. It depends on how you would implement it.

> The big question then... do "queues" and "pipes" used shared memory or
> do they pass data back and forth between processes?? (if they used
> shared memory, then that would be perfect)
 
Queues most likely do, pipes absolutely do not.

> Does PyPy have any other options for me?

I wonder if it could be done with an object space, or similarly done
"behind the scenes" in the PyPy interpreter, sort of the way ZODB works
semi-transparently. Only in this case completely transparently.

> True Pre-emptive scheduling?

This wouldn't really be difficult, although doing it efficiently might
very well be without some serious black magic. But PyPy may also be the
right tool for that since the black magic can be written in Python or
RPython instead of C.

> Any way to get pre-emptive micro-threads?? Stackless (the real
> Stackless, not the one in PyPy) has the ability to suspend them after a
> certain number of interpreter instructions; however, this is prone to
> problems because it can run much longer than expected.? Ideally, I would
> like to have true pre-emptive scheduling using hardware interrupts based
> on timing or CPU cycles (like the OS does for real threads).

By using a process for each thread, and some shared memory arena for the
bulk of the application data structures, this is probably quite possible
without reimplementing the OS in Python.

> I am currently not aware of any way to achieve this in CPython, PyPy,
> Unladen Swallow, Stackless, etc....

I've done this a number of times, both with threads and with processes.
Processes ironically give you finer control over scheduling since you
aren't stuck behind the GIL, but as you are finding, you need some way to
share data.

> Are there detailed docs on why the Python GIL exists?

Here is the page from the Python Wiki:

http://wiki.python.org/moin/GlobalInterpreterLock

And here is an interesting article on the GIL problem:

http://blog.ianbicking.org/gil-of-doom.html

-- 
Evan Cofsky "The UNIX Man" 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: Digital signature
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100726/9b636796/attachment.pgp 

From fijall at gmail.com  Tue Jul 27 11:43:50 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 27 Jul 2010 11:43:50 +0200
Subject: [pypy-dev] PyPy Proxy
In-Reply-To: <345280.58625.qm@web114015.mail.gq1.yahoo.com>
References: <345280.58625.qm@web114015.mail.gq1.yahoo.com>
Message-ID: 

Hey.

Does it come with tests? Or how can I look how is it working?

On Mon, Jul 26, 2010 at 7:47 AM, Hart's Antler  wrote:
> The code from PyPyGTK has been generalized so that it can work with PyGame, and PyODE. ?Function calls are improved so that different arg types can be accepted by defining custom wrappers per function. ?Custom wrappers can also be made for the return values, so different types can be handled as well (it seems that rpython restricts what can be returned from a function to the same types). ?Proxy objects can move back and forth from CPython to RPython. ?The wrappers for pygtk, pyode, and pygame are by no means complete, but some basic tests are working. ?Callbacks are limited to quoted lambdas, but it could be improved.
>
> http://pastebin.com/rWEfgMSN
>
> I had seen the other proxy method before, but found few examples, how does it work, from Rpython or the PyPy interpreter?
> http://morepypy.blogspot.com/2009/11/using-cpython-extension-modules-with.html
>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Tue Jul 27 11:48:57 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 27 Jul 2010 11:48:57 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
Message-ID: 

On Tue, Jul 27, 2010 at 4:09 AM, Kevin Ar18  wrote:
>
> Might as well warn you: This is going to be a rather long post.
> I'm not sure if this is appropriate to post here or if would fit right in with the mailing list.? Sorry, if it is the wrong place to post about this.
>

This is a relevant list for some of questions below. I'll try to answer them.

> Quick Question: Do queues from the multiprocessing module use shared memory?? If the answer is YES, you can just skip this section, because that would solve this particular problem.

PyPy has no multiprocessing module so far (besides, I think it's an
ugly hack, but that's another issue).

>
> Does PyPy have any other options for me?
>

Right now, no. But there are ways in which you can experiment. Truly
concurrent threads (depends on implicit vs explicit shared memory)
might require a truly concurrent GC to achieve performance. This is
work (although not as big as removing refcounting from CPython for
example).

>
> True Pre-emptive scheduling?
>
> ----------------------------
>
> Any way to get pre-emptive micro-threads?? Stackless (the real
> Stackless, not the one in PyPy) has the ability to suspend them after a
> certain number of interpreter instructions; however, this is prone to
> problems because it can run much longer than expected.? Ideally, I would
> like to have true pre-emptive scheduling using
> hardware interrupts based on timing or CPU cycles (like the OS does for
> real threads).
>
> I am currently not aware of any way to achieve this in CPython, PyPy, Unladen Swallow, Stackless, etc....
>

Sounds relatively easy, but you would need to write this part in
RPython (however, that does not mean you get rid of GIL).

>
> Are there detailed docs on why the Python GIL exists?
> -----------------------------------------------------
> I don't mean trivial statements like "because of C extensions" or "because the interpreter can't handle it".
> It may be possible that my particular usage would not require the GIL.? However, I won't know this until I can understand what threading problems the Python interpreter has that the GIL was meant to protect against.? Is there detailed documentation about this anywhere that covers all the threading issues that the GIL was meant to solve?

The short answer is "yes". The long answer is that it's much easier to
write interpreter assuming GIL is around. For fine-grained locking to
work and be efficient, you would need:

* Some sort of concurrent GC (not specifically running in a separate
thread, but having different pools of memory to allocate from)
* Possibly a JIT optimization that would remove some locking.
* The forementioned locking, to ensure that it's not that easy to
screw things up.

So, in short, "work".

From bhartsho at yahoo.com  Tue Jul 27 14:43:47 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Tue, 27 Jul 2010 05:43:47 -0700 (PDT)
Subject: [pypy-dev] PyPy Proxy
In-Reply-To: 
Message-ID: <929795.85917.qm@web114016.mail.gq1.yahoo.com>

Hi Maciej,

Yes it comes with its own test, just save the file from pastebin and run it.  You should two gtk windows popup and a pygame window that draws a circle.

I have a new version with proxy support for one module from Blender2.5 (bpy.ops); note that pygtk, ode, and pygame are broken in this version because i had to change some things so it can run in Blender's embedded python3.1

http://pastebin.com/TsYNqd8p



--- On Tue, 7/27/10, Maciej Fijalkowski  wrote:

> From: Maciej Fijalkowski 
> Subject: Re: [pypy-dev] PyPy Proxy
> To: "Hart's Antler" 
> Cc: pypy-dev at codespeak.net
> Date: Tuesday, 27 July, 2010, 2:43 AM
> Hey.
> 
> Does it come with tests? Or how can I look how is it
> working?
> 
> On Mon, Jul 26, 2010 at 7:47 AM, Hart's Antler 
> wrote:
> > The code from PyPyGTK has been generalized so that it
> can work with PyGame, and PyODE. ?Function calls are
> improved so that different arg types can be accepted by
> defining custom wrappers per function. ?Custom wrappers can
> also be made for the return values, so different types can
> be handled as well (it seems that rpython restricts what can
> be returned from a function to the same types). ?Proxy
> objects can move back and forth from CPython to RPython.
> ?The wrappers for pygtk, pyode, and pygame are by no means
> complete, but some basic tests are working. ?Callbacks are
> limited to quoted lambdas, but it could be improved.
> >
> > http://pastebin.com/rWEfgMSN
> >
> > I had seen the other proxy method before, but found
> few examples, how does it work, from Rpython or the PyPy
> interpreter?
> > http://morepypy.blogspot.com/2009/11/using-cpython-extension-modules-with.html
> >
> >
> >
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
> 




From p.giarrusso at gmail.com  Tue Jul 27 15:17:29 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Tue, 27 Jul 2010 15:17:29 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: <20100727062702.GE12699@tunixman.com>
References: 
	<20100727062702.GE12699@tunixman.com>
Message-ID: 

On Tue, Jul 27, 2010 at 08:27, Evan Cofsky  wrote:
> On 07/26 22:09, Kevin Ar18 wrote:
>> Are there detailed docs on why the Python GIL exists?
>
> Here is the page from the Python Wiki:
>
> http://wiki.python.org/moin/GlobalInterpreterLock

To keep it short, CPython uses refcounting, and without the GIL the
refcount incs and decs would need to be atomic, with a huge
performance impact (that's discussed in the below links).

However, you can look at this answer from Guido van Rossum:
http://www.artima.com/weblogs/viewpost.jsp?thread=214235

And these two attempts to remove the GIL:
http://code.google.com/p/unladen-swallow/wiki/ProjectPlan#Global_Interpreter_Lock
http://code.google.com/p/python-safethread/

PyPy does not have this problem, but you still need to make
thread-safe the dictionaries holding members of each object. You don't
need to make lists thread-safe, I think, because the programmer is
supposed to lock them, but you want to allow a thread to add a member
to an object while another thread performs a method call.

Anyway, all this just explains why the GIL is still there, which is a
slightly different question from the original one. With
state-of-the-art technology, it is bad on every front, except
simplicity of implementation.

> And here is an interesting article on the GIL problem:
>
> http://blog.ianbicking.org/gil-of-doom.html

Given that processor frequencies aren't going to increase a lot in the
future as they used to do, while the number of cores is going to
increase much more, this article seems outdated nowadays - see also
http://atlee.ca/blog/2006/06/27/python-warts-2/.

This other link (http://poshmodule.sourceforge.net/) used to be
interesting for the problem you are discussing, but seems also dead -
there are other modules here:
http://wiki.python.org/moin/ParallelProcessing.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From fijall at gmail.com  Tue Jul 27 15:42:55 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 27 Jul 2010 15:42:55 +0200
Subject: [pypy-dev] rotting buildbot infrastructure
Message-ID: 

Hello.

According to current buildbot status, both osx and win machines are
offline. No clue how to get them back. Anyway, our OS X machine is
unable to translate pypy, so it's not exactly the best buildbot ever.
Can anyone contribute any machine for one of those buildbots?

Cheers
fijal

From p.giarrusso at gmail.com  Tue Jul 27 16:36:26 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Tue, 27 Jul 2010 16:36:26 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	
Message-ID: 

Hi all!

I am possibly interested in doing work on this, even if not in the
immediate future.

On Tue, Jul 27, 2010 at 11:48, Maciej Fijalkowski  wrote:
> On Tue, Jul 27, 2010 at 4:09 AM, Kevin Ar18  wrote:

> Truly
> concurrent threads (depends on implicit vs explicit shared memory)
> might require a truly concurrent GC to achieve performance. This is
> work (although not as big as removing refcounting from CPython for
> example).

>> Are there detailed docs on why the Python GIL exists?
>> -----------------------------------------------------
>> I don't mean trivial statements like "because of C extensions" or "because the interpreter can't handle it".
>> It may be possible that my particular usage would not require the GIL.? However, I won't know this until I can understand what threading problems the Python interpreter has that the GIL was meant to protect against.? Is there detailed documentation about this anywhere that covers all the threading issues that the GIL was meant to solve?

> The short answer is "yes". The long answer is that it's much easier to
> write interpreter assuming GIL is around. For fine-grained locking to
> work and be efficient, you would need:

> * The forementioned locking, to ensure that it's not that easy to
> screw things up.
I've wondered around the guarantees we need to offer to the
programmer, and my guess was that Jython's memory model is similar.
I've been concentrating on the dictionary of objects, on the
assumption that lists and most other built-in structures should be
locked by the programmer in case of concurrent modifications.

However, we don't want to require locking to support something like:
Thread 1:
obj.newmember=1;
Thread 2:
a = obj.oldmember;

Looking for Jython memory model on Google produces some garbage and
then this document from Unladen Swallow:
http://code.google.com/p/unladen-swallow/wiki/MemoryModel
It implicitly agrees on what's above (since Jython and IronPython both
use thread-safe dictionaries), and then delves into issues about
allowed reorderings.
However, it requires that even racy code does not make the interpreter crash.

> * Possibly a JIT optimization that would remove some locking.
Any more specific ideas on this?
> * Some sort of concurrent GC (not specifically running in a separate
> thread, but having different pools of memory to allocate from)

Among all points, this seems the easiest design-wise. Having
per-thread pools is nowadays standard, so it's _just_ work (as opposed
to 'complicated design'). Parallel GCs become important just when lots
of garbage must be reclaimed.
A GC is called concurrent, rather than parallel, when it runs
concurrently with the mutator, and this usually reduces both pause
times and throughput, so you probably don't want this as default (it
is useful for particular programs, such as heavily interactive
programs or videogames, I guess), do you?

More details are here:
http://www.ibm.com/developerworks/java/library/j-jtp11253/

The trick used in the (mostly) concurrent collector of Hotspot seems
interesting: it uses two short-stop-the-world phases and lets the
program run in between. I think I'll look for a paper on it.

Cheers,
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From fijall at gmail.com  Tue Jul 27 17:07:59 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 27 Jul 2010 17:07:59 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	 
	
Message-ID: 

On Tue, Jul 27, 2010 at 4:36 PM, Paolo Giarrusso  wrote:
> Hi all!
>
> I am possibly interested in doing work on this, even if not in the
> immediate future.

Well, talk is cheap. Would be great to see some work done of course.

Cheers,
fijal

From fijall at gmail.com  Tue Jul 27 17:11:43 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 27 Jul 2010 17:11:43 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	 
	
Message-ID: 

>
>> Truly
>> concurrent threads (depends on implicit vs explicit shared memory)
>> might require a truly concurrent GC to achieve performance. This is
>> work (although not as big as removing refcounting from CPython for
>> example).
>
>>> Are there detailed docs on why the Python GIL exists?
>>> -----------------------------------------------------
>>> I don't mean trivial statements like "because of C extensions" or "because the interpreter can't handle it".
>>> It may be possible that my particular usage would not require the GIL.? However, I won't know this until I can understand what threading problems the Python interpreter has that the GIL was meant to protect against.? Is there detailed documentation about this anywhere that covers all the threading issues that the GIL was meant to solve?
>
>> The short answer is "yes". The long answer is that it's much easier to
>> write interpreter assuming GIL is around. For fine-grained locking to
>> work and be efficient, you would need:
>
>> * The forementioned locking, to ensure that it's not that easy to
>> screw things up.
> I've wondered around the guarantees we need to offer to the
> programmer, and my guess was that Jython's memory model is similar.
> I've been concentrating on the dictionary of objects, on the
> assumption that lists and most other built-in structures should be
> locked by the programmer in case of concurrent modifications.
>
> However, we don't want to require locking to support something like:
> Thread 1:
> obj.newmember=1;
> Thread 2:
> a = obj.oldmember;
>
> Looking for Jython memory model on Google produces some garbage and
> then this document from Unladen Swallow:
> http://code.google.com/p/unladen-swallow/wiki/MemoryModel
> It implicitly agrees on what's above (since Jython and IronPython both
> use thread-safe dictionaries), and then delves into issues about
> allowed reorderings.
> However, it requires that even racy code does not make the interpreter crash.

I guess the main restraint is "interpreter should not crash" indeed.

>
>> * Possibly a JIT optimization that would remove some locking.
> Any more specific ideas on this?

Well, yes. Determining when object is local so you don't need to do
any locking, even though it escapes (this is also "just work", since
it has been done before).

>> * Some sort of concurrent GC (not specifically running in a separate
>> thread, but having different pools of memory to allocate from)
>
> Among all points, this seems the easiest design-wise. Having
> per-thread pools is nowadays standard, so it's _just_ work (as opposed
> to 'complicated design'). Parallel GCs become important just when lots
> of garbage must be reclaimed.
> A GC is called concurrent, rather than parallel, when it runs
> concurrently with the mutator, and this usually reduces both pause
> times and throughput, so you probably don't want this as default (it
> is useful for particular programs, such as heavily interactive
> programs or videogames, I guess), do you?

I guess I meant parallel then.

>
> More details are here:
> http://www.ibm.com/developerworks/java/library/j-jtp11253/
>
> The trick used in the (mostly) concurrent collector of Hotspot seems
> interesting: it uses two short-stop-the-world phases and lets the
> program run in between. I think I'll look for a paper on it.

Would be interested in that.

>
> Cheers,
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
>

From holger at merlinux.eu  Tue Jul 27 18:05:49 2010
From: holger at merlinux.eu (holger krekel)
Date: Tue, 27 Jul 2010 18:05:49 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared
	memory	message passing?
In-Reply-To: 
References: 
	
	
	
Message-ID: <20100727160548.GJ14601@trillke.net>

On Tue, Jul 27, 2010 at 17:07 +0200, Maciej Fijalkowski wrote:
> On Tue, Jul 27, 2010 at 4:36 PM, Paolo Giarrusso  wrote:
> > Hi all!
> >
> > I am possibly interested in doing work on this, even if not in the
> > immediate future.
> 
> Well, talk is cheap. Would be great to see some work done of course.

Well, I think it can be useful to state intentions and interest.  At least
for my projects i feel a difference if people express interest (even through 
negative feedback or broken code) or if they are indifferent, 
not saying or doing anything. 

best,
holger

From jbaker at zyasoft.com  Tue Jul 27 19:58:17 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Tue, 27 Jul 2010 11:58:17 -0600
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: <20100727160548.GJ14601@trillke.net>
References: 
	 
	 
	 
	<20100727160548.GJ14601@trillke.net>
Message-ID: 

A much shorter version of the Jython memory model can be found in my book:
http://jythonpodcast.hostjava.net/jythonbook/en/1.0/Concurrency.html#python-memory-model

In general, I would think the coroutine mechanism being implemented by Lukas
Stadler for the MLVM version of the hotspot JVM might be a good option; you
can directly control the scheduling, although I don't think you change the
mapping from one hardware thread to another. (That's probably not
interesting.)

There are good results with JRuby, it would be nice to replicate with Jython
- and it should be really straightforward to do that. See
http://classparser.blogspot.com/

- Jim

On Tue, Jul 27, 2010 at 10:05 AM, holger krekel  wrote:

> On Tue, Jul 27, 2010 at 17:07 +0200, Maciej Fijalkowski wrote:
> > On Tue, Jul 27, 2010 at 4:36 PM, Paolo Giarrusso 
> wrote:
> > > Hi all!
> > >
> > > I am possibly interested in doing work on this, even if not in the
> > > immediate future.
> >
> > Well, talk is cheap. Would be great to see some work done of course.
>
> Well, I think it can be useful to state intentions and interest.  At least
> for my projects i feel a difference if people express interest (even
> through
> negative feedback or broken code) or if they are indifferent,
> not saying or doing anything.
>
> best,
> holger
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100727/4d799ca0/attachment-0001.htm 

From kevinar18 at hotmail.com  Tue Jul 27 20:20:10 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Tue, 27 Jul 2010 14:20:10 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: <20100727062702.GE12699@tunixman.com>
References: ,
	<20100727062702.GE12699@tunixman.com>
Message-ID: 


I won't even bother giving individual replies.? It's 
going to take me some time to go through all that information on the 
GIL, so I guess there's no much of a reply I can give anyways.? :)? Let me explain what this is all about in greater detail.



BTW, if there are more links on the GIL, feel free to post.

> Anonymous memory-mapped regions would work, with a suitable data
> abstraction. Or even memory-mapped files, which aren't really all that
> different on systems anymore.
I considered that... however, that would mean writing a significant library to convert Python data types to C/machine types and I wasn't looking forward to that prospect... although after some experimenting, maybe I will find that it won't be that big a deal for my particular situation.

-----------------------
What this is all about:
-----------------------
I am attempting to experiment with FBP - Flow Based Programming (http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com/fbp/book.pdf)? There is something very similar in Python: http://www.kamaelia.org/MiniAxon.html? Also, there are some similarities to Erlang - the share nothing memory model... and on some very broad levels, there are similarities that can be found in functional languages.

Consider p74 and p75 of the FBP book (http://www.jpaulmorrison.com/fbp/book.pdf).? Programs essentially consist of many "black boxes" connected together.? A box receives data, processes it and passes it along to another box, to output or drops/deletes it.? Each box, is like a mini-program written in a traditional programming language (like C++ or Python).

The process of connecting the boxes together was actually designed to be programmed visually, as you can see from the examples in the book (I have no idea if it works well, as I am merely starting to experiment with it).

Each box, being a self contained "program," the only data it has access to is 3 parts:
(1) it's own internal variables
(2) The "in ports" These are connections from other boxes allowing the box to receive data to be processed (very similar to the arguments in a function call)
(3) The "out ports" After processing the data, the box sends results to various "out ports" (which, in turn, go to anther box's "in port" or to system output).? There is no "return" like in functions... and a box can continually generate many pieces of data on the "out ports", unlike a function which only generates one return.


------------------------
At this point, my understanding of the FBP concept is extremely limited.? Unfortunately, the author does not have very detailed documentation on the implementation details.? So, I am going to try exploring the concept on my own and see if I can actually use it in some production code.


Implementation of FBP requires a custom scheduler for several reasons:
(1) A box can only run if it has actual data on the "in port(s)"? Thus, the scheduler would only schedule boxes to run when they can actually process some data.
(2) In theory, it may be possible to end up with hundreds or thousands of these light weight boxes.? Using heavy-weight OS threads or processes for every one is out of the question.


The Kamaelia website describes a simplistic single-threaded way to write a scheduler in Python that would work for the FBP concept (even though they never heard of FBP when they designed Kamaelia).? Based on that, it seems like writing a simple scheduler would be rather easy:


In a perfect world, here's what I might do:
* Assume a quad core cpu
(1) Spawn 1 process
(2) Spawn 4 threads & assign each thread to only 1 core -- in other words, don't let the OS handle moving threads around to different cores
(3) Inside each thread, have a mini scheduler that switches back and forth between the many micro-threads (or "boxes") -- note that the OS should not handle any of the switching between micro-threads/boxes as it does it all wrong (and to heavyweight) for this situation.
(4) Using a shared memory queue, each of the 4 schedulers can get the next box to run... or add more boxes to the schedule queue.

(5) Each box has access to its "in ports" and "out ports" only -- and nothing else.? These can be implemented as shared memory for speed.


Some notes:
Garbage Collection - I noticed that one of the issues mentioned about the GIL was garbage collection.? Within the FBP concept, this MIGHT be easily solved: (a) only 1 running piece of code (1 box) can access a piece of data at a time, so there is no worries about whether there are dangling pointers to the var/object somewhere, etc... (b) data must be manually "dropped" inside a box to get rid of it; thus, there is no need to go checking for data that is not used anymore

Threading protection - In theory, there is significantly less threading issues since: (a) only one box can control/access data at a time (b) the only place where there is contention is when you push/pop from the in/out ports ... and that is trivial to protect against.



Anyways, I appreciate the replies.? At this point, I guess I'll just go for a simplistic implementation to get a feel for how things work.? Then, maybe I can check on if something better can be done in PyPy.
 		 	   		  
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2

From cfbolz at gmx.de  Tue Jul 27 23:56:26 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Tue, 27 Jul 2010 23:56:26 +0200
Subject: [pypy-dev] rotting buildbot infrastructure
In-Reply-To: 
References: 
Message-ID: <4C4F560A.6080101@gmx.de>

On 07/27/2010 03:42 PM, Maciej Fijalkowski wrote:
> Hello.
>
> According to current buildbot status, both osx and win machines are
> offline. No clue how to get them back. Anyway, our OS X machine is
> unable to translate pypy, so it's not exactly the best buildbot ever.
> Can anyone contribute any machine for one of those buildbots?

Sorry, I will only be able to look at the OS X machine in August. Why 
can't it translate PyPy?

Carl Friedrich

From fijall at gmail.com  Wed Jul 28 08:42:22 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 28 Jul 2010 08:42:22 +0200
Subject: [pypy-dev] rotting buildbot infrastructure
In-Reply-To: <4C4F560A.6080101@gmx.de>
References:  
	<4C4F560A.6080101@gmx.de>
Message-ID: 

On Tue, Jul 27, 2010 at 11:56 PM, Carl Friedrich Bolz  wrote:
> On 07/27/2010 03:42 PM, Maciej Fijalkowski wrote:
>> Hello.
>>
>> According to current buildbot status, both osx and win machines are
>> offline. No clue how to get them back. Anyway, our OS X machine is
>> unable to translate pypy, so it's not exactly the best buildbot ever.
>> Can anyone contribute any machine for one of those buildbots?
>
> Sorry, I will only be able to look at the OS X machine in August. Why
> can't it translate PyPy?

There is not enough memory (the build timeout after like 4 or 5 hours).

>
> Carl Friedrich
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From stephen at thorne.id.au  Wed Jul 28 09:29:51 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Wed, 28 Jul 2010 17:29:51 +1000
Subject: [pypy-dev] rotting buildbot infrastructure
In-Reply-To: 
References: 
	<4C4F560A.6080101@gmx.de>
	
Message-ID: <20100728072951.GF1338@thorne.id.au>

On 2010-07-28, Maciej Fijalkowski wrote:
> On Tue, Jul 27, 2010 at 11:56 PM, Carl Friedrich Bolz  wrote:
> > On 07/27/2010 03:42 PM, Maciej Fijalkowski wrote:
> >> Hello.
> >>
> >> According to current buildbot status, both osx and win machines are
> >> offline. No clue how to get them back. Anyway, our OS X machine is
> >> unable to translate pypy, so it's not exactly the best buildbot ever.
> >> Can anyone contribute any machine for one of those buildbots?
> >
> > Sorry, I will only be able to look at the OS X machine in August. Why
> > can't it translate PyPy?
> 
> There is not enough memory (the build timeout after like 4 or 5 hours).

I have a quad core ppc OSX (10.4) machine that isn't currently operating. It
only has a few of its RAM slots filled. If anyone wanted to fill it with RAM it
would make a reasonable build machine.

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue

From william.leslie.ttg at gmail.com  Wed Jul 28 14:54:39 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Wed, 28 Jul 2010 22:54:39 +1000
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
Message-ID: 

On 28 July 2010 04:20, Kevin Ar18  wrote:
> I am attempting to experiment with FBP - Flow Based Programming (http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com/fbp/book.pdf)? There is something very similar in Python: http://www.kamaelia.org/MiniAxon.html? Also, there are some similarities to Erlang - the share nothing memory model... and on some very broad levels, there are similarities that can be found in functional languages.

Does anyone know if there is a central resource for incompatible
python memory model proposals? I know of Jython, Python-Safethread,
and Mont-E.

I do like the idea of MiniAxon, but let me mention a topic that has
slowly been bubbling to the front of my mind for the last few months.

Concurrency in the face of shared mutable state is hard. It makes it
trivial to introduce bugs all over the place. Nondeterminacy related
bugs are far harder to test, diagnose, and fix than anything else that
I would almost mandate static verification (via optional typing,
probably) of task noninterference if I was moving to a concurrent
environment with shared mutable state. There might be a reasonable
middle ground where, if a task attempts to violate the required static
semantics, it fails dynamically. At least then, latent bugs make
plenty of noise. An example for MiniAxon (as I understand it, which is
not very well) would be verification that a "task" (including
functions that the task calls) never closes over and yields the same
mutable objects, and never mutates globally reachable objects.

I wonder if you could close such tasks off with a clever subclass of
the proxy object space that detects and rejects such memory model
violations? With only semantics that make the program deterministic?

The moral equivalent would be cooperating processes with a large
global (or not) shared memory store for immutable objects, queues for
communication, and the additional semantic that objects in a queue are
either immutable or the queue holds their only reference. The trouble
is that it is so hard to work out what immutable really means.
Non-optional annotations would be not very pythonian.

-- 
William Leslie

From p.giarrusso at gmail.com  Wed Jul 28 15:12:40 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Wed, 28 Jul 2010 15:12:40 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
Message-ID: 

On Tue, Jul 27, 2010 at 20:20, Kevin Ar18  wrote:
>
> I won't even bother giving individual replies.? It's
> going to take me some time to go through all that information on the
> GIL, so I guess there's no much of a reply I can give anyways.? :)? Let me explain what this is all about in greater detail.

> BTW, if there are more links on the GIL, feel free to post.
>
>> Anonymous memory-mapped regions would work, with a suitable data
>> abstraction. Or even memory-mapped files, which aren't really all that
>> different on systems anymore.
> I considered that... however, that would mean writing a significant library to convert Python data types to C/machine types and I wasn't looking forward to that prospect... although after some experimenting, maybe I will find that it won't be that big a deal for my particular situation.

> I am attempting to experiment with FBP - Flow Based Programming (http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com/fbp/book.pdf)? There is something very similar in Python: http://www.kamaelia.org/MiniAxon.html? Also, there are some similarities to Erlang - the share nothing memory model... and on some very broad levels, there are similarities that can be found in functional languages.
Except for the "visual programming" part, the general idea you
describe stems from CSP (Communicating Sequential Processes) and is
also found at least in the Scala actor library and in Google's Go with
goroutines.

In both languages you can easily pretend that no memory is shared by
avoiding to share any pointers (unlike C, even buggy code can't modify
a pointer which wasn't shared), and Go recommends programming this
way. A difference is that this is a convention.

For the "visual programming", it looks like a particular case of what
the Eclipse Modeling Framework is doing (they allow you to define
types of diagrams, called metamodels, and a way to convert them to
code, and generate a diagram editor and other support stuff. I'm not
an expert on that).
>From what you describe, FBP seems to give nothing new, except the
combination among "visual programming" with this idea. Disclaimer: I
did not read the book.

> Consider p74 and p75 of the FBP book (http://www.jpaulmorrison.com/fbp/book.pdf).? Programs essentially consist of many "black boxes" connected together.? A box receives data, processes it and passes it along to another box, to output or drops/deletes it.? Each box, is like a mini-program written in a traditional programming language (like C++ or Python).
>
> The process of connecting the boxes together was actually designed to be programmed visually, as you can see from the examples in the book (I have no idea if it works well, as I am merely starting to experiment with it).
>
> Each box, being a self contained "program," the only data it has access to is 3 parts:
> (1) it's own internal variables
> (2) The "in ports" These are connections from other boxes allowing the box to receive data to be processed (very similar to the arguments in a function call)
> (3) The "out ports" After processing the data, the box sends results to various "out ports" (which, in turn, go to anther box's "in port" or to system output).? There is no "return" like in functions... and a box can continually generate many pieces of data on the "out ports", unlike a function which only generates one return.
>
>
> ------------------------
> At this point, my understanding of the FBP concept is extremely limited.? Unfortunately, the author does not have very detailed documentation on the implementation details.? So, I am going to try exploring the concept on my own and see if I can actually use it in some production code.
>
>
> Implementation of FBP requires a custom scheduler for several reasons:
> (1) A box can only run if it has actual data on the "in port(s)"? Thus, the scheduler would only schedule boxes to run when they can actually process some data.
> (2) In theory, it may be possible to end up with hundreds or thousands of these light weight boxes.? Using heavy-weight OS threads or processes for every one is out of the question.
>
>
> The Kamaelia website describes a simplistic single-threaded way to write a scheduler in Python that would work for the FBP concept (even though they never heard of FBP when they designed Kamaelia).? Based on that, it seems like writing a simple scheduler would be rather easy:

> In a perfect world, here's what I might do:
> * Assume a quad core cpu
> (1) Spawn 1 process
> (2) Spawn 4 threads & assign each thread to only 1 core -- in other words, don't let the OS handle moving threads around to different cores
> (3) Inside each thread, have a mini scheduler that switches back and forth between the many micro-threads (or "boxes") -- note that the OS should not handle any of the switching between micro-threads/boxes as it does it all wrong (and to heavyweight) for this situation.
> (4) Using a shared memory queue, each of the 4 schedulers can get the next box to run... or add more boxes to the schedule queue.

Most of this is usual or standard - even if somebody possibly won't
set thread-CPU affinity, possibly because they don't know about the
syscalls to do it, i.e. sched_setaffinity. IIRC, this was not
mentioned in the paper I read about the Scala actor library.
Look for 'N:M threading library' (without quotes) on Google.

> (5) Each box has access to its "in ports" and "out ports" only -- and nothing else.? These can be implemented as shared memory for speed.

> Some notes:
> Garbage Collection - I noticed that one of the issues mentioned about the GIL was garbage collection.? Within the FBP concept, this MIGHT be easily solved: (a) only 1 running piece of code (1 box) can access a piece of data at a time, so there is no worries about whether there are dangling pointers to the var/object somewhere, etc...

> (b) data must be manually "dropped" inside a box to get rid of it; thus, there is no need to go checking for data that is not used anymore

A "piece of data" can point to other objects, and the pointer can be
modified. So you need GC anyway: having that, requiring data to be
dropped explicitly seems just an annoyance (there might be deeper
reasons, however).

> Threading protection - In theory, there is significantly less threading issues since: (a) only one box can control/access data at a time (b) the only place where there is contention is when you push/pop from the in/out ports ... and that is trivial to protect against.
Agreed.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From p.giarrusso at gmail.com  Wed Jul 28 15:37:07 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Wed, 28 Jul 2010 15:37:07 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	
Message-ID: 

On Wed, Jul 28, 2010 at 14:54, William Leslie
 wrote:
> On 28 July 2010 04:20, Kevin Ar18  wrote:
>> I am attempting to experiment with FBP - Flow Based Programming (http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com/fbp/book.pdf)? There is something very similar in Python: http://www.kamaelia.org/MiniAxon.html? Also, there are some similarities to Erlang - the share nothing memory model... and on some very broad levels, there are similarities that can be found in functional languages.

> Does anyone know if there is a central resource for incompatible
> python memory model proposals? I know of Jython, Python-Safethread,
> and Mont-E.

Add Unladen Swallow to your list - the "Jython memory model" is undocumented.
I don't know of Mont-E, can't find its website through Google (!), and
there seems to be no such central resource.

> I do like the idea of MiniAxon, but let me mention a topic that has
> slowly been bubbling to the front of my mind for the last few months.

> Concurrency in the face of shared mutable state is hard. It makes it
> trivial to introduce bugs all over the place. Nondeterminacy related
> bugs are far harder to test, diagnose, and fix than anything else that
> I would almost mandate static verification (via optional typing,
> probably) of task noninterference if I was moving to a concurrent
> environment with shared mutable state.

This is a general issue with concurrency, and usually I try to solve
this using more pencil-and-paper design than usual.

> There might be a reasonable
> middle ground where, if a task attempts to violate the required static
> semantics, it fails dynamically. At least then, latent bugs make
> plenty of noise.

In general, I've seen lots of research on this, and something
implemented in Valgrind - see here for links:
http://blaisorbladeprog.blogspot.com/2010/07/automatic-race-detection.html.
Given the interest on this, the lack of complete tools might mean that
it is just too hard currently.

> An example for MiniAxon (as I understand it, which is
> not very well) would be verification that a "task" (including
> functions that the task calls) never closes over and yields the same
> mutable objects, and never mutates globally reachable objects.

I guess that 'close over' here means 'getting as input'.

> I wonder if you could close such tasks off with a clever subclass of
> the proxy object space that detects and rejects such memory model
> violations? With only semantics that make the program deterministic?

> The moral equivalent would be cooperating processes with a large
> global (or not) shared memory store for immutable objects, queues for
> communication, and the additional semantic that objects in a queue are
> either immutable or the queue holds their only reference.

In C++ auto_ptr do it, but that's hard in Python.

> The trouble
> is that it is so hard to work out what immutable really means.
> Non-optional annotations would be not very pythonian.

If you want static guarantees, you need a statically typed language.
The usual argument for dynamic languages is that instead of static
typing, you need to write unit tests, and since you must do that
anyway, dynamic languages are a win. We have two incomplete attempts
to make programs correct:
- Types give strong guarantees against a subclass of errors (you
_never_ get certain errors from a program which compiles)
- Testing gives weak guarantees (which go just as far as you test),
but covers all classes of errors
- The middle ground would be to require annotations to prove
properties. One would need (once and for all) to annotate even strings
as immutable!

Cheers,
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From william.leslie.ttg at gmail.com  Wed Jul 28 16:56:43 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 00:56:43 +1000
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
Message-ID: 

On 28 July 2010 23:37, Paolo Giarrusso  wrote:
> On Wed, Jul 28, 2010 at 14:54, William Leslie
>  wrote:
>> Does anyone know if there is a central resource for incompatible
>> python memory model proposals? I know of Jython, Python-Safethread,
>> and Mont-E.
>
> Add Unladen Swallow to your list - the "Jython memory model" is undocumented.
> I don't know of Mont-E, can't find its website through Google (!), and
> there seems to be no such central resource.

Mont-E was, for a long time, the hypothetical capability-secure subset
of python based on E and discussed on cap-talk. A handful of people
started work on it in earnest as a cpython fork fairly recently, but
it does seem to be pretty quiet, and documentation free. I did find a
repository and a presentation:
  http://bytebucket.org/habnabit/mont-e/overview
  https://docs.google.com/present/view?id=d9wrrrq_15ch78nq9n

> This is a general issue with concurrency, and usually I try to solve
> this using more pencil-and-paper design than usual.

I found the following paper pretty interesting. The motivating study
is some concurrency experts implementing software for proving the lack
of deadlock in Java. Even with the sort of dedication that only a
researcher with no life can provide, their deadlock inference software
itself deadlocked after many years of use.
www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

>> An example for MiniAxon (as I understand it, which is
>> not very well) would be verification that a "task" (including
>> functions that the task calls) never closes over and yields the same
>> mutable objects, and never mutates globally reachable objects.
>
> I guess that 'close over' here means 'getting as input'.

I mean that it keeps a reference to the objects between invocations.
Hence, sharing mutable state.

>> The trouble
>> is that it is so hard to work out what immutable really means.
>> Non-optional annotations would be not very pythonian.
>
> If you want static guarantees, you need a statically typed language.
> The usual argument for dynamic languages is that instead of static
> typing, you need to write unit tests, and since you must do that
> anyway, dynamic languages are a win.

One thing that many even very experienced hackers miss is that
(static) types (and typesystems) actually cover a broad range of
usages, and many of them are very different to the structural
typesafety systems you are used to in C# and Java. A typesystem can
prove anything that is statically computable, from the noninterference
of effects to program termination, the ability to stack allocate data
structures, and that privileged information can't be tainted.

It's important to realise that these are orthogonal to, not supersets
of, typesystems that validate structural safety. So it can be
reasonable, if yet a little more difficult, to apply them to dynamic
languages.

-- 
William Leslie

From glavoie at gmail.com  Wed Jul 28 21:32:38 2010
From: glavoie at gmail.com (Gabriel Lavoie)
Date: Wed, 28 Jul 2010 15:32:38 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
Message-ID: 

Hello Kevin,
     I don't know if it can be a solution to your problem but for my Master
Thesis I'm working on making Stackless Python distributed. What I did is
working but not complete and I'm right now in the process of writing the
thesis (in french unfortunately). My code currently works with PyPy's
"stackless" module onlyis and use some PyPy specific things. Here's what I
added to Stackless:

- Possibility to move tasklets easily (ref_tasklet.move(node_id)). A node is
an instance of an interpreter.
- Each tasklet has its global namespace (to avoid sharing of data). The
state is also easier to move to another interpreter this way.
- Distributed channels: All requests are known by all nodes using the
channel.
- Distributed objets: When a reference is sent to a remote node, the object
is not copied, a reference is created using PyPy's proxy object space.
- Automated dependency recovery when an object or a tasklet is loaded on
another interpreter

With a proper scheduler, many tasklets could be automatically spread in
multiple interpreters to use multiple cores or on multiple computers. A bit
like the N:M threading model where N lightweight threads/coroutines can be
executed on M threads.

The API is described here in french but it's pretty straightforward:
https://w3.mutehq.net/wiki/maitrise/API_DStackless

The code is available here (Just click on the Download link next to the
trunk folder):
https://w3.mutehq.net/websvn/wildchild/dstackless/trunk/

You need pypy-c built with --stackless. The code is a bit buggy right now
though...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100728/c12dfb5e/attachment.htm 

From kevinar18 at hotmail.com  Thu Jul 29 02:59:21 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Wed, 28 Jul 2010 20:59:21 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: ,
	
Message-ID: 

> I don't know if it can be a solution to your problem but for my Master
> Thesis I'm working on making Stackless Python distributed.

It might be of use.  Thanks for the heads up.  I do have several questions:

1) Is it PyPy's stackless module or Stackless Python (stackless.com)?  Or are they the same module?
2) Do you have a non-https version of the site or one with a publically signed certificate?

P.S. You can send your reply over private email if you want, so as to not bother the list. :)

Date: Wed, 28 Jul 2010 15:32:38 -0400
Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared memory 	message passing?
From: glavoie at gmail.com
To: kevinar18 at hotmail.com
CC: pypy-dev at codespeak.net

Hello Kevin,
     I don't know if it can be a solution to your problem but for my Master Thesis I'm working on making Stackless Python distributed. What I did is working but not complete and I'm right now in the process of writing the thesis (in french unfortunately). My code currently works with PyPy's "stackless" module onlyis and use some PyPy specific things. Here's what I added to Stackless:

- Possibility to move tasklets easily (ref_tasklet.move(node_id)). A node is an instance of an interpreter.- Each tasklet has its global namespace (to avoid sharing of data). The state is also easier to move to another interpreter this way. 
- Distributed channels: All requests are known by all nodes using the channel. - Distributed objets: When a reference is sent to a remote node, the object is not copied, a reference is created using PyPy's proxy object space.
- Automated dependency recovery when an object or a tasklet is loaded on another interpreter
With a proper scheduler, many tasklets could be automatically spread in multiple interpreters to use multiple cores or on multiple computers. A bit like the N:M threading model where N lightweight threads/coroutines can be executed on M threads. 

The API is described here in french but it's pretty straightforward:https://w3.mutehq.net/wiki/maitrise/API_DStackless

The code is available here (Just click on the Download link next to the trunk folder):https://w3.mutehq.net/websvn/wildchild/dstackless/trunk/

You need pypy-c built with --stackless. The code is a bit buggy right now though...
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100728/00022661/attachment.htm 

From kevinar18 at hotmail.com  Thu Jul 29 03:33:04 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Wed, 28 Jul 2010 21:33:04 -0400
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: ,
	<20100727062702.GE12699@tunixman.com>,
	,
	
Message-ID: 

As a followup to my earlier post:
"pre-emptive micro-threads utilizing shared memory message passing?"

I am actually finding that the biggest hurdle to accomplishing what I want is the lack of ANY type of shared memory -- even if it is limited.  I wonder if I might ask a question:

Would the following be a possible way to offer a limited type of shared memory:

Summary: create a system very, very similar to POSH, but with differences:

In detail, here's what I mean:
* unlike POSH, utilize OS threads and shared memory (not processes)
* Create a special shared memory location where you can place Python objects
* Each Python object you place into this location can only be accessed (modified) by 1 thread.
* You must manually assign ownership of an object to a particular thread.
* The thread that "owns" the object is the only one that can modify it.
* You can transfer ownership to another thread (but, as always only the owner can modify it).

* There is no GIL when a thread interacts with these special objects.  You can have true thread parallelism if your code uses a lot of these special objects.
* The GIL remains in place for all other data access.
* If your code has a mixture of access to the special objects and regular data, then once you hit a point where a thread starts to interact with data not in the special storage, then that thread must follow GIL rules.

Granted, there might be some difficulty with the GIL part... but I thought I might ask anyways. :)

> Date: Wed, 28 Jul 2010 22:54:39 +1000
> Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared memory 	message passing?
> From: william.leslie.ttg at gmail.com
> To: kevinar18 at hotmail.com
> CC: pypy-dev at codespeak.net
> 
> On 28 July 2010 04:20, Kevin Ar18  wrote:
> > I am attempting to experiment with FBP - Flow Based Programming (http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com/fbp/book.pdf)  There is something very similar in Python: http://www.kamaelia.org/MiniAxon.html  Also, there are some similarities to Erlang - the share nothing memory model... and on some very broad levels, there are similarities that can be found in functional languages.
> 
> Does anyone know if there is a central resource for incompatible
> python memory model proposals? I know of Jython, Python-Safethread,
> and Mont-E.
> 
> I do like the idea of MiniAxon, but let me mention a topic that has
> slowly been bubbling to the front of my mind for the last few months.
> 
> Concurrency in the face of shared mutable state is hard. It makes it
> trivial to introduce bugs all over the place. Nondeterminacy related
> bugs are far harder to test, diagnose, and fix than anything else that
> I would almost mandate static verification (via optional typing,
> probably) of task noninterference if I was moving to a concurrent
> environment with shared mutable state. There might be a reasonable
> middle ground where, if a task attempts to violate the required static
> semantics, it fails dynamically. At least then, latent bugs make
> plenty of noise. An example for MiniAxon (as I understand it, which is
> not very well) would be verification that a "task" (including
> functions that the task calls) never closes over and yields the same
> mutable objects, and never mutates globally reachable objects.
> 
> I wonder if you could close such tasks off with a clever subclass of
> the proxy object space that detects and rejects such memory model
> violations? With only semantics that make the program deterministic?
> 
> The moral equivalent would be cooperating processes with a large
> global (or not) shared memory store for immutable objects, queues for
> communication, and the additional semantic that objects in a queue are
> either immutable or the queue holds their only reference. The trouble
> is that it is so hard to work out what immutable really means.
> Non-optional annotations would be not very pythonian.
> 
> -- 
> William Leslie
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100728/f4c342b2/attachment.htm 

From alex.gaynor at gmail.com  Thu Jul 29 03:44:23 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Wed, 28 Jul 2010 20:44:23 -0500
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
Message-ID: 

On Wed, Jul 28, 2010 at 8:33 PM, Kevin Ar18  wrote:
> As a followup to my earlier post:
> "pre-emptive micro-threads utilizing shared memory message passing?"
>
> I am actually finding that the biggest hurdle to accomplishing what I want
> is the lack of ANY type of shared memory -- even if it is limited.??I wonder
> if I might ask a question:
>
> Would the following be a possible way to offer a limited type of shared
> memory:
>
> Summary: create a system very, very similar to POSH, but with differences:
>
> In detail, here's what I mean:
> * unlike POSH, utilize OS threads and shared memory (not processes)
> * Create a special shared memory location where you can place Python objects
> * Each Python object you place into this location can only be accessed
> (modified) by 1 thread.
> * You must manually assign ownership of an object to a particular thread.
> * The thread that "owns" the object is the only one that can modify it.
> * You can transfer ownership to another thread (but, as always only the
> owner can modify it).
>
> * There is no GIL when a thread interacts with these special objects.??You
> can have true thread parallelism if your code uses a lot of these special
> objects.
> * The GIL remains in place for all other data access.
> * If your code has a mixture of access to the special objects and regular
> data, then once you hit a point where a thread starts to interact with data
> not in the special storage, then that thread must follow GIL rules.
>
> Granted, there might be some difficulty with the GIL part... but I thought I
> might ask anyways. :)
>
>> Date: Wed, 28 Jul 2010 22:54:39 +1000
>> Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared memory
>> message passing?
>> From: william.leslie.ttg at gmail.com
>> To: kevinar18 at hotmail.com
>> CC: pypy-dev at codespeak.net
>>
>> On 28 July 2010 04:20, Kevin Ar18  wrote:
>> > I am attempting to experiment with FBP - Flow Based Programming
>> > (http://www.jpaulmorrison.com/fbp/ and book:
>> > http://www.jpaulmorrison.com/fbp/book.pdf)? There is something very similar
>> > in Python: http://www.kamaelia.org/MiniAxon.html? Also, there are some
>> > similarities to Erlang - the share nothing memory model... and on some very
>> > broad levels, there are similarities that can be found in functional
>> > languages.
>>
>> Does anyone know if there is a central resource for incompatible
>> python memory model proposals? I know of Jython, Python-Safethread,
>> and Mont-E.
>>
>> I do like the idea of MiniAxon, but let me mention a topic that has
>> slowly been bubbling to the front of my mind for the last few months.
>>
>> Concurrency in the face of shared mutable state is hard. It makes it
>> trivial to introduce bugs all over the place. Nondeterminacy related
>> bugs are far harder to test, diagnose, and fix than anything else that
>> I would almost mandate static verification (via optional typing,
>> probably) of task noninterference if I was moving to a concurrent
>> environment with shared mutable state. There might be a reasonable
>> middle ground where, if a task attempts to violate the required static
>> semantics, it fails dynamically. At least then, latent bugs make
>> plenty of noise. An example for MiniAxon (as I understand it, which is
>> not very well) would be verification that a "task" (including
>> functions that the task calls) never closes over and yields the same
>> mutable objects, and never mutates globally reachable objects.
>>
>> I wonder if you could close such tasks off with a clever subclass of
>> the proxy object space that detects and rejects such memory model
>> violations? With only semantics that make the program deterministic?
>>
>> The moral equivalent would be cooperating processes with a large
>> global (or not) shared memory store for immutable objects, queues for
>> communication, and the additional semantic that objects in a queue are
>> either immutable or the queue holds their only reference. The trouble
>> is that it is so hard to work out what immutable really means.
>> Non-optional annotations would be not very pythonian.
>>
>> --
>> William Leslie
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

Honestly, that sounds really difficult, out and out removing the GIL
would probably be easier.

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From kevinar18 at hotmail.com  Thu Jul 29 04:07:57 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Wed, 28 Jul 2010 22:07:57 -0400
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: ,
	<20100727062702.GE12699@tunixman.com>,
	,
	,
	,
	
Message-ID: 


> Honestly, that sounds really difficult, out and out removing the GIL
> would probably be easier.
Based on the extremely limited info on the GIL, the big issue I noticed were two pieces of code trying to modify the same object at the same time because of the way they are stored internally in Python and because of garbage collection.
I figured, if you have special objects that cannot be simultaneously accessed that maybe that would be possible. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100728/2493f8cd/attachment.htm 

From william.leslie.ttg at gmail.com  Thu Jul 29 07:18:57 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 15:18:57 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
Message-ID: 

On 29 July 2010 11:33, Kevin Ar18  wrote:
> In detail, here's what I mean:
> * unlike POSH, utilize OS threads and shared memory (not processes)
> * Create a special shared memory location where you can place Python objects
> * Each Python object you place into this location can only be accessed
> (modified) by 1 thread.
> * You must manually assign ownership of an object to a particular thread.
> * The thread that "owns" the object is the only one that can modify it.
> * You can transfer ownership to another thread (but, as always only the
> owner can modify it).

When an object is mutable, it must be visible to at most one thread.
This means it can participate in return values, arguments and queues,
but the sender cannot keep a reference to an object it sends, because
if the receiver mutates the object, this will need to be reflected in
the sender's thread to ensure internal consistency. Well, you could
ignore internal consistency, require explicit locking, and have it
segfault when the change to the length of your list has propogated but
not the element you have added, but that wouldn't be much fun. The
alternative, implicitly writing updates back to memory as soon as
possible and reading them out of memory every time, can be hundreds or
more times slower. So you really can't have two tasks sharing mutable
objects, ever.

-- 
William Leslie

From fijall at gmail.com  Thu Jul 29 09:27:05 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 29 Jul 2010 09:27:05 +0200
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	 
	
	
Message-ID: 

On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
 wrote:
> On 29 July 2010 11:33, Kevin Ar18  wrote:
>> In detail, here's what I mean:
>> * unlike POSH, utilize OS threads and shared memory (not processes)
>> * Create a special shared memory location where you can place Python objects
>> * Each Python object you place into this location can only be accessed
>> (modified) by 1 thread.
>> * You must manually assign ownership of an object to a particular thread.
>> * The thread that "owns" the object is the only one that can modify it.
>> * You can transfer ownership to another thread (but, as always only the
>> owner can modify it).
>
> When an object is mutable, it must be visible to at most one thread.
> This means it can participate in return values, arguments and queues,
> but the sender cannot keep a reference to an object it sends, because
> if the receiver mutates the object, this will need to be reflected in
> the sender's thread to ensure internal consistency. Well, you could
> ignore internal consistency, require explicit locking, and have it
> segfault when the change to the length of your list has propogated but
> not the element you have added, but that wouldn't be much fun. The
> alternative, implicitly writing updates back to memory as soon as
> possible and reading them out of memory every time, can be hundreds or
> more times slower. So you really can't have two tasks sharing mutable
> objects, ever.
>
> --
> William Leslie

Hi.

Do you have any data points supporting your claim?

Cheers,
fijal

From william.leslie.ttg at gmail.com  Thu Jul 29 09:32:57 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 17:32:57 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
Message-ID: 

On 29 July 2010 17:27, Maciej Fijalkowski  wrote:
> On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
>  wrote:
>> When an object is mutable, it must be visible to at most one thread.
>> This means it can participate in return values, arguments and queues,
>> but the sender cannot keep a reference to an object it sends, because
>> if the receiver mutates the object, this will need to be reflected in
>> the sender's thread to ensure internal consistency. Well, you could
>> ignore internal consistency, require explicit locking, and have it
>> segfault when the change to the length of your list has propogated but
>> not the element you have added, but that wouldn't be much fun. The
>> alternative, implicitly writing updates back to memory as soon as
>> possible and reading them out of memory every time, can be hundreds or
>> more times slower. So you really can't have two tasks sharing mutable
>> objects, ever.
>>
>> --
>> William Leslie
>
> Hi.
>
> Do you have any data points supporting your claim?

About the performance of programs that involve a cache miss on every
memory access, or internal consistency?

-- 
William Leslie

From fijall at gmail.com  Thu Jul 29 09:40:21 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 29 Jul 2010 09:40:21 +0200
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	 
	
	 
	 
	
Message-ID: 

On Thu, Jul 29, 2010 at 9:32 AM, William Leslie
 wrote:
> On 29 July 2010 17:27, Maciej Fijalkowski  wrote:
>> On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
>>  wrote:
>>> When an object is mutable, it must be visible to at most one thread.
>>> This means it can participate in return values, arguments and queues,
>>> but the sender cannot keep a reference to an object it sends, because
>>> if the receiver mutates the object, this will need to be reflected in
>>> the sender's thread to ensure internal consistency. Well, you could
>>> ignore internal consistency, require explicit locking, and have it
>>> segfault when the change to the length of your list has propogated but
>>> not the element you have added, but that wouldn't be much fun. The
>>> alternative, implicitly writing updates back to memory as soon as
>>> possible and reading them out of memory every time, can be hundreds or
>>> more times slower. So you really can't have two tasks sharing mutable
>>> objects, ever.
>>>
>>> --
>>> William Leslie
>>
>> Hi.
>>
>> Do you have any data points supporting your claim?
>
> About the performance of programs that involve a cache miss on every
> memory access, or internal consistency?
>

I think I lost some implication here. Did I get you right - you claim
that per-object locking in case threads share obejcts are very
expensive, is that correct? If not, I completely misunderstood you and
my question makes no sense, please explain. If yes, why does it mean a
cache miss on every read/write?

Cheers,
fijal

From william.leslie.ttg at gmail.com  Thu Jul 29 09:57:58 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 17:57:58 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
	
Message-ID: 

On 29 July 2010 17:40, Maciej Fijalkowski  wrote:
> On Thu, Jul 29, 2010 at 9:32 AM, William Leslie
>  wrote:
>> On 29 July 2010 17:27, Maciej Fijalkowski  wrote:
>>> On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
>>>  wrote:
>>>> When an object is mutable, it must be visible to at most one thread.
>>>> This means it can participate in return values, arguments and queues,
>>>> but the sender cannot keep a reference to an object it sends, because
>>>> if the receiver mutates the object, this will need to be reflected in
>>>> the sender's thread to ensure internal consistency. Well, you could
>>>> ignore internal consistency, require explicit locking, and have it
>>>> segfault when the change to the length of your list has propogated but
>>>> not the element you have added, but that wouldn't be much fun. The
>>>> alternative, implicitly writing updates back to memory as soon as
>>>> possible and reading them out of memory every time, can be hundreds or
>>>> more times slower. So you really can't have two tasks sharing mutable
>>>> objects, ever.
>>>>
>>>> --
>>>> William Leslie
>>>
>>> Hi.
>>>
>>> Do you have any data points supporting your claim?
>>
>> About the performance of programs that involve a cache miss on every
>> memory access, or internal consistency?
>>
>
> I think I lost some implication here. Did I get you right - you claim
> that per-object locking in case threads share obejcts are very
> expensive, is that correct? If not, I completely misunderstood you and
> my question makes no sense, please explain. If yes, why does it mean a
> cache miss on every read/write?

I claim that there are two alternatives in the face of one thread
mutating an object and the other observing:

0. You can give up consistency and do fine-grained locking, which is
reasonably fast but error prone, or
1. Expect python to handle all of this for you, effectively not making
a change to the memory model. You could do this with implicit
per-object locks which might be reasonably fast in the absence of
contention, but not when several threads are trying to use the object.

Queues already are in a sense your per-object-lock,
one-thread-mutating, but usually one thread has acquire semantics and
one has release semantics, and that combination actually works. It's
when you expect to have a full memory barrier that is the problem.

Come to think of it, you might be right Kevin: as long as only one
thread mutates the object, the mutating thread never /needs/ to
acquire, as it knows that it has the latest revision.

Have I missed something?

-- 
William Leslie

From fijall at gmail.com  Thu Jul 29 10:02:30 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 29 Jul 2010 10:02:30 +0200
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	 
	
	 
	 
	 
	 
	
Message-ID: 

On Thu, Jul 29, 2010 at 9:57 AM, William Leslie
 wrote:
> On 29 July 2010 17:40, Maciej Fijalkowski  wrote:
>> On Thu, Jul 29, 2010 at 9:32 AM, William Leslie
>>  wrote:
>>> On 29 July 2010 17:27, Maciej Fijalkowski  wrote:
>>>> On Thu, Jul 29, 2010 at 7:18 AM, William Leslie
>>>>  wrote:
>>>>> When an object is mutable, it must be visible to at most one thread.
>>>>> This means it can participate in return values, arguments and queues,
>>>>> but the sender cannot keep a reference to an object it sends, because
>>>>> if the receiver mutates the object, this will need to be reflected in
>>>>> the sender's thread to ensure internal consistency. Well, you could
>>>>> ignore internal consistency, require explicit locking, and have it
>>>>> segfault when the change to the length of your list has propogated but
>>>>> not the element you have added, but that wouldn't be much fun. The
>>>>> alternative, implicitly writing updates back to memory as soon as
>>>>> possible and reading them out of memory every time, can be hundreds or
>>>>> more times slower. So you really can't have two tasks sharing mutable
>>>>> objects, ever.
>>>>>
>>>>> --
>>>>> William Leslie
>>>>
>>>> Hi.
>>>>
>>>> Do you have any data points supporting your claim?
>>>
>>> About the performance of programs that involve a cache miss on every
>>> memory access, or internal consistency?
>>>
>>
>> I think I lost some implication here. Did I get you right - you claim
>> that per-object locking in case threads share obejcts are very
>> expensive, is that correct? If not, I completely misunderstood you and
>> my question makes no sense, please explain. If yes, why does it mean a
>> cache miss on every read/write?
>
> I claim that there are two alternatives in the face of one thread
> mutating an object and the other observing:
>
> 0. You can give up consistency and do fine-grained locking, which is
> reasonably fast but error prone, or
> 1. Expect python to handle all of this for you, effectively not making
> a change to the memory model. You could do this with implicit
> per-object locks which might be reasonably fast in the absence of
> contention, but not when several threads are trying to use the object.
>
> Queues already are in a sense your per-object-lock,
> one-thread-mutating, but usually one thread has acquire semantics and
> one has release semantics, and that combination actually works. It's
> when you expect to have a full memory barrier that is the problem.
>
> Come to think of it, you might be right Kevin: as long as only one
> thread mutates the object, the mutating thread never /needs/ to
> acquire, as it knows that it has the latest revision.
>
> Have I missed something?
>
> --
> William Leslie
>

So my question is why you think 1. is really expensive (can you find
evidence). I don't see what is has to do with cache misses. Besides,
in python you cannot guarantee much about mutability of objects. So
you don't know if object passed in a queue is mutable or not, unless
you restrict yourself to some very simlpe types (in which case there
is no shared memory, since you only pass immutable objects).

Cheers,
fijal

From william.leslie.ttg at gmail.com  Thu Jul 29 10:50:52 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 18:50:52 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
	
	
	
Message-ID: 

On 29 July 2010 18:02, Maciej Fijalkowski  wrote:
> On Thu, Jul 29, 2010 at 9:57 AM, William Leslie
>  wrote:
>> I claim that there are two alternatives in the face of one thread
>> mutating an object and the other observing:
>>
>> 0. You can give up consistency and do fine-grained locking, which is
>> reasonably fast but error prone, or
>> 1. Expect python to handle all of this for you, effectively not making
>> a change to the memory model. You could do this with implicit
>> per-object locks which might be reasonably fast in the absence of
>> contention, but not when several threads are trying to use the object.
>>
>> Queues already are in a sense your per-object-lock,
>> one-thread-mutating, but usually one thread has acquire semantics and
>> one has release semantics, and that combination actually works. It's
>> when you expect to have a full memory barrier that is the problem.
>>
>> Come to think of it, you might be right Kevin: as long as only one
>> thread mutates the object, the mutating thread never /needs/ to
>> acquire, as it knows that it has the latest revision.
>>
>> Have I missed something?
>>
>> --
>> William Leslie
>>
>
> So my question is why you think 1. is really expensive (can you find
> evidence). I don't see what is has to do with cache misses. Besides,
> in python you cannot guarantee much about mutability of objects. So
> you don't know if object passed in a queue is mutable or not, unless
> you restrict yourself to some very simlpe types (in which case there
> is no shared memory, since you only pass immutable objects).

If task X expects that task Y will mutate some object it has, it needs
to go back to the source for every read. This means that if you do use
mutation of some shared object for communication, it needs to be
synchronised before every access. What this means for us is that every
read from a possibly mutable object requires an acquire, and every
write requires a release. It's as if every reference in the program is
implemented with a volatile pointer. Even if the object is never
mutated, there can be a lot of unnecessary bus chatter waiting for
MESI to tell us so.

-- 
William Leslie

From fijall at gmail.com  Thu Jul 29 10:55:25 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 29 Jul 2010 10:55:25 +0200
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	 
	
	 
	 
	 
	 
	 
	 
	
Message-ID: 

On Thu, Jul 29, 2010 at 10:50 AM, William Leslie
 wrote:
> On 29 July 2010 18:02, Maciej Fijalkowski  wrote:
>> On Thu, Jul 29, 2010 at 9:57 AM, William Leslie
>>  wrote:
>>> I claim that there are two alternatives in the face of one thread
>>> mutating an object and the other observing:
>>>
>>> 0. You can give up consistency and do fine-grained locking, which is
>>> reasonably fast but error prone, or
>>> 1. Expect python to handle all of this for you, effectively not making
>>> a change to the memory model. You could do this with implicit
>>> per-object locks which might be reasonably fast in the absence of
>>> contention, but not when several threads are trying to use the object.
>>>
>>> Queues already are in a sense your per-object-lock,
>>> one-thread-mutating, but usually one thread has acquire semantics and
>>> one has release semantics, and that combination actually works. It's
>>> when you expect to have a full memory barrier that is the problem.
>>>
>>> Come to think of it, you might be right Kevin: as long as only one
>>> thread mutates the object, the mutating thread never /needs/ to
>>> acquire, as it knows that it has the latest revision.
>>>
>>> Have I missed something?
>>>
>>> --
>>> William Leslie
>>>
>>
>> So my question is why you think 1. is really expensive (can you find
>> evidence). I don't see what is has to do with cache misses. Besides,
>> in python you cannot guarantee much about mutability of objects. So
>> you don't know if object passed in a queue is mutable or not, unless
>> you restrict yourself to some very simlpe types (in which case there
>> is no shared memory, since you only pass immutable objects).
>
> If task X expects that task Y will mutate some object it has, it needs
> to go back to the source for every read. This means that if you do use
> mutation of some shared object for communication, it needs to be
> synchronised before every access. What this means for us is that every
> read from a possibly mutable object requires an acquire, and every
> write requires a release. It's as if every reference in the program is
> implemented with a volatile pointer. Even if the object is never
> mutated, there can be a lot of unnecessary bus chatter waiting for
> MESI to tell us so.
>

I do agree there is an overhead. Can you provide some data how much
this overhead is? Python is not a very simple language and a lot of
things are complex and time consuming, so I wonder how it compares to
locking per object.

From sparks.m at gmail.com  Thu Jul 29 11:44:52 2010
From: sparks.m at gmail.com (Michael Sparks)
Date: Thu, 29 Jul 2010 10:44:52 +0100
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
Message-ID: 

Would comments from a project using this approach in real systems be
of interest/use/help? Whilst I didn't know about Morrison's FBP
(Balzer's work predates him btw - don't listen to hype) I had heard of
(and played with) Occam among other more influential things, and
Kamaelia is a real tool. Also there is already a pre-existing FBP tool
for Stackless, and then historically there's also MASCOT & friends. It
just looks to me that you're tieing yourself up in knots over things
that aren't problems, when there are some things which could be useful
(in practice) & interesting in this space.

Oh, incidentally, Mini Axon is a toy/teaching/testing system - as the
name suggests. The main Axon is more complete -- in the areas we've
needed - it's been driven by real system needs.

(for those who don't know me, Kamaelia is my project, I don't bite,
but I do sometimes talk or type fast)

Regards,

Michael Sparks
--
http://www.kamaelia.org/PragmaticConcurrency.html
http://yeoldeclue.com/blog

On 7/29/10, Kevin Ar18  wrote:
> As a followup to my earlier post:
> "pre-emptive micro-threads utilizing shared memory message passing?"
>
> I am actually finding that the biggest hurdle to accomplishing what I want
> is the lack of ANY type of shared memory -- even if it is limited.  I wonder
> if I might ask a question:
>
> Would the following be a possible way to offer a limited type of shared
> memory:
>
> Summary: create a system very, very similar to POSH, but with differences:
>
> In detail, here's what I mean:
> * unlike POSH, utilize OS threads and shared memory (not processes)
> * Create a special shared memory location where you can place Python objects
> * Each Python object you place into this location can only be accessed
> (modified) by 1 thread.
> * You must manually assign ownership of an object to a particular thread.
> * The thread that "owns" the object is the only one that can modify it.
> * You can transfer ownership to another thread (but, as always only the
> owner can modify it).
>
> * There is no GIL when a thread interacts with these special objects.  You
> can have true thread parallelism if your code uses a lot of these special
> objects.
> * The GIL remains in place for all other data access.
> * If your code has a mixture of access to the special objects and regular
> data, then once you hit a point where a thread starts to interact with data
> not in the special storage, then that thread must follow GIL rules.
>
> Granted, there might be some difficulty with the GIL part... but I thought I
> might ask anyways. :)
>
>> Date: Wed, 28 Jul 2010 22:54:39 +1000
>> Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared memory
>> 	message passing?
>> From: william.leslie.ttg at gmail.com
>> To: kevinar18 at hotmail.com
>> CC: pypy-dev at codespeak.net
>>
>> On 28 July 2010 04:20, Kevin Ar18  wrote:
>> > I am attempting to experiment with FBP - Flow Based Programming
>> > (http://www.jpaulmorrison.com/fbp/ and book:
>> > http://www.jpaulmorrison.com/fbp/book.pdf)  There is something very
>> > similar in Python: http://www.kamaelia.org/MiniAxon.html  Also, there
>> > are some similarities to Erlang - the share nothing memory model... and
>> > on some very broad levels, there are similarities that can be found in
>> > functional languages.
>>
>> Does anyone know if there is a central resource for incompatible
>> python memory model proposals? I know of Jython, Python-Safethread,
>> and Mont-E.
>>
>> I do like the idea of MiniAxon, but let me mention a topic that has
>> slowly been bubbling to the front of my mind for the last few months.
>>
>> Concurrency in the face of shared mutable state is hard. It makes it
>> trivial to introduce bugs all over the place. Nondeterminacy related
>> bugs are far harder to test, diagnose, and fix than anything else that
>> I would almost mandate static verification (via optional typing,
>> probably) of task noninterference if I was moving to a concurrent
>> environment with shared mutable state. There might be a reasonable
>> middle ground where, if a task attempts to violate the required static
>> semantics, it fails dynamically. At least then, latent bugs make
>> plenty of noise. An example for MiniAxon (as I understand it, which is
>> not very well) would be verification that a "task" (including
>> functions that the task calls) never closes over and yields the same
>> mutable objects, and never mutates globally reachable objects.
>>
>> I wonder if you could close such tasks off with a clever subclass of
>> the proxy object space that detects and rejects such memory model
>> violations? With only semantics that make the program deterministic?
>>
>> The moral equivalent would be cooperating processes with a large
>> global (or not) shared memory store for immutable objects, queues for
>> communication, and the additional semantic that objects in a queue are
>> either immutable or the queue holds their only reference. The trouble
>> is that it is so hard to work out what immutable really means.
>> Non-optional annotations would be not very pythonian.
>>
>> --
>> William Leslie
>

From william.leslie.ttg at gmail.com  Thu Jul 29 15:15:32 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 29 Jul 2010 23:15:32 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
	
	
	
	
	
Message-ID: 

On 29 July 2010 18:55, Maciej Fijalkowski  wrote:
> On Thu, Jul 29, 2010 at 10:50 AM, William Leslie
>  wrote:
>> If task X expects that task Y will mutate some object it has, it needs
>> to go back to the source for every read. This means that if you do use
>> mutation of some shared object for communication, it needs to be
>> synchronised before every access. What this means for us is that every
>> read from a possibly mutable object requires an acquire, and every
>> write requires a release. It's as if every reference in the program is
>> implemented with a volatile pointer. Even if the object is never
>> mutated, there can be a lot of unnecessary bus chatter waiting for
>> MESI to tell us so.
>>
>
> I do agree there is an overhead. Can you provide some data how much
> this overhead is? Python is not a very simple language and a lot of
> things are complex and time consuming, so I wonder how it compares to
> locking per object.

It *is* locking per object, but you also spend time looking for the
data if someone else has invalidated your cache line.

Come to think of it, that isn't as bad as it first seemed to me. If
the sender never mutates the object, it will Just Work on any machine
with a fairly flat cache architecture.

Sorry. Carry on.

-- 
William Leslie

From kevinar18 at hotmail.com  Thu Jul 29 18:56:28 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 29 Jul 2010 12:56:28 -0400
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: ,
	<20100727062702.GE12699@tunixman.com>,
	,
	,
	,
	,
	,
	,
	,
	
Message-ID: 


> I claim that there are two alternatives in the face of one thread
> mutating an object and the other observing:
Well, I did consider the possibility of one thread being able to change, the others observe, but I have no idea if that is too complicate like you are suggesting.
However, that is not even necessary.  An even more limited form, would work fine (at least for me):
 
Two possible modes:
Read/Write from 1 thread:
* ONLY one thread can change and observe(read) -- no other threads have access of any kind or even know of its existence until you transfer control to another thread (then only the thread you transferred control has acces).
(Optional) read only from all threads:
* Optionally, you could have objects that are in read only mode and all threads can observe it.
 
To make things easier, maybe special GIL-free threads could be added.  (They would still be OS-level threads, but with special properties in Python.) These threads would have the property that they could ONLY access data stored in the special object store to which they have read/write privilege.  They can't access other objects not in the special store.  As a result, these special threads would be free of the GIL and could run in parallel.

> Queues already are in a sense your per-object-lock,
> one-thread-mutating, but usually one thread has acquire semantics and
> one has release semantics, and that combination actually works. It's
> when you expect to have a full memory barrier that is the problem.

Now you brought up something interesting: queues
To be honest something like queues and pipes would good enough for my purposes -- if they used shared memory.  Currently, the implemenation of queues and pipes in the multiprocessing module seems rather costly as they use processes, and require copying data back and forth.
In particular, what would be useful:
 
* A queue that holds self-contained Python objects (with no pointers/references to other data not in the queue so as to prevent threading issues)
* The queue can be accessed by all special threads simultaneously (in parallel).  You would only need locks around queue operations, but that is pretty easy to do -- unless there is some hidden Interpreter problem that would make this easy task hard.
* Streaming buffers -- like a file buffer or something similar, so you can send data from one thread to another as it comes in (when you don't know when it will end or it may never end).  Only two threads have access: one to put data in, the other to extract it.
 
> 0. You can give up consistency and do fine-grained locking, which is
> reasonably fast but error prone, or
> 1. Expect python to handle all of this for you, effectively not making
> a change to the memory model. You could do this with implicit
> per-object locks which might be reasonably fast in the absence of
> contention, but not when several threads are trying to use the object.
> 
...
> 
> Come to think of it, you might be right Kevin: as long as only one
> thread mutates the object, the mutating thread never /needs/ to
> acquire, as it knows that it has the latest revision.
> 
> Have I missed something?
I'm afraid I don't know enough about Python's Interpreter to say much.  The only way would be for me to do some studying on interpreters/compilers and get digging into the codebase -- and I'm not sure how much time I have to do that right now. :)
Perhaps the part about one thread only having read & write changes the situation?
 
One possible implemenation might be similar to how POSH does it:
Now, I'm not suggesting this, because I know enough to say it is possible, but just to put something out there that might work.
Create a special virtual memory address or lookup table for each thread.  When you assign a read+write object to a thread, it gets added to the virtual address/memory table.
Optinally, it could be up to the programmer to make sure they don't try to access data from a thread that does not have ownership/control of that object.  If a programmer does try to access it, it would fail as the memory address would point to nowhere/bad data/etc....
 
Of course, there are probably other, better ways to do it that are not as fickle as this... but I don't know if the limitations of the Python Interpreter and GIL would allow better methods. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100729/b0393bbc/attachment.htm 

From andrewfr_ice at yahoo.com  Thu Jul 29 18:56:52 2010
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Thu, 29 Jul 2010 09:56:52 -0700 (PDT)
Subject: [pypy-dev] pypy-dev Digest, Vol 360, Issue 13
In-Reply-To: 
Message-ID: <680164.96893.qm@web120007.mail.ne1.yahoo.com>

Hi Kevin:

Message: 1
Date: Tue, 27 Jul 2010 14:20:10 -0400
From: Kevin Ar18 
Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared
    memory message passing?
To: 
Message-ID: 
Content-Type: text/plain; charset="iso-8859-1"


>I am attempting to experiment with FBP - Flow Based Programming >(http://www.jpaulmorrison.com/fbp/ and book: http://www.jpaulmorrison.com>/fbp/book.pdf)? There is something very similar in Python: >http://www.kamaelia.org/MiniAxon.html? Also, there are some similarities >to Erlang - the share nothing memory model... and on some very broad >levels, there are similarities that can be found in functional languages.

I just came back from EuroPython. A lot of discussion on concurrency....

Well functional languages (like Erlang), variables tend to be immutable. This is a bonus in a concurrent system - makes it easier to reason about the system - and helps to avoid various race conditions. As for the shared memory. I think there is a difference between whether things are shared at the application programmer level, or under the hood controlled by the system. Programmers tend to beare bad at the former. 

>http://www.kamaelia.org/MiniAxon.html

I took a quick look. Maybe I am biased but Stackless Python gives you most of that. Also tasklets and channels can do everything a generator can and more (a generator is more specialised than a coroutine). Also it is easy to mimic asynchrony with a CSP style messaging system where microthreads and channels are cheap. A line from the book "Actors: A Model of Concurrent Computation in Distributed Systems" by Gul A. Agha comes to mind: "synchrony is mere buffered asynchrony."

>The process of connecting the boxes together was actually designed to be >programmed visually, as you can see from the examples in the book (I have >no idea if it works well, as I am merely starting to experiment with it).

What bought me to Stackless Python and PyPy was work concerning WS-BPEL. Allegedly, WS-BPEL/XLang/WSFL (Web-Services Flow Language) are based on formalisms like pi calculus.

Since I don't own a multi-core machine and I am not doing CPU intense stuff, I never really cared. However I have been doing things where I needed to impose logical orderings upon processes (i.e., process C can only run after process A and B are finished). My initial native uses of Stackless (easy to do in anything system based on CSP), resulted in deadlocking the system. So I found understanding deadlock to be very important.

>Each box, being a self contained "program," the only data it has access >to is 3 parts:

>Implementation of FBP requires a custom scheduler for several reasons:
>(1) A box can only run if it has actual data on the "in port(s)"? Thus, >the scheduler would only schedule boxes to run when they can actually >process some data.

Stackless Python already works like this. No custom scheduler needed. I would recommend you read Rob Pike's paper "The Implementation of Newsqueak" or some of the Cardilli papers to understand how CSP constructs with channels work. And if you need to customize schedulers - you have two routes 1) Use pre-existing classes and API 2) Experiment with PyPy's stackless.py

>(2) In theory, it may be possible to end up with hundreds or thousands of >these light weight boxes.? Using heavy-weight OS threads or processes for every one is out of the question.

Stackless Python.

>In a perfect world, here's what I might do:
* Assume a quad core cpu
>(1) Spawn 1 process
>(2) Spawn 4 threads & assign each thread to only 1 core -- in other >words, don't let the OS handle moving threads around to different cores
>(3) Inside each thread, have a mini scheduler that switches back and >forth between the many micro-threads (or "boxes") -- note that the OS >should not handle any of the switching between micro-threads/boxes as it >does it all wrong (and to heavyweight) for this situation.
>(4) Using a shared memory queue, each of the 4 schedulers can get the >next box to run... or add more boxes to the schedule queue.

My advice: get stuff properly working under a single threaded model first so you understand the machinery. That said, I think Carlos Eduardo de Paula a few years ago played with adapting Stackless for multi-processing.

Second piece of advice: start looking at how Go does things. Stackless Python and Go share a common ancestor. However Go does much more on the multi-core front.

Cheers,
Andrew


      


From kevinar18 at hotmail.com  Thu Jul 29 19:35:14 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 29 Jul 2010 13:35:14 -0400
Subject: [pypy-dev] pypy-dev Digest, Vol 360, Issue 13
In-Reply-To: <680164.96893.qm@web120007.mail.ne1.yahoo.com>
References: ,
	<680164.96893.qm@web120007.mail.ne1.yahoo.com>
Message-ID: 


> Well functional languages (like Erlang), variables tend to be immutable. This is a bonus in a concurrent system - makes it easier to reason about the system - and helps to avoid various race conditions. As for the shared memory. I think there is a difference between whether things are shared at the application programmer level, or under the hood controlled by the system. Programmers tend to beare bad at the former. 

Your right... and I am actually talking about non-shared memory from the perspective of the programmer, but under the hood, it MUST use shared memory for implementation.  The problem I am running into is that there is no way to implement it under the hood because there is no way to do shared memory in Python.
 
Thanks for bringing that up.  Maybe that will clarify what I was going on about. :)
 
> I took a quick look. Maybe I am biased but Stackless Python gives you most of that. Also tasklets and channels can do everything a generator can and more (a generator is more specialised than a coroutine). Also it is easy to mimic asynchrony with a CSP style messaging system where microthreads and channels are cheap. A line from the book "Actors: A Model of Concurrent Computation in Distributed Systems" by Gul A. Agha comes to mind: "synchrony is mere buffered asynchrony."

Agreed.  Stuff like the stackless module in PyPy, greenlets, twisted, and others do offer some useful options that are even better than generators...  I could definitely make use of them for some of the broader implemenation details.  However, the problem is always that there is no way to make them parallel within Python itself, because there is no shared memory that I can use for "under the hood" implemenation.
 
Now, if there is a true parallel implementation of stackless, greenlets, twisted, etc... maybe it could fit my purposes... but I'd have to check.  I did some basic searching on various Python threading implemenations in the past and didn't really find one that did... but, like you suggested, maybe there is one out there somewhere.
 
> >The process of connecting the boxes together was actually designed to be >programmed visually, as you can see from the examples in the book (I have >no idea if it works well, as I am merely starting to experiment with it).
> 
> What bought me to Stackless Python and PyPy was work concerning WS-BPEL. Allegedly, WS-BPEL/XLang/WSFL (Web-Services Flow Language) are based on formalisms like pi calculus.
> 
> Since I don't own a multi-core machine and I am not doing CPU intense stuff, I never really cared. However I have been doing things where I needed to impose logical orderings upon processes (i.e., process C can only run after process A and B are finished). My initial native uses of Stackless (easy to do in anything system based on CSP), resulted in deadlocking the system. So I found understanding deadlock to be very important.
> 
Thanks... and, uh, about all I can do is bookmark this for later.  Really, thanks for the links; I may very well want to research each and every one of these at some point and see what I can learn from each one.  If you have more stuff like that, feel free to let me know. :)
 
> My advice: get stuff properly working under a single threaded model first so you understand the machinery. That said, I think Carlos Eduardo de Paula a few years ago played with adapting Stackless for multi-processing.
Yeah, I've been considering that.  Maybe I'll just go ahead with a single threaded implementation... and if I feel like it, I could always try to edit PyPy or one of the other implemenations later (although I probably never will due to time constraints :) ).  Still, I figured I might as well ask around and see if it was possible to do a parallel implementation sooner.
 
Or... what I may end up doing is using the slow multiprocessing module and queues.  Granted, it will probably be slow since it doesn't use shared memory "under the hood", but it would be parallel.
 
> Second piece of advice: start looking at how Go does things. Stackless Python and Go share a common ancestor. However Go does much more on the multi-core front.
I have looked at Go Goroutines.... albeit briefly.  I noticed that they are co-operative like stackless and, based on your comments, I'm guessing they work on multiple cores?  I was really disappointed that they were not pre-emptive, however.  I haven't really looked much into it beyond that, but maybe I'll give it another look; but using it would mean not using Python. :( 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100729/cbd8666a/attachment.htm 

From kevinar18 at hotmail.com  Thu Jul 29 19:44:39 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 29 Jul 2010 13:44:39 -0400
Subject: [pypy-dev] FW: Would the following shared memory model be possible?
In-Reply-To: 
References: ,
	<20100727062702.GE12699@tunixman.com>,
	,
	,
	,
	,
	
Message-ID: 


> Would comments from a project using this approach in real systems be
> of interest/use/help? Whilst I didn't know about Morrison's FBP
> (Balzer's work predates him btw - don't listen to hype) I had heard of
> (and played with) Occam among other more influential things, and
> Kamaelia is a real tool. Also there is already a pre-existing FBP tool
> for Stackless, and then historically there's also MASCOT & friends. It

You brought up a lot of topics.  I went ahead and sent you a private email.  There's always lots of interesting things I can add to my list of things to learn about. :)
 
> just looks to me that you're tieing yourself up in knots over things
> that aren't problems, when there are some things which could be useful
> (in practice) & interesting in this space.
The particular issue in this situation is that there is no way to make Kamaelia, FBP, or other concurrency concepts run in parallel (unless you are willing to accept lots of overhead like with the multiprocessing queues).
 
Since you have worked with Kamaelia code a lot... you understand a lot more about implementation details.  Do you think the previous shared memory concept or something like it would let you make Kamaelia parallel?
If not, can you think of any method that would let you make Kamaelia parallel?
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100729/198b078f/attachment.htm 

From kevinar18 at hotmail.com  Thu Jul 29 20:02:38 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 29 Jul 2010 14:02:38 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: ,
	
Message-ID: 


> Hello Kevin,
> I don't know if it can be a solution to your problem but for my
> Master Thesis I'm working on making Stackless Python distributed. What
> I did is working but not complete and I'm right now in the process of
> writing the thesis (in french unfortunately). My code currently works
> with PyPy's "stackless" module onlyis and use some PyPy specific
> things. Here's what I added to Stackless:
>
> - Possibility to move tasklets easily (ref_tasklet.move(node_id)). A
> node is an instance of an interpreter.
> - Each tasklet has its global namespace (to avoid sharing of data). The
> state is also easier to move to another interpreter this way.
> - Distributed channels: All requests are known by all nodes using the
> channel.
> - Distributed objets: When a reference is sent to a remote node, the
> object is not copied, a reference is created using PyPy's proxy object
> space.
> - Automated dependency recovery when an object or a tasklet is loaded
> on another interpreter
>
> With a proper scheduler, many tasklets could be automatically spread in
> multiple interpreters to use multiple cores or on multiple computers. A
> bit like the N:M threading model where N lightweight threads/coroutines
> can be executed on M threads.

Was able to have a look at the API...
If others don't mind my asking this on the mailing list:
 
* .send() and .receive()
What type of data can you send and receive between the tasklets?  Can you pass entire Python objects?
 
* .send() and .receive() memory model
When you send data between tasklets (pass messages) or whateve you want to call it, how is this implemented under the hood?  Does it use shared memory under the hood or does it involve a more costly copying of the data?  I realize that if it is on another machine you have to copy the data, but what about between two threads?  You mentioned PyPy's proxy object.... guess I'll need to read up on that. 		 	   		  

From sparks.m at gmail.com  Thu Jul 29 19:21:25 2010
From: sparks.m at gmail.com (Michael Sparks)
Date: Thu, 29 Jul 2010 18:21:25 +0100
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	
	
Message-ID: <201007291821.26318.sparks.m@gmail.com>

I make it a point these days to only reply on-list. It leads to endless 
repetition otherwise. If you repost this cc'ing the pypy-dev list I'll reply. 
If you think it's off topic there, then I see no point.


Michael.

On Thursday 29 July 2010 18:05:27 you wrote:
> Thanks for the reply.
> 
> > Would comments from a project using this approach in real systems be
> > of interest/use/help?
> 
> I contacted someone from Kamaelia a while back (probably you).
> Yes, use of the dataflow concept would be really useful (no
> MIT/BSD/Python/PD license).  However, licensing was an issues, so I went
> it on my own.  I find the concept rather interesting both to maybe learn
> from and to actually try and use in an actual application.
> 
> > Whilst I didn't know about Morrison's FBP
> > (Balzer's work predates him btw - don't listen to hype) I had heard of
> > (and played with) Occam among other more influential things, and
> > Kamaelia is a real tool.
> 
> What is this Balzer and Occam? :)  Do you have any links I can look at?
> 
> > Also there is already a pre-existing FBP tool
> > for Stackless
> 
> The problem is that Stackless is not parallel, which is what I would really
> like to do.
> 
> > , and then historically there's also MASCOT & friends.
> 
> Do you have a link about this?

-- 
>>>

From andrewfr_ice at yahoo.com  Thu Jul 29 22:39:16 2010
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Thu, 29 Jul 2010 13:39:16 -0700 (PDT)
Subject: [pypy-dev] Would the following shared memory model be possible?
Message-ID: <557968.57023.qm@web120009.mail.ne1.yahoo.com>

Hi Michael:

--- On Thu, 7/29/10, Michael Sparks  wrote:

> It's a pity we didn't get a chance to chat at the
> conference. (I was the one videoing everything for upload after >transcoding :)

Yes I noticed. I gave the talk "Prototyping Go's Select with stackless.py for Stackless Python." Much of that talk dealt with rendezvous semantics courtesy via synchronous channels.

I will post the Original slides and the Revised version (mistakes corrected ) in a day or two. 
 
> > >http://www.kamaelia.org/MiniAxon.html
> > 

> I'm biassed towards Kamaelia (naturally :-), but I agree.
> MiniAxon is just a  toy/tutorial. Early in kamaelia's history we >considered using stackless, but rejected it simply because we wanted to >work mainly with mainline python, rather than a specialised version.

Fair enough. Currently Stackless Python is being integrated with Psyco and will be available as a module. 

> Other things in Stackless's favour (IIRC) - include the
> fact that you can  pickle generators, and send them across network >connection, unpickle them and let them continue running. I don't know 
>if you do the same with tasklets, but I wouldn't be surprised if you do :)

As long as you do not have a C Frame involved, you can pickle a tasklet.
That was the subject of my "Silly Stackless Python Trick" lighting talk.
I was going to demonstrate a version of the Sieve of Eratosthenes that could be pickled and resumed on another machine. However my HP Netbook had a non-standard VGA output connection and I needed to install Stackless
on a loaner ThinkPad that died as I hooked it up. However you saw all
that :-(

> That means you have potential for process migration.

Yep. Gabriel Lavoie does a lot of work in that area with PyPy (thanks Gabriel !)

> Doing that sensibly though IMO would require better understanding
> in the system of what the user is trying to achieve and what they're >sending. (It's easy to think of examples where this causes more pain than >it's worth after all)

You have to understand what can be pickled. Occasionally you are in for
a surprise (i.e. functools).

>You could argue in that case that the biggest _real_ difference is 
>that we try to use a unified API for different concurrency
>methods. 

Well I would argue that Stackless has a simple elegant model. The addition of select just adds more power.Stackless channels can also serve as generators (they are iterable). I recently took a stab at writing the Sleeping Barber's problem. I think in Stackless, the basic solution was about 30 lines. Very little clutter.

> One **highly subjective** other thing in our favour, is
> that generators are limited to a single level of control flow 
>(ie non-nestable without a trampoline). This doesn't sounds like
> an advantage, but it tends to lead to simpler components which are 
>in turn reusable. (and that I view as useful :)

Okay. I attended Ray's Hettinger's talk on Monocle. In the past
I have encountered situations where I bumped up with the nesting problem.
If I recall, the problem involved request handlers that had a RPC style AND made additional Twisted deferred calls:

class MyRequestHandler(...):
   
    @defer.inlineCallbacks
    def process(self):
        try:
            result = yield
client.getPage("http://www.google.com")
        except Exception, err:
            log.err(err, "process getPage call
failed")
        else:
            # do some processing with the result 
            return result

looks reasonable but Python will balk. Nested generators. Only way around it is that you had to hope that the Twisted protocol was properly written and chain deferreds.

> Have fun,

I do :-)

Cheers,
Andrew



      


From p.giarrusso at gmail.com  Thu Jul 29 22:53:22 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 29 Jul 2010 22:53:22 +0200
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com> 
	
	 
	
	 
	 
	 
	 
	 
	 
	 
	 
	
Message-ID: 

On Thu, Jul 29, 2010 at 15:15, William Leslie
 wrote:
> On 29 July 2010 18:55, Maciej Fijalkowski  wrote:
>> On Thu, Jul 29, 2010 at 10:50 AM, William Leslie
>>  wrote:
>>> If task X expects that task Y will mutate some object it has, it needs
>>> to go back to the source for every read. This means that if you do use
>>> mutation of some shared object for communication, it needs to be
>>> synchronised before every access. What this means for us is that every
>>> read from a possibly mutable object requires an acquire, and every
>>> write requires a release. It's as if every reference in the program is
>>> implemented with a volatile pointer. Even if the object is never
>>> mutated, there can be a lot of unnecessary bus chatter waiting for
>>> MESI to tell us so.
>>>
>>
>> I do agree there is an overhead. Can you provide some data how much
>> this overhead is? Python is not a very simple language and a lot of
>> things are complex and time consuming, so I wonder how it compares to
>> locking per object.

Below I try to prove that locking is still too expensive, even for an
interpreter.
Also, for many things the clever optimizations you do allow making
those costs small, at least for the average case / fast path. I have
been taught to consider clever optimizations as required. With JIT
compilation, specialization and shadow classes, are method calls much
more expensive than a guard and (if no inlining is done, as might
happen in PyPy in the worst case for big functions) an assembler
'call' opcode, and possibly stack shuffling? How many cycles is that?
How more expensive is that than optimized JavaScript (which is not far
from C, the only difference being the guard)? You can assume the case
of plain calls without keyword arguments and so on (and with inlining,
keyword arguments should pay no runtime cost).

Also, the free threading patches which tried removing the GIL gave an
unacceptable (IIRC 2x) slowdown to CPython in the old days of CPython
1.5. And I don't think they tried to lock every object, just what you
need to lock (which included refcounts).

> It *is* locking per object, but you also spend time looking for the
> data if someone else has invalidated your cache line.

That overhead is already there in locking per object, I think (locking
can be much more expensive than a cache miss, see below).
However, locking per object does not prevent race conditions unless
you make atomic regions as big as actually needed (locking per
statement does not work), it just prevents data races (a conflict
between a write and a memory operation which are not synchronized
between each other). And you can't extend atomic regions indefinitely,
as that implies starvation. Even software transactional memory
requires the programmer to allocate which regions have to be atomic.

Given the additional cost (discussed elsewhere in this mail), and
given that there is not much benefit, I think locking-per-object is
not worth it (but I'd still love to know more about why the effort on
python-safethread was halted).

> Come to think of it, that isn't as bad as it first seemed to me. If
> the sender never mutates the object, it will Just Work on any machine
> with a fairly flat cache architecture.

You first wrote: "The alternative, implicitly writing updates back to
memory as soon as possible and reading them out of memory every time,
can be hundreds or more times slower."
This is not "locking per object", it is just semantically close to it,
and becomes equivalent if only one thread has a reference at any time.

They are very different though performance-wise, and each of them is
better for some usages. In the Linux kernel (which I consider quite
authoritative here, on what you can do in C) both are used for valid
performance reasons, and a JIT compiler could choice between them.
Here, first I describe the two alternatives mentioned. Finally, I go
to the combination for the "unshared case".

- What you first described (memory barriers or uncached R/Ws) can be
faster for small updates, depending on the access pattern. An uncached
memory area does not disturb other memory traffic, unlike memory
barriers which are global, but I don't think an unprivileged process
is allowed to obtain one (by modifying MSRs or PATs, for x86).

Cost: each memory op goes to main memory and is thus as slow as a
cache miss (hundreds of clock cycles). When naively reading a Python
field, many such reads can be possible, but a JIT compiler can bring
it down to the equivalent of a C access with shadow classes and
specialization, and this would pay even more here (V8 does it for
JavaScript and I think PyPy already does most or all of it).

- Locking per object (monitors): slow upfront, but you can do each r/w
out of your cache, so if the object is kept locked for some time, this
is more efficient.
How slow? A system call to perform locking can cost tens of thousands
of cycles. But Java locks, and nowadays even Linux futexes (and
Windows locks), perform everything in userspace in as many cases as
possible (the slowpath is when there is actually contention on the
lock, but it's uncommon with locking-per-object). I won't sum up here
the literature on this.

- Since no contention is expected here, a simple couple of memory
barrier is needed on send/receive (a write barrier for send, a read
one for receive, IIRC). Allowing read-only access to another thread
already brings back to a mixture of the above two solutions. However,
in the 1st solution, using memory barriers, you'd need a write barrier
for every write, but you could save on read barriers.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From exarkun at twistedmatrix.com  Thu Jul 29 23:24:58 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Thu, 29 Jul 2010 21:24:58 -0000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: <557968.57023.qm@web120009.mail.ne1.yahoo.com>
References: <557968.57023.qm@web120009.mail.ne1.yahoo.com>
Message-ID: <20100729212458.2188.24074246.divmod.xquotient.34@localhost.localdomain>

On 08:39 pm, andrewfr_ice at yahoo.com wrote:
>
>Okay. I attended Ray's Hettinger's talk on Monocle. In the past
>I have encountered situations where I bumped up with the nesting 
>problem.
>If I recall, the problem involved request handlers that had a RPC style 
>AND made additional Twisted deferred calls:
>
>class MyRequestHandler(...):
>
>    @defer.inlineCallbacks
>    def process(self):
>        try:
>            result = yield
>client.getPage("http://www.google.com")
>        except Exception, err:
>            log.err(err, "process getPage call
>failed")
>        else:
>            # do some processing with the result
>            return result
>
>looks reasonable but Python will balk.

Aside from the "return result" (should be defer.returnValue(result), 
generators can't return with a value), this looks fine to me too.  Why 
do you say Python will balk?

Jean-Paul

From william.leslie.ttg at gmail.com  Fri Jul 30 09:35:29 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Fri, 30 Jul 2010 17:35:29 +1000
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
	
	
	
	
	
	
	
Message-ID: 

On 30 July 2010 06:53, Paolo Giarrusso  wrote:
>> Come to think of it, that isn't as bad as it first seemed to me. If
>> the sender never mutates the object, it will Just Work on any machine
>> with a fairly flat cache architecture.
>
> You first wrote: "The alternative, implicitly writing updates back to
> memory as soon as possible and reading them out of memory every time,
> can be hundreds or more times slower."
> This is not "locking per object", it is just semantically close to it,
> and becomes equivalent if only one thread has a reference at any time.

Yes, direct memory access was misdirection (sorry), as the cache
already handles consistency even in NUMA systems of the same size that
sit on most desktops today, and most significantly you still need to
lock objects in many cases, such as looking up an entry in a dict,
which can change size while probing. Not only are uncached accesses
needlessly slow in the typical case, but they are not sufficient to
ensure consistency of some resizable rpython data structures.

-- 
William Leslie

From evan at theunixman.com  Fri Jul 30 21:36:28 2010
From: evan at theunixman.com (Evan Cofsky)
Date: Fri, 30 Jul 2010 12:36:28 -0700
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: 
	
Message-ID: <20100730193627.GB2082@tunixman.com>

On 07/27 11:48, Maciej Fijalkowski wrote:
> Right now, no. But there are ways in which you can experiment. Truly
> concurrent threads (depends on implicit vs explicit shared memory)
> might require a truly concurrent GC to achieve performance. This is
> work (although not as big as removing refcounting from CPython for
> example).

Would starting to remove the GIL then be a useful project for someone
(like me, for example) to undertake? It might be a good start to
experimentation with other kinds of concurrency. I've been interested in
Software Transactional Memory
(http://en.wikipedia.org/wiki/Software_transactional_memory).

-- 
Evan Cofsky 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: Digital signature
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100730/4fc25374/attachment.pgp 

From fijall at gmail.com  Fri Jul 30 21:40:35 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 30 Jul 2010 21:40:35 +0200
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: <20100730193627.GB2082@tunixman.com>
References: 
	 
	<20100730193627.GB2082@tunixman.com>
Message-ID: 

On Fri, Jul 30, 2010 at 9:36 PM, Evan Cofsky  wrote:
> On 07/27 11:48, Maciej Fijalkowski wrote:
>> Right now, no. But there are ways in which you can experiment. Truly
>> concurrent threads (depends on implicit vs explicit shared memory)
>> might require a truly concurrent GC to achieve performance. This is
>> work (although not as big as removing refcounting from CPython for
>> example).
>
> Would starting to remove the GIL then be a useful project for someone
> (like me, for example) to undertake? It might be a good start to
> experimentation with other kinds of concurrency. I've been interested in
> Software Transactional Memory
> (http://en.wikipedia.org/wiki/Software_transactional_memory).
>
> --
> Evan Cofsky 
>

I think removing GIL is not a good place to start. It's far too
complex without knowing codebase (it's fairly complex with knowing
codebase). There are many related projects, which are smaller in size
and eventually might lead to having some idea how to remove the GIL.
If you're interested, come to #pypy on IRC to discuss.

Cheers,
fijal

From evan at theunixman.com  Fri Jul 30 21:54:09 2010
From: evan at theunixman.com (Evan Cofsky)
Date: Fri, 30 Jul 2010 12:54:09 -0700
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: 
	
	<20100730193627.GB2082@tunixman.com>
	
Message-ID: <20100730195408.GC2082@tunixman.com>

On 07/30 21:40, Maciej Fijalkowski wrote:
> If you're interested, come to #pypy on IRC to discuss.

Sounds reasonable enough. I'll hang out on #pypy and see what happens.

Thanks

-- 
Evan Cofsky 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: Digital signature
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100730/98d04acb/attachment.pgp 

From sparks.m at gmail.com  Sat Jul 31 03:08:49 2010
From: sparks.m at gmail.com (Michael Sparks)
Date: Sat, 31 Jul 2010 02:08:49 +0100
Subject: [pypy-dev] FW: Would the following shared memory model be
	possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
Message-ID: 

On Thu, Jul 29, 2010 at 6:44 PM, Kevin Ar18  wrote:
> You brought up a lot of topics.  I went ahead and sent you a private email.
> There's always lots of interesting things I can add to my list of things to
> learn about. :)

Yes, there are lots of interesting things. I have a limited amount of
time however (I should be in bed, it's very late here, but I do /try/
to reply to on-list mails), so cannot spood feed you. Mailing me
directly rather than a (relevant) list precludes you getting answers
from someone other than me. Not being on lists also precludes you
getting answers to questions by chance. Changing emails and names in
email headers also makes keeping track of people hard...

(For example you asked off list last year about Kamaelia's license
from a different email address. Since it wasn't searchable I
completely forgot. You also asked all sorts of questions but didn't
want the answers public, so I didn't reply. If instead you'd
subscribed to the list, and asked there, you'd've found out that
Kamaelia's license changed - to the Apache Software License v2 ...)

If I mention something you find interesting, please Google first and
then ask publicly somewhere relevant. (the answer and question are
then googleable, and you're doing the community a service IMO if you
ask q's that way - if you're question is somewhere relevant and shows
you've already googled prior work as far as you can... People are
always willing to help people who show willing to help themselves in
my experience.)

>> just looks to me that you're tieing yourself up in knots over things
>> that aren't problems, when there are some things which could be useful
>> (in practice) & interesting in this space.
> The particular issue in this situation is that there is no way to make
> Kamaelia, FBP, or other concurrency concepts run in parallel (unless you are
> willing to accept lots of overhead like with the multiprocessing queues).
>
> Since you have worked with Kamaelia code a lot... you understand a lot more
> about implementation details.  Do you think the previous shared memory
> concept or something like it would let you make Kamaelia parallel?
> If not, can you think of any method that would let you make Kamaelia
> parallel?

Kamaelia already CAN run components in parallel in different processes
(has been able to do so for quite some time) or on different
processors. Indeed, all you do is use a ProcessPipeline or
ProcessGraphline rather than Pipeline or Graphline, and the components
in the top level are spread across processes. I still view the code as
experimental, but it does work, and when needed is very useful.

Kamaelia running on Iron Python can run on seperate processors sharing
data efficiently (due to lack of GIL there) happily too. Threaded
components there do that naturally - I don't use IronPython, but it
does run on Iron Python. On windows this is easiest, though Mono works
just as well.

I believe Jython also is GIL free, and Kamaelia's Axon runs there
cleanly too. As a result because Kamaelia is pure python, it runs
truly in parallel there too (based on hearing from people using
kamaelia on jython). Cpython is the exception (and a rather big one at
that). (Pypy has a choice IIUC)

Personally, I think if PyPy worked with generators better (which is
why I keep an eye on PyPy) and cpyext was improved, it'd provide a
really compelling platform for me. (I was rather gutted at Europython
to hear that PyPy's generator support was still ... problematic)

Regarding the *efficiency* and *enforcement* of the approach taken, I
feel you're chasing the wrong tree, but let's go there.

What approach does baseline (non-Iron Python running) kamaelia take
for multi-process work?

For historical reasons, it builds on top of pprocess rather than
multiprocessing module based. This means for interprocess
communications objects are pickled before being sent over operating
system pipes.

This provides an obvious communications overhead - and this isn't
really kamaelia specific at this point.

However, shifting data from one CPU to another is expensive, and only
worth doing in some circumstances. (Consider a machine with several
physical CPUs - each has a local CPU cache, and the data needs to be
transferred from one to another, which is why partly people worry
about thread/CPU affinity etc)

Basically, if you can manage it, you don't want to shift data between
CPUs, you want to partition the processing.

ie you may want to start caring about the size of messages and number
of messages going between processes. Sending small and few between
processes is going to be preferable to sending large and many for
throughput purposes.

In the case of small and few, the approach of pickling and sending
across OS pipes isn't such a bad idea. It works.

If you do want to share data between CPUs, and it sounds like you do,
then most OSs already provide a means of doing that - threads. The
conventions people use for using threads are where they become
unpicked, but as a mechanism, threads do generally work, and work
well.

As well as channels/boxes, you can use an STM approach, such as than
in Axon.STM ...
    * http://www.kamaelia.org/STM.html
    * http://code.google.com/p/kamaelia/source/browse/trunk/Code/Python/Bindings/STM/

...which is logically very similar to version control for variables. A
downside of STM (at least with this approach) however, is that for it
to work, you need either copy on write semantics for objects, or full
copying of objects or similar. Personally I use a biological metaphor
here, in that channels/boxes and components, and similar perform a
similar function to axons and neurons in the body, and that STM is
akin to the hormonal system for maintaining and controlling system
state. (I modelled biological tree growth many moons ago)

Anyhow, coming back to threads, that brings us back to python, and
implementations with a GIL, and those without.

For implementations with a GIL, you then have a choice: do I choose to
try and implement a memory model that _enforces_ data locality? that
is if a piece of data is in use inside a single "process" or "thread"
(from hereon I'll use "task" as a generic phrase) that trying to use
it inside another causes a problem for the task attempting to breach
the model.

In order to enforce this, I personally believe you'd need to use
multiple processes, and only share data through dedicated code
managing shared memory. You could of course do this outside user code.
To do this you'd need an abstraction that made sense, and something
like stackless' channels or kamaelia's (in/out) box model makes sense
there. (The CELL API uses a mailbox metaphor as well for reference)

In that case, you have a choice. You either copy the data into shared
memory, or you share the data in situ. The former gives you back
precisely the same overhead previously described, or the latter
fragments your memory (since you can no longer access it). You could
also have compaction.

However, personally, I think any possible benefits here are outweighed
by the costs and complexity.

The alternative is to _encourage_ data locality. That is encourage the
usage and sharing of data such that whilst you could share data
between tasks and cause corruption that the common way of using the
system discourages such actions. In essence that's what I try to do in
Kamaelia, and it seems to work. Specifically, the model says:

    * If I take a piece of data from an inbox, I own it and can do anything
      with it that I like. If you think of a physical piece of paper and
      I take it from an intray, then that really is the case.

    * If I put a piece of data in an outbox, I no longer own it and should
      not attempt to do anything more with it. Again, using a physical
      metaphor, and naming scheme helps here. In particular, if I put a
      piece of paper in the post, I can no longer modify it. How it gets
      to its recipient is not my concern either.

In practice this does actually work. If you add in immutable tuples,
and immutable strings then it becomes a lot clearer how this can work.

Is there a risk here of accidental modification? Yes. However, the
size and general simplicity of components tends to lead to such
problems being picked up early. It also enables component level
acceptance tests. (We tend to build small examples of usage, which in
turn effectively form acceptance tests)

[ An alternative is to make the "send" primitive make a copy on send.
That would be quite an overhead, and also limit the types of data you
can send. ]

In practical terms, it works. (Stackless proves this as well IMO,
since despite some differences, there's also lots of similarities)

The other question that arises, is "isn't the GIL a problem with
threads?". Well, the answer to that really depends on what you're
doing. David Beazely's talk on what happens on mixing different sorts
of threads shows that it isn't ideal, and if you're hitting that
behaviour, then actually switching to real processes makes sense.
However if you're doing CPU intensive work inside a C extension which
releases the GIL (eg numpy), then it's less of an issue in practice.
Custom extensions can do the same.

So, for example, picking something which I know colleagues [1] at work
do, you can use a DVS broadcast capture card to capture video frames,
pass those between threads which are doing processing on them, and
inside those threads use c extensions to process the data efficiently
(since image processing does take time...), and those release the GIL
boosting throughput.

   [1] On this project : http://www.bbc.co.uk/rd/projects/2009/10/i3dlive.shtml

So, that makes it all sound great - ie things can, after various
fashions, run in parallel on various versions of python, to practical
benefit. But obviously it could be improved.

Personally, I think the project most likely to make a difference here
is actually pypy. Now, talk is very cheap, and easy, and I'm not
likely to implement this, so I'll aim to be brief. Execution is hard.

In particular, what I think is most likely to be beneficial is
something _like_ this:

Assume pypy runs without a GIL. Then allow the creation of a green
process. A green process is implemented using threads, but with data
created on the heap such that it defaults to being marked private to
the thread (ie ala thread local storage, but perhaps implemented
slightly differently - via references from the thread local storage
into the heap) rather than shared. Sharing between green processes
(for channels or boxes) would "simply" be detagged as being owned by
one thread, and passed to another.

In particular this would mean that you need a mechanism for doing
this. Simply attempting to call another green process (or thread) from
another with mutable data types would be sufficient to raise the
equivalent of a segmentation fault.

Secondly, improve cpyext to the extent that each cpython extension
gets it's own version of the GIL. (ie each extension runs with its own
logical runtime, and thinks that it has its own GIL which it can lock
and release. In practice it's faked by the PyPy runtime. This is
essentially similar conceptually to creating green processes.

It's worth considering that the Linux kernel went through similar
changes, in that in the 2.0 days there was a large single big lock,
which was replaced by ever granular locks. I personally think that
since there are so many extensions that rely on the existence of the
GIL simply waving a wand to get rid of it isn't likely. However
logically providing a GIL per C-Extension may be plausible, and _may_
be sufficient.

However, I don't know - it might well not - I've not looked at the
code, and talk is cheap - execution is hard.

Hopefully the above (cheap :) comments are in some small way useful.

Regards,


Michael.

From cfbolz at gmx.de  Sat Jul 31 08:34:49 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Sat, 31 Jul 2010 08:34:49 +0200
Subject: [pypy-dev] S3 2010 deadline extension
Message-ID: <4C53C409.1060101@gmx.de>

The S3 2010 Paper deadline was moved forward by two weeks, and is now 
August 13, 2010.


*** Workshop on Self-sustaining Systems (S3) 2010 ***

September 27-28, 2010
The University of Tokyo, Japan
http://www.hpi.uni-potsdam.de/swa/s3/s3-10/

In cooperation with ACM SIGPLAN

=== Call for papers ===

The Workshop on Self-sustaining Systems (S3) is a forum for discussion 
of topics relating to computer systems and languages that are able to 
bootstrap, implement, modify, and maintain themselves. One property of 
these systems is that their implementation is based on small but 
powerful abstractions; examples include (amongst others) 
Squeak/Smalltalk, COLA, Klein/Self, PyPy/Python, Rubinius/Ruby, and 
Lisp. Such systems are the engines of their own replacement, giving 
researchers and developers great power to experiment with, and explore 
future directions from within, their own small language kernels.

S3 will be take place September 27-28, 2010 at The University of Tokyo, 
Japan. It is an exciting opportunity for researchers and practitioners 
interested in self-sustaining systems to meet and share their knowledge, 
experience, and ideas for future research and development.

--- Submissions and proceedings ---

S3 invites submissions of high-quality papers reporting original 
research, or describing innovative contributions to, or experience with, 
self-sustaining systems, their implementation, and their application. 
Papers that depart significantly from established ideas and practices 
are particularly welcome.

Submissions must not have been published previously and must not be 
under review for any another refereed event or publication. The program 
committee will evaluate each contributed paper based on its relevance, 
significance, clarity, and originality. Revised papers will be published 
as post-proceedings in the ACM Digital Library.

Papers should be submitted electronically via EasyChair at 
http://www.easychair.org/conferences/?conf=s32010 in PDF format. 
Submissions must be written in English (the official language of the 
workshop) and must not exceed 10 pages. They should use the ACM SIGPLAN 
10 point format, templates for which are available at 
http://www.acm.org/sigs/sigplan/authorInformation.htm.

--- Venue ---

The University of Tokyo, Komaba Campus, Japan

--- Important dates ---

Submission of papers: *EXTENDED* August 13, 2010
Author notification: August 27, 2010
Early registration: September 3, 2010
Revised papers: September 10, 2010
S3 workshop: September 27-28, 2010
Final papers for ACM-DL post-proceedings: October 15, 2010

--- Invited talks ---

Yukihiro Matsumoto: "From Lisp to Ruby to Rubinius"
Takashi Ikegami: "Sustainable Autonomy and Designing Mind Time"

--- Chairs ---

Robert Hirschfeld (Hasso-Plattner-Institut Potsdam, Germany)
hirschfeld at hpi.uni-potsdam.de
Hidehiko Masuhara (The University of Tokyo, Japan)
masuhara at graco.c.u-tokyo.ac.jp
Kim Rose (Viewpoints Research Institute, USA)
kim.rose at vpri.org

--- Program committee ---

Carl Friedrich Bolz, University of Duesseldorf, Germany
Johan Brichau, Universite Catholique de Louvain, Belgium
Shigeru Chiba, Tokyo Institute of Technology, Japan
Brian Demsky, University of California, Irvine, USA
Marcus Denker, INRIA Lille, France
Richard P. Gabriel, IBM Research, USA
Michael Haupt, Hasso-Plattner-Institut, Germany
Robert Hirschfeld, Hasso-Plattner-Institut, Germany (co-chair)
Atsushi Igarashi, University of Kyoto, Japan
David Lorenz, The Open University, Israel
Hidehiko Masuhara, University of Tokyo, Japan (co-chair)
Eliot Miranda, Teleplace, USA
Ian Piumarta, Viewpoints Research Institute, USA
Martin Rinard, MIT, USA
Antero Taivalsaari, Nokia, Finland
David Ungar, IBM, USA

_______________________________________________
fonc mailing list
fonc at vpri.org
http://vpri.org/mailman/listinfo/fonc

From andrewfr_ice at yahoo.com  Sat Jul 31 12:00:49 2010
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Sat, 31 Jul 2010 03:00:49 -0700 (PDT)
Subject: [pypy-dev] Would the following shared memory model be possible?
In-Reply-To: 
Message-ID: <597371.6380.qm@web120001.mail.ne1.yahoo.com>

Hi JP:

Message: 1
Date: Thu, 29 Jul 2010 21:24:58 -0000
From: exarkun at twistedmatrix.com
Subject: Re: [pypy-dev] Would the following shared memory model be
    possible?
To: pypy-dev at codespeak.net
Message-ID:
    <20100729212458.2188.24074246.divmod.xquotient.34 at localhost.localdomain>
   
Content-Type: text/plain; charset="utf-8"; format="flowed"

On 08:39 pm, andrewfr_ice at yahoo.com wrote:
>
>Okay. I attended Ray's Hettinger's talk on Monocle. In the past
>I have encountered situations where I bumped up with the nesting
>problem.
>If I recall, the problem involved request handlers that had a RPC style
>AND made additional Twisted deferred calls:
>
>class MyRequestHandler(...):
>
>    @defer.inlineCallbacks
>    def process(self):
>        try:
>            result = yield
>client.getPage("http://www.google.com")
>        except Exception, err:
>            log.err(err, "process getPage call
>failed")
>        else:
>            # do some processing with the result
>            return result
>
>looks reasonable but Python will balk.

JP>Aside from the "return result" (should be defer.returnValue(result),
JP>generators can't return with a value), this looks fine to me too.  Why
JP>do you say Python will balk?

Well the return with a value was the deal breaker. I used this example because this is where I came face-to-face with nested generators - and generated a mistrust for them in regard to exotic uses. There was something else about the real example (I am having a hard time finding the posts - somewhere in 2007) - I think it was a very early version of PyAMF and it really wanted a return (HTTP is okay). I believe under the hood, if the protocol returns a deferred or None, the reactor will expect further output in the future.

Cheers,
Andrew

Cheers,
Andrew


      


From sparks.m at gmail.com  Sat Jul 31 19:43:32 2010
From: sparks.m at gmail.com (Michael Sparks)
Date: Sat, 31 Jul 2010 18:43:32 +0100
Subject: [pypy-dev] FW: Would the following shared memory model be
	possible?
In-Reply-To: 
References: 
	<20100727062702.GE12699@tunixman.com>
	
	
	
	
	
	
	
	
Message-ID: 

[ cc'ing the list in case anyone else took my words the same way as Kevin :-( ]

On Sat, Jul 31, 2010 at 5:26 PM, Kevin Ar18  wrote:
> I have no idea what I did you warrant you hateful replies towards me, but
> they really are not appropriate (in public or private email).

I had absolutely no intention of offending you, and am deeply sorry
for any offense that I may have caused you.

In my reply I merely wanted to flag that I don't have time to go into
everything (like most people), that asking questions in a public realm
is better because you may then get answers from multiple people, and
that people who appear to do some research first tend to get better
answers. I also tried to give an example, but that doesn't appear to
have been helpful. (I'm fallible like everyone else)

My intention there was to be helpful and to explain why I have that
view of only replying on list, and it appears to have offended you
instead, and I apologise. (one person's direct and helpful speech in
one place can be a mortal insult somewhere else)

After those couple of paragraphs, I tried to add to your discussion by
replying to your specific points which you asked about parallel
execution, noting places and examples where it is possible today. (to
varying degrees of satisfaction) I then also tried to answer your
point of "if something extra could be done, what would probably be
generally useful". To that I noted that *my* talk there was cheap, and
that execution was hard.

Somehow along the way, my intent to try to be helpful to you has
resulted in offending and upsetting you, and for that I am truly sorry
- life is simply too short for people to upset each other, and in no
way was my post intended as "hateful", and once again, my apologies.
In future please assume good intentions - I assumed good intentions on
your part.

I'll bow out at this point.

Best Regards,


Michael.

>
>> Date: Sat, 31 Jul 2010 02:08:49 +0100
>> Subject: Re: [pypy-dev] FW: Would the following shared memory model be
>> possible?
>> From: sparks.m at gmail.com
>> To: kevinar18 at hotmail.com
>> CC: pypy-dev at codespeak.net
>>
>> On Thu, Jul 29, 2010 at 6:44 PM, Kevin Ar18  wrote:
>> > You brought up a lot of topics. I went ahead and sent you a private
>> > email.
>> > There's always lots of interesting things I can add to my list of things
>> > to
>> > learn about. :)
>>
>> Yes, there are lots of interesting things. I have a limited amount of
>> time however (I should be in bed, it's very late here, but I do /try/
>> to reply to on-list mails), so cannot spood feed you. Mailing me
>> directly rather than a (relevant) list precludes you getting answers
>> from someone other than me. Not being on lists also precludes you
>> getting answers to questions by chance. Changing emails and names in
>> email headers also makes keeping track of people hard...
>>
>> (For example you asked off list last year about Kamaelia's license
>> from a different email address. Since it wasn't searchable I
>> completely forgot. You also asked all sorts of questions but didn't
>> want the answers public, so I didn't reply. If instead you'd
>> subscribed to the list, and asked there, you'd've found out that
>> Kamaelia's license changed - to the Apache Software License v2 ...)
>>
>> If I mention something you find interesting, please Google first and
>> then ask publicly somewhere relevant. (the answer and question are
>> then googleable, and you're doing the community a service IMO if you
>> ask q's that way - if you're question is somewhere relevant and shows
>> you've already googled prior work as far as you can... People are
>> time however (I should be in bed, it's very late here, but I do /try/
>> to reply to on-list mails), so cannot spood feed you. Mailing me
>> directly rather than a (relevant) list precludes you getting answers
>> from someone other than me. Not being on lists also precludes you
>> getting answers to questions by chance. Changing emails and names in
>> email headers also makes keeping track of people hard...
>>
>> (For example you asked off list last year about Kamaelia's license
>> from a different email address. Since it wasn't searchable I
>> completely forgot. You also asked all sorts of questions but didn't
>> want the answers public, so I didn't reply. If instead you'd
>> subscribed to the list, and asked there, you'd've found out that
>> Kamaelia's license changed - to the Apache Software License v2 ...)
>>
>> always willing to help people who show willing to help themselves in
>> my experience.)
>>
>> >> just looks to me that you're tieing yourself up in knots over things
>> >> that aren't problems, when there are some things which could be useful
>> >> (in practice) & interesting in this space.
>> > The particular issue in this situation is that there is no way to make
>> > Kamaelia, FBP, or other concurrency concepts run in parallel (unless you
>> > are
>> > willing to accept lots of overhead like with the multiprocessing
>> > queues).
>> >
>> > Since you have worked with Kamaelia code a lot... you understand a lot
>> > more
>> > about implementation details. Do you think the previous shared memory
>> > concept or something like it would let you make Kamaelia parallel?
>> > If not, can you think of any method that would let you make Kamaelia
>> > parallel?
>>
>> Kamaelia already CAN run components in parallel in different processes
>> (has been able to do so for quite some time) or on different
>> processors. Indeed, all you do is use a ProcessPipeline or
>> ProcessGraphline rather than Pipeline or Graphline, and the components
>> in the top level are spread across processes. I still view the code as
>> experimental, but it does work, and when needed is very useful.
>>
>> Kamaelia running on Iron Python can run on seperate processors sharing
>> data efficiently (due to lack of GIL there) happily too. Threaded
>> components there do that naturally - I don't use IronPython, but it
>> does run on Iron Python. On windows this is easiest, though Mono works
>> just as well.
>>
>> I believe Jython also is GIL free, and Kamaelia's Axon runs there
>> cleanly too. As a result because Kamaelia is pure python, it runs
>> truly in parallel there too (based on hearing from people using
>> kamaelia on jython). Cpython is the exception (and a rather big one at
>> that). (Pypy has a choice IIUC)
>>
>> Personally, I think if PyPy worked with generators better (which is
>> why I keep an eye on PyPy) and cpyext was improved, it'd provide a
>> really compelling platform for me. (I was rather gutted at Europython
>> to hear that PyPy's generator support was still ... problematic)
>>
>> Regarding the *efficiency* and *enforcement* of the approach taken, I
>> feel you're chasing the wrong tree, but let's go there.
>>
>> What approach does baseline (non-Iron Python running) kamaelia take
>> for multi-process work?
>>
>> For historical reasons, it builds on top of pprocess rather than
>> multiprocessing module based. This means for interprocess
>> communications objects are pickled before being sent over operating
>> system pipes.
>>
>> This provides an obvious communications overhead - and this isn't
>> really kamaelia specific at this point.
>>
>> However, shifting data from one CPU to another is expensive, and only
>> worth doing in some circumstances. (Consider a machine with several
>> physical CPUs - each has a local CPU cache, and the data needs to be
>> transferred from one to another, which is why partly people worry
>> about thread/CPU affinity etc)
>>
>> Basically, if you can manage it, you don't want to shift data between
>> CPUs, you want to partition the processing.
>>
>> ie you may want to start caring about the size of messages and number
>> of messages going between processes. Sending small and few between
>> processes is going to be preferable to sending large and many for
>> throughput purposes.
>>
>> In the case of small and few, the approach of pickling and sending
>> across OS pipes isn't such a bad idea. It works.
>>
>> If you do want to share data between CPUs, and it sounds like you do,
>> then most OSs already provide a means of doing that - threads. The
>> conventions people use for using threads are where they become
>> unpicked, but as a mechanism, threads do generally work, and work
>> well.
>>
>> As well as channels/boxes, you can use an STM approach, such as than
>> in Axon.STM ...
>> * http://www.kamaelia.org/STM.html
>> *
>> http://code.google.com/p/kamaelia/source/browse/trunk/Code/Python/Bindings/STM/
>>
>> ...which is logically very similar to version control for variables. A
>> downside of STM (at least with this approach) however, is that for it
>> to work, you need either copy on write semantics for objects, or full
>> copying of objects or similar. Personally I use a biological metaphor
>> here, in that channels/boxes and components, and similar perform a
>> similar function to axons and neurons in the body, and that STM is
>> akin to the hormonal system for maintaining and controlling system
>> state. (I modelled biological tree growth many moons ago)
>>
>> Anyhow, coming back to threads, that brings us back to python, and
>> implementations with a GIL, and those without.
>>
>> For implementations with a GIL, you then have a choice: do I choose to
>> try and implement a memory model that _enforces_ data locality? that
>> is if a piece of data is in use inside a single "process" or "thread"
>> (from hereon I'll use "task" as a generic phrase) that trying to use
>> it inside another causes a problem for the task attempting to breach
>> the model.
>>
>> In order to enforce this, I personally believe you'd need to use
>> multiple processes, and only share data through dedicated code
>> managing shared memory. You could of course do this outside user code.
>> To do this you'd need an abstraction that made sense, and something
>> like stackless' channels or kamaelia's (in/out) box model makes sense
>> there. (The CELL API uses a mailbox metaphor as well for reference)
>>
>> In that case, you have a choice. You either copy the data into shared
>> memory, or you share the data in situ. The former gives you back
>> precisely the same overhead previously described, or the latter
>> fragments your memory (since you can no longer access it). You could
>> also have compaction.
>>
>> However, personally, I think any possible benefits here are outweighed
>> by the costs and complexity.
>>
>> The alternative is to _encourage_ data locality. That is encourage the
>> usage and sharing of data such that whilst you could share data
>> between tasks and cause corruption that the common way of using the
>> system discourages such actions. In essence that's what I try to do in
>> Kamaelia, and it seems to work. Specifically, the model says:
>>
>> * If I take a piece of data from an inbox, I own it and can do anything
>> with it that I like. If you think of a physical piece of paper and
>> I take it from an intray, then that really is the case.
>>
>> * If I put a piece of data in an outbox, I no longer own it and should
>> not attempt to do anything more with it. Again, using a physical
>> metaphor, and naming scheme helps here. In particular, if I put a
>> piece of paper in the post, I can no longer modify it. How it gets
>> to its recipient is not my concern either.
>>
>> In practice this does actually work. If you add in immutable tuples,
>> and immutable strings then it becomes a lot clearer how this can work.
>>
>> Is there a risk here of accidental modification? Yes. However, the
>> size and general simplicity of components tends to lead to such
>> problems being picked up early. It also enables component level
>> acceptance tests. (We tend to build small examples of usage, which in
>> turn effectively form acceptance tests)
>>
>> [ An alternative is to make the "send" primitive make a copy on send.
>> That would be quite an overhead, and also limit the types of data you
>> can send. ]
>>
>> In practical terms, it works. (Stackless proves this as well IMO,
>> since despite some differences, there's also lots of similarities)
>>
>> The other question that arises, is "isn't the GIL a problem with
>> threads?". Well, the answer to that really depends on what you're
>> doing. David Beazely's talk on what happens on mixing different sorts
>> of threads shows that it isn't ideal, and if you're hitting that
>> behaviour, then actually switching to real processes makes sense.
>> However if you're doing CPU intensive work inside a C extension which
>> releases the GIL (eg numpy), then it's less of an issue in practice.
>> Custom extensions can do the same.
>>
>> So, for example, picking something which I know colleagues [1] at work
>> do, you can use a DVS broadcast capture card to capture video frames,
>> pass those between threads which are doing processing on them, and
>> inside those threads use c extensions to process the data efficiently
>> (since image processing does take time...), and those release the GIL
>> boosting throughput.
>>
>> [1] On this project :
>> http://www.bbc.co.uk/rd/projects/2009/10/i3dlive.shtml
>>
>> So, that makes it all sound great - ie things can, after various
>> fashions, run in parallel on various versions of python, to practical
>> benefit. But obviously it could be improved.
>>
>> Personally, I think the project most likely to make a difference here
>> is actually pypy. Now, talk is very cheap, and easy, and I'm not
>> likely to implement this, so I'll aim to be brief. Execution is hard.
>>
>> In particular, what I think is most likely to be beneficial is
>> something _like_ this:
>>
>> Assume pypy runs without a GIL. Then allow the creation of a green
>> process. A green process is implemented using threads, but with data
>> created on the heap such that it defaults to being marked private to
>> the thread (ie ala thread local storage, but perhaps implemented
>> slightly differently - via references from the thread local storage
>> into the heap) rather than shared. Sharing between green processes
>> (for channels or boxes) would "simply" be detagged as being owned by
>> one thread, and passed to another.
>>
>> In particular this would mean that you need a mechanism for doing
>> this. Simply attempting to call another green process (or thread) from
>> another with mutable data types would be sufficient to raise the
>> equivalent of a segmentation fault.
>>
>> Secondly, improve cpyext to the extent that each cpython extension
>> gets it's own version of the GIL. (ie each extension runs with its own
>> logical runtime, and thinks that it has its own GIL which it can lock
>> and release. In practice it's faked by the PyPy runtime. This is
>> essentially similar conceptually to creating green processes.
>>
>> It's worth considering that the Linux kernel went through similar
>> changes, in that in the 2.0 days there was a large single big lock,
>> which was replaced by ever granular locks. I personally think that
>> since there are so many extensions that rely on the existence of the
>> GIL simply waving a wand to get rid of it isn't likely. However
>> logically providing a GIL per C-Extension may be plausible, and _may_
>> be sufficient.
>>
>> However, I don't know - it might well not - I've not looked at the
>> code, and talk is cheap - execution is hard.
>>
>> Hopefully the above (cheap :) comments are in some small way useful.
>>
>> Regards,
>>
>>
>> Michael.
>

From kevinar18 at hotmail.com  Sun Aug  1 04:09:28 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Sat, 31 Jul 2010 22:09:28 -0400
Subject: [pypy-dev] FW: Would the following shared memory model be
 possible?
In-Reply-To: 
References: ,
	<20100727062702.GE12699@tunixman.com>,
	,
	,
	,
	,
	,
	,
	,
	,
	
Message-ID: 


> > I have no idea what I did you warrant you hateful replies towards me, but
> > they really are not appropriate (in public or private email).
>
> I had absolutely no intention of offending you, and am deeply sorry
> for any offense that I may have caused you.

I must admit, I'm rather surprised by your reply -- and also thank you.  I'm sorry for the trouble I caused you with this.  I had hoped for a good conversation about the issues related to Kamaelia, yet everytime I got a reply back, it seemed like you were mad at me for some unknown reason.
 
As a simple example of what I mean.  In your first email, you mentioned a lot of different programming styles related to FBP and Kamaelia.  Since I am interested in parallel "research" I put those words into google and made a whole bookmark section so that I would have them for future study.  When I replied back, I figured that this would be a good way to lighten the mood in the email, so I thanked you for the info and asked for any more links/ideas you might want to mention.  A shared point of interest might be a good way to foster a nice friendly atmosphere.  Unfortunately, I am assuming you must have misunderstood me, because instead of stirring up a friendly interest, I received several paragraphs about me being inconsiderate (not searching google for something) and putting an undue burden on you.
 
At this point, it would be really unfair to talk about it further.  I guess to sum things up, I got the impression that you were mad at me for some unknown reason: it was like each successive email was going further and further down hill -- and I didn't know why.
However, in the end, I am glad that the whole situation could be resolved the way it has been.
 
 
> I'll bow out at this point.
I wouldn't want you to have to do that; your input can be very useful to people.
I apologized, you apolgized....  Some stuff was cleared up, etc....  I don't think anybody here is holding a grude or going rehash the topic again (me and you included).
 
You have very specific knowledge related to Kamaelia that could be useful to people exploring micro-threading implementations, parallel computing, etc....
 
---------------
Now, to change the topic slightly (and hopeful in a positive way).
 
 
I'm not sure if it really matters to you, but I have been considering another possible way to make a parallel tasklet (like for FBP and Kamaelia) in PyPy... but I don't have 3+ months to spend ironing out the flaws, learning PyPy, writing an implemenation, etc....   ... and to be honest, I would not feel comfortable asking someone else (here or otherwise) to try and make something for my benefit.
 
On another note... something that might actually interest you:  I have done some work on a graphical front-end for FBP ... nothing super special, mind you, but I could keep you informed in the future if is something of interest to you.
 
 
 
Anyways, hope this email turns out on a positive note for you and everyone else.
Kevin
 
 
> > I have no idea what I did you warrant you hateful replies towards me, but
> > they really are not appropriate (in public or private email).
>
> I had absolutely no intention of offending you, and am deeply sorry
> for any offense that I may have caused you.
>
> In my reply I merely wanted to flag that I don't have time to go into
> everything (like most people), that asking questions in a public realm
> is better because you may then get answers from multiple people, and
> that people who appear to do some research first tend to get better
> answers. I also tried to give an example, but that doesn't appear to
> have been helpful. (I'm fallible like everyone else)
>
> My intention there was to be helpful and to explain why I have that
> view of only replying on list, and it appears to have offended you
> instead, and I apologise. (one person's direct and helpful speech in
> one place can be a mortal insult somewhere else)
>
> After those couple of paragraphs, I tried to add to your discussion by
> replying to your specific points which you asked about parallel
> execution, noting places and examples where it is possible today. (to
> varying degrees of satisfaction) I then also tried to answer your
> point of "if something extra could be done, what would probably be
> generally useful". To that I noted that *my* talk there was cheap, and
> that execution was hard.
>
> Somehow along the way, my intent to try to be helpful to you has
> resulted in offending and upsetting you, and for that I am truly sorry
> - life is simply too short for people to upset each other, and in no
> way was my post intended as "hateful", and once again, my apologies.
> In future please assume good intentions - I assumed good intentions on
> your part.
>
> I'll bow out at this point.
>
> Best Regards,
>
>
> Michael.
>
> >
> >> Date: Sat, 31 Jul 2010 02:08:49 +0100
> >> Subject: Re: [pypy-dev] FW: Would the following shared memory model be
> >> possible?
> >> From: sparks.m at gmail.com
> >> To: kevinar18 at hotmail.com
> >> CC: pypy-dev at codespeak.net
> >>
> >> On Thu, Jul 29, 2010 at 6:44 PM, Kevin Ar18 wrote:
> >> > You brought up a lot of topics. I went ahead and sent you a private
> >> > email.
> >> > There's always lots of interesting things I can add to my list of things
> >> > to
> >> > learn about. :)
> >>
> >> Yes, there are lots of interesting things. I have a limited amount of
> >> time however (I should be in bed, it's very late here, but I do /try/
> >> to reply to on-list mails), so cannot spood feed you. Mailing me
> >> directly rather than a (relevant) list precludes you getting answers
> >> from someone other than me. Not being on lists also precludes you
> >> getting answers to questions by chance. Changing emails and names in
> >> email headers also makes keeping track of people hard...
> >>
> >> (For example you asked off list last year about Kamaelia's license
> >> from a different email address. Since it wasn't searchable I
> >> completely forgot. You also asked all sorts of questions but didn't
> >> want the answers public, so I didn't reply. If instead you'd
> >> subscribed to the list, and asked there, you'd've found out that
> >> Kamaelia's license changed - to the Apache Software License v2 ...)
> >>
> >> If I mention something you find interesting, please Google first and
> >> then ask publicly somewhere relevant. (the answer and question are
> >> then googleable, and you're doing the community a service IMO if you
> >> ask q's that way - if you're question is somewhere relevant and shows
> >> you've already googled prior work as far as you can... People are
> >> time however (I should be in bed, it's very late here, but I do /try/
> >> to reply to on-list mails), so cannot spood feed you. Mailing me
> >> directly rather than a (relevant) list precludes you getting answers
> >> from someone other than me. Not being on lists also precludes you
> >> getting answers to questions by chance. Changing emails and names in
> >> email headers also makes keeping track of people hard...
> >>
> >> (For example you asked off list last year about Kamaelia's license
> >> from a different email address. Since it wasn't searchable I
> >> completely forgot. You also asked all sorts of questions but didn't
> >> want the answers public, so I didn't reply. If instead you'd
> >> subscribed to the list, and asked there, you'd've found out that
> >> Kamaelia's license changed - to the Apache Software License v2 ...)
> >>
> >> always willing to help people who show willing to help themselves in
> >> my experience.)
> >>
> >> >> just looks to me that you're tieing yourself up in knots over things
> >> >> that aren't problems, when there are some things which could be useful
> >> >> (in practice) & interesting in this space.
> >> > The particular issue in this situation is that there is no way to make
> >> > Kamaelia, FBP, or other concurrency concepts run in parallel (unless you
> >> > are
> >> > willing to accept lots of overhead like with the multiprocessing
> >> > queues).
> >> >
> >> > Since you have worked with Kamaelia code a lot... you understand a lot
> >> > more
> >> > about implementation details. Do you think the previous shared memory
> >> > concept or something like it would let you make Kamaelia parallel?
> >> > If not, can you think of any method that would let you make Kamaelia
> >> > parallel?
> >>
> >> Kamaelia already CAN run components in parallel in different processes
> >> (has been able to do so for quite some time) or on different
> >> processors. Indeed, all you do is use a ProcessPipeline or
> >> ProcessGraphline rather than Pipeline or Graphline, and the components
> >> in the top level are spread across processes. I still view the code as
> >> experimental, but it does work, and when needed is very useful.
> >>
> >> Kamaelia running on Iron Python can run on seperate processors sharing
> >> data efficiently (due to lack of GIL there) happily too. Threaded
> >> components there do that naturally - I don't use IronPython, but it
> >> does run on Iron Python. On windows this is easiest, though Mono works
> >> just as well.
> >>
> >> I believe Jython also is GIL free, and Kamaelia's Axon runs there
> >> cleanly too. As a result because Kamaelia is pure python, it runs
> >> truly in parallel there too (based on hearing from people using
> >> kamaelia on jython). Cpython is the exception (and a rather big one at
> >> that). (Pypy has a choice IIUC)
> >>
> >> Personally, I think if PyPy worked with generators better (which is
> >> why I keep an eye on PyPy) and cpyext was improved, it'd provide a
> >> really compelling platform for me. (I was rather gutted at Europython
> >> to hear that PyPy's generator support was still ... problematic)
> >>
> >> Regarding the *efficiency* and *enforcement* of the approach taken, I
> >> feel you're chasing the wrong tree, but let's go there.
> >>
> >> What approach does baseline (non-Iron Python running) kamaelia take
> >> for multi-process work?
> >>
> >> For historical reasons, it builds on top of pprocess rather than
> >> multiprocessing module based. This means for interprocess
> >> communications objects are pickled before being sent over operating
> >> system pipes.
> >>
> >> This provides an obvious communications overhead - and this isn't
> >> really kamaelia specific at this point.
> >>
> >> However, shifting data from one CPU to another is expensive, and only
> >> worth doing in some circumstances. (Consider a machine with several
> >> physical CPUs - each has a local CPU cache, and the data needs to be
> >> transferred from one to another, which is why partly people worry
> >> about thread/CPU affinity etc)
> >>
> >> Basically, if you can manage it, you don't want to shift data between
> >> CPUs, you want to partition the processing.
> >>
> >> ie you may want to start caring about the size of messages and number
> >> of messages going between processes. Sending small and few between
> >> processes is going to be preferable to sending large and many for
> >> throughput purposes.
> >>
> >> In the case of small and few, the approach of pickling and sending
> >> across OS pipes isn't such a bad idea. It works.
> >>
> >> If you do want to share data between CPUs, and it sounds like you do,
> >> then most OSs already provide a means of doing that - threads. The
> >> conventions people use for using threads are where they become
> >> unpicked, but as a mechanism, threads do generally work, and work
> >> well.
> >>
> >> As well as channels/boxes, you can use an STM approach, such as than
> >> in Axon.STM ...
> >> * http://www.kamaelia.org/STM.html
> >> *
> >> http://code.google.com/p/kamaelia/source/browse/trunk/Code/Python/Bindings/STM/
> >>
> >> ...which is logically very similar to version control for variables. A
> >> downside of STM (at least with this approach) however, is that for it
> >> to work, you need either copy on write semantics for objects, or full
> >> copying of objects or similar. Personally I use a biological metaphor
> >> here, in that channels/boxes and components, and similar perform a
> >> similar function to axons and neurons in the body, and that STM is
> >> akin to the hormonal system for maintaining and controlling system
> >> state. (I modelled biological tree growth many moons ago)
> >>
> >> Anyhow, coming back to threads, that brings us back to python, and
> >> implementations with a GIL, and those without.
> >>
> >> For implementations with a GIL, you then have a choice: do I choose to
> >> try and implement a memory model that _enforces_ data locality? that
> >> is if a piece of data is in use inside a single "process" or "thread"
> >> (from hereon I'll use "task" as a generic phrase) that trying to use
> >> it inside another causes a problem for the task attempting to breach
> >> the model.
> >>
> >> In order to enforce this, I personally believe you'd need to use
> >> multiple processes, and only share data through dedicated code
> >> managing shared memory. You could of course do this outside user code.
> >> To do this you'd need an abstraction that made sense, and something
> >> like stackless' channels or kamaelia's (in/out) box model makes sense
> >> there. (The CELL API uses a mailbox metaphor as well for reference)
> >>
> >> In that case, you have a choice. You either copy the data into shared
> >> memory, or you share the data in situ. The former gives you back
> >> precisely the same overhead previously described, or the latter
> >> fragments your memory (since you can no longer access it). You could
> >> also have compaction.
> >>
> >> However, personally, I think any possible benefits here are outweighed
> >> by the costs and complexity.
> >>
> >> The alternative is to _encourage_ data locality. That is encourage the
> >> usage and sharing of data such that whilst you could share data
> >> between tasks and cause corruption that the common way of using the
> >> system discourages such actions. In essence that's what I try to do in
> >> Kamaelia, and it seems to work. Specifically, the model says:
> >>
> >> * If I take a piece of data from an inbox, I own it and can do anything
> >> with it that I like. If you think of a physical piece of paper and
> >> I take it from an intray, then that really is the case.
> >>
> >> * If I put a piece of data in an outbox, I no longer own it and should
> >> not attempt to do anything more with it. Again, using a physical
> >> metaphor, and naming scheme helps here. In particular, if I put a
> >> piece of paper in the post, I can no longer modify it. How it gets
> >> to its recipient is not my concern either.
> >>
> >> In practice this does actually work. If you add in immutable tuples,
> >> and immutable strings then it becomes a lot clearer how this can work.
> >>
> >> Is there a risk here of accidental modification? Yes. However, the
> >> size and general simplicity of components tends to lead to such
> >> problems being picked up early. It also enables component level
> >> acceptance tests. (We tend to build small examples of usage, which in
> >> turn effectively form acceptance tests)
> >>
> >> [ An alternative is to make the "send" primitive make a copy on send.
> >> That would be quite an overhead, and also limit the types of data you
> >> can send. ]
> >>
> >> In practical terms, it works. (Stackless proves this as well IMO,
> >> since despite some differences, there's also lots of similarities)
> >>
> >> The other question that arises, is "isn't the GIL a problem with
> >> threads?". Well, the answer to that really depends on what you're
> >> doing. David Beazely's talk on what happens on mixing different sorts
> >> of threads shows that it isn't ideal, and if you're hitting that
> >> behaviour, then actually switching to real processes makes sense.
> >> However if you're doing CPU intensive work inside a C extension which
> >> releases the GIL (eg numpy), then it's less of an issue in practice.
> >> Custom extensions can do the same.
> >>
> >> So, for example, picking something which I know colleagues [1] at work
> >> do, you can use a DVS broadcast capture card to capture video frames,
> >> pass those between threads which are doing processing on them, and
> >> inside those threads use c extensions to process the data efficiently
> >> (since image processing does take time...), and those release the GIL
> >> boosting throughput.
> >>
> >> [1] On this project :
> >> http://www.bbc.co.uk/rd/projects/2009/10/i3dlive.shtml
> >>
> >> So, that makes it all sound great - ie things can, after various
> >> fashions, run in parallel on various versions of python, to practical
> >> benefit. But obviously it could be improved.
> >>
> >> Personally, I think the project most likely to make a difference here
> >> is actually pypy. Now, talk is very cheap, and easy, and I'm not
> >> likely to implement this, so I'll aim to be brief. Execution is hard.
> >>
> >> In particular, what I think is most likely to be beneficial is
> >> something _like_ this:
> >>
> >> Assume pypy runs without a GIL. Then allow the creation of a green
> >> process. A green process is implemented using threads, but with data
> >> created on the heap such that it defaults to being marked private to
> >> the thread (ie ala thread local storage, but perhaps implemented
> >> slightly differently - via references from the thread local storage
> >> into the heap) rather than shared. Sharing between green processes
> >> (for channels or boxes) would "simply" be detagged as being owned by
> >> one thread, and passed to another.
> >>
> >> In particular this would mean that you need a mechanism for doing
> >> this. Simply attempting to call another green process (or thread) from
> >> another with mutable data types would be sufficient to raise the
> >> equivalent of a segmentation fault.
> >>
> >> Secondly, improve cpyext to the extent that each cpython extension
> >> gets it's own version of the GIL. (ie each extension runs with its own
> >> logical runtime, and thinks that it has its own GIL which it can lock
> >> and release. In practice it's faked by the PyPy runtime. This is
> >> essentially similar conceptually to creating green processes.
> >>
> >> It's worth considering that the Linux kernel went through similar
> >> changes, in that in the 2.0 days there was a large single big lock,
> >> which was replaced by ever granular locks. I personally think that
> >> since there are so many extensions that rely on the existence of the
> >> GIL simply waving a wand to get rid of it isn't likely. However
> >> logically providing a GIL per C-Extension may be plausible, and _may_
> >> be sufficient.
> >>
> >> However, I don't know - it might well not - I've not looked at the
> >> code, and talk is cheap - execution is hard.
> >>
> >> Hopefully the above (cheap :) comments are in some small way useful.
> >>
> >> Regards,
> >>
> >>
> >> Michael.
> > 		 	   		  

From holger at merlinux.eu  Sun Aug  1 13:50:29 2010
From: holger at merlinux.eu (holger krekel)
Date: Sun, 1 Aug 2010 13:50:29 +0200
Subject: [pypy-dev] py.test/debian and pypy issue
Message-ID: <20100801115029.GL1914@trillke.net>

Hi all, 

just for you information: if you are running Debian (e.g. Ubuntu 10.04) 
and install "py.test" (codespeak-python-lib) from there you get the
9-month old py.test-1.1 which cannot run PyPy's trunk-test suite.  Solutions:

* uninstall the debian version.  install 'py' from PyPI with e.g. 
  "pip install py" or "easy_install py" - this should get you 
  the 1.3.3 version which should work fine. 

* uninstall the debian version, don't install any other and 
  then alias "py.test" to "trunk/pypy/py/bin/py.test" which 
  means you use the pypy-included py version, currently version 1.3.1
  which is also the version used in nightly test runs etc. 

sidenote: Fedora 13 ships 1.3.2 and Gentoo ships 1.3.3 so you
mostly only get the issues on debian-based systems, i guess. 

best,
holger

From glavoie at gmail.com  Sun Aug  1 22:04:29 2010
From: glavoie at gmail.com (Gabriel Lavoie)
Date: Sun, 1 Aug 2010 16:04:29 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	
	
Message-ID: 

Sorry for the late answer, I was unavailable in the last few days.

About send() and receive(), it depends on if the communication is local or
not. For a local communication, anything can be passed since only the
reference is sent. This is the base model for Stackless channels. For a
remote communication (between two interpreters), any picklable object (a
copy will then be made) and it includes channels and tasklets (for which a
reference will automatically be created).

The use of the PyPy proxy object space is to make remote communication more
Stackless like by passing object by reference. If a ref_object is made, only
a reference will be passed when a tasklet is moved or the object is sent on
a channel. The object always resides where it was created. A move()
operation will also be implemented on those objects so they can be moved
around like tasklets.

I hope it helps,

Gabriel

2010/7/29 Kevin Ar18 

>
> > Hello Kevin,
> > I don't know if it can be a solution to your problem but for my
> > Master Thesis I'm working on making Stackless Python distributed. What
> > I did is working but not complete and I'm right now in the process of
> > writing the thesis (in french unfortunately). My code currently works
> > with PyPy's "stackless" module onlyis and use some PyPy specific
> > things. Here's what I added to Stackless:
> >
> > - Possibility to move tasklets easily (ref_tasklet.move(node_id)). A
> > node is an instance of an interpreter.
> > - Each tasklet has its global namespace (to avoid sharing of data). The
> > state is also easier to move to another interpreter this way.
> > - Distributed channels: All requests are known by all nodes using the
> > channel.
> > - Distributed objets: When a reference is sent to a remote node, the
> > object is not copied, a reference is created using PyPy's proxy object
> > space.
> > - Automated dependency recovery when an object or a tasklet is loaded
> > on another interpreter
> >
> > With a proper scheduler, many tasklets could be automatically spread in
> > multiple interpreters to use multiple cores or on multiple computers. A
> > bit like the N:M threading model where N lightweight threads/coroutines
> > can be executed on M threads.
>
> Was able to have a look at the API...
> If others don't mind my asking this on the mailing list:
>
> * .send() and .receive()
> What type of data can you send and receive between the tasklets?  Can you
> pass entire Python objects?
>
> * .send() and .receive() memory model
> When you send data between tasklets (pass messages) or whateve you want to
> call it, how is this implemented under the hood?  Does it use shared memory
> under the hood or does it involve a more costly copying of the data?  I
> realize that if it is on another machine you have to copy the data, but what
> about between two threads?  You mentioned PyPy's proxy object.... guess I'll
> need to read up on that.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
Gabriel Lavoie
glavoie at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100801/18bdd567/attachment.htm 

From alex.gaynor at gmail.com  Mon Aug  2 04:11:15 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Sun, 1 Aug 2010 22:11:15 -0400
Subject: [pypy-dev] I broke stackless
Message-ID: 

The work I did changing CALL_METHOD to support keyword arguments moved
some rstack.resume_point calls around, and it seems to have
inadvertantly broken stackless on trunk.  The latest translation fail
can be found here:
http://buildbot.pypy.org/builders/pypy-c-stackless-app-level-linux-x86-32/builds/597/steps/translate/logs/stdio.
 Anyone have a suggestion as to what exactly I need to do to get this
working?

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From benjamin at python.org  Mon Aug  2 04:52:34 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 1 Aug 2010 21:52:34 -0500
Subject: [pypy-dev] I broke stackless
In-Reply-To: 
References: 
Message-ID: 

2010/8/1 Alex Gaynor :
> The work I did changing CALL_METHOD to support keyword arguments moved
> some rstack.resume_point calls around, and it seems to have
> inadvertantly broken stackless on trunk. ?The latest translation fail
> can be found here:
> http://buildbot.pypy.org/builders/pypy-c-stackless-app-level-linux-x86-32/builds/597/steps/translate/logs/stdio.
> ?Anyone have a suggestion as to what exactly I need to do to get this
> working?

Revert it! :)



-- 
Regards,
Benjamin

From todd.a.anderson at intel.com  Tue Aug  3 19:29:49 2010
From: todd.a.anderson at intel.com (Anderson, Todd A)
Date: Tue, 3 Aug 2010 10:29:49 -0700
Subject: [pypy-dev] Percentage Python as RPython.
Message-ID: <9662F248D13E8C45B097A77F005E9729B0C39B74@orsmsx503.amr.corp.intel.com>

Sorry if this has been asked before.  I did some searching of the archive and didn't see anything but I might have missed it.

I am curious what percentage of real-world Python programs in use are also RPython programs.  I know that the FAQ says that the translator is not intended for Python programs in general but only for the PyPy interpreter itself but I've also seen a few mentions (on other sites) of attempting to translate Python to C.  I've been thinking about adding a backend to the translator but would only want to do so if a significant amount of Python programs could use it.

thanks,

Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100803/8a101abb/attachment.htm 

From fijall at gmail.com  Tue Aug  3 20:52:54 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 3 Aug 2010 20:52:54 +0200
Subject: [pypy-dev] Percentage Python as RPython.
In-Reply-To: <9662F248D13E8C45B097A77F005E9729B0C39B74@orsmsx503.amr.corp.intel.com>
References: <9662F248D13E8C45B097A77F005E9729B0C39B74@orsmsx503.amr.corp.intel.com>
Message-ID: 

On Tue, Aug 3, 2010 at 7:29 PM, Anderson, Todd A
 wrote:
> Sorry if this has been asked before. ?I did some searching of the archive
> and didn?t see anything but I might have missed it.
>
>
>
> I am curious what percentage of real-world Python programs in use are also
> RPython programs. ?I know that the FAQ says that the translator is not
> intended for Python programs in general but only for the PyPy interpreter
> itself but I?ve also seen a few mentions (on other sites) of attempting to
> translate Python to C.? I?ve been thinking about adding a backend to the
> translator but would only want to do so if a significant amount of Python
> programs could use it.
>

0 - 0.5% (generally, none. You write programs for RPython in a
different manner).

>
>
> thanks,
>
>
>
> Todd
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From ademan555 at gmail.com  Tue Aug  3 22:08:07 2010
From: ademan555 at gmail.com (Dan Roberts)
Date: Tue, 3 Aug 2010 13:08:07 -0700
Subject: [pypy-dev] Percentage Python as RPython.
In-Reply-To: <9662F248D13E8C45B097A77F005E9729B0C39B74@orsmsx503.amr.corp.intel.com>
References: <9662F248D13E8C45B097A77F005E9729B0C39B74@orsmsx503.amr.corp.intel.com>
Message-ID: 

Hi Todd,
I'm not sure what your goals are, but my position is that if you write a
translator backend and a JIT backend (please do) you can have fast (and
improving) python on platform X.  What were you hoping to target with your
backend?
Cheers,
Dan

On Aug 3, 2010 10:43 AM, "Anderson, Todd A" 
wrote:

 Sorry if this has been asked before.  I did some searching of the archive
and didn?t see anything but I might have missed it.



I am curious what percentage of real-world Python programs in use are also
RPython programs.  I know that the FAQ says that the translator is not
intended for Python programs in general but only for the PyPy interpreter
itself but I?ve also seen a few mentions (on other sites) of attempting to
translate Python to C.  I?ve been thinking about adding a backend to the
translator but would only want to do so if a significant amount of Python
programs could use it.



thanks,



Todd

_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100803/a611f5d2/attachment-0001.htm 

From bhartsho at yahoo.com  Wed Aug  4 10:21:10 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Wed, 4 Aug 2010 01:21:10 -0700 (PDT)
Subject: [pypy-dev] demoting method, cannot follow, call result degenerated
Message-ID: <267005.44211.qm@web114012.mail.gq1.yahoo.com>

I'm still struggling to learn all the rules of RPython, i have read the coding guide, and the PDF's PyGirl and Ancona's RPython paper, but still i feel i'm not fully grasping everything.

I have a function that returns different classes that all share a common base class.  It works until i introduce a new subclass that has some methods of the same name.  Then i get the demotion, can not follow, degenerated error.

I googled, but all i can find is an IRC log where Fijal seems to taking talking about my problem.
http://www.tismer.com/pypy/irc-logs/pypy/%23pypy.log.20070125

 pedronis: if function can return (in rpython) set of classes with common superclass, than all methods that I call later must be defined on that superclass, right?

[11:30]  [15:01]  yes, unless you assert a specific subclass 

So i just need to use an assert statement before the function return, and assert the class i am returning?

I am blogging about my progress while learning RPython, i have posted about meta-programming in Rpython which is a new concept to me.

http://pyppet.blogspot.com/2010/08/meta-programming-in-rpython.html

-brett




From fijall at gmail.com  Wed Aug  4 10:25:59 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 4 Aug 2010 10:25:59 +0200
Subject: [pypy-dev] demoting method, cannot follow,
	call result 	degenerated
In-Reply-To: <267005.44211.qm@web114012.mail.gq1.yahoo.com>
References: <267005.44211.qm@web114012.mail.gq1.yahoo.com>
Message-ID: 

Hey.

If at any place in code you want to call methods on a thing that can't
be proven to be of a specific subclass, they have to be defined on a
superclass (even dummy versions).

If you are however sure that this object will be of a specific subclass, write:
assert isinstance(x, MySubclass)
x.specific_method

that's fine

On Wed, Aug 4, 2010 at 10:21 AM, Hart's Antler  wrote:
> I'm still struggling to learn all the rules of RPython, i have read the coding guide, and the PDF's PyGirl and Ancona's RPython paper, but still i feel i'm not fully grasping everything.
>
> I have a function that returns different classes that all share a common base class. ?It works until i introduce a new subclass that has some methods of the same name. ?Then i get the demotion, can not follow, degenerated error.
>
> I googled, but all i can find is an IRC log where Fijal seems to taking talking about my problem.
> http://www.tismer.com/pypy/irc-logs/pypy/%23pypy.log.20070125
>
>  pedronis: if function can return (in rpython) set of classes with common superclass, than all methods that I call later must be defined on that superclass, right?
>
> [11:30]  [15:01]  yes, unless you assert a specific subclass
>
> So i just need to use an assert statement before the function return, and assert the class i am returning?
>
> I am blogging about my progress while learning RPython, i have posted about meta-programming in Rpython which is a new concept to me.
>
> http://pyppet.blogspot.com/2010/08/meta-programming-in-rpython.html
>
> -brett
>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From bhartsho at yahoo.com  Thu Aug  5 03:04:00 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Wed, 4 Aug 2010 18:04:00 -0700 (PDT)
Subject: [pypy-dev] Percentage Python as RPython.
Message-ID: <805862.40025.qm@web114009.mail.gq1.yahoo.com>

Todd,
I think you want a frontend, not a backend.  The frontend would take in normal Python and convert it to RPython.  RPython seems to have infinite meta-programming possibilities, so its just a matter of how hard would it be to make the meta frontend.  Probably too hard since python is so dynamic, but maybe its possible with a new subset of Python halfway to RPython, developers would then only have to port to Not-So-Restricted-Python, and then the frontend does the final job of converting to RPython.
-brett




From bhartsho at yahoo.com  Thu Aug  5 03:14:10 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Wed, 4 Aug 2010 18:14:10 -0700 (PDT)
Subject: [pypy-dev] demoting method, cannot follow,
	call result  degenerated
In-Reply-To: 
Message-ID: <836035.68548.qm@web114016.mail.gq1.yahoo.com>

Thanks for clarifying Fijal, putting dummy functions on the base class fixes the demotion errors.

But now i have a new problem, from the bookkeeper, unpackiterable.
pypy.annotation.bookkeeper.CallPatternTooComplex': '*' argument must be SomeTuple
	.. v2 = call_args(v0, ((0, (), True, False)), v1)
	.. '(rbpy:1)BPY_Object_MESH.GET_location'

I checked the object, instead of SomeTuple it is SomeObject.
I'm trying to understand what causes the CallPatternTooComplex error, i can not reproduce it with a simple model that is close to what my actual code is doing.

class T(object):
	def hi( self, *args ): pass
class TA( T ):
	def hi( self, a,b,c ): pass
class TB( T ):
	def hi( self, y ): pass

def pypy_entrypoint():
	t = T()
	ta = TA()
	tb = TB()
	ta.hi(1,2,'x')
	tb.hi()
	tb.hi('xxx')
	print 'too complex test'

the above translates just fine, no TooComplex error.
-brett


--- On Wed, 8/4/10, Maciej Fijalkowski  wrote:

> From: Maciej Fijalkowski 
> Subject: Re: [pypy-dev] demoting method, cannot follow, call result  degenerated
> To: "Hart's Antler" 
> Cc: pypy-dev at codespeak.net
> Date: Wednesday, 4 August, 2010, 1:25 AM
> Hey.
> 
> If at any place in code you want to call methods on a thing
> that can't
> be proven to be of a specific subclass, they have to be
> defined on a
> superclass (even dummy versions).
> 
> If you are however sure that this object will be of a
> specific subclass, write:
> assert isinstance(x, MySubclass)
> x.specific_method
> 
> that's fine
> 
> On Wed, Aug 4, 2010 at 10:21 AM, Hart's Antler 
> wrote:
> > I'm still struggling to learn all the rules of
> RPython, i have read the coding guide, and the PDF's PyGirl
> and Ancona's RPython paper, but still i feel i'm not fully
> grasping everything.
> >
> > I have a function that returns different classes that
> all share a common base class. ?It works until i introduce
> a new subclass that has some methods of the same name.
> ?Then i get the demotion, can not follow, degenerated
> error.
> >
> > I googled, but all i can find is an IRC log where
> Fijal seems to taking talking about my problem.
> > http://www.tismer.com/pypy/irc-logs/pypy/%23pypy.log.20070125
> >
> >  pedronis: if function can return (in
> rpython) set of classes with common superclass, than all
> methods that I call later must be defined on that
> superclass, right?
> >
> > [11:30]  [15:01]  yes,
> unless you assert a specific subclass
> >
> > So i just need to use an assert statement before the
> function return, and assert the class i am returning?
> >
> > I am blogging about my progress while learning
> RPython, i have posted about meta-programming in Rpython
> which is a new concept to me.
> >
> > http://pyppet.blogspot.com/2010/08/meta-programming-in-rpython.html
> >
> > -brett
> >
> >
> >
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
> 




From kevinar18 at hotmail.com  Fri Aug  6 04:30:27 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 5 Aug 2010 22:30:27 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: ,
	,
	,
	
Message-ID: 


Note: Gabriel, do you think we should discuss this on another mailing list (or in private) as I'm not sure this related to PyPy dev anymore?
 
 
Anywyas, what are your future plans for the project?
Is it just an experiment for school ... maybe in the hopes that others would maintaining it if it was found to be interesting?
...
are you planning actual future development, maintenance, promotion of it yourself?


-----------

On a personal note... the concept has a lot of similarities to what I am exploring. However, I would have to make so many additional modifications. Perhaps you can give some thoughts on whether it would take me a long time to add such things?

Some examples:

* Two additional message passing styles (in addition to your own)
Queues - multiple tasklets can push onto queue, only one tasklet can pop.... multiple tasklets can access the property to find out if there is any data in the queue. Queues can be set to an infite size or set with a max # of entries allowed.

Streams - I'm not sure of the exact name, but kind of like an infinite stream/buffer ... useful for passing infinite amounts of data. Only one tasklet can write/add data. Only one tasklet can read/extract data.


* Message passing
When you create a tasklet, you assign a set number of queues or streams to it (it can have many) and whether they extract data from them or write to them (they can only either extract or write to it as noted above). The tasklet's global namespace has access to these queues or streams and can extract or add data to them.

In my case, I look at message passing from the perspective of the tasklet. A tasklet can either be assigned a certain number of "in ports" and a certain number of "out ports." In this case the "in ports" are the .read() end of a queue or stream and the "out ports" are the .send() part of a queue or stream.


* Scheduler
For the scheduler, I would need to control when a tasklet runs. Currently, I am thinking that I would look at all the "in ports" that a tasklet has and make sure each one has some data. Only then would the tasklet be scheduled to run by the scheduler.



------------
On another note, I am curious how you handled the issue of "nested" objects. Consider send() and receive() that you use to pass objects around in your project. Am I correct in that these objects cannot contain references outside of themselves? Also, how do you handle extracting out of the tree and making sure there are not references outside the object?

For example, consider the following object, where "->" means it has a reference to that object

Object 1 -> Object 2

Object 2 -> Object 3
Object 2 -> Object 4

Object 4 -> Object 2


Now, let's say I have a tasklet like the following:

.... -> incoming data = pointer/reference to Object 1

1. read incoming data (get Object 1 reference)
2. remove Object 3
3. send Object 3 to tasklet B
4. send Object 1 to tasklet C

Result:
tasklet B now has this object:
pointer/reference to Object 1, which contains the following tree:
Object 1 -> Object 2
Object 2 -> Object 4
Object 4 -> Object 2


tasklet C now has this object:
pointer/reference to Object 3, which contains the following tree:
Object 3



On the other hand, consider the following scenario:
 
1. read incoming data (get Object 1 reference)
2. remove Object 4
ERROR: this would not be possible, as it refers to Object 2
 

> Sorry for the late answer, I was unavailable in the last few days.
>
> About send() and receive(), it depends on if the communication is local
> or not. For a local communication, anything can be passed since only
> the reference is sent. This is the base model for Stackless channels.
> For a remote communication (between two interpreters), any picklable
> object (a copy will then be made) and it includes channels and tasklets
> (for which a reference will automatically be created).
>
> The use of the PyPy proxy object space is to make remote communication
> more Stackless like by passing object by reference. If a ref_object is
> made, only a reference will be passed when a tasklet is moved or the
> object is sent on a channel. The object always resides where it was
> created. A move() operation will also be implemented on those objects
> so they can be moved around like tasklets.
>
> I hope it helps,
>
> Gabriel
>
> 2010/7/29 Kevin Ar18>
>
>> Hello Kevin,
>> I don't know if it can be a solution to your problem but for my
>> Master Thesis I'm working on making Stackless Python distributed. What
>> I did is working but not complete and I'm right now in the process of
>> writing the thesis (in french unfortunately). My code currently works
>> with PyPy's "stackless" module onlyis and use some PyPy specific
>> things. Here's what I added to Stackless:
>>
>> - Possibility to move tasklets easily (ref_tasklet.move(node_id)). A
>> node is an instance of an interpreter.
>> - Each tasklet has its global namespace (to avoid sharing of data). The
>> state is also easier to move to another interpreter this way.
>> - Distributed channels: All requests are known by all nodes using the
>> channel.
>> - Distributed objets: When a reference is sent to a remote node, the
>> object is not copied, a reference is created using PyPy's proxy object
>> space.
>> - Automated dependency recovery when an object or a tasklet is loaded
>> on another interpreter
>>
>> With a proper scheduler, many tasklets could be automatically spread in
>> multiple interpreters to use multiple cores or on multiple computers. A
>> bit like the N:M threading model where N lightweight threads/coroutines
>> can be executed on M threads.
>
> Was able to have a look at the API...
> If others don't mind my asking this on the mailing list:
>
> * .send() and .receive()
> What type of data can you send and receive between the tasklets? Can
> you pass entire Python objects?
>
> * .send() and .receive() memory model
> When you send data between tasklets (pass messages) or whateve you want
> to call it, how is this implemented under the hood? Does it use shared
> memory under the hood or does it involve a more costly copying of the
> data? I realize that if it is on another machine you have to copy the
> data, but what about between two threads? You mentioned PyPy's proxy
> object.... guess I'll need to read up on that.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
>
>
> --
> Gabriel Lavoie
> glavoie at gmail.com 		 	   		  

From glavoie at gmail.com  Fri Aug  6 05:31:15 2010
From: glavoie at gmail.com (Gabriel Lavoie)
Date: Thu, 5 Aug 2010 23:31:15 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
	message passing?
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

I don't mind replying to the mailing list unless it annoys someone? Maybe
some people could be interested by this discussion.

You have a lot of questions! :) My answers are inline.

2010/8/5 Kevin Ar18 

>
> Note: Gabriel, do you think we should discuss this on another mailing list
> (or in private) as I'm not sure this related to PyPy dev anymore?
>
>
> Anywyas, what are your future plans for the project?
> Is it just an experiment for school ... maybe in the hopes that others
> would maintaining it if it was found to be interesting?
> ...
> are you planning actual future development, maintenance, promotion of it
> yourself?
>

Based on the interest and time I'll and and other people will have I plan to
debug this as much as possible. If people are interested to join in after my
thesis, I'll be more than open to welcome then in the project. Right now,
I'm writing my report and I'm also looking for a job. I won't have much time
to touch again to the code before next month to prepare it for my
presentation, along with a lot of examples and use cases.


>
>
> -----------
>
> On a personal note... the concept has a lot of similarities to what I am
> exploring. However, I would have to make so many additional modifications.
> Perhaps you can give some thoughts on whether it would take me a long time
> to add such things?
>

Allright, my plan was to make all the needed lower level constructs that can
be used to build more complex things. For example, a mix of tasklet and sync
channels could be wrapped in an API to create async channels. I know this is
far from complete and I have a few ideas on how it could be improved in the
future but it's currently not needed for my project.

For now, the idea was to stay as close as possible to standard Stackless
Python and only add the needed APIs and functionalities to support
distributing tasklets between multiple interpreters.


>
> Some examples:
>
> * Two additional message passing styles (in addition to your own)
> Queues - multiple tasklets can push onto queue, only one tasklet can
> pop.... multiple tasklets can access the property to find out if there is
> any data in the queue. Queues can be set to an infite size or set with a max
> # of entries allowed.


This could easily be implemented using a standard channel and by starting
multiple tasklets to send data. With some helper methods on a channel it
could be possible to know how many tasklets are waiting to send their data.
A channel already have a built-in queue for send/receive requests. This
queue contains a list of all tasklets waiting for a send/receive operation.
Tasklets are supposed to be lightweight enough to support something like
this.


>

Streams - I'm not sure of the exact name, but kind of like an infinite
> stream/buffer ... useful for passing infinite amounts of data. Only one
> tasklet can write/add data. Only one tasklet can read/extract data.
>

Like a UNIX pipe()? Async? Again, some code wrapping standard channels could
be used for this.


>
>
> * Message passing
> When you create a tasklet, you assign a set number of queues or streams to
> it (it can have many) and whether they extract data from them or write to
> them (they can only either extract or write to it as noted above). The
> tasklet's global namespace has access to these queues or streams and can
> extract or add data to them.
>
> In my case, I look at message passing from the perspective of the tasklet.
> A tasklet can either be assigned a certain number of "in ports" and a
> certain number of "out ports." In this case the "in ports" are the .read()
> end of a queue or stream and the "out ports" are the .send() part of a queue
> or stream.
>
>
Sorry, I don't really understand what you're trying to explain here. Maybe
an example could be helpful? :)


>
> * Scheduler
> For the scheduler, I would need to control when a tasklet runs. Currently,
> I am thinking that I would look at all the "in ports" that a tasklet has and
> make sure each one has some data. Only then would the tasklet be scheduled
> to run by the scheduler.
>
>
Couldn't all those ports (channels) be read one at a time, then the
processing could be done? I don't exactly see the need to play with the
scheduler. Channels are blocking. A tasklet will be anyway unscheduled when
it tries to read on a channel in which no data is available.


>
>
> ------------
> On another note, I am curious how you handled the issue of "nested"
> objects. Consider send() and receive() that you use to pass objects around
> in your project. Am I correct in that these objects cannot contain
> references outside of themselves? Also, how do you handle extracting out of
> the tree and making sure there are not references outside the object?
>

Right now, I did not really dig too far with this problem. With a local
communication, a reference to the object is sent through a channel. The
receiver tasklet will have the same access to the object and all the
sub-object as the sender tasklet.

For remote communications, pickling is involved. The object to send must be
picklable. It excludes any I/O object unless the programmer creates its own
pickling protocol for those. A copy of all the object tree will then be
made. Sometime it's good (small objects), sometime it's bad (really complex,
big objects, I/O objects, etc.). This is why I added the concept of
ref_object() using PyPy's proxy object space. For such objects, a proxy can
be made and only a reference object will be sent to the remote side. This
object will have the same type as the original object but all operations
will be forwarded to the host node. All replies will also be wrapped by
proxies when sent back to the remote reference object. The only case where a
proxy object is not created is with atomic types (string, int, float, etc).
It's useless for those because they are immutable anyway. A remote access to
those would introduce useless latency. With ref_object(), the object tree
always stay on the initial node. A move() operation will also be added to
those ref_object()s to be able to move them between interpreters if needed.


>
> For example, consider the following object, where "->" means it has a
> reference to that object
>
> Object 1 -> Object 2
>
> Object 2 -> Object 3

Object 2 -> Object 4


> Object 4 -> Object 2
>
>
> Now, let's say I have a tasklet like the following:
>
> .... -> incoming data = pointer/reference to Object 1
>
> 1. read incoming data (get Object 1 reference)
> 2. remove Object 3
> 3. send Object 3 to tasklet B
> 4. send Object 1 to tasklet C
>
> Result:
> tasklet B now has this object:
> pointer/reference to Object 1, which contains the following tree:

Object 1 -> Object 2
> Object 2 -> Object 4
> Object 4 -> Object 2
>
>
> tasklet C now has this object:
> pointer/reference to Object 3, which contains the following tree:
> Object 3
>
>
I think you swapped tasklet B and tasklet C for the end result! ;)


>
>
> On the other hand, consider the following scenario:
>
> 1. read incoming data (get Object 1 reference)
> 2. remove Object 4
> ERROR: this would not be possible, as it refers to Object 2
>

Why isn't it possible?
By removing "Object 4" I guess you mean removing this link: Object 2 ->
Object 4? This is the only way Object 4 could be removed.


>
> > Sorry for the late answer, I was unavailable in the last few days.
> >
> > About send() and receive(), it depends on if the communication is local
> > or not. For a local communication, anything can be passed since only
> > the reference is sent. This is the base model for Stackless channels.
> > For a remote communication (between two interpreters), any picklable
> > object (a copy will then be made) and it includes channels and tasklets
> > (for which a reference will automatically be created).
> >
> > The use of the PyPy proxy object space is to make remote communication
> > more Stackless like by passing object by reference. If a ref_object is
> > made, only a reference will be passed when a tasklet is moved or the
> > object is sent on a channel. The object always resides where it was
> > created. A move() operation will also be implemented on those objects
> > so they can be moved around like tasklets.
> >
> > I hope it helps,
> >
> > Gabriel
> >
> > 2010/7/29 Kevin Ar18>
> >
> >> Hello Kevin,
> >> I don't know if it can be a solution to your problem but for my
> >> Master Thesis I'm working on making Stackless Python distributed. What
> >> I did is working but not complete and I'm right now in the process of
> >> writing the thesis (in french unfortunately). My code currently works
> >> with PyPy's "stackless" module onlyis and use some PyPy specific
> >> things. Here's what I added to Stackless:
> >>
> >> - Possibility to move tasklets easily (ref_tasklet.move(node_id)). A
> >> node is an instance of an interpreter.
> >> - Each tasklet has its global namespace (to avoid sharing of data). The
> >> state is also easier to move to another interpreter this way.
> >> - Distributed channels: All requests are known by all nodes using the
> >> channel.
> >> - Distributed objets: When a reference is sent to a remote node, the
> >> object is not copied, a reference is created using PyPy's proxy object
> >> space.
> >> - Automated dependency recovery when an object or a tasklet is loaded
> >> on another interpreter
> >>
> >> With a proper scheduler, many tasklets could be automatically spread in
> >> multiple interpreters to use multiple cores or on multiple computers. A
> >> bit like the N:M threading model where N lightweight threads/coroutines
> >> can be executed on M threads.
> >
> > Was able to have a look at the API...
> > If others don't mind my asking this on the mailing list:
> >
> > * .send() and .receive()
> > What type of data can you send and receive between the tasklets? Can
> > you pass entire Python objects?
> >
> > * .send() and .receive() memory model
> > When you send data between tasklets (pass messages) or whateve you want
> > to call it, how is this implemented under the hood? Does it use shared
> > memory under the hood or does it involve a more costly copying of the
> > data? I realize that if it is on another machine you have to copy the
> > data, but what about between two threads? You mentioned PyPy's proxy
> > object.... guess I'll need to read up on that.
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
> >
> >
> > --
> > Gabriel Lavoie
> > glavoie at gmail.com
>

By the way, if you come to #pypy on FreeNode, I'm WildChild! I'm always
there though not alway available. I'm in the EST timezone (UTC-5).

See ya,

Gabriel

-- 
Gabriel Lavoie
glavoie at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100805/5ac63a52/attachment-0001.htm 

From bhartsho at yahoo.com  Tue Aug 10 08:32:31 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Mon, 9 Aug 2010 23:32:31 -0700 (PDT)
Subject: [pypy-dev] rstruct where is pack?
Message-ID: <250182.36782.qm@web114016.mail.gq1.yahoo.com>

Seems like struct.pack is not RPython?  I see the examples for unpack in the tests folder, but not for packing.




From benjamin at python.org  Tue Aug 10 15:14:04 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 10 Aug 2010 08:14:04 -0500
Subject: [pypy-dev] rstruct where is pack?
In-Reply-To: <250182.36782.qm@web114016.mail.gq1.yahoo.com>
References: <250182.36782.qm@web114016.mail.gq1.yahoo.com>
Message-ID: 

2010/8/10 Hart's Antler :
> Seems like struct.pack is not RPython? ?I see the examples for unpack in the tests folder, but not for packing.

struct.pack() is implemented in pypy/module/rstruct/.



-- 
Regards,
Benjamin

From anto.cuni at gmail.com  Tue Aug 10 15:32:29 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Tue, 10 Aug 2010 15:32:29 +0200
Subject: [pypy-dev] rstruct where is pack?
In-Reply-To: 
References: <250182.36782.qm@web114016.mail.gq1.yahoo.com>
	
Message-ID: <4C6154ED.1070904@gmail.com>

On 10/08/10 15:14, Benjamin Peterson wrote:
> 2010/8/10 Hart's Antler :
>> Seems like struct.pack is not RPython?  I see the examples for unpack in the tests folder, but not for packing.
> 
> struct.pack() is implemented in pypy/module/rstruct/.

I suppose you mean pypy/module/struct.

But if the OP is looking for an rpython lib to use in his rpython program,
this is not exactly what he looks for, although I agree it could be adapted
and ported to rlib.

ciao,
Anto

From bhartsho at yahoo.com  Wed Aug 11 03:08:51 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Tue, 10 Aug 2010 18:08:51 -0700 (PDT)
Subject: [pypy-dev] rstruct where is pack?
In-Reply-To: <4C6154ED.1070904@gmail.com>
Message-ID: <951074.37494.qm@web114002.mail.gq1.yahoo.com>

I have made a RPython replacement for struct pack/unpack that could go in rlib.  It is not a drop in replacement, and for some reason i can't get long to work, but for simple packing and unpacking it will work.  Posted the code on my blog if anybody ever runs into the same problem:
http://pyppet.blogspot.com/2010/08/rpython-struct.html


--- On Tue, 8/10/10, Antonio Cuni  wrote:

> From: Antonio Cuni 
> Subject: Re: [pypy-dev] rstruct where is pack?
> To: "Benjamin Peterson" 
> Cc: "Hart's Antler" , pypy-dev at codespeak.net
> Date: Tuesday, 10 August, 2010, 6:32 AM
> On 10/08/10 15:14, Benjamin Peterson
> wrote:
> > 2010/8/10 Hart's Antler :
> >> Seems like struct.pack is not RPython?? I see
> the examples for unpack in the tests folder, but not for
> packing.
> > 
> > struct.pack() is implemented in pypy/module/rstruct/.
> 
> I suppose you mean pypy/module/struct.
> 
> But if the OP is looking for an rpython lib to use in his
> rpython program,
> this is not exactly what he looks for, although I agree it
> could be adapted
> and ported to rlib.
> 
> ciao,
> Anto
> 




From arigo at tunes.org  Wed Aug 11 14:20:31 2010
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 11 Aug 2010 14:20:31 +0200
Subject: [pypy-dev] I broke stackless
In-Reply-To: 
References: 
	
Message-ID: <20100811122031.GA2733@code0.codespeak.net>

Hi,

For reference, after IRC discussions I fixed it in r76475.


Armin

From arigo at tunes.org  Wed Aug 11 14:24:20 2010
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 11 Aug 2010 14:24:20 +0200
Subject: [pypy-dev] Percentage Python as RPython.
In-Reply-To: <805862.40025.qm@web114009.mail.gq1.yahoo.com>
References: <805862.40025.qm@web114009.mail.gq1.yahoo.com>
Message-ID: <20100811122420.GB2733@code0.codespeak.net>

Hi Hart,

On Wed, Aug 04, 2010 at 06:04:00PM -0700, Hart's Antler wrote:
> I think you want a frontend, not a backend.  The frontend would take
> in normal Python and convert it to RPython.

I think the chances of getting this to work are "0 - 0.5 %", as per
fijal's previous excellent answer.

Writing in RPython requires a different state of mind than writing in
normal Python (unless, maybe, you are a Java programmer that writes Java
with the Python syntax; for that case, I would suggest that writing in
Java in the first place is just as easy).


A bientot,

Armin.

From stefan_ml at behnel.de  Thu Aug 12 08:49:09 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 12 Aug 2010 08:49:09 +0200
Subject: [pypy-dev] What can Cython do for PyPy?
Message-ID: 

Hi,

there has recently been a move towards a .NET/IronPython port of Cython, 
mostly driven by the need for a fast NumPy port. During the related 
discussion, the question came up how much it would take to let Cython also 
target other runtimes, including PyPy.

Given that PyPy already has a CPython C-API compatibility layer, I doubt 
that it would be hard to enable that. With my limited knowledge about the 
internals of that layer, I guess the question thus becomes: is there 
anything Cython could do to the C code it generates that would make the 
Cython generated extension modules run faster/better/safer on PyPy than 
they would currently? I never tried to make a Cython module actually run on 
PyPy (simply because I don't use PyPy), but I have my doubts that they'd 
run perfectly out of the box. While generally portable, I'm pretty sure the 
C code relies on some specific internals of CPython that PyPy can't easily 
(or efficiently) provide.

Stefan


From fijall at gmail.com  Thu Aug 12 10:05:01 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 12 Aug 2010 10:05:01 +0200
Subject: [pypy-dev] What can Cython do for PyPy?
In-Reply-To: 
References: 
Message-ID: 

Hi Stefan.

CPython extension compatibility layer is in alpha at best. I heavily
doubt that anything would run out of the box. However, this is a
cpython compatiblity layer anyway, it's not meant to be used as a long
term solutions. First of all it's inneficient (and unclear if will
ever be), but it's also unjitable. This means that to JIT, cpython
extension is like a black box which should not be touched.  Also,
several concepts, like refcounting are completely alien to pypy and
emulated.

For example for numpy, I think a rewrite is necessary to make it fast
(and as experiments have shown, it's possible to make it really fast),
so I would not worry about using cython for speeding things up. In
theory you should not need it and the boundary layer between
cython-compiled code and JITted code would make you suffer anyway.
There is another usecase for using cython for providing access to C
libraries. This is a bit harder question and I don't have a good
answer for that, but maybe cpython compatibility layer would be good
enough in this case? I can't see how Cython can produce a "native" C
code instead of CPython C code without some major effort.

Cheers,
fijal

On Thu, Aug 12, 2010 at 8:49 AM, Stefan Behnel  wrote:
> Hi,
>
> there has recently been a move towards a .NET/IronPython port of Cython,
> mostly driven by the need for a fast NumPy port. During the related
> discussion, the question came up how much it would take to let Cython also
> target other runtimes, including PyPy.
>
> Given that PyPy already has a CPython C-API compatibility layer, I doubt
> that it would be hard to enable that. With my limited knowledge about the
> internals of that layer, I guess the question thus becomes: is there
> anything Cython could do to the C code it generates that would make the
> Cython generated extension modules run faster/better/safer on PyPy than
> they would currently? I never tried to make a Cython module actually run on
> PyPy (simply because I don't use PyPy), but I have my doubts that they'd
> run perfectly out of the box. While generally portable, I'm pretty sure the
> C code relies on some specific internals of CPython that PyPy can't easily
> (or efficiently) provide.
>
> Stefan
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From stefan_ml at behnel.de  Thu Aug 12 11:25:18 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 12 Aug 2010 11:25:18 +0200
Subject: [pypy-dev] What can Cython do for PyPy?
In-Reply-To: 
References: 
	
Message-ID: 

Maciej Fijalkowski, 12.08.2010 10:05:
> On Thu, Aug 12, 2010 at 8:49 AM, Stefan Behnel wrote:
>> there has recently been a move towards a .NET/IronPython port of Cython,
>> mostly driven by the need for a fast NumPy port. During the related
>> discussion, the question came up how much it would take to let Cython also
>> target other runtimes, including PyPy.
>>
>> Given that PyPy already has a CPython C-API compatibility layer, I doubt
>> that it would be hard to enable that. With my limited knowledge about the
>> internals of that layer, I guess the question thus becomes: is there
>> anything Cython could do to the C code it generates that would make the
>> Cython generated extension modules run faster/better/safer on PyPy than
>> they would currently? I never tried to make a Cython module actually run on
>> PyPy (simply because I don't use PyPy), but I have my doubts that they'd
>> run perfectly out of the box. While generally portable, I'm pretty sure the
>> C code relies on some specific internals of CPython that PyPy can't easily
>> (or efficiently) provide.
>
> CPython extension compatibility layer is in alpha at best. I heavily
> doubt that anything would run out of the box. However, this is a
> cpython compatiblity layer anyway, it's not meant to be used as a long
> term solutions. First of all it's inneficient (and unclear if will
> ever be)

If you only use it to call into non-trivial Cython code (e.g. some heavy 
calculations on NumPy tables), the call overhead should be mostly 
negligible, maybe even close to that in CPython. You could even provide 
some kind of fast-path to 'cpdef' functions (i.e. functions that are 
callable from both C and Python) and 'api' functions (which are currently 
exported at the module API level using the PyCapsule mechanism). That would 
reduce the call overhead to that of a C call.

Then, a lot of Cython code doesn't do much ref-counting and the like but 
simply runs in plain C. So, often enough, there won't be that much overhead 
involved in the code itself either, especially in tight loops where users 
prune away all CPython interaction anyway.


> but it's also unjitable. This means that to JIT, cpython
> extension is like a black box which should not be touched.

Well, unless both sides learn about each other, that is. It won't 
necessarily impact the JIT, but then again, a JIT usually won't have a 
noticeable impact on the performance of Cython code anyway.


> Also, several concepts, like refcounting are completely alien to pypy
> and emulated.

Sure. That's why I asked if there is anything that Cython can help to 
improve here. For example, the code it generates for INCREF/DECREF 
operations is not only configurable at the C preprocessor level.


> For example for numpy, I think a rewrite is necessary to make it fast
> (and as experiments have shown, it's possible to make it really fast),
> so I would not worry about using cython for speeding things up.

This isn't only about making things fast when being rewritten. This is also 
about accessing and reusing existing code in a new environment. Cython is 
becoming increasingly popular in the numerics community, and a lot of 
Cython code is being written as we speak, not only in the SciPy/NumPy 
environment. People even find it attractive enough to start rewriting their 
CPython extension modules (most often library wrappers) from C in Cython, 
both for performance and TCO reasons.


> There is another usecase for using cython for providing access to C
> libraries. This is a bit harder question and I don't have a good
> answer for that, but maybe cpython compatibility layer would be good
> enough in this case? I can't see how Cython can produce a "native" C
> code instead of CPython C code without some major effort.

Native (standalone) C code isn't the goal, just something that adapts well 
to what PyPy can provide as a CPython compatibility layer. If Cython 
modules work across independent Python implementations, that would be the 
most simple way by far to make lots of them available cross-platform, thus 
making it a lot simpler to switch between different implementations.

Stefan


From santagada at gmail.com  Thu Aug 12 16:31:01 2010
From: santagada at gmail.com (Leonardo Santagada)
Date: Thu, 12 Aug 2010 11:31:01 -0300
Subject: [pypy-dev] What can Cython do for PyPy?
In-Reply-To: 
References: 
Message-ID: <303CCB23-07C0-4D0F-90D2-0DD908DB4043@gmail.com>


On Aug 12, 2010, at 3:49 AM, Stefan Behnel wrote:

> Hi,
> 
> there has recently been a move towards a .NET/IronPython port of Cython, 
> mostly driven by the need for a fast NumPy port. During the related 
> discussion, the question came up how much it would take to let Cython also 
> target other runtimes, including PyPy.
> 
> Given that PyPy already has a CPython C-API compatibility layer, I doubt 
> that it would be hard to enable that. With my limited knowledge about the 
> internals of that layer, I guess the question thus becomes: is there 
> anything Cython could do to the C code it generates that would make the 
> Cython generated extension modules run faster/better/safer on PyPy than 
> they would currently? I never tried to make a Cython module actually run on 
> PyPy (simply because I don't use PyPy), but I have my doubts that they'd 
> run perfectly out of the box. While generally portable, I'm pretty sure the 
> C code relies on some specific internals of CPython that PyPy can't easily 
> (or efficiently) provide.


A possible solution I think would be to do an oo backend for cython. That could be made to generate C# or RPython code. The problem remains that pypy still doesn't have separate compilation so you cannot make a external module for the pypy interpreter after it is translated.

So it is hard, maybe harder than anyone on cython would like, but I still think it is a good solution. (Unless I'm mistaken in any of my assumptions, and then it is a terrible solution :)

--
Leonardo Santagada
santagada at gmail.com




From p.giarrusso at gmail.com  Thu Aug 12 17:35:40 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 12 Aug 2010 17:35:40 +0200
Subject: [pypy-dev] What can Cython do for PyPy?
In-Reply-To: 
References: 
	
	
Message-ID: 

I agree with the motivations given by Stefan - two interesting
possibilities would be to:
a) first, test the compatibility layer with Cython generated code

b) possibly, allow users to use the Python API while replacing
refcounting with another, more meaningful, PyPy-specific API* for a
garbage collected heap.

However, such an API is radically different. I'm also not sure how
well such an API would mesh with the CPython API, actually. If Cython
could support such an API, that would be great. But I'm unsure whether
this is worth it, for Cython, and more in general for other modules
(one could easily and elegantly support both CPython and PyPy with
preprocessor tricks).

See further below about why call overhead is not the biggest
performance problem when not inlining.

* I thought the Java Native Interface (JNI) design of local and global
references (http://download.oracle.com/javase/6/docs/technotes/guides/jni/spec/design.html#wp16785)
would work here, with some adaptation.
However, if your moving GCs support pinning of objects, as I expect to
be necessary to interact with CPython code, I would do an important
change to that API: instead of having object references be pointers to
(movable by the GC) pointers to objects, like in the JNI API, PyPy
should use plain pinned pointers. The pinning would not be apparent in
the type, but that should be fine I guess.
Problems arise when PyPy-aware code calls code which still uses the
refcounting API. It is mostly safe to ignore the refcounting (even
decreases) for local references, but I'm unsure about persistent
references, even if it's probably still the best solution, so that the
PyPy-aware code handles the lifecycle by itself.

On Thu, Aug 12, 2010 at 11:25, Stefan Behnel  wrote:
> Maciej Fijalkowski, 12.08.2010 10:05:
>> On Thu, Aug 12, 2010 at 8:49 AM, Stefan Behnel wrote:

> If you only use it to call into non-trivial Cython code (e.g. some heavy
> calculations on NumPy tables), the call overhead should be mostly
> negligible, maybe even close to that in CPython. You could even provide
> some kind of fast-path to 'cpdef' functions (i.e. functions that are
> callable from both C and Python) and 'api' functions (which are currently
> exported at the module API level using the PyCapsule mechanism). That would
> reduce the call overhead to that of a C call.

>> but it's also unjitable. This means that to JIT, cpython
>> extension is like a black box which should not be touched.

> Well, unless both sides learn about each other, that is. It won't
> necessarily impact the JIT, but then again, a JIT usually won't have a
> noticeable impact on the performance of Cython code anyway.

Call overhead is not the biggest problem, I guess (well, if it's
bigger than that in C, it might be); it's IMHO the minor problem when
you can't inline. Inlining is important because it allows to do more
optimizations on the combined code. Now, it might or might not apply
to your typical use cases (present and future), you should just keep
this issue in mind, too. Whenever you say "If you only use it to call
into non-trivial Cython code", you imply that some kind of functional
abstraction, the one where you write short functions, such as
accessors, are not efficiently supported.

For instance, if you call two functions, each containing a parallel
for loops, fusing the loops requires inlining the functions to expose
the loops.
Inlining accessors (getters and setters) allows to recognize that they
often don't need to be called over and over again, i.e., common
subexpression elimination, which you can't do on a normal (impure)
function.

To make a particularly dramatic example (since it comes from C) of a
quadratic-to-linear optimization: a loop like
for (i = 0; i < strlen(s); i++) {
  //do something on s without modifying it
}

takes quadratic time, because strlen takes linear time and is called
at each loop. Can the optimizer fix this? The simplest way for it is
to inline everything, then it could notice that calculating strlen
only once is safe. In C with GCC extensions, one could annotate strlen
as pure, and use functions which take s as a const parameter (but I'm
unsure if it actually works). In Python (and even in Java), anything
such should work without annotations.

Of course, one can't rely on this quadratic-linear optimization unless
it's guaranteed to work (like tail call elimination), so I wouldn't do
it in this case; this point relates to the wider issue of unreliable
optimizations and "sufficiently smart compilers", better discussed at
http://prog21.dadgum.com/40.html (not mine).
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From jbaker at zyasoft.com  Thu Aug 12 17:41:39 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Thu, 12 Aug 2010 09:41:39 -0600
Subject: [pypy-dev] What can Cython do for PyPy?
In-Reply-To: 
References: 
	
	
Message-ID: 

[crossposting to jython-dev]

Because of some conversations I had with Maciej (mostly at Folsom Coffee in
Boulder :) ), we are considering adding support for the CPython C-Extension
API for Jython, modeling what has already been done in PyPy and IronPython.
Although I think it may make a lot of sense to port NumPy to Java, and have
argued for it in the past, being pragmatic suggests it's better to work with
the tide of NumPy/Cython than against it. Also, this can bring in a large
swath of existing libraries to work with Jython, including those coded
against SWIG, at the cost that it will not run under most security manager
policies. I think that's a reasonable tradeoff.

Similar concerns that Maciej raises apply to Jython. No Java JIT will inline
such native code, marshaling from the Java domain to the native one will be
expensive, etc. But this is (mostly) true of Jython today, from Python code
to Java (although invokedynamic will at least reduce some of those costs).
But users can still take advantage of Java to achieve much better
performance from Jython, if they are careful about structuring the execution
of their code. At the end of the day, Jython to C code, including that
produced by Cython should see a similar performance profile to CPython to C
code, as long as they don't hammer the INCREF/DECREF *functions*. (JRuby is
implementing something similar, and we probably can borrow their
"refcounting" support.) But of course that's exactly what one needs to avoid
to write performant extension code anyway in CPython, at least if it's to be
multithreaded.

One interesting part of this discussion is whether we can support lock
eliding. This is one part of JIT inlining that you don't want to give up for
multithreaded performance. Rather than having C code callback into Java to
release the GIL (which is only global for such C code!), it would be better
to have a marker on the C code that allows for immediate release, or perhaps
some other inlinable Java stub. I could imagine this could be readily
supported by Cython (and perhaps already is).

Lastly, I want to emphasize again that if/when Jython adds support for the C
extension API, the "GIL" and "refcounting" support will only be for such C
code! We like our concurrency support and we are not giving it up :)

- Jim

On Thu, Aug 12, 2010 at 3:25 AM, Stefan Behnel  wrote:

> Maciej Fijalkowski, 12.08.2010 10:05:
> > On Thu, Aug 12, 2010 at 8:49 AM, Stefan Behnel wrote:
> >> there has recently been a move towards a .NET/IronPython port of Cython,
> >> mostly driven by the need for a fast NumPy port. During the related
> >> discussion, the question came up how much it would take to let Cython
> also
> >> target other runtimes, including PyPy.
> >>
> >> Given that PyPy already has a CPython C-API compatibility layer, I doubt
> >> that it would be hard to enable that. With my limited knowledge about
> the
> >> internals of that layer, I guess the question thus becomes: is there
> >> anything Cython could do to the C code it generates that would make the
> >> Cython generated extension modules run faster/better/safer on PyPy than
> >> they would currently? I never tried to make a Cython module actually run
> on
> >> PyPy (simply because I don't use PyPy), but I have my doubts that they'd
> >> run perfectly out of the box. While generally portable, I'm pretty sure
> the
> >> C code relies on some specific internals of CPython that PyPy can't
> easily
> >> (or efficiently) provide.
> >
> > CPython extension compatibility layer is in alpha at best. I heavily
> > doubt that anything would run out of the box. However, this is a
> > cpython compatiblity layer anyway, it's not meant to be used as a long
> > term solutions. First of all it's inneficient (and unclear if will
> > ever be)
>
> If you only use it to call into non-trivial Cython code (e.g. some heavy
> calculations on NumPy tables), the call overhead should be mostly
> negligible, maybe even close to that in CPython. You could even provide
> some kind of fast-path to 'cpdef' functions (i.e. functions that are
> callable from both C and Python) and 'api' functions (which are currently
> exported at the module API level using the PyCapsule mechanism). That would
> reduce the call overhead to that of a C call.
>
> Then, a lot of Cython code doesn't do much ref-counting and the like but
> simply runs in plain C. So, often enough, there won't be that much overhead
> involved in the code itself either, especially in tight loops where users
> prune away all CPython interaction anyway.
>
>
> > but it's also unjitable. This means that to JIT, cpython
> > extension is like a black box which should not be touched.
>
> Well, unless both sides learn about each other, that is. It won't
> necessarily impact the JIT, but then again, a JIT usually won't have a
> noticeable impact on the performance of Cython code anyway.
>
>
> > Also, several concepts, like refcounting are completely alien to pypy
> > and emulated.
>
> Sure. That's why I asked if there is anything that Cython can help to
> improve here. For example, the code it generates for INCREF/DECREF
> operations is not only configurable at the C preprocessor level.
>
>
> > For example for numpy, I think a rewrite is necessary to make it fast
> > (and as experiments have shown, it's possible to make it really fast),
> > so I would not worry about using cython for speeding things up.
>
> This isn't only about making things fast when being rewritten. This is also
> about accessing and reusing existing code in a new environment. Cython is
> becoming increasingly popular in the numerics community, and a lot of
> Cython code is being written as we speak, not only in the SciPy/NumPy
> environment. People even find it attractive enough to start rewriting their
> CPython extension modules (most often library wrappers) from C in Cython,
> both for performance and TCO reasons.
>
>
> > There is another usecase for using cython for providing access to C
> > libraries. This is a bit harder question and I don't have a good
> > answer for that, but maybe cpython compatibility layer would be good
> > enough in this case? I can't see how Cython can produce a "native" C
> > code instead of CPython C code without some major effort.
>
> Native (standalone) C code isn't the goal, just something that adapts well
> to what PyPy can provide as a CPython compatibility layer. If Cython
> modules work across independent Python implementations, that would be the
> most simple way by far to make lots of them available cross-platform, thus
> making it a lot simpler to switch between different implementations.
>
> Stefan
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100812/e00f0940/attachment.htm 

From anto.cuni at gmail.com  Thu Aug 12 20:31:06 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Thu, 12 Aug 2010 20:31:06 +0200
Subject: [pypy-dev] [pypy-svn] r76608 - in
 pypy/branch/jit-bounds/pypy/jit/metainterp: . test
In-Reply-To: <20100812170214.45AC5282B9E@codespeak.net>
References: <20100812170214.45AC5282B9E@codespeak.net>
Message-ID: <4C643DEA.9020809@gmail.com>

On 12/08/10 19:02, hakanardo at codespeak.net wrote:

> +    def boundint_gt(self, val):
> +        if val is None: return
> +        self.minint = val + 1

what happens if val == sys.maxint?

ciao,
Anto

From kevinar18 at hotmail.com  Fri Aug 13 01:42:54 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Thu, 12 Aug 2010 19:42:54 -0400
Subject: [pypy-dev] pre-emptive micro-threads utilizing shared memory
 message passing?
In-Reply-To: 
References: ,
	,
	,
	,
	,
	
Message-ID: 


Sorry for not gettin back to you sooner.


I don't mind replying to the mailing list unless it annoys someone? Maybe some people could be interested by this discussion. 



You have a lot of questions! :) My answers are inline.


  


* Message passing
When you create a tasklet, you assign a set number of queues or streams to it (it can have many) and whether they extract data from them or write to them (they can only either extract or write to it as noted above). The tasklet's global namespace has access to these queues or streams and can extract or add data to them.

In my case, I look at message passing from the perspective of the tasklet. A tasklet can either be assigned a certain number of "in ports" and a certain number of "out ports." In this case the "in ports" are the .read() end of a queue or stream and the "out ports" are the .send() part of a queue or stream.




Sorry, I don't really understand what you're trying to explain here. Maybe an example could be helpful? :)


* Scheduler
For the scheduler, I would need to control when a tasklet runs. Currently, I am thinking that I would look at all the "in ports" that a tasklet has and make sure each one has some data. Only then would the tasklet be scheduled to run by the scheduler.




Couldn't all those ports (channels) be read one at a time, then the processing could be done? I don't exactly see the need to play with the scheduler. Channels are blocking. A tasklet will be anyway unscheduled when it tries to read on a channel in which no data is available.
 
http://www.jpaulmorrison.com/fbp/concepts.htm
Figure 3.6 and Figure 3.7 are a good example.
Let's say Figure 3.7 is the tasklet (the one with a local namespace and no access to global memory or memory in other tasklets).
IN, ACC, REJ are pointers to a shared memory location (from an implementation standpoint).
IN, ACC, REJ are either a queue or buffer/pipe/steam (from the perspective of the programmer).
The tasklet can only read/extract data from IN.
The tasklet can only write to ACC and REJ.
 
> Couldn't all those ports (channels) be read one at a time, then the processing could be done?
Not sure exactly, what you mean, but as shown in Figure 3.7, different parts of code will read or write to different ports at different times.
> A tasklet will be anyway unscheduled when it tries to read on a channel in which no data is available.
Good idea.  If there's no data to read, the tasklet can yield. ... but I need to know when the tasklet can be put back into the scheduler queue
 
Then again, I don't know how I will want to do the scheduler... and would like the low level primitives to explore different styles.
 
 
 
 
Anyways, at this point, I guess this whole discussion is not that important.  I should probably make something simpler for now just to try things out.  Then maybe I'll know if I want to even bother working on something better.   However, if you would like me to keep you up to date, I can contact you via email a few months from now.  (Let me know and I'll give you a different email to use). 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100812/a4abdb4d/attachment-0001.htm 

From bhartsho at yahoo.com  Fri Aug 13 03:42:23 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Thu, 12 Aug 2010 18:42:23 -0700 (PDT)
Subject: [pypy-dev] Wrapping C, void pointer
Message-ID: <986277.65536.qm@web114019.mail.gq1.yahoo.com>

I am wrapping PortAudio for RPython.  Following the source code in RSDL as a guide i can clearly see what to do with constants, structs, functions and so on.  So far so good until i reached something new in PortAudio i am not sure how to deal with, a void typedef and a void pointer.  In the RSDL example pointers were defined by Ptr = lltype.Ptr(lltype.ForwardReference()), then in the CConfig class the struct was defined and rffi_platform.configure parses, finally the Ptr is told TO become the type - given the output of platform.configure.  How do we deal with this situation from PortAudio?

# from portaudio.h
typedef void PaStream;
PaError Pa_OpenStream( PaStream** stream,
                       const PaStreamParameters *inputParameters,
                       const PaStreamParameters *outputParameters,
                       double sampleRate,
                       unsigned long framesPerBuffer,
                       PaStreamFlags streamFlags,
                       PaStreamCallback *streamCallback,
                       void *userData );

###############################
I have tried the following but it fails when i try to malloc the void pointer.

OpenDefaultStream = external(	'Pa_OpenDefaultStream', 
	[
		rffi.VOIDPP,	# PaStream** 
		rffi.INT, 		# numInputChannels
		rffi.INT, 		# numOutputChannels
		rffi.INT, 		# sampleFormat
		rffi.INT, 		# sampleRate
		rffi.INT, 		# framesPerBuffer
		rffi.INT,		#streamcallback
		rffi.VOIDP,		#userData
	],	
	rffi.INT
)
Stream = lltype.Void		#rffi.VOIDP
def test():
	print 'portaudio version %s' %GetVersion()
	assert Initialize() == 0		# paNoError = 0, error code is returned on init fail.

	stream = lltype.malloc(Stream, flavor='raw')
	try: ok = OpenDefaultStream( stream, 1, 1, Int16, 22050, FramesPerBufferUnspecified, 0 )
	finally: lltype.free(stream, flavor='raw')
	Terminate()


-brett




From arigo at tunes.org  Fri Aug 13 10:39:49 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 13 Aug 2010 10:39:49 +0200
Subject: [pypy-dev] Wrapping C, void pointer
In-Reply-To: <986277.65536.qm@web114019.mail.gq1.yahoo.com>
References: <986277.65536.qm@web114019.mail.gq1.yahoo.com>
Message-ID: <20100813083948.GA20768@code0.codespeak.net>

Hi Hart,

On Thu, Aug 12, 2010 at 06:42:23PM -0700, Hart's Antler wrote:
> I am wrapping PortAudio for RPython.

Why?  Writing it in standard ctypes would give really bad performance?


Armin

From bhartsho at yahoo.com  Fri Aug 13 15:02:12 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Fri, 13 Aug 2010 06:02:12 -0700 (PDT)
Subject: [pypy-dev] Wrapping C, void pointer
In-Reply-To: <20100813083948.GA20768@code0.codespeak.net>
Message-ID: <634594.45561.qm@web114019.mail.gq1.yahoo.com>

Hi Armin, i wanted something faster than ctypes, i think thats why Hubert Pham used the Python C API when doing pyaudio before, also i want to do DSP on the samples and want to option to do as many effects as possible in real-time.
I figured out my problem, rpyportaudio is on google code now, http://code.google.com/p/rpyportaudio/


--- On Fri, 8/13/10, Armin Rigo  wrote:

> From: Armin Rigo 
> Subject: Re: [pypy-dev] Wrapping C, void pointer
> To: "Hart's Antler" 
> Cc: pypy-dev at codespeak.net
> Date: Friday, 13 August, 2010, 1:39 AM
> Hi Hart,
> 
> On Thu, Aug 12, 2010 at 06:42:23PM -0700, Hart's Antler
> wrote:
> > I am wrapping PortAudio for RPython.
> 
> Why?? Writing it in standard ctypes would give really
> bad performance?
> 
> 
> Armin
> 




From andrewfr_ice at yahoo.com  Fri Aug 13 18:52:14 2010
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Fri, 13 Aug 2010 09:52:14 -0700 (PDT)
Subject: [pypy-dev] pypy-dev Digest, Vol 361, Issue 5
In-Reply-To: 
Message-ID: <66172.2718.qm@web120712.mail.ne1.yahoo.com>

Hi Kevin:

Message: 4
Date: Thu, 12 Aug 2010 19:42:54 -0400
From: Kevin Ar18 
Subject: Re: [pypy-dev] pre-emptive micro-threads utilizing shared
    memory message passing?
To: 
Message-ID: 
Content-Type: text/plain; charset="iso-8859-1"


>I don't mind replying to the mailing list unless it annoys someone? Maybe >some people could be interested by this discussion.

I am finding it a bit difficult to follow this thread. I am not sure who is saying what. Also I don't know if you are talking about an entirely new system or the stackless.py module.

>In my case, I look at message passing from the perspective of the >tasklet. A tasklet can either be assigned a certain number of "in ports" >and a certain number of "out ports." In this case the "in ports" are the >.read() end of a queue or stream and the "out ports" are the .send() part >of a queue or stream.

A part of the model that Stackless uses is that tasklets have channels.
Channels have send() and receive() operations. 

>For the scheduler, I would need to control when a tasklet runs. >Currently, I am thinking that I would look at all the "in ports" that a >tasklet has and make sure each one has some data. Only then would the >tasklet be scheduled to run by the scheduler.

The current scheduler already does this. However there are no in or out ports, just operations that can proceed.

>Couldn't all those ports (channels) be read one at a time, then the >processing could be done? 

If you are using stackless.py - the tasklet will block if it encounters
a channel with no target on the other side. I wrote a select() function that allows monitoring on multiple channels.

>Good idea.  If there's no data to read, the tasklet can yield. ... but I >need to know when the tasklet can be put back into the scheduler queue

I don't want to toot my horn but I gave a talk that covers how rendez-vous semantics works at EuroPython: http://andrewfr.wordpress.com/2010/07/24/prototyping-gos-select-and-beyond/

Cheers,
Andrew



      

From bhartsho at yahoo.com  Sat Aug 14 03:51:00 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Fri, 13 Aug 2010 18:51:00 -0700 (PDT)
Subject: [pypy-dev] RPython function callback from C
Message-ID: <662548.18463.qm@web114020.mail.gq1.yahoo.com>

I have the PortAudio blocking API working, simple reading and writing to the sound card works.  PortAudio also has an async API where samples are fed to a callback as they stream in.  But i'm not sure how to define a RPython function that will be called as a callback from C, is this even possible?  I see some references in the source of rffi that seems to suggest it is possible.  Full source code is here http://pastebin.com/6YHbT7CU


I'm passing the callback like this:

def stream_callback( *args ):
	print 'stream callback'
	return 0		# 0=continue, 1=complete, 2=abort

stream_callback_ptr = rffi.CCallback([], rffi.INT)

OpenDefaultStream = rffi.llexternal(	'Pa_OpenDefaultStream', 
	[
		StreamRefPtr,		# PaStream** 
		rffi.INT, 		# numInputChannels
		rffi.INT, 		# numOutputChannels
		rffi.INT, 		# sampleFormat
		rffi.DOUBLE, 		# double sampleRate
		rffi.INT, 		# unsigned long framesPerBuffer
		#rffi.VOIDP,		#PaStreamCallback *streamCallback
		stream_callback_ptr,
		rffi.VOIDP,		#void *userData
	],	
	rffi.INT,		# return
	compilation_info=eci,
	_callable=stream_callback
)

entrypoint():
   ...
   callback = lltype.nullptr( stream_callback_ptr.TO )
   ok = OpenDefaultStream( streamptr, 2, 2, Int16, 22050.0, 512, callback, userdata )





From kevinar18 at hotmail.com  Sat Aug 14 05:29:15 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Fri, 13 Aug 2010 23:29:15 -0400
Subject: [pypy-dev] ongoing microthread discussions
In-Reply-To: <66172.2718.qm@web120712.mail.ne1.yahoo.com>
References: ,
	<66172.2718.qm@web120712.mail.ne1.yahoo.com>
Message-ID: 


> >I don't mind replying to the mailing list unless it annoys someone? Maybe >some people could be interested by this discussion.
> 
> I am finding it a bit difficult to follow this thread. I am not sure who is saying what. Also I don't know if you are talking about an entirely new system or the stackless.py module.
An entirely new system/way of doing things -- meaning I don't think the stackless style would fit.
 
Originally, I was hoping for some way to achieve what I want in Python across multiple cores, but I'm finding there is no such primitives to do that effectively.  I know the basics of how I would do it in a lower level language.
 
Yes, there are many different topics that this brought up.  Here's a summary:
* I wanted to work on a different way of doing things (different than stackless)... but I needed lower level primitives that allowed me to pass data back and forth between threads using shared memory queues or pipes (instead of the current method that copies the data back and forth)
* I then asked about the difficulty in doing some form of limited shared memory (one that wouldn't involve a GIL overhaul)
* A branch of the discussion involved people discuss various locking problems that might cause...
* The author of Kamaelia posted a message and we had a brief discussion down that road.  (His project is very similar to what I want to do.)
* Gabriel mentioned his project and we had a brief discussion.  His project has some similarities ... but still is probably too different for my needs, but maybe would be very interesting to other people here.
* In one of the emails, I brought up a possible solution to offering shared memory "message passing" that would not require locks of locking issues... but it really is too much for me to get involved with now.
 
... and I guess by now the discussion has pretty much died off as there was really nothing more.... 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100813/55e97701/attachment.htm 

From jbaker at zyasoft.com  Sat Aug 14 08:13:26 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Sat, 14 Aug 2010 00:13:26 -0600
Subject: [pypy-dev] ongoing microthread discussions
In-Reply-To: 
References: 
	<66172.2718.qm@web120712.mail.ne1.yahoo.com>
	
Message-ID: 

Kevin,

You may want to broaden your candidates. Jython already supports multiple
cores with no GIL and shared memory with well-defined memory semantics
derived directly from Java's memory model (and compatible with the informal
memory model that we see in CPython). Because JRuby needs it for efficient
support of Ruby 1.9 generators, which are more general than Python's
(non-nested yields), there has been substantial attention paid to the MLVM
coroutine support which has demonstrated 1M+ microthread scalability in a
single JVM process.

It would be amazing if someone spent some time looking at this in Jython.

- Jim

On Fri, Aug 13, 2010 at 9:29 PM, Kevin Ar18  wrote:

>  > >I don't mind replying to the mailing list unless it annoys someone?
> Maybe >some people could be interested by this discussion.
> >
> > I am finding it a bit difficult to follow this thread. I am not sure who
> is saying what. Also I don't know if you are talking about an entirely new
> system or the stackless.py module.
> An entirely new system/way of doing things -- meaning I don't think the
> stackless style would fit.
>
> Originally, I was hoping for some way to achieve what I want in Python
> across multiple cores, but I'm finding there is no such primitives to do
> that effectively.  I know the basics of how I would do it in a lower level
> language.
>
> Yes, there are many different topics that this brought up.  Here's a
> summary:
> * I wanted to work on a different way of doing things (different than
> stackless)... but I needed lower level primitives that allowed me to pass
> data back and forth between threads using shared memory queues or pipes
> (instead of the current method that copies the data back and forth)
> * I then asked about the difficulty in doing some form of limited shared
> memory (one that wouldn't involve a GIL overhaul)
> * A branch of the discussion involved people discuss various locking
> problems that might cause...
> * The author of Kamaelia posted a message and we had a brief discussion
> down that road.  (His project is very similar to what I want to do.)
> * Gabriel mentioned his project and we had a brief discussion.  His project
> has some similarities ... but still is probably too different for my needs,
> but maybe would be very interesting to other people here.
> * In one of the emails, I brought up a possible solution to offering shared
> memory "message passing" that would not require locks of locking issues...
> but it really is too much for me to get involved with now.
>
> ... and I guess by now the discussion has pretty much died off as there was
> really nothing more....
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100814/f1d03659/attachment-0001.htm 

From arigo at tunes.org  Sat Aug 14 15:50:02 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 14 Aug 2010 15:50:02 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
Message-ID: <20100814135002.GA1941@code0.codespeak.net>

Hi all,

The PyPy speed center does not display any update more recent than July 29.
The buildbot infrastructure correctly puts them into files
codespeak.net:~buildmaster/bench_results/REV.json, but the web site at
http://speed.pypy.org/ does not get updated.

Help please!


A bientot,

Armin.

From arigo at tunes.org  Sat Aug 14 16:23:21 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 14 Aug 2010 16:23:21 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
In-Reply-To: <20100814135002.GA1941@code0.codespeak.net>
References: <20100814135002.GA1941@code0.codespeak.net>
Message-ID: <20100814142321.GA4071@code0.codespeak.net>

Hi all,

On Sat, Aug 14, 2010 at 03:50:02PM +0200, Armin Rigo wrote:
> The PyPy speed center does not display any update more recent than July 29.

Wrong (thanks Antonio).  It's only the twisted_web benchmark that stops
at July 29; it was certainly removed at that date.  For the others it
works as expected.

The most recent results of today (76624) have been run on the
kill-caninline branch.


Armin.

From kevinar18 at hotmail.com  Sun Aug 15 02:24:20 2010
From: kevinar18 at hotmail.com (Kevin Ar18)
Date: Sat, 14 Aug 2010 20:24:20 -0400
Subject: [pypy-dev] ongoing microthread discussions
In-Reply-To: 
References: ,
	<66172.2718.qm@web120712.mail.ne1.yahoo.com>
	,
	
Message-ID: 



You may want to broaden your candidates. Jython already supports multiple cores with no GIL and shared memory with well-defined memory semantics derived directly from Java's memory model (and compatible with the informal memory model that we see in CPython). Because JRuby needs it for efficient support of Ruby 1.9 generators, which are more general than Python's (non-nested yields), there has been substantial attention paid to the MLVM coroutine support which has demonstrated 1M+ microthread scalability in a single JVM process.


It would be amazing if someone spent some time looking at this in Jython.
 
For me, anything based on the Java VM or copyleft code it out of question.  However, you are quite right in that it is not necessary that I use PyPy.  For example, if Unladen Swallow had the primitives I needed, that would be great too.
 
As a side note, PyPy does have two advantages: speed and that it is coded in RPython: which might even allow me to just hack PyPy itself at some point. :)
 
BTW, thanks for the suggestion.  Now that you brought up the topic of different implementations, I should probably check on what is going on in regards to Unladen Swallow, etc.... 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100814/483871ec/attachment.htm 

From jbaker at zyasoft.com  Sun Aug 15 16:31:38 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Sun, 15 Aug 2010 08:31:38 -0600
Subject: [pypy-dev] ongoing microthread discussions
In-Reply-To: 
References: 
	<66172.2718.qm@web120712.mail.ne1.yahoo.com>
	
	
	
Message-ID: 

To clarify, there are numerous implementations of the JVM that are not
copyleft, such as Apache Harmony. Of course the MLVM work I
citedis
not one of them.

Jython itself is licensed  under the
Python Software License.

On Sat, Aug 14, 2010 at 6:24 PM, Kevin Ar18  wrote:

>  You may want to broaden your candidates. Jython already supports multiple
> cores with no GIL and shared memory with well-defined memory semantics
> derived directly from Java's memory model (and compatible with the informal
> memory model that we see in CPython). Because JRuby needs it for efficient
> support of Ruby 1.9 generators, which are more general than Python's
> (non-nested yields), there has been substantial attention paid to the MLVM
> coroutine support which has demonstrated 1M+ microthread scalability in a
> single JVM process.
>
> It would be amazing if someone spent some time looking at this in Jython.
>
>
> For me, anything based on the Java VM or copyleft code it out of question.
> However, you are quite right in that it is not necessary that I use PyPy.
> For example, if Unladen Swallow had the primitives I needed, that would be
> great too.
>
> As a side note, PyPy does have two advantages: speed and that it is coded
> in RPython: which might even allow me to just hack PyPy itself at some
> point. :)
>
> BTW, thanks for the suggestion.  Now that you brought up the topic of
> different implementations, I should probably check on what is going on in
> regards to Unladen Swallow, etc....
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100815/1011e24a/attachment.htm 

From amauryfa at gmail.com  Sun Aug 15 21:32:51 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Sun, 15 Aug 2010 21:32:51 +0200
Subject: [pypy-dev] RPython function callback from C
In-Reply-To: <662548.18463.qm@web114020.mail.gq1.yahoo.com>
References: <662548.18463.qm@web114020.mail.gq1.yahoo.com>
Message-ID: 

Hi,

Le 14 ao?t 2010 03:51:00 UTC+2, Hart's Antler  a ?crit :
> I have the PortAudio blocking API working, simple reading and writing to the
> sound card works. ?PortAudio also has an async API where samples are fed to
> a callback as they stream in. ?But i'm not sure how to define a RPython
> function that will be called as a callback from C, is this even possible? ?I
> see some references in the source of rffi that seems to suggest it is
> possible.

Yes this is possible.
See an example in pypy/rpython/lltypesystem/test/test_ll2ctypes.py
in the function test_qsort_callback().

Why are you passing _callable=stream_callback?
It should be enough to pass stream_callback directly as a function argument.

-- 
Amaury Forgeot d'Arc

From ademan555 at gmail.com  Mon Aug 16 01:18:10 2010
From: ademan555 at gmail.com (Dan Roberts)
Date: Sun, 15 Aug 2010 16:18:10 -0700
Subject: [pypy-dev] JIT Failure on lltype.Array access
Message-ID: 

As best I can tell, the JIT cannot handle my code properly, it corrupts
memory and returns 0.0 for float arrays. I don't know whether the true
problem is in my code or the JIT, but I need to get this resolved quickly.

I know the JIT and my code are interacting badly because py.py works fine
(though slow) and translated pypy-c with jit and --jit threshold=9999999
both work fine.

Here's what I've tried to resolve the issue:
Removing my _immutable_fields_ hints.
Hand implementing bh_{get,set}arrayitem_raw_{r,i,f} (though I don't know my
implementation was right, I simply copied the gc version and removed the
first offset (since raw arrays have no header right? Although I expect that
the gc version would have simply gotten 0 for the header size... I tried it
anyways)

A few thoughts:
descr.py alludes to a FloatArrayDescr which I never raw defined
Could the asm backend be part of the problem? Rather than the code in
llmodel.py?

Unfortunately I'm ill equipped to resolve this issue, so any help is
appreciated (I'm on my phone but I'll happily furnish exact errors upon
request)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100815/b57c1dfb/attachment.htm 

From tobami at googlemail.com  Mon Aug 16 08:19:33 2010
From: tobami at googlemail.com (Miquel Torres)
Date: Mon, 16 Aug 2010 08:19:33 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
In-Reply-To: <20100814142321.GA4071@code0.codespeak.net>
References: <20100814135002.GA1941@code0.codespeak.net>
	<20100814142321.GA4071@code0.codespeak.net>
Message-ID: 

Hi Armin,

are all results going to be run on a branch now?.

If you run results on a branch, but don't change the config on
codespeed, the commit logs won't work because it will try to pull them
from trunk


2010/8/14 Armin Rigo :
> Hi all,
>
> On Sat, Aug 14, 2010 at 03:50:02PM +0200, Armin Rigo wrote:
>> The PyPy speed center does not display any update more recent than July 29.
>
> Wrong (thanks Antonio). ?It's only the twisted_web benchmark that stops
> at July 29; it was certainly remov ed at that date. ?For the others it
> works as expected.
>
> The most recent results of today (76624) have been run on the
> kill-caninline branch.
>
>
> Armin.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Mon Aug 16 09:00:52 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 16 Aug 2010 09:00:52 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
In-Reply-To: 
References: <20100814135002.GA1941@code0.codespeak.net>
	<20100814142321.GA4071@code0.codespeak.net>
	
Message-ID: 

I disabled twisted_web because of run out of TCP connection problem.
Regarding branches - how can we have branches visible with trunk
side-by-side, submit that as a different interpreter?

On Mon, Aug 16, 2010 at 8:19 AM, Miquel Torres  wrote:
> Hi Armin,
>
> are all results going to be run on a branch now?.
>
> If you run results on a branch, but don't change the config on
> codespeed, the commit logs won't work because it will try to pull them
> from trunk
>
>
> 2010/8/14 Armin Rigo :
>> Hi all,
>>
>> On Sat, Aug 14, 2010 at 03:50:02PM +0200, Armin Rigo wrote:
>>> The PyPy speed center does not display any update more recent than July 29.
>>
>> Wrong (thanks Antonio). ?It's only the twisted_web benchmark that stops
>> at July 29; it was certainly remov ed at that date. ?For the others it
>> works as expected.
>>
>> The most recent results of today (76624) have been run on the
>> kill-caninline branch.
>>
>>
>> Armin.
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From arigo at tunes.org  Mon Aug 16 15:06:48 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 16 Aug 2010 15:06:48 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
In-Reply-To: 
References: <20100814135002.GA1941@code0.codespeak.net>
	<20100814142321.GA4071@code0.codespeak.net>
	
Message-ID: <20100816130648.GA15483@code0.codespeak.net>

Hi Miquel,

On Mon, Aug 16, 2010 at 08:19:33AM +0200, Miquel Torres wrote:
> are all results going to be run on a branch now?.

No no, I just ran manually twice on a branch.


A bientot,

Armin.

From arigo at tunes.org  Mon Aug 16 15:10:33 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 16 Aug 2010 15:10:33 +0200
Subject: [pypy-dev] JIT Failure on lltype.Array access
In-Reply-To: 
References: 
Message-ID: <20100816131033.GB15483@code0.codespeak.net>

Hi Dan,

The issue was that the JIT was silently and incorrectly accepting the
type lltype.Array(), which is a non-GC but with-length-prefix array, and
it was (by mistake) considering it to be a GC array.  That's where the
errors come from.

Now the JIT explicitly refuses to work with such arrays.  As explained
on IRC, you need anyway in micronumpy to use the type rffi.CArray(),
which does not contain the length prefix.


A bientot,

Armin.

From tobami at googlemail.com  Mon Aug 16 16:15:39 2010
From: tobami at googlemail.com (Miquel Torres)
Date: Mon, 16 Aug 2010 16:15:39 +0200
Subject: [pypy-dev] PyPy speed center not updating any more
In-Reply-To: <20100816130648.GA15483@code0.codespeak.net>
References: <20100814135002.GA1941@code0.codespeak.net>
	<20100814142321.GA4071@code0.codespeak.net>
	
	<20100816130648.GA15483@code0.codespeak.net>
Message-ID: 

Maciej: sorry, we had this issue pending for a long time already.

The best way would be to add a new project per branch. So instead of
project = 'PyPy'

save as
project = 'experimental_branchX'

then in the admin (the project entry will be created when the first
results are saved), choose whether to "track" the project (show or
hide in the changes view), and customize the commit log info (pull
logs from the corresponding subdir in svn instead of trunk).

Note: to avoid confusion, executables names are unique, so exe
(interpreter) names will need to be different as well (it could be
changed if needed)

Cheers,
Miquel


2010/8/16 Armin Rigo :
> Hi Miquel,
>
> On Mon, Aug 16, 2010 at 08:19:33AM +0200, Miquel Torres wrote:
>> are all results going to be run on a branch now?.
>
> No no, I just ran manually twice on a branch.
>
>
> A bientot,
>
> Armin.
>

From bhartsho at yahoo.com  Thu Aug 19 03:25:02 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Wed, 18 Aug 2010 18:25:02 -0700 (PDT)
Subject: [pypy-dev] JIT'ed function performance degrades
Message-ID: <786138.15701.qm@web114011.mail.gq1.yahoo.com>

I am starting to learn how to use the JIT, and i'm confused why my function gets slower over time, twice as slow after running for a few minutes.  Using a virtualizable did speed up my code, but it still has the degrading performance problem.  I have yesterdays SVN and using 64bit with boehm.  I understand boehm is slower, but overall my JIT'ed function is many times slower than un-jitted, is this expected behavior from boehm?

code is here:
http://pastebin.com/9VGJHpNa




From sakesun at gmail.com  Thu Aug 19 06:25:42 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Thu, 19 Aug 2010 11:25:42 +0700
Subject: [pypy-dev] =?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
Message-ID: 

Hi,

 I encountered this quite a few times when learning pypy from internet
resources:
  the code like this

>>> open(?xxx?, ?w?).write(?stuff?)

This code is not working on pypy because it rely on CPython refcounting
behaviour.

I don't get it. Why ?  I thought the code should be similar to storing the
file object in temporary variable like this

>>> f = open('xxx', 'w')
>>> f.write('stuff')
>>> del f

Also, I've tried that with both Jython and IronPython and they all work
fine.

Why does this cause problem to pypy ?  Do I have to avoid writing code like
this in the future ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100819/c7f99278/attachment.htm 

From sakesun at gmail.com  Thu Aug 19 06:49:36 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Thu, 19 Aug 2010 11:49:36 +0700
Subject: [pypy-dev]
	=?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
In-Reply-To: 
References: 
	
Message-ID: 

That's make sense.  I've tried on both IronPython and Jython with:

ipy -c "open(?xxx?, ?w?).write(?stuff?)"
jython -c "open(?xxx?, ?w?).write(?stuff?)"

When the interpreter terminate the file is closed. That's why it didn't
cause any problem.

Perhaps, I should always use "with" statement from now on.

>>> with open('xxx', 'w') as f: f.write('stuff')

Thanks

On Thu, Aug 19, 2010 at 11:40 AM, Aaron DeVore wrote:

> If I understand correctly, PyPy will garbage collect (and close) the
> file object at an indeterminate time. That time could be as long as
> until the program exits. Because CPython uses reference counting, it
> closes the file immediately after the file object goes out of scope.
>
> Of course, I may be entirely wrong.
>
> -Aaron DeVore
>
> On Wed, Aug 18, 2010 at 9:25 PM, sakesun roykiatisak 
> wrote:
> > Hi,
> >  I encountered this quite a few times when learning pypy from internet
> > resources:
> >   the code like this
> >>>> open(?xxx?, ?w?).write(?stuff?)
> > This code is not working on pypy because it rely on CPython refcounting
> > behaviour.
> > I don't get it. Why ?  I thought the code should be similar to storing
> the
> > file object in temporary variable like this
> >>>> f = open('xxx', 'w')
> >>>> f.write('stuff')
> >>>> del f
> > Also, I've tried that with both Jython and IronPython and they all work
> > fine.
> > Why does this cause problem to pypy ?  Do I have to avoid writing code
> like
> > this in the future ?
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100819/58418987/attachment.htm 

From sakesun at gmail.com  Thu Aug 19 07:07:47 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Thu, 19 Aug 2010 12:07:47 +0700
Subject: [pypy-dev]
	=?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
In-Reply-To: 
References: 
	
	
Message-ID: 

A little problem is that, "with" statement is yet to work in pypy.

:)


On Thu, Aug 19, 2010 at 11:49 AM, sakesun roykiatisak wrote:

> That's make sense.  I've tried on both IronPython and Jython with:
>
> ipy -c "open(?xxx?, ?w?).write(?stuff?)"
> jython -c "open(?xxx?, ?w?).write(?stuff?)"
>
> When the interpreter terminate the file is closed. That's why it didn't
> cause any problem.
>
> Perhaps, I should always use "with" statement from now on.
>
> >>> with open('xxx', 'w') as f: f.write('stuff')
>
> Thanks
>
> On Thu, Aug 19, 2010 at 11:40 AM, Aaron DeVore wrote:
>
>> If I understand correctly, PyPy will garbage collect (and close) the
>> file object at an indeterminate time. That time could be as long as
>> until the program exits. Because CPython uses reference counting, it
>> closes the file immediately after the file object goes out of scope.
>>
>> Of course, I may be entirely wrong.
>>
>> -Aaron DeVore
>>
>> On Wed, Aug 18, 2010 at 9:25 PM, sakesun roykiatisak 
>> wrote:
>> > Hi,
>> >  I encountered this quite a few times when learning pypy from internet
>> > resources:
>> >   the code like this
>> >>>> open(?xxx?, ?w?).write(?stuff?)
>> > This code is not working on pypy because it rely on CPython refcounting
>> > behaviour.
>> > I don't get it. Why ?  I thought the code should be similar to storing
>> the
>> > file object in temporary variable like this
>> >>>> f = open('xxx', 'w')
>> >>>> f.write('stuff')
>> >>>> del f
>> > Also, I've tried that with both Jython and IronPython and they all work
>> > fine.
>> > Why does this cause problem to pypy ?  Do I have to avoid writing code
>> like
>> > this in the future ?
>> > _______________________________________________
>> > pypy-dev at codespeak.net
>> > http://codespeak.net/mailman/listinfo/pypy-dev
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100819/dd415df9/attachment.htm 

From alex.gaynor at gmail.com  Thu Aug 19 07:09:25 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Thu, 19 Aug 2010 00:09:25 -0500
Subject: [pypy-dev]
	=?utf-8?b?V2hhdCdzIHdyb25nIHdpdGggPj4+IG9wZW4o4oCZeHh4?=
	=?utf-8?b?4oCZLCDigJl34oCZKS53cml0ZSjigJlzdHVmZuKAmSkgPw==?=
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Thu, Aug 19, 2010 at 12:07 AM, sakesun roykiatisak  wrote:
>
> A little problem is that, "with" statement is yet to work in pypy.
> :)
>
> On Thu, Aug 19, 2010 at 11:49 AM, sakesun roykiatisak 
> wrote:
>>
>> That's make sense. ?I've tried on both IronPython and Jython with:
>> ipy -c "open(?xxx?, ?w?).write(?stuff?)"
>> jython -c "open(?xxx?, ?w?).write(?stuff?)"
>> When the interpreter terminate the file is closed. That's why it didn't
>> cause any problem.
>> Perhaps, I should always use "with" statement from now on.
>> >>> with open('xxx', 'w') as f: f.write('stuff')
>> Thanks
>>
>> On Thu, Aug 19, 2010 at 11:40 AM, Aaron DeVore 
>> wrote:
>>>
>>> If I understand correctly, PyPy will garbage collect (and close) the
>>> file object at an indeterminate time. That time could be as long as
>>> until the program exits. Because CPython uses reference counting, it
>>> closes the file immediately after the file object goes out of scope.
>>>
>>> Of course, I may be entirely wrong.
>>>
>>> -Aaron DeVore
>>>
>>> On Wed, Aug 18, 2010 at 9:25 PM, sakesun roykiatisak 
>>> wrote:
>>> > Hi,
>>> > ?I encountered this quite a few times when learning pypy from internet
>>> > resources:
>>> > ??the code like this
>>> >>>> open(?xxx?, ?w?).write(?stuff?)
>>> > This code is not working on pypy because it rely on CPython refcounting
>>> > behaviour.
>>> > I don't get it. Why ? ?I thought the code should be similar to storing
>>> > the
>>> > file object in temporary variable like this
>>> >>>> f = open('xxx', 'w')
>>> >>>> f.write('stuff')
>>> >>>> del f
>>> > Also, I've tried that with both Jython and IronPython and they all work
>>> > fine.
>>> > Why does this cause problem to pypy ? ?Do I have to avoid writing code
>>> > like
>>> > this in the future ?
>>> > _______________________________________________
>>> > pypy-dev at codespeak.net
>>> > http://codespeak.net/mailman/listinfo/pypy-dev
>>> >
>>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

Since PyPy implements Python 2.5 at present you'll need to use `from
__future__ import with_statement` to ues it.

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From sakesun at gmail.com  Thu Aug 19 07:12:46 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Thu, 19 Aug 2010 12:12:46 +0700
Subject: [pypy-dev]
	=?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

Wow, thanks.  Pypy is a really precise implementation.


> Since PyPy implements Python 2.5 at present you'll need to use `from
> __future__ import with_statement` to ues it.
>
> Alex
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100819/9f2c54fa/attachment-0001.htm 

From william.leslie.ttg at gmail.com  Thu Aug 19 07:13:36 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 19 Aug 2010 15:13:36 +1000
Subject: [pypy-dev]
	=?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
In-Reply-To: 
References: 
	
	
	
Message-ID: 

A good resource I recently read on this is this entry in Raymond Chen's blog:

http://blogs.msdn.com/b/oldnewthing/archive/2010/08/09/10047586.aspx

Together with the following entry, which explains why the lifetime of
the variable has nothing to do with the lifetime of the object, this
should help you understand.

You should consider automatically closing a file to be an
implementation detail, even cpython may not respect such semantics in
future. That is why the with statement was created.

-- 
William Leslie

From sakesun at gmail.com  Thu Aug 19 07:20:35 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Thu, 19 Aug 2010 12:20:35 +0700
Subject: [pypy-dev]
	=?windows-1252?q?What=27s_wrong_with_=3E=3E=3E_open=28?=
	=?windows-1252?q?=92xxx=92=2C_=92w=92=29=2Ewrite=28=92stuff=92=29_?=
	=?windows-1252?q?=3F?=
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

Thanks.

Interestingly, this is not the first time I was suggested to
pursue further reading with Raymond Chen's blog.

http://www.mail-archive.com/users at lists.ironpython.com/msg05792.html

:)

On Thu, Aug 19, 2010 at 12:13 PM, William Leslie <
william.leslie.ttg at gmail.com> wrote:

> A good resource I recently read on this is this entry in Raymond Chen's
> blog:
>
> http://blogs.msdn.com/b/oldnewthing/archive/2010/08/09/10047586.aspx
>
> Together with the following entry, which explains why the lifetime of
> the variable has nothing to do with the lifetime of the object, this
> should help you understand.
>
> You should consider automatically closing a file to be an
> implementation detail, even cpython may not respect such semantics in
> future. That is why the with statement was created.
>
> --
> William Leslie
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100819/6a2c2a40/attachment.htm 

From p.giarrusso at gmail.com  Thu Aug 19 08:48:05 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 19 Aug 2010 08:48:05 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
References: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
Message-ID: 

On Thu, Aug 19, 2010 at 03:25, Hart's Antler  wrote:
> I am starting to learn how to use the JIT, and i'm confused why my function gets slower over time, twice as slow after running for a few minutes. ?Using a virtualizable did speed up my code, but it still has the degrading performance problem. ?I have yesterdays SVN and using 64bit with boehm. ?I understand boehm is slower, but overall my JIT'ed function is many times slower than un-jitted, is this expected behavior from boehm?
>
> code is here:
> http://pastebin.com/9VGJHpNa

I think this has nothing to do with Boehm.

Is it swapping? If yes, that explains the slowdown.
Is memory usage growing over time? I expect yes, and it's a
misbehavior which could be explained by my analysis below.
Is it JITting code? I think no, or not to an advantage, but that's a
more complicated guess.
BTW, when debugging such things, _always_ ask and answer these
questions yourself.

Moreover, I'm not sure you need to use the JIT yourself.
- Your code is RPython, so you could as well just translate it without
JIT annotations, and it will be compiled to C code.
- Otherwise, you could write that as a app-level function, i.e. in
normal Python, and pass it to a translated PyPy-JIT interpreter. Did
you try and benchmark the code?
Can I ask you why you did not write that as a app-level function, i.e.
as normal Python code, to use PyPy's JIT directly, without needing
detailed understanding of the JIT?
It would be interesting to see a comparison (and have it on the web,
after some code review).

Especially, I'm not sure that as currently written you're getting any
speedup, and I seriously wonder whether the JIT could give an
additional speedup over RPython here (the regexp interpreter is a
completely different case, since it compiles a regexp, but why do you
compile an array?).
I think just raw CPython can be 340x slower than C (I assume NumPy
uses C), and since your code is RPython, there must be something basic
wrong.

I think you have too many green variables in your code:
"At runtime, for a given value of the green variables, one piece of
machine code will be generated. This piece of machine code can
therefore assume that the value of the green variable is constant."
[1]

So, every time you change the value of a green variable, the JIT will
have to recompile again the function. Note that actually, I think, for
each new value of the variable, first a given number of iterations
have to occur (1000? 10 000? I'm not sure), then the JIT will spend
time creating a trace and compiling it. The length of the involved
arrays is maybe around the threshold, maybe smaller, so you get "all
pain, and no gain".

>From your code:
complex_dft_jitdriver = JitDriver(
        greens = 'index length accum array'.split(),
        reds = 'k a b J'.split(),
        virtualizables = 'a'.split()
        #can_inline=True
)

The only acceptable green variable are IMHO array and length there,
because in the calling code, the other change for each invocation I
think.
I also think that only length should be green (and that could give a
speedup), and that marking array as green gives neglibible or no
speedup.
Marking length as green allows specializing the function on the size
of the array - something one would not do in C probably, but that one
could do in C++. Whether it is worth it depends on the specific code &
optimizations available - I think here the speedup should be small.

Best regards

[1] http://morepypy.blogspot.com/2010/06/jit-for-regular-expression-matching.html
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From fijall at gmail.com  Thu Aug 19 12:03:17 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 19 Aug 2010 12:03:17 +0200
Subject: [pypy-dev]
	=?utf-8?b?V2hhdCdzIHdyb25nIHdpdGggPj4+IG9wZW4o4oCZeHh4?=
	=?utf-8?b?4oCZLCDigJl34oCZKS53cml0ZSjigJlzdHVmZuKAmSkgPw==?=
In-Reply-To: 
References: 
Message-ID: 

Hi.

Yes, those two things are equivalent and they both work. However, if
you try to read the file immediately after deleting the variable,
you'll find out that the file is empty on any implementation but
cpython.

On Thu, Aug 19, 2010 at 6:25 AM, sakesun roykiatisak  wrote:
> Hi,
> ?I encountered this quite a few times when learning pypy from internet
> resources:
> ??the code like this
>>>> open(?xxx?, ?w?).write(?stuff?)
> This code is not working on pypy because it rely on CPython refcounting
> behaviour.
> I don't get it. Why ? ?I thought the code should be similar to storing the
> file object in temporary variable like this
>>>> f = open('xxx', 'w')
>>>> f.write('stuff')
>>>> del f
> Also, I've tried that with both Jython and IronPython and they all work
> fine.
> Why does this cause problem to pypy ? ?Do I have to avoid writing code like
> this in the future ?
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From fijall at gmail.com  Thu Aug 19 12:11:29 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 19 Aug 2010 12:11:29 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: 
References: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
	
Message-ID: 

Hi

On Thu, Aug 19, 2010 at 8:48 AM, Paolo Giarrusso  wrote:
> On Thu, Aug 19, 2010 at 03:25, Hart's Antler  wrote:
>> I am starting to learn how to use the JIT, and i'm confused why my function gets slower over time, twice as slow after running for a few minutes. ?Using a virtualizable did speed up my code, but it still has the degrading performance problem. ?I have yesterdays SVN and using 64bit with boehm. ?I understand boehm is slower, but overall my JIT'ed function is many times slower than un-jitted, is this expected behavior from boehm?
>>
>> code is here:
>> http://pastebin.com/9VGJHpNa
>
> I think this has nothing to do with Boehm.

I don't think as well



> Moreover, I'm not sure you need to use the JIT yourself.
> - Your code is RPython, so you could as well just translate it without
> JIT annotations, and it will be compiled to C code.
> - Otherwise, you could write that as a app-level function, i.e. in
> normal Python, and pass it to a translated PyPy-JIT interpreter. Did
> you try and benchmark the code?
> Can I ask you why you did not write that as a app-level function, i.e.
> as normal Python code, to use PyPy's JIT directly, without needing
> detailed understanding of the JIT?
> It would be interesting to see a comparison (and have it on the web,
> after some code review).

JIT can essentially speed up based on constant folding based on
bytecode. Bytecode should be the only green variable here and all
others (that you don't want to specialize over) should be red and not
promoted. In your case it's very likely you compile new loop very
often (overspecialization).

>
> Especially, I'm not sure that as currently written you're getting any
> speedup, and I seriously wonder whether the JIT could give an
> additional speedup over RPython here (the regexp interpreter is a
> completely different case, since it compiles a regexp, but why do you
> compile an array?).

That's silly, our python interpreter is an RPython program. Anything
that can have a meaningfully defined "bytecode" or a "compile time
constant" can be sped up by the JIT. For example a templating
language.

> I think just raw CPython can be 340x slower than C (I assume NumPy
> uses C)

You should check more and have less assumptions.

>
> So, every time you change the value of a green variable, the JIT will
> have to recompile again the function. Note that actually, I think, for
> each new value of the variable, first a given number of iterations
> have to occur (1000? 10 000? I'm not sure), then the JIT will spend
> time creating a trace and compiling it. The length of the involved
> arrays is maybe around the threshold, maybe smaller, so you get "all
> pain, and no gain".
>

to be precise for each combination of green variables there has to be
a 1000 (by default) iterations. If there is no such thing, you'll
never compile code and simply spend time bookkeeping.

Cheers,
fijal

From p.giarrusso at gmail.com  Thu Aug 19 13:34:12 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 19 Aug 2010 13:34:12 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: 
References: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
	
	
Message-ID: 

Hi Maciej,
I think you totally misunderstood me, possibly because I was not
clear, see below. In short, I was wondering whether the approach of
the original code made any sense, and my guess was "mostly not",
exactly because there is little constant folding possible in the code,
as it is written.

[Hart, I don't think that any O(N^2) implementation of DFT (what is in
the code), i.e. two nested for loops, should be written to explicitly
take advantage of the JIT. I don't know about the FFT algorithm, but a
few vague ideas say "yes", because constant folding the length could
_maybe_ allow constant folding the permutations applied to data in the
Cooley?Tukey FFT algorithm.]

On Thu, Aug 19, 2010 at 12:11, Maciej Fijalkowski  wrote:
> Hi


>> Moreover, I'm not sure you need to use the JIT yourself.
>> - Your code is RPython, so you could as well just translate it without
>> JIT annotations, and it will be compiled to C code.
>> - Otherwise, you could write that as a app-level function, i.e. in
>> normal Python, and pass it to a translated PyPy-JIT interpreter. Did
>> you try and benchmark the code?
>> Can I ask you why you did not write that as a app-level function, i.e.
>> as normal Python code, to use PyPy's JIT directly, without needing
>> detailed understanding of the JIT?
>> It would be interesting to see a comparison (and have it on the web,
>> after some code review).
>
> JIT can essentially speed up based on constant folding based on
> bytecode. Bytecode should be the only green variable here and all
> others (that you don't want to specialize over) should be red and not
> promoted. In your case it's very likely you compile new loop very
> often (overspecialization).

I see no bytecode in the example - it's a DFT implementation.
For each combination of green variables, there are 1024 iterations,
and there are 1024 such combinations, so overspecialization is almost
guaranteed.

My next question, inspired from the specific code, is: is JITted code
ever thrown away, if too much is generated? Even for valid use cases,
most JITs can generate too much code, and they need then to choose
what to keep and what to throw away.

>> Especially, I'm not sure that as currently written you're getting any
>> speedup, and I seriously wonder whether the JIT could give an
>> additional speedup over RPython here (the regexp interpreter is a
>> completely different case, since it compiles a regexp, but why do you
>> compile an array?).
>
> That's silly, our python interpreter is an RPython program. Anything
> that can have a meaningfully defined "bytecode" or a "compile time
> constant" can be sped up by the JIT. For example a templating
> language.

You misunderstood me, I totally agree with you, and my understanding
is that in the given program (which I read almost fully) constant
folding makes little sense.
Since that program is written with RPython + JIT, but it has green
variables which are not at all "compile time constants", "I wonder
seriously" was meant as "I wonder seriously whether what you are
trying makes any sense". As I argued, the only constant folding
possible is for the array length. And again, I wonder whether it's
worth it, my guess tends towards "no", but a benchmark is needed
(there will be some improvement probably).

I was just a bit vaguer because I just studied docs on PyPy (and
papers about tracing compilation). But your answer confirms that my
original analysis is correct, and that I should write more clearly
maybe.

>> I think just raw CPython can be 340x slower than C (I assume NumPy
>> uses C)

> You should check more and have less assumptions.

I did some checks, on PyPy's blog actually, not definitive though, and
I stand by what I meant (see below). Without reading the pastie in
full, however, my comments are out of context.
I guess your tone is fine, since you thought I wrote nonsense. But in
general, I have yet to see a guideline forbidding "IIRC" and similar
ways of discussing (the above was an _educated_ guess), especially
when the writer remembers correctly (as in this case).
Having said that, I'm always happy to see counterexamples and learn
something, if they exist. In this case, for what I actually meant (and
wrote, IMHO), a counterexample would be a RPython or a JITted program
>= 340x slower than C.

For the speed ratio, the code pastie writes that RPython JITted code
is 340x slower than NumPy code, and I was writing that it's
unreasonable; in this case, it happens because of overspecialization
caused by misuse of the JIT.

For speed ratios among CPython, C, RPython, I was comparing to
http://morepypy.blogspot.com/2010/06/jit-for-regular-expression-matching.html.
What I meant is that JITted code can't be so much slower than C.

For NumPy, I had read this:
http://morepypy.blogspot.com/2009/07/pypy-numeric-experiments.html,
and it mostly implies that NumPy is written in C (it actually says
"NumPy's C version", but I missed it). And for the specific discussed
microbenchmark, the performance gap between NumPy and CPython is
~100x.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From fijall at gmail.com  Thu Aug 19 13:55:00 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 19 Aug 2010 13:55:00 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: 
References: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
	
	
	
Message-ID: 

On Thu, Aug 19, 2010 at 1:34 PM, Paolo Giarrusso  wrote:
> Hi Maciej,
> I think you totally misunderstood me, possibly because I was not
> clear, see below. In short, I was wondering whether the approach of
> the original code made any sense, and my guess was "mostly not",
> exactly because there is little constant folding possible in the code,
> as it is written.

That's always possible :)

>
> [Hart, I don't think that any O(N^2) implementation of DFT (what is in
> the code), i.e. two nested for loops, should be written to explicitly
> take advantage of the JIT. I don't know about the FFT algorithm, but a
> few vague ideas say "yes", because constant folding the length could
> _maybe_ allow constant folding the permutations applied to data in the
> Cooley?Tukey FFT algorithm.]
>
> On Thu, Aug 19, 2010 at 12:11, Maciej Fijalkowski  wrote:
>> Hi
> 
>
>>> Moreover, I'm not sure you need to use the JIT yourself.
>>> - Your code is RPython, so you could as well just translate it without
>>> JIT annotations, and it will be compiled to C code.
>>> - Otherwise, you could write that as a app-level function, i.e. in
>>> normal Python, and pass it to a translated PyPy-JIT interpreter. Did
>>> you try and benchmark the code?
>>> Can I ask you why you did not write that as a app-level function, i.e.
>>> as normal Python code, to use PyPy's JIT directly, without needing
>>> detailed understanding of the JIT?
>>> It would be interesting to see a comparison (and have it on the web,
>>> after some code review).
>>
>> JIT can essentially speed up based on constant folding based on
>> bytecode. Bytecode should be the only green variable here and all
>> others (that you don't want to specialize over) should be red and not
>> promoted. In your case it's very likely you compile new loop very
>> often (overspecialization).
>
> I see no bytecode in the example - it's a DFT implementation.
> For each combination of green variables, there are 1024 iterations,
> and there are 1024 such combinations, so overspecialization is almost
> guaranteed.

Agreed.

>
> My next question, inspired from the specific code, is: is JITted code
> ever thrown away, if too much is generated? Even for valid use cases,
> most JITs can generate too much code, and they need then to choose
> what to keep and what to throw away.

No, as of now, never. In general in case of Python it would have to be
a heuristic anyway (since code objects are mostly immortal and you
can't decide whether certain combination of assumptions will occur in
the future or not). We have some ideas which code will never run any
more and besides that, we need to implement some heuristics when to
throw away code.

>
>>> Especially, I'm not sure that as currently written you're getting any
>>> speedup, and I seriously wonder whether the JIT could give an
>>> additional speedup over RPython here (the regexp interpreter is a
>>> completely different case, since it compiles a regexp, but why do you
>>> compile an array?).
>>
>> That's silly, our python interpreter is an RPython program. Anything
>> that can have a meaningfully defined "bytecode" or a "compile time
>> constant" can be sped up by the JIT. For example a templating
>> language.
>
> You misunderstood me, I totally agree with you, and my understanding
> is that in the given program (which I read almost fully) constant
> folding makes little sense.

Great :) I might have misunderstood you.

> Since that program is written with RPython + JIT, but it has green
> variables which are not at all "compile time constants", "I wonder
> seriously" was meant as "I wonder seriously whether what you are
> trying makes any sense". As I argued, the only constant folding
> possible is for the array length. And again, I wonder whether it's
> worth it, my guess tends towards "no", but a benchmark is needed
> (there will be some improvement probably).

I guess the answer is "hell no", simply because if you don't constant
fold our assembler would not be nearly as good as gcc's one (if
nothing else).

>
> I was just a bit vaguer because I just studied docs on PyPy (and
> papers about tracing compilation). But your answer confirms that my
> original analysis is correct, and that I should write more clearly
> maybe.
>
>>> I think just raw CPython can be 340x slower than C (I assume NumPy
>>> uses C)
>
>> You should check more and have less assumptions.
>
> I did some checks, on PyPy's blog actually, not definitive though, and
> I stand by what I meant (see below). Without reading the pastie in
> full, however, my comments are out of context.
> I guess your tone is fine, since you thought I wrote nonsense. But in
> general, I have yet to see a guideline forbidding "IIRC" and similar
> ways of discussing (the above was an _educated_ guess), especially
> when the writer remembers correctly (as in this case).
> Having said that, I'm always happy to see counterexamples and learn
> something, if they exist. In this case, for what I actually meant (and
> wrote, IMHO), a counterexample would be a RPython or a JITted program
>>= 340x slower than C.

My comment was merely about "numpy is written in C".

>
> For the speed ratio, the code pastie writes that RPython JITted code
> is 340x slower than NumPy code, and I was writing that it's
> unreasonable; in this case, it happens because of overspecialization
> caused by misuse of the JIT.

Yes.

>
> For speed ratios among CPython, C, RPython, I was comparing to
> http://morepypy.blogspot.com/2010/06/jit-for-regular-expression-matching.html.
> What I meant is that JITted code can't be so much slower than C.
>
> For NumPy, I had read this:
> http://morepypy.blogspot.com/2009/07/pypy-numeric-experiments.html,
> and it mostly implies that NumPy is written in C (it actually says
> "NumPy's C version", but I missed it). And for the specific discussed
> microbenchmark, the performance gap between NumPy and CPython is
> ~100x.

Yes, there is a slight difference :-) numpy is written mostly in C (at
least glue code), but a lot of algorithms call back to some other
stuff (depending what you have installed) which as far as I'm
concerned might be whatever (most likely fortran or SSE assembler at
some level.)

>
> Best regards
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
>

From bhartsho at yahoo.com  Fri Aug 20 08:04:48 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Thu, 19 Aug 2010 23:04:48 -0700 (PDT)
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: 
Message-ID: <23133.28490.qm@web114006.mail.gq1.yahoo.com>

Hi Paolo,

thanks for your in-depth response, i tried your suggestions and noticed a big speed improvement with no more degrading performance, i didn't realize having more green is bad.  However it still runs 4x slower than just plain old compiled RPython, i checked if the JIT was really running, and your right its not actually using any JIT'ed code, it only traces and then aborts, though now i can not figure out why it aborts after trying several things.

I didn't write this as an app-level function because i wanted to understand how the JIT works on a deeper level and with RPython.  I had seen the blog post before by Carl Friedrich Bolz about JIT'ing and that he was able to speed things up 22x faster than plain RPython translated to C, so that got me curious about the JIT.  Now i understand that that was an exceptional case, but what other cases might RPython+JIT be useful?  And its good to see here what if any speed up there will be in the worst case senairo.

Sorry about all the confusion about numpy being 340x faster, i should have added in that note that i compared numpy fast fourier transform to Rpython direct fourier transform, and direct is known to be hundreds of times slower.  (numpy lacks a DFT to compare to)

updated code with only the length as green: http://pastebin.com/DnJikXze

The jitted function now checks jit.we_are_jitted(), and prints 'unjitted' if there is no jitting.
abort: trace too long seems to happen every trace, so we_are_jitted() is never true, and the 4x overhead compared to compiled RPython is then understandable.

trace_limit is set to its maximum, so why is it aborting?  Here is my settings:
	jitdriver.set_param('threshold', 4)
	jitdriver.set_param('trace_eagerness', 4)
	jitdriver.set_param('trace_limit', sys.maxint)
	jitdriver.set_param('debug', 3)


Tracing:      	80	1.019871
Backend:      	0	0.000000
Running asm:     	0
Blackhole:       	80
TOTAL:      		16.785704
ops:             	1456160
recorded ops:    	1200000
  calls:         	99080
guards:          	430120
opt ops:         	0
opt guards:      	0
forcings:        	0
abort: trace too long:	80
abort: compiling:	0
abort: vable escape:	0
nvirtuals:       	0
nvholes:         	0
nvreused:        	0


--- On Thu, 8/19/10, Maciej Fijalkowski  wrote:

> From: Maciej Fijalkowski 
> Subject: Re: [pypy-dev] JIT'ed function performance degrades
> To: "Paolo Giarrusso" 
> Cc: "Hart's Antler" , pypy-dev at codespeak.net
> Date: Thursday, 19 August, 2010, 4:55 AM
> On Thu, Aug 19, 2010 at 1:34 PM,
> Paolo Giarrusso 
> wrote:
> > Hi Maciej,
> > I think you totally misunderstood me, possibly because
> I was not
> > clear, see below. In short, I was wondering whether
> the approach of
> > the original code made any sense, and my guess was
> "mostly not",
> > exactly because there is little constant folding
> possible in the code,
> > as it is written.
> 
> That's always possible :)
> 
> >
> > [Hart, I don't think that any O(N^2) implementation of
> DFT (what is in
> > the code), i.e. two nested for loops, should be
> written to explicitly
> > take advantage of the JIT. I don't know about the FFT
> algorithm, but a
> > few vague ideas say "yes", because constant folding
> the length could
> > _maybe_ allow constant folding the permutations
> applied to data in the
> > Cooley?Tukey FFT algorithm.]
> >
> > On Thu, Aug 19, 2010 at 12:11, Maciej Fijalkowski
> 
> wrote:
> >> Hi
> > 
> >
> >>> Moreover, I'm not sure you need to use the JIT
> yourself.
> >>> - Your code is RPython, so you could as well
> just translate it without
> >>> JIT annotations, and it will be compiled to C
> code.
> >>> - Otherwise, you could write that as a
> app-level function, i.e. in
> >>> normal Python, and pass it to a translated
> PyPy-JIT interpreter. Did
> >>> you try and benchmark the code?
> >>> Can I ask you why you did not write that as a
> app-level function, i.e.
> >>> as normal Python code, to use PyPy's JIT
> directly, without needing
> >>> detailed understanding of the JIT?
> >>> It would be interesting to see a comparison
> (and have it on the web,
> >>> after some code review).
> >>
> >> JIT can essentially speed up based on constant
> folding based on
> >> bytecode. Bytecode should be the only green
> variable here and all
> >> others (that you don't want to specialize over)
> should be red and not
> >> promoted. In your case it's very likely you
> compile new loop very
> >> often (overspecialization).
> >
> > I see no bytecode in the example - it's a DFT
> implementation.
> > For each combination of green variables, there are
> 1024 iterations,
> > and there are 1024 such combinations, so
> overspecialization is almost
> > guaranteed.
> 
> Agreed.
> 
> >
> > My next question, inspired from the specific code, is:
> is JITted code
> > ever thrown away, if too much is generated? Even for
> valid use cases,
> > most JITs can generate too much code, and they need
> then to choose
> > what to keep and what to throw away.
> 
> No, as of now, never. In general in case of Python it would
> have to be
> a heuristic anyway (since code objects are mostly immortal
> and you
> can't decide whether certain combination of assumptions
> will occur in
> the future or not). We have some ideas which code will
> never run any
> more and besides that, we need to implement some heuristics
> when to
> throw away code.
> 
> >
> >>> Especially, I'm not sure that as currently
> written you're getting any
> >>> speedup, and I seriously wonder whether the
> JIT could give an
> >>> additional speedup over RPython here (the
> regexp interpreter is a
> >>> completely different case, since it compiles a
> regexp, but why do you
> >>> compile an array?).
> >>
> >> That's silly, our python interpreter is an RPython
> program. Anything
> >> that can have a meaningfully defined "bytecode" or
> a "compile time
> >> constant" can be sped up by the JIT. For example a
> templating
> >> language.
> >
> > You misunderstood me, I totally agree with you, and my
> understanding
> > is that in the given program (which I read almost
> fully) constant
> > folding makes little sense.
> 
> Great :) I might have misunderstood you.
> 
> > Since that program is written with RPython + JIT, but
> it has green
> > variables which are not at all "compile time
> constants", "I wonder
> > seriously" was meant as "I wonder seriously whether
> what you are
> > trying makes any sense". As I argued, the only
> constant folding
> > possible is for the array length. And again, I wonder
> whether it's
> > worth it, my guess tends towards "no", but a benchmark
> is needed
> > (there will be some improvement probably).
> 
> I guess the answer is "hell no", simply because if you
> don't constant
> fold our assembler would not be nearly as good as gcc's one
> (if
> nothing else).
> 
> >
> > I was just a bit vaguer because I just studied docs on
> PyPy (and
> > papers about tracing compilation). But your answer
> confirms that my
> > original analysis is correct, and that I should write
> more clearly
> > maybe.
> >
> >>> I think just raw CPython can be 340x slower
> than C (I assume NumPy
> >>> uses C)
> >
> >> You should check more and have less assumptions.
> >
> > I did some checks, on PyPy's blog actually, not
> definitive though, and
> > I stand by what I meant (see below). Without reading
> the pastie in
> > full, however, my comments are out of context.
> > I guess your tone is fine, since you thought I wrote
> nonsense. But in
> > general, I have yet to see a guideline forbidding
> "IIRC" and similar
> > ways of discussing (the above was an _educated_
> guess), especially
> > when the writer remembers correctly (as in this
> case).
> > Having said that, I'm always happy to see
> counterexamples and learn
> > something, if they exist. In this case, for what I
> actually meant (and
> > wrote, IMHO), a counterexample would be a RPython or a
> JITted program
> >>= 340x slower than C.
> 
> My comment was merely about "numpy is written in C".
> 
> >
> > For the speed ratio, the code pastie writes that
> RPython JITted code
> > is 340x slower than NumPy code, and I was writing that
> it's
> > unreasonable; in this case, it happens because of
> overspecialization
> > caused by misuse of the JIT.
> 
> Yes.
> 
> >
> > For speed ratios among CPython, C, RPython, I was
> comparing to
> > http://morepypy.blogspot.com/2010/06/jit-for-regular-expression-matching.html.
> > What I meant is that JITted code can't be so much
> slower than C.
> >
> > For NumPy, I had read this:
> > http://morepypy.blogspot.com/2009/07/pypy-numeric-experiments.html,
> > and it mostly implies that NumPy is written in C (it
> actually says
> > "NumPy's C version", but I missed it). And for the
> specific discussed
> > microbenchmark, the performance gap between NumPy and
> CPython is
> > ~100x.
> 
> Yes, there is a slight difference :-) numpy is written
> mostly in C (at
> least glue code), but a lot of algorithms call back to some
> other
> stuff (depending what you have installed) which as far as
> I'm
> concerned might be whatever (most likely fortran or SSE
> assembler at
> some level.)
> 
> >
> > Best regards
> > --
> > Paolo Giarrusso - Ph.D. Student
> > http://www.informatik.uni-marburg.de/~pgiarrusso/
> >
> 




From timon.elviejo at gmail.com  Fri Aug 20 09:58:07 2010
From: timon.elviejo at gmail.com (=?ISO-8859-1?Q?Jorge_Tim=F3n?=)
Date: Fri, 20 Aug 2010 09:58:07 +0200
Subject: [pypy-dev] gpgpu and pypy
Message-ID: 

Hi, I'm just curious about the feasibility of running python code in a gpu
by extending pypy.
I don't have the time (and probably the knowledge neither) to develop that
pypy extension, but I just want to know if it's possible.
I'm interested in languages like openCL and nvidia's CUDA because I think
the future of supercomputing is going to be GPGPU. There's people working in
bringing GPGPU to python:

http://mathema.tician.de/software/pyopencl
http://mathema.tician.de/software/pycuda

Would it be possible to run python code in parallel without the need (for
the developer) of actively parallelizing the code?
I'm not talking about code of hard concurrency, but of code with intrinsic
parallelism (let's say matrix multiplication).
Would a JIT compilation be capable of detecting parallelism?
Would it be interesting or that's a job we must leave to humans by now?
What do you think?

I don't know if I had explain myself because English is not my first
language.

Cheers,
Jorge Tim?n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/0b6f7cf7/attachment.htm 

From william.leslie.ttg at gmail.com  Fri Aug 20 10:05:50 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Fri, 20 Aug 2010 18:05:50 +1000
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: <23133.28490.qm@web114006.mail.gq1.yahoo.com>
References: 
	<23133.28490.qm@web114006.mail.gq1.yahoo.com>
Message-ID: 

On 20 August 2010 16:04, Hart's Antler  wrote:
> Hi Paolo,
>
> thanks for your in-depth response, i tried your suggestions and noticed a big speed improvement with no more degrading performance, i didn't realize having more green is bad. ?However it still runs 4x slower than just plain old compiled RPython, i checked if the JIT was really running, and your right its not actually using any JIT'ed code, it only traces and then aborts, though now i can not figure out why it aborts after trying several things.
>
> I didn't write this as an app-level function because i wanted to understand how the JIT works on a deeper level and with RPython. ?I had seen the blog post before by Carl Friedrich Bolz about JIT'ing and that he was able to speed things up 22x faster than plain RPython translated to C, so that got me curious about the JIT. ?Now i understand that that was an exceptional case, but what other cases might RPython+JIT be useful? ?And its good to see here what if any speed up there will be in the worst case senairo.
>
> Sorry about all the confusion about numpy being 340x faster, i should have added in that note that i compared numpy fast fourier transform to Rpython direct fourier transform, and direct is known to be hundreds of times slower. ?(numpy lacks a DFT to compare to)
>
> updated code with only the length as green: http://pastebin.com/DnJikXze
>
> The jitted function now checks jit.we_are_jitted(), and prints 'unjitted' if there is no jitting.
> abort: trace too long seems to happen every trace, so we_are_jitted() is never true, and the 4x overhead compared to compiled RPython is then understandable.
>
> trace_limit is set to its maximum, so why is it aborting? ?Here is my settings:
> ? ? ? ?jitdriver.set_param('threshold', 4)
> ? ? ? ?jitdriver.set_param('trace_eagerness', 4)
> ? ? ? ?jitdriver.set_param('trace_limit', sys.maxint)
> ? ? ? ?jitdriver.set_param('debug', 3)
>
>
> Tracing: ? ? ? ?80 ? ? ?1.019871
> Backend: ? ? ? ?0 ? ? ? 0.000000
> Running asm: ? ? ? ? ? ?0
> Blackhole: ? ? ? ? ? ? ?80
> TOTAL: ? ? ? ? ? ? ? ? ?16.785704
> ops: ? ? ? ? ? ? ? ? ? ?1456160
> recorded ops: ? ? ? ? ? 1200000
> ?calls: ? ? ? ? ? ? ? ?99080
> guards: ? ? ? ? ? ? ? ? 430120
> opt ops: ? ? ? ? ? ? ? ?0
> opt guards: ? ? ? ? ? ? 0
> forcings: ? ? ? ? ? ? ? 0
> abort: trace too long: ?80
> abort: compiling: ? ? ? 0
> abort: vable escape: ? ?0
> nvirtuals: ? ? ? ? ? ? ?0
> nvholes: ? ? ? ? ? ? ? ?0
> nvreused: ? ? ? ? ? ? ? 0

This application probably isn't a very good use for the jit because it
has very little control flow. It may unroll the loop, but you're
probably not gaining anything there. As long as the methods get
inlined (as there is no polymorphic dispatch here that I can see), jit
can't improve on this much. What optimisations do you expect it to
make?

-- 
William Leslie

From arigo at tunes.org  Fri Aug 20 11:31:58 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 20 Aug 2010 11:31:58 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
References: <786138.15701.qm@web114011.mail.gq1.yahoo.com>
Message-ID: <20100820093158.GA16244@code0.codespeak.net>

Hi Hart,

On Wed, Aug 18, 2010 at 06:25:02PM -0700, Hart's Antler wrote:
> I am starting to learn how to use the JIT, and i'm confused why my
> function gets slower over time, twice as slow after running for a few
> minutes.  Using a virtualizable did speed up my code, but it still has
> the degrading performance problem.  I have yesterdays SVN and using
> 64bit with boehm.  I understand boehm is slower, but overall my JIT'ed
> function is many times slower than un-jitted, is this expected
> behavior from boehm?

It seems that there are still issues with the 64-bit JIT -- it could be
something along the line of "the guards are not correctly overwritten",
or likely something more subtle along these lines, causing more and more
assembler to be produced.  We have observed "infinite"-looking memory
usage for long-running programs, too.

Note that in the example you posted, you are doing the common mistake of
putting some code (the looping condition) between can_enter_jit and
jit_merge_point.  We should really do something about checking that
people don't do that.  It mostly works, except in some cases where it
doesn't :-(  The issue is more precisely:

    while x < y:
        my_jit_driver.jit_merge_point(...)
        ...loop body...
        my_jit_driver.can_enter_jit(...)

In this case, the "x < y" is evaluated between can_enter_jit and
jit_merge_point, and that's the mistake.  You should rewrite your
examples as:

    while x < y:
        my_jit_driver.can_enter_jit(...)
        my_jit_driver.jit_merge_point(...)
        ...loop body...


A bientot,

Armin.

From arigo at tunes.org  Fri Aug 20 11:45:24 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 20 Aug 2010 11:45:24 +0200
Subject: [pypy-dev] JIT'ed function performance degrades
In-Reply-To: <23133.28490.qm@web114006.mail.gq1.yahoo.com>
References: 
	<23133.28490.qm@web114006.mail.gq1.yahoo.com>
Message-ID: <20100820094524.GB16244@code0.codespeak.net>

Hi Hart,

On Thu, Aug 19, 2010 at 11:04:48PM -0700, Hart's Antler wrote:
> I had seen the blog post before by Carl Friedrich Bolz about JIT'ing
> and that he was able to speed things up 22x faster than plain RPython
> translated to C, so that got me curious about the JIT.

You cannot expect any program to get 22x faster with RPython+JIT than it
is with just RPython.  That would be like saying that any C program can
get 22x faster if we apply some special JIT on it.  For a general C
program, such a statement makes no sense -- no JIT can help.

The PyPy JIT can help *only* if the RPython program in question is some
kind of interpreter, with a loose definition of interpreter.  That's why
we can apply the PyPy JIT to the Python interpreter written in RPython;
or to some other examples, like Carl Friedrich's blog post about the
regular expressions "interpreter".


A bientot,

Armin.

From arigo at tunes.org  Fri Aug 20 11:57:21 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 20 Aug 2010 11:57:21 +0200
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
Message-ID: <20100820095721.GC16244@code0.codespeak.net>

Hi Sakesun,

On Thu, Aug 19, 2010 at 11:25:42AM +0700, sakesun roykiatisak wrote:
> >>> f = open('xxx', 'w')
> >>> f.write('stuff')
> >>> del f
> 
> Also, I've tried that with both Jython and IronPython and they all work
> fine.

I guess that you didn't try exactly the same thing.  If I do:

    arigo at tannit ~ $ jython                 
    Jython 2.2.1 on java1.6.0_20
    Type "copyright", "credits" or "license" for more information.
    >>> open('x', 'w').write('hello')
    >>> 

Then "cat x" in another terminal shows an empty file.  The file "x" is
only filled when I exit Jython.  It is exactly the same behavior as I
get on PyPy.  Maybe I missed something, and there is a different way to
do things such that it works on Jython but not on PyPy; if so, can you
describe it more precisely?  Thanks!


A bientot,

Armin.

From donny.viszneki at gmail.com  Fri Aug 20 12:23:26 2010
From: donny.viszneki at gmail.com (Donny Viszneki)
Date: Fri, 20 Aug 2010 06:23:26 -0400
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: <20100820095721.GC16244@code0.codespeak.net>
References: 
	<20100820095721.GC16244@code0.codespeak.net>
Message-ID: 

Armin: Sakesun used "del f" and it appears you did not. In Python
IIRC, an explicit call to del should kick off the finalizer to flush
and close the file!

open('x', 'w').write('hello') alone does not imply the file instance
(return value of open()) has been finalized because the garbage
collector may not have hit it yet.

Jython and IronPython are pretty much guaranteed to behave differently
under a wide variety of circumstances when it comes to the garbage
collector. Do not rely on the garbage collector for program semantics!

Because Sakesun has used "del f" it should be quite a concern that the
file has not been finalized properly!

On Fri, Aug 20, 2010 at 5:57 AM, Armin Rigo  wrote:
> Hi Sakesun,
>
> On Thu, Aug 19, 2010 at 11:25:42AM +0700, sakesun roykiatisak wrote:
>> >>> f = open('xxx', 'w')
>> >>> f.write('stuff')
>> >>> del f
>>
>> Also, I've tried that with both Jython and IronPython and they all work
>> fine.
>
> I guess that you didn't try exactly the same thing. ?If I do:
>
> ? ?arigo at tannit ~ $ jython
> ? ?Jython 2.2.1 on java1.6.0_20
> ? ?Type "copyright", "credits" or "license" for more information.
> ? ?>>> open('x', 'w').write('hello')
> ? ?>>>
>
> Then "cat x" in another terminal shows an empty file. ?The file "x" is
> only filled when I exit Jython. ?It is exactly the same behavior as I
> get on PyPy. ?Maybe I missed something, and there is a different way to
> do things such that it works on Jython but not on PyPy; if so, can you
> describe it more precisely? ?Thanks!
>
>
> A bientot,
>
> Armin.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
http://codebad.com/

From william.leslie.ttg at gmail.com  Fri Aug 20 12:32:34 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Fri, 20 Aug 2010 20:32:34 +1000
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
Message-ID: 

It seems you too have missed the difference between deleting some reference
to the object (as del does) and finalising.

On 20/08/2010 8:23 PM, "Donny Viszneki"  wrote:

Armin: Sakesun used "del f" and it appears you did not. In Python
IIRC, an explicit call to del should kick off the finalizer to flush
and close the file!

open('x', 'w').write('hello') alone does not imply the file instance
(return value of open()) has been finalized because the garbage
collector may not have hit it yet.

Jython and IronPython are pretty much guaranteed to behave differently
under a wide variety of circumstances when it comes to the garbage
collector. Do not rely on the garbage collector for program semantics!

Because Sakesun has used "del f" it should be quite a concern that the
file has not been finalized properly!


On Fri, Aug 20, 2010 at 5:57 AM, Armin Rigo  wrote:
> Hi Sakesun,
>
> On Thu, Aug ...
--
http://codebad.com/

_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/43eabd16/attachment.htm 

From arigo at tunes.org  Fri Aug 20 13:06:49 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 20 Aug 2010 13:06:49 +0200
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
Message-ID: <20100820110649.GA23268@code0.codespeak.net>

Hi Donny,

On Fri, Aug 20, 2010 at 06:23:26AM -0400, Donny Viszneki wrote:
> Armin: Sakesun used "del f" and it appears you did not.

As explained earlier this makes no difference.  E.g. in any Python
version, the following code would not call the __del__ method of the
object x either:

>>> x = SomeClassWithADel()
>>> y = x
>>> del x


A bientot,

Armin.

From p.giarrusso at gmail.com  Fri Aug 20 15:39:22 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Fri, 20 Aug 2010 15:39:22 +0200
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
Message-ID: 

On Fri, Aug 20, 2010 at 12:23, Donny Viszneki  wrote:
> Armin: Sakesun used "del f" and it appears you did not.
Actually, he didn't either. He said "I think that open(?xxx?,
?w?).write(?stuff?)" is equivalent to using del (which he thought
would work), and the equivalence was correct.

Anyway, in the _first reply_ message, he realized that using:

ipy -c "open(?xxx?, ?w?).write(?stuff?)"
jython -c "open(?xxx?, ?w?).write(?stuff?)"

made a difference (because the interpreter exited), so that problem
was solved. His mail implies that on PyPy he typed the code at the
prompt, rather than at -c.

> In Python
> IIRC, an explicit call to del should kick off the finalizer to flush
> and close the file!

No, as shown by Armin. del will just clear the reference, which on
CPython means decreasing the refcount. Refcounting will then finalize
the object immediately, a GC at some later point, if it runs at all -
there's no such guarantee on Java and .NET. For Java, that's unless
you do special unsafe setup (System.runFinalizersOnExit(), it's
discouraged for a number of reasons, see docs). On .NET, I expect a
such method to exist, too, since they were so unaware of problems
wiith finalizers in .NET 1.0 to give them the syntax of destructors.
But .NET 2.0 has SafeHandles, which guarantee release of critical
resources if the "finalization" code follows some restriction, using
_reference counting_:

http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.safehandle.aspx
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.safehandle.dangerousaddref.aspx

> open('x', 'w').write('hello') alone does not imply the file instance
> (return value of open()) has been finalized because the garbage
> collector may not have hit it yet.

On CPython, you have such an implication, because of refcounting semantics.

> On Fri, Aug 20, 2010 at 5:57 AM, Armin Rigo  wrote:
>> Hi Sakesun,
>>
>> On Thu, Aug 19, 2010 at 11:25:42AM +0700, sakesun roykiatisak wrote:
>>> >>> f = open('xxx', 'w')
>>> >>> f.write('stuff')
>>> >>> del f
>>>
>>> Also, I've tried that with both Jython and IronPython and they all work
>>> fine.
>>
>> I guess that you didn't try exactly the same thing. ?If I do:
>>
>> ? ?arigo at tannit ~ $ jython
>> ? ?Jython 2.2.1 on java1.6.0_20
>> ? ?Type "copyright", "credits" or "license" for more information.
>> ? ?>>> open('x', 'w').write('hello')
>> ? ?>>>
>>
>> Then "cat x" in another terminal shows an empty file. ?The file "x" is
>> only filled when I exit Jython. ?It is exactly the same behavior as I
>> get on PyPy. ?Maybe I missed something, and there is a different way to
>> do things such that it works on Jython but not on PyPy; if so, can you
>> describe it more precisely? ?Thanks!

-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From arigo at tunes.org  Fri Aug 20 16:06:31 2010
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 20 Aug 2010 16:06:31 +0200
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
Message-ID: <20100820140631.GA3513@code0.codespeak.net>

Hi Donny,

On Fri, Aug 20, 2010 at 06:23:26AM -0400, Donny Viszneki wrote:
> Armin: Sakesun used "del f" and it appears you did not. In Python
> IIRC, an explicit call to del should kick off the finalizer to flush
> and close the file!

No, you are wrong.  Try for example:

    >>> f = open('xxx')
    >>> g = f
    >>> del f

After this, 'g' still refers to the file, and it is still open.

If you want the file to be flushed and closed, then call 'f.close()' :-)


A bientot,

Armin.

From p.giarrusso at gmail.com  Fri Aug 20 19:01:07 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Fri, 20 Aug 2010 19:01:07 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
Message-ID: 

2010/8/20 Jorge Tim?n :
> Hi, I'm just curious about the feasibility of running python code in a gpu
> by extending pypy.
Disclaimer: I am not a PyPy developer, even if I've been following the
project with interest. Nor am I an expert of GPU - I provide links to
the literature I've read.
Yet, I believe that such an attempt is unlikely to be interesting.
Quoting Wikipedia's synthesis:
"Unlike CPUs however, GPUs have a parallel throughput architecture
that emphasizes executing many concurrent threads slowly, rather than
executing a single thread very fast."
And significant optimizations are needed anyway to get performance for
GPU code (and if you don't need the last bit of performance, why
bother with a GPU?), so I think that the need to use a C-like language
is the smallest problem.

> I don't have the time (and probably the knowledge neither) to develop that
> pypy extension, but I just want to know if it's possible.
> I'm interested in languages like openCL and nvidia's CUDA because I think
> the future of supercomputing is going to be GPGPU.

I would like to point out that while for some cases it might be right,
the importance of GPGPU is probably often exaggerated:

http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#

Researchers in the field are mostly aware of the fact that GPGPU is
the way to go only for a very restricted category of code. For that
code, fine.
Thus, instead of running Python code in a GPU, designing from scratch
an easy way to program a GPU efficiently, for those task, is better,
and projects for that already exist (i.e. what you cite).

Additionally, it would take probably a different kind of JIT to
exploit GPUs. No branch prediction, very small non-coherent caches, no
efficient synchronization primitives, as I read from this paper... I'm
no expert, but I guess you'd need to rearchitecture from scratch the
needed optimizations.
And it took 20-30 years to get from the first, slow Lisp (1958) to,
say, Self (1991), a landmark in performant high-level languages,
derived from SmallTalk. Most of that would have to be redone.

So, I guess that the effort to compile Python code for a GPU is not
worth it. There might be further reasons due to the kind of code a JIT
generates, since a GPU has no branch predictor, no caches, and so on,
but I'm no GPU expert and I would have to check again.

Finally, for general purpose code, exploiting the big expected number
of CPUs on our desktop systems is already a challenge.

> There's people working in
> bringing GPGPU to python:
>
> http://mathema.tician.de/software/pyopencl
> http://mathema.tician.de/software/pycuda
>
> Would it be possible to run python code in parallel without the need (for
> the developer) of actively parallelizing the code?

I would say that Python is not yet the language to use to write
efficient parallel code, because of the Global Interpreter Lock
(Google for "Python GIL"). The two implementations having no GIL are
IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
and the current focus is not on removing it.
Scientific computing uses external libraries (like NumPy) - for the
supported algorithms, one could introduce parallelism at that level.
If that's enough for your application, good.
If you want to write a parallel algorithm in Python, we're not there yet.

> I'm not talking about code of hard concurrency, but of code with intrinsic
> parallelism (let's say matrix multiplication).

Automatic parallelization is hard, see:
http://en.wikipedia.org/wiki/Automatic_parallelization

Lots of scientists have tried, lots of money has been invested, but
it's still hard.
The only practical approaches still require the programmer to
introduce parallelism, but in ways much simpler than using
multithreading directly. Google OpenMP and Cilk.

> Would a JIT compilation be capable of detecting parallelism?
Summing up what is above, probably not.

Moreover, matrix multiplication may not be so easy as one might think.
I do not know how to write it for a GPU, but in the end I reference
some suggestions from that paper (where it is one of the benchmarks).
But here, I explain why writing it for a CPU is complicated. You can
multiply two matrixes with a triply nested for, but such an algorithm
has poor performance for big matrixes because of bad cache locality.
GPUs, according to the above mentioned paper, provide no caches and
hides latency in other ways.

See here for the two main alternative ideas which allow solving this
problem of writing an efficient matrix multiplication algorithm:
http://en.wikipedia.org/wiki/Cache_blocking
http://en.wikipedia.org/wiki/Cache-oblivious_algorithm

Then, you need to parallelize the resulting code yourself, which might
or might not be easy (depending on the interactions between the
parallel blocks that are found there).
In that paper, where matrix multiplication is called as SGEMM (the
BLAS routine implementing it), they suggest using a cache-blocked
version of matrix multiplication for both CPUs and GPUs, and argue
that parallelization is then easy.

Cheers,
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From jbaker at zyasoft.com  Fri Aug 20 20:20:17 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Fri, 20 Aug 2010 12:20:17 -0600
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: <20100820140631.GA3513@code0.codespeak.net>
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
	<20100820140631.GA3513@code0.codespeak.net>
Message-ID: 

Obviously please close the file, ideally using something like the
with-statement or at least finally. But for perhaps the convenience of
scripters, and the sorrow of everyone else ;), Jython will close the file
upon clean termination of the JVM via registering a closer of such files
with Runtime#addShutdownHook

This is currently part of the most important outstanding
bugin Jython 2.5.2, and something
that has to be resolved for 2.5.2 beta 2,
because of how it interacts with classloaders and prevents their class GC
upon reload (thus potentially exhausting permgen).

On Fri, Aug 20, 2010 at 8:06 AM, Armin Rigo  wrote:

> Hi Donny,
>
> On Fri, Aug 20, 2010 at 06:23:26AM -0400, Donny Viszneki wrote:
> > Armin: Sakesun used "del f" and it appears you did not. In Python
> > IIRC, an explicit call to del should kick off the finalizer to flush
> > and close the file!
>
> No, you are wrong.  Try for example:
>
>    >>> f = open('xxx')
>    >>> g = f
>    >>> del f
>
> After this, 'g' still refers to the file, and it is still open.
>
> If you want the file to be flushed and closed, then call 'f.close()' :-)
>
>
> A bientot,
>
> Armin.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/26c1f8d8/attachment.htm 

From jbaker at zyasoft.com  Fri Aug 20 20:25:11 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Fri, 20 Aug 2010 12:25:11 -0600
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
Message-ID: 

Jython single-threaded performance has little to do with a lack of the GIL.
Probably the only direct manifestation is seen in the overhead of allocating
__dict__ (or dict) objects because Python attributes have volatile memory
semantics, which is ensured by the backing of a ConcurrentHashMap, which can
be expensive to allocate. There are workarounds.

2010/8/20 Paolo Giarrusso 

> 2010/8/20 Jorge Tim?n :
> > Hi, I'm just curious about the feasibility of running python code in a
> gpu
> > by extending pypy.
> Disclaimer: I am not a PyPy developer, even if I've been following the
> project with interest. Nor am I an expert of GPU - I provide links to
> the literature I've read.
> Yet, I believe that such an attempt is unlikely to be interesting.
> Quoting Wikipedia's synthesis:
> "Unlike CPUs however, GPUs have a parallel throughput architecture
> that emphasizes executing many concurrent threads slowly, rather than
> executing a single thread very fast."
> And significant optimizations are needed anyway to get performance for
> GPU code (and if you don't need the last bit of performance, why
> bother with a GPU?), so I think that the need to use a C-like language
> is the smallest problem.
>
> > I don't have the time (and probably the knowledge neither) to develop
> that
> > pypy extension, but I just want to know if it's possible.
> > I'm interested in languages like openCL and nvidia's CUDA because I think
> > the future of supercomputing is going to be GPGPU.
>
> I would like to point out that while for some cases it might be right,
> the importance of GPGPU is probably often exaggerated:
>
>
> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>
> Researchers in the field are mostly aware of the fact that GPGPU is
> the way to go only for a very restricted category of code. For that
> code, fine.
> Thus, instead of running Python code in a GPU, designing from scratch
> an easy way to program a GPU efficiently, for those task, is better,
> and projects for that already exist (i.e. what you cite).
>
> Additionally, it would take probably a different kind of JIT to
> exploit GPUs. No branch prediction, very small non-coherent caches, no
> efficient synchronization primitives, as I read from this paper... I'm
> no expert, but I guess you'd need to rearchitecture from scratch the
> needed optimizations.
> And it took 20-30 years to get from the first, slow Lisp (1958) to,
> say, Self (1991), a landmark in performant high-level languages,
> derived from SmallTalk. Most of that would have to be redone.
>
> So, I guess that the effort to compile Python code for a GPU is not
> worth it. There might be further reasons due to the kind of code a JIT
> generates, since a GPU has no branch predictor, no caches, and so on,
> but I'm no GPU expert and I would have to check again.
>
> Finally, for general purpose code, exploiting the big expected number
> of CPUs on our desktop systems is already a challenge.
>
> > There's people working in
> > bringing GPGPU to python:
> >
> > http://mathema.tician.de/software/pyopencl
> > http://mathema.tician.de/software/pycuda
> >
> > Would it be possible to run python code in parallel without the need (for
> > the developer) of actively parallelizing the code?
>
> I would say that Python is not yet the language to use to write
> efficient parallel code, because of the Global Interpreter Lock
> (Google for "Python GIL"). The two implementations having no GIL are
> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
> and the current focus is not on removing it.
> Scientific computing uses external libraries (like NumPy) - for the
> supported algorithms, one could introduce parallelism at that level.
> If that's enough for your application, good.
> If you want to write a parallel algorithm in Python, we're not there yet.
>
> > I'm not talking about code of hard concurrency, but of code with
> intrinsic
> > parallelism (let's say matrix multiplication).
>
> Automatic parallelization is hard, see:
> http://en.wikipedia.org/wiki/Automatic_parallelization
>
> Lots of scientists have tried, lots of money has been invested, but
> it's still hard.
> The only practical approaches still require the programmer to
> introduce parallelism, but in ways much simpler than using
> multithreading directly. Google OpenMP and Cilk.
>
> > Would a JIT compilation be capable of detecting parallelism?
> Summing up what is above, probably not.
>
> Moreover, matrix multiplication may not be so easy as one might think.
> I do not know how to write it for a GPU, but in the end I reference
> some suggestions from that paper (where it is one of the benchmarks).
> But here, I explain why writing it for a CPU is complicated. You can
> multiply two matrixes with a triply nested for, but such an algorithm
> has poor performance for big matrixes because of bad cache locality.
> GPUs, according to the above mentioned paper, provide no caches and
> hides latency in other ways.
>
> See here for the two main alternative ideas which allow solving this
> problem of writing an efficient matrix multiplication algorithm:
> http://en.wikipedia.org/wiki/Cache_blocking
> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>
> Then, you need to parallelize the resulting code yourself, which might
> or might not be easy (depending on the interactions between the
> parallel blocks that are found there).
> In that paper, where matrix multiplication is called as SGEMM (the
> BLAS routine implementing it), they suggest using a cache-blocked
> version of matrix multiplication for both CPUs and GPUs, and argue
> that parallelization is then easy.
>
> Cheers,
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/734dfeaf/attachment-0001.htm 

From p.giarrusso at gmail.com  Fri Aug 20 22:45:21 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Fri, 20 Aug 2010 22:45:21 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
Message-ID: 

2010/8/20 Jim Baker :
> Jython single-threaded performance has little to do with a lack of the GIL.

Never implied that - I do believe that a GIL-less fast Python is
possible. I just meant we don't have one yet.

> Probably the only direct manifestation is seen in the overhead of allocating
> __dict__ (or dict) objects because Python attributes have volatile memory
> semantics
Uh? "Jython memory model" doesn't seem to find anything. Is there any
docs on this, with the rationale for the choice you describe?

I've only found the Unladen Swallow proposals for a memory model:
http://code.google.com/p/unladen-swallow/wiki/MemoryModel (and
python-safethread, which I don't like).

As a Java programmer using Jython, I wouldn't expect to have any
volatile field ever, but I would expect to be able to act on different
fields indipendently - the race conditions we have to protect from are
the ones on structual modification (unless the table uses open
addressing).
_This_ can be implemented through ConcurrentHashMap (which also makes
all fields volatile), but an implementation not guaranteeing volatile
semantics (if possible) would have been equally valid.
I am interested because I want to experiment with alternatives.

Of course, you can offer stronger semantics, but then you should also
advertise that fields are volatile, thus I don't need a lock to pass a
reference.

> , which is ensured by the backing of a ConcurrentHashMap, which can
> be expensive to allocate. There are workarounds.

I'm also curious about such workarounds - are they currently
implemented or speculations?

> 2010/8/20 Paolo Giarrusso 
>>
>> 2010/8/20 Jorge Tim?n :
>> > Hi, I'm just curious about the feasibility of running python code in a
>> > gpu
>> > by extending pypy.
>> Disclaimer: I am not a PyPy developer, even if I've been following the
>> project with interest. Nor am I an expert of GPU - I provide links to
>> the literature I've read.
>> Yet, I believe that such an attempt is unlikely to be interesting.
>> Quoting Wikipedia's synthesis:
>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>> that emphasizes executing many concurrent threads slowly, rather than
>> executing a single thread very fast."
>> And significant optimizations are needed anyway to get performance for
>> GPU code (and if you don't need the last bit of performance, why
>> bother with a GPU?), so I think that the need to use a C-like language
>> is the smallest problem.
>>
>> > I don't have the time (and probably the knowledge neither) to develop
>> > that
>> > pypy extension, but I just want to know if it's possible.
>> > I'm interested in languages like openCL and nvidia's CUDA because I
>> > think
>> > the future of supercomputing is going to be GPGPU.
>>
>> I would like to point out that while for some cases it might be right,
>> the importance of GPGPU is probably often exaggerated:
>>
>>
>> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>>
>> Researchers in the field are mostly aware of the fact that GPGPU is
>> the way to go only for a very restricted category of code. For that
>> code, fine.
>> Thus, instead of running Python code in a GPU, designing from scratch
>> an easy way to program a GPU efficiently, for those task, is better,
>> and projects for that already exist (i.e. what you cite).
>>
>> Additionally, it would take probably a different kind of JIT to
>> exploit GPUs. No branch prediction, very small non-coherent caches, no
>> efficient synchronization primitives, as I read from this paper... I'm
>> no expert, but I guess you'd need to rearchitecture from scratch the
>> needed optimizations.
>> And it took 20-30 years to get from the first, slow Lisp (1958) to,
>> say, Self (1991), a landmark in performant high-level languages,
>> derived from SmallTalk. Most of that would have to be redone.
>>
>> So, I guess that the effort to compile Python code for a GPU is not
>> worth it. There might be further reasons due to the kind of code a JIT
>> generates, since a GPU has no branch predictor, no caches, and so on,
>> but I'm no GPU expert and I would have to check again.
>>
>> Finally, for general purpose code, exploiting the big expected number
>> of CPUs on our desktop systems is already a challenge.
>>
>> > There's people working in
>> > bringing GPGPU to python:
>> >
>> > http://mathema.tician.de/software/pyopencl
>> > http://mathema.tician.de/software/pycuda
>> >
>> > Would it be possible to run python code in parallel without the need
>> > (for
>> > the developer) of actively parallelizing the code?
>>
>> I would say that Python is not yet the language to use to write
>> efficient parallel code, because of the Global Interpreter Lock
>> (Google for "Python GIL"). The two implementations having no GIL are
>> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
>> and the current focus is not on removing it.
>> Scientific computing uses external libraries (like NumPy) - for the
>> supported algorithms, one could introduce parallelism at that level.
>> If that's enough for your application, good.
>> If you want to write a parallel algorithm in Python, we're not there yet.
>>
>> > I'm not talking about code of hard concurrency, but of code with
>> > intrinsic
>> > parallelism (let's say matrix multiplication).
>>
>> Automatic parallelization is hard, see:
>> http://en.wikipedia.org/wiki/Automatic_parallelization
>>
>> Lots of scientists have tried, lots of money has been invested, but
>> it's still hard.
>> The only practical approaches still require the programmer to
>> introduce parallelism, but in ways much simpler than using
>> multithreading directly. Google OpenMP and Cilk.
>>
>> > Would a JIT compilation be capable of detecting parallelism?
>> Summing up what is above, probably not.
>>
>> Moreover, matrix multiplication may not be so easy as one might think.
>> I do not know how to write it for a GPU, but in the end I reference
>> some suggestions from that paper (where it is one of the benchmarks).
>> But here, I explain why writing it for a CPU is complicated. You can
>> multiply two matrixes with a triply nested for, but such an algorithm
>> has poor performance for big matrixes because of bad cache locality.
>> GPUs, according to the above mentioned paper, provide no caches and
>> hides latency in other ways.
>>
>> See here for the two main alternative ideas which allow solving this
>> problem of writing an efficient matrix multiplication algorithm:
>> http://en.wikipedia.org/wiki/Cache_blocking
>> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>>
>> Then, you need to parallelize the resulting code yourself, which might
>> or might not be easy (depending on the interactions between the
>> parallel blocks that are found there).
>> In that paper, where matrix multiplication is called as SGEMM (the
>> BLAS routine implementing it), they suggest using a cache-blocked
>> version of matrix multiplication for both CPUs and GPUs, and argue
>> that parallelization is then easy.
>>
>> Cheers,
>> --
>> Paolo Giarrusso - Ph.D. Student
>> http://www.informatik.uni-marburg.de/~pgiarrusso/
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From fijall at gmail.com  Fri Aug 20 22:51:42 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 20 Aug 2010 22:51:42 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
Message-ID: 

2010/8/20 Paolo Giarrusso :
> 2010/8/20 Jorge Tim?n :
>> Hi, I'm just curious about the feasibility of running python code in a gpu
>> by extending pypy.
> Disclaimer: I am not a PyPy developer, even if I've been following the
> project with interest. Nor am I an expert of GPU - I provide links to
> the literature I've read.
> Yet, I believe that such an attempt is unlikely to be interesting.
> Quoting Wikipedia's synthesis:
> "Unlike CPUs however, GPUs have a parallel throughput architecture
> that emphasizes executing many concurrent threads slowly, rather than
> executing a single thread very fast."
> And significant optimizations are needed anyway to get performance for
> GPU code (and if you don't need the last bit of performance, why
> bother with a GPU?), so I think that the need to use a C-like language
> is the smallest problem.
>
>> I don't have the time (and probably the knowledge neither) to develop that
>> pypy extension, but I just want to know if it's possible.
>> I'm interested in languages like openCL and nvidia's CUDA because I think
>> the future of supercomputing is going to be GPGPU.

Python is a very different language than CUDA or openCL, hence it's
not completely to map python's semantics to something that will make
sense for GPU.

>
> I would like to point out that while for some cases it might be right,
> the importance of GPGPU is probably often exaggerated:
>
> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>
> Researchers in the field are mostly aware of the fact that GPGPU is
> the way to go only for a very restricted category of code. For that
> code, fine.
> Thus, instead of running Python code in a GPU, designing from scratch
> an easy way to program a GPU efficiently, for those task, is better,
> and projects for that already exist (i.e. what you cite).
>
> Additionally, it would take probably a different kind of JIT to
> exploit GPUs. No branch prediction, very small non-coherent caches, no
> efficient synchronization primitives, as I read from this paper... I'm
> no expert, but I guess you'd need to rearchitecture from scratch the
> needed optimizations.
> And it took 20-30 years to get from the first, slow Lisp (1958) to,
> say, Self (1991), a landmark in performant high-level languages,
> derived from SmallTalk. Most of that would have to be redone.
>
> So, I guess that the effort to compile Python code for a GPU is not
> worth it. There might be further reasons due to the kind of code a JIT
> generates, since a GPU has no branch predictor, no caches, and so on,
> but I'm no GPU expert and I would have to check again.
>
> Finally, for general purpose code, exploiting the big expected number
> of CPUs on our desktop systems is already a challenge.
>
>> There's people working in
>> bringing GPGPU to python:
>>
>> http://mathema.tician.de/software/pyopencl
>> http://mathema.tician.de/software/pycuda
>>
>> Would it be possible to run python code in parallel without the need (for
>> the developer) of actively parallelizing the code?
>
> I would say that Python is not yet the language to use to write
> efficient parallel code, because of the Global Interpreter Lock
> (Google for "Python GIL"). The two implementations having no GIL are
> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
> and the current focus is not on removing it.
> Scientific computing uses external libraries (like NumPy) - for the
> supported algorithms, one could introduce parallelism at that level.
> If that's enough for your application, good.
> If you want to write a parallel algorithm in Python, we're not there yet.
>
>> I'm not talking about code of hard concurrency, but of code with intrinsic
>> parallelism (let's say matrix multiplication).
>
> Automatic parallelization is hard, see:
> http://en.wikipedia.org/wiki/Automatic_parallelization
>
> Lots of scientists have tried, lots of money has been invested, but
> it's still hard.
> The only practical approaches still require the programmer to
> introduce parallelism, but in ways much simpler than using
> multithreading directly. Google OpenMP and Cilk.
>
>> Would a JIT compilation be capable of detecting parallelism?
> Summing up what is above, probably not.
>
> Moreover, matrix multiplication may not be so easy as one might think.
> I do not know how to write it for a GPU, but in the end I reference
> some suggestions from that paper (where it is one of the benchmarks).
> But here, I explain why writing it for a CPU is complicated. You can
> multiply two matrixes with a triply nested for, but such an algorithm
> has poor performance for big matrixes because of bad cache locality.
> GPUs, according to the above mentioned paper, provide no caches and
> hides latency in other ways.
>
> See here for the two main alternative ideas which allow solving this
> problem of writing an efficient matrix multiplication algorithm:
> http://en.wikipedia.org/wiki/Cache_blocking
> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>
> Then, you need to parallelize the resulting code yourself, which might
> or might not be easy (depending on the interactions between the
> parallel blocks that are found there).
> In that paper, where matrix multiplication is called as SGEMM (the
> BLAS routine implementing it), they suggest using a cache-blocked
> version of matrix multiplication for both CPUs and GPUs, and argue
> that parallelization is then easy.

What's interesting in using GPU and a JIT is optimizing numpy
vectorized operations to speed up things like big_array_a +
big_array_b using SSE and GPU. However, I don't think anyone plans to
work on it in a near future and if you don't have time this stays as a
topic of interest only :)

>
> Cheers,
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev

From jonah at eecs.berkeley.edu  Fri Aug 20 23:05:15 2010
From: jonah at eecs.berkeley.edu (Jeff Anderson-Lee)
Date: Fri, 20 Aug 2010 14:05:15 -0700
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
Message-ID: <4C6EEE0B.7060500@eecs.berkeley.edu>

  On 8/20/2010 1:51 PM, Maciej Fijalkowski wrote:
> 2010/8/20 Paolo Giarrusso:
>> 2010/8/20 Jorge Tim?n:
>>> Hi, I'm just curious about the feasibility of running python code in a gpu
>>> by extending pypy.
>> Disclaimer: I am not a PyPy developer, even if I've been following the
>> project with interest. Nor am I an expert of GPU - I provide links to
>> the literature I've read.
>> Yet, I believe that such an attempt is unlikely to be interesting.
>> Quoting Wikipedia's synthesis:
>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>> that emphasizes executing many concurrent threads slowly, rather than
>> executing a single thread very fast."
>> And significant optimizations are needed anyway to get performance for
>> GPU code (and if you don't need the last bit of performance, why
>> bother with a GPU?), so I think that the need to use a C-like language
>> is the smallest problem.
>>
>>> I don't have the time (and probably the knowledge neither) to develop that
>>> pypy extension, but I just want to know if it's possible.
>>> I'm interested in languages like openCL and nvidia's CUDA because I think
>>> the future of supercomputing is going to be GPGPU.
> Python is a very different language than CUDA or openCL, hence it's
> not completely to map python's semantics to something that will make
> sense for GPU.
Try googling: copperhead cuda
Also look at:

http://code.google.com/p/copperhead/wiki/Installing


From fijall at gmail.com  Fri Aug 20 23:18:12 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 20 Aug 2010 23:18:12 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: <4C6EEE0B.7060500@eecs.berkeley.edu>
References: 
	
	
	<4C6EEE0B.7060500@eecs.berkeley.edu>
Message-ID: 

On Fri, Aug 20, 2010 at 11:05 PM, Jeff Anderson-Lee
 wrote:
> ?On 8/20/2010 1:51 PM, Maciej Fijalkowski wrote:
>> 2010/8/20 Paolo Giarrusso:
>>> 2010/8/20 Jorge Tim?n:
>>>> Hi, I'm just curious about the feasibility of running python code in a gpu
>>>> by extending pypy.
>>> Disclaimer: I am not a PyPy developer, even if I've been following the
>>> project with interest. Nor am I an expert of GPU - I provide links to
>>> the literature I've read.
>>> Yet, I believe that such an attempt is unlikely to be interesting.
>>> Quoting Wikipedia's synthesis:
>>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>>> that emphasizes executing many concurrent threads slowly, rather than
>>> executing a single thread very fast."
>>> And significant optimizations are needed anyway to get performance for
>>> GPU code (and if you don't need the last bit of performance, why
>>> bother with a GPU?), so I think that the need to use a C-like language
>>> is the smallest problem.
>>>
>>>> I don't have the time (and probably the knowledge neither) to develop that
>>>> pypy extension, but I just want to know if it's possible.
>>>> I'm interested in languages like openCL and nvidia's CUDA because I think
>>>> the future of supercomputing is going to be GPGPU.
>> Python is a very different language than CUDA or openCL, hence it's
>> not completely to map python's semantics to something that will make
>> sense for GPU.
> Try googling: copperhead cuda
> Also look at:
>
> http://code.google.com/p/copperhead/wiki/Installing
>

What's the point of posting here project which has not released any code?

From jbaker at zyasoft.com  Fri Aug 20 23:27:20 2010
From: jbaker at zyasoft.com (Jim Baker)
Date: Fri, 20 Aug 2010 15:27:20 -0600
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
	
Message-ID: 

The Unladen Swallow doc, which was derived from a PEP that Jeff proposed,
seems to be a fair descriptive outline of Python memory models in general,
and Jython's in specific.

Obviously the underlying implementation in the JVM is happens-before
consistency; everything else derives from there. The CHM  provides
additional consistency constraints that should imply sequential consistency
for a (vast) subset of Python programs. However, I can readily construct a
program that violates sequential consistency: maybe it uses slots (stored in
a Java array), or the array module (which also just wraps Java arrays), or
by accesses local variables in a frame from another thread (same storage,
same problem). Likewise I can also create Python programs that access Java
classes (since this is Jython!), and they too will only see happens-before
consistency.

Naturally, the workarounds I mentioned for improving performance in object
allocation all rely on not using CHM and its (modestly) expensive semantics.
So this would mean using a Java class in some way, possibly a HashMap
(especially one that's been exposed through our type expose mechanism to
avoid reflection overhead), or directly using a Java class of some kind
(again exposing is best, much like are builtin types like PyInteger),
possibly with all fields marked as volatile.

Hope this helps! If you are interested in studying this problem in more
depth for Jython, or other implementations, and the implications of our
hybrid model, it would certainly be most welcome. Unfortunately, it's not
something that Jython development itself will be working on (standard time
constraints apply here).

- Jim

2010/8/20 Paolo Giarrusso 

> 2010/8/20 Jim Baker :
> > Jython single-threaded performance has little to do with a lack of the
> GIL.
>
> Never implied that - I do believe that a GIL-less fast Python is
> possible. I just meant we don't have one yet.
>
> > Probably the only direct manifestation is seen in the overhead of
> allocating
> > __dict__ (or dict) objects because Python attributes have volatile memory
> > semantics
> Uh? "Jython memory model" doesn't seem to find anything. Is there any
> docs on this, with the rationale for the choice you describe?
>
> I've only found the Unladen Swallow proposals for a memory model:
> http://code.google.com/p/unladen-swallow/wiki/MemoryModel (and
> python-safethread, which I don't like).
>
> As a Java programmer using Jython, I wouldn't expect to have any
> volatile field ever, but I would expect to be able to act on different
> fields indipendently - the race conditions we have to protect from are
> the ones on structual modification (unless the table uses open
> addressing).
> _This_ can be implemented through ConcurrentHashMap (which also makes
> all fields volatile), but an implementation not guaranteeing volatile
> semantics (if possible) would have been equally valid.
> I am interested because I want to experiment with alternatives.
>
> Of course, you can offer stronger semantics, but then you should also
> advertise that fields are volatile, thus I don't need a lock to pass a
> reference.
>
> > , which is ensured by the backing of a ConcurrentHashMap, which can
> > be expensive to allocate. There are workarounds.
>
> I'm also curious about such workarounds - are they currently
> implemented or speculations?
>
> > 2010/8/20 Paolo Giarrusso 
> >>
> >> 2010/8/20 Jorge Tim?n :
> >> > Hi, I'm just curious about the feasibility of running python code in a
> >> > gpu
> >> > by extending pypy.
> >> Disclaimer: I am not a PyPy developer, even if I've been following the
> >> project with interest. Nor am I an expert of GPU - I provide links to
> >> the literature I've read.
> >> Yet, I believe that such an attempt is unlikely to be interesting.
> >> Quoting Wikipedia's synthesis:
> >> "Unlike CPUs however, GPUs have a parallel throughput architecture
> >> that emphasizes executing many concurrent threads slowly, rather than
> >> executing a single thread very fast."
> >> And significant optimizations are needed anyway to get performance for
> >> GPU code (and if you don't need the last bit of performance, why
> >> bother with a GPU?), so I think that the need to use a C-like language
> >> is the smallest problem.
> >>
> >> > I don't have the time (and probably the knowledge neither) to develop
> >> > that
> >> > pypy extension, but I just want to know if it's possible.
> >> > I'm interested in languages like openCL and nvidia's CUDA because I
> >> > think
> >> > the future of supercomputing is going to be GPGPU.
> >>
> >> I would like to point out that while for some cases it might be right,
> >> the importance of GPGPU is probably often exaggerated:
> >>
> >>
> >>
> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
> >>
> >> Researchers in the field are mostly aware of the fact that GPGPU is
> >> the way to go only for a very restricted category of code. For that
> >> code, fine.
> >> Thus, instead of running Python code in a GPU, designing from scratch
> >> an easy way to program a GPU efficiently, for those task, is better,
> >> and projects for that already exist (i.e. what you cite).
> >>
> >> Additionally, it would take probably a different kind of JIT to
> >> exploit GPUs. No branch prediction, very small non-coherent caches, no
> >> efficient synchronization primitives, as I read from this paper... I'm
> >> no expert, but I guess you'd need to rearchitecture from scratch the
> >> needed optimizations.
> >> And it took 20-30 years to get from the first, slow Lisp (1958) to,
> >> say, Self (1991), a landmark in performant high-level languages,
> >> derived from SmallTalk. Most of that would have to be redone.
> >>
> >> So, I guess that the effort to compile Python code for a GPU is not
> >> worth it. There might be further reasons due to the kind of code a JIT
> >> generates, since a GPU has no branch predictor, no caches, and so on,
> >> but I'm no GPU expert and I would have to check again.
> >>
> >> Finally, for general purpose code, exploiting the big expected number
> >> of CPUs on our desktop systems is already a challenge.
> >>
> >> > There's people working in
> >> > bringing GPGPU to python:
> >> >
> >> > http://mathema.tician.de/software/pyopencl
> >> > http://mathema.tician.de/software/pycuda
> >> >
> >> > Would it be possible to run python code in parallel without the need
> >> > (for
> >> > the developer) of actively parallelizing the code?
> >>
> >> I would say that Python is not yet the language to use to write
> >> efficient parallel code, because of the Global Interpreter Lock
> >> (Google for "Python GIL"). The two implementations having no GIL are
> >> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
> >> and the current focus is not on removing it.
> >> Scientific computing uses external libraries (like NumPy) - for the
> >> supported algorithms, one could introduce parallelism at that level.
> >> If that's enough for your application, good.
> >> If you want to write a parallel algorithm in Python, we're not there
> yet.
> >>
> >> > I'm not talking about code of hard concurrency, but of code with
> >> > intrinsic
> >> > parallelism (let's say matrix multiplication).
> >>
> >> Automatic parallelization is hard, see:
> >> http://en.wikipedia.org/wiki/Automatic_parallelization
> >>
> >> Lots of scientists have tried, lots of money has been invested, but
> >> it's still hard.
> >> The only practical approaches still require the programmer to
> >> introduce parallelism, but in ways much simpler than using
> >> multithreading directly. Google OpenMP and Cilk.
> >>
> >> > Would a JIT compilation be capable of detecting parallelism?
> >> Summing up what is above, probably not.
> >>
> >> Moreover, matrix multiplication may not be so easy as one might think.
> >> I do not know how to write it for a GPU, but in the end I reference
> >> some suggestions from that paper (where it is one of the benchmarks).
> >> But here, I explain why writing it for a CPU is complicated. You can
> >> multiply two matrixes with a triply nested for, but such an algorithm
> >> has poor performance for big matrixes because of bad cache locality.
> >> GPUs, according to the above mentioned paper, provide no caches and
> >> hides latency in other ways.
> >>
> >> See here for the two main alternative ideas which allow solving this
> >> problem of writing an efficient matrix multiplication algorithm:
> >> http://en.wikipedia.org/wiki/Cache_blocking
> >> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
> >>
> >> Then, you need to parallelize the resulting code yourself, which might
> >> or might not be easy (depending on the interactions between the
> >> parallel blocks that are found there).
> >> In that paper, where matrix multiplication is called as SGEMM (the
> >> BLAS routine implementing it), they suggest using a cache-blocked
> >> version of matrix multiplication for both CPUs and GPUs, and argue
> >> that parallelization is then easy.
> >>
> >> Cheers,
> >> --
> >> Paolo Giarrusso - Ph.D. Student
> >> http://www.informatik.uni-marburg.de/~pgiarrusso/
> >> _______________________________________________
> >> pypy-dev at codespeak.net
> >> http://codespeak.net/mailman/listinfo/pypy-dev
> >
>
>
>
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/4c59b014/attachment-0001.htm 

From jonah at eecs.berkeley.edu  Fri Aug 20 23:28:14 2010
From: jonah at eecs.berkeley.edu (Jeff Anderson-Lee)
Date: Fri, 20 Aug 2010 14:28:14 -0700
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
	<4C6EEE0B.7060500@eecs.berkeley.edu>
	
Message-ID: <4C6EF36E.5000507@eecs.berkeley.edu>

  On 8/20/2010 2:18 PM, Maciej Fijalkowski wrote:
> On Fri, Aug 20, 2010 at 11:05 PM, Jeff Anderson-Lee
>   wrote:
>>   On 8/20/2010 1:51 PM, Maciej Fijalkowski wrote:
>>> 2010/8/20 Paolo Giarrusso:
>>>> 2010/8/20 Jorge Tim?n:
>>>>> Hi, I'm just curious about the feasibility of running python code in a gpu
>>>>> by extending pypy.
>>>> Disclaimer: I am not a PyPy developer, even if I've been following the
>>>> project with interest. Nor am I an expert of GPU - I provide links to
>>>> the literature I've read.
>>>> Yet, I believe that such an attempt is unlikely to be interesting.
>>>> Quoting Wikipedia's synthesis:
>>>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>>>> that emphasizes executing many concurrent threads slowly, rather than
>>>> executing a single thread very fast."
>>>> And significant optimizations are needed anyway to get performance for
>>>> GPU code (and if you don't need the last bit of performance, why
>>>> bother with a GPU?), so I think that the need to use a C-like language
>>>> is the smallest problem.
>>>>
>>>>> I don't have the time (and probably the knowledge neither) to develop that
>>>>> pypy extension, but I just want to know if it's possible.
>>>>> I'm interested in languages like openCL and nvidia's CUDA because I think
>>>>> the future of supercomputing is going to be GPGPU.
>>> Python is a very different language than CUDA or openCL, hence it's
>>> not completely to map python's semantics to something that will make
>>> sense for GPU.
>> Try googling: copperhead cuda
>> Also look at:
>>
>> http://code.google.com/p/copperhead/wiki/Installing
>>
> What's the point of posting here project which has not released any code?
1) He is packaging it up for release this month:
> Comment by bryan.catanzaro 
> , Aug 05, 2010
>
> Before the end of August. I'm working on packaging it up right now. =)
>
2) Bryan's got a good head on his shoulders and has been working on this 
problem or some time. Rather than (or at least before) starting off in a 
completely new direction, its worth looking at something that has been 
in the works for a while now and is attaining some maturity.
3) You are welcome to ignore it, but some folks might be interested, and 
at least they now know it is there and where to look for more 
information and forthcoming code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100820/308786dd/attachment.htm 

From ncbray at gmail.com  Sat Aug 21 00:46:53 2010
From: ncbray at gmail.com (Nick Bray)
Date: Fri, 20 Aug 2010 17:46:53 -0500
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
Message-ID: 

I can't speak for GPGPU, but I have compiled a subset of Python onto
the GPU for real-time rendering.  The subset is a little broader than
RPython in some ways (for example, attributes are semantically
identical to Python) and a little narrower in some ways (many forms of
recursion are disallowed.)  This big idea is that it allows you to
create a real-time rendering system with a single code base, and
transparently share functions and data structures between the CPU and
GPU.

http://www.ncbray.com/pystream.html
http://www.ncbray.com/ncbray-dissertation.pdf

It's at least ~100,000x faster than interpreting Python on the CPU.
"At least" because the measurements neglect doing things on the CPU
like texture sampling.  This speedup is pretty obscene, but if you
break it down it isn't too unbelievable... 100x for interpreted ->
compiled, 10x for abstraction overhead of using floats instead of
doubles, 100x for using the GPU and using it for a task it was built
for.

Parallelism issues are sidestepped by explicitly identifying the
parallel sections (one function processes every vertex, one function
processes every fragment), requiring the parallel sections have no
global side effects, and that certain I/O conventions are followed.
Sorry, no big answers here - it's essentially Pythonic stream
programming.

The biggest issues with getting Python onto the GPU is memory.  I was
actually targeting GLSL, not CUDA (it can't access the full rendering
pipeline), so pointers were not available.  To work around this, the
code is optimized to an extreme degree to remove as many memory
operations as possible.  The remaining memory operations are emulated
by splitting the heap into regions, indirecting through arrays, and
copying constant data wherever possible.  From what I've seen this is
where PyPy would have the most trouble: its analysis algorithms are
good enough for inferring types and  allowing compilation /
translation... they aren't designed to enable aggressive optimization
of memory operations (there's not a huge reason to do this if you're
translating RPython into C... the C compiler will do it for you).  In
general, GPU programming doesn't work well with memory access (too
many functional units, too little bandwidth).  Most of the "C-like"
GPU languages are designed to they can easily boil down into code
operating out of registers.  Python, on the other hand, is addicted to
heap memory.  Even if you target CUDA, eliminating memory operations
will be a huge win.

I'll freely admit there's some ugly things going on, such as the lack
of recursion, reliance on exhaustive inlining, requiring GPU code
follow a specific form, and not working well with container objects in
certain situations (it needs to bound the size of the heap).  In the
end, however, it's a talking dog... the grammar may not be perfect,
but the dog talks!  If anyone has questions, either private or on the
list, I'd be happy to answer them.  I have not done enough to
advertise my project, and this seems like a good place to start.

- Nick Bray

2010/8/20 Paolo Giarrusso :
> 2010/8/20 Jorge Tim?n :
>> Hi, I'm just curious about the feasibility of running python code in a gpu
>> by extending pypy.
> Disclaimer: I am not a PyPy developer, even if I've been following the
> project with interest. Nor am I an expert of GPU - I provide links to
> the literature I've read.
> Yet, I believe that such an attempt is unlikely to be interesting.
> Quoting Wikipedia's synthesis:
> "Unlike CPUs however, GPUs have a parallel throughput architecture
> that emphasizes executing many concurrent threads slowly, rather than
> executing a single thread very fast."
> And significant optimizations are needed anyway to get performance for
> GPU code (and if you don't need the last bit of performance, why
> bother with a GPU?), so I think that the need to use a C-like language
> is the smallest problem.
>
>> I don't have the time (and probably the knowledge neither) to develop that
>> pypy extension, but I just want to know if it's possible.
>> I'm interested in languages like openCL and nvidia's CUDA because I think
>> the future of supercomputing is going to be GPGPU.
>
> I would like to point out that while for some cases it might be right,
> the importance of GPGPU is probably often exaggerated:
>
> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>
> Researchers in the field are mostly aware of the fact that GPGPU is
> the way to go only for a very restricted category of code. For that
> code, fine.
> Thus, instead of running Python code in a GPU, designing from scratch
> an easy way to program a GPU efficiently, for those task, is better,
> and projects for that already exist (i.e. what you cite).
>
> Additionally, it would take probably a different kind of JIT to
> exploit GPUs. No branch prediction, very small non-coherent caches, no
> efficient synchronization primitives, as I read from this paper... I'm
> no expert, but I guess you'd need to rearchitecture from scratch the
> needed optimizations.
> And it took 20-30 years to get from the first, slow Lisp (1958) to,
> say, Self (1991), a landmark in performant high-level languages,
> derived from SmallTalk. Most of that would have to be redone.
>
> So, I guess that the effort to compile Python code for a GPU is not
> worth it. There might be further reasons due to the kind of code a JIT
> generates, since a GPU has no branch predictor, no caches, and so on,
> but I'm no GPU expert and I would have to check again.
>
> Finally, for general purpose code, exploiting the big expected number
> of CPUs on our desktop systems is already a challenge.
>
>> There's people working in
>> bringing GPGPU to python:
>>
>> http://mathema.tician.de/software/pyopencl
>> http://mathema.tician.de/software/pycuda
>>
>> Would it be possible to run python code in parallel without the need (for
>> the developer) of actively parallelizing the code?
>
> I would say that Python is not yet the language to use to write
> efficient parallel code, because of the Global Interpreter Lock
> (Google for "Python GIL"). The two implementations having no GIL are
> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
> and the current focus is not on removing it.
> Scientific computing uses external libraries (like NumPy) - for the
> supported algorithms, one could introduce parallelism at that level.
> If that's enough for your application, good.
> If you want to write a parallel algorithm in Python, we're not there yet.
>
>> I'm not talking about code of hard concurrency, but of code with intrinsic
>> parallelism (let's say matrix multiplication).
>
> Automatic parallelization is hard, see:
> http://en.wikipedia.org/wiki/Automatic_parallelization
>
> Lots of scientists have tried, lots of money has been invested, but
> it's still hard.
> The only practical approaches still require the programmer to
> introduce parallelism, but in ways much simpler than using
> multithreading directly. Google OpenMP and Cilk.
>
>> Would a JIT compilation be capable of detecting parallelism?
> Summing up what is above, probably not.
>
> Moreover, matrix multiplication may not be so easy as one might think.
> I do not know how to write it for a GPU, but in the end I reference
> some suggestions from that paper (where it is one of the benchmarks).
> But here, I explain why writing it for a CPU is complicated. You can
> multiply two matrixes with a triply nested for, but such an algorithm
> has poor performance for big matrixes because of bad cache locality.
> GPUs, according to the above mentioned paper, provide no caches and
> hides latency in other ways.
>
> See here for the two main alternative ideas which allow solving this
> problem of writing an efficient matrix multiplication algorithm:
> http://en.wikipedia.org/wiki/Cache_blocking
> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>
> Then, you need to parallelize the resulting code yourself, which might
> or might not be easy (depending on the interactions between the
> parallel blocks that are found there).
> In that paper, where matrix multiplication is called as SGEMM (the
> BLAS routine implementing it), they suggest using a cache-blocked
> version of matrix multiplication for both CPUs and GPUs, and argue
> that parallelization is then easy.
>
> Cheers,
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev

From p.giarrusso at gmail.com  Sat Aug 21 01:46:28 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Sat, 21 Aug 2010 01:46:28 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

2010/8/20 Jim Baker :
> The Unladen Swallow doc, which was derived from a PEP that Jeff proposed,
> seems to be a fair descriptive outline of Python memory models in general,
> and Jython's in specific.
> Obviously the underlying implementation in the JVM is happens-before
> consistency; everything else derives from there. The CHM ?provides
> additional consistency constraints that should imply sequential consistency
> for a (vast) subset of Python programs. However, I can readily construct a
> program that violates sequential consistency: maybe it uses slots (stored in
> a Java array), or the array module (which also just wraps Java arrays), or
> by accesses local variables in a frame from another thread (same storage,
> same problem). Likewise I can also create Python programs that access Java
> classes (since this is Jython!), and they too will only see happens-before
> consistency.
OK, I guess that volatile semantics for fields were just a side effect.
As far as I can see, you get sequential consistency only in practice,
not in theory - you have happens-before edges only when a reader and a
writer touch the same field. In practice, the few cases where it
matters can't apply here as far as I know, because a hash function
decides to which of the submaps a mapping belongs.

Your mention of slots is very cool! You made me recall that once you
get shadow classes in Python, you can not only do inline caching, but
you also have the _same_ object layout as in slots, because adding a
member causes a hidden class transition, getting rid of any kind of
dictionary _after compilation_. Two exceptions:
* an immutable dictionary mapping field names to offsets is used both
during JIT compilation and when inline caching fails, for
* a fallback case for when __dict__ is used, I guess, is needed. Not
necessarily a dictionary must be used though: one could also make
__dict__ usage just cause class transitions.
* beyond a certain member count, i.e., if __dict__ is used as a
general-purpose dictionary, one might want to switch back to a
dictionary representation. This only applies if this is done in
Pythonic code (guess not) - I remember this case from V8, for
JavaScript, where the expected usage is different.

> Naturally, the workarounds I mentioned for improving performance in object
> allocation all rely on not using CHM and its (modestly) expensive semantics.
> So this would mean using a Java class in some way, possibly a HashMap
> (especially one that's been exposed through our type expose mechanism to
> avoid reflection overhead), or directly using a Java class of some kind
> (again exposing is best, much like are builtin types like PyInteger),
> possibly with all fields marked as volatile.
> Hope this helps! If you are interested in studying this problem in more
> depth for Jython, or other implementations, and the implications of our
> hybrid model, it would certainly be most welcome. Unfortunately, it's not
> something that Jython development itself will be working on (standard time
> constraints apply here).

Such constraints apply to me too - but I hope this to work on that.

> - Jim
> 2010/8/20 Paolo Giarrusso 
>>
>> 2010/8/20 Jim Baker :
>> > Jython single-threaded performance has little to do with a lack of the
>> > GIL.
>>
>> Never implied that - I do believe that a GIL-less fast Python is
>> possible. I just meant we don't have one yet.
>>
>> > Probably the only direct manifestation is seen in the overhead of
>> > allocating
>> > __dict__ (or dict) objects because Python attributes have volatile
>> > memory
>> > semantics
>> Uh? "Jython memory model" doesn't seem to find anything. Is there any
>> docs on this, with the rationale for the choice you describe?
>>
>> I've only found the Unladen Swallow proposals for a memory model:
>> http://code.google.com/p/unladen-swallow/wiki/MemoryModel (and
>> python-safethread, which I don't like).
>>
>> As a Java programmer using Jython, I wouldn't expect to have any
>> volatile field ever, but I would expect to be able to act on different
>> fields indipendently - the race conditions we have to protect from are
>> the ones on structual modification (unless the table uses open
>> addressing).
>> _This_ can be implemented through ConcurrentHashMap (which also makes
>> all fields volatile), but an implementation not guaranteeing volatile
>> semantics (if possible) would have been equally valid.
>> I am interested because I want to experiment with alternatives.
>>
>> Of course, you can offer stronger semantics, but then you should also
>> advertise that fields are volatile, thus I don't need a lock to pass a
>> reference.
>>
>> > , which is ensured by the backing of a ConcurrentHashMap, which can
>> > be expensive to allocate. There are workarounds.
>>
>> I'm also curious about such workarounds - are they currently
>> implemented or speculations?
>>
>> > 2010/8/20 Paolo Giarrusso 
>> >>
>> >> 2010/8/20 Jorge Tim?n :
>> >> > Hi, I'm just curious about the feasibility of running python code in
>> >> > a
>> >> > gpu
>> >> > by extending pypy.
>> >> Disclaimer: I am not a PyPy developer, even if I've been following the
>> >> project with interest. Nor am I an expert of GPU - I provide links to
>> >> the literature I've read.
>> >> Yet, I believe that such an attempt is unlikely to be interesting.
>> >> Quoting Wikipedia's synthesis:
>> >> "Unlike CPUs however, GPUs have a parallel throughput architecture
>> >> that emphasizes executing many concurrent threads slowly, rather than
>> >> executing a single thread very fast."
>> >> And significant optimizations are needed anyway to get performance for
>> >> GPU code (and if you don't need the last bit of performance, why
>> >> bother with a GPU?), so I think that the need to use a C-like language
>> >> is the smallest problem.
>> >>
>> >> > I don't have the time (and probably the knowledge neither) to develop
>> >> > that
>> >> > pypy extension, but I just want to know if it's possible.
>> >> > I'm interested in languages like openCL and nvidia's CUDA because I
>> >> > think
>> >> > the future of supercomputing is going to be GPGPU.
>> >>
>> >> I would like to point out that while for some cases it might be right,
>> >> the importance of GPGPU is probably often exaggerated:
>> >>
>> >>
>> >>
>> >> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>> >>
>> >> Researchers in the field are mostly aware of the fact that GPGPU is
>> >> the way to go only for a very restricted category of code. For that
>> >> code, fine.
>> >> Thus, instead of running Python code in a GPU, designing from scratch
>> >> an easy way to program a GPU efficiently, for those task, is better,
>> >> and projects for that already exist (i.e. what you cite).
>> >>
>> >> Additionally, it would take probably a different kind of JIT to
>> >> exploit GPUs. No branch prediction, very small non-coherent caches, no
>> >> efficient synchronization primitives, as I read from this paper... I'm
>> >> no expert, but I guess you'd need to rearchitecture from scratch the
>> >> needed optimizations.
>> >> And it took 20-30 years to get from the first, slow Lisp (1958) to,
>> >> say, Self (1991), a landmark in performant high-level languages,
>> >> derived from SmallTalk. Most of that would have to be redone.
>> >>
>> >> So, I guess that the effort to compile Python code for a GPU is not
>> >> worth it. There might be further reasons due to the kind of code a JIT
>> >> generates, since a GPU has no branch predictor, no caches, and so on,
>> >> but I'm no GPU expert and I would have to check again.
>> >>
>> >> Finally, for general purpose code, exploiting the big expected number
>> >> of CPUs on our desktop systems is already a challenge.
>> >>
>> >> > There's people working in
>> >> > bringing GPGPU to python:
>> >> >
>> >> > http://mathema.tician.de/software/pyopencl
>> >> > http://mathema.tician.de/software/pycuda
>> >> >
>> >> > Would it be possible to run python code in parallel without the need
>> >> > (for
>> >> > the developer) of actively parallelizing the code?
>> >>
>> >> I would say that Python is not yet the language to use to write
>> >> efficient parallel code, because of the Global Interpreter Lock
>> >> (Google for "Python GIL"). The two implementations having no GIL are
>> >> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
>> >> and the current focus is not on removing it.
>> >> Scientific computing uses external libraries (like NumPy) - for the
>> >> supported algorithms, one could introduce parallelism at that level.
>> >> If that's enough for your application, good.
>> >> If you want to write a parallel algorithm in Python, we're not there
>> >> yet.
>> >>
>> >> > I'm not talking about code of hard concurrency, but of code with
>> >> > intrinsic
>> >> > parallelism (let's say matrix multiplication).
>> >>
>> >> Automatic parallelization is hard, see:
>> >> http://en.wikipedia.org/wiki/Automatic_parallelization
>> >>
>> >> Lots of scientists have tried, lots of money has been invested, but
>> >> it's still hard.
>> >> The only practical approaches still require the programmer to
>> >> introduce parallelism, but in ways much simpler than using
>> >> multithreading directly. Google OpenMP and Cilk.
>> >>
>> >> > Would a JIT compilation be capable of detecting parallelism?
>> >> Summing up what is above, probably not.
>> >>
>> >> Moreover, matrix multiplication may not be so easy as one might think.
>> >> I do not know how to write it for a GPU, but in the end I reference
>> >> some suggestions from that paper (where it is one of the benchmarks).
>> >> But here, I explain why writing it for a CPU is complicated. You can
>> >> multiply two matrixes with a triply nested for, but such an algorithm
>> >> has poor performance for big matrixes because of bad cache locality.
>> >> GPUs, according to the above mentioned paper, provide no caches and
>> >> hides latency in other ways.
>> >>
>> >> See here for the two main alternative ideas which allow solving this
>> >> problem of writing an efficient matrix multiplication algorithm:
>> >> http://en.wikipedia.org/wiki/Cache_blocking
>> >> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>> >>
>> >> Then, you need to parallelize the resulting code yourself, which might
>> >> or might not be easy (depending on the interactions between the
>> >> parallel blocks that are found there).
>> >> In that paper, where matrix multiplication is called as SGEMM (the
>> >> BLAS routine implementing it), they suggest using a cache-blocked
>> >> version of matrix multiplication for both CPUs and GPUs, and argue
>> >> that parallelization is then easy.
>> >>
>> >> Cheers,
>> >> --
>> >> Paolo Giarrusso - Ph.D. Student
>> >> http://www.informatik.uni-marburg.de/~pgiarrusso/
>> >> _______________________________________________
>> >> pypy-dev at codespeak.net
>> >> http://codespeak.net/mailman/listinfo/pypy-dev
>> >
>>
>>
>>
>> --
>> Paolo Giarrusso - Ph.D. Student
>> http://www.informatik.uni-marburg.de/~pgiarrusso/
>
>



-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From sakesun at gmail.com  Sat Aug 21 05:20:11 2010
From: sakesun at gmail.com (sakesun roykiatisak)
Date: Sat, 21 Aug 2010 10:20:11 +0700
Subject: [pypy-dev] What's wrong with >>> open(?xxx?,
	?w?).write(?stuff?) ?
In-Reply-To: 
References: 
	<20100820095721.GC16244@code0.codespeak.net>
	
	
Message-ID: 

This discussion is getting a little too long than necessary, at least for
me.  :)
Most of pypy talk video is in pretty poor recording quality. Most of the
time I
try to discern barely from the slides.

I always understand the difference between resource lifetime and object
lifetime.
Actually, in my most recent years, my sole python interpreter is
the non-refcounting IronPython already.  And I always wrap file operation
inside try/finally or with statement.

The problem is the example that claim to cause problem:

>>> open('xxx', 'w').write('stuff')

I misinterpret that the problem is caused in the "write" methods.
The above statement cause no problem, but the subsequent usage of
the file will. That's what I missed.
In fact, it might be more intuitive to demonstrate in a little longer
sample.

>>> open('xxx', 'w').write('stuff')
>>> assert open('xxx').read() == 'stuff'    # Might fail ! The first file
might not be closed yet !


Cheers.








On Fri, Aug 20, 2010 at 8:39 PM, Paolo Giarrusso wrote:

> On Fri, Aug 20, 2010 at 12:23, Donny Viszneki 
> wrote:
> > Armin: Sakesun used "del f" and it appears you did not.
> Actually, he didn't either. He said "I think that open(?xxx?,
> ?w?).write(?stuff?)" is equivalent to using del (which he thought
> would work), and the equivalence was correct.
>
> Anyway, in the _first reply_ message, he realized that using:
>
> ipy -c "open(?xxx?, ?w?).write(?stuff?)"
> jython -c "open(?xxx?, ?w?).write(?stuff?)"
>
> made a difference (because the interpreter exited), so that problem
> was solved. His mail implies that on PyPy he typed the code at the
> prompt, rather than at -c.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100821/f0db4f70/attachment.htm 

From hakan at debian.org  Sat Aug 21 09:06:10 2010
From: hakan at debian.org (Hakan Ardo)
Date: Sat, 21 Aug 2010 09:06:10 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 
	
	
Message-ID: 

Hi,
here is a another effort allowing you to write GPU kernels using
python, targeted at gpgpu. The programmer has to explicitly state the
parallelism and there are restrictions on what kind of constructs are
allowed in the kernels, but it's pretty cool:

  http://www.cs.lth.se/home/Calle_Lejdfors/pygpu/

On Sat, Aug 21, 2010 at 12:46 AM, Nick Bray  wrote:
> I can't speak for GPGPU, but I have compiled a subset of Python onto
> the GPU for real-time rendering. ?The subset is a little broader than
> RPython in some ways (for example, attributes are semantically
> identical to Python) and a little narrower in some ways (many forms of
> recursion are disallowed.) ?This big idea is that it allows you to
> create a real-time rendering system with a single code base, and
> transparently share functions and data structures between the CPU and
> GPU.
>
> http://www.ncbray.com/pystream.html
> http://www.ncbray.com/ncbray-dissertation.pdf
>
> It's at least ~100,000x faster than interpreting Python on the CPU.
> "At least" because the measurements neglect doing things on the CPU
> like texture sampling. ?This speedup is pretty obscene, but if you
> break it down it isn't too unbelievable... 100x for interpreted ->
> compiled, 10x for abstraction overhead of using floats instead of
> doubles, 100x for using the GPU and using it for a task it was built
> for.
>
> Parallelism issues are sidestepped by explicitly identifying the
> parallel sections (one function processes every vertex, one function
> processes every fragment), requiring the parallel sections have no
> global side effects, and that certain I/O conventions are followed.
> Sorry, no big answers here - it's essentially Pythonic stream
> programming.
>
> The biggest issues with getting Python onto the GPU is memory. ?I was
> actually targeting GLSL, not CUDA (it can't access the full rendering
> pipeline), so pointers were not available. ?To work around this, the
> code is optimized to an extreme degree to remove as many memory
> operations as possible. ?The remaining memory operations are emulated
> by splitting the heap into regions, indirecting through arrays, and
> copying constant data wherever possible. ?From what I've seen this is
> where PyPy would have the most trouble: its analysis algorithms are
> good enough for inferring types and ?allowing compilation /
> translation... they aren't designed to enable aggressive optimization
> of memory operations (there's not a huge reason to do this if you're
> translating RPython into C... the C compiler will do it for you). ?In
> general, GPU programming doesn't work well with memory access (too
> many functional units, too little bandwidth). ?Most of the "C-like"
> GPU languages are designed to they can easily boil down into code
> operating out of registers. ?Python, on the other hand, is addicted to
> heap memory. ?Even if you target CUDA, eliminating memory operations
> will be a huge win.
>
> I'll freely admit there's some ugly things going on, such as the lack
> of recursion, reliance on exhaustive inlining, requiring GPU code
> follow a specific form, and not working well with container objects in
> certain situations (it needs to bound the size of the heap). ?In the
> end, however, it's a talking dog... the grammar may not be perfect,
> but the dog talks! ?If anyone has questions, either private or on the
> list, I'd be happy to answer them. ?I have not done enough to
> advertise my project, and this seems like a good place to start.
>
> - Nick Bray
>
> 2010/8/20 Paolo Giarrusso :
>> 2010/8/20 Jorge Tim?n :
>>> Hi, I'm just curious about the feasibility of running python code in a gpu
>>> by extending pypy.
>> Disclaimer: I am not a PyPy developer, even if I've been following the
>> project with interest. Nor am I an expert of GPU - I provide links to
>> the literature I've read.
>> Yet, I believe that such an attempt is unlikely to be interesting.
>> Quoting Wikipedia's synthesis:
>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>> that emphasizes executing many concurrent threads slowly, rather than
>> executing a single thread very fast."
>> And significant optimizations are needed anyway to get performance for
>> GPU code (and if you don't need the last bit of performance, why
>> bother with a GPU?), so I think that the need to use a C-like language
>> is the smallest problem.
>>
>>> I don't have the time (and probably the knowledge neither) to develop that
>>> pypy extension, but I just want to know if it's possible.
>>> I'm interested in languages like openCL and nvidia's CUDA because I think
>>> the future of supercomputing is going to be GPGPU.
>>
>> I would like to point out that while for some cases it might be right,
>> the importance of GPGPU is probably often exaggerated:
>>
>> http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>>
>> Researchers in the field are mostly aware of the fact that GPGPU is
>> the way to go only for a very restricted category of code. For that
>> code, fine.
>> Thus, instead of running Python code in a GPU, designing from scratch
>> an easy way to program a GPU efficiently, for those task, is better,
>> and projects for that already exist (i.e. what you cite).
>>
>> Additionally, it would take probably a different kind of JIT to
>> exploit GPUs. No branch prediction, very small non-coherent caches, no
>> efficient synchronization primitives, as I read from this paper... I'm
>> no expert, but I guess you'd need to rearchitecture from scratch the
>> needed optimizations.
>> And it took 20-30 years to get from the first, slow Lisp (1958) to,
>> say, Self (1991), a landmark in performant high-level languages,
>> derived from SmallTalk. Most of that would have to be redone.
>>
>> So, I guess that the effort to compile Python code for a GPU is not
>> worth it. There might be further reasons due to the kind of code a JIT
>> generates, since a GPU has no branch predictor, no caches, and so on,
>> but I'm no GPU expert and I would have to check again.
>>
>> Finally, for general purpose code, exploiting the big expected number
>> of CPUs on our desktop systems is already a challenge.
>>
>>> There's people working in
>>> bringing GPGPU to python:
>>>
>>> http://mathema.tician.de/software/pyopencl
>>> http://mathema.tician.de/software/pycuda
>>>
>>> Would it be possible to run python code in parallel without the need (for
>>> the developer) of actively parallelizing the code?
>>
>> I would say that Python is not yet the language to use to write
>> efficient parallel code, because of the Global Interpreter Lock
>> (Google for "Python GIL"). The two implementations having no GIL are
>> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
>> and the current focus is not on removing it.
>> Scientific computing uses external libraries (like NumPy) - for the
>> supported algorithms, one could introduce parallelism at that level.
>> If that's enough for your application, good.
>> If you want to write a parallel algorithm in Python, we're not there yet.
>>
>>> I'm not talking about code of hard concurrency, but of code with intrinsic
>>> parallelism (let's say matrix multiplication).
>>
>> Automatic parallelization is hard, see:
>> http://en.wikipedia.org/wiki/Automatic_parallelization
>>
>> Lots of scientists have tried, lots of money has been invested, but
>> it's still hard.
>> The only practical approaches still require the programmer to
>> introduce parallelism, but in ways much simpler than using
>> multithreading directly. Google OpenMP and Cilk.
>>
>>> Would a JIT compilation be capable of detecting parallelism?
>> Summing up what is above, probably not.
>>
>> Moreover, matrix multiplication may not be so easy as one might think.
>> I do not know how to write it for a GPU, but in the end I reference
>> some suggestions from that paper (where it is one of the benchmarks).
>> But here, I explain why writing it for a CPU is complicated. You can
>> multiply two matrixes with a triply nested for, but such an algorithm
>> has poor performance for big matrixes because of bad cache locality.
>> GPUs, according to the above mentioned paper, provide no caches and
>> hides latency in other ways.
>>
>> See here for the two main alternative ideas which allow solving this
>> problem of writing an efficient matrix multiplication algorithm:
>> http://en.wikipedia.org/wiki/Cache_blocking
>> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>>
>> Then, you need to parallelize the resulting code yourself, which might
>> or might not be easy (depending on the interactions between the
>> parallel blocks that are found there).
>> In that paper, where matrix multiplication is called as SGEMM (the
>> BLAS routine implementing it), they suggest using a cache-blocked
>> version of matrix multiplication for both CPUs and GPUs, and argue
>> that parallelization is then easy.
>>
>> Cheers,
>> --
>> Paolo Giarrusso - Ph.D. Student
>> http://www.informatik.uni-marburg.de/~pgiarrusso/
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
H?kan Ard?

From cfbolz at gmx.de  Sat Aug 21 10:25:39 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Sat, 21 Aug 2010 10:25:39 +0200
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 				
	
Message-ID: <4C6F8D83.8060709@gmx.de>

Hi Paolo,

On 08/21/2010 01:46 AM, Paolo Giarrusso wrote:
[...]
> Your mention of slots is very cool! You made me recall that once you
> get shadow classes in Python, you can not only do inline caching, but
> you also have the _same_ object layout as in slots, because adding a
> member causes a hidden class transition, getting rid of any kind of
> dictionary _after compilation_. Two exceptions:
> * an immutable dictionary mapping field names to offsets is used both
> during JIT compilation and when inline caching fails, for
> * a fallback case for when __dict__ is used, I guess, is needed. Not
> necessarily a dictionary must be used though: one could also make
> __dict__ usage just cause class transitions.
> * beyond a certain member count, i.e., if __dict__ is used as a
> general-purpose dictionary, one might want to switch back to a
> dictionary representation. This only applies if this is done in
> Pythonic code (guess not) - I remember this case from V8, for
> JavaScript, where the expected usage is different.
>

Just as a note: PyPy's Python interpreter does all this already, and I 
am working on making it even cooler :-).

[...]

Cheers,

Carl Friedrich

From hakan at debian.org  Sat Aug 28 15:05:11 2010
From: hakan at debian.org (Hakan Ardo)
Date: Sat, 28 Aug 2010 15:05:11 +0200
Subject: [pypy-dev] Loop invaraints
Message-ID: 

Hi,
some time ago, there were some discussion about loop invaraints, but
no conclusion. What do you think about the following approach:

- Let optimize_loop mark the arguments in loop.inputargs as invariant
if they appear at the same position in the jump instruction at the end
before calling propagate_formward

- Let the optimize_... methods emit operations that only uses
invariant arguments to some preamble instead of emitting them to
self.newoperations whenever that is safe. Also, the result of these
operations should probably be marked as invariant.

- Insert the created preamble at every point where the loop is called,
right before the jump.

- When compiling a bridge from a failing guard, run the the preamble
through propagate_formward and discard the emitted operations, to
inherit that part of the state of Optimizer.

This should place the invariant instructions at the end of the entry
bridge, which is a suitable place, right? At the end of a bridge from
a failing guard that maintains the invariants the optimizer should
remove the inserted preamble again, right? And at the end of a bridge
that invalidates them, enough of the preamble will be kept to maintain
correct behavior, right?

-- 
H?kan Ard?

From william.leslie.ttg at gmail.com  Sun Aug 29 00:05:06 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Sun, 29 Aug 2010 08:05:06 +1000
Subject: [pypy-dev] Loop invaraints
In-Reply-To: 
References: 
Message-ID: 

The other part of the work is the algorithm that finds loop variants. It is
similar to the algorithm for variable colour inference, so you do have a
starting point.

On 28/08/2010 11:12 PM, "Hakan Ardo"  wrote:

Hi,
some time ago, there were some discussion about loop invaraints, but
no conclusion. What do you think about the following approach:

- Let optimize_loop mark the arguments in loop.inputargs as invariant
if they appear at the same position in the jump instruction at the end
before calling propagate_formward

- Let the optimize_... methods emit operations that only uses
invariant arguments to some preamble instead of emitting them to
self.newoperations whenever that is safe. Also, the result of these
operations should probably be marked as invariant.

- Insert the created preamble at every point where the loop is called,
right before the jump.

- When compiling a bridge from a failing guard, run the the preamble
through propagate_formward and discard the emitted operations, to
inherit that part of the state of Optimizer.

This should place the invariant instructions at the end of the entry
bridge, which is a suitable place, right? At the end of a bridge from
a failing guard that maintains the invariants the optimizer should
remove the inserted preamble again, right? And at the end of a bridge
that invalidates them, enough of the preamble will be kept to maintain
correct behavior, right?

--
H?kan Ard?
_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100829/24ebcd01/attachment.htm 

From cfbolz at gmx.de  Sun Aug 29 12:32:23 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Sun, 29 Aug 2010 12:32:23 +0200
Subject: [pypy-dev] Loop invaraints
In-Reply-To: 
References: 
Message-ID: <4C7A3737.50902@gmx.de>

Hi H?kan,

thanks for taking up the topic.

On 08/28/2010 03:05 PM, Hakan Ardo wrote:
> - Let optimize_loop mark the arguments in loop.inputargs as
> invariant if they appear at the same position in the jump instruction
> at the end before calling propagate_formward

sounds good.

> - Let the optimize_... methods emit operations that only uses
> invariant arguments to some preamble instead of emitting them to
> self.newoperations whenever that is safe. Also, the result of these
> operations should probably be marked as invariant.

Need to be a bit careful about operations with side-effects, but
basically yes.

> - Insert the created preamble at every point where the loop is
> called, right before the jump.

This part makes sense to me. The code would have to be careful to match
the variables in the trace and in the preamble.

> - When compiling a bridge from a failing guard, run the the preamble
> through propagate_formward and discard the emitted operations, to
> inherit that part of the state of Optimizer.

... but I don't see why this is needed. Wouldn't you rather need the 
whole trace of the loop including the preamble up to the failing guard? 
This would be bad, because you need to store the full trace then.

> This should place the invariant instructions at the end of the entry
> bridge, which is a suitable place, right? At the end of a bridge
> from a failing guard that maintains the invariants the optimizer
> should remove the inserted preamble again, right? And at the end of a
> bridge that invalidates them, enough of the preamble will be kept to
> maintain correct behavior, right?

Yes to all the questions, at least as fas as I can see. I guess in 
practice there might be complications.

Cheers,

Carl Friedrich


P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think 
it would be good if the bounds-related optimizations could move out of 
optimizeopt.py to their own file, because otherwise optimizeopt.py is 
getting really unwieldy. Does that make sense?

From arigo at tunes.org  Sun Aug 29 13:04:11 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 29 Aug 2010 13:04:11 +0200
Subject: [pypy-dev] Loop invaraints
In-Reply-To: 
References: 
Message-ID: <20100829110411.GA13704@code0.codespeak.net>

Hi,

On Sat, Aug 28, 2010 at 03:05:11PM +0200, Hakan Ardo wrote:
> some time ago, there were some discussion about loop invaraints, but
> no conclusion.

A general answer to that question: there are two kinds of goals we can
have when optimizing.  One is to get the fastest possible code for small
Python loops, e.g. doing numerical computations.  The other is to get
reasonably good code for large and complicated loops, e.g. the dispatch
loop of some network application.  Although loop-invariant code motion
would definitely be great for the first kind of loops, it's unclear that
it helps on the second kind of loops.

As a similar consideration, I am thinking about trying to remove the
optimization that passes "virtuals" from one iteration of the loop to
the next one.  Although it has good effects on small loops, it has
actually a negative effect on large loops, because the loop taking
virtual arguments cannot be directly jumped to from the interpreter.

I'm not saying that loop-invariant code motion could also have a
negative effect on large loops; I think it's a pure win, so it's
probably worth a try.  I'm just giving a warning: it may not help much
in the case of a "general Python program doing lots of stuff", but only
in the case of small numerical computation loops.


A bientot,

Armin.

From hakan at debian.org  Sun Aug 29 13:49:23 2010
From: hakan at debian.org (Hakan Ardo)
Date: Sun, 29 Aug 2010 13:49:23 +0200
Subject: [pypy-dev] Loop invaraints
In-Reply-To: <4C7A3737.50902@gmx.de>
References: 
	<4C7A3737.50902@gmx.de>
Message-ID: 

On Sun, Aug 29, 2010 at 12:32 PM, Carl Friedrich Bolz  wrote:
>
> ... but I don't see why this is needed. Wouldn't you rather need the

My thinking was that for the preamble to be removed from the end of a
bridge maintaining the invariant this would be needed? But I might be
mistaking?

> whole trace of the loop including the preamble up to the failing guard?
> This would be bad, because you need to store the full trace then.

OK, so that might be a problem. Maybe it would be possible to extract
what part of the state it would be safe to inherit even if only the
preamble has been processed, i.e. self.pure_operations might be ok?

> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
> it would be good if the bounds-related optimizations could move out of
> optimizeopt.py to their own file, because otherwise optimizeopt.py is
> getting really unwieldy. Does that make sense?

Well, class IntBound and the propagate_bounds_ methods could probably
be moved elsewhere, but a lot of the work is done in optimize_...
methods, which I'm not so sure it would make sens to split up.

-- 
H?kan Ard?

From cfbolz at gmx.de  Sun Aug 29 14:03:37 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Sun, 29 Aug 2010 14:03:37 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: 
References: 	<4C7A3737.50902@gmx.de>
	
Message-ID: <4C7A4C99.2050803@gmx.de>

On 08/29/2010 01:49 PM, Hakan Ardo wrote:
>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
>> it would be good if the bounds-related optimizations could move out of
>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
>> getting really unwieldy. Does that make sense?
>
> Well, class IntBound and the propagate_bounds_ methods could probably
> be moved elsewhere, but a lot of the work is done in optimize_...
> methods, which I'm not so sure it would make sens to split up.

I guess then the things that can be sanely moved should move. The file 
is nearly 2000 lines, which is way too big. I guess also the heap 
optimizations could go to their own file.

Carl Friedrich

From fijall at gmail.com  Sun Aug 29 22:05:49 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Sun, 29 Aug 2010 22:05:49 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: <4C7A4C99.2050803@gmx.de>
References: 
	<4C7A3737.50902@gmx.de>
	
	<4C7A4C99.2050803@gmx.de>
Message-ID: 

On Sun, Aug 29, 2010 at 2:03 PM, Carl Friedrich Bolz  wrote:
> On 08/29/2010 01:49 PM, Hakan Ardo wrote:
>>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
>>> it would be good if the bounds-related optimizations could move out of
>>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
>>> getting really unwieldy. Does that make sense?
>>
>> Well, class IntBound and the propagate_bounds_ methods could probably
>> be moved elsewhere, but a lot of the work is done in optimize_...
>> methods, which I'm not so sure it would make sens to split up.
>
> I guess then the things that can be sanely moved should move. The file
> is nearly 2000 lines, which is way too big. I guess also the heap
> optimizations could go to their own file.
>
> Carl Friedrich

How about a couple of files (preferably small) each containing a
contained optimization if possible? (maybe a package?)

From hakan at debian.org  Tue Aug 31 09:25:13 2010
From: hakan at debian.org (Hakan Ardo)
Date: Tue, 31 Aug 2010 09:25:13 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: 
References: 
	<4C7A3737.50902@gmx.de>
	
	<4C7A4C99.2050803@gmx.de>
	
Message-ID: 

Ok, so we split it up into a set of Optimization classes in separate
files. Each containing a subset of the optimize_... methods. Then we
have the propagate_forward method iterate over the instructions
passing them to one Optimization after the other? That way we keep the
single iteration over the instructions. Would it be preferable to
separate them even more and have each Optimization contain it's own
loop over the instructions?

On Sun, Aug 29, 2010 at 10:05 PM, Maciej Fijalkowski  wrote:
> On Sun, Aug 29, 2010 at 2:03 PM, Carl Friedrich Bolz  wrote:
>> On 08/29/2010 01:49 PM, Hakan Ardo wrote:
>>>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
>>>> it would be good if the bounds-related optimizations could move out of
>>>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
>>>> getting really unwieldy. Does that make sense?
>>>
>>> Well, class IntBound and the propagate_bounds_ methods could probably
>>> be moved elsewhere, but a lot of the work is done in optimize_...
>>> methods, which I'm not so sure it would make sens to split up.
>>
>> I guess then the things that can be sanely moved should move. The file
>> is nearly 2000 lines, which is way too big. I guess also the heap
>> optimizations could go to their own file.
>>
>> Carl Friedrich
>
> How about a couple of files (preferably small) each containing a
> contained optimization if possible? (maybe a package?)
>



-- 
H?kan Ard?

From hakan at debian.org  Tue Aug 31 09:20:15 2010
From: hakan at debian.org (Hakan Ardo)
Date: Tue, 31 Aug 2010 09:20:15 +0200
Subject: [pypy-dev] Loop invaraints
In-Reply-To: <20100829110411.GA13704@code0.codespeak.net>
References: 
	<20100829110411.GA13704@code0.codespeak.net>
Message-ID: 

On Sun, Aug 29, 2010 at 1:04 PM, Armin Rigo  wrote:
>
> I'm not saying that loop-invariant code motion could also have a
> negative effect on large loops; I think it's a pure win, so it's
> probably worth a try. ?I'm just giving a warning: it may not help much
> in the case of a "general Python program doing lots of stuff", but only
> in the case of small numerical computation loops.

Right. I write a lot of numerical computation loops these days, both
small and somewhat bigger, and I am typically force to write them in C
to get decent performance. So the motivation here would rater be to
broaden the usability of python than to improve performance of
exciting python programs.

Another motivation might be to help pypy developers focus on the
important instruction while staring at traces, ie by hiding the
instructions that will be inserted only once :)


-- 
H?kan Ard?

From fijall at gmail.com  Tue Aug 31 10:38:22 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 31 Aug 2010 10:38:22 +0200
Subject: [pypy-dev] Loop invaraints
In-Reply-To: 
References: 
	<20100829110411.GA13704@code0.codespeak.net>
	
Message-ID: 

On Tue, Aug 31, 2010 at 9:20 AM, Hakan Ardo  wrote:
> On Sun, Aug 29, 2010 at 1:04 PM, Armin Rigo  wrote:
>>
>> I'm not saying that loop-invariant code motion could also have a
>> negative effect on large loops; I think it's a pure win, so it's
>> probably worth a try. ?I'm just giving a warning: it may not help much
>> in the case of a "general Python program doing lots of stuff", but only
>> in the case of small numerical computation loops.
>
> Right. I write a lot of numerical computation loops these days, both
> small and somewhat bigger, and I am typically force to write them in C
> to get decent performance. So the motivation here would rater be to
> broaden the usability of python than to improve performance of
> exciting python programs.
>
> Another motivation might be to help pypy developers focus on the
> important instruction while staring at traces, ie by hiding the
> instructions that will be inserted only once :)
>

I second hakan here - small loops are not uninteresting, since it
broadens areas where you can use python, not limiting yourself to
existing python programs.

From elmo.mantynen at iki.fi  Wed Sep  1 11:01:34 2010
From: elmo.mantynen at iki.fi (Elmo)
Date: Wed, 01 Sep 2010 12:01:34 +0300
Subject: [pypy-dev] gpgpu and pypy
In-Reply-To: 
References: 		
	
Message-ID: <4C7E166E.9030008@iki.fi>

This seems similar to what MyHDL does. It's a python framework to be 
used as an HDL (hardware description language) for describing gate array 
configurations (outputs Verilog and VHDL for FPGAs or ASICs). It uses a 
similar approach to RPython as the compilation works on objects created 
by running the python code, with restrictions applying mostly to the 
code inside the generators that are used to encode the intended behavior 
(which is intrinsically parallel).

A bit on a different level, since you could use MyHDL to describe (and 
implement) a GPU, but I thought it would be interesting :)

Elmo

On 08/21/2010 10:06 AM, Hakan Ardo wrote:
> Hi,
> here is a another effort allowing you to write GPU kernels using
> python, targeted at gpgpu. The programmer has to explicitly state the
> parallelism and there are restrictions on what kind of constructs are
> allowed in the kernels, but it's pretty cool:
>
>    http://www.cs.lth.se/home/Calle_Lejdfors/pygpu/
>
> On Sat, Aug 21, 2010 at 12:46 AM, Nick Bray  wrote:
>> I can't speak for GPGPU, but I have compiled a subset of Python onto
>> the GPU for real-time rendering.  The subset is a little broader than
>> RPython in some ways (for example, attributes are semantically
>> identical to Python) and a little narrower in some ways (many forms of
>> recursion are disallowed.)  This big idea is that it allows you to
>> create a real-time rendering system with a single code base, and
>> transparently share functions and data structures between the CPU and
>> GPU.
>>
>> http://www.ncbray.com/pystream.html
>> http://www.ncbray.com/ncbray-dissertation.pdf
>>
>> It's at least ~100,000x faster than interpreting Python on the CPU.
>> "At least" because the measurements neglect doing things on the CPU
>> like texture sampling.  This speedup is pretty obscene, but if you
>> break it down it isn't too unbelievable... 100x for interpreted ->
>> compiled, 10x for abstraction overhead of using floats instead of
>> doubles, 100x for using the GPU and using it for a task it was built
>> for.
>>
>> Parallelism issues are sidestepped by explicitly identifying the
>> parallel sections (one function processes every vertex, one function
>> processes every fragment), requiring the parallel sections have no
>> global side effects, and that certain I/O conventions are followed.
>> Sorry, no big answers here - it's essentially Pythonic stream
>> programming.
>>
>> The biggest issues with getting Python onto the GPU is memory.  I was
>> actually targeting GLSL, not CUDA (it can't access the full rendering
>> pipeline), so pointers were not available.  To work around this, the
>> code is optimized to an extreme degree to remove as many memory
>> operations as possible.  The remaining memory operations are emulated
>> by splitting the heap into regions, indirecting through arrays, and
>> copying constant data wherever possible.  From what I've seen this is
>> where PyPy would have the most trouble: its analysis algorithms are
>> good enough for inferring types and  allowing compilation /
>> translation... they aren't designed to enable aggressive optimization
>> of memory operations (there's not a huge reason to do this if you're
>> translating RPython into C... the C compiler will do it for you).  In
>> general, GPU programming doesn't work well with memory access (too
>> many functional units, too little bandwidth).  Most of the "C-like"
>> GPU languages are designed to they can easily boil down into code
>> operating out of registers.  Python, on the other hand, is addicted to
>> heap memory.  Even if you target CUDA, eliminating memory operations
>> will be a huge win.
>>
>> I'll freely admit there's some ugly things going on, such as the lack
>> of recursion, reliance on exhaustive inlining, requiring GPU code
>> follow a specific form, and not working well with container objects in
>> certain situations (it needs to bound the size of the heap).  In the
>> end, however, it's a talking dog... the grammar may not be perfect,
>> but the dog talks!  If anyone has questions, either private or on the
>> list, I'd be happy to answer them.  I have not done enough to
>> advertise my project, and this seems like a good place to start.
>>
>> - Nick Bray
>>
>> 2010/8/20 Paolo Giarrusso:
>>> 2010/8/20 Jorge Tim?n:
>>>> Hi, I'm just curious about the feasibility of running python code
in a gpu
>>>> by extending pypy.
>>> Disclaimer: I am not a PyPy developer, even if I've been following the
>>> project with interest. Nor am I an expert of GPU - I provide links to
>>> the literature I've read.
>>> Yet, I believe that such an attempt is unlikely to be interesting.
>>> Quoting Wikipedia's synthesis:
>>> "Unlike CPUs however, GPUs have a parallel throughput architecture
>>> that emphasizes executing many concurrent threads slowly, rather than
>>> executing a single thread very fast."
>>> And significant optimizations are needed anyway to get performance for
>>> GPU code (and if you don't need the last bit of performance, why
>>> bother with a GPU?), so I think that the need to use a C-like language
>>> is the smallest problem.
>>>
>>>> I don't have the time (and probably the knowledge neither) to
develop that
>>>> pypy extension, but I just want to know if it's possible.
>>>> I'm interested in languages like openCL and nvidia's CUDA because
I think
>>>> the future of supercomputing is going to be GPGPU.
>>>
>>> I would like to point out that while for some cases it might be right,
>>> the importance of GPGPU is probably often exaggerated:
>>>
>>>
http://portal.acm.org/citation.cfm?id=1816021&coll=GUIDE&dl=GUIDE&CFID=11111111&CFTOKEN=2222222&ret=1#
>>>
>>> Researchers in the field are mostly aware of the fact that GPGPU is
>>> the way to go only for a very restricted category of code. For that
>>> code, fine.
>>> Thus, instead of running Python code in a GPU, designing from scratch
>>> an easy way to program a GPU efficiently, for those task, is better,
>>> and projects for that already exist (i.e. what you cite).
>>>
>>> Additionally, it would take probably a different kind of JIT to
>>> exploit GPUs. No branch prediction, very small non-coherent caches, no
>>> efficient synchronization primitives, as I read from this paper... I'm
>>> no expert, but I guess you'd need to rearchitecture from scratch the
>>> needed optimizations.
>>> And it took 20-30 years to get from the first, slow Lisp (1958) to,
>>> say, Self (1991), a landmark in performant high-level languages,
>>> derived from SmallTalk. Most of that would have to be redone.
>>>
>>> So, I guess that the effort to compile Python code for a GPU is not
>>> worth it. There might be further reasons due to the kind of code a JIT
>>> generates, since a GPU has no branch predictor, no caches, and so on,
>>> but I'm no GPU expert and I would have to check again.
>>>
>>> Finally, for general purpose code, exploiting the big expected number
>>> of CPUs on our desktop systems is already a challenge.
>>>
>>>> There's people working in
>>>> bringing GPGPU to python:
>>>>
>>>> http://mathema.tician.de/software/pyopencl
>>>> http://mathema.tician.de/software/pycuda
>>>>
>>>> Would it be possible to run python code in parallel without the
need (for
>>>> the developer) of actively parallelizing the code?
>>>
>>> I would say that Python is not yet the language to use to write
>>> efficient parallel code, because of the Global Interpreter Lock
>>> (Google for "Python GIL"). The two implementations having no GIL are
>>> IronPython (as slow as CPython) and Jython (slower). PyPy has a GIL,
>>> and the current focus is not on removing it.
>>> Scientific computing uses external libraries (like NumPy) - for the
>>> supported algorithms, one could introduce parallelism at that level.
>>> If that's enough for your application, good.
>>> If you want to write a parallel algorithm in Python, we're not
there yet.
>>>
>>>> I'm not talking about code of hard concurrency, but of code with
intrinsic
>>>> parallelism (let's say matrix multiplication).
>>>
>>> Automatic parallelization is hard, see:
>>> http://en.wikipedia.org/wiki/Automatic_parallelization
>>>
>>> Lots of scientists have tried, lots of money has been invested, but
>>> it's still hard.
>>> The only practical approaches still require the programmer to
>>> introduce parallelism, but in ways much simpler than using
>>> multithreading directly. Google OpenMP and Cilk.
>>>
>>>> Would a JIT compilation be capable of detecting parallelism?
>>> Summing up what is above, probably not.
>>>
>>> Moreover, matrix multiplication may not be so easy as one might think.
>>> I do not know how to write it for a GPU, but in the end I reference
>>> some suggestions from that paper (where it is one of the benchmarks).
>>> But here, I explain why writing it for a CPU is complicated. You can
>>> multiply two matrixes with a triply nested for, but such an algorithm
>>> has poor performance for big matrixes because of bad cache locality.
>>> GPUs, according to the above mentioned paper, provide no caches and
>>> hides latency in other ways.
>>>
>>> See here for the two main alternative ideas which allow solving this
>>> problem of writing an efficient matrix multiplication algorithm:
>>> http://en.wikipedia.org/wiki/Cache_blocking
>>> http://en.wikipedia.org/wiki/Cache-oblivious_algorithm
>>>
>>> Then, you need to parallelize the resulting code yourself, which might
>>> or might not be easy (depending on the interactions between the
>>> parallel blocks that are found there).
>>> In that paper, where matrix multiplication is called as SGEMM (the
>>> BLAS routine implementing it), they suggest using a cache-blocked
>>> version of matrix multiplication for both CPUs and GPUs, and argue
>>> that parallelization is then easy.
>>>
>>> Cheers,
>>> --
>>> Paolo Giarrusso - Ph.D. Student
>>> http://www.informatik.uni-marburg.de/~pgiarrusso/
>>> _______________________________________________
>>> pypy-dev at codespeak.net
>>> http://codespeak.net/mailman/listinfo/pypy-dev
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>>
>
>
>

From cfbolz at gmx.de  Wed Sep  1 15:41:19 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Wed, 01 Sep 2010 15:41:19 +0200
Subject: [pypy-dev] Registration for S3 - A Workshop on Self-Sustaining
 Systems is now open.
Message-ID: <4C7E57FF.3000601@gmx.de>

Registration for S3 - A Workshop on Self-Sustaining Systems is now open.

     http://www.hpi.uni-potsdam.de/hirschfeld/s3/s3-10/

We hope you will join us on September 27-28 in Tokyo.

Invited speakers:
     ? Takashi Ikegami: Sustainable Autonomy and Designing Mind Time 
(The University of Tokyo)
     ? Yukihiro Matsumoto: From Lisp to Ruby to Rubinius (Rakuten 
Institute of Technology)
     ? Vishal Sikka: On Sustainable Business Solutions (SAP)

You can find the workshop at:
http://www.hpi.uni-potsdam.de/hirschfeld/s3/s3-10/program/index.html

Kim Rose, Hidehiko Masuhara, and Robert Hirschfeld

From hakan at debian.org  Thu Sep  2 07:32:54 2010
From: hakan at debian.org (Hakan Ardo)
Date: Thu, 2 Sep 2010 07:32:54 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: 
References: 
	<4C7A3737.50902@gmx.de>
	
	<4C7A4C99.2050803@gmx.de>
	
	
Message-ID: 

Hi,
I've checked in a version of optimizeopt that is a package with
support for building a chain of optimizations and passing instructions
down this chain. Does this design make sens? If so I'll start moving
the different optimization to the different files. It will require
some refactoring, but not too much I hope...

On Tue, Aug 31, 2010 at 9:25 AM, Hakan Ardo  wrote:
> Ok, so we split it up into a set of Optimization classes in separate
> files. Each containing a subset of the optimize_... methods. Then we
> have the propagate_forward method iterate over the instructions
> passing them to one Optimization after the other? That way we keep the
> single iteration over the instructions. Would it be preferable to
> separate them even more and have each Optimization contain it's own
> loop over the instructions?
>
> On Sun, Aug 29, 2010 at 10:05 PM, Maciej Fijalkowski  wrote:
>> On Sun, Aug 29, 2010 at 2:03 PM, Carl Friedrich Bolz  wrote:
>>> On 08/29/2010 01:49 PM, Hakan Ardo wrote:
>>>>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
>>>>> it would be good if the bounds-related optimizations could move out of
>>>>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
>>>>> getting really unwieldy. Does that make sense?
>>>>
>>>> Well, class IntBound and the propagate_bounds_ methods could probably
>>>> be moved elsewhere, but a lot of the work is done in optimize_...
>>>> methods, which I'm not so sure it would make sens to split up.
>>>
>>> I guess then the things that can be sanely moved should move. The file
>>> is nearly 2000 lines, which is way too big. I guess also the heap
>>> optimizations could go to their own file.
>>>
>>> Carl Friedrich
>>
>> How about a couple of files (preferably small) each containing a
>> contained optimization if possible? (maybe a package?)
>>
>
>
>
> --
> H?kan Ard?
>



-- 
H?kan Ard?

From sarvi at yahoo.com  Thu Sep  2 07:54:02 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Wed, 1 Sep 2010 22:54:02 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
Message-ID: <525425.61205.qm@web53702.mail.re2.yahoo.com>


I understand from various threads here,  that RPython is not for general purpose 
use.
Why this lack of Focus on general use.

I am looking at this and I am thinking and comparing this to a corporation that 
is working on this awesome product. 

They are so focused on this awesome final product vision that they fail to 
realize the awesome potential of some if its intermediate side deliverables.

PyPy is definitely gaining momentum.  
But as a strategy to build that momentum, and gain new converts it should put 
some focus on some of its niche strengths.
Things other python implementions cannot do. 

One such niche is its RPython and RPython Compiler.  
No other python implementation can convert python programs to executables.
I am seeing growing interest in writing Rpython code for performance critical 
code and even potentially compiling it to binaries.

http://olliwang.com/2009/12/20/aes-implementation-in-rpython/ 
http://alexgaynor.net/2010/may/15/pypy-future-python/


Is it possible the PyPy team may be understating the significance of RPython?
Am I crazy to think this way? :-)

Sarvi 


      

From mcneil at hku.hk  Thu Sep  2 08:27:32 2010
From: mcneil at hku.hk (Douglas McNeil)
Date: Thu, 2 Sep 2010 14:27:32 +0800
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <525425.61205.qm@web53702.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
Message-ID: 

> No other python implementation can convert python programs to executables.

There's shedskin, which is actually very good as these things go:

http://code.google.com/p/shedskin/

Like RPython, you have to write in a small subset of python which can
be a little frustrating once you've gotten used to pythonic freedom.
But I've found it very useful for some short numerical codes (putting
on my OEIS associate editor hat).  And Cython is pretty powerful these
days.

ObPyPy: the other day I had cause to run a very short, unoptimized,
mostly integer-arithmetic code.  With shedskin, it took between ~42s
(with ints) and ~1m43 (with longs), as compared with only ~3m30 or so
to run under pypy.  That's only a factor of two (if I'd needed longs).
 Both could be much improved, and a lower-level version in C would
beat them both, but I was very impressed by how little difference
there was.  Major props!

For numerics it'd be interesting to have a JIT option which didn't
care about compilation times, and instead of generating assembly
itself generated assembly-like C which was then delegated to an
external compiler.


Doug

-- 
Department of Earth Sciences
University of Hong Kong

From p.giarrusso at gmail.com  Thu Sep  2 09:02:49 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 2 Sep 2010 09:02:49 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: 

On Thu, Sep 2, 2010 at 08:27, Douglas McNeil  wrote:
>> No other python implementation can convert python programs to executables.
>
> There's shedskin, which is actually very good as these things go:
>
> http://code.google.com/p/shedskin/
>
> Like RPython, you have to write in a small subset of python which can
> be a little frustrating once you've gotten used to pythonic freedom.
> But I've found it very useful for some short numerical codes (putting
> on my OEIS associate editor hat). ?And Cython is pretty powerful these
> days.

> ObPyPy: the other day I had cause to run a very short, unoptimized,
> mostly integer-arithmetic code. ?With shedskin, it took between ~42s
> (with ints) and ~1m43 (with longs), as compared with only ~3m30 or so
> to run under pypy. ?That's only a factor of two (if I'd needed longs).
> ?Both could be much improved, and a lower-level version in C would
> beat them both, but I was very impressed by how little difference
> there was. ?Major props!

> For numerics it'd be interesting to have a JIT option which didn't
> care about compilation times, and instead of generating assembly
> itself generated assembly-like C which was then delegated to an
> external compiler.

A more interesting road (which is mentioned somewhere in the PyPy
blog) is to use LLVM in place of this "external JIT compiler", so that
you generate "assembly-like LLVM Intermediate Representation". A bit
like UnladenSwallow is doing, with the difference of having a saner
runtime model to start with (say, no reference counting). Once you
start with LLVM, you are free to choose which optimization passes to
run, from very little to -O3 to even more ones.
The other C compilers incur huge startup costs for no good, and don't
usually allow being used as a library, if just for engineering
problems. LLVM is so much cooler anyway, especially now that say
_everybody_ is switching to it.

About the compilation times tradeoff, you can look for "tiered
compilation", which is a general strategy for doing it automatically,
possibly allowing different tunings (say, like java -server, which is
tuned for performance rather than responsiveness). My authoritative
reference is Cliff Click's blog [1], but you probably want to stop
reading it after the introduction, as I did in this case.

[1] http://www.azulsystems.com/blog/cliff-click/2010-07-16-tiered-compilation
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From william.leslie.ttg at gmail.com  Thu Sep  2 09:09:03 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 2 Sep 2010 17:09:03 +1000
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <525425.61205.qm@web53702.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
Message-ID: 

On 2 September 2010 15:54, Saravanan Shanmugham  wrote:
>
> I understand from various threads here, ?that RPython is not for general purpose
> use.
> Why this lack of Focus on general use.

Because then we would have to support that general use.

Python benefits from being reasonably standardised, you can be sure
that most python you write will run on any implementation that
supports the version you are targetting. On the other hand, if you are
mangling cpython or pypy bytecode, you are asking for trouble. Rpython
is an example of such an implementation detail - we* might like to
change features of it here or there to better support some needed
pattern.

Introducing yet another incompatable and complicated language to the
python ecosystem is not a worthwhile goal in itself.

* Just my opinion. Others might feel like standardising some amount of
rpython is a worthwhile idea.

> They are so focused on this awesome final product vision that they fail to
> realize the awesome potential of some if its intermediate side deliverables.
>
> PyPy is definitely gaining momentum.
> But as a strategy to build that momentum, and gain new converts it should put
> some focus on some of its niche strengths.
> Things other python implementions cannot do.
>
> One such niche is its RPython and RPython Compiler.
> No other python implementation can convert python programs to executables.

I can't see why you would ever want to do this - if you use py2exe or
the like instead, you get a large standard library and a great
language to work in, neither of which you get if you use rpython.

> I am seeing growing interest in writing Rpython code for performance critical
> code and even potentially compiling it to binaries.

The intention is to get almost the same performance out of the JIT.
For those that actually care about the last few percent, it would be
nicer to provide hints to generate specialised code at module compile
time, that way you can still work at python level.

> Is it possible the PyPy team may be understating the significance of RPython?
> Am I crazy to think this way? :-)

Supporting better integration between app-level python and other
languages that interact with interpreter level would be nice. CLI
integration is good, and JVM integration is lagging just a little. But
once you can interact with that level, there are much saner languages
that you could use for your low-level code than rpython - languages
/designed/ to be general purpose languages.

At the moment, the lack of separate compilation is a real issue
standing in the way of using rpython as a general purpose language, or
even as an extension language. Having to re-translate *everything*
every time you want to install an extension module is not on. Even C
doesn't require that.

The other is that type inference is global and changes you make to one
function can have far-reaching consequences. The error messages when
you do screw up aren't very friendly either.

If you want a low-level general purpose language with type inference
and garbage collection that has implementations for every platform
pypy targets, there are already plenty of options.

-- 
William Leslie

From p.giarrusso at gmail.com  Thu Sep  2 09:56:24 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 2 Sep 2010 09:56:24 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: 

Hi,
I was curious about the interplay between type inference and separate
compilation.

On Thu, Sep 2, 2010 at 09:09, William Leslie
 wrote:
> At the moment, the lack of separate compilation is a real issue
> standing in the way of using rpython as a general purpose language, or
> even as an extension language. Having to re-translate *everything*
> every time you want to install an extension module is not on. Even C
> doesn't require that.

> The other is that type inference is global and changes you make to one
> function can have far-reaching consequences.
Is it module-global or is it performed on the whole program?
I guess you'd need modular type inference before allowing separate
compilation, and of course lots of implementation work.
Functional languages allow separate compilation - is there any
RPython-specific problem for that? I've omitted my guesses here.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From sarvi at yahoo.com  Thu Sep  2 09:57:33 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 00:57:33 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: <69412.54494.qm@web53706.mail.re2.yahoo.com>

I afraid people are missing the point here.
For an average engineer its better to be an expert of 1 language than be an 
average at 4. 
Thats my take on things.

Take Merurial(an SCM) 95% python 5%C and gives GIT a run for its money
This could be 95%Python and 5%RPython.

>From what I can tell writing RPython is still simpler than writing C/C++. 
Garbage collection alone justifies its use in my opinion.
The option of the Interpreter during development is just huge amounts of icing 
on the cake.

>From my reading on PyPy, thats why yall chose to write the PyPy in RPython. 
Yall could have done in this C/C++ right?

So far as I can tell RPython is a strict subset of Python. I don't see why it 
shouldn't continue to be.
And even if yall needed to make a very small set of static extension to RPython, 
you wouldn't any worse that Cython and Shedskin.

I would still rather work with just one interpreter/compiler, say PyPy. Better 
than PyPy/CPython for interpreter and Cython/Shedkin for compiling, with 
interpreter support during development.

I am just seeing Cython/Shedskin as fragmentation of resources. 
A lot more could accomplished if these projects came together.

Sarvi


----- Original Message ----
From: William Leslie 
To: Saravanan Shanmugham 
Cc: pypy-dev at codespeak.net
Sent: Thu, September 2, 2010 12:09:03 AM
Subject: Re: [pypy-dev] Question on the future of RPython

On 2 September 2010 15:54, Saravanan Shanmugham  wrote:
>
> I understand from various threads here,  that RPython is not for general 
>purpose
> use.
> Why this lack of Focus on general use.

Because then we would have to support that general use.

Python benefits from being reasonably standardised, you can be sure
that most python you write will run on any implementation that
supports the version you are targetting. On the other hand, if you are
mangling cpython or pypy bytecode, you are asking for trouble. Rpython
is an example of such an implementation detail - we* might like to
change features of it here or there to better support some needed
pattern.

Introducing yet another incompatable and complicated language to the
python ecosystem is not a worthwhile goal in itself.

* Just my opinion. Others might feel like standardising some amount of
rpython is a worthwhile idea.

> They are so focused on this awesome final product vision that they fail to
> realize the awesome potential of some if its intermediate side deliverables.
>
> PyPy is definitely gaining momentum.
> But as a strategy to build that momentum, and gain new converts it should put
> some focus on some of its niche strengths.
> Things other python implementions cannot do.
>
> One such niche is its RPython and RPython Compiler.
> No other python implementation can convert python programs to executables.

I can't see why you would ever want to do this - if you use py2exe or
the like instead, you get a large standard library and a great
language to work in, neither of which you get if you use rpython.

> I am seeing growing interest in writing Rpython code for performance critical
> code and even potentially compiling it to binaries.

The intention is to get almost the same performance out of the JIT.
For those that actually care about the last few percent, it would be
nicer to provide hints to generate specialised code at module compile
time, that way you can still work at python level.

> Is it possible the PyPy team may be understating the significance of RPython?
> Am I crazy to think this way? :-)

Supporting better integration between app-level python and other
languages that interact with interpreter level would be nice. CLI
integration is good, and JVM integration is lagging just a little. But
once you can interact with that level, there are much saner languages
that you could use for your low-level code than rpython - languages
/designed/ to be general purpose languages.

At the moment, the lack of separate compilation is a real issue
standing in the way of using rpython as a general purpose language, or
even as an extension language. Having to re-translate *everything*
every time you want to install an extension module is not on. Even C
doesn't require that.

The other is that type inference is global and changes you make to one
function can have far-reaching consequences. The error messages when
you do screw up aren't very friendly either.

If you want a low-level general purpose language with type inference
and garbage collection that has implementations for every platform
pypy targets, there are already plenty of options.

-- 
William Leslie



      

From sarvi at yahoo.com  Thu Sep  2 10:10:06 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 01:10:06 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: <569927.57152.qm@web53706.mail.re2.yahoo.com>

If PyPy is using RPython for its compiler implementation, it should and will be 
optimized eventually for its compiler/JIT to to be fast.
Which just tells me that the performance gap between Shedskin and PyPy will be 
narrowed/beat pretty soon.

I would still rather work with just one interpreter/compiler, say PyPy. 
Better than using PyPy/CPython for an interpreter and Cython/Shedkin for 
compiling, without interpreter support during development.

I am just seeing Cython/Shedskin as fragmentation of resources. 
A lot more could accomplished if these projects came together with PyPy.


If you ask this question on the Shedskin/Cython alias as to why they shouldn't 
pool resources into making the PyPy RPython compiler into a First class 
citizen/goal of PyPy.
They will immediately tell you its not a goal PyPy.

Why not officially make it so. 
Formalize RPython and its compiler. 

Obviate the need for Cython/Shedskin and get them on board.

Like the example I quoated. Mercurial is 95% python 5% C for peformance. It 
should be 95% python and 5% RPython.

We have Pickle and cPickle for performance. The Pickle could have simply been 
rewritten in RPython and probably compiled and we don't need  different versons 
:-))

Sarvi


----- Original Message ----
From: Douglas McNeil 
To: Saravanan Shanmugham 
Cc: "pypy-dev at codespeak.net" 
Sent: Wed, September 1, 2010 11:27:32 PM
Subject: Re: [pypy-dev] Question on the future of RPython

> No other python implementation can convert python programs to executables.

There's shedskin, which is actually very good as these things go:

http://code.google.com/p/shedskin/

Like RPython, you have to write in a small subset of python which can
be a little frustrating once you've gotten used to pythonic freedom.
But I've found it very useful for some short numerical codes (putting
on my OEIS associate editor hat).  And Cython is pretty powerful these
days.

ObPyPy: the other day I had cause to run a very short, unoptimized,
mostly integer-arithmetic code.  With shedskin, it took between ~42s
(with ints) and ~1m43 (with longs), as compared with only ~3m30 or so
to run under pypy.  That's only a factor of two (if I'd needed longs).
Both could be much improved, and a lower-level version in C would
beat them both, but I was very impressed by how little difference
there was.  Major props!

For numerics it'd be interesting to have a JIT option which didn't
care about compilation times, and instead of generating assembly
itself generated assembly-like C which was then delegated to an
external compiler.


Doug

-- 
Department of Earth Sciences
University of Hong Kong



      

From stefan_ml at behnel.de  Thu Sep  2 10:11:10 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 02 Sep 2010 10:11:10 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <69412.54494.qm@web53706.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	
	<69412.54494.qm@web53706.mail.re2.yahoo.com>
Message-ID: 

Saravanan Shanmugham, 02.09.2010 09:57:
> I afraid people are missing the point here.
> For an average engineer its better to be an expert of 1 language than be an
> average at 4.

Well, it's certainly better to be an almost-expert in two, than a 
no-left-no-right expert in only one.


> I am just seeing Cython/Shedskin as fragmentation of resources.

You might want to closer look at the projects and their goals before 
judging that way.

Stefan


From sarvi at yahoo.com  Thu Sep  2 10:18:43 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 01:18:43 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<69412.54494.qm@web53706.mail.re2.yahoo.com>
	
Message-ID: <694746.77392.qm@web53706.mail.re2.yahoo.com>

I have researched these projects quite extensively.
Quite similar beasts as far as I can tell.

Cython/Pyrex used to write python extensions. They use statically typed variants 
of Python which gets compiled into C which can then be compiled.

Shedskin is slightly more general purpose Restricted Python to C++ compiler.

PyPy as I understand can convert RPython into C code

Am I missing something here?

Sarvi




----- Original Message ----
From: Stefan Behnel 
To: pypy-dev at codespeak.net
Sent: Thu, September 2, 2010 1:11:10 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Saravanan Shanmugham, 02.09.2010 09:57:
> I afraid people are missing the point here.
> For an average engineer its better to be an expert of 1 language than be an
> average at 4.

Well, it's certainly better to be an almost-expert in two, than a 
no-left-no-right expert in only one.


> I am just seeing Cython/Shedskin as fragmentation of resources.

You might want to closer look at the projects and their goals before 
judging that way.

Stefan

_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev



      

From p.giarrusso at gmail.com  Thu Sep  2 10:22:24 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 2 Sep 2010 10:22:24 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: 
References: 
	<4C7A3737.50902@gmx.de>
	
	<4C7A4C99.2050803@gmx.de>
	
	
Message-ID: 

On Tue, Aug 31, 2010 at 09:25, Hakan Ardo  wrote:
> Ok, so we split it up into a set of Optimization classes in separate
> files. Each containing a subset of the optimize_... methods. Then we
> have the propagate_forward method iterate over the instructions
> passing them to one Optimization after the other? That way we keep the
> single iteration over the instructions. Would it be preferable to
> separate them even more and have each Optimization contain it's own
> loop over the instructions?

But won't this affect performance? Which is very important in a JIT compiler.
When compiling traces bigger than a cacheline, it might even affect
locality, i.e. be an important performance problem.
Unless your RPython compiler can join the loops. If they are just
loops, it could. If they are tree visits, it likely can't; it's done
by the Haskell compiler (google Haskell, stream fusion, shortcut
deforestation, I guess), but the techniques are unlikely to generalize
to languages with side effects; it's also done/doable in some
Domain-Specific Languages for tree visitors.

> On Sun, Aug 29, 2010 at 10:05 PM, Maciej Fijalkowski  wrote:
>> On Sun, Aug 29, 2010 at 2:03 PM, Carl Friedrich Bolz  wrote:
>>> On 08/29/2010 01:49 PM, Hakan Ardo wrote:
>>>>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
>>>>> it would be good if the bounds-related optimizations could move out of
>>>>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
>>>>> getting really unwieldy. Does that make sense?
>>>>
>>>> Well, class IntBound and the propagate_bounds_ methods could probably
>>>> be moved elsewhere, but a lot of the work is done in optimize_...
>>>> methods, which I'm not so sure it would make sens to split up.
>>>
>>> I guess then the things that can be sanely moved should move. The file
>>> is nearly 2000 lines, which is way too big. I guess also the heap
>>> optimizations could go to their own file.
>>>
>>> Carl Friedrich
>>
>> How about a couple of files (preferably small) each containing a
>> contained optimization if possible? (maybe a package?)
>>
>
>
>
> --
> H?kan Ard?
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From amauryfa at gmail.com  Thu Sep  2 10:28:14 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 2 Sep 2010 10:28:14 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <569927.57152.qm@web53706.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
Message-ID: 

Hi,

2010/9/2 Saravanan Shanmugham :
> We have Pickle and cPickle for performance. The Pickle could have simply been
> rewritten in RPython and probably compiled and we don't need ?different versons
> :-))

The PyPy way is much simpler:
there is only the original pickle.py, written in plain full Python,
and it's as fast as a C or RPython implementation.

-- 
Amaury Forgeot d'Arc

From sarvi at yahoo.com  Thu Sep  2 10:37:39 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 01:37:39 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
	
Message-ID: <527977.40072.qm@web53701.mail.re2.yahoo.com>

awesome.

The point I was making is that RPython(a static subset of Python) will be faster 
than Dynamic Python code on a JIT or compiled to machine code.

Sarvi


----- Original Message ----
From: Amaury Forgeot d'Arc 
To: Saravanan Shanmugham 
Cc: Douglas McNeil ; "pypy-dev at codespeak.net" 

Sent: Thu, September 2, 2010 1:28:14 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Hi,

2010/9/2 Saravanan Shanmugham :
> We have Pickle and cPickle for performance. The Pickle could have simply been
> rewritten in RPython and probably compiled and we don't need  different 
versons
> :-))

The PyPy way is much simpler:
there is only the original pickle.py, written in plain full Python,
and it's as fast as a C or RPython implementation.

-- 
Amaury Forgeot d'Arc



      

From william.leslie.ttg at gmail.com  Thu Sep  2 10:40:03 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 2 Sep 2010 18:40:03 +1000
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <527977.40072.qm@web53701.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
	
	<527977.40072.qm@web53701.mail.re2.yahoo.com>
Message-ID: 

But what makes you think that? A dynamic compiler has more information, so
it should be able to produce better code.

On 02/09/2010 6:37 PM, "Saravanan Shanmugham"  wrote:

awesome.

The point I was making is that RPython(a static subset of Python) will be
faster
than Dynamic Python code on a JIT or compiled to machine code.

Sarvi



----- Original Message ----
From: Amaury Forgeot d'Arc 
To: Saravanan Shanmugh...

Sent: Thu, September 2, 2010 1:28:14 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Hi,

2010/9/2 Saravanan Shanmugham :
> We have Pickle and cPickle for performance. ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100902/d4f04e18/attachment.htm 

From p.giarrusso at gmail.com  Thu Sep  2 10:56:14 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 2 Sep 2010 10:56:14 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
	
	<527977.40072.qm@web53701.mail.re2.yahoo.com>
	
Message-ID: 

On Thu, Sep 2, 2010 at 10:40, William Leslie
wrote:

> But what makes you think that? A dynamic compiler has more information, so
> it should be able to produce better code.
>
Note that he's not arguing about a static compiler for the same code, which
has no type information, and where you are obviously right. He's arguing
about a statically typed language, where the type information is already
there in the source, e.g. C - there is much less information missing.
Actually, your point can still be made, but it becomes much less obvious.
For this case, it's much more contended what's best - see the "java faster
than C" debate. Nobody has yet given a proof convincing enough to close the
debate.

I would say that there's a tradeoff between JIT and Ahead-Of-Time
compilation, when AOT makes sense (not in Python, SmallTalk, Self...).

On 02/09/2010 6:37 PM, "Saravanan Shanmugham"  wrote:
>
> The point I was making is that RPython(a static subset of Python) will be
> faster
> than Dynamic Python code on a JIT or compiled to machine code.
>
> Note that, with enough implementation effort, that doesn't need to be true.
Run-time specialization would allow exactly the same code to be generated,
without any extra guards in the inner loop. Java can do that at times, and
can even be better than C, but not always (see above). You'd need a static
compiler with Profile-Guided Optimization and have a profile which matches
runtime, to guarantee superior results.

Cheers
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100902/d33bb6f2/attachment-0001.htm 

From sarvi at yahoo.com  Thu Sep  2 11:00:27 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 02:00:27 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
	
	<527977.40072.qm@web53701.mail.re2.yahoo.com>
	
Message-ID: <996190.16331.qm@web53703.mail.re2.yahoo.com>

So far as I can tell from Unladed Swallow and PyPy, it is some of these Dynamic 
features of Python, such as Dynamic Typing that make it hard to compile/optimize 
and hit C like speeds.

Hence the need for RPython in PyPy or Restricted Python in Shedskin?

Sarvi



________________________________
From: William Leslie 
To: Saravanan Shanmugham 
Cc: "pypy-dev at codespeak.net" ; Amaury Forgeot d'Arc 

Sent: Thu, September 2, 2010 1:40:03 AM
Subject: Re: [pypy-dev] Question on the future of RPython


But what makes you think that? A dynamic compiler has more information, so it 
should be able to produce better code.
On 02/09/2010 6:37 PM, "Saravanan Shanmugham"  wrote:
>
>awesome.
>
>The point I was making is that RPython(a static subset of Python) will be 
faster
>than Dynamic Python code on a JIT or compiled to machine code.
>
>Sarvi
>
>
>
>----- Original Message ----
>From: Amaury Forgeot d'Arc 
>To: Saravanan Shanmugh...
>Sent: Thu, September 2, 2010 1:28:14 AM
>Subject: Re: [pypy-dev] Question on the future of RPython
>
>Hi,
>
>2010/9/2 Saravanan Shanmugham :
>> We have Pickle and cPickle for performance. ...


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100902/84eb7267/attachment.htm 

From fijall at gmail.com  Thu Sep  2 11:35:05 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 2 Sep 2010 11:35:05 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	
Message-ID: 

On Thu, Sep 2, 2010 at 9:56 AM, Paolo Giarrusso  wrote:
> Hi,
> I was curious about the interplay between type inference and separate
> compilation.
>
> On Thu, Sep 2, 2010 at 09:09, William Leslie
>  wrote:
>> At the moment, the lack of separate compilation is a real issue
>> standing in the way of using rpython as a general purpose language, or
>> even as an extension language. Having to re-translate *everything*
>> every time you want to install an extension module is not on. Even C
>> doesn't require that.
>
>> The other is that type inference is global and changes you make to one
>> function can have far-reaching consequences.
> Is it module-global or is it performed on the whole program?
> I guess you'd need modular type inference before allowing separate
> compilation, and of course lots of implementation work.
> Functional languages allow separate compilation - is there any
> RPython-specific problem for that? I've omitted my guesses here.

There is no notion of a "module" in RPython. RPython is compiled from
live python objects (hence python is a metaprogramming language for
RPython). There is a bunch of technical problems, but it's generally
possible to implement separate compilation (it's work though).

Cheers,
fijal

> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From jacob at openend.se  Thu Sep  2 11:40:45 2010
From: jacob at openend.se (Jacob =?iso-8859-1?q?Hall=E9n?=)
Date: Thu, 2 Sep 2010 11:40:45 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <525425.61205.qm@web53702.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
Message-ID: <201009021140.53415.jacob@openend.se>

Thursday 02 September 2010 you wrote:
> I understand from various threads here,  that RPython is not for general
> purpose use.
> Why this lack of Focus on general use.
> 
> I am looking at this and I am thinking and comparing this to a corporation
> that is working on this awesome product.
> 
> They are so focused on this awesome final product vision that they fail to
> realize the awesome potential of some if its intermediate side
> deliverables.
> 
> PyPy is definitely gaining momentum.
> But as a strategy to build that momentum, and gain new converts it should
> put some focus on some of its niche strengths.
> Things other python implementions cannot do.
> 
> One such niche is its RPython and RPython Compiler.
> No other python implementation can convert python programs to executables.
> I am seeing growing interest in writing Rpython code for performance
> critical code and even potentially compiling it to binaries.
> 
> http://olliwang.com/2009/12/20/aes-implementation-in-rpython/
> http://alexgaynor.net/2010/may/15/pypy-future-python/
> 
> 
> Is it possible the PyPy team may be understating the significance of
> RPython? Am I crazy to think this way? :-)

RPython was tried in a production environment some years ago and while it 
produced some very nice results, it was quite difficult to work with. Dealing 
with those difficulties requires a group of people who are willing to build 
RPython code for general applications, run the code and identify what the 
difficulties actually are. Then they need to come up with strategies for how 
to remedy the problems and implement them in code. This is a very large 
undertaking for which Pypy does not have the manpower.. It also reqires people 
who are interested in building support for compiled programming languages. 
Pypy is a volunteer effort and the only person who was interested in this has 
retired from the project.

Jacob Hall?n
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100902/d69f8a19/attachment.pgp 

From william.leslie.ttg at gmail.com  Thu Sep  2 11:52:56 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 2 Sep 2010 19:52:56 +1000
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <996190.16331.qm@web53703.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<569927.57152.qm@web53706.mail.re2.yahoo.com>
	
	<527977.40072.qm@web53701.mail.re2.yahoo.com>
	
	<996190.16331.qm@web53703.mail.re2.yahoo.com>
Message-ID: 

On 2 September 2010 19:00, Saravanan Shanmugham  wrote:
> So far as I can tell from Unladed Swallow and PyPy, it is some of these
> Dynamic features of Python, such as Dynamic Typing that make it hard to
> compile/optimize and hit C like speeds.
> Hence the need for RPython in PyPy or Restricted Python in Shedskin?

Hence the need for the JIT, not rpython. Rpython is an implementation
detail, to support translation easily to C as well as CLI and JVM
bytecode, and to support translation aspects such as stackless, and
testing on top of a full python environment. Rewriting things in
rpython for performance is a hack that should stop happening as the
JIT matures.

Dynamic typing means you need to do more work to produce code of the
same performance, but it's not impossible.


On 2 September 2010 18:56, Paolo Giarrusso  wrote:
> On Thu, Sep 2, 2010 at 10:40, William Leslie 
> wrote:
>>
>> But what makes you think that? A dynamic compiler has more information, so
>> it should be able to produce better code.
>
> Note that he's not arguing about a static compiler for the same code, which
> has no type information, and where you are obviously right. He's arguing
> about a statically typed language, where the type information is already
> there in the source, e.g. C - there is much less information missing.
> Actually, your point can still be made, but it becomes much less obvious.
> For this case, it's much more contended what's best - see the "java faster
> than C" debate. Nobody has yet given a proof convincing enough to close the
> debate.

Sure - having static type guarantees is another case of "more information".

There is a little more room for discussion here, because there are
cases where a dynamic compiler for a safe runtime can do better at
considering certain optimisations, too. We have been talking about our
stock-standard type systems here, which ensure that our object will
have the field or method that we are interested in at runtime, and
perhaps (as long as it isn't an interface method, which we don't have
in rpython anyway) the offset into the instance or vtable
respectively. That makes for a pretty decent optimisation, but type
systems can infer much more than this, including which objects may
escape (via region typing a-la cyclone), which fields may be None, and
which instructions are loop invariant. The point is that some of these
type systems work fine with separate compilation, and some do
significantly better with runtime or linktime specialisation.

On 2 September 2010 17:56, Paolo Giarrusso  wrote:
> On Thu, Sep 2, 2010 at 09:09, William Leslie
>  wrote:
>> The other is that type inference is global and changes you make to one
>> function can have far-reaching consequences.
> Is it module-global or is it performed on the whole program?

Rtyping is whole-program.

> Functional languages allow separate compilation - is there any
> RPython-specific problem for that? I've omitted my guesses here.

Many do, yes. To use ML derivatives as an example, you require the
signature of any modules you directly import. I was recently reading
about MLKit's module system, which is quite interesting (it has region
typing, and the way it unifies these types at the module boundary is
cute - carrying region information around in the source text is
fragile, so must be inferred). Haskell is kind of a special case,
requiring dictionaries to be passed around at runtime to determine
which method of some typeclass to call.

For OCaml (most MLs are similar) see section 2.5: "Modules and
separate compilation" of
http://pauillac.inria.fr/ocaml/htmlman/manual004.html

On MLKit's module implementation and region inference:
http://www.itu.dk/research/mlkit/index.php/Static_Interpretation

-- 
William Leslie

From p.giarrusso at gmail.com  Thu Sep  2 16:10:07 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 2 Sep 2010 16:10:07 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <694746.77392.qm@web53706.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	
	<69412.54494.qm@web53706.mail.re2.yahoo.com>
	
	<694746.77392.qm@web53706.mail.re2.yahoo.com>
Message-ID: 

On Thu, Sep 2, 2010 at 10:18, Saravanan Shanmugham  wrote:
> I have researched these projects quite extensively.
> Quite similar beasts as far as I can tell.
>
> Cython/Pyrex used to write python extensions. They use statically typed variants
> of Python which gets compiled into C which can then be compiled.
>
> Shedskin is slightly more general purpose Restricted Python to C++ compiler.
>
> PyPy as I understand can convert RPython into C code
>
> Am I missing something here?

Maybe you have done extensive research, but the above is not enough
for the conclusion, which might still be valid.
There could be some cool way to reuse each other's code, and that
would be cool given the available manpower.

The question is:
=> Do different goals cause _incompatible_ design/implementation choices?
Currently, static typing versus global type inference seems to be
already a fundamental difference. Modular type inference, if
polymorphic (and I guess it has to be), would require using boxed or
tagged integers more often, as far as I can see.

RPython is intended to be compiled to various environments, with
different variations (choose GC/stackful or stackless/heaps of other
choices), and its programmers are somewhat OK with its limitations; it
has type inference, with its set of tradeoffs. This for instance
prevents reusing shedskin, and probably even prevents reusing any of
its code.

Cython/Shedskin are intended to be used by more people and to be simpler.

==> Would making RPython usable for people harm its usability for PyPy?

I see no trivial answer to the above questions which allows merging,
but I don't develop any of them.
However, a discussion of this could probably end in a PyPy FAQ.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From sarvi at yahoo.com  Thu Sep  2 19:18:56 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 10:18:56 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <201009021140.53415.jacob@openend.se>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
Message-ID: <933850.84117.qm@web53705.mail.re2.yahoo.com>

one response inline


----- Original Message ----
From: Jacob Hall?n 
To: pypy-dev at codespeak.net
Sent: Thu, September 2, 2010 2:40:45 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Thursday 02 September 2010 you wrote:
> I understand from various threads here,  that RPython is not for general
> purpose use.
> Why this lack of Focus on general use.
> 
> I am looking at this and I am thinking and comparing this to a corporation
> that is working on this awesome product.
> 
> They are so focused on this awesome final product vision that they fail to
> realize the awesome potential of some if its intermediate side
> deliverables.
> 
> PyPy is definitely gaining momentum.
> But as a strategy to build that momentum, and gain new converts it should
> put some focus on some of its niche strengths.
> Things other python implementions cannot do.
> 
> One such niche is its RPython and RPython Compiler.
> No other python implementation can convert python programs to executables.
> I am seeing growing interest in writing Rpython code for performance
> critical code and even potentially compiling it to binaries.
> 
> http://olliwang.com/2009/12/20/aes-implementation-in-rpython/
> http://alexgaynor.net/2010/may/15/pypy-future-python/
> 
> 
> Is it possible the PyPy team may be understating the significance of
> RPython? Am I crazy to think this way? :-)

RPython was tried in a production environment some years ago and while it 
produced some very nice results, it was quite difficult to work with. Dealing 
with those difficulties requires a group of people who are willing to build 
RPython code for general applications, run the code and identify what the 
difficulties actually are. Then they need to come up with strategies for how 
to remedy the problems and implement them in code. This is a very large 
undertaking for which Pypy does not have the manpower.. It also reqires people 
who are interested in building support for compiled programming languages. 
Pypy is a volunteer effort and the only person who was interested in this has 
retired from the project.

Sarvi>>>>
This makes sense. 
But wouldn't the answer to this problem be to invite people like the 
Shedskin/Cython developers to join forces with PyPy?
So that they can pursue the general RPython usecase you mention above while the 
others focus on JIT and stuff on a common code base?

Wouldn't that be a win-win for everybody?

This collaboration feels so obvious to me, that I am confused why it isn't to 
others.
Considering that Shedskin's goals feel almost like a strict subset of PyPy.


Sarvi



Jacob Hall?n



      

From santagada at gmail.com  Thu Sep  2 20:18:32 2010
From: santagada at gmail.com (Leonardo Santagada)
Date: Thu, 2 Sep 2010 15:18:32 -0300
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <933850.84117.qm@web53705.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
Message-ID: 

On Thu, Sep 2, 2010 at 2:18 PM, Saravanan Shanmugham  wrote:
> one response inline
>
>
> ----- Original Message ----
> From: Jacob Hall?n 
> To: pypy-dev at codespeak.net
> Sent: Thu, September 2, 2010 2:40:45 AM
> Subject: Re: [pypy-dev] Question on the future of RPython
>
> Thursday 02 September 2010 you wrote:
>> I understand from various threads here, ?that RPython is not for general
>> purpose use.
>> Why this lack of Focus on general use.
>>
>> I am looking at this and I am thinking and comparing this to a corporation
>> that is working on this awesome product.
>>
>> They are so focused on this awesome final product vision that they fail to
>> realize the awesome potential of some if its intermediate side
>> deliverables.
>>
>> PyPy is definitely gaining momentum.
>> But as a strategy to build that momentum, and gain new converts it should
>> put some focus on some of its niche strengths.
>> Things other python implementions cannot do.
>>
>> One such niche is its RPython and RPython Compiler.
>> No other python implementation can convert python programs to executables.
>> I am seeing growing interest in writing Rpython code for performance
>> critical code and even potentially compiling it to binaries.
>>
>> http://olliwang.com/2009/12/20/aes-implementation-in-rpython/
>> http://alexgaynor.net/2010/may/15/pypy-future-python/
>>
>>
>> Is it possible the PyPy team may be understating the significance of
>> RPython? Am I crazy to think this way? :-)
>
> RPython was tried in a production environment some years ago and while it
> produced some very nice results, it was quite difficult to work with. Dealing
> with those difficulties requires a group of people who are willing to build
> RPython code for general applications, run the code and identify what the
> difficulties actually are. Then they need to come up with strategies for how
> to remedy the problems and implement them in code. This is a very large
> undertaking for which Pypy does not have the manpower.. It also reqires people
> who are interested in building support for compiled programming languages.
> Pypy is a volunteer effort and the only person who was interested in this has
> retired from the project.
>
> Sarvi>>>>
> This makes sense.
> But wouldn't the answer to this problem be to invite people like the
> Shedskin/Cython developers to join forces with PyPy?
> So that they can pursue the general RPython usecase you mention above while the
> others focus on JIT and stuff on a common code base?
>
> Wouldn't that be a win-win for everybody?
>
> This collaboration feels so obvious to me, that I am confused why it isn't to
> others.
> Considering that Shedskin's goals feel almost like a strict subset of PyPy.

I think what you don't get is how open source works, there is always
ten projects doing almost the same thing. Everyone at least once
thought "why does linux has this many media players/text editors/flash
implementations/jvm when all we need is a really good one with lots of
support". I does get me depressed sometimes, but this is the way it
is.

Cython has a big user base that they have to support and lots of
programs that are in production today, shedskin is looking for pure
performance and the pypy guys want to have a faster python. Although I
also think that maybe RPython and the pypy python interpreter could
solve all this problems someday it doesn't do so right now.

I have used RPython in the past and the error messages alone would
drive some people away. Some group of people could work to fix this,
but I doubt it will happen soon.

What I think could be done to make pypy more visible to people would
be to have a killer app running on pypy way faster/better than on
cpython. For me this app is either mercurial or django.

-- 
Leonardo Santagada

From sarvi at yahoo.com  Thu Sep  2 22:03:31 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Thu, 2 Sep 2010 13:03:31 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: <56902.77920.qm@web53708.mail.re2.yahoo.com>





----- Original Message ----
From: Leonardo Santagada 
To: Saravanan Shanmugham 
Cc: Jacob Hall?n ; pypy-dev at codespeak.net
Sent: Thu, September 2, 2010 11:18:32 AM
Subject: Re: [pypy-dev] Question on the future of RPython

On Thu, Sep 2, 2010 at 2:18 PM, Saravanan Shanmugham  wrote:
> one response inline
>
>
> ----- Original Message ----
> From: Jacob Hall?n 
> To: pypy-dev at codespeak.net
> Sent: Thu, September 2, 2010 2:40:45 AM
> Subject: Re: [pypy-dev] Question on the future of RPython
>
> Thursday 02 September 2010 you wrote:
>> I understand from various threads here,  that RPython is not for general
>> purpose use.
>> Why this lack of Focus on general use.
>>
>> I am looking at this and I am thinking and comparing this to a corporation
>> that is working on this awesome product.
>>
>> They are so focused on this awesome final product vision that they fail to
>> realize the awesome potential of some if its intermediate side
>> deliverables.
>>
>> PyPy is definitely gaining momentum.
>> But as a strategy to build that momentum, and gain new converts it should
>> put some focus on some of its niche strengths.
>> Things other python implementions cannot do.
>>
>> One such niche is its RPython and RPython Compiler.
>> No other python implementation can convert python programs to executables.
>> I am seeing growing interest in writing Rpython code for performance
>> critical code and even potentially compiling it to binaries.
>>
>> http://olliwang.com/2009/12/20/aes-implementation-in-rpython/
>> http://alexgaynor.net/2010/may/15/pypy-future-python/
>>
>>
>> Is it possible the PyPy team may be understating the significance of
>> RPython? Am I crazy to think this way? :-)
>
> RPython was tried in a production environment some years ago and while it
> produced some very nice results, it was quite difficult to work with. Dealing
> with those difficulties requires a group of people who are willing to build
> RPython code for general applications, run the code and identify what the
> difficulties actually are. Then they need to come up with strategies for how
> to remedy the problems and implement them in code. This is a very large
> undertaking for which Pypy does not have the manpower.. It also reqires people
> who are interested in building support for compiled programming languages.
> Pypy is a volunteer effort and the only person who was interested in this has
> retired from the project.
>
> Sarvi>>>>
> This makes sense.
> But wouldn't the answer to this problem be to invite people like the
> Shedskin/Cython developers to join forces with PyPy?
> So that they can pursue the general RPython usecase you mention above while 
the
> others focus on JIT and stuff on a common code base?
>
> Wouldn't that be a win-win for everybody?
>
> This collaboration feels so obvious to me, that I am confused why it isn't to
> others.
> Considering that Shedskin's goals feel almost like a strict subset of PyPy.

I think what you don't get is how open source works, there is always
ten projects doing almost the same thing. Everyone at least once
thought "why does linux has this many media players/text editors/flash
implementations/jvm when all we need is a really good one with lots of
support". I does get me depressed sometimes, but this is the way it
is.

Cython has a big user base that they have to support and lots of
programs that are in production today, shedskin is looking for pure
performance and the pypy guys want to have a faster python. Although I
also think that maybe RPython and the pypy python interpreter could
solve all this problems someday it doesn't do so right now.

I have used RPython in the past and the error messages alone would
drive some people away. Some group of people could work to fix this,
but I doubt it will happen soon.



Sarvi>> Yeah having this thread of conversation on 4 separate aliases, 
Python.org, Unladen Swallow, Shedskn and PyPy was just my attempt 
at seeing if these can come together.
Oh well.



What I think could be done to make pypy more visible to people would
be to have a killer app running on pypy way faster/better than on
cpython. For me this app is either mercurial or django.

Sarvi>>> Very True. 

Sarvi




-- 
Leonardo Santagada



      

From jacob at openend.se  Thu Sep  2 22:54:01 2010
From: jacob at openend.se (Jacob =?iso-8859-1?q?Hall=E9n?=)
Date: Thu, 2 Sep 2010 22:54:01 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <933850.84117.qm@web53705.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
Message-ID: <201009022254.07665.jacob@openend.se>

Thursday 02 September 2010 you wrote:
> This makes sense.
> But wouldn't the answer to this problem be to invite people like the
> Shedskin/Cython developers to join forces with PyPy?
> So that they can pursue the general RPython usecase you mention above while
> the others focus on JIT and stuff on a common code base?
> 
> Wouldn't that be a win-win for everybody?
> 
> This collaboration feels so obvious to me, that I am confused why it isn't
> to others.
> Considering that Shedskin's goals feel almost like a strict subset of PyPy.

It is a matter of personal pride, I think. If we made the invitation to the 
Shedskin people they would see this as "Pypy thinks they are way cooler than 
us, so they invite us to be part of their project". This would naturally 
generate a refusal, because even though we don't make such value statements, 
it would be viewed that way.

So, we don't make such invitations, even if they make sense.

What we hope is that some people examine the Pypy project and find that it 
actually is a really cool piece of technology with lots of possible side 
projects and expansion possibilities. If they decide to join the project we 
will give them all the help we are capable of.

Most people have actually joined Pypy in this way. The most recent example is 
H?kan Ard? who wanted to expand Pypy in the direction of numeric calculations. 
The learning curve is fairly steep, but there are quite a few people on the 
IRC channel who are ready to help you overcome the hurdles.

Jacob Hall?n
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100902/39455c75/attachment.pgp 

From sarvi at yahoo.com  Fri Sep  3 11:11:13 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Fri, 3 Sep 2010 02:11:13 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <201009022254.07665.jacob@openend.se>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
Message-ID: <810929.87945.qm@web53702.mail.re2.yahoo.com>

I have heard repeatedly in this alias that PyPy's RPython is very difficult to 
use.

I have also heard here and elsewhere that Shedskin fast and is great for what it 
does i.e. translate its version of Restricted Python to C++.

Which then begs the question, would it make sense for PyPy to adopt Shedskin to 
compile its PyPy RPython code into C++/binary.

Sarvi


----- Original Message ----
From: Jacob Hall?n 
To: Saravanan Shanmugham 
Cc: pypy-dev at codespeak.net
Sent: Thu, September 2, 2010 1:54:01 PM
Subject: Re: [pypy-dev] Question on the future of RPython

Thursday 02 September 2010 you wrote:
> This makes sense.
> But wouldn't the answer to this problem be to invite people like the
> Shedskin/Cython developers to join forces with PyPy?
> So that they can pursue the general RPython usecase you mention above while
> the others focus on JIT and stuff on a common code base?
> 
> Wouldn't that be a win-win for everybody?
> 
> This collaboration feels so obvious to me, that I am confused why it isn't
> to others.
> Considering that Shedskin's goals feel almost like a strict subset of PyPy.

It is a matter of personal pride, I think. If we made the invitation to the 
Shedskin people they would see this as "Pypy thinks they are way cooler than 
us, so they invite us to be part of their project". This would naturally 
generate a refusal, because even though we don't make such value statements, 
it would be viewed that way.

So, we don't make such invitations, even if they make sense.

What we hope is that some people examine the Pypy project and find that it 
actually is a really cool piece of technology with lots of possible side 
projects and expansion possibilities. If they decide to join the project we 
will give them all the help we are capable of.

Most people have actually joined Pypy in this way. The most recent example is 
H?kan Ard? who wanted to expand Pypy in the direction of numeric calculations. 
The learning curve is fairly steep, but there are quite a few people on the 
IRC channel who are ready to help you overcome the hurdles.

Jacob Hall?n



      

From stefan_ml at behnel.de  Fri Sep  3 11:30:32 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 03 Sep 2010 11:30:32 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <810929.87945.qm@web53702.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	<201009021140.53415.jacob@openend.se>	<933850.84117.qm@web53705.mail.re2.yahoo.com>	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
Message-ID: 

Saravanan Shanmugham, 03.09.2010 11:11:
> From: Jacob Hall?n
>> It is a matter of personal pride, I think. If we made the invitation to the
>> Shedskin people they would see this as "Pypy thinks they are way cooler than
>> us, so they invite us to be part of their project". This would naturally
>> generate a refusal, because even though we don't make such value statements,
>> it would be viewed that way.
>>
>> So, we don't make such invitations, even if they make sense.
>
> I have heard repeatedly in this alias that PyPy's RPython is very difficult to
> use.
>
> I have also heard here and elsewhere that Shedskin fast and is great for what it
> does i.e. translate its version of Restricted Python to C++.
>
> Which then begs the question, would it make sense for PyPy to adopt Shedskin to
> compile its PyPy RPython code into C++/binary.

You should seriously read and try to understand the e-mails that you reply 
to, instead of top-posting them away.

Stefan


From sarvi at yahoo.com  Fri Sep  3 19:22:58 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Fri, 3 Sep 2010 10:22:58 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: <609691.54456.qm@web53705.mail.re2.yahoo.com>

Lets not be a little presumptious shall we.
This is the second time you seem to be claiming that I haven't done my 
research/reading.

I have been following the progress of PyPy over 2 years. Its great work. So is 
Shedskin.
Just for the record, I have used PyRex, Cython.
And have read the documentation and/or sample code for both Shedskin and PyPy

If people say that there are emotional and pragmatic reasons for the 2 projects 
not coming together. 
That makes sense.

I just don't see any logical reasons, thats all. And I haven't heard any on this 
thread either.

BTW, Just because I top post for "readability" doesn't mean I haven't read all 
the threads in detail.

Sarvi


----- Original Message ----
From: Stefan Behnel 
To: pypy-dev at codespeak.net
Sent: Fri, September 3, 2010 2:30:32 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Saravanan Shanmugham, 03.09.2010 11:11:
> From: Jacob Hall?n
>> It is a matter of personal pride, I think. If we made the invitation to the
>> Shedskin people they would see this as "Pypy thinks they are way cooler than
>> us, so they invite us to be part of their project". This would naturally
>> generate a refusal, because even though we don't make such value statements,
>> it would be viewed that way.
>>
>> So, we don't make such invitations, even if they make sense.
>
> I have heard repeatedly in this alias that PyPy's RPython is very difficult to
> use.
>
> I have also heard here and elsewhere that Shedskin fast and is great for what 
>it
> does i.e. translate its version of Restricted Python to C++.
>
> Which then begs the question, would it make sense for PyPy to adopt Shedskin 
to
> compile its PyPy RPython code into C++/binary.

You should seriously read and try to understand the e-mails that you reply 
to, instead of top-posting them away.

Stefan

_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


      

From stefan_ml at behnel.de  Fri Sep  3 19:52:41 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 03 Sep 2010 19:52:41 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <609691.54456.qm@web53705.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	<201009021140.53415.jacob@openend.se>	<933850.84117.qm@web53705.mail.re2.yahoo.com>	<201009022254.07665.jacob@openend.se>	<810929.87945.qm@web53702.mail.re2.yahoo.com>	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
Message-ID: 

Saravanan Shanmugham, 03.09.2010 19:22:
> Lets not be a little presumptious shall we.
> This is the second time you seem to be claiming that I haven't done my
> research/reading.

That's just the impression that I get from what you write and how you write it.


> I just don't see any logical reasons, thats all. And I haven't heard any on this
> thread either.

Well, you are talking to people who know a lot more about what you are 
talking about than you do. It's normal that they are not equally 
enthusiastic about pie-in-the-sky ideas that someone throws at them.


> BTW, Just because I top post for "readability" doesn't mean I haven't read all
> the threads in detail.

I like the fact that you put marks of irony around the word "readability".

Stefan


From amauryfa at gmail.com  Fri Sep  3 20:16:11 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Fri, 3 Sep 2010 20:16:11 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <810929.87945.qm@web53702.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
Message-ID: 

Hi,
2010/9/3 Saravanan Shanmugham 
>
> I have heard repeatedly in this alias that PyPy's RPython is very difficult to
> use.
>
> I have also heard here and elsewhere that Shedskin fast and is great for what it
> does i.e. translate its version of Restricted Python to C++.
>
> Which then begs the question, would it make sense for PyPy to adopt Shedskin to
> compile its PyPy RPython code into C++/binary.

But PyPy does not translate RPython code to C++.
Or before doing so, it performs transformations to the code that
require the analysis
of the program as a whole and that a C++ compiler cannot do, like the choice
of a garbage collector, the stackless mode, and most of all the
generation of a tracing JIT.

It also operates on the bytecode, which offers interesting
metaprogramming techniques
that are used throughout the code (similar to C++ templates, for example,
except that it's written in Python :-) )

Shedskin on the other hand performs a more direct translation of
Python code (it uses the ast)

Both projects don't have the same goals.

--
Amaury Forgeot d'Arc

From sarvi at yahoo.com  Fri Sep  3 21:06:14 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Fri, 3 Sep 2010 12:06:14 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: <294101.49986.qm@web53704.mail.re2.yahoo.com>

Stefan, 
    If I were to go with my impressions, based on you being the lead developer 
of Cython, I could have claimed you have an ulterior motive on this thread.
But then I didn't because, inspite of first impressions/scepticism I believe we 
are all here with a genuine interest to improve the Python environment and get 
more visibiity and momentum on PyPy. Personally I cant wait to see PyPy become 
the default Python. :-)

Lets start with an understanding that we are all smart people with good ideas 
and lets also not get too cocky enough to think we have all the answers.

I saw some genuine synergies that I was calling out.

And I have heard some pragmatic arguments from others though no one has 
necessarily claimed logical impossibility on why this may not work.
And I can understand that.

Sarvi



----- Original Message ----
From: Stefan Behnel 
To: pypy-dev at codespeak.net
Sent: Fri, September 3, 2010 10:52:41 AM
Subject: Re: [pypy-dev] Question on the future of RPython

Saravanan Shanmugham, 03.09.2010 19:22:
> Lets not be a little presumptious shall we.
> This is the second time you seem to be claiming that I haven't done my
> research/reading.

That's just the impression that I get from what you write and how you write it.


> I just don't see any logical reasons, thats all. And I haven't heard any on 
>this
> thread either.

Well, you are talking to people who know a lot more about what you are 
talking about than you do. It's normal that they are not equally 
enthusiastic about pie-in-the-sky ideas that someone throws at them.


> BTW, Just because I top post for "readability" doesn't mean I haven't read all
> the threads in detail.

I like the fact that you put marks of irony around the word "readability".

Stefan

_______________________________________________
pypy-dev at codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev



      

From stefan_ml at behnel.de  Fri Sep  3 21:13:26 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 03 Sep 2010 21:13:26 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <294101.49986.qm@web53704.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	<201009021140.53415.jacob@openend.se>	<933850.84117.qm@web53705.mail.re2.yahoo.com>	<201009022254.07665.jacob@openend.se>	<810929.87945.qm@web53702.mail.re2.yahoo.com>		<609691.54456.qm@web53705.mail.re2.yahoo.com>	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
Message-ID: 

Saravanan Shanmugham, 03.09.2010 21:06:
>      If I were to go with my impressions, based on you being the lead developer
> of Cython, I could have claimed you have an ulterior motive on this thread.

*shrug*

Stefan


From p.giarrusso at gmail.com  Sat Sep  4 01:34:24 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Sat, 4 Sep 2010 01:34:24 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
Message-ID: 

> You should seriously read and try to understand the e-mails that you reply
> to, instead of top-posting them away.

Stefan, there are different ways to argue the same valid thing, and
the way you chose is IMHO counterproductive for you - the only result
is offensive comments.
Also, while I seldom top-post, especially in public forums/MLs, IIRC
several PyPy contributors routinely top-post, and I see some sensible
arguments (see http://en.wikipedia.org/wiki/Posting_style#Top-posting).

Saravanan, a small part of the issue is that many people consider top
posting inappropriate and/or lame (for instance
http://www.caliburn.nl/topposting.html). Be aware of that risk if you
top-post. And please, never claim it increases readability - it makes
your post only readable if you read the whole thread. Non-crappy email
clients highlight differently the new text from the quoted text,
making the former easy to find.

In particular, by top-posting you never address the comments which
explain why merging does not necessarily make sense (like some of
mine), or the ones which argue it's a bad idea (like last Amaury's
mail). Interleaved replying brings instead to point-by-point answers.

See other comments below.
On Fri, Sep 3, 2010 at 11:30, Stefan Behnel  wrote:
> Saravanan Shanmugham, 03.09.2010 11:11:
>> From: Jacob Hall?n
>> I have heard repeatedly in this alias that PyPy's RPython is very difficult to
>> use.
This alias?? You mean this _thread_, don't you?
>> I have also heard here and elsewhere that Shedskin fast and is great for what it
>> does i.e. translate its version of Restricted Python to C++.

>> Which then begs the question, would it make sense for PyPy to adopt Shedskin to
>> compile its PyPy RPython code into C++/binary.

The answer is already implicit in one of my previous emails, and is a
very clear "no, unless considerable extra merging effort is done,
which might be more than the effort to make the RPython compiler
better than Shedskin". I paste a relevant subset of that mail at the
end; while I can believe that you have read it, I often do not
understand all the implications of what I read the first time, if
that's complex, like it is for everybody, so do not be offended if I
suggest you to re-read it again.

A similar, more detailed argument is discussed by Amaury Forgeot d'Arc
in an email where he replies to you.

In other mails, you write that:

> I just don't see any logical reasons [against the merger], thats all. And I haven't heard any on this
> thread either.

> no one has necessarily claimed logical impossibility on why this may not work.

which strikes me as _wrong_. The mails I mentioned explain why there
are different design goals - Amaury, who knows more about Shedskin
than me, explained why it is less general. That's already an answer
for me.
Of course, this does not prove impossibility, it only suggests that it
may be not a good idea to merge the projects. You shouldn't care about
logical impossibility, which makes _NO SENSE_ in such questions in
software engineering; what is possible and bad makes little sense.

If you meant "nobody claimed that this is necessarily a bad idea",
then I agree. We believe there's no obvious way to combine the
projects; anybody, including you, is welcome to address the specific
issues and find some clever solution. You didn't even scratch them,
yet. And while you claimed experience with using the projects, or
reading their documentation, it is not clear at all that you
understand their internals, and this is required to address these
problems.

The only idea which makes some sense is that instead of starting the
development of Shedskin, the author could have tried achieving the
same results improving RPython, fixing its error messages and so on.
However, I can imagine a ton of possible reasons for which he might
have consciously decided to do something else. The keyword here is
"design tradeoff": a design choice can make a product better in some
respects and worse in other ones. Shedskin is less flexible, but
possibly this gives technical advantages which are important. That's
the same thing as explained below.

Best regards

=====

=> Do different goals cause _incompatible_ design/implementation choices?
Currently, static typing versus global type inference seems to be
already a fundamental difference. Modular type inference, if
polymorphic (and I guess it has to be), would require using boxed or
tagged integers more often, as far as I can see.

RPython is intended to be compiled to various environments, with
different variations (choose GC/stackful or stackless/heaps of other
choices), and its programmers are somewhat OK with its limitations; it
has type inference, with its set of tradeoffs. This for instance
prevents reusing shedskin, and probably even prevents reusing any of
its code.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From p.giarrusso at gmail.com  Sat Sep  4 02:15:47 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Sat, 4 Sep 2010 02:15:47 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <294101.49986.qm@web53704.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
Message-ID: 

On Fri, Sep 3, 2010 at 21:06, Saravanan Shanmugham  wrote:
> Stefan,
> ? ?If I were to go with my impressions, based on you being the lead developer
> of Cython, I could have claimed you have an ulterior motive on this thread.

> But then I didn't because, inspite of first impressions/scepticism I believe we
> are all here with a genuine interest to improve the Python environment

While you didn't initiate the flame, I think that's totally
inappropriate, and I can say so even without knowing Stefan.
You wrote:
> Lets not be a little presumptious shall we.
Well, rereading previous comments it is clear that:
a) you don't know well some basics of virtual machines which have been explained
Which is fine, but then you shouldn't consider yourself a peer to
developers of these projects. And you shouldn't claim you have done
proper research if you just used the projects.
You are welcome to be curious, but with such a comment you are the
presumptuous one.
Note that I already remarked that Stefan's comment was not appropriate in style.

b) your email client _is_ crappy, given the way you reply inline (I
was mentioning crappy clients in my previous email).
Socially speaking, in an Open Source community, not using a decent
email client can look as bad as dressing very very wrong. I'm not so
picky, but it does mean you're not a hacker.

Note I'm not a developer of PyPy, and I don't claim being an expert,
but I have some technical knowledge of its documentation about
internals and of some literature, and some small experience with a
Python implementation.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From sarvi at yahoo.com  Sat Sep  4 04:33:16 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Fri, 3 Sep 2010 19:33:16 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	
Message-ID: <597924.69989.qm@web53705.mail.re2.yahoo.com>





----- Original Message ----
From: Paolo Giarrusso 
To: Stefan Behnel ; Saravanan Shanmugham 
Cc: pypy-dev at codespeak.net
Sent: Fri, September 3, 2010 4:34:24 PM
Subject: Re: [pypy-dev] Question on the future of RPython

> You should seriously read and try to understand the e-mails that you reply
> to, instead of top-posting them away.

Stefan, there are different ways to argue the same valid thing, and
the way you chose is IMHO counterproductive for you - the only result
is offensive comments.
Also, while I seldom top-post, especially in public forums/MLs, IIRC
several PyPy contributors routinely top-post, and I see some sensible
arguments (see http://en.wikipedia.org/wiki/Posting_style#Top-posting).

Saravanan, a small part of the issue is that many people consider top
posting inappropriate and/or lame (for instance
http://www.caliburn.nl/topposting.html). Be aware of that risk if you
top-post. And please, never claim it increases readability - it makes
your post only readable if you read the whole thread. Non-crappy email
clients highlight differently the new text from the quoted text,
making the former easy to find.

Sarvi>> Point taken. will keep that in mind.
It was misguided notion of what would be readable.

Sarvi 

In particular, by top-posting you never address the comments which
explain why merging does not necessarily make sense (like some of
mine), or the ones which argue it's a bad idea (like last Amaury's
mail). Interleaved replying brings instead to point-by-point answers.

See other comments below.
On Fri, Sep 3, 2010 at 11:30, Stefan Behnel  wrote:
> Saravanan Shanmugham, 03.09.2010 11:11:
>> From: Jacob Hall?n
>> I have heard repeatedly in this alias that PyPy's RPython is very difficult 
to
>> use.
This alias?? You mean this _thread_, don't you?
>> I have also heard here and elsewhere that Shedskin fast and is great for what 
>>it
>> does i.e. translate its version of Restricted Python to C++.

>> Which then begs the question, would it make sense for PyPy to adopt Shedskin 
>to
>> compile its PyPy RPython code into C++/binary.

The answer is already implicit in one of my previous emails, and is a
very clear "no, unless considerable extra merging effort is done,
which might be more than the effort to make the RPython compiler
better than Shedskin". I paste a relevant subset of that mail at the
end; while I can believe that you have read it, I often do not
understand all the implications of what I read the first time, if
that's complex, like it is for everybody, so do not be offended if I
suggest you to re-read it again.

A similar, more detailed argument is discussed by Amaury Forgeot d'Arc
in an email where he replies to you.

In other mails, you write that:

> I just don't see any logical reasons [against the merger], thats all. And I 
>haven't heard any on this
> thread either.

> no one has necessarily claimed logical impossibility on why this may not work.

which strikes me as _wrong_. The mails I mentioned explain why there
are different design goals - Amaury, who knows more about Shedskin
than me, explained why it is less general. That's already an answer
for me.
Of course, this does not prove impossibility, it only suggests that it
may be not a good idea to merge the projects. You shouldn't care about
logical impossibility, which makes _NO SENSE_ in such questions in
software engineering; what is possible and bad makes little sense.

If you meant "nobody claimed that this is necessarily a bad idea",
then I agree. We believe there's no obvious way to combine the
projects; anybody, including you, is welcome to address the specific
issues and find some clever solution. You didn't even scratch them,
yet. And while you claimed experience with using the projects, or
reading their documentation, it is not clear at all that you
understand their internals, and this is required to address these
problems.

The only idea which makes some sense is that instead of starting the
development of Shedskin, the author could have tried achieving the
same results improving RPython, fixing its error messages and so on.
However, I can imagine a ton of possible reasons for which he might
have consciously decided to do something else. The keyword here is
"design tradeoff": a design choice can make a product better in some
respects and worse in other ones. Shedskin is less flexible, but
possibly this gives technical advantages which are important. That's
the same thing as explained below.

Best regards

=====

=> Do different goals cause _incompatible_ design/implementation choices?
Currently, static typing versus global type inference seems to be
already a fundamental difference. Modular type inference, if
polymorphic (and I guess it has to be), would require using boxed or
tagged integers more often, as far as I can see.

RPython is intended to be compiled to various environments, with
different variations (choose GC/stackful or stackless/heaps of other
choices), and its programmers are somewhat OK with its limitations; it
has type inference, with its set of tradeoffs. This for instance
prevents reusing shedskin, and probably even prevents reusing any of
its code.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/



      

From sarvi at yahoo.com  Sat Sep  4 04:56:49 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Fri, 3 Sep 2010 19:56:49 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
Message-ID: <860977.83657.qm@web53705.mail.re2.yahoo.com>





----- Original Message ----
From: Paolo Giarrusso 
To: Saravanan Shanmugham 
Cc: Stefan Behnel ; pypy-dev at codespeak.net
Sent: Fri, September 3, 2010 5:15:47 PM
Subject: Re: [pypy-dev] Question on the future of RPython

On Fri, Sep 3, 2010 at 21:06, Saravanan Shanmugham  wrote:
> Stefan,
>    If I were to go with my impressions, based on you being the lead developer
> of Cython, I could have claimed you have an ulterior motive on this thread.

> But then I didn't because, inspite of first impressions/scepticism I believe 
we
> are all here with a genuine interest to improve the Python environment

While you didn't initiate the flame, I think that's totally
inappropriate, and I can say so even without knowing Stefan.

Sarvi>> I believe it is an appropriate response to the flame bait.
BTW, I was very careful not to make the accusation. 
No real offense meant.
It was just a what if argument to drive the point that if every one responded 
like that, based on impression and presumptions,  it would be wrong.
So I standby that.

You wrote:
> Lets not be a little presumptious shall we.
Well, rereading previous comments it is clear that:
a) you don't know well some basics of virtual machines which have been explained
Which is fine, but then you shouldn't consider yourself a peer to
developers of these projects. And you shouldn't claim you have done
proper research if you just used the projects.
You are welcome to be curious, but with such a comment you are the
presumptuous one.
Note that I already remarked that Stefan's comment was not appropriate in style.

Sarvi>> We may have to agree to disagree here. 
I don't believe my thread of discussion has anything to do with Virtual Machines 
at al. 
What I have been saying has more to do with compiling plain RPython code into 
C/C++/ASM executables.

Shedskin uses a statically typed restricted version of Python that gets 
converted to C++
PyPy does convert a statically typed restricted version of Python to C that can 
then be compiled to an executable.
So though with different approachs the final goal is to produce an compiled 
binary executable for the RPython code.

Agreed PyPy does additionally allow using Language/JIT hints to help 
write/generate JIT compilers as well.

That does not remove the possibility that the statically typed version of 
Restricted Python used by Shedskin cannot be full subset of the PyPy RPython.
Nor that there is a possibility of using PyPy as just a plain/pure Restricted 
Python compiler. pure and simple.

This thought angle has nothing to do with Virtual Machines, really.


b) your email client _is_ crappy, given the way you reply inline (I
was mentioning crappy clients in my previous email).
Socially speaking, in an Open Source community, not using a decent
email client can look as bad as dressing very very wrong. I'm not so
picky, but it does mean you're not a hacker.

Sarvi>>> Point taken. 
I use plain Yahoo Web Mail. Do you have any suggestion how I could do better 
with the Yahoo Web Mail client???
I am open learning a better way. :-) Will look into it.


Thanks,
Sarvi


Note I'm not a developer of PyPy, and I don't claim being an expert,
but I have some technical knowledge of its documentation about
internals and of some literature, and some small experience with a
Python implementation.

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/



      

From arigo at tunes.org  Sat Sep  4 11:03:24 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 4 Sep 2010 11:03:24 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <860977.83657.qm@web53705.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
Message-ID: 

Hi,

Can we please close this thread?  The basic answer you will get from
anybody that actually worked at least a bit with PyPy is that all your
discussions are moving air around and nothing else.

There is no one working with PyPy that is interested in using RPython
for the purpose of compiling some RPython programs to C code
statically (except interpreters).  If anyone is really interested in
this topic he can (again) give it a try.  He would get some help from
us, i.e. the rest of the PyPy team, but it would be a fork for now.  I
say "again" because there are some previous attempts at doing that,
which all failed.  As long as no such project exists and is successful
-- and I have some doubts about it -- I will not believe in the nice
(and, to me, completely bogus) claims made on this thread, like "let's
bring RPython and Shedskin together".


A bientot,

Armin.

From bhartsho at yahoo.com  Sun Sep  5 06:50:44 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Sat, 4 Sep 2010 21:50:44 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
Message-ID: <579650.12542.qm@web114018.mail.gq1.yahoo.com>

RPython is a nice language even right now, its very useful, i'm using it all the time.  Too many people downplay its potentials saying its too hard or complex.  I like the error messages, they help you understand how it all works; however it would be nice to see an official FAQ where the common errors are explained in plain english. RPython is only lacking a version number.
-brett




From sarvi at yahoo.com  Sun Sep  5 18:51:26 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Sun, 5 Sep 2010 09:51:26 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython - My Donation to
	the Cause
In-Reply-To: <579650.12542.qm@web114018.mail.gq1.yahoo.com>
References: <579650.12542.qm@web114018.mail.gq1.yahoo.com>
Message-ID: <333806.72573.qm@web53704.mail.re2.yahoo.com>

I think the work PyPy and Shedskin is doing is excellent for the python 
community.
And I think RPython has an excellent future if will give it a little bit of a 
push.

I am just a private citizen.
So here is a small bounty/price of motivation to the PyPy team.
I understand people do open source work for pride and not money.
And Yes $200 may sound like small. 
 
I woud contribute code if I had the time, but I already have my hands on too 
many side projects. 
So this is me helping another way.

Think of this as me hosting Pizza and Coke for one of the endless Sprints yall 
do :-))

Also I am hoping this is just seed money to motivate the RPython cause.
And I am sincerely hoping others interested in seeing RPython and the C backend 
of PyPy develop more completely will also add to this prize pool and show 
support for the cause.


Thx,
Sarvi


----- Original Message ----
> From: Hart's Antler 
> To: pypy-dev at codespeak.net
> Sent: Sat, September 4, 2010 9:50:44 PM
> Subject: Re: [pypy-dev] Question on the future of RPython
> 
> RPython is a nice language even right now, its very useful, i'm using it all 
>the  time.  Too many people downplay its potentials saying its too hard or  
>complex.  I like the error messages, they help you understand how it all  works; 
>however it would be nice to see an official FAQ where the common errors  are 
>explained in plain english. RPython is only lacking a version  number.
> -brett
> 
> 
> 
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 


      

From sarvi at yahoo.com  Mon Sep  6 20:27:08 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Mon, 6 Sep 2010 11:27:08 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: <792921.22517.qm@web53703.mail.re2.yahoo.com>

Hi Armin,
   Could you point me to some of these previous attempts at improving 
RPython-to-Executable capability. 
I would like to understand what was attempted.

Hart's Antler, who seems to be working on RPython quite extensively contacted me 
privately about dong some work in the RPython area.
I am considering sponsoring him in to do some work on PyPy,only  if it is done 
with the PyPy teams blessings and will help help PyPy as a whole.

Is there a wish list of RPython enhancements somewhere that the PyPy team might 
be considering? 
Stuff that would benefit RPython users in general.

Sarvi



----- Original Message ----
> From: Armin Rigo 
> To: Saravanan Shanmugham 
> Cc: Paolo Giarrusso ; pypy-dev at codespeak.net; Stefan 
>Behnel 
> Sent: Sat, September 4, 2010 2:03:24 AM
> Subject: Re: [pypy-dev] Question on the future of RPython
> 
> Hi,
> 
> Can we please close this thread?  The basic answer you will get  from
> anybody that actually worked at least a bit with PyPy is that all  your
> discussions are moving air around and nothing else.
> 
> There is no  one working with PyPy that is interested in using RPython
> for the purpose of  compiling some RPython programs to C code
> statically (except  interpreters).  If anyone is really interested in
> this topic he can  (again) give it a try.  He would get some help from
> us, i.e. the rest of  the PyPy team, but it would be a fork for now.  I
> say "again" because  there are some previous attempts at doing that,
> which all failed.  As  long as no such project exists and is successful
> -- and I have some doubts  about it -- I will not believe in the nice
> (and, to me, completely bogus)  claims made on this thread, like "let's
> bring RPython and Shedskin  together".
> 
> 
> A bientot,
> 
> Armin.
> 


      

From hakan at debian.org  Tue Sep  7 10:51:00 2010
From: hakan at debian.org (Hakan Ardo)
Date: Tue, 7 Sep 2010 10:51:00 +0200
Subject: [pypy-dev] jit-bounds branch (was: Loop invaraints)
In-Reply-To: 
References: 
	<4C7A3737.50902@gmx.de>
	
	<4C7A4C99.2050803@gmx.de>
	
	
	
Message-ID: 

Hi,
there is now a package-version of optimizeopt in the jit-bounds
branch. In optimizeopt/__init__.py a chain of optimizers is created:

??? optimizations = [OptIntBounds(),
???????????????????? OptRewrite(),
???????????????????? OptVirtualize(),
???????????????????? OptHeap(),
??????????????????? ]

The opperations are passed from one optimizer to the next, which means
we keep the single loop over the iterations. Each optimazation is
located in it's own file, and it should be straight forward to add
more optimization and even make them optional using runtime arguments
if that is of interest.

I believe this branch is ready to be merged now.

On Thu, Sep 2, 2010 at 10:22 AM, Paolo Giarrusso  wrote:
>
> On Tue, Aug 31, 2010 at 09:25, Hakan Ardo  wrote:
> > Ok, so we split it up into a set of Optimization classes in separate
> > files. Each containing a subset of the optimize_... methods. Then we
> > have the propagate_forward method iterate over the instructions
> > passing them to one Optimization after the other? That way we keep the
> > single iteration over the instructions. Would it be preferable to
> > separate them even more and have each Optimization contain it's own
> > loop over the instructions?
>
> But won't this affect performance? Which is very important in a JIT compiler.
> When compiling traces bigger than a cacheline, it might even affect
> locality, i.e. be an important performance problem.
> Unless your RPython compiler can join the loops. If they are just
> loops, it could. If they are tree visits, it likely can't; it's done
> by the Haskell compiler (google Haskell, stream fusion, shortcut
> deforestation, I guess), but the techniques are unlikely to generalize
> to languages with side effects; it's also done/doable in some
> Domain-Specific Languages for tree visitors.
>
> > On Sun, Aug 29, 2010 at 10:05 PM, Maciej Fijalkowski  wrote:
> >> On Sun, Aug 29, 2010 at 2:03 PM, Carl Friedrich Bolz  wrote:
> >>> On 08/29/2010 01:49 PM, Hakan Ardo wrote:
> >>>>> P.S.: A bit unrelated, but a comment on the jit-bounds branch: I think
> >>>>> it would be good if the bounds-related optimizations could move out of
> >>>>> optimizeopt.py to their own file, because otherwise optimizeopt.py is
> >>>>> getting really unwieldy. Does that make sense?
> >>>>
> >>>> Well, class IntBound and the propagate_bounds_ methods could probably
> >>>> be moved elsewhere, but a lot of the work is done in optimize_...
> >>>> methods, which I'm not so sure it would make sens to split up.
> >>>
> >>> I guess then the things that can be sanely moved should move. The file
> >>> is nearly 2000 lines, which is way too big. I guess also the heap
> >>> optimizations could go to their own file.
> >>>
> >>> Carl Friedrich
> >>
> >> How about a couple of files (preferably small) each containing a
> >> contained optimization if possible? (maybe a package?)
> >>
> >
> >
> >
> > --
> > H?kan Ard?
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
>
>
>
> --
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/



--
H?kan Ard?

From arigo at tunes.org  Tue Sep  7 10:57:15 2010
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 7 Sep 2010 10:57:15 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <792921.22517.qm@web53703.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
Message-ID: 

Hi,

On Mon, Sep 6, 2010 at 8:27 PM, Saravanan Shanmugham
> Is there a wish list of RPython enhancements somewhere that the
> PyPy team might be considering?
> Stuff that would benefit RPython users in general.

I feel like I am repeating myself so that's my last mail to this
thread.  There are no enhancements we are considering to benefit other
RPython users because *there* *are* *no* *other* *RPython* *users.*
There is only us and RPython suits us just fine for the purpose for
which it was designed.

Again, feel free to make a fork or a branch of PyPy and try to develop
a version of RPython that is more suited to writing general programs
in.  I don't know if there is a wish list of what is missing, but
certainly I haven't given it much thoughts myself.  Personally, I
think that writing RPython programs is kind of fun, but in a perverse
way -- if I could just write plain Python that was as fast or mostly
as fast, it would be perfect.


A bient?t,

Armin.

From stefan_ml at behnel.de  Tue Sep  7 11:07:42 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 07 Sep 2010 11:07:42 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	<201009021140.53415.jacob@openend.se>	<933850.84117.qm@web53705.mail.re2.yahoo.com>	<201009022254.07665.jacob@openend.se>	<810929.87945.qm@web53702.mail.re2.yahoo.com>		<609691.54456.qm@web53705.mail.re2.yahoo.com>		<294101.49986.qm@web53704.mail.re2.yahoo.com>		<860977.83657.qm@web53705.mail.re2.yahoo.com>		<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
Message-ID: 

Armin Rigo, 07.09.2010 10:57:
> On Mon, Sep 6, 2010 at 8:27 PM, Saravanan Shanmugham
>> Is there a wish list of RPython enhancements somewhere that the
>> PyPy team might be considering?
>> Stuff that would benefit RPython users in general.
>
> Again, feel free to make a fork or a branch of PyPy and try to develop
> a version of RPython that is more suited to writing general programs
> in.

In that case, I suggest working on Shedskin or Cython instead.

Stefan


From angelflow at yahoo.com  Thu Sep  9 20:04:11 2010
From: angelflow at yahoo.com (Andy)
Date: Thu, 9 Sep 2010 11:04:11 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
Message-ID: <599423.93719.qm@web111315.mail.gq1.yahoo.com>

Hi,

I'm interested in using PyPy JIT with async greenlet-based frameworks such as gevent or meinheld.

Is there any plan to make PyPy JIT work with greenlet and C extensions? If so when may that be available?

Thanks.
Andy


      

From angelflow at yahoo.com  Thu Sep  9 20:06:20 2010
From: angelflow at yahoo.com (Andy)
Date: Thu, 9 Sep 2010 11:06:20 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT and Django
Message-ID: <215409.51875.qm@web111306.mail.gq1.yahoo.com>

I'd like to run Django on PyPy JIT. 

Could you give me some instructions on how to do that? I couldn't really find any documentation in that area.

Thanks.

Andy


      

From amauryfa at gmail.com  Thu Sep  9 20:18:22 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 9 Sep 2010 20:18:22 +0200
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: <215409.51875.qm@web111306.mail.gq1.yahoo.com>
References: <215409.51875.qm@web111306.mail.gq1.yahoo.com>
Message-ID: 

Hi,

2010/9/9 Andy :
> I'd like to run Django on PyPy JIT.
>
> Could you give me some instructions on how to do that? I couldn't really find any documentation in that area.

I suggest to start with the obvious:
- Install PyPy
- download Django, unpack the archive
- in the Django directory, run "pypy setup.py install"
And tell us how it behaves!

-- 
Amaury Forgeot d'Arc

From angelflow at yahoo.com  Thu Sep  9 20:32:55 2010
From: angelflow at yahoo.com (Andy)
Date: Thu, 9 Sep 2010 11:32:55 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: 
Message-ID: <515725.20046.qm@web111303.mail.gq1.yahoo.com>


--- On Thu, 9/9/10, Amaury Forgeot d'Arc  wrote:

> I suggest to start with the obvious:
> - Install PyPy
> - download Django, unpack the archive
> - in the Django directory, run "pypy setup.py install"
> And tell us how it behaves!
> 

Would this replace my existing Python interpretor with PyPy? I want to keep my existing Python.

I'd like to have the option to choose either the standard CPython or PyPy. I could set up 2 vrtualenvs I suppose - 1 for CPython and 1 for PyPy. Does PyPy work with virtualenv?


      

From fuzzyman at voidspace.org.uk  Thu Sep  9 20:35:14 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Thu, 9 Sep 2010 19:35:14 +0100
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: <515725.20046.qm@web111303.mail.gq1.yahoo.com>
References: 
	<515725.20046.qm@web111303.mail.gq1.yahoo.com>
Message-ID: 

On 9 September 2010 19:32, Andy  wrote:

>
> --- On Thu, 9/9/10, Amaury Forgeot d'Arc  wrote:
>
> > I suggest to start with the obvious:
> > - Install PyPy
> > - download Django, unpack the archive
> > - in the Django directory, run "pypy setup.py install"
> > And tell us how it behaves!
> >
>
> Would this replace my existing Python interpretor with PyPy? I want to keep
> my existing Python.
>
>
Nope, the pypy interpreter is called pypy not python.


> I'd like to have the option to choose either the standard CPython or PyPy.
> I could set up 2 vrtualenvs I suppose - 1 for CPython and 1 for PyPy. Does
> PyPy work with virtualenv?
>
>
There is definitely a version of virtualenv that works with pypy. It may
even be included in pypy.

All the best,

Michael



>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>



-- 
http://www.voidspace.org.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100909/ceca28bf/attachment.htm 

From anto.cuni at gmail.com  Thu Sep  9 20:42:10 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Thu, 09 Sep 2010 20:42:10 +0200
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: <515725.20046.qm@web111303.mail.gq1.yahoo.com>
References: <515725.20046.qm@web111303.mail.gq1.yahoo.com>
Message-ID: <4C892A82.5080804@gmail.com>

On 09/09/10 20:32, Andy wrote:
> Would this replace my existing Python interpretor with PyPy? I want to keep my existing Python.

well, no: if you run pypy setup.py install on whatever package, what happens
is that you install this package in pypy's site-package directory instead of
cpython's one.

> I'd like to have the option to choose either the standard CPython or PyPy. I could set up 2 vrtualenvs I suppose - 1 for CPython and 1 for PyPy. Does PyPy work with virtualenv?

yes, pypy supports virtualenv but:

1) you need a pypy newer than pypy 1.3: you can build one by yourself from
svn, or download one of our nightly builds:
http://buildbot.pypy.org/nightly/trunk/

note that you probably want the one with -jit and -linux in it, which is a 32
bit executable. There is no pypy-jit for linux 64 bit yet (but it might be
there soon).

2) you need a recent version of virtualenv, as pypy support has been added
only recently (http://bitbucket.org/ianb/virtualenv/changeset/a03cb042dd81).
AFAIK, no released version supports it, so you need to install/run it from the
mercurial repository.

ciao,
Anto

From angelflow at yahoo.com  Thu Sep  9 22:29:53 2010
From: angelflow at yahoo.com (Andy)
Date: Thu, 9 Sep 2010 13:29:53 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: <4C892A82.5080804@gmail.com>
Message-ID: <392170.19756.qm@web111313.mail.gq1.yahoo.com>

Thanks Antonio.

So right now there's no way to run PyPy JIT on Linux 64 bit? 

--- On Thu, 9/9/10, Antonio Cuni  wrote:

> From: Antonio Cuni 
> Subject: Re: [pypy-dev] PyPy JIT and Django
> To: "Andy" 
> Cc: "Amaury Forgeot d'Arc" , pypy-dev at codespeak.net
> Date: Thursday, September 9, 2010, 2:42 PM
> On 09/09/10 20:32, Andy wrote:
> > Would this replace my existing Python interpretor with
> PyPy? I want to keep my existing Python.
> 
> well, no: if you run pypy setup.py install on whatever
> package, what happens
> is that you install this package in pypy's site-package
> directory instead of
> cpython's one.
> 
> > I'd like to have the option to choose either the
> standard CPython or PyPy. I could set up 2 vrtualenvs I
> suppose - 1 for CPython and 1 for PyPy. Does PyPy work with
> virtualenv?
> 
> yes, pypy supports virtualenv but:
> 
> 1) you need a pypy newer than pypy 1.3: you can build one
> by yourself from
> svn, or download one of our nightly builds:
> http://buildbot.pypy.org/nightly/trunk/
> 
> note that you probably want the one with -jit and -linux in
> it, which is a 32
> bit executable. There is no pypy-jit for linux 64 bit yet
> (but it might be
> there soon).
> 
> 2) you need a recent version of virtualenv, as pypy support
> has been added
> only recently (http://bitbucket.org/ianb/virtualenv/changeset/a03cb042dd81).
> AFAIK, no released version supports it, so you need to
> install/run it from the
> mercurial repository.
> 
> ciao,
> Anto
> 


      

From amauryfa at gmail.com  Thu Sep  9 22:37:15 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 9 Sep 2010 22:37:15 +0200
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: <392170.19756.qm@web111313.mail.gq1.yahoo.com>
References: <4C892A82.5080804@gmail.com>
	<392170.19756.qm@web111313.mail.gq1.yahoo.com>
Message-ID: 

2010/9/9 Andy :
> Thanks Antonio.
>
> So right now there's no way to run PyPy JIT on Linux 64 bit?

Yes there is!
Armin merged the asmgcc-64 branch yesterday.
You have to use trunk version of PyPy, and build it yourself.

-- 
Amaury Forgeot d'Arc

From fijall at gmail.com  Thu Sep  9 22:55:55 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 9 Sep 2010 22:55:55 +0200
Subject: [pypy-dev] PyPy JIT and Django
In-Reply-To: 
References: <4C892A82.5080804@gmail.com>
	<392170.19756.qm@web111313.mail.gq1.yahoo.com>
	
Message-ID: 

On Thu, Sep 9, 2010 at 10:37 PM, Amaury Forgeot d'Arc
 wrote:
> 2010/9/9 Andy :
>> Thanks Antonio.
>>
>> So right now there's no way to run PyPy JIT on Linux 64 bit?
>
> Yes there is!
> Armin merged the asmgcc-64 branch yesterday.
> You have to use trunk version of PyPy, and build it yourself.

Although I would mark this support as "experimental" for now, it works :-)

>
> --
> Amaury Forgeot d'Arc
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From arigo at tunes.org  Sat Sep 11 16:57:41 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 11 Sep 2010 16:57:41 +0200
Subject: [pypy-dev] External RPython mailing list
Message-ID: 

Hi,

To anyone interested, Sarvi(?) created an RPython mailing list (Thanks
Bea for spotting this):

    http://pyppet.blogspot.com/2010/09/rpython-mailing-list.html

The following paragraph should have been posted as a comment to that
blog post, but it doesn't record my post no matter how much I try, so
I'll put it here:

"""
Ah, sorry about the money issue.  I didn't realize that you already
sent it to us; I misunderstood that you would not send it at all after
we told you that we don't have resources and motivation to make
RPython more user-friendly (even with $200).  Now I suppose that we
can arrange for you to get the money back if you like, or else thank
you properly for it if it's ours to keep anyway :-)
"""

About the non-money issue, I end up looking like the bad guy.  I
suppose I should not have tried to say and repeat "no" so many times
in the previous thread in increasingly bad tones; now Sarvi points
only to my most negative e-mail.


A bient?t,

Armin

From bea at changemaker.nu  Sun Sep 12 21:58:14 2010
From: bea at changemaker.nu (Bea During)
Date: Sun, 12 Sep 2010 21:58:14 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
Message-ID: <4C8D30D6.10802@changemaker.nu>

Hi there

Armin Rigo skrev:
> Hi,
>
> To anyone interested, Sarvi(?) created an RPython mailing list (Thanks
> Bea for spotting this):
>
>     http://pyppet.blogspot.com/2010/09/rpython-mailing-list.html
>
> The following paragraph should have been posted as a comment to that
> blog post, but it doesn't record my post no matter how much I try, so
> I'll put it here:
>
> """
> Ah, sorry about the money issue.  I didn't realize that you already
> sent it to us; I misunderstood that you would not send it at all after
> we told you that we don't have resources and motivation to make
> RPython more user-friendly (even with $200).  Now I suppose that we
> can arrange for you to get the money back if you like, or else thank
> you properly for it if it's ours to keep anyway :-)
> """
>
> About the non-money issue, I end up looking like the bad guy.  I
> suppose I should not have tried to say and repeat "no" so many times
> in the previous thread in increasingly bad tones; now Sarvi points
> only to my most negative e-mail.
>
>
> A bient?t,
>
> Armin
>   

Thanks for posting this Armin. Let?s focus on the future then.

Maybe we should be clear in our documentation somewhere on
where we stand regarding RPython and maybe give some friendly
advice on how to get started with experimenting (because that is
what it means trying out RPython for other purposes than what Pypy
uses it for). And if more questions like these pop up we can refer the
inquiries there?

That way it?s the core dev team who expresses the views, in one voice
so to say.

Just a suggestion.

Cheers

Bea
(

> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>
>   


From angelflow at yahoo.com  Mon Sep 13 07:21:03 2010
From: angelflow at yahoo.com (Andy)
Date: Sun, 12 Sep 2010 22:21:03 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
Message-ID: <173287.30107.qm@web111312.mail.gq1.yahoo.com>



--- On Fri, 9/10/10, Armin Rigo  wrote:

> On Thu, Sep 9, 2010 at 8:04 PM, Andy 
> wrote:
> > Is there any plan to make PyPy JIT work with greenlet
> and C extensions?
> > If so when may that be available?
> 
> As far as I can tell it works already.? Just download
> (http://buildbot.pypy.org/nightly/trunk/)
> or build yourself a
> stackless version of PyPy; it should contain "cpyext",
> which means
> that it should support C extension modules.
> 
> Note that callbacks from C to Python do not play nicely
> with stackless
> features: if you try e.g. to switch to another greenlet
> while you are
> in such a callback, you get a fatal error.? Hopefully
> that's a rare
> case, but still, it should at least be improved to give you
> a regular
> catchable Python exception.


Thanks Armin.

Does that mean PyPy will not work with greenlet/gevent/etc?

Is there any plan to make PyPy support greenlet? Or is there some fundamental obstacle that would prevent it from doing so?

Andy


      

From arigo at tunes.org  Mon Sep 13 09:57:28 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 13 Sep 2010 09:57:28 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <4C8D30D6.10802@changemaker.nu>
References: 
	<4C8D30D6.10802@changemaker.nu>
Message-ID: 

Hi,

On Sun, Sep 12, 2010 at 9:58 PM, Bea During  wrote:
> Maybe we should be clear in our documentation somewhere on
> where we stand regarding RPython

What about renaming it first?  There is at least one other project
that uses the name RPython.  What about something like InterpPy or
InterpPython to make it clear that it's supposed to be used to write
interpreters?  It doesn't sound terrific but I don't really care --
so, comments welcome, but please no infinite discussion on the pros
and cons of various names.


A bient?t,

Armin.

From fijall at gmail.com  Mon Sep 13 10:03:01 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 13 Sep 2010 10:03:01 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
Message-ID: 

On Mon, Sep 13, 2010 at 9:57 AM, Armin Rigo  wrote:
> Hi,
>
> On Sun, Sep 12, 2010 at 9:58 PM, Bea During  wrote:
>> Maybe we should be clear in our documentation somewhere on
>> where we stand regarding RPython
>
> What about renaming it first? ?There is at least one other project
> that uses the name RPython. ?What about something like InterpPy or
> InterpPython to make it clear that it's supposed to be used to write
> interpreters? ?It doesn't sound terrific but I don't really care --
> so, comments welcome, but please no infinite discussion on the pros
> and cons of various names.
>

While we're at it, how about splitting the translation toolchain from
pypy interpreter? I don't mean on technical merits, it can still be
the same or mostly the same source codebase, but more on the
conceptual level, to have 2 different websites names etc.

From arigo at tunes.org  Mon Sep 13 10:10:40 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 13 Sep 2010 10:10:40 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <173287.30107.qm@web111312.mail.gq1.yahoo.com>
References: 
	<173287.30107.qm@web111312.mail.gq1.yahoo.com>
Message-ID: 

Hi Andy,

On Mon, Sep 13, 2010 at 7:21 AM, Andy  wrote:
> Does that mean PyPy will not work with greenlet/gevent/etc?

Sorry if I wasn't clear.  PyPy contains greenlet support (since
2005-6).  It's part of the same package that we call "pypy-stackless".


Armin

From fijall at gmail.com  Mon Sep 13 10:11:42 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 13 Sep 2010 10:11:42 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
Message-ID: 

On Mon, Sep 13, 2010 at 10:08 AM, Armin Rigo  wrote:
> Hi Maciej,
>
> On Mon, Sep 13, 2010 at 10:03 AM, Maciej Fijalkowski  wrote:
>> While we're at it, how about splitting the translation toolchain from
>> pypy interpreter? I don't mean on technical merits, it can still be
>> the same or mostly the same source codebase, but more on the
>> conceptual level, to have 2 different websites names etc.
>
> I don't care too much right now. ?My motivation was to make RPython
> *less* visible, not create a second website for the translation
> toolchain (which would make RPython more visible).
>

I don't think it's hideable. What we can do instead is to leave some
kind of description why it is like it is and what it is. Trying to
hide it means to some people that we have an awesome tool that we
don't want to share. Instead it's worth explaining why we don't share
this (because it's eg hard to use)

>
> A bient?t,
>
> Armin.
>

From arigo at tunes.org  Mon Sep 13 10:14:01 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 13 Sep 2010 10:14:01 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
Message-ID: 

Hi,

On Mon, Sep 13, 2010 at 10:11 AM, Maciej Fijalkowski  wrote:
> I don't think it's hideable.

Sorry, I wasn't clear.  I'm not really trying to hide it.  But I'm
also not really trying to push it forward (which seems to be what
creating a website for it would do).


Armin

From arigo at tunes.org  Mon Sep 13 10:08:51 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 13 Sep 2010 10:08:51 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
Message-ID: 

Hi Maciej,

On Mon, Sep 13, 2010 at 10:03 AM, Maciej Fijalkowski  wrote:
> While we're at it, how about splitting the translation toolchain from
> pypy interpreter? I don't mean on technical merits, it can still be
> the same or mostly the same source codebase, but more on the
> conceptual level, to have 2 different websites names etc.

I don't care too much right now.  My motivation was to make RPython
*less* visible, not create a second website for the translation
toolchain (which would make RPython more visible).


A bient?t,

Armin.

From anto.cuni at gmail.com  Mon Sep 13 10:20:07 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Mon, 13 Sep 2010 10:20:07 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
References: 	<173287.30107.qm@web111312.mail.gq1.yahoo.com>
	
Message-ID: <4C8DDEB7.3080006@gmail.com>

On 13/09/10 10:10, Armin Rigo wrote:
> Hi Andy,
> 
> On Mon, Sep 13, 2010 at 7:21 AM, Andy  wrote:
>> Does that mean PyPy will not work with greenlet/gevent/etc?
> 
> Sorry if I wasn't clear.  PyPy contains greenlet support (since
> 2005-6).  It's part of the same package that we call "pypy-stackless".

yes, but it must also be said that at the moment, pypy-stackless and pypy-jit
do not work together.

ciao,
Anto

From anto.cuni at gmail.com  Mon Sep 13 10:22:54 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Mon, 13 Sep 2010 10:22:54 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 	<4C8D30D6.10802@changemaker.nu>				
	
Message-ID: <4C8DDF5E.8060303@gmail.com>

On 13/09/10 10:14, Armin Rigo wrote:
> Hi,
> 
> On Mon, Sep 13, 2010 at 10:11 AM, Maciej Fijalkowski  wrote:
>> I don't think it's hideable.
> 
> Sorry, I wasn't clear.  I'm not really trying to hide it.  But I'm
> also not really trying to push it forward (which seems to be what
> creating a website for it would do).

well, I don't think that hiding it or pushing it backward is a good idea.  In
theory, we would like if other people start to use rpython to write interpreters.

What we don't like is to use rpython as a general purpose language, but that's
a slightly different issue, IMHO.

ciao,
Anto

From fijall at gmail.com  Mon Sep 13 10:27:27 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 13 Sep 2010 10:27:27 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <4C8DDF5E.8060303@gmail.com>
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
	
	<4C8DDF5E.8060303@gmail.com>
Message-ID: 

On Mon, Sep 13, 2010 at 10:22 AM, Antonio Cuni  wrote:
> On 13/09/10 10:14, Armin Rigo wrote:
>> Hi,
>>
>> On Mon, Sep 13, 2010 at 10:11 AM, Maciej Fijalkowski  wrote:
>>> I don't think it's hideable.
>>
>> Sorry, I wasn't clear. ?I'm not really trying to hide it. ?But I'm
>> also not really trying to push it forward (which seems to be what
>> creating a website for it would do).
>
> well, I don't think that hiding it or pushing it backward is a good idea. ?In
> theory, we would like if other people start to use rpython to write interpreters.
>
> What we don't like is to use rpython as a general purpose language, but that's
> a slightly different issue, IMHO.

Is it really about interpreters? (what's interpreter-specific after
all in RPython) or is it just that it's hard to use and does not
integrate with CPython well?

>
> ciao,
> Anto
>

From arigo at tunes.org  Mon Sep 13 10:40:02 2010
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 13 Sep 2010 10:40:02 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <4C8DDEB7.3080006@gmail.com>
References: 
	<173287.30107.qm@web111312.mail.gq1.yahoo.com>
	
	<4C8DDEB7.3080006@gmail.com>
Message-ID: 

Hi,

On Mon, Sep 13, 2010 at 10:20 AM, Antonio Cuni  wrote:
> yes, but it must also be said that at the moment, pypy-stackless and pypy-jit
> do not work together.

Oups, sorry. I missed the word "JIT" in the original message of this
thread :-(  Sorry for the confusion.

To answer the original question: it would be nice if someone would
show up and help contribute JIT support for Stackless builds of PyPy.
I think that the status is that no-one of us is ready to invest a lot
of time there, but we can definitely give pointers and get people
started and follow their progress.


A bient?t,

Armin.

From anto.cuni at gmail.com  Mon Sep 13 10:50:34 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Mon, 13 Sep 2010 10:50:34 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
	
	<4C8DDF5E.8060303@gmail.com>
	
Message-ID: <4C8DE5DA.9050909@gmail.com>

On 13/09/10 10:27, Maciej Fijalkowski wrote:

> Is it really about interpreters? (what's interpreter-specific after
> all in RPython) or is it just that it's hard to use and does not
> integrate with CPython well?

my point if that it's definitely good enough for writing interpreters. For the
rest, it's a bit unknown (in the sense that nobody has ever tried), and we
don't care about knowing :-)

From holger at merlinux.eu  Mon Sep 13 11:24:16 2010
From: holger at merlinux.eu (holger krekel)
Date: Mon, 13 Sep 2010 11:24:16 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <4C8DE5DA.9050909@gmail.com>
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
	
	<4C8DDF5E.8060303@gmail.com>
	
	<4C8DE5DA.9050909@gmail.com>
Message-ID: <20100913092416.GH32478@trillke.net>

On Mon, Sep 13, 2010 at 10:50 +0200, Antonio Cuni wrote:
> On 13/09/10 10:27, Maciej Fijalkowski wrote:
> 
> > Is it really about interpreters? (what's interpreter-specific after
> > all in RPython) or is it just that it's hard to use and does not
> > integrate with CPython well?
> 
> my point if that it's definitely good enough for writing interpreters. For the
> rest, it's a bit unknown (in the sense that nobody has ever tried), and we
> don't care about knowing :-)

People have written apps and libs in RPython at several points in its
history.  And while i find it perfectly acceptable and fine for PyPy core devs 
to not want to care for usage of RPython for non-interpreter purposes i 
am a bit tired of this ever ongoing competition of expressing dis-interest
and uttering discouraging statements. 

best,
holger

From benjamin at python.org  Mon Sep 13 17:42:07 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 13 Sep 2010 10:42:07 -0500
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
Message-ID: 

2010/9/13 Maciej Fijalkowski :
> On Mon, Sep 13, 2010 at 9:57 AM, Armin Rigo  wrote:
>> Hi,
>>
>> On Sun, Sep 12, 2010 at 9:58 PM, Bea During  wrote:
>>> Maybe we should be clear in our documentation somewhere on
>>> where we stand regarding RPython
>>
>> What about renaming it first? ?There is at least one other project
>> that uses the name RPython. ?What about something like InterpPy or
>> InterpPython to make it clear that it's supposed to be used to write
>> interpreters? ?It doesn't sound terrific but I don't really care --
>> so, comments welcome, but please no infinite discussion on the pros
>> and cons of various names.
>>
>
> While we're at it, how about splitting the translation toolchain from
> pypy interpreter? I don't mean on technical merits, it can still be
> the same or mostly the same source codebase, but more on the
> conceptual level, to have 2 different websites names etc.

-0. We don't need more websites/trees to maintain. Anyway, it's not
clear to me where the split would be, since the translator and the
python interpreter are very interdependent.

-- 
Regards,
Benjamin

From angelflow at yahoo.com  Mon Sep 13 18:18:15 2010
From: angelflow at yahoo.com (Andy)
Date: Mon, 13 Sep 2010 09:18:15 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
Message-ID: <122423.28497.qm@web111302.mail.gq1.yahoo.com>


--- On Mon, 9/13/10, Armin Rigo  wrote:

> > yes, but it must also be said that at the moment,
> pypy-stackless and pypy-jit
> > do not work together.
> 
> Oups, sorry. I missed the word "JIT" in the original
> message of this
> thread :-(? Sorry for the confusion.
> 
> To answer the original question: it would be nice if
> someone would
> show up and help contribute JIT support for Stackless
> builds of PyPy.
> I think that the status is that no-one of us is ready to
> invest a lot

OK let me make sure I got it right:

PyPy-JIT does not work with pypy-stackless. I'm mostly interested in greenlet, not stackless python. Is pypy-stackless required for greenlet support?

Looks like you're saying PyPy-JIT doesn't support greenlet and there's no plan to do so, correct?






      

From cfbolz at gmx.de  Mon Sep 13 19:15:38 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Mon, 13 Sep 2010 19:15:38 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <122423.28497.qm@web111302.mail.gq1.yahoo.com>
References: <122423.28497.qm@web111302.mail.gq1.yahoo.com>
Message-ID: <4C8E5C3A.7080302@gmx.de>

Hi Andy,

On 09/13/2010 06:18 PM, Andy wrote:
> OK let me make sure I got it right:
>
> PyPy-JIT does not work with pypy-stackless. I'm mostly interested in
> greenlet, not stackless python. Is pypy-stackless required for
> greenlet support?
>
> Looks like you're saying PyPy-JIT doesn't support greenlet

That's correct.

> there's no plan to do so, correct?

I think it's more a case of "no manpower". If somebody is interested in 
implementing it and shows up in the channel, we can give help. We have 
currently no time to do it ourselves.

Cheers,

Carl Friedrich


From sarvi at yahoo.com  Tue Sep 14 01:10:59 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Mon, 13 Sep 2010 16:10:59 -0700 (PDT)
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: 
References: 
Message-ID: <157589.10631.qm@web53701.mail.re2.yahoo.com>

No I didn't create the mailing list. Possibly Hart's Antler who did.

And the money to PyPy had no strings attached.
I really hope PyPy replaces CPython as the standard python sooner, rather than 
later.

True. 
That yall are not interested in standardizing an implicitly static subset of 
Python that can be used to create compiled executables or python extension 
libraries.

But if PyPy gains momentum, I am pretty sure this idea will gain momentum 
eventually.

Keep the good work.

Sarvi




----- Original Message ----
> From: Armin Rigo 
> To: pypy-dev at codespeak.net
> Sent: Sat, September 11, 2010 7:57:41 AM
> Subject: [pypy-dev] External RPython mailing list
> 
> Hi,
> 
> To anyone interested, Sarvi(?) created an RPython mailing list  (Thanks
> Bea for spotting this):
> 
>      http://pyppet.blogspot.com/2010/09/rpython-mailing-list.html
> 
> The  following paragraph should have been posted as a comment to that
> blog post,  but it doesn't record my post no matter how much I try, so
> I'll put it  here:
> 
> """
> Ah, sorry about the money issue.  I didn't realize that  you already
> sent it to us; I misunderstood that you would not send it at all  after
> we told you that we don't have resources and motivation to  make
> RPython more user-friendly (even with $200).  Now I suppose that  we
> can arrange for you to get the money back if you like, or else  thank
> you properly for it if it's ours to keep anyway :-)
> """
> 
> About  the non-money issue, I end up looking like the bad guy.  I
> suppose I  should not have tried to say and repeat "no" so many times
> in the previous  thread in increasingly bad tones; now Sarvi points
> only to my most negative  e-mail.
> 
> 
> A  bient?t,
> 
> Armin
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 


      

From sarvi at yahoo.com  Tue Sep 14 01:36:07 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Mon, 13 Sep 2010 16:36:07 -0700 (PDT)
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <4C8DE5DA.9050909@gmail.com>
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
	
	<4C8DDF5E.8060303@gmail.com>
	
	<4C8DE5DA.9050909@gmail.com>
Message-ID: <328327.91624.qm@web53705.mail.re2.yahoo.com>

I believe Hart's Antler has done quite bit of work in RPython.

Yeah, that said, more work to do there.

My goal with my RPython thread, is that I believe that there is an implicitly 
static subset of Python that can be compiled into standalone executables and 
DLLs without needing JIT or VMs.
Can serve 2 purposes.
   1. Make standalone executables just like C/C++ code.
   2. Write Python Extension modules that can be compiled into shared DLL 
modules for CPython and PyPy

Looking through the various threads on PyPy, Shedskin and Cython, I believe its 
just a matter of time.

Sarvi



  

----- Original Message ----
> From: Antonio Cuni 
> To: Maciej Fijalkowski 
> Cc: pypy-dev at codespeak.net; Armin Rigo 
> Sent: Mon, September 13, 2010 1:50:34 AM
> Subject: Re: [pypy-dev] External RPython mailing list
> 
> On 13/09/10 10:27, Maciej Fijalkowski wrote:
> 
> > Is it really about  interpreters? (what's interpreter-specific after
> > all in RPython) or is  it just that it's hard to use and does not
> > integrate with CPython  well?
> 
> my point if that it's definitely good enough for writing  interpreters. For 
the
> rest, it's a bit unknown (in the sense that nobody has  ever tried), and we
> don't care about knowing  :-)
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 


      

From fijall at gmail.com  Tue Sep 14 11:39:30 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 14 Sep 2010 11:39:30 +0200
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <328327.91624.qm@web53705.mail.re2.yahoo.com>
References: 
	<4C8D30D6.10802@changemaker.nu>
	
	
	
	
	
	<4C8DDF5E.8060303@gmail.com>
	
	<4C8DE5DA.9050909@gmail.com>
	<328327.91624.qm@web53705.mail.re2.yahoo.com>
Message-ID: 

Hi.

Speaking from a personal perspective here, I would help people write
standalone executables using RPython. This has been tried (even with
success) for small examples and works. However, for most places where
it was tried, it was an ill-chosen tool for that purpose (where slight
python optimizations or using JIT would work equally well) with
RPython having sometimes bizarre limitations that we're not willing to
work on.

The second part (writing Python extensions) I think is not a very good
target, but can be done. However, I don't want people telling me that
I should work on it. If some people want to implement this, they can
get my help.

I think this defines rough outline where I'm (personally) willing to
help or not to help people using RPython.

On Tue, Sep 14, 2010 at 1:36 AM, Saravanan Shanmugham  wrote:
> I believe Hart's Antler has done quite bit of work in RPython.
>
> Yeah, that said, more work to do there.
>
> My goal with my RPython thread, is that I believe that there is an implicitly
> static subset of Python that can be compiled into standalone executables and
> DLLs without needing JIT or VMs.
> Can serve 2 purposes.
> ? 1. Make standalone executables just like C/C++ code.
> ? 2. Write Python Extension modules that can be compiled into shared DLL
> modules for CPython and PyPy
>
> Looking through the various threads on PyPy, Shedskin and Cython, I believe its
> just a matter of time.
>
> Sarvi
>
>
>
>
>
> ----- Original Message ----
>> From: Antonio Cuni 
>> To: Maciej Fijalkowski 
>> Cc: pypy-dev at codespeak.net; Armin Rigo 
>> Sent: Mon, September 13, 2010 1:50:34 AM
>> Subject: Re: [pypy-dev] External RPython mailing list
>>
>> On 13/09/10 10:27, Maciej Fijalkowski wrote:
>>
>> > Is it really about ?interpreters? (what's interpreter-specific after
>> > all in RPython) or is ?it just that it's hard to use and does not
>> > integrate with CPython ?well?
>>
>> my point if that it's definitely good enough for writing ?interpreters. For
> the
>> rest, it's a bit unknown (in the sense that nobody has ?ever tried), and we
>> don't care about knowing ?:-)
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>>
>
>
>
>

From sarvi at yahoo.com  Tue Sep 14 21:32:27 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Tue, 14 Sep 2010 12:32:27 -0700 (PDT)
Subject: [pypy-dev] PyPy to generate C/C++ code
Message-ID: <553714.16055.qm@web53703.mail.re2.yahoo.com>

To be very clear this is not a question on PyPY RPython itself. :-))

But I had another thought and wanted to run it by PyPy team. 


As I understand it PyPy is foremost a language development framework. 
It is about implementing the python interpreter in RPython, plus 
additional hints to assist in JIT generation. 

If the Python language implementation in RPython has enough 
information to create a python interpreter and do JIT compilation. 
I am thinking it should have enough information to generate C/C++ code.

The kind that shedskin has under shedskin/lib/ 

Basically port the type inference engine from shedskin over to PyPy and use the 
bulk of Shedksin C++ code but use PyPy Language 
Framework to implement the Python Compiler that shedksin implements? 


In otherwords, can PyPy be the language framework in which Shedskin is 
implemented/ported onto?

Looking Shedskin and PyPy do yall have a rough feel for how difficult this would 
be.

Why the question?
I am planning to fund some prize money for an Under/Graduate school project back 
in India and am looking for ideas. 

This means we would able to motivate a team of 2-5 smart young engineers for 
about 6 months into doing something interesting for them but  beneficial for the 
python community. 


One area I am obviously looking at is compiling Python code. 


I was thinking the project could be to 
   1. take your C++ code under shedskin/lib as is 
   2. Have them implement/port the shedskin type inference engine onto the PyPy 
framework and create a PyPy backend that generates  the C++ code 
from the shedskin/lib 


What would yall think of such an idea. 
Estimates? Feasibility? 
Do you see any benefits to this work for Shedskin or PyPy or both? 

Sarvi 



      

From benjamin at python.org  Tue Sep 14 23:26:31 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 14 Sep 2010 16:26:31 -0500
Subject: [pypy-dev] PyPy to generate C/C++ code
In-Reply-To: <553714.16055.qm@web53703.mail.re2.yahoo.com>
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
Message-ID: 

2010/9/14 Saravanan Shanmugham :
> To be very clear this is not a question on PyPY RPython itself. :-))
>
> But I had another thought and wanted to run it by PyPy team.
>
>
> As I understand it PyPy is foremost a language development framework.
> It is about implementing the python interpreter in RPython, plus
> additional hints to assist in JIT generation.
>
> If the Python language implementation in RPython has enough
> information to create a python interpreter and do JIT compilation.
> I am thinking it should have enough information to generate C/C++ code.

Creating a JIT compiler is completely different from statically
compiling code. In a JIT, you use runtime information to optimize the
code. You can't do anything about this in C.



-- 
Regards,
Benjamin

From sarvi at yahoo.com  Wed Sep 15 00:15:36 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Tue, 14 Sep 2010 15:15:36 -0700 (PDT)
Subject: [pypy-dev] PyPy to generate C/C++ code
In-Reply-To: 
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
	
Message-ID: <708334.39926.qm@web53705.mail.re2.yahoo.com>

I don't expect this python compiler to be for full python but just a Restricted 
statically typed subset of python as defined by Shedskin.

Yes. JIT annotation may not serve the purpose of generating a compiler.
Hence the porting of the type inference engine and may be use JIT notations if 
it can be . 

Sarvi



----- Original Message ----
> From: Benjamin Peterson 
> To: Saravanan Shanmugham 
> Cc: pypy-dev at codespeak.net
> Sent: Tue, September 14, 2010 2:26:31 PM
> Subject: Re: [pypy-dev] PyPy to generate C/C++ code
> 
> 2010/9/14 Saravanan Shanmugham :
> > To be very clear  this is not a question on PyPY RPython itself. :-))
> >
> > But I had  another thought and wanted to run it by PyPy team.
> >
> >
> > As I  understand it PyPy is foremost a language development framework.
> > It is  about implementing the python interpreter in RPython, plus
> > additional  hints to assist in JIT generation.
> >
> > If the Python language  implementation in RPython has enough
> > information to create a python  interpreter and do JIT compilation.
> > I am thinking it should have enough  information to generate C/C++ code.
> 
> Creating a JIT compiler is completely  different from statically
> compiling code. In a JIT, you use runtime  information to optimize the
> code. You can't do anything about this in  C.
> 
> 
> 
> -- 
> Regards,
> Benjamin
> 


      

From benjamin at python.org  Wed Sep 15 00:19:36 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 14 Sep 2010 17:19:36 -0500
Subject: [pypy-dev] Fwd:  PyPy to generate C/C++ code
In-Reply-To: 
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
	
	<708334.39926.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: 

---------- Forwarded message ----------
From: Benjamin Peterson 
Date: 2010/9/14
Subject: Re: [pypy-dev] PyPy to generate C/C++ code
To: Saravanan Shanmugham 


2010/9/14 Saravanan Shanmugham :
> I don't expect this python compiler to be for full python but just a Restricted
> statically typed subset of python as defined by Shedskin.

So how is that any different than the existing RPython?



--
Regards,
Benjamin



-- 
Regards,
Benjamin

From sarvi at yahoo.com  Wed Sep 15 00:45:11 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Tue, 14 Sep 2010 15:45:11 -0700 (PDT)
Subject: [pypy-dev] Fwd:  PyPy to generate C/C++ code
In-Reply-To: 
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
	
	<708334.39926.qm@web53705.mail.re2.yahoo.com>
	
	
Message-ID: <313654.95430.qm@web53701.mail.re2.yahoo.com>

This gives us an opportunity to define a more general purpose definition of 
RPython.

Based on previous threads, RPython as currently defined in PyPy is up for 
general use.
And the team seems to have no interest in exploring this RPython for general 
use.
And to be honest I can see some extent why. 
Keeps it simple and domain specific for the purpose of language definition which 
is PyPy's goals.

So my intent is to instead go for more general purpose RPython, by starting with 
Shedskin's definition of Restricted Python, which is pretty close to PyPy's 
RPython
Refer to Hart's comparison 
here. http://groups.google.com/group/shedskin-discuss/browse_thread/thread/a8b473f0b4b52217


Why port shedskin onto PyPy. 
   1. See if PyPy as a language definition framework can be used generate 
compilers and not just Interpreters.
   2. Leverage the C++ code under shedskin/lib to quickly get to that goal. 
   3. I am thinking bringing them together will bring more interest and momentum 
to both PyPy but more importantly a general purpose Restricted Python Compiler.

Sarvi 


----- Original Message ----
> From: Benjamin Peterson 
> To: PyPy Dev 
> Sent: Tue, September 14, 2010 3:19:36 PM
> Subject: [pypy-dev] Fwd:  PyPy to generate C/C++ code
> 
> ---------- Forwarded message ----------
> From: Benjamin Peterson 
> Date:  2010/9/14
> Subject: Re: [pypy-dev] PyPy to generate C/C++ code
> To:  Saravanan Shanmugham 
> 
> 
> 2010/9/14  Saravanan Shanmugham :
> > I don't expect  this python compiler to be for full python but just a 
>Restricted
> >  statically typed subset of python as defined by Shedskin.
> 
> So how is that  any different than the existing  RPython?
> 
> 
> 
> --
> Regards,
> Benjamin
> 
> 
> 
> -- 
> Regards,
> Benjamin
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 


      

From amauryfa at gmail.com  Wed Sep 15 01:11:28 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 15 Sep 2010 01:11:28 +0200
Subject: [pypy-dev] PyPy to generate C/C++ code
In-Reply-To: <708334.39926.qm@web53705.mail.re2.yahoo.com>
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
	
	<708334.39926.qm@web53705.mail.re2.yahoo.com>
Message-ID: 

2010/9/15 Saravanan Shanmugham :
> I don't expect this python compiler to be for full python but just a Restricted
> statically typed subset of python as defined by Shedskin.
>
> Yes. JIT annotation may not serve the purpose of generating a compiler.
> Hence the porting of the type inference engine and may be use JIT notations if
> it can be.

I've downloaded and read the source code of shedskin.
>From what I understand, here are some differences between PyPy and Shedksin.

- Shedskin analyses and generates code directly by walking the AST of a python
module.  (there are two passes: the first to grab information about global types
and functions, the second to emit code)

- Shedskin does very little type inference. Shedskin's type system is based on
C++ templates, and once a variable's type has been determined, generic code is
emitted and the C++ compiler will select the correct implementation.  Other
inference engines also work on the AST; Logilab's pylint, for example, works
much harder to check all instructions and the type of all variables.  Shedskin
does not seem to need such power.

- On the other hand, PyPy analyzes imported modules, and works on the bytecode
of functions living in memory.  It does a complete type inference and emits
low-level C code or Java intermediate representation.

- PyPy has its own way to write generic code and templates, the language for
meta-programming is Python itself!  [I'm referring to loops that
generate classes and functions, and things like "specialize:argtype(0)",
"unrolling_iterable" combined with constant propagation].

In most cases, PyPy does not generate better code than Shedskin. When Shedskin
compiles code, it does it well.  And its restrictions are easier to work with;
RPython is really tricky to get right sometimes.

Of course, PyPy goal is different: it does not only generate low-level C code,
it also generates a JIT compiler that can optimize calls at runtime - in the
context of an interpreter.  I can't see which computations made there could be
applied to static code.

Bottom line: if you want to generate efficient C code from python, use (and
improve) Shedskin.  If you want python code to run faster, don't translate
anything, and use the PyPy interpreter.

-- 
Amaury Forgeot d'Arc

From cfbolz at gmx.de  Wed Sep 15 10:05:17 2010
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Wed, 15 Sep 2010 10:05:17 +0200
Subject: [pypy-dev] PyPy to generate C/C++ code
In-Reply-To: 
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>		<708334.39926.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: <4C907E3D.10104@gmx.de>

Hi Amaury,

On 09/15/2010 01:11 AM, Amaury Forgeot d'Arc wrote:
> 2010/9/15 Saravanan Shanmugham:
>> I don't expect this python compiler to be for full python but just a Restricted
>> statically typed subset of python as defined by Shedskin.
>>
>> Yes. JIT annotation may not serve the purpose of generating a compiler.
>> Hence the porting of the type inference engine and may be use JIT notations if
>> it can be.
>
> I've downloaded and read the source code of shedskin.
>> From what I understand, here are some differences between PyPy and Shedksin.
>
> - Shedskin analyses and generates code directly by walking the AST of a python
> module.  (there are two passes: the first to grab information about global types
> and functions, the second to emit code)
>
> - Shedskin does very little type inference. Shedskin's type system is based on
> C++ templates, and once a variable's type has been determined, generic code is
> emitted and the C++ compiler will select the correct implementation.  Other
> inference engines also work on the AST; Logilab's pylint, for example, works
> much harder to check all instructions and the type of all variables.  Shedskin
> does not seem to need such power.
>
> - On the other hand, PyPy analyzes imported modules, and works on the bytecode
> of functions living in memory.  It does a complete type inference and emits
> low-level C code or Java intermediate representation.
>
> - PyPy has its own way to write generic code and templates, the language for
> meta-programming is Python itself!  [I'm referring to loops that
> generate classes and functions, and things like "specialize:argtype(0)",
> "unrolling_iterable" combined with constant propagation].
>
> In most cases, PyPy does not generate better code than Shedskin. When Shedskin
> compiles code, it does it well.  And its restrictions are easier to work with;
> RPython is really tricky to get right sometimes.

Nice analysis and description, thank you!

Carl Friedrich

From jacob at openend.se  Wed Sep 15 11:01:26 2010
From: jacob at openend.se (Jacob =?iso-8859-1?q?Hall=E9n?=)
Date: Wed, 15 Sep 2010 11:01:26 +0200
Subject: [pypy-dev] PyPy to generate C/C++ code
In-Reply-To: 
References: <553714.16055.qm@web53703.mail.re2.yahoo.com>
	<708334.39926.qm@web53705.mail.re2.yahoo.com>
	
Message-ID: <201009151101.27281.jacob@openend.se>

At a higher level of abstraction, Python is a dynamic language. The dynamicity 
is what makes it slow. There are simply so many things that might occur at 
runtime that have to be taken into account in the code. The JIT is designed to 
find cases where the dynamic properties of the language are not being used in 
that particular instance of execution, and generate faster code for that bit 
of the program.
This has almost nothing in common with trying to generate C or machine code 
from a static language that superficially looks like Python. Square Peg, Round 
Hole.

Jacob Hall?n
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100915/a5225d57/attachment.pgp 

From fijall at gmail.com  Wed Sep 15 18:18:05 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 15 Sep 2010 18:18:05 +0200
Subject: [pypy-dev] [pypy-svn] r77083 - pypy/branch/jitffi
In-Reply-To: <20100915110721.C94F0282C16@codespeak.net>
References: <20100915110721.C94F0282C16@codespeak.net>
Message-ID: 

Hey anto.

There was a SoC about that, I guess it would be good to chat about it
at least (personally I think jitting rlib/libffi is exactly bad layer
to be jitted and some experiments were done).

Cheers,
fijal

On Wed, Sep 15, 2010 at 1:07 PM,   wrote:
> Author: antocuni
> Date: Wed Sep 15 13:07:20 2010
> New Revision: 77083
>
> Added:
> ? pypy/branch/jitffi/ ? (props changed)
> ? ? ?- copied from r77082, pypy/trunk/
> Log:
> a branch in which to try to jit rlib/libffi.py
>
>
> _______________________________________________
> pypy-svn mailing list
> pypy-svn at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-svn
>

From bhartsho at yahoo.com  Thu Sep 16 02:44:35 2010
From: bhartsho at yahoo.com (Hart's Antler)
Date: Wed, 15 Sep 2010 17:44:35 -0700 (PDT)
Subject: [pypy-dev] External RPython mailing list
Message-ID: <463721.60230.qm@web114014.mail.gq1.yahoo.com>

Porting ShedSkin to use the PyPy translation toolchain on the surface sounds like a good idea, but its not if we look at the details.  The first issue is legal, PyPy is MIT licensed, which works very well when integrated by commerical software.  But Shedskin uses the GNU GPL3, so importing any of its code (or code that imports GPL code, etc) into the user's compiled program also binds it to the GPL - which is no good for commerical software.  Shedskin within the PyPy toolchain may taint the users program with GPL code because some RPython programs will import from rlib (which may in someway depend on Mark's GPL code).

The second issue is technical, not much is gained for the likely great amount of effort it would take to merge ShedSkin.  Lets look at what is gained:
	1. muteable globals
	2. None and (int,float) are intermixable as attributes on an instance because ShedSkin has some limited support for dynamic-sub-types.  (PyPy can not mix None with int and float)
	3. operator overloading (except for __iter__ and __call__), PyPy only allows overloading __init__ and __del__

#1, would be nice to have but its an easy workaround to use singleton instances.
#2, no big advantage.
#3, this is a big advantage, i wish i could at least overload __getattr__ in PyPy Rpy.

ShedSkin is behind PyPy Rpy in the following areas:
	1. no getattr, hasattr etc.
	2. *args		(Mark says he can bring it back but only for homogenous types (PyPy supports *args with non-homogenous types))
	3. passing method references
	4. no interface to C
	5. mixed-type tuples are limited to length two (PyPy allows for any length)
	6. multiple inheritance

The ShedSkin readme itself states that " the type inference techniques employed by Shed Skin currently do not scale very well beyond several hundred lines of code" and recommends ShedSkin only for small programs.  Not having the 6 items above are additional reasons why ShedSkin is not ideal for writting large programs, these are serious limitations for a large program, especially #4 - not having a easy way to interface with C is a huge show-stopper; PyPy Rpy has an amazingly simple way to interface with C (rffi).  The other dissadvantages of ShedSkin are: slow object allocation (Phil Hassey did a test showing ShedSkin 30% slower than RPython), and it only translates to C++ while PyPy can translate to C, Java, and #C.

There are simple ways to improve PyPy Rpy that will benifit both the PyPy project and those who strictly want to use the translation toolchain.  I've been following the progress of this years Google Summer of Code projects, and i see a big stumbling block for everybody was RPython.  PyPy today would have a better 64bit JIT, faster ctypes, and better numpy support if RPython was itself better.

RPython Wishlist:
	. documentation
	. iteration over tuples of any length (with any mixed types)
	. overloading __getattr__, __setattr__
	. pickle support, if its limited thats ok.
	. rstruct is incomplete
	. llvm backend, what happened to llvm support?
	. not having to define dummy functions on the base class to prevent 'demotion'
	. not having to use the hack `assert isinstance(a,MySubClass)` to call methods with incompatible signatures.
	. we already have the decorator: @specialize.argtype(1), why can't we have @specialize.argtype(*) so that all arguments can have flexible types?
	. methods stored in a list for easy dispatch can not have mismatched signatures.

I aggree with Fijal, CPython module extension should be a low priority.  There is a big speed overhead when passing data back and forth from CPython, and speed is the whole point of going through the trouble of writting in RPython.  Those who are less concerned with speed and want CPython module extensions can use Cython, which is well tested.  Those who are interested in RPython and want a simple way to get started can use ShedSkin to make an extension module, and then migrate their module to a standalone app with PyPy if they choose.

-hart




From william.leslie.ttg at gmail.com  Thu Sep 16 04:10:29 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 16 Sep 2010 12:10:29 +1000
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
References: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
Message-ID: 

On 16 September 2010 10:44, Hart's Antler  wrote:
> [the GPL] is no good for commerical software.

Please don't go there.

> ... what is gained:
> ? ? ? ?2. None and (int,float) are intermixable as attributes on an instance because ShedSkin has some limited support for dynamic-sub-types. ?(PyPy can not mix None with int and float)

And it's not obvious we would want this anyway, as such type
information has to live somewhere, and having to carry that around
makes using ints and floats more expensive. In pypy-c, small applevel
ints are already tagged in the obvious way (the lowest bit of a
pointer is always unset, so setting it tags an app-level integer), but
rpython ints shouldn't have to live with the performance hit and
additional accuracy loss where tagged ints are cast to native ints.

> ? ? ? ?3. operator overloading (except for __iter__ and __call__), PyPy only allows overloading __init__ and __del__

Makes rtyping more work. It takes long enough as it is.

> ShedSkin is behind PyPy Rpy in the following areas:
> ? ? ? ?1. no getattr, hasattr etc.

Does rpython really have this? As it appears in, say, the main python
eval loop, it's constant folded. This is a good example of python
usage in metaprogramming rpython.

> ? ? ? ?6. multiple inheritance

That would limit rpython's ability to target the CLI and JVM if
implemented naeively. It's doable, but means we need to use interfaces
on ootype targets and need to do more work to look up methods in the C
backend - possibly implementing hotspot-style itables or virtual
inheritance or passing a dictionary around. It could get ugly quickly,
not only for the backends, but also for the JIT generator, which
already has too much to think about in terms of the model it is
implementing.

> ? ? ? ?. not having to define dummy functions on the base class to prevent 'demotion'

Some concept of an interface would be handy.

-- 
William Leslie

From santagada at gmail.com  Thu Sep 16 05:01:50 2010
From: santagada at gmail.com (Leonardo Santagada)
Date: Thu, 16 Sep 2010 00:01:50 -0300
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
References: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
Message-ID: 

If anyone wants to pay developers to work on rpython it should
probably follow this wishlist and not focus on trying to merge
shedskin into pypy (for the all the reasons given above).

Here is my version of the list

On Wed, Sep 15, 2010 at 9:44 PM, Hart's Antler  wrote:
> RPython Wishlist:
> ? ? ? ?. documentation
Yes, this should be the highest priority. I could add documentation,
doctests explaining stuff, templates for new modules and other things,
better error messages.

> ? ? ? ?. iteration over tuples of any length (with any mixed types)
> ? ? ? ?. overloading __getattr__, __setattr__
> ? ? ? ?. pickle support, if its limited thats ok.
did anyone ever really needed this? Would be cool I think, but unnecessary.

> ? ? ? ?. rstruct is incomplete
> ? ? ? ?. llvm backend, what happened to llvm support?
also I don't know about the utility of this for rpython itself, would
be interesting if there was a jit backend that used llvm to better jit
everything.

> ? ? ? ?. not having to define dummy functions on the base class to prevent 'demotion'
> ? ? ? ?. not having to use the hack `assert isinstance(a,MySubClass)` to call methods with incompatible signatures.
both are nice to have in a language with type inferences. Without this
explicit in the code maybe the error messages would be even more
complex to deal with.

> ? ? ? ?. we already have the decorator: @specialize.argtype(1), why can't we have @specialize.argtype(*) so that all arguments can have flexible types?
could be @specilize.argtype(42)

> ? ? ? ?. methods stored in a list for easy dispatch can not have mismatched signatures.

I would add:
        . tool to automatically generate stubs for c libraries
        . c++ support (to be able to use c++ libs directly on rpython)
        . better java support (because the jvm matters).
        . better tooling (profiling, debugging etc)
        . separate compilation and/or some cache to speed up compilation.


-- 
Leonardo Santagada

From sarvi at yahoo.com  Thu Sep 16 05:03:43 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Wed, 15 Sep 2010 20:03:43 -0700 (PDT)
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
References: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
Message-ID: <119208.74110.qm@web53706.mail.re2.yahoo.com>





----- Original Message ----
> From: Hart's Antler 
> To: pypy-dev at codespeak.net
> Cc: Saravanan Shanmugham 
> Sent: Wed, September 15, 2010 5:44:35 PM
> Subject: Re: External RPython mailing list
> 
> Porting ShedSkin to use the PyPy translation toolchain on the surface sounds  
>like a good idea, but its not if we look at the details.  The first issue  is 
>legal, PyPy is MIT licensed, which works very well when integrated by  
>commerical software.  But Shedskin uses the GNU GPL3, so importing any of  its 
>code (or code that imports GPL code, etc) into the user's compiled program  also 
>binds it to the GPL - which is no good for commerical software.   Shedskin 
>within the PyPy toolchain may taint the users program with GPL code  because 
>some RPython programs will import from rlib (which may in someway depend  on 
>Mark's GPL code).

Sarvi: Good point. I hadn't noticed that the generated C++ code was GPL.
I think Mark might be open to MIT licesing the generated C++ code. Coz, without 
it Shedskin as a tool chain would make no sense.
But then thats a different story.

Either way, it looks like there i not much enthusiasm for porting Shedskin on 
PyPy and have pypy generate a compiler instead of an interpreter.

>From various threads on python.org as well pypy itself, I see a lot of interest 
in a compiler for a staticaly typed subset of python.
I also feel that a statically typed subset of python can be faster than the 
dynamic superset. 

Which is why I have exploring options to spur some interest to drive some 
momentum in this area. Hasn't been easy.

I can see why some might feel that RPython is not for general use and only for 
language development. 

But what totally surprises me though is that as a language developer, I would 
want RPython to be as flexible as possibile within feasibility of course.

Anyway, I am going to keep it simple and just start exploring just expanding on 
PyPy's RPython compiler to make it more general purpose.

yes, on a separate branch :-))  of ofcourse. :-)

I'll start with some of the items below.

Lets see where that can go.

Sarvi


> 
> The second issue is technical, not much is gained  for the likely great amount 
>of effort it would take to merge ShedSkin.   Lets look at what is gained:
>     1. muteable  globals
>     2. None and (int,float) are intermixable as  attributes on an instance 
>because ShedSkin has some limited support for  dynamic-sub-types.  (PyPy can not 
>mix None with int and  float)
>     3. operator overloading (except for __iter__ and  __call__), PyPy only 
>allows overloading __init__ and __del__
> 
> #1, would be  nice to have but its an easy workaround to use singleton 
>instances.
> #2, no  big advantage.
> #3, this is a big advantage, i wish i could at least overload  __getattr__ in 
>PyPy Rpy.
> 
> ShedSkin is behind PyPy Rpy in the following  areas:
>     1. no getattr, hasattr etc.
>      2. *args        (Mark says he can bring it back  but only for homogenous 
>types (PyPy supports *args with non-homogenous  types))
>     3. passing method references
>      4. no interface to C
>     5. mixed-type tuples are limited to  length two (PyPy allows for any 
>length)
>     6. multiple  inheritance
> 
> The ShedSkin readme itself states that " the type inference  techniques 
>employed by Shed Skin currently do not scale very well beyond several  hundred 
>lines of code" and recommends ShedSkin only for small programs.   Not having the 
>6 items above are additional reasons why ShedSkin is not ideal  for writting 
>large programs, these are serious limitations for a large program,  especially 
>#4 - not having a easy way to interface with C is a huge  show-stopper; PyPy Rpy 
>has an amazingly simple way to interface with C  (rffi).  The other 
>dissadvantages of ShedSkin are: slow object allocation  (Phil Hassey did a test 
>showing ShedSkin 30% slower than RPython), and it only  translates to C++ while 
>PyPy can translate to C, Java, and #C.
> 
> There are  simple ways to improve PyPy Rpy that will benifit both the PyPy 
>project and  those who strictly want to use the translation toolchain.  I've 
>been  following the progress of this years Google Summer of Code projects, and i 
>see a  big stumbling block for everybody was RPython.  PyPy today would have a  
>better 64bit JIT, faster ctypes, and better numpy support if RPython was itself  
>better.
> 
> RPython Wishlist:
>     .  documentation
>     . iteration over tuples of any length (with  any mixed types)
>     . overloading __getattr__,  __setattr__
>     . pickle support, if its limited thats  ok.
>     . rstruct is incomplete
>     . llvm  backend, what happened to llvm support?
>     . not having to  define dummy functions on the base class to prevent  
>'demotion'
>     . not having to use the hack `assert  isinstance(a,MySubClass)` to call 
>methods with incompatible  signatures.
>     . we already have the decorator:  @specialize.argtype(1), why can't we have 
>@specialize.argtype(*) so that all  arguments can have flexible types?
>     . methods stored in a  list for easy dispatch can not have mismatched 
>signatures.
> 
> I aggree with  Fijal, CPython module extension should be a low priority.  There 
>is a big  speed overhead when passing data back and forth from CPython, and 
>speed is the  whole point of going through the trouble of writting in RPython.  
>Those who  are less concerned with speed and want CPython module extensions can 
>use Cython,  which is well tested.  Those who are interested in RPython and want 
>a  simple way to get started can use ShedSkin to make an extension module, and 
>then  migrate their module to a standalone app with PyPy if they  choose.
> 
> -hart
> 
> 
> 
> 


      

From william.leslie.ttg at gmail.com  Thu Sep 16 05:42:11 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Thu, 16 Sep 2010 13:42:11 +1000
Subject: [pypy-dev] External RPython mailing list
In-Reply-To: <119208.74110.qm@web53706.mail.re2.yahoo.com>
References: <463721.60230.qm@web114014.mail.gq1.yahoo.com>
	<119208.74110.qm@web53706.mail.re2.yahoo.com>
Message-ID: 

On 16 September 2010 13:03, Saravanan Shanmugham  wrote:
> Either way, it looks like there i not much enthusiasm for porting Shedskin on
> PyPy and have pypy generate a compiler instead of an interpreter.

In a sense, it already does :). And of course translation is compilation, too.

> >From various threads on python.org as well pypy itself, I see a lot of interest
> in a compiler for a staticaly typed subset of python.
> I also feel that a statically typed subset of python can be faster than the
> dynamic superset.

It can be, there's nothing stopping you from dynamically compiling a
static language, and feeding back profiling information is easy too.
It's just probably not going to be *that* much faster to be of value
at the end of the day. Then again, the end of the day could be a long
way away.

> I can see why some might feel that RPython is not for general use and only for
> language development.
>
> But what totally surprises me though is that as a language developer, I would
> want RPython to be as flexible as possibile within feasibility of course.

You also have to look at it from the other perspective - that of
someone implementing a backend or translation aspect, such as a
garbage collector or a JIT compiler generator. This is my perspective
coming to pypy - I am experimenting with a range of optimisations
based on extensive region and effect analysis, and I fear rpython
already makes this difficult.

For example, the use of abstract interpretation to generate the
flowgraph IR means that you now have no information about which loop
is the 'outer' one, and that information can be useful in generating
heuristics. Similar things could be said about the JIT and generators,
which is not something I have looked at extensively, but dealing with
the generator case would have been implicit from the start if the IR
used a CPS transform to represent all instruction flow. In short:
rpython is complicated enough already.

It happens to do the job it was created for, but not a whole lot more
than that. It happens to be well suited to my experiments for two
unrelated reasons*. But I can't imagine choosing it to write extension
modules or inner loops - there are plenty of languages that do it
better, like cython, pyrex, D, cyclone, SML, etc.

* it's (memory) safe and the rffi is sane, particularly about letting
native code deal with rpython objects. And thanks to the pypy python
interpreter, there's a large body of code to test it on.

-- 
William Leslie

From anto.cuni at gmail.com  Thu Sep 16 08:57:12 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Thu, 16 Sep 2010 08:57:12 +0200
Subject: [pypy-dev] [pypy-svn] r77083 - pypy/branch/jitffi
In-Reply-To: 
References: <20100915110721.C94F0282C16@codespeak.net>
	
Message-ID: <4C91BFC8.9060600@gmail.com>

Hi,

On 15/09/10 18:18, Maciej Fijalkowski wrote:
> Hey anto.
>
> There was a SoC about that, I guess it would be good to chat about it
> at least (personally I think jitting rlib/libffi is exactly bad layer
> to be jitted and some experiments were done).

yes, I read the code in the fast-ctypes branch but I wanted to take another 
(simpler) approach.  Note that my goal is not only to speed up ctypes, but 
also to provide a useful building block for cppyy (the module to call c++ 
functions that we started at the cern sprint).

My basic idea was to mark libffi.FuncPtr.{push_arg,call} in a special way, so 
that the backend can recognize the pattern (i.e. push* + call) and emit a 
single assembler call.

I even started to write a bit of code, but then I realized that libffi.FuncPtr 
is not used at all, as _rawffi uses RawFuncPtr: the bad news is that 
RawFuncPtr uses a different interface, as it does not have push_arg but passes 
the arguments already packed in a list, so my easy solution above cannot work. 
  Note however that doing it at the level of FuncPtr might still be useful for 
cppyy.

Question: why does _rawffi use RawFuncPtr instead of FuncPtr? Would it be 
possible/easy/hard/whatever to switch to FuncPtr?

ciao,
Anto

From anto.cuni at gmail.com  Thu Sep 16 09:03:32 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Thu, 16 Sep 2010 09:03:32 +0200
Subject: [pypy-dev] [pypy-svn] r77101 - in pypy/trunk/pypy: jit/tl
 module/__builtin__ module/__builtin__/test module/pypyjit/test
In-Reply-To: <20100916052748.2421F282C23@codespeak.net>
References: <20100916052748.2421F282C23@codespeak.net>
Message-ID: <4C91C144.3010100@gmail.com>

Hi,

On 16/09/10 07:27, hakanardo at codespeak.net wrote:

> Log:
> Allow jit to unroll calls to max() and min() with more than one argument.
>
[cut]
> + at unroll_safe
>   @specialize.arg(2)
>   def min_max(space, args, implementation_of):
>       if implementation_of == "max":
>           compare = space.gt
>       else:
>           compare = space.lt
> +
> +    args_w = args.arguments_w
> +    if len(args_w)>  1 and not args.keywords: # Unrollable case
> +        w_max_item = None
> +        for w_item in args_w:
> +            if w_max_item is None or \
> +                   space.is_true(compare(w_item, w_max_item)):
> +                w_max_item = w_item
> +        return w_max_item
> +    else:
> +        return min_max_loop(space, args, implementation_of)


I don't think it's a good idea. What happens if I call max() over a list of 1 
million of elements? We obviously don't want the jit to unroll 1 million of 
iterations. Or am I missing something?

ciao,
Anto

From hakan at debian.org  Thu Sep 16 09:18:57 2010
From: hakan at debian.org (Hakan Ardo)
Date: Thu, 16 Sep 2010 09:18:57 +0200
Subject: [pypy-dev] [pypy-svn] r77101 - in pypy/trunk/pypy: jit/tl
 module/__builtin__ module/__builtin__/test module/pypyjit/test
In-Reply-To: <4C91C144.3010100@gmail.com>
References: <20100916052748.2421F282C23@codespeak.net>
	<4C91C144.3010100@gmail.com>
Message-ID: 

2010 at 9:03 AM, Antonio Cuni  wrote:
>> +
>> + ? ?args_w = args.arguments_w
>> + ? ?if len(args_w)> ?1 and not args.keywords: # Unrollable case
>> + ? ? ? ?w_max_item = None
>> + ? ? ? ?for w_item in args_w:
>> + ? ? ? ? ? ?if w_max_item is None or \
>> + ? ? ? ? ? ? ? ? ? space.is_true(compare(w_item, w_max_item)):
>> + ? ? ? ? ? ? ? ?w_max_item = w_item
>> + ? ? ? ?return w_max_item
>> + ? ?else:
>> + ? ? ? ?return min_max_loop(space, args, implementation_of)
>
>
> I don't think it's a good idea. What happens if I call max() over a list of 1
> million of elements? We obviously don't want the jit to unroll 1 million of
> iterations. Or am I missing something?

If lst is your list, the call max(lst) has a single argument, the
list, and it will be passed to the old implementation now called
min_max_loop. However if you call max(*lst) the jit will unroll it.
But why would you do that? The idea here was to optimize the case
max(i,0) where you typically only have a few arguments. Anyway, how
about calling min_max_loop() as soon as len(args_w)  > 10 to be on the
safe side?

-- 
H?kan Ard?

From arigo at tunes.org  Thu Sep 16 17:44:05 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 16 Sep 2010 17:44:05 +0200
Subject: [pypy-dev] [pypy-svn] r77101 - in pypy/trunk/pypy: jit/tl
 module/__builtin__ module/__builtin__/test module/pypyjit/test
In-Reply-To: 
References: <20100916052748.2421F282C23@codespeak.net>
	<4C91C144.3010100@gmail.com>
	
Message-ID: 

Hi,

Can you maybe make a branch, and move the checkin on the branch?
That's the kind of change that needs careful consideration...


Armin

From garyrob at me.com  Thu Sep 23 17:51:13 2010
From: garyrob at me.com (Gary Robinson)
Date: Thu, 23 Sep 2010 11:51:13 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
Message-ID: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>

Hi,

I saw the PyPy Status Blog post mentioning that there is a working asmgcc for x86_64 linux. I wonder if you could clarify the status of it a bit further. The last thing Jason Creighton wrote on the subject that I can find, from Aug 13, was: "...the bottom line is that the main goal of my GSoC was accomplished: A working 64-bit PyPy JIT. Hopefully I'll be able to complete asmgcc-64, and make the JIT even faster..."

But the new Status Blog post says " It not only includes working 64bit JIT (merged into PyPy trunk), but also a working asmgcc for x86_64 linux platform, that makes it possible to run the JIT on this architecture with our advanced garbage collectors"

So it sounds like he (or someone) DID get the Linux version of it working. Has it been merged into the trunk? Does it seem stable? You say: "Expect this to be a major selling point for the next PyPy release :-)"  Do you have an estimate of when that'll come out?

I'm looking forward to testing PyPy for some of our music recommendation code. The main thing holding me back so far is the lack of 64-bit support.

The other thing in the way is that I need to use multiple cores. I can home-grow a solution for my needs, but it would be great if the python multiprocessing library were to be supported. I see "r77223 - in	pypy/branch/fast-forward/pypy/module/_multiprocessing: . test" in the svn commit log, dated Tuesday of this week (http://permalink.gmane.org/gmane.comp.python.pypy.cvs/29865)... I'm hoping that means it's going to be supported soon? That would be really great.

Thanks,
Gary

-- 

Gary Robinson
CTO
Emergent Discovery, LLC
personal email: garyrob at me.com
work email: grobinson at emergentdiscovery.com
Company: http://www.emergentdiscovery.com
Blog:    http://www.garyrobinson.net





From alex.gaynor at gmail.com  Thu Sep 23 18:54:16 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Thu, 23 Sep 2010 12:54:16 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
Message-ID: 

On Thu, Sep 23, 2010 at 11:51 AM, Gary Robinson  wrote:
> Hi,
>
> I saw the PyPy Status Blog post mentioning that there is a working asmgcc for x86_64 linux. I wonder if you could clarify the status of it a bit further. The last thing Jason Creighton wrote on the subject that I can find, from Aug 13, was: "...the bottom line is that the main goal of my GSoC was accomplished: A working 64-bit PyPy JIT. Hopefully I'll be able to complete asmgcc-64, and make the JIT even faster..."
>
> But the new Status Blog post says " It not only includes working 64bit JIT (merged into PyPy trunk), but also a working asmgcc for x86_64 linux platform, that makes it possible to run the JIT on this architecture with our advanced garbage collectors"
>
> So it sounds like he (or someone) DID get the Linux version of it working. Has it been merged into the trunk? Does it seem stable? You say: "Expect this to be a major selling point for the next PyPy release :-)" ?Do you have an estimate of when that'll come out?
>
> I'm looking forward to testing PyPy for some of our music recommendation code. The main thing holding me back so far is the lack of 64-bit support.
>
> The other thing in the way is that I need to use multiple cores. I can home-grow a solution for my needs, but it would be great if the python multiprocessing library were to be supported. I see "r77223 - in ?pypy/branch/fast-forward/pypy/module/_multiprocessing: . test" in the svn commit log, dated Tuesday of this week (http://permalink.gmane.org/gmane.comp.python.pypy.cvs/29865)... I'm hoping that means it's going to be supported soon? That would be really great.
>
> Thanks,
> Gary
>
> --
>
> Gary Robinson
> CTO
> Emergent Discovery, LLC
> personal email: garyrob at me.com
> work email: grobinson at emergentdiscovery.com
> Company: http://www.emergentdiscovery.com
> Blog: ? ?http://www.garyrobinson.net
>
>
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

Yes, 64-bit support for asmgcc as merged, however there appears to be
a performance issue with it, it's not nearly as fast as it should be.

multiproccessing was added to the stdlib in 2.6, we have a
fast-forward branch that's aiming to implement 2.7, so when it's
released it will contain a multiprocessing module.

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From garyrob at me.com  Thu Sep 23 23:28:23 2010
From: garyrob at me.com (Gary Robinson)
Date: Thu, 23 Sep 2010 17:28:23 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
Message-ID: <885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>

Thanks for your response Alex... I have a couple follow-up questions:

> Yes, 64-bit support for asmgcc as merged, however there appears to be
> a performance issue with it, it's not nearly as fast as it should be.

Is this a matter that is getting PyPy developer attention, or is expected to in the relatively near future?

> multiproccessing was added to the stdlib in 2.6, we have a
> fast-forward branch that's aiming to implement 2.7, so when it's
> released it will contain a multiprocessing module.

That's great news. Is there any estimate of when a fairly stable beta will be available?

Thanks!
Gary

-- 

Gary Robinson
CTO
Emergent Discovery, LLC
personal email: garyrob at me.com
work email: grobinson at emergentdiscovery.com
Company: http://www.emergentdiscovery.com
Blog:    http://www.garyrobinson.net




On Sep 23, 2010, at 12:54 PM, Alex Gaynor wrote:

> On Thu, Sep 23, 2010 at 11:51 AM, Gary Robinson  wrote:
>> Hi,
>> 
>> I saw the PyPy Status Blog post mentioning that there is a working asmgcc for x86_64 linux. I wonder if you could clarify the status of it a bit further. The last thing Jason Creighton wrote on the subject that I can find, from Aug 13, was: "...the bottom line is that the main goal of my GSoC was accomplished: A working 64-bit PyPy JIT. Hopefully I'll be able to complete asmgcc-64, and make the JIT even faster..."
>> 
>> But the new Status Blog post says " It not only includes working 64bit JIT (merged into PyPy trunk), but also a working asmgcc for x86_64 linux platform, that makes it possible to run the JIT on this architecture with our advanced garbage collectors"
>> 
>> So it sounds like he (or someone) DID get the Linux version of it working. Has it been merged into the trunk? Does it seem stable? You say: "Expect this to be a major selling point for the next PyPy release :-)"  Do you have an estimate of when that'll come out?
>> 
>> I'm looking forward to testing PyPy for some of our music recommendation code. The main thing holding me back so far is the lack of 64-bit support.
>> 
>> The other thing in the way is that I need to use multiple cores. I can home-grow a solution for my needs, but it would be great if the python multiprocessing library were to be supported. I see "r77223 - in  pypy/branch/fast-forward/pypy/module/_multiprocessing: . test" in the svn commit log, dated Tuesday of this week (http://permalink.gmane.org/gmane.comp.python.pypy.cvs/29865)... I'm hoping that means it's going to be supported soon? That would be really great.
>> 
>> Thanks,
>> Gary
>> 
>> --
>> 
>> Gary Robinson
>> CTO
>> Emergent Discovery, LLC
>> personal email: garyrob at me.com
>> work email: grobinson at emergentdiscovery.com
>> Company: http://www.emergentdiscovery.com
>> Blog:    http://www.garyrobinson.net
>> 
>> 
>> 
>> 
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>> 
> 
> Yes, 64-bit support for asmgcc as merged, however there appears to be
> a performance issue with it, it's not nearly as fast as it should be.
> 
> multiproccessing was added to the stdlib in 2.6, we have a
> fast-forward branch that's aiming to implement 2.7, so when it's
> released it will contain a multiprocessing module.
> 
> Alex
> 
> -- 
> "I disapprove of what you say, but I will defend to the death your
> right to say it." -- Voltaire
> "The people's good is the highest law." -- Cicero
> "Code can always be simpler than you think, but never as simple as you
> want" -- Me


From alex.gaynor at gmail.com  Thu Sep 23 23:34:54 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Thu, 23 Sep 2010 17:34:54 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: <885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
Message-ID: 

On Thu, Sep 23, 2010 at 5:28 PM, Gary Robinson  wrote:
> Thanks for your response Alex... I have a couple follow-up questions:
>
>> Yes, 64-bit support for asmgcc as merged, however there appears to be
>> a performance issue with it, it's not nearly as fast as it should be.
>
> Is this a matter that is getting PyPy developer attention, or is expected to in the relatively near future?
>

We're aware of it, and it will definitely happen before we do any sort
of release.

>> multiproccessing was added to the stdlib in 2.6, we have a
>> fast-forward branch that's aiming to implement 2.7, so when it's
>> released it will contain a multiprocessing module.
>
> That's great news. Is there any estimate of when a fairly stable beta will be available?
>

Amaury or Benjamin could better say.

> Thanks!
> Gary
>
> --
>
> Gary Robinson
> CTO
> Emergent Discovery, LLC
> personal email: garyrob at me.com
> work email: grobinson at emergentdiscovery.com
> Company: http://www.emergentdiscovery.com
> Blog: ? ?http://www.garyrobinson.net
>
>
>
>
> On Sep 23, 2010, at 12:54 PM, Alex Gaynor wrote:
>
>> On Thu, Sep 23, 2010 at 11:51 AM, Gary Robinson  wrote:
>>> Hi,
>>>
>>> I saw the PyPy Status Blog post mentioning that there is a working asmgcc for x86_64 linux. I wonder if you could clarify the status of it a bit further. The last thing Jason Creighton wrote on the subject that I can find, from Aug 13, was: "...the bottom line is that the main goal of my GSoC was accomplished: A working 64-bit PyPy JIT. Hopefully I'll be able to complete asmgcc-64, and make the JIT even faster..."
>>>
>>> But the new Status Blog post says " It not only includes working 64bit JIT (merged into PyPy trunk), but also a working asmgcc for x86_64 linux platform, that makes it possible to run the JIT on this architecture with our advanced garbage collectors"
>>>
>>> So it sounds like he (or someone) DID get the Linux version of it working. Has it been merged into the trunk? Does it seem stable? You say: "Expect this to be a major selling point for the next PyPy release :-)" ?Do you have an estimate of when that'll come out?
>>>
>>> I'm looking forward to testing PyPy for some of our music recommendation code. The main thing holding me back so far is the lack of 64-bit support.
>>>
>>> The other thing in the way is that I need to use multiple cores. I can home-grow a solution for my needs, but it would be great if the python multiprocessing library were to be supported. I see "r77223 - in ?pypy/branch/fast-forward/pypy/module/_multiprocessing: . test" in the svn commit log, dated Tuesday of this week (http://permalink.gmane.org/gmane.comp.python.pypy.cvs/29865)... I'm hoping that means it's going to be supported soon? That would be really great.
>>>
>>> Thanks,
>>> Gary
>>>
>>> --
>>>
>>> Gary Robinson
>>> CTO
>>> Emergent Discovery, LLC
>>> personal email: garyrob at me.com
>>> work email: grobinson at emergentdiscovery.com
>>> Company: http://www.emergentdiscovery.com
>>> Blog: ? ?http://www.garyrobinson.net
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> pypy-dev at codespeak.net
>>> http://codespeak.net/mailman/listinfo/pypy-dev
>>>
>>
>> Yes, 64-bit support for asmgcc as merged, however there appears to be
>> a performance issue with it, it's not nearly as fast as it should be.
>>
>> multiproccessing was added to the stdlib in 2.6, we have a
>> fast-forward branch that's aiming to implement 2.7, so when it's
>> released it will contain a multiprocessing module.
>>
>> Alex
>>
>> --
>> "I disapprove of what you say, but I will defend to the death your
>> right to say it." -- Voltaire
>> "The people's good is the highest law." -- Cicero
>> "Code can always be simpler than you think, but never as simple as you
>> want" -- Me
>
>

Alex

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From benjamin at python.org  Thu Sep 23 23:55:55 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 23 Sep 2010 16:55:55 -0500
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
Message-ID: 

2010/9/23 Alex Gaynor :
> On Thu, Sep 23, 2010 at 5:28 PM, Gary Robinson  wrote:
>> Thanks for your response Alex... I have a couple follow-up questions:
>>
>>> Yes, 64-bit support for asmgcc as merged, however there appears to be
>>> a performance issue with it, it's not nearly as fast as it should be.
>>
>> Is this a matter that is getting PyPy developer attention, or is expected to in the relatively near future?
>>
>
> We're aware of it, and it will definitely happen before we do any sort
> of release.
>
>>> multiproccessing was added to the stdlib in 2.6, we have a
>>> fast-forward branch that's aiming to implement 2.7, so when it's
>>> released it will contain a multiprocessing module.
>>
>> That's great news. Is there any estimate of when a fairly stable beta will be available?
>>
>
> Amaury or Benjamin could better say.

"Longer than you want"


-- 
Regards,
Benjamin

From alex.gaynor at gmail.com  Fri Sep 24 04:36:16 2010
From: alex.gaynor at gmail.com (Alex Gaynor)
Date: Thu, 23 Sep 2010 22:36:16 -0400
Subject: [pypy-dev] PyCon Call for Proposals -- PyCon 2011
Message-ID: 

Call for proposals -- PyCon 2011 -- 
===============================================================

Proposal Due date: November 1st, 2010

PyCon is back! With a rocking new website, a great location and
more Python hackers and luminaries under one roof than you could
possibly shake a stick at. We've also added an "Extreme" talk
track this year - no introduction, no fluff - only the pure
technical meat!

PyCon 2011 will be held March 9th through the 17th, 2011 in Atlanta,
Georgia. (Home of some of the best southern food you can possibly
find on Earth!) The PyCon conference days will be March 11-13,
preceded by two tutorial days (March 9-10), and followed by four
days of development sprints (March 14-17).

PyCon 2011 is looking for proposals for the formal presentation
tracks (this includes "extreme talks"). A request for proposals for
poster sessions and tutorials will come separately.

Want to showcase your skills as a Python Hacker? Want to have
hundreds of people see your talk on the subject of your choice? Have
some hot button issue you think the community needs to address, or have
some package, code or project you simply love talking about? Want to
launch your master plan to take over the world with Python?

PyCon is your platform for getting the word out and teaching something
new to hundreds of people, face to face.

In the past, PyCon has had a broad range of presentations, from reports
on academic and commercial projects, tutorials on a broad range of
subjects, and case studies. All conference speakers are volunteers and
come from a myriad of backgrounds: some are new speakers, some have been
speaking for years. Everyone is welcome, so bring your passion and your
code! We've had some incredible past PyCons, and we're looking to you to
help us top them!

Online proposal submission is open now! Proposals  will be accepted
through November 10th, with acceptance notifications coming out by
January 20th. To get started, please see:

   

For videos of talks from previous years - check out:

   

For more information on "Extreme Talks" see:

   

We look forward to seeing you in Atlanta!

Please also note - registration for PyCon 2011 will also be capped at a
maximum of 1,500 delegates, including speakers. When registration opens
(soon), you're going to want to make sure you register early! Speakers
with accepted talks will have a guaranteed slot.

Important Dates:
   * November 1st, 2010: Talk proposals due.
   * December 15th, 2010: Acceptance emails sent.
   * January 19th, 2010: Early bird registration closes.
   * March 9-10th, 2011: Tutorial days at PyCon.
   * March 11-13th, 2011: PyCon main conference.
   * March 14-17th, 2011: PyCon sprints days.

Contact Emails:
   Van Lindberg (Conference Chair) - van at python.org
   Jesse Noller (Co-Chair) - jnoller at python.org
   PyCon Organizers list: pycon-organizers at python.org

-- 
"I disapprove of what you say, but I will defend to the death your
right to say it." -- Voltaire
"The people's good is the highest law." -- Cicero
"Code can always be simpler than you think, but never as simple as you
want" -- Me

From garyrob at me.com  Fri Sep 24 18:40:22 2010
From: garyrob at me.com (Gary Robinson)
Date: Fri, 24 Sep 2010 12:40:22 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
	
Message-ID: 



-- 

Gary Robinson
CTO
Emergent Discovery, LLC
personal email: garyrob at me.com
work email: grobinson at emergentdiscovery.com
Company: http://www.emergentdiscovery.com
Blog:    http://www.garyrobinson.net


> 
> "Longer than you want"

Ain't that the truth! :)

And yet, if you felt like one of the following blocks if time was more likely than the others, I'd be very interested in knowing:

   < 6 months
   < 1 year
   < 2 years

Or if you'd rather not say anything even at that level, I understand.

Thanks!
Gary


On Sep 23, 2010, at 5:55 PM, Benjamin Peterson wrote:

> 2010/9/23 Alex Gaynor :
>> On Thu, Sep 23, 2010 at 5:28 PM, Gary Robinson  wrote:
>>> Thanks for your response Alex... I have a couple follow-up questions:
>>> 
>>>> Yes, 64-bit support for asmgcc as merged, however there appears to be
>>>> a performance issue with it, it's not nearly as fast as it should be.
>>> 
>>> Is this a matter that is getting PyPy developer attention, or is expected to in the relatively near future?
>>> 
>> 
>> We're aware of it, and it will definitely happen before we do any sort
>> of release.
>> 
>>>> multiproccessing was added to the stdlib in 2.6, we have a
>>>> fast-forward branch that's aiming to implement 2.7, so when it's
>>>> released it will contain a multiprocessing module.
>>> 
>>> That's great news. Is there any estimate of when a fairly stable beta will be available?
>>> 
>> 
>> Amaury or Benjamin could better say.
> 
> "Longer than you want"
> 
> 
> -- 
> Regards,
> Benjamin


From amauryfa at gmail.com  Fri Sep 24 18:56:15 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Fri, 24 Sep 2010 18:56:15 +0200
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
	
	
Message-ID: 

Hi,
2010/9/24 Gary Robinson :
>> "Longer than you want"
>
> Ain't that the truth! :)
>
> And yet, if you felt like one of the following blocks if time was more likely than the others, I'd be very interested in knowing:
>
> ? < 6 months
> ? < 1 year
> ? < 2 years

I will certainly give up before this delay.
At the moment, 1/3 of the files in the test suite pass without error,
they were zero yesterday.
We need volunteers to help us and implement the failing/missing parts!

-- 
Amaury Forgeot d'Arc

From garyrob at me.com  Fri Sep 24 19:16:01 2010
From: garyrob at me.com (Gary Robinson)
Date: Fri, 24 Sep 2010 13:16:01 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
	
	
	
Message-ID: 

> I will certainly give up before this delay.
> At the moment, 1/3 of the files in the test suite pass without error,
> they were zero yesterday.
> We need volunteers to help us and implement the failing/missing parts!

0 to 1/3 is big progress. Sounds to me like the <6 mo time frame might be the right one.

I unfortunately can't contribute code/patches at this point (I don't have the time-freedom or competence in JIT's or C++ to try), but when you have 64-bit code that supports the multiprocessing module ready for end-user testing, I'll be eager to install it and try it against my code, and of course I'll let you know of any problems. Sounds like it's not quite there yet, but I'll continue to follow developments. Perhaps popping up here sometimes to ask again about progress. I'm excited about the work on 64-bit and multiprocessing...

Best,
Gary

-- 

Gary Robinson
CTO
Emergent Discovery, LLC
personal email: garyrob at me.com
work email: grobinson at emergentdiscovery.com
Company: http://www.emergentdiscovery.com
Blog:    http://www.garyrobinson.net




On Sep 24, 2010, at 12:56 PM, Amaury Forgeot d'Arc wrote:

> Hi,
> 2010/9/24 Gary Robinson :
>>> "Longer than you want"
>> 
>> Ain't that the truth! :)
>> 
>> And yet, if you felt like one of the following blocks if time was more likely than the others, I'd be very interested in knowing:
>> 
>>   < 6 months
>>   < 1 year
>>   < 2 years
> 
> I will certainly give up before this delay.
> At the moment, 1/3 of the files in the test suite pass without error,
> they were zero yesterday.
> We need volunteers to help us and implement the failing/missing parts!
> 
> -- 
> Amaury Forgeot d'Arc


From santagada at gmail.com  Fri Sep 24 20:29:14 2010
From: santagada at gmail.com (Leonardo Santagada)
Date: Fri, 24 Sep 2010 15:29:14 -0300
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
	
	
	
	
Message-ID: 

On Fri, Sep 24, 2010 at 2:16 PM, Gary Robinson  wrote:
> I unfortunately can't contribute code/patches at this point (I don't have the time-freedom or competence in JIT's or C++ to try),

Just to be clear, the pypy python interpreter is written in RPython,
and for most of the stuff that is missing on the fast-forward branch
you don't need to know nothing about how the JIT work. Its a pain to
learn RPython, but just to be clear on what you need to know.

Are there any tasks that you can do with pure python?


-- 
Leonardo Santagada

From garyrob at me.com  Fri Sep 24 20:43:56 2010
From: garyrob at me.com (Gary Robinson)
Date: Fri, 24 Sep 2010 14:43:56 -0400
Subject: [pypy-dev] Readiness of asmgcc for x86_64 linux?
In-Reply-To: 
References: <740FA798-2388-47B2-BF54-5F274DED752F@me.com>
	
	<885F1724-7CBB-4EF9-BBCE-0A81DA4A8A7D@me.com>
	
	
	
	
	
	
Message-ID: <3A450FE3-CDF4-428C-A8ED-AEDE92FA83FA@me.com>

> Just to be clear, the pypy python interpreter is written in RPython,
> and for most of the stuff that is missing on the fast-forward branch
> you don't need to know nothing about how the JIT work. Its a pain to
> learn RPython, but just to be clear on what you need to know.
> 
> Are there any tasks that you can do with pure python?

I would love to do it if I had the time-freedom. The information you give above tells me that I probably do have the skills -- I've written a ton of python code over the last 12 years or so. But I'm working to the point that I'm depriving my family already, because my company needs my full attention now. There will come a time when I'll be able to do it, and I suspect PyPy will still be able to benefit from code and patch contributors then. But, of course, that's easy to say. The point is that PyPy would can use more contributors right now, and I, unfortunately, can't be among them.

But I will be an active tester when it's ready for end-user testing.

-- 

Gary Robinson
CTO
Emergent Discovery, LLC
personal email: garyrob at me.com
work email: grobinson at emergentdiscovery.com
Company: http://www.emergentdiscovery.com
Blog:    http://www.garyrobinson.net




On Sep 24, 2010, at 2:29 PM, Leonardo Santagada wrote:

> On Fri, Sep 24, 2010 at 2:16 PM, Gary Robinson  wrote:
>> I unfortunately can't contribute code/patches at this point (I don't have the time-freedom or competence in JIT's or C++ to try),
> 
> Just to be clear, the pypy python interpreter is written in RPython,
> and for most of the stuff that is missing on the fast-forward branch
> you don't need to know nothing about how the JIT work. Its a pain to
> learn RPython, but just to be clear on what you need to know.
> 
> Are there any tasks that you can do with pure python?
> 
> 
> -- 
> Leonardo Santagada


From horace3d at gmail.com  Sat Sep 25 17:47:20 2010
From: horace3d at gmail.com (horace grant)
Date: Sat, 25 Sep 2010 17:47:20 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
Message-ID: 

i just had a (probably) silly idea. :)

if some people like rpython so much, how about writing a rpython
interpreter in rpython? wouldn't it be much easier for the jit to
optimize rpython code? couldn't jitted rpython code theoretically be
as fast as a program that got compiled to c from rpython?

hm... but i wonder if this would make sense at all. maybe if you ran
rpython code with pypy-c-jit, it already could be jitted as well as
with a special rpython interpreter? ...if there were a special rpython
interpreter, would the current jit generator have to be changed to
take advantage of the more simple language?

just curious...


On Tue, Sep 7, 2010 at 11:07 AM, Stefan Behnel  wrote:
> Armin Rigo, 07.09.2010 10:57:
>> On Mon, Sep 6, 2010 at 8:27 PM, Saravanan Shanmugham
>>> Is there a wish list of RPython enhancements somewhere that the
>>> PyPy team might be considering?
>>> Stuff that would benefit RPython users in general.
>>
>> Again, feel free to make a fork or a branch of PyPy and try to develop
>> a version of RPython that is more suited to writing general programs
>> in.
>
> In that case, I suggest working on Shedskin or Cython instead.
>
> Stefan
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From william.leslie.ttg at gmail.com  Sun Sep 26 01:45:37 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Sun, 26 Sep 2010 09:45:37 +1000
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
Message-ID: 

The current JIT generator creates a tracing jit, which gives very different
performance profile to static compilation. For tight loops etc this might be
ok, but might be different for the specific use case people are interested
in (I admit I still don't know what that is).

On 26/09/2010 1:47 AM, "horace grant"  wrote:

i just had a (probably) silly idea. :)

if some people like rpython so much, how about writing a rpython
interpreter in rpython? wouldn't it be much easier for the jit to
optimize rpython code? couldn't jitted rpython code theoretically be
as fast as a program that got compiled to c from rpython?

hm... but i wonder if this would make sense at all. maybe if you ran
rpython code with pypy-c-jit, it already could be jitted as well as
with a special rpython interpreter? ...if there were a special rpython
interpreter, would the current jit generator have to be changed to
take advantage of the more simple language?

just curious...



On Tue, Sep 7, 2010 at 11:07 AM, Stefan Behnel  wrote:
> Armin Rigo, 07.09.20...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100926/f123db1d/attachment-0001.htm 

From list-sink at trainedmonkeystudios.org  Sun Sep 26 23:28:12 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Sun, 26 Sep 2010 14:28:12 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
Message-ID: <1285536492.8752.28.camel@localhost>

On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> i just had a (probably) silly idea. :)
> 
> if some people like rpython so much, how about writing a rpython
> interpreter in rpython? wouldn't it be much easier for the jit to
> optimize rpython code? couldn't jitted rpython code theoretically be
> as fast as a program that got compiled to c from rpython?
>
> hm... but i wonder if this would make sense at all. maybe if you ran
> rpython code with pypy-c-jit, it already could be jitted as well as
> with a special rpython interpreter? ...if there were a special rpython
> interpreter, would the current jit generator have to be changed to
> take advantage of the more simple language?

An excellent question at least.  

A better idea, I think, would be to ask what subset of full-python will
jit well.  What I'd really like to see is a static analyzer that can
display (e.g. by coloring names or lines) how "jit friendly" a piece of
python code is.  This would allow a programmer to get an idea of what
help the jit is going to be when running their code and, hopefully, help
people avoid tragic performance results.  Naturally, for performance
intensive code, you would still need to profile, but for a lot of uses,
simply not having catastrophically bad performance is more than enough
for a good user experience.  

With such a tool, it wouldn't really matter if the answer to "what is
faster" is RPython -- it would be whatever python language subset
happens to work well in a particular case.  I've started working on
something like this [1], but given that I'm doing a startup, I don't
have nearly the time I would need to make this useful in the near-term.

-Terrence

[1] http://github.com/terrence2/melano

> just curious...
> 
> 
> On Tue, Sep 7, 2010 at 11:07 AM, Stefan Behnel  wrote:
> > Armin Rigo, 07.09.2010 10:57:
> >> On Mon, Sep 6, 2010 at 8:27 PM, Saravanan Shanmugham
> >>> Is there a wish list of RPython enhancements somewhere that the
> >>> PyPy team might be considering?
> >>> Stuff that would benefit RPython users in general.
> >>
> >> Again, feel free to make a fork or a branch of PyPy and try to develop
> >> a version of RPython that is more suited to writing general programs
> >> in.
> >
> > In that case, I suggest working on Shedskin or Cython instead.
> >
> > Stefan
> >
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev




From sarvi at yahoo.com  Mon Sep 27 08:57:16 2010
From: sarvi at yahoo.com (Saravanan Shanmugham)
Date: Sun, 26 Sep 2010 23:57:16 -0700 (PDT)
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285536492.8752.28.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
	<1285536492.8752.28.camel@localhost>
Message-ID: <827221.6704.qm@web53704.mail.re2.yahoo.com>

Well, I am happy to see that the my interest in a general purpose RPython is not 
as isolated as I was lead to believe :-))
Thx,

Sarvi


----- Original Message ----
> From: Terrence Cole 
> To: pypy-dev at codespeak.net
> Sent: Sun, September 26, 2010 2:28:12 PM
> Subject: Re: [pypy-dev] Question on the future of RPython
> 
> On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> > i just had a  (probably) silly idea. :)
> > 
> > if some people like rpython so much,  how about writing a rpython
> > interpreter in rpython? wouldn't it be much  easier for the jit to
> > optimize rpython code? couldn't jitted rpython  code theoretically be
> > as fast as a program that got compiled to c from  rpython?
> >
> > hm... but i wonder if this would make sense at all.  maybe if you ran
> > rpython code with pypy-c-jit, it already could be  jitted as well as
> > with a special rpython interpreter? ...if there were a  special rpython
> > interpreter, would the current jit generator have to be  changed to
> > take advantage of the more simple language?
> 
> An  excellent question at least.  
> 
> A better idea, I think, would be to  ask what subset of full-python will
> jit well.  What I'd really like to  see is a static analyzer that can
> display (e.g. by coloring names or lines)  how "jit friendly" a piece of
> python code is.  This would allow a  programmer to get an idea of what
> help the jit is going to be when running  their code and, hopefully, help
> people avoid tragic performance  results.  Naturally, for performance
> intensive code, you would still  need to profile, but for a lot of uses,
> simply not having catastrophically  bad performance is more than enough
> for a good user experience.  
> 
> With such a tool, it wouldn't really matter if the answer to "what  is
> faster" is RPython -- it would be whatever python language  subset
> happens to work well in a particular case.  I've started working  on
> something like this [1], but given that I'm doing a startup, I  don't
> have nearly the time I would need to make this useful in the  near-term.
> 
> -Terrence
> 
> [1]  http://github.com/terrence2/melano
> 
> > just curious...
> > 
> > 
> > On Tue, Sep 7, 2010 at 11:07 AM, Stefan Behnel  wrote:
> >  > Armin Rigo, 07.09.2010 10:57:
> > >> On Mon, Sep 6, 2010 at 8:27  PM, Saravanan Shanmugham
> > >>> Is there a wish list of RPython  enhancements somewhere that the
> > >>> PyPy team might be  considering?
> > >>> Stuff that would benefit RPython users in  general.
> > >>
> > >> Again, feel free to make a fork or a  branch of PyPy and try to develop
> > >> a version of RPython that is  more suited to writing general programs
> > >> in.
> >  >
> > > In that case, I suggest working on Shedskin or Cython  instead.
> > >
> > > Stefan
> > >
> > >  _______________________________________________
> > > pypy-dev at codespeak.net
> > >  http://codespeak.net/mailman/listinfo/pypy-dev
> > >
> >  _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> 
> 
> 
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 


      

From ipc at srand.net  Mon Sep 27 13:49:11 2010
From: ipc at srand.net (Ian P. Cooke)
Date: Mon, 27 Sep 2010 06:49:11 -0500
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
Message-ID: <4CA084B7.3030106@srand.net>


There was a recent thread with the same subject and I would like to look 
into this a bit more.
I knew pypy-stackless wouldn't work after I built a working 64-bit pypy 
w/ JIT, well, now I'm intrigued.

I will look at the code more closely soon.  Armin, Carl Friedrich, would 
you answer a couple of questions in the mean-time?

What is the largest roadblock to making pypy-stackless work on pypy w/ JIT?
Would it be possible/easier to port the greenlet module?

Having built-in support for co-routines would be very nice but my own 
goal is to get greenlet working in any manner.
If I could build a 64-bit pypy w/ JIT and then easy_install greenlet, 
that would work for me.

Thanks,
     Ian

P.S. congratulations on all your recent progress!  I always look forward 
for the next pypy blog update :)

From fijall at gmail.com  Mon Sep 27 14:23:14 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 27 Sep 2010 14:23:14 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <4CA084B7.3030106@srand.net>
References: <4CA084B7.3030106@srand.net>
Message-ID: 

Hey.

greenlet C module is quite incompatible with pypy and won't work.
However making pypy work with jit and stackless is something that
requires a bit of work only (teaching jit how to unroll the stack
mostly) and I plan to look into it in the very near future.

Cheers,
fijal

On Mon, Sep 27, 2010 at 1:49 PM, Ian P. Cooke  wrote:
>
> There was a recent thread with the same subject and I would like to look
> into this a bit more.
> I knew pypy-stackless wouldn't work after I built a working 64-bit pypy
> w/ JIT, well, now I'm intrigued.
>
> I will look at the code more closely soon. ?Armin, Carl Friedrich, would
> you answer a couple of questions in the mean-time?
>
> What is the largest roadblock to making pypy-stackless work on pypy w/ JIT?
> Would it be possible/easier to port the greenlet module?
>
> Having built-in support for co-routines would be very nice but my own
> goal is to get greenlet working in any manner.
> If I could build a 64-bit pypy w/ JIT and then easy_install greenlet,
> that would work for me.
>
> Thanks,
> ? ? Ian
>
> P.S. congratulations on all your recent progress! ?I always look forward
> for the next pypy blog update :)
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

From angelflow at yahoo.com  Mon Sep 27 14:29:49 2010
From: angelflow at yahoo.com (Andy)
Date: Mon, 27 Sep 2010 05:29:49 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
Message-ID: <187982.37806.qm@web111302.mail.gq1.yahoo.com>

Why wouldn't pypy work with greenlet but would work with Stackless? greenlet calls itself a spin-off of Stackless. Isn't greenlet a subset of Stackless without the scheduling? Could you explain a bit more?

Thanks.

--- On Mon, 9/27/10, Maciej Fijalkowski  wrote:

> From: Maciej Fijalkowski 
> Subject: Re: [pypy-dev] PyPy JIT & C extensions, greenlet
> To: "Ian P. Cooke" 
> Cc: pypy-dev at codespeak.net
> Date: Monday, September 27, 2010, 8:23 AM
> Hey.
> 
> greenlet C module is quite incompatible with pypy and won't
> work.
> However making pypy work with jit and stackless is
> something that
> requires a bit of work only (teaching jit how to unroll the
> stack
> mostly) and I plan to look into it in the very near
> future.
> 
> Cheers,
> fijal
> 
> On Mon, Sep 27, 2010 at 1:49 PM, Ian P. Cooke 
> wrote:
> >
> > There was a recent thread with the same subject and I
> would like to look
> > into this a bit more.
> > I knew pypy-stackless wouldn't work after I built a
> working 64-bit pypy
> > w/ JIT, well, now I'm intrigued.
> >
> > I will look at the code more closely soon. ?Armin,
> Carl Friedrich, would
> > you answer a couple of questions in the mean-time?
> >
> > What is the largest roadblock to making pypy-stackless
> work on pypy w/ JIT?
> > Would it be possible/easier to port the greenlet
> module?
> >
> > Having built-in support for co-routines would be very
> nice but my own
> > goal is to get greenlet working in any manner.
> > If I could build a 64-bit pypy w/ JIT and then
> easy_install greenlet,
> > that would work for me.
> >
> > Thanks,
> > ? ? Ian
> >
> > P.S. congratulations on all your recent progress! ?I
> always look forward
> > for the next pypy blog update :)
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> >
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev


      

From fijall at gmail.com  Mon Sep 27 14:30:50 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 27 Sep 2010 14:30:50 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <187982.37806.qm@web111302.mail.gq1.yahoo.com>
References: 
	<187982.37806.qm@web111302.mail.gq1.yahoo.com>
Message-ID: 

PyPy stackless does support greenlets. PyPy would not work with
CPython C module called "greenlets".

On Mon, Sep 27, 2010 at 2:29 PM, Andy  wrote:
> Why wouldn't pypy work with greenlet but would work with Stackless? greenlet calls itself a spin-off of Stackless. Isn't greenlet a subset of Stackless without the scheduling? Could you explain a bit more?
>
> Thanks.
>
> --- On Mon, 9/27/10, Maciej Fijalkowski  wrote:
>
>> From: Maciej Fijalkowski 
>> Subject: Re: [pypy-dev] PyPy JIT & C extensions, greenlet
>> To: "Ian P. Cooke" 
>> Cc: pypy-dev at codespeak.net
>> Date: Monday, September 27, 2010, 8:23 AM
>> Hey.
>>
>> greenlet C module is quite incompatible with pypy and won't
>> work.
>> However making pypy work with jit and stackless is
>> something that
>> requires a bit of work only (teaching jit how to unroll the
>> stack
>> mostly) and I plan to look into it in the very near
>> future.
>>
>> Cheers,
>> fijal
>>
>> On Mon, Sep 27, 2010 at 1:49 PM, Ian P. Cooke 
>> wrote:
>> >
>> > There was a recent thread with the same subject and I
>> would like to look
>> > into this a bit more.
>> > I knew pypy-stackless wouldn't work after I built a
>> working 64-bit pypy
>> > w/ JIT, well, now I'm intrigued.
>> >
>> > I will look at the code more closely soon. ?Armin,
>> Carl Friedrich, would
>> > you answer a couple of questions in the mean-time?
>> >
>> > What is the largest roadblock to making pypy-stackless
>> work on pypy w/ JIT?
>> > Would it be possible/easier to port the greenlet
>> module?
>> >
>> > Having built-in support for co-routines would be very
>> nice but my own
>> > goal is to get greenlet working in any manner.
>> > If I could build a 64-bit pypy w/ JIT and then
>> easy_install greenlet,
>> > that would work for me.
>> >
>> > Thanks,
>> > ? ? Ian
>> >
>> > P.S. congratulations on all your recent progress! ?I
>> always look forward
>> > for the next pypy blog update :)
>> > _______________________________________________
>> > pypy-dev at codespeak.net
>> > http://codespeak.net/mailman/listinfo/pypy-dev
>> >
>> _______________________________________________
>> pypy-dev at codespeak.net
>> http://codespeak.net/mailman/listinfo/pypy-dev
>
>
>
>

From list-sink at trainedmonkeystudios.org  Mon Sep 27 21:44:51 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Mon, 27 Sep 2010 12:44:51 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <827221.6704.qm@web53704.mail.re2.yahoo.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
	<1285536492.8752.28.camel@localhost>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
Message-ID: <1285616691.5954.34.camel@localhost>

On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> Well, I am happy to see that the my interest in a general purpose RPython is not 
> as isolated as I was lead to believe :-))
> Thx,

What I wrote has apparently been widely misunderstood, so let me explain
what I mean in more detail.  What I want is _not_ RPython and it is
_not_ Shedskin.  What I want is not a compiler at all.  What I want is a
visual tool, for example, a plugin to an IDE.  This tool would perform
static analysis on a piece of python code.  Instead of generating code
with this information, it would mark up the python code in the text
display with colors, weights, etc in order to show properties from the
static analysis.  This would be something like semantic highlighting, as
opposed to syntax highlighting.  

I think it possible that this information would, if created and
presented in the correct way, represent the sort of optimizations that
pypy-c-jit -- a full python implementation, not a language subset --
would likely perform on the code if run.  Given this sort of feedback,
it would be much easier for a python coder to write code that works well
with the jit: for example, moving a declaration inside a loop to avoid
boxing, based on the information presented.

Ideally, such a tool would perform instantaneous syntax highlighting
while editing and do full parsing and analysis in the background to
update the semantic highlighting as frequently as possible.  Obviously,
detailed static analysis will provide far more information than it would
be possible to display on the code at once, so I see this gui as having
several modes -- like predator vision -- that show different information
from the analysis.  Naturally, what those modes are will depend strongly
on the details of how pypy-c-jit works internally, what sort of
information can be sanely collected through static analysis, and,
naturally, user testing.

I was somewhat baffled at first as to how what I wrote before was
interpreted as interest in a static python.  I think the disconnect here
is the assumption on many people's part that a static language will
always be faster than a dynamic one.  Given the existing tools that
provide basically no feedback from the compiler / interpreter / jitter,
this is inevitably true at the moment.  I foresee a future, however,
where better tools let us use the full power of a dynamic python AND let
us tighten up our code for speed to get the full advantages of jit
compilation as well.  I believe that in the end, this combination will
prove superior to any fully static compiler.

-Terrence


> Sarvi
> 
> 
> ----- Original Message ----
> > From: Terrence Cole 
> > To: pypy-dev at codespeak.net
> > Sent: Sun, September 26, 2010 2:28:12 PM
> > Subject: Re: [pypy-dev] Question on the future of RPython
> > 
> > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> > > i just had a  (probably) silly idea. :)
> > > 
> > > if some people like rpython so much,  how about writing a rpython
> > > interpreter in rpython? wouldn't it be much  easier for the jit to
> > > optimize rpython code? couldn't jitted rpython  code theoretically be
> > > as fast as a program that got compiled to c from  rpython?
> > >
> > > hm... but i wonder if this would make sense at all.  maybe if you ran
> > > rpython code with pypy-c-jit, it already could be  jitted as well as
> > > with a special rpython interpreter? ...if there were a  special rpython
> > > interpreter, would the current jit generator have to be  changed to
> > > take advantage of the more simple language?
> > 
> > An  excellent question at least.  
> > 
> > A better idea, I think, would be to  ask what subset of full-python will
> > jit well.  What I'd really like to  see is a static analyzer that can
> > display (e.g. by coloring names or lines)  how "jit friendly" a piece of
> > python code is.  This would allow a  programmer to get an idea of what
> > help the jit is going to be when running  their code and, hopefully, help
> > people avoid tragic performance  results.  Naturally, for performance
> > intensive code, you would still  need to profile, but for a lot of uses,
> > simply not having catastrophically  bad performance is more than enough
> > for a good user experience.  
> > 
> > With such a tool, it wouldn't really matter if the answer to "what  is
> > faster" is RPython -- it would be whatever python language  subset
> > happens to work well in a particular case.  I've started working  on
> > something like this [1], but given that I'm doing a startup, I  don't
> > have nearly the time I would need to make this useful in the  near-term.
> > 
> > -Terrence
> > 
> > [1]  http://github.com/terrence2/melano
> > 
> > > just curious...
> > > 
> > > 
> > > On Tue, Sep 7, 2010 at 11:07 AM, Stefan Behnel  wrote:
> > >  > Armin Rigo, 07.09.2010 10:57:
> > > >> On Mon, Sep 6, 2010 at 8:27  PM, Saravanan Shanmugham
> > > >>> Is there a wish list of RPython  enhancements somewhere that the
> > > >>> PyPy team might be  considering?
> > > >>> Stuff that would benefit RPython users in  general.
> > > >>
> > > >> Again, feel free to make a fork or a  branch of PyPy and try to develop
> > > >> a version of RPython that is  more suited to writing general programs
> > > >> in.
> > >  >
> > > > In that case, I suggest working on Shedskin or Cython  instead.
> > > >
> > > > Stefan
> > > >
> > > >  _______________________________________________
> > > > pypy-dev at codespeak.net
> > > >  http://codespeak.net/mailman/listinfo/pypy-dev
> > > >
> > >  _______________________________________________
> > > pypy-dev at codespeak.net
> > > http://codespeak.net/mailman/listinfo/pypy-dev
> > 
> > 
> > 
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> > 
> 
> 
>       



From santagada at gmail.com  Mon Sep 27 21:58:51 2010
From: santagada at gmail.com (Leonardo Santagada)
Date: Mon, 27 Sep 2010 16:58:51 -0300
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285616691.5954.34.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
	<1285536492.8752.28.camel@localhost>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
Message-ID: 

On Mon, Sep 27, 2010 at 4:44 PM, Terrence Cole
 wrote:
> On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
>> Well, I am happy to see that the my interest in a general purpose RPython is not
>> as isolated as I was lead to believe :-))
>> Thx,
>
> What I wrote has apparently been widely misunderstood, so let me explain
> what I mean in more detail. ?What I want is _not_ RPython and it is
> _not_ Shedskin. ?What I want is not a compiler at all. ?What I want is a
> visual tool, for example, a plugin to an IDE. ?This tool would perform
> static analysis on a piece of python code. ?Instead of generating code
> with this information, it would mark up the python code in the text
> display with colors, weights, etc in order to show properties from the
> static analysis. ?This would be something like semantic highlighting, as
> opposed to syntax highlighting.
>
> I think it possible that this information would, if created and
> presented in the correct way, represent the sort of optimizations that
> pypy-c-jit -- a full python implementation, not a language subset --
> would likely perform on the code if run. ?Given this sort of feedback,
> it would be much easier for a python coder to write code that works well
> with the jit: for example, moving a declaration inside a loop to avoid
> boxing, based on the information presented.
>
> Ideally, such a tool would perform instantaneous syntax highlighting
> while editing and do full parsing and analysis in the background to
> update the semantic highlighting as frequently as possible. ?Obviously,
> detailed static analysis will provide far more information than it would
> be possible to display on the code at once, so I see this gui as having
> several modes -- like predator vision -- that show different information
> from the analysis. ?Naturally, what those modes are will depend strongly
> on the details of how pypy-c-jit works internally, what sort of
> information can be sanely collected through static analysis, and,
> naturally, user testing.
>
> I was somewhat baffled at first as to how what I wrote before was
> interpreted as interest in a static python. ?I think the disconnect here
> is the assumption on many people's part that a static language will
> always be faster than a dynamic one. ?Given the existing tools that
> provide basically no feedback from the compiler / interpreter / jitter,
> this is inevitably true at the moment. ?I foresee a future, however,
> where better tools let us use the full power of a dynamic python AND let
> us tighten up our code for speed to get the full advantages of jit
> compilation as well. ?I believe that in the end, this combination will
> prove superior to any fully static compiler.

This all looks interesting, and if you can plug that on emacs or
textmate I would be really happy, but it is not what I want. I would
settle for a tool that generates at runtime information about what the
jit is doing in a simple text format (json, yaml or something even
simpler?) and a tool to visualize this so you can optimize python
programs to run on pypy easily. The biggest difference is that just
collecting this info from the JIT appears to be much much easier than
somehow implement a static processor for python code that do some form
of analysis.

I think that fijal is at least thinking about doing such a tool right?

-- 
Leonardo Santagada

From p.giarrusso at gmail.com  Tue Sep 28 00:52:48 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Tue, 28 Sep 2010 00:52:48 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
	<1285536492.8752.28.camel@localhost>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	
Message-ID: 

On Mon, Sep 27, 2010 at 21:58, Leonardo Santagada  wrote:
> On Mon, Sep 27, 2010 at 4:44 PM, Terrence Cole
>  wrote:
>> On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
>>> Well, I am happy to see that the my interest in a general purpose RPython is not
>>> as isolated as I was lead to believe :-))
>>> Thx,
>>
>> What I wrote has apparently been widely misunderstood, so let me explain
>> what I mean in more detail. ?What I want is _not_ RPython and it is
>> _not_ Shedskin. ?What I want is not a compiler at all. ?What I want is a
>> visual tool, for example, a plugin to an IDE. ?This tool would perform
>> static analysis on a piece of python code. ?Instead of generating code
>> with this information, it would mark up the python code in the text
>> display with colors, weights, etc in order to show properties from the
>> static analysis. ?This would be something like semantic highlighting, as
>> opposed to syntax highlighting.
>>
>> I think it possible that this information would, if created and
>> presented in the correct way, represent the sort of optimizations that
>> pypy-c-jit -- a full python implementation, not a language subset --
>> would likely perform on the code if run. ?Given this sort of feedback,
>> it would be much easier for a python coder to write code that works well
>> with the jit: for example, moving a declaration inside a loop to avoid
>> boxing, based on the information presented.
>>
>> Ideally, such a tool would perform instantaneous syntax highlighting
>> while editing and do full parsing and analysis in the background to
>> update the semantic highlighting as frequently as possible. ?Obviously,
>> detailed static analysis will provide far more information than it would
>> be possible to display on the code at once, so I see this gui as having
>> several modes -- like predator vision -- that show different information
>> from the analysis. ?Naturally, what those modes are will depend strongly
>> on the details of how pypy-c-jit works internally, what sort of
>> information can be sanely collected through static analysis, and,
>> naturally, user testing.
>>
>> I was somewhat baffled at first as to how what I wrote before was
>> interpreted as interest in a static python. ?I think the disconnect here
>> is the assumption on many people's part that a static language will
>> always be faster than a dynamic one. ?Given the existing tools that
>> provide basically no feedback from the compiler / interpreter / jitter,
>> this is inevitably true at the moment. ?I foresee a future, however,
>> where better tools let us use the full power of a dynamic python AND let
>> us tighten up our code for speed to get the full advantages of jit
>> compilation as well. ?I believe that in the end, this combination will
>> prove superior to any fully static compiler.
>
> This all looks interesting, and if you can plug that on emacs or
> textmate I would be really happy, but it is not what I want. I would
> settle for a tool that generates at runtime information about what the
> jit is doing in a simple text format (json, yaml or something even
> simpler?) and a tool to visualize this so you can optimize python
> programs to run on pypy easily. The biggest difference is that just
> collecting this info from the JIT appears to be much much easier than
> somehow implement a static processor for python code that do some form
> of analysis.

Have you looked at what the Azul Java VM supports for Java, in
particular RTPM (Real Time Performance Monitoring)?

Academic accounts are available, and from Cliff Click's presentations,
it seems to be a production-quality solution for this (for Java),
which could give interesting ideas. Azul business is exclusively
centered around Java optimization at the JVM level, so while
not-so-famous they are quite relevant.

See slide 28 of: www.azulsystems.com/events/vee_2009/2009_VEE.pdf for
some more details.
See also wiki.jvmlangsummit.com/pdf/36_Click_fastbcs.pdf, and the
account about JRuby's slowness (caused by unreliable performance
analysis tools).

Given that JIT can beat static compilation only through forms of
profile-directed optimization, I also believe that the interesting
information should be obtained through logs from the JIT. A static
analyser can't do something better than a static compiler - not
reliably at least.

_However_, static semantic highlighting might still be interesting:
while it does not help understanding profile-directed optimizations
done by the JIT, it might help understanding the consequences of the
execution model of the language itself, where it has a weird impact on
performance.
E.g., for CPython, it might be very useful simply highlighting usages
of global variables, that require a dict lookup, as "bad", especially
in tight loops. OTOH, that kind of optimization should be done by a
JIT like PyPy, not by the programmer.
I believe that CALL_LIKELY_BUILTIN and hidden classes already allow
PyPy to fix the problem without changing the source code.

The question then is: which kinds of constructs are unexpectedly slow
in Python, even with a good JIT?

Best regards
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From jacob at openend.se  Tue Sep 28 01:57:11 2010
From: jacob at openend.se (Jacob =?iso-8859-1?q?Hall=E9n?=)
Date: Tue, 28 Sep 2010 01:57:11 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285616691.5954.34.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
Message-ID: <201009280157.17264.jacob@openend.se>

Monday 27 September 2010 you wrote:
> On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> > Well, I am happy to see that the my interest in a general purpose RPython
> > is not as isolated as I was lead to believe :-))
> > Thx,
> 
> What I wrote has apparently been widely misunderstood, so let me explain
> what I mean in more detail.  What I want is _not_ RPython and it is
> _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> visual tool, for example, a plugin to an IDE.  This tool would perform
> static analysis on a piece of python code.  Instead of generating code
> with this information, it would mark up the python code in the text
> display with colors, weights, etc in order to show properties from the
> static analysis.  This would be something like semantic highlighting, as
> opposed to syntax highlighting.
> 
> I think it possible that this information would, if created and
> presented in the correct way, represent the sort of optimizations that
> pypy-c-jit -- a full python implementation, not a language subset --
> would likely perform on the code if run.  Given this sort of feedback,
> it would be much easier for a python coder to write code that works well
> with the jit: for example, moving a declaration inside a loop to avoid
> boxing, based on the information presented.
> 
> Ideally, such a tool would perform instantaneous syntax highlighting
> while editing and do full parsing and analysis in the background to
> update the semantic highlighting as frequently as possible.  Obviously,
> detailed static analysis will provide far more information than it would
> be possible to display on the code at once, so I see this gui as having
> several modes -- like predator vision -- that show different information
> from the analysis.  Naturally, what those modes are will depend strongly
> on the details of how pypy-c-jit works internally, what sort of
> information can be sanely collected through static analysis, and,
> naturally, user testing.
> 
> I was somewhat baffled at first as to how what I wrote before was
> interpreted as interest in a static python.  I think the disconnect here
> is the assumption on many people's part that a static language will
> always be faster than a dynamic one.  Given the existing tools that
> provide basically no feedback from the compiler / interpreter / jitter,
> this is inevitably true at the moment.  I foresee a future, however,
> where better tools let us use the full power of a dynamic python AND let
> us tighten up our code for speed to get the full advantages of jit
> compilation as well.  I believe that in the end, this combination will
> prove superior to any fully static compiler.
> 
> -Terrence
> 
> > Sarvi
> > 
> > 
> > ----- Original Message ----
> > 
> > > From: Terrence Cole 
> > > To: pypy-dev at codespeak.net
> > > Sent: Sun, September 26, 2010 2:28:12 PM
> > > Subject: Re: [pypy-dev] Question on the future of RPython
> > > 
> > > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> > > > i just had a  (probably) silly idea. :)
> > > > 
> > > > if some people like rpython so much,  how about writing a rpython
> > > > interpreter in rpython? wouldn't it be much  easier for the jit to
> > > > optimize rpython code? couldn't jitted rpython  code theoretically be
> > > > as fast as a program that got compiled to c from  rpython?
> > > > 
> > > > hm... but i wonder if this would make sense at all.  maybe if you ran
> > > > rpython code with pypy-c-jit, it already could be  jitted as well as
> > > > with a special rpython interpreter? ...if there were a  special
> > > > rpython interpreter, would the current jit generator have to be 
> > > > changed to take advantage of the more simple language?
> > > 
> > > An  excellent question at least.
> > > 
> > > A better idea, I think, would be to  ask what subset of full-python
> > > will jit well.  What I'd really like to  see is a static analyzer that
> > > can display (e.g. by coloring names or lines)  how "jit friendly" a
> > > piece of python code is.  This would allow a  programmer to get an
> > > idea of what help the jit is going to be when running  their code and,
> > > hopefully, help people avoid tragic performance  results.  Naturally,
> > > for performance intensive code, you would still  need to profile, but
> > > for a lot of uses, simply not having catastrophically  bad performance
> > > is more than enough for a good user experience.
> > > 
> > > With such a tool, it wouldn't really matter if the answer to "what  is
> > > faster" is RPython -- it would be whatever python language  subset
> > > happens to work well in a particular case.  I've started working  on
> > > something like this [1], but given that I'm doing a startup, I  don't
> > > have nearly the time I would need to make this useful in the 
> > > near-term.

The JIT works because it has more information at runtime than what is 
available at compile time. If the information was available at compile time we 
could do the optimizations then and not have to invoke the extra complexity 
required by the JIT. Examples of  the extra information include things like 
knowing that introspection will not be used in the current evaluation of a 
loop, specific argument types will be used in calls and that some arguments 
will be known to be constant over part of the program execution.. Knowing 
these bits allows you to optimize away large chunks o f the code that 
otherwise would have been executed.

Static analysis assumes that none of the above mentioned possibilities can 
actually take place. It is impossible to make such assumptions at compile time 
in a dynamic language. Therefore PyPy is a bad match for people wanting to 
staically compile subsets of Python. Applying the JIT to RPython code is not 
workable, because the JIT is optimized to remove bits of generated assembler 
code that never shows up in the compilation of RPython code.

These are very basic first principle concepts, and it is a mystery to me why 
people can't work them out for themselves.

Jacob Hall?n
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
Url : http://codespeak.net/pipermail/pypy-dev/attachments/20100928/d1212fae/attachment.pgp 

From list-sink at trainedmonkeystudios.org  Tue Sep 28 02:03:08 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Mon, 27 Sep 2010 17:03:08 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<201009021140.53415.jacob@openend.se>
	<933850.84117.qm@web53705.mail.re2.yahoo.com>
	<201009022254.07665.jacob@openend.se>
	<810929.87945.qm@web53702.mail.re2.yahoo.com>
	
	<609691.54456.qm@web53705.mail.re2.yahoo.com>
	
	<294101.49986.qm@web53704.mail.re2.yahoo.com>
	
	<860977.83657.qm@web53705.mail.re2.yahoo.com>
	
	<792921.22517.qm@web53703.mail.re2.yahoo.com>
	
	
	
	<1285536492.8752.28.camel@localhost>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	
	
Message-ID: <1285632188.5954.94.camel@localhost>

On Tue, 2010-09-28 at 00:52 +0200, Paolo Giarrusso wrote:
> On Mon, Sep 27, 2010 at 21:58, Leonardo Santagada  wrote:
> > On Mon, Sep 27, 2010 at 4:44 PM, Terrence Cole
> >  wrote:
> >> On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> >>> Well, I am happy to see that the my interest in a general purpose RPython is not
> >>> as isolated as I was lead to believe :-))
> >>> Thx,
> >>
> >> What I wrote has apparently been widely misunderstood, so let me explain
> >> what I mean in more detail.  What I want is _not_ RPython and it is
> >> _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> >> visual tool, for example, a plugin to an IDE.  This tool would perform
> >> static analysis on a piece of python code.  Instead of generating code
> >> with this information, it would mark up the python code in the text
> >> display with colors, weights, etc in order to show properties from the
> >> static analysis.  This would be something like semantic highlighting, as
> >> opposed to syntax highlighting.
> >>
> >> I think it possible that this information would, if created and
> >> presented in the correct way, represent the sort of optimizations that
> >> pypy-c-jit -- a full python implementation, not a language subset --
> >> would likely perform on the code if run.  Given this sort of feedback,
> >> it would be much easier for a python coder to write code that works well
> >> with the jit: for example, moving a declaration inside a loop to avoid
> >> boxing, based on the information presented.
> >>
> >> Ideally, such a tool would perform instantaneous syntax highlighting
> >> while editing and do full parsing and analysis in the background to
> >> update the semantic highlighting as frequently as possible.  Obviously,
> >> detailed static analysis will provide far more information than it would
> >> be possible to display on the code at once, so I see this gui as having
> >> several modes -- like predator vision -- that show different information
> >> from the analysis.  Naturally, what those modes are will depend strongly
> >> on the details of how pypy-c-jit works internally, what sort of
> >> information can be sanely collected through static analysis, and,
> >> naturally, user testing.
> >>
> >> I was somewhat baffled at first as to how what I wrote before was
> >> interpreted as interest in a static python.  I think the disconnect here
> >> is the assumption on many people's part that a static language will
> >> always be faster than a dynamic one.  Given the existing tools that
> >> provide basically no feedback from the compiler / interpreter / jitter,
> >> this is inevitably true at the moment.  I foresee a future, however,
> >> where better tools let us use the full power of a dynamic python AND let
> >> us tighten up our code for speed to get the full advantages of jit
> >> compilation as well.  I believe that in the end, this combination will
> >> prove superior to any fully static compiler.
> >
> > This all looks interesting, and if you can plug that on emacs or
> > textmate I would be really happy, but it is not what I want. I would
> > settle for a tool that generates at runtime information about what the
> > jit is doing in a simple text format (json, yaml or something even
> > simpler?) and a tool to visualize this so you can optimize python
> > programs to run on pypy easily. The biggest difference is that just
> > collecting this info from the JIT appears to be much much easier than
> > somehow implement a static processor for python code that do some form
> > of analysis.
> 
> Have you looked at what the Azul Java VM supports for Java, in
> particular RTPM (Real Time Performance Monitoring)?

Briefly, but it's not open source, and it's a Java thing, so it didn't
pique my interest significantly.

> Academic accounts are available, and from Cliff Click's presentations,
> it seems to be a production-quality solution for this (for Java),
> which could give interesting ideas. Azul business is exclusively
> centered around Java optimization at the JVM level, so while
> not-so-famous they are quite relevant.
> 
> See slide 28 of: www.azulsystems.com/events/vee_2009/2009_VEE.pdf for
> some more details.
> See also wiki.jvmlangsummit.com/pdf/36_Click_fastbcs.pdf, and the
> account about JRuby's slowness (caused by unreliable performance
> analysis tools).
> 
> Given that JIT can beat static compilation only through forms of
> profile-directed optimization, I also believe that the interesting
> information should be obtained through logs from the JIT. A static
> analyser can't do something better than a static compiler - not
> reliably at least.

I'd be pursuing the jit logging approach much more aggressively if I
cared at all about Python2 anymore.  All of the source I care about
analyzing is in Python3.  However, considering the rate I'm going, pypy
will doubtless support python3 by the time I get a half-way descent
static analyzer working anyway, so it's probably worth considering.

> _However_, static semantic highlighting might still be interesting:
> while it does not help understanding profile-directed optimizations
> done by the JIT, it might help understanding the consequences of the
> execution model of the language itself, where it has a weird impact on
> performance.
> E.g., for CPython, it might be very useful simply highlighting usages
> of global variables, that require a dict lookup, as "bad", especially
> in tight loops. OTOH, that kind of optimization should be done by a
> JIT like PyPy, not by the programmer.
> I believe that CALL_LIKELY_BUILTIN and hidden classes already allow
> PyPy to fix the problem without changing the source code.
> 
> The question then is: which kinds of constructs are unexpectedly slow
> in Python, even with a good JIT?

Precisely.  I'd love a good answer to that question.

In addition to jitting, although it would not technically be python
anymore, I see a place for something like SPUR or Jaegermonkey --
combined compilation and jitting.  Naturally, the performance of such a
beast over a jit alone would be dependent on how much boxing the
compiler could remove.  My goal for this work is about half geared
towards answering that single question, just so I'll know if I should
stop dreaming about python eventually having performance parity with C/C
++.

I tend to think that having a solid (if never perfect) static analyzer
for python could help in many areas.   I had thought that helping coders
help the jit out would be a good first use, but as you say, there will
be problems with that.  Regardless, my hope is that a library for static
analysis of python will be more generally useful than my own
hare-brained schemes.

In any case, I'm working on this in the form of a code editor first
because, regardless of what the answer to the previous question is, I
know from experience that highlighting for python like what
SourceInsight does for C++ will be extremely useful. 


Thank you for the kind feedback, your comments are much appreciated.
-Terrence

> Best regards



From list-sink at trainedmonkeystudios.org  Tue Sep 28 02:43:33 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Mon, 27 Sep 2010 17:43:33 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <201009280157.17264.jacob@openend.se>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
Message-ID: <1285634613.5954.112.camel@localhost>

On Tue, 2010-09-28 at 01:57 +0200, Jacob Hall?n wrote:
> Monday 27 September 2010 you wrote:
> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> > > Well, I am happy to see that the my interest in a general purpose RPython
> > > is not as isolated as I was lead to believe :-))
> > > Thx,
> > 
> > What I wrote has apparently been widely misunderstood, so let me explain
> > what I mean in more detail.  What I want is _not_ RPython and it is
> > _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> > visual tool, for example, a plugin to an IDE.  This tool would perform
> > static analysis on a piece of python code.  Instead of generating code
> > with this information, it would mark up the python code in the text
> > display with colors, weights, etc in order to show properties from the
> > static analysis.  This would be something like semantic highlighting, as
> > opposed to syntax highlighting.
> > 
> > I think it possible that this information would, if created and
> > presented in the correct way, represent the sort of optimizations that
> > pypy-c-jit -- a full python implementation, not a language subset --
> > would likely perform on the code if run.  Given this sort of feedback,
> > it would be much easier for a python coder to write code that works well
> > with the jit: for example, moving a declaration inside a loop to avoid
> > boxing, based on the information presented.
> > 
> > Ideally, such a tool would perform instantaneous syntax highlighting
> > while editing and do full parsing and analysis in the background to
> > update the semantic highlighting as frequently as possible.  Obviously,
> > detailed static analysis will provide far more information than it would
> > be possible to display on the code at once, so I see this gui as having
> > several modes -- like predator vision -- that show different information
> > from the analysis.  Naturally, what those modes are will depend strongly
> > on the details of how pypy-c-jit works internally, what sort of
> > information can be sanely collected through static analysis, and,
> > naturally, user testing.
> > 
> > I was somewhat baffled at first as to how what I wrote before was
> > interpreted as interest in a static python.  I think the disconnect here
> > is the assumption on many people's part that a static language will
> > always be faster than a dynamic one.  Given the existing tools that
> > provide basically no feedback from the compiler / interpreter / jitter,
> > this is inevitably true at the moment.  I foresee a future, however,
> > where better tools let us use the full power of a dynamic python AND let
> > us tighten up our code for speed to get the full advantages of jit
> > compilation as well.  I believe that in the end, this combination will
> > prove superior to any fully static compiler.
> > 
> > -Terrence
> > 
> > > Sarvi
> > > 
> > > 
> > > ----- Original Message ----
> > > 
> > > > From: Terrence Cole 
> > > > To: pypy-dev at codespeak.net
> > > > Sent: Sun, September 26, 2010 2:28:12 PM
> > > > Subject: Re: [pypy-dev] Question on the future of RPython
> > > > 
> > > > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> > > > > i just had a  (probably) silly idea. :)
> > > > > 
> > > > > if some people like rpython so much,  how about writing a rpython
> > > > > interpreter in rpython? wouldn't it be much  easier for the jit to
> > > > > optimize rpython code? couldn't jitted rpython  code theoretically be
> > > > > as fast as a program that got compiled to c from  rpython?
> > > > > 
> > > > > hm... but i wonder if this would make sense at all.  maybe if you ran
> > > > > rpython code with pypy-c-jit, it already could be  jitted as well as
> > > > > with a special rpython interpreter? ...if there were a  special
> > > > > rpython interpreter, would the current jit generator have to be 
> > > > > changed to take advantage of the more simple language?
> > > > 
> > > > An  excellent question at least.
> > > > 
> > > > A better idea, I think, would be to  ask what subset of full-python
> > > > will jit well.  What I'd really like to  see is a static analyzer that
> > > > can display (e.g. by coloring names or lines)  how "jit friendly" a
> > > > piece of python code is.  This would allow a  programmer to get an
> > > > idea of what help the jit is going to be when running  their code and,
> > > > hopefully, help people avoid tragic performance  results.  Naturally,
> > > > for performance intensive code, you would still  need to profile, but
> > > > for a lot of uses, simply not having catastrophically  bad performance
> > > > is more than enough for a good user experience.
> > > > 
> > > > With such a tool, it wouldn't really matter if the answer to "what  is
> > > > faster" is RPython -- it would be whatever python language  subset
> > > > happens to work well in a particular case.  I've started working  on
> > > > something like this [1], but given that I'm doing a startup, I  don't
> > > > have nearly the time I would need to make this useful in the 
> > > > near-term.
> 
> The JIT works because it has more information at runtime than what is 
> available at compile time. If the information was available at compile time we 
> could do the optimizations then and not have to invoke the extra complexity 
> required by the JIT. Examples of  the extra information include things like 
> knowing that introspection will not be used in the current evaluation of a 
> loop, specific argument types will be used in calls and that some arguments 
> will be known to be constant over part of the program execution.. Knowing 
> these bits allows you to optimize away large chunks o f the code that 
> otherwise would have been executed.
> 
> Static analysis assumes that none of the above mentioned possibilities can 
> actually take place. It is impossible to make such assumptions at compile time 
> in a dynamic language. Therefore PyPy is a bad match for people wanting to 
> staically compile subsets of Python. Applying the JIT to RPython code

Yes, that idea is just dumb.  It's also not what I suggested at all.  I
can see now that what I said would be easy to misinterpret, but on
re-reading it, it clearly doesn't say what you think it does.

>  is not 
> workable, because the JIT is optimized to remove bits of generated assembler 
> code that never shows up in the compilation of RPython code.
> 
> These are very basic first principle concepts, and it is a mystery to me why 
> people can't work them out for themselves.

You are quite right that static analysis will be able to do little to
help an optimal jit.  However, I doubt that in the near term pypy's jit
will cover all the dark corners of python equally well -- C has been
around for 38 years and its still got room for optimization.

-Terrence

> Jacob Hall?n



From william.leslie.ttg at gmail.com  Tue Sep 28 03:55:06 2010
From: william.leslie.ttg at gmail.com (William Leslie)
Date: Tue, 28 Sep 2010 11:55:06 +1000
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285634613.5954.112.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
Message-ID: 

On 28 September 2010 10:43, Terrence Cole
 wrote:
> On Tue, 2010-09-28 at 01:57 +0200, Jacob Hall?n wrote:
>> Monday 27 September 2010 you wrote:
>> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
>> > > Well, I am happy to see that the my interest in a general purpose RPython
>> > > is not as isolated as I was lead to believe :-))
>> > > Thx,
>> >
>> > What I wrote has apparently been widely misunderstood, so let me explain
>> > what I mean in more detail. ?What I want is _not_ RPython and it is
>> > _not_ Shedskin. ?What I want is not a compiler at all. ?What I want is a
>> > visual tool, for example, a plugin to an IDE. ?This tool would perform
>> > static analysis on a piece of python code. ?Instead of generating code
>> > with this information, it would mark up the python code in the text
>> > display with colors, weights, etc in order to show properties from the
>> > static analysis. ?This would be something like semantic highlighting, as
>> > opposed to syntax highlighting.
>> >
>> > I think it possible that this information would, if created and
>> > presented in the correct way, represent the sort of optimizations that
>> > pypy-c-jit -- a full python implementation, not a language subset --
>> > would likely perform on the code if run. ?Given this sort of feedback,
>> > it would be much easier for a python coder to write code that works well
>> > with the jit: for example, moving a declaration inside a loop to avoid
>> > boxing, based on the information presented.
>> >
>> > Ideally, such a tool would perform instantaneous syntax highlighting
>> > while editing and do full parsing and analysis in the background to
>> > update the semantic highlighting as frequently as possible. ?Obviously,
>> > detailed static analysis will provide far more information than it would
>> > be possible to display on the code at once, so I see this gui as having
>> > several modes -- like predator vision -- that show different information
>> > from the analysis. ?Naturally, what those modes are will depend strongly
>> > on the details of how pypy-c-jit works internally, what sort of
>> > information can be sanely collected through static analysis, and,
>> > naturally, user testing.
>> >
>> > I was somewhat baffled at first as to how what I wrote before was
>> > interpreted as interest in a static python. ?I think the disconnect here
>> > is the assumption on many people's part that a static language will
>> > always be faster than a dynamic one. ?Given the existing tools that
>> > provide basically no feedback from the compiler / interpreter / jitter,
>> > this is inevitably true at the moment. ?I foresee a future, however,
>> > where better tools let us use the full power of a dynamic python AND let
>> > us tighten up our code for speed to get the full advantages of jit
>> > compilation as well. ?I believe that in the end, this combination will
>> > prove superior to any fully static compiler.
>>
>> The JIT works because it has more information at runtime than what is
>> available at compile time. If the information was available at compile time we
>> could do the optimizations then and not have to invoke the extra complexity
>> required by the JIT. Examples of ?the extra information include things like
>> knowing that introspection will not be used in the current evaluation of a
>> loop, specific argument types will be used in calls and that some arguments
>> will be known to be constant over part of the program execution.. Knowing
>> these bits allows you to optimize away large chunks o f the code that
>> otherwise would have been executed.
>>
>> Static analysis assumes that none of the above mentioned possibilities can
>> actually take place. It is impossible to make such assumptions at compile time
>> in a dynamic language. Therefore PyPy is a bad match for people wanting to
>> staically compile subsets of Python. Applying the JIT to RPython code
>
> Yes, that idea is just dumb. ?It's also not what I suggested at all. ?I
> can see now that what I said would be easy to misinterpret, but on
> re-reading it, it clearly doesn't say what you think it does.

It does make /some/ sense, I think. From the perspective of the JIT,
operating at interp-level, the app-level python program *is the
biggest part of* the "stuff you don't know about until runtime". That
is, you don't know the program source at translation time, and most of
the information the JIT is supposed to find are app-level constructs
(eg app-level loops).

Of course any such analysis will fall flat in certain cases, like
eval(raw_input(...)). But you should still be able to gather enough
information for most fairly hygenic code.

What sort of analyses did you have in mind?

>> ?is not
>> workable, because the JIT is optimized to remove bits of generated assembler
>> code that never shows up in the compilation of RPython code.
>>
>> These are very basic first principle concepts, and it is a mystery to me why
>> people can't work them out for themselves.
>
> You are quite right that static analysis will be able to do little to
> help an optimal jit. ?However, I doubt that in the near term pypy's jit
> will cover all the dark corners of python equally well -- C has been
> around for 38 years and its still got room for optimization.

There are some undesirable things about static analysis, but it can
sure be useful from optimisation, security and reliability
perspectives. There's also code browsing, too; IDEs require a
different (fuzzier) parser, but the question of 'what types does this
object probably have' makes more sense with a little dependent region
analysis. Optimising when you can be fairly confident of the types
involved could be useful. That doesn't really sound like pypy at that
point, though.

-- 
William Leslie

From fijall at gmail.com  Tue Sep 28 15:20:46 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 28 Sep 2010 15:20:46 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285634613.5954.112.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
Message-ID: 

On Tue, Sep 28, 2010 at 2:43 AM, Terrence Cole
 wrote:
> On Tue, 2010-09-28 at 01:57 +0200, Jacob Hall?n wrote:
>> Monday 27 September 2010 you wrote:
>> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
>> > > Well, I am happy to see that the my interest in a general purpose RPython
>> > > is not as isolated as I was lead to believe :-))
>> > > Thx,
>> >
>> > What I wrote has apparently been widely misunderstood, so let me explain
>> > what I mean in more detail. ?What I want is _not_ RPython and it is
>> > _not_ Shedskin. ?What I want is not a compiler at all. ?What I want is a
>> > visual tool, for example, a plugin to an IDE. ?This tool would perform
>> > static analysis on a piece of python code. ?Instead of generating code
>> > with this information, it would mark up the python code in the text
>> > display with colors, weights, etc in order to show properties from the
>> > static analysis. ?This would be something like semantic highlighting, as
>> > opposed to syntax highlighting.
>> >
>> > I think it possible that this information would, if created and
>> > presented in the correct way, represent the sort of optimizations that
>> > pypy-c-jit -- a full python implementation, not a language subset --
>> > would likely perform on the code if run. ?Given this sort of feedback,
>> > it would be much easier for a python coder to write code that works well
>> > with the jit: for example, moving a declaration inside a loop to avoid
>> > boxing, based on the information presented.
>> >
>> > Ideally, such a tool would perform instantaneous syntax highlighting
>> > while editing and do full parsing and analysis in the background to
>> > update the semantic highlighting as frequently as possible. ?Obviously,
>> > detailed static analysis will provide far more information than it would
>> > be possible to display on the code at once, so I see this gui as having
>> > several modes -- like predator vision -- that show different information
>> > from the analysis. ?Naturally, what those modes are will depend strongly
>> > on the details of how pypy-c-jit works internally, what sort of
>> > information can be sanely collected through static analysis, and,
>> > naturally, user testing.
>> >
>> > I was somewhat baffled at first as to how what I wrote before was
>> > interpreted as interest in a static python. ?I think the disconnect here
>> > is the assumption on many people's part that a static language will
>> > always be faster than a dynamic one. ?Given the existing tools that
>> > provide basically no feedback from the compiler / interpreter / jitter,
>> > this is inevitably true at the moment. ?I foresee a future, however,
>> > where better tools let us use the full power of a dynamic python AND let
>> > us tighten up our code for speed to get the full advantages of jit
>> > compilation as well. ?I believe that in the end, this combination will
>> > prove superior to any fully static compiler.
>> >
>> > -Terrence
>> >
>> > > Sarvi
>> > >
>> > >
>> > > ----- Original Message ----
>> > >
>> > > > From: Terrence Cole 
>> > > > To: pypy-dev at codespeak.net
>> > > > Sent: Sun, September 26, 2010 2:28:12 PM
>> > > > Subject: Re: [pypy-dev] Question on the future of RPython
>> > > >
>> > > > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
>> > > > > i just had a ?(probably) silly idea. :)
>> > > > >
>> > > > > if some people like rpython so much, ?how about writing a rpython
>> > > > > interpreter in rpython? wouldn't it be much ?easier for the jit to
>> > > > > optimize rpython code? couldn't jitted rpython ?code theoretically be
>> > > > > as fast as a program that got compiled to c from ?rpython?
>> > > > >
>> > > > > hm... but i wonder if this would make sense at all. ?maybe if you ran
>> > > > > rpython code with pypy-c-jit, it already could be ?jitted as well as
>> > > > > with a special rpython interpreter? ...if there were a ?special
>> > > > > rpython interpreter, would the current jit generator have to be
>> > > > > changed to take advantage of the more simple language?
>> > > >
>> > > > An ?excellent question at least.
>> > > >
>> > > > A better idea, I think, would be to ?ask what subset of full-python
>> > > > will jit well. ?What I'd really like to ?see is a static analyzer that
>> > > > can display (e.g. by coloring names or lines) ?how "jit friendly" a
>> > > > piece of python code is. ?This would allow a ?programmer to get an
>> > > > idea of what help the jit is going to be when running ?their code and,
>> > > > hopefully, help people avoid tragic performance ?results. ?Naturally,
>> > > > for performance intensive code, you would still ?need to profile, but
>> > > > for a lot of uses, simply not having catastrophically ?bad performance
>> > > > is more than enough for a good user experience.
>> > > >
>> > > > With such a tool, it wouldn't really matter if the answer to "what ?is
>> > > > faster" is RPython -- it would be whatever python language ?subset
>> > > > happens to work well in a particular case. ?I've started working ?on
>> > > > something like this [1], but given that I'm doing a startup, I ?don't
>> > > > have nearly the time I would need to make this useful in the
>> > > > near-term.
>>
>> The JIT works because it has more information at runtime than what is
>> available at compile time. If the information was available at compile time we
>> could do the optimizations then and not have to invoke the extra complexity
>> required by the JIT. Examples of ?the extra information include things like
>> knowing that introspection will not be used in the current evaluation of a
>> loop, specific argument types will be used in calls and that some arguments
>> will be known to be constant over part of the program execution.. Knowing
>> these bits allows you to optimize away large chunks o f the code that
>> otherwise would have been executed.
>>
>> Static analysis assumes that none of the above mentioned possibilities can
>> actually take place. It is impossible to make such assumptions at compile time
>> in a dynamic language. Therefore PyPy is a bad match for people wanting to
>> staically compile subsets of Python. Applying the JIT to RPython code
>
> Yes, that idea is just dumb. ?It's also not what I suggested at all. ?I
> can see now that what I said would be easy to misinterpret, but on
> re-reading it, it clearly doesn't say what you think it does.
>
>> ?is not
>> workable, because the JIT is optimized to remove bits of generated assembler
>> code that never shows up in the compilation of RPython code.
>>
>> These are very basic first principle concepts, and it is a mystery to me why
>> people can't work them out for themselves.
>
> You are quite right that static analysis will be able to do little to
> help an optimal jit. ?However, I doubt that in the near term pypy's jit
> will cover all the dark corners of python equally well -- C has been
> around for 38 years and its still got room for optimization.
>
> -Terrence
>
>> Jacob Hall?n
>
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev

Hey.

I'm really interested in having jit feedback displayed as text info
(say for profiling purposes). Do you have any particular ideas in mind
or just a general one?

From tismer at stackless.com  Tue Sep 28 16:39:38 2010
From: tismer at stackless.com (Christian Tismer)
Date: Tue, 28 Sep 2010 16:39:38 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <187982.37806.qm@web111302.mail.gq1.yahoo.com>
References: <187982.37806.qm@web111302.mail.gq1.yahoo.com>
Message-ID: <4CA1FE2A.5060105@stackless.com>

  On 9/27/10 2:29 PM, Andy wrote:
> Why wouldn't pypy work with greenlet but would work with Stackless? greenlet calls itself a spin-off of Stackless. Isn't greenlet a subset of Stackless without the scheduling? Could you explain a bit more?
>

This is a deep misconception. Neither stackless nor greenlets
work with PyPy. Instead, a special coroutine version was written
for PyP's RPython, and then Stackless was written as an application
module. There is a greenlet implementation as well.

They both rely on the stack unwinding, which is not yet implemented
for the Jit.

The original greenlets and stackless have some similarities, since
they use the same tricks to modify the stack in assembly. This is
not related to PyPy, this reasoning is just the improper level.

ciao - chris

-- 
Christian Tismer             :^)
tismerysoft GmbH             :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9A     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key ->  http://wwwkeys.pgp.net/
work +49 30 802 86 56  mobile +49 173 24 18 776  fax +49 30 80 90 57 05
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
       whom do you want to sponsor today?   http://www.stackless.com/


From list-sink at trainedmonkeystudios.org  Tue Sep 28 21:49:08 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Tue, 28 Sep 2010 12:49:08 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
Message-ID: <1285703348.5276.56.camel@localhost>

On Tue, 2010-09-28 at 11:55 +1000, William Leslie wrote:
> On 28 September 2010 10:43, Terrence Cole
>  wrote:
> > On Tue, 2010-09-28 at 01:57 +0200, Jacob Hall?n wrote:
> >> Monday 27 September 2010 you wrote:
> >> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> >> > > Well, I am happy to see that the my interest in a general purpose RPython
> >> > > is not as isolated as I was lead to believe :-))
> >> > > Thx,
> >> >
> >> > What I wrote has apparently been widely misunderstood, so let me explain
> >> > what I mean in more detail.  What I want is _not_ RPython and it is
> >> > _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> >> > visual tool, for example, a plugin to an IDE.  This tool would perform
> >> > static analysis on a piece of python code.  Instead of generating code
> >> > with this information, it would mark up the python code in the text
> >> > display with colors, weights, etc in order to show properties from the
> >> > static analysis.  This would be something like semantic highlighting, as
> >> > opposed to syntax highlighting.
> >> >
> >> > I think it possible that this information would, if created and
> >> > presented in the correct way, represent the sort of optimizations that
> >> > pypy-c-jit -- a full python implementation, not a language subset --
> >> > would likely perform on the code if run.  Given this sort of feedback,
> >> > it would be much easier for a python coder to write code that works well
> >> > with the jit: for example, moving a declaration inside a loop to avoid
> >> > boxing, based on the information presented.
> >> >
> >> > Ideally, such a tool would perform instantaneous syntax highlighting
> >> > while editing and do full parsing and analysis in the background to
> >> > update the semantic highlighting as frequently as possible.  Obviously,
> >> > detailed static analysis will provide far more information than it would
> >> > be possible to display on the code at once, so I see this gui as having
> >> > several modes -- like predator vision -- that show different information
> >> > from the analysis.  Naturally, what those modes are will depend strongly
> >> > on the details of how pypy-c-jit works internally, what sort of
> >> > information can be sanely collected through static analysis, and,
> >> > naturally, user testing.
> >> >
> >> > I was somewhat baffled at first as to how what I wrote before was
> >> > interpreted as interest in a static python.  I think the disconnect here
> >> > is the assumption on many people's part that a static language will
> >> > always be faster than a dynamic one.  Given the existing tools that
> >> > provide basically no feedback from the compiler / interpreter / jitter,
> >> > this is inevitably true at the moment.  I foresee a future, however,
> >> > where better tools let us use the full power of a dynamic python AND let
> >> > us tighten up our code for speed to get the full advantages of jit
> >> > compilation as well.  I believe that in the end, this combination will
> >> > prove superior to any fully static compiler.
> >>
> >> The JIT works because it has more information at runtime than what is
> >> available at compile time. If the information was available at compile time we
> >> could do the optimizations then and not have to invoke the extra complexity
> >> required by the JIT. Examples of  the extra information include things like
> >> knowing that introspection will not be used in the current evaluation of a
> >> loop, specific argument types will be used in calls and that some arguments
> >> will be known to be constant over part of the program execution.. Knowing
> >> these bits allows you to optimize away large chunks o f the code that
> >> otherwise would have been executed.
> >>
> >> Static analysis assumes that none of the above mentioned possibilities can
> >> actually take place. It is impossible to make such assumptions at compile time
> >> in a dynamic language. Therefore PyPy is a bad match for people wanting to
> >> staically compile subsets of Python. Applying the JIT to RPython code
> >
> > Yes, that idea is just dumb.  It's also not what I suggested at all.  I
> > can see now that what I said would be easy to misinterpret, but on
> > re-reading it, it clearly doesn't say what you think it does.
> 
> It does make /some/ sense, I think. From the perspective of the JIT,
> operating at interp-level, 

I think this is a disconnect.  Applying a jit to a non-interpretted
language -- Jacob here seems to think I was talking about a static,
compiled subset of python -- makes little sense.  Static analysis to
provide help to an interpreter does, as you say, make some sense, and
not to just me.  Brett Cannon applied static type analysis to the
CPython interpreter for his PHD thesis [1], looking for a speed boost by
removing some typing abstraction.  Unfortunately, it was not
spectacularly helpful for CPython.  I think for pypy-jit, however, it
has much greater potential because of the possibility of full unboxing.
Given past results however, it's not the first place I'd go looking for
speedups.  Others may have better ideas in this area than I do though.

> the app-level python program *is the
> biggest part of* the "stuff you don't know about until runtime". That
> is, you don't know the program source at translation time, and most of
> the information the JIT is supposed to find are app-level constructs
> (eg app-level loops).

This is one of the reasons that I had to pull together my own parsing
(largely borrowed from pypy, actually) and analysis infrastructure,
rather than just using pypy's off-the-shelf.  Even without pypy's neat
analysis code, the fact that it ditches character-level info when making
an ast means you can't apply highlighting with it without groping about
half-blindly in the source.

> Of course any such analysis will fall flat in certain cases, like
> eval(raw_input(...)). But you should still be able to gather enough
> information for most fairly hygenic code.

Given the choice between the status quo and an extremely slow eval, but
much faster python overall, I think most people would pick the second.

> What sort of analyses did you have in mind?

As this is a side project, for the moment I am focusing on simple stuff,
mostly things I need/want for work.  In the short term these include
Python3 linting (which is almost working) and static type analysis.  The
second will be particularly interesting because we have (at work)
annotated most of our interfaces with type data, so this will probably
net much more specific and helpful data than it would in many projects.
I am also, specifically, as I mentioned to Paolo yesterday, trying to
find out how much of our code could be fully unboxed, given that we have
extensive type contracts at our interfaces.  If the answer is "most of
it", then it may make sense for us to build something like Jaegermonkey
for python someday.

> >>  is not
> >> workable, because the JIT is optimized to remove bits of generated assembler
> >> code that never shows up in the compilation of RPython code.
> >>
> >> These are very basic first principle concepts, and it is a mystery to me why
> >> people can't work them out for themselves.
> >
> > You are quite right that static analysis will be able to do little to
> > help an optimal jit.  However, I doubt that in the near term pypy's jit
> > will cover all the dark corners of python equally well -- C has been
> > around for 38 years and its still got room for optimization.
> 
> There are some undesirable things about static analysis, but it can
> sure be useful from optimisation, security and reliability
> perspectives. 

Brendan Eich agrees [2].  This is heartening, because javascript has
much in common with python.

I agree too, for that matter, but that's probably a lot less
heartening :-).

> There's also code browsing, too; IDEs require a
> different (fuzzier) parser, 

Reason number two that I have to maintain a separate parser/analyzer.

> but the question of 'what types does this
> object probably have' makes more sense with a little dependent region
> analysis. Optimising when you can be fairly confident of the types
> involved could be useful. That doesn't really sound like pypy at that
> point, though.

Given that I want to work with Python3 anyway (and that I'd never be
able to beat pypy's performance before it supports Python3), I'm
focusing mostly on a tool to help make reliable and correct code.  

However, performance is always in the back of my mind these days.  It
seems from this thread that I won't be able to do much in that regard
with my current approach, unfortunately.  Maybe by the time I can focus
on it, pypy will support python3 and I can work on providing real-time
jit feedback.

-Terrence

[1] http://www.ocf.berkeley.edu/~bac/thesis.pdf
[2] http://brendaneich.com/2010/08/static-analysis-ftw/



From list-sink at trainedmonkeystudios.org  Tue Sep 28 22:33:14 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Tue, 28 Sep 2010 13:33:14 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
Message-ID: <1285705994.5276.99.camel@localhost>

On Tue, 2010-09-28 at 15:20 +0200, Maciej Fijalkowski wrote:
> On Tue, Sep 28, 2010 at 2:43 AM, Terrence Cole
>  wrote:
> > On Tue, 2010-09-28 at 01:57 +0200, Jacob Hall?n wrote:
> >> Monday 27 September 2010 you wrote:
> >> > On Sun, 2010-09-26 at 23:57 -0700, Saravanan Shanmugham wrote:
> >> > > Well, I am happy to see that the my interest in a general purpose RPython
> >> > > is not as isolated as I was lead to believe :-))
> >> > > Thx,
> >> >
> >> > What I wrote has apparently been widely misunderstood, so let me explain
> >> > what I mean in more detail.  What I want is _not_ RPython and it is
> >> > _not_ Shedskin.  What I want is not a compiler at all.  What I want is a
> >> > visual tool, for example, a plugin to an IDE.  This tool would perform
> >> > static analysis on a piece of python code.  Instead of generating code
> >> > with this information, it would mark up the python code in the text
> >> > display with colors, weights, etc in order to show properties from the
> >> > static analysis.  This would be something like semantic highlighting, as
> >> > opposed to syntax highlighting.
> >> >
> >> > I think it possible that this information would, if created and
> >> > presented in the correct way, represent the sort of optimizations that
> >> > pypy-c-jit -- a full python implementation, not a language subset --
> >> > would likely perform on the code if run.  Given this sort of feedback,
> >> > it would be much easier for a python coder to write code that works well
> >> > with the jit: for example, moving a declaration inside a loop to avoid
> >> > boxing, based on the information presented.
> >> >
> >> > Ideally, such a tool would perform instantaneous syntax highlighting
> >> > while editing and do full parsing and analysis in the background to
> >> > update the semantic highlighting as frequently as possible.  Obviously,
> >> > detailed static analysis will provide far more information than it would
> >> > be possible to display on the code at once, so I see this gui as having
> >> > several modes -- like predator vision -- that show different information
> >> > from the analysis.  Naturally, what those modes are will depend strongly
> >> > on the details of how pypy-c-jit works internally, what sort of
> >> > information can be sanely collected through static analysis, and,
> >> > naturally, user testing.
> >> >
> >> > I was somewhat baffled at first as to how what I wrote before was
> >> > interpreted as interest in a static python.  I think the disconnect here
> >> > is the assumption on many people's part that a static language will
> >> > always be faster than a dynamic one.  Given the existing tools that
> >> > provide basically no feedback from the compiler / interpreter / jitter,
> >> > this is inevitably true at the moment.  I foresee a future, however,
> >> > where better tools let us use the full power of a dynamic python AND let
> >> > us tighten up our code for speed to get the full advantages of jit
> >> > compilation as well.  I believe that in the end, this combination will
> >> > prove superior to any fully static compiler.
> >> >
> >> > -Terrence
> >> >
> >> > > Sarvi
> >> > >
> >> > >
> >> > > ----- Original Message ----
> >> > >
> >> > > > From: Terrence Cole 
> >> > > > To: pypy-dev at codespeak.net
> >> > > > Sent: Sun, September 26, 2010 2:28:12 PM
> >> > > > Subject: Re: [pypy-dev] Question on the future of RPython
> >> > > >
> >> > > > On Sat, 2010-09-25 at 17:47 +0200, horace grant wrote:
> >> > > > > i just had a  (probably) silly idea. :)
> >> > > > >
> >> > > > > if some people like rpython so much,  how about writing a rpython
> >> > > > > interpreter in rpython? wouldn't it be much  easier for the jit to
> >> > > > > optimize rpython code? couldn't jitted rpython  code theoretically be
> >> > > > > as fast as a program that got compiled to c from  rpython?
> >> > > > >
> >> > > > > hm... but i wonder if this would make sense at all.  maybe if you ran
> >> > > > > rpython code with pypy-c-jit, it already could be  jitted as well as
> >> > > > > with a special rpython interpreter? ...if there were a  special
> >> > > > > rpython interpreter, would the current jit generator have to be
> >> > > > > changed to take advantage of the more simple language?
> >> > > >
> >> > > > An  excellent question at least.
> >> > > >
> >> > > > A better idea, I think, would be to  ask what subset of full-python
> >> > > > will jit well.  What I'd really like to  see is a static analyzer that
> >> > > > can display (e.g. by coloring names or lines)  how "jit friendly" a
> >> > > > piece of python code is.  This would allow a  programmer to get an
> >> > > > idea of what help the jit is going to be when running  their code and,
> >> > > > hopefully, help people avoid tragic performance  results.  Naturally,
> >> > > > for performance intensive code, you would still  need to profile, but
> >> > > > for a lot of uses, simply not having catastrophically  bad performance
> >> > > > is more than enough for a good user experience.
> >> > > >
> >> > > > With such a tool, it wouldn't really matter if the answer to "what  is
> >> > > > faster" is RPython -- it would be whatever python language  subset
> >> > > > happens to work well in a particular case.  I've started working  on
> >> > > > something like this [1], but given that I'm doing a startup, I  don't
> >> > > > have nearly the time I would need to make this useful in the
> >> > > > near-term.
> >>
> >> The JIT works because it has more information at runtime than what is
> >> available at compile time. If the information was available at compile time we
> >> could do the optimizations then and not have to invoke the extra complexity
> >> required by the JIT. Examples of  the extra information include things like
> >> knowing that introspection will not be used in the current evaluation of a
> >> loop, specific argument types will be used in calls and that some arguments
> >> will be known to be constant over part of the program execution.. Knowing
> >> these bits allows you to optimize away large chunks o f the code that
> >> otherwise would have been executed.
> >>
> >> Static analysis assumes that none of the above mentioned possibilities can
> >> actually take place. It is impossible to make such assumptions at compile time
> >> in a dynamic language. Therefore PyPy is a bad match for people wanting to
> >> staically compile subsets of Python. Applying the JIT to RPython code
> >
> > Yes, that idea is just dumb.  It's also not what I suggested at all.  I
> > can see now that what I said would be easy to misinterpret, but on
> > re-reading it, it clearly doesn't say what you think it does.
> >
> >>  is not
> >> workable, because the JIT is optimized to remove bits of generated assembler
> >> code that never shows up in the compilation of RPython code.
> >>
> >> These are very basic first principle concepts, and it is a mystery to me why
> >> people can't work them out for themselves.
> >
> > You are quite right that static analysis will be able to do little to
> > help an optimal jit.  However, I doubt that in the near term pypy's jit
> > will cover all the dark corners of python equally well -- C has been
> > around for 38 years and its still got room for optimization.
> >
> > -Terrence
> >
> >> Jacob Hall?n
> >
> >
> > _______________________________________________
> > pypy-dev at codespeak.net
> > http://codespeak.net/mailman/listinfo/pypy-dev
> 
> Hey.
> 
> I'm really interested in having jit feedback displayed as text info
> (say for profiling purposes). Do you have any particular ideas in mind
> or just a general one?

Lots.  They're almost all probably wrong though, so be warned :-).  I'm
also not entirely clear on what you mean, so let me tell you what I have
in mind and you can tell me if I'm way off base.

I assume workflow would go like this:  1) run pypy on a bunch of code in
profiling mode, 2) pypy spits out lots of data about what happened in
the jit when the program exits, 3) start up external analysis program
pointing it at this data, 4) browse the python source with the data from
the jit overlayed as color, formatting, etc on top of the source.
Potentially there would be several separate modes for viewing different
aspects of the jit info.  This could also include the ability to select
different program elements (loops, variables, functions, etc) and get
detailed information about their runtime usage in a side-pane.  Ideally,
this workflow would be taken care of automatically by pushing the run
button in your IDE.

As a more specific example of what the gui would do in, for instance,
escape analysis mode:  display local variables that do not escape any
loops in green, others in red.  Hovering over a red variable would show
information about how, why, and where it escapes the loop in a tooltip
or bubble.  Selecting a red variable show the same info in a pane and
would draw arrows on the source showing where it escapes from a
loop/function etc.

In my ideal world, this profiling data analysis would sit side-by-side
with various display modes that show useful static analysis feedback,
all inside a full-fledged python IDE.

This is all, of course, a long way off still.  What I'm working on right
now is basic linting for python3 so that I can add a lint step to our
hudson server and start to get some graphs up.

What I _really_ would like to work on, if I had the time, is making pypy
support Python3 so that I could use it at work.  However, I think I'd
mostly just get in the way if I tried that, given my other time
commitments.

I hope there was something helpful in that brain-dump, but I suspect I
may be way off target at this point.

-Terrence




From anto.cuni at gmail.com  Wed Sep 29 11:37:05 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Wed, 29 Sep 2010 11:37:05 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285705994.5276.99.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	<827221.6704.qm@web53704.mail.re2.yahoo.com>	<1285616691.5954.34.camel@localhost>	<201009280157.17264.jacob@openend.se>	<1285634613.5954.112.camel@localhost>	
	<1285705994.5276.99.camel@localhost>
Message-ID: <4CA308C1.5030200@gmail.com>

Hi Terrence, hi all

On 28/09/10 22:33, Terrence Cole wrote:
> I assume workflow would go like this:  1) run pypy on a bunch of code in
> profiling mode, 2) pypy spits out lots of data about what happened in
> the jit when the program exits, 3) start up external analysis program
> pointing it at this data, 4) browse the python source with the data from
> the jit overlayed as color, formatting, etc on top of the source.

You can already to it (partially) by using the PYPYLOG environment variable 
like this:

PYPYLOG=jit-log-opt:mylog ./pypy -m test.pystone

then, mylog contains all the loops and bridges produced by the jit. The 
interesting point is that there are also special operations called 
"debug_merge_point" that are emitted for each python bytecode, so you can 
easily map the low-level jit instructions back to the original python source.

E.g., take lines 214 of pystone:
     Array1Par[IntLoc+30] = IntLoc

The corresponding python bytecode is this:
214          38 LOAD_FAST                4 (IntLoc)
              41 LOAD_FAST                0 (Array1Par)
              44 LOAD_FAST                4 (IntLoc)
              47 LOAD_CONST               3 (30)
              50 BINARY_ADD
              51 STORE_SUBSCR


By searching in the logs, you find the following (I edited it a bit to improve 
readability):

debug_merge_point(' #38 LOAD_FAST')
debug_merge_point(' #41 LOAD_FAST')
debug_merge_point(' #44 LOAD_FAST')
debug_merge_point(' #47 LOAD_CONST')
debug_merge_point(' #50 BINARY_ADD')
debug_merge_point(' #51 STORE_SUBSCR')
p345 = new_with_vtable(ConstClass(W_IntObject))
setfield_gc(p345, 8, descr=)
call(ConstClass(ll_setitem__dum_checkidxConst_listPtr_Signed_objectPtr),
      p333, 38, p345, descr=)
guard_no_exception(, descr=) [p1, p0, p71, p345, p312, p3, p4, p6, 
p308, p315, p335, p12, p13, p14, p15, p16, p18, p19, p178, p26, p320, p328, 
i124, p25, i329]

Here, you can see that most opcodes are "empty" (i.e., no operations between 
one debug_merge_point and the next). In general, all the opcodes that 
manipulate the python stack are optimized away by the jit, because all the 
python variables on the stack become "local variables" in the assembler.

Moreover, you can see that BINARY_ADD is also empty: this probably means that 
the loop was specialized for the specific value of IntLoc, so the addition has 
been constant-folded away.  Indeed, the only opcode that do real work is 
STORE_SUBSCR.  What it does it to allocate a new W_IntObject whose value is 8 
(i.e., boxing IntLoc on the fly, because it's escaping), and store it into the 
element 38 of the list stored in p333.

Finally, we check that no exception was raised.

Obviously, when presenting these information to the user you must consider 
that there is not a 1-to-1 mapping from python source to jit loops.  In the 
example above, the very same opcodes are compiled also in another loop (which 
by chance it has the same jit-operations, but they might also be very 
different, depending on the cases).

As you can see, there is already lot of information that can be useful to the 
user.  However, don't ask me how to present it visually :-)

ciao,
anto

From fijall at gmail.com  Wed Sep 29 12:35:00 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Wed, 29 Sep 2010 12:35:00 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <4CA308C1.5030200@gmail.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285705994.5276.99.camel@localhost> <4CA308C1.5030200@gmail.com>
Message-ID: 

>
> As you can see, there is already lot of information that can be useful to
> the user. ?However, don't ask me how to present it visually :-)
>

As you've probably noticed, it takes quite a bit of skill to actually
read it and say which variables are unescaped locals for example

From list-sink at trainedmonkeystudios.org  Wed Sep 29 22:40:59 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Wed, 29 Sep 2010 13:40:59 -0700
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <4CA308C1.5030200@gmail.com>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285705994.5276.99.camel@localhost>  <4CA308C1.5030200@gmail.com>
Message-ID: <1285792859.5129.79.camel@localhost>

On Wed, 2010-09-29 at 11:37 +0200, Antonio Cuni wrote:
> Hi Terrence, hi all
> 
> On 28/09/10 22:33, Terrence Cole wrote:
> > I assume workflow would go like this:  1) run pypy on a bunch of code in
> > profiling mode, 2) pypy spits out lots of data about what happened in
> > the jit when the program exits, 3) start up external analysis program
> > pointing it at this data, 4) browse the python source with the data from
> > the jit overlayed as color, formatting, etc on top of the source.
> 
> You can already to it (partially) by using the PYPYLOG environment variable 
> like this:
> 
> PYPYLOG=jit-log-opt:mylog ./pypy -m test.pystone
> 
> then, mylog contains all the loops and bridges produced by the jit. The 
> interesting point is that there are also special operations called 
> "debug_merge_point" that are emitted for each python bytecode, so you can 
> easily map the low-level jit instructions back to the original python source.

I think that 'easily' in that last sentence is missing scare-quotes. :-)

> E.g., take lines 214 of pystone:
>      Array1Par[IntLoc+30] = IntLoc
> 
> The corresponding python bytecode is this:
> 214          38 LOAD_FAST                4 (IntLoc)
>               41 LOAD_FAST                0 (Array1Par)
>               44 LOAD_FAST                4 (IntLoc)
>               47 LOAD_CONST               3 (30)
>               50 BINARY_ADD
>               51 STORE_SUBSCR
> 
> 
> By searching in the logs, you find the following (I edited it a bit to improve 
> readability):
> 
> debug_merge_point(' #38 LOAD_FAST')
> debug_merge_point(' #41 LOAD_FAST')
> debug_merge_point(' #44 LOAD_FAST')
> debug_merge_point(' #47 LOAD_CONST')
> debug_merge_point(' #50 BINARY_ADD')
> debug_merge_point(' #51 STORE_SUBSCR')
> p345 = new_with_vtable(ConstClass(W_IntObject))
> setfield_gc(p345, 8, descr= pypy.objspace.std.intobject.W_IntObject.inst_intval 8>)
> call(ConstClass(ll_setitem__dum_checkidxConst_listPtr_Signed_objectPtr),
>       p333, 38, p345, descr=)
> guard_no_exception(, descr=) [p1, p0, p71, p345, p312, p3, p4, p6, 
> p308, p315, p335, p12, p13, p14, p15, p16, p18, p19, p178, p26, p320, p328, 
> i124, p25, i329]
> 
> Here, you can see that most opcodes are "empty" (i.e., no operations between 
> one debug_merge_point and the next). In general, all the opcodes that 
> manipulate the python stack are optimized away by the jit, because all the 
> python variables on the stack become "local variables" in the assembler.
> 
> Moreover, you can see that BINARY_ADD is also empty: this probably means that 
> the loop was specialized for the specific value of IntLoc, so the addition has 
> been constant-folded away.  Indeed, the only opcode that do real work is 
> STORE_SUBSCR.  What it does it to allocate a new W_IntObject whose value is 8 
> (i.e., boxing IntLoc on the fly, because it's escaping), and store it into the 
> element 38 of the list stored in p333.
> 
> Finally, we check that no exception was raised.

Wow, thank you for the awesome explanation.  I think the only surprising
thing in there is that I actually understood all of that.

> Obviously, when presenting these information to the user you must consider 
> that there is not a 1-to-1 mapping from python source to jit loops.  In the 
> example above, the very same opcodes are compiled also in another loop (which 
> by chance it has the same jit-operations, but they might also be very 
> different, depending on the cases).

Currently, in my hacked together parsing chain, the low-level parser
keeps a reference to the underlying token when it creates a new node and
subsequently the ast builder keeps a references to the low-level parse
node when it creates an ast node.  This way, I can easily map down to
the individual source chars and full context when walking the AST to do
highlighting, linting, etc.  

My first inclination would be to continue this chain and add a bytecode
compiler on top of the ast builder.  This would keep ast node references
in the instructions it creates.  If the algorithms don't diverge too
much, I think this would allow the debug output to be mapped all the way
back to the source chars with minimal effort.  I'm not terrifically
familiar with the specifics of how python emits bytecode from an ast, so
I'd appreciate any feedback if you think this is crazy-talk.

> As you can see, there is already lot of information that can be useful to the 
> user.  However, don't ask me how to present it visually :-)

Neither do I, but finding out is going to be the fun part.  

> ciao,
> anto

I'm excited to try some of this out, but unfortunately, there is an
annoying problem in the way.  All of my work in the last year has been
on python3.  Having worked in python3 for awhile now, my opinion is that
it's just a much better language -- surprisingly so, considering how
little it changed from python2.  If pypy supported python3, then I could
maintain my parser as a diff against pypy (you are switching to
Mercurial at some point, right?), which would make it much easier to
avoid divergence.

So what I'm getting at is: what is the pypy story for python3 support?
I haven't seen anything on pypy-dev, or in my occasional looks at the
repository, to suggest that it is being worked on but I'm sure you have
a plan of some sort.  I'm willing to help out with python3 support, if I
can do so without getting in anyone's way.  It seems like the sort of
thing that will be disruptive, however, so I have been leery of jumping
in, considering how little time I have to contribute, at the moment.

In my mind, the python3 picture is something like:
At the compilation level, it's easy enough to dump Grammar3.2 in
pypy/interpreter/pyparser/data and to modify astbuilder for python3 --
I'll backport the changes I made, if you want.  Interpreter support for
the new language features will be harder, but that's probably already
done since 2.7 is almost working.  The only potential problems I see are
the big string/unicode switch and the management of the fairly large
changes to astbuilder -- I'm sure you want to continue supporting
python2 into the future.  I don't know how much the bytecode changed
between 2 and 3, so I'm not sure if there are jit issues to worry about.
Am I missing anything big?


-Terrence



From andrewfr_ice at yahoo.com  Wed Sep 29 23:39:55 2010
From: andrewfr_ice at yahoo.com (Andrew Francis)
Date: Wed, 29 Sep 2010 14:39:55 -0700 (PDT)
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
Message-ID: <69092.15439.qm@web120710.mail.ne1.yahoo.com>

Hi Maciej:

Message: 4
Date: Mon, 27 Sep 2010 14:23:14 +0200
From: Maciej Fijalkowski 
Subject: Re: [pypy-dev] PyPy JIT & C extensions, greenlet
To: "Ian P. Cooke" 
Cc: pypy-dev at codespeak.net
Message-ID:
    
Content-Type: text/plain; charset=UTF-8

>greenlet C module is quite incompatible with pypy and won't work.
>However making pypy work with jit and stackless is something that
>requires a bit of work only (teaching jit how to unroll the stack
>mostly) and I plan to look into it in the very near future.

I talked briefly with Armin at EuroPython about Stackless and JIT. 
I poke around pypy-dev and use stackless.py. However I am very interested
in learning about how the stackless transform works and how pypy works so
I could help. I have been at Stackless for about five years now and I wouldn't mind spending a year learning pypy.

Cheers,
Andrew





      

From p.giarrusso at gmail.com  Thu Sep 30 08:35:22 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 30 Sep 2010 08:35:22 +0200
Subject: [pypy-dev] Fwd:  Question on the future of RPython
In-Reply-To: <1285809701.5129.150.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
Message-ID: 

I'm forwarding this mail that Terrence Cole sent me privately for no
reason apparent to me, because without it my mail makes less sense.

---------- Forwarded message ----------
From: Terrence Cole 
Date: Thu, Sep 30, 2010 at 03:21
Subject: Re: [pypy-dev] Question on the future of RPython
To: Paolo Giarrusso 


On Wed, 2010-09-29 at 23:50 +0200, Paolo Giarrusso wrote:
> The uselessness of static type analysis in dynamic languages was shown
> concretely in 1991 on a SmallTalk derivative, Self, the father of
> JavaScript, but I've seen the result only in the PhD thesis* of U.
> H?lzle, one of Self authors - and I'm pointing it out because I'm not
> sure how well known it is.

I've read parts of that at some point in the past.

> That might explain why Brett Canon's PhD thesis was not that
> successful. Of course, after reading his thesis you might be able in
> theory to address his concerns, and he acknowledges that type analysis
> might help with a few specific cases. I'm not sure about Brendan
> Eich's points, but I don't have the time, unfortunately, to
> investigate them.

Dr. Canon achieved an ~1% performance improvement. ?From his conclusion,
the most significant problem he found was that most program data is
stored in object attributes and object attributes were not
type-inferable with the methods he used.

> * Title: "Adaptive optimization for Self: Reconciling High Performance
> with Exploratory Programming". See, among others, sec. 7.2, "Type
> analysis does not improve performance of object-oriented programs",
> sec. 7.4.1, "type analysis exhibits unstable performance". The
> conclusion is that in most cases it helps more to use ?"type feedback"
> , ?indicates profile-guided optimization applied

Scanning through the 7.2, they say:

"In general, type analysis cannot infer the result types of non-inlined
message sends, of arguments of non-inlined sends, or of assignable slots
(i.e., instance variables). Since sends to such values are fairly
frequent, a large fraction of message sends does not benefit from type
analysis."

So they seem to have run into similar problems: the type inferencing
algorithm doesn't get to sink its teeth into any of the bits that most
need it.

Chapter3 of Cannon05 details what specific parts of python can't be
type-inferred. ?It turns out this is most of python. ?For example, type
inferencing can't cross module boundaries because the module used at
runtime might be different from the one present at compile time. ?While
perfectly true, this is probably going to be quite rare in practice.

I think it makes sense to treat the sorts of optimizations you can do
with static analysis of a dynamic language with the same sort of guarded
optimism that is used in pypy's jit compiler: run with the assumption
that it will all work out, but watch for failure and fallback to a safe
slow-path nicely when things go off the rails.

> See below further inline replies.
>
> On Tue, Sep 28, 2010 at 21:49, Terrence Cole
>  wrote:
> > I think this is a disconnect. ?Applying a jit to a non-interpretted
> > language -- Jacob here seems to think I was talking about a static,
> > compiled subset of python -- makes little sense.
>
> JIT just makes much _more_ sense for dynamic languages. Profile-guided
> optimization (PGO) give impressive performance improvements for C, but
> can also worsen performance when the execution profile changes
> (because of different inputs), and JIT would solve this problem. Since
> we are talking about optimized C, "impressive performance
> improvements" is more like 20% more (so to say) than like 10x-100x,
> and it's mostly about things like static branch prediction.
> For RPython, it makes more sense to reuse the support from the
> translation backend (JVM and .NET should do this by default, in C you
> can use PGO on the translation output.
>
> >> There are some undesirable things about static analysis, but it can
> >> sure be useful from optimisation, security and reliability
> >> perspectives.
> >
> > Brendan Eich agrees [2]. ?This is heartening, because javascript has
> > much in common with python.
>
> > [1] http://www.ocf.berkeley.edu/~bac/thesis.pdf
> > [2] http://brendaneich.com/2010/08/static-analysis-ftw/
>





-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From p.giarrusso at gmail.com  Thu Sep 30 08:33:02 2010
From: p.giarrusso at gmail.com (Paolo Giarrusso)
Date: Thu, 30 Sep 2010 08:33:02 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285809701.5129.150.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
Message-ID: 

On Thu, Sep 30, 2010 at 03:21, Terrence Cole
 wrote:
> On Wed, 2010-09-29 at 23:50 +0200, Paolo Giarrusso wrote:

> I think it makes sense to treat the sorts of optimizations you can do
> with static analysis of a dynamic language with the same sort of guarded
> optimism that is used in pypy's jit compiler: run with the assumption
> that it will all work out, but watch for failure and fallback to a safe
> slow-path nicely when things go off the rails.

Agreed, but "watch out for failure" is done by guards, and one of the
further advantages of type inference, on top of "type feedback",
described there, is being able to remove some of those guards, because
that might help in some inner loops.
In some cases, that's still possible, by "watching out" not during
code execution but during execution of infrequent events. In Java
loading a class might invalidate some optimization assumptions (like
"this class has no subclass", useful for inlining without guards), but
that's checked by the classloading fast-path.

Same things could apply to allow Python cross-module type-inferencing.
I don't know exactly how modules at JIT compile-time and runtime can
be different, but I guess that invalidation at module loading should
catch that. And invalidate lots of compiled code, which is usually
fine. The interaction of this with tracing is actually interesting: in
a Python tracing JIT, you could keep the traces and restore omitted
guards, but when your JIT traces the Python interpreter, I wonder how
do you express any of this. One can insert a guard only if needed, but
telling "hey, this is invalid" requires some special API.

My proposal, here, would be a "virtual guard", which is recorded in
the guard but omitted from the output. The "omission" is what can be
invalidated, but the trace itself (not its compiled version) is kept
(because it can still be executed).
If some form of this makes actually sense (I've just thought on it 5
minutes), this is something worth publishing, which would allow me to
take some time of my PhD to work on it - if it's not already done by
papers on tracing JITs.
-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/

From holger at merlinux.eu  Thu Sep 30 09:37:46 2010
From: holger at merlinux.eu (holger krekel)
Date: Thu, 30 Sep 2010 09:37:46 +0200
Subject: [pypy-dev] Python3,
	Python2.7 - fast-forward (was Re: Question on	the future of RPython)
In-Reply-To: <1285792859.5129.79.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285705994.5276.99.camel@localhost> <4CA308C1.5030200@gmail.com>
	<1285792859.5129.79.camel@localhost>
Message-ID: <20100930073746.GL20695@trillke.net>

Hi Terence, all,

On Wed, Sep 29, 2010 at 13:40 -0700, Terrence Cole wrote:
> So what I'm getting at is: what is the pypy story for python3 support?
> I haven't seen anything on pypy-dev, or in my occasional looks at the
> repository, to suggest that it is being worked on but I'm sure you have
> a plan of some sort.  I'm willing to help out with python3 support, if I
> can do so without getting in anyone's way.  It seems like the sort of
> thing that will be disruptive, however, so I have been leery of jumping
> in, considering how little time I have to contribute, at the moment.

In fact, there has been work from Benjamin Peterson and is some 
work from Amaury and Alex to complete the

    http://codespeak.net/svn/pypy/branch/fast-forward/

branch.  It aims at offering Python2.7 compatibility.  This is a
good intermediate step to jump to Python3 at some point.  Most
PyPy core devs are focusing on JIT related tasks so this is 
a good place to help out in general.  

If you like to help you can drop by at #pypy on freenode and/or maybe
some of the involved persons can point to some tasks here. 

cheers,
holger

> In my mind, the python3 picture is something like:
> At the compilation level, it's easy enough to dump Grammar3.2 in
> pypy/interpreter/pyparser/data and to modify astbuilder for python3 --
> I'll backport the changes I made, if you want.  Interpreter support for
> the new language features will be harder, but that's probably already
> done since 2.7 is almost working.  The only potential problems I see are
> the big string/unicode switch and the management of the fairly large
> changes to astbuilder -- I'm sure you want to continue supporting
> python2 into the future.  I don't know how much the bytecode changed
> between 2 and 3, so I'm not sure if there are jit issues to worry about.
> Am I missing anything big?
> 
> 
> -Terrence
> 
> 
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
> 

-- 

From fijall at gmail.com  Thu Sep 30 10:51:01 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Thu, 30 Sep 2010 10:51:01 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285792859.5129.79.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285705994.5276.99.camel@localhost> <4CA308C1.5030200@gmail.com>
	<1285792859.5129.79.camel@localhost>
Message-ID: 

[not answering the rest]

> Am I missing anything big?

Standard library is usually the biggest thing when porting from one
version of python to another. Other big issues are about RPython
itself. Do we want RPython to be python3 compatible? How?

From arigo at tunes.org  Thu Sep 30 12:05:10 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 30 Sep 2010 12:05:10 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285705994.5276.99.camel@localhost> <4CA308C1.5030200@gmail.com>
	<1285792859.5129.79.camel@localhost>
	
Message-ID: 

Hi Maciej,

On Thu, Sep 30, 2010 at 10:51 AM, Maciej Fijalkowski  wrote:
> Other big issues are about RPython
> itself. Do we want RPython to be python3 compatible? How?

No, I'm pretty sure that even if we want to support python3 at some
point, RPython will remain what it is now, and translate.py will
remain a python2 tool.


Armin

From arigo at tunes.org  Thu Sep 30 13:01:59 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 30 Sep 2010 13:01:59 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
	
Message-ID: 

Hi Paolo,

On Thu, Sep 30, 2010 at 8:33 AM, Paolo Giarrusso  wrote:
> My proposal, here, would be a "virtual guard", (...)

Yes, this proposal makes sense.  It's an optimization that is
definitely done in regular JITs, and we have a "to-do" task about it
in http://codespeak.net/svn/pypy/extradoc/planning/jit.txt (where they
are called "out-of-line guards").


Armin.

From arigo at tunes.org  Thu Sep 30 13:23:24 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 30 Sep 2010 13:23:24 +0200
Subject: [pypy-dev] Fwd: Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
	
Message-ID: 

Hi Terrence,

I think that what you are describing is found in informal discussions
about LLVM/HLVM, and more formally in the plans for Unladen Swallow at
http://code.google.com/p/unladen-swallow/wiki/ProjectPlan .  See in
particular the section about Feedback-Directed Optimization.  Maybe
you want to discuss these ideas with the Unladen Swallow guys instead
:-)


Armin.

From arigo at tunes.org  Thu Sep 30 14:25:53 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 30 Sep 2010 14:25:53 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: <69092.15439.qm@web120710.mail.ne1.yahoo.com>
References: 
	<69092.15439.qm@web120710.mail.ne1.yahoo.com>
Message-ID: 

Hi,

On Wed, Sep 29, 2010 at 11:39 PM, Andrew Francis  wrote:
> I talked briefly with Armin at EuroPython about Stackless and JIT.
> I poke around pypy-dev and use stackless.py. However I am very interested
> in learning about how the stackless transform works and how pypy works so
> I could help. I have been at Stackless for about five years now and I wouldn't mind
> spending a year learning pypy.

Maybe I should expand on an idea posted on #pypy by fijal.  He
mentioned that he would like to try to support Stackless in PyPy
without using the stackless transform, just by using the same
low-level stack hacks that are done by greenlet.c and optionally by
Stackless Python.  This means that there would be two different
approaches we can consider to support Stackless in PyPy:

    stackless transform (done)          C-level stack switching
    --------------------------------    --------------------------------

    approach from Stackless Python 1    approach from Stackless Python 2

    10-20% speed penalty in the         no speed penalty
    whole interpreter

    JIT support needed                  JIT supports comes for free
    (missing so far)

    fully portable                      needs a little bit of assembler

    some issues to integrate with       easy to integrate with C code
    non-PyPy C code

    tasklet-switching Python code       tasklet-switching Python code
    becomes a single loop in            becomes N loops with residual calls
    machine code                        to switch() functions
    (potentially very good)             (less good)


As you can see from the above summary, the main issue with the 2nd
approach would be that Python tasklet-switching loops do not turn into
a form that is as efficient as what we would get in the 1st approach.
Nevertheless it is an interesting approach because it makes basic JIT
support and integration easier.


A bient?t,

Armin.

From anto.cuni at gmail.com  Thu Sep 30 14:32:01 2010
From: anto.cuni at gmail.com (Antonio Cuni)
Date: Thu, 30 Sep 2010 14:32:01 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: <1285792859.5129.79.camel@localhost>
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>	
	<827221.6704.qm@web53704.mail.re2.yahoo.com>	
	<1285616691.5954.34.camel@localhost>	<201009280157.17264.jacob@openend.se>	
	<1285634613.5954.112.camel@localhost>	
		
	<1285705994.5276.99.camel@localhost> <4CA308C1.5030200@gmail.com>
	<1285792859.5129.79.camel@localhost>
Message-ID: <4CA48341.3020106@gmail.com>

On 29/09/10 22:40, Terrence Cole wrote:

>> then, mylog contains all the loops and bridges produced by the jit. The
>> interesting point is that there are also special operations called
>> "debug_merge_point" that are emitted for each python bytecode, so you can
>> easily map the low-level jit instructions back to the original python source.
>
> I think that 'easily' in that last sentence is missing scare-quotes. :-)

well, it's easy as long as you have a bytecode-compiled version around. With 
only the AST I agree that it might be a bit trickier.

[cut]
> My first inclination would be to continue this chain and add a bytecode
> compiler on top of the ast builder.  This would keep ast node references
> in the instructions it creates.  If the algorithms don't diverge too
> much, I think this would allow the debug output to be mapped all the way
> back to the source chars with minimal effort.  I'm not terrifically
> familiar with the specifics of how python emits bytecode from an ast, so
> I'd appreciate any feedback if you think this is crazy-talk.

Are you using your custom-made AST or the one from the standard library? In 
the latter case, you can just pass the ast to the compile() builtin function 
to get the corresponding bytecode.


From arigo at tunes.org  Thu Sep 30 14:55:15 2010
From: arigo at tunes.org (Armin Rigo)
Date: Thu, 30 Sep 2010 14:55:15 +0200
Subject: [pypy-dev] PyPy JIT & C extensions, greenlet
In-Reply-To: 
References: 
	<69092.15439.qm@web120710.mail.ne1.yahoo.com>
	
Message-ID: 

Re-hi,

I forgot to mention that the improvement over the "stackless
transform" approach for PyPy might be to not apply the stackless
transform on the interpreter, but replace it with resuming C code
using the blackhole interpreter that we have anyway in the JIT.

Also, any "final" long-term approach that anyone should at least
consider if taking all of this seriously might be a mix of the two
approaches, similar to Stackless Python 3 which combines both
Stackless Python 1 and Stackless Python 2 features.


Armin

From renesd at gmail.com  Thu Sep 30 16:35:05 2010
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Thu, 30 Sep 2010 16:35:05 +0200
Subject: [pypy-dev] Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
	
	
Message-ID: 

Hi,

on the topic of optimizations, since I should get these out of my
brain, but you probably don't want them in yours... but I'll write
them anyway.

Considered Duffs device for loop unwinding?? For dividing the number
of loop counter ++ calls by 8 times or more:
http://en.wikipedia.org/wiki/Duff%27s_Device? I'm guessing pypy
already does this or something better.

Outputting gcc extensions, and pragmas, for example using
__builtin_expect to help branch prediction:
#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

As well as rectrict to tell it about pointer aliasing?
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Restricted-Pointers.html


Also, have you considered using PREFETCH* (or gccs __builtin_prefetch)
instructions when you are iterating over sequences?  It might be a win
If you know there is some memory coming, and can slip in some of these
instructions, it's usually a win.
    http://gcc.gnu.org/projects/prefetch.html

__builtin_constant_p for constant detection?

SSE2 optimized hash functions?  It seems this is a big speedup for
interpreters when the hash function is sped up... I guess pypy already
uses an inline cache, but I hope it would still speed things up.



... almost deleted this email, since I hate suggesting things that
people might like to work on... but I didn't. oops, sorry.



On Thu, Sep 30, 2010 at 1:01 PM, Armin Rigo  wrote:
>
> Hi Paolo,
>
> On Thu, Sep 30, 2010 at 8:33 AM, Paolo Giarrusso  wrote:
> > My proposal, here, would be a "virtual guard", (...)
>
> Yes, this proposal makes sense. ?It's an optimization that is
> definitely done in regular JITs, and we have a "to-do" task about it
> in http://codespeak.net/svn/pypy/extradoc/planning/jit.txt (where they
> are called "out-of-line guards").
>
>
> Armin.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev

From list-sink at trainedmonkeystudios.org  Thu Sep 30 22:01:11 2010
From: list-sink at trainedmonkeystudios.org (Terrence Cole)
Date: Thu, 30 Sep 2010 13:01:11 -0700
Subject: [pypy-dev] Fwd:  Question on the future of RPython
In-Reply-To: 
References: <525425.61205.qm@web53702.mail.re2.yahoo.com>
	<827221.6704.qm@web53704.mail.re2.yahoo.com>
	<1285616691.5954.34.camel@localhost>
	<201009280157.17264.jacob@openend.se>
	<1285634613.5954.112.camel@localhost>
	
	<1285703348.5276.56.camel@localhost>
	
	<1285809701.5129.150.camel@localhost>
	
Message-ID: <1285876871.9991.7.camel@localhost>

On Thu, 2010-09-30 at 08:35 +0200, Paolo Giarrusso wrote:
> I'm forwarding this mail that Terrence Cole sent me privately for no
> reason apparent to me, because without it my mail makes less sense.

I replied to you privately because the mail from you that I was replying
to did not have the list cc'd.  I checked the headers several times to
be sure that my mail client was not lying to me.  I thought that you
wanted to take the ensuing discussion off list, as it is starting to
range even farther off topic for pypy-dev than it was before.

Oh well.  Sorry for the noise.

> ---------- Forwarded message ----------
> From: Terrence Cole 
> Date: Thu, Sep 30, 2010 at 03:21
> Subject: Re: [pypy-dev] Question on the future of RPython
> To: Paolo Giarrusso 
> 
> 
> On Wed, 2010-09-29 at 23:50 +0200, Paolo Giarrusso wrote:
> > The uselessness of static type analysis in dynamic languages was shown
> > concretely in 1991 on a SmallTalk derivative, Self, the father of
> > JavaScript, but I've seen the result only in the PhD thesis* of U.
> > H?lzle, one of Self authors - and I'm pointing it out because I'm not
> > sure how well known it is.
> 
> I've read parts of that at some point in the past.
> 
> > That might explain why Brett Canon's PhD thesis was not that
> > successful. Of course, after reading his thesis you might be able in
> > theory to address his concerns, and he acknowledges that type analysis
> > might help with a few specific cases. I'm not sure about Brendan
> > Eich's points, but I don't have the time, unfortunately, to
> > investigate them.
> 
> Dr. Canon achieved an ~1% performance improvement.  From his conclusion,
> the most significant problem he found was that most program data is
> stored in object attributes and object attributes were not
> type-inferable with the methods he used.
> 
> > * Title: "Adaptive optimization for Self: Reconciling High Performance
> > with Exploratory Programming". See, among others, sec. 7.2, "Type
> > analysis does not improve performance of object-oriented programs",
> > sec. 7.4.1, "type analysis exhibits unstable performance". The
> > conclusion is that in most cases it helps more to use  "type feedback"
> > ,  indicates profile-guided optimization applied
> 
> Scanning through the 7.2, they say:
> 
> "In general, type analysis cannot infer the result types of non-inlined
> message sends, of arguments of non-inlined sends, or of assignable slots
> (i.e., instance variables). Since sends to such values are fairly
> frequent, a large fraction of message sends does not benefit from type
> analysis."
> 
> So they seem to have run into similar problems: the type inferencing
> algorithm doesn't get to sink its teeth into any of the bits that most
> need it.
> 
> Chapter3 of Cannon05 details what specific parts of python can't be
> type-inferred.  It turns out this is most of python.  For example, type
> inferencing can't cross module boundaries because the module used at
> runtime might be different from the one present at compile time.  While
> perfectly true, this is probably going to be quite rare in practice.
> 
> I think it makes sense to treat the sorts of optimizations you can do
> with static analysis of a dynamic language with the same sort of guarded
> optimism that is used in pypy's jit compiler: run with the assumption
> that it will all work out, but watch for failure and fallback to a safe
> slow-path nicely when things go off the rails.
> 
> > See below further inline replies.
> >
> > On Tue, Sep 28, 2010 at 21:49, Terrence Cole
> >  wrote:
> > > I think this is a disconnect.  Applying a jit to a non-interpretted
> > > language -- Jacob here seems to think I was talking about a static,
> > > compiled subset of python -- makes little sense.
> >
> > JIT just makes much _more_ sense for dynamic languages. Profile-guided
> > optimization (PGO) give impressive performance improvements for C, but
> > can also worsen performance when the execution profile changes
> > (because of different inputs), and JIT would solve this problem. Since
> > we are talking about optimized C, "impressive performance
> > improvements" is more like 20% more (so to say) than like 10x-100x,
> > and it's mostly about things like static branch prediction.
> > For RPython, it makes more sense to reuse the support from the
> > translation backend (JVM and .NET should do this by default, in C you
> > can use PGO on the translation output.
> >
> > >> There are some undesirable things about static analysis, but it can
> > >> sure be useful from optimisation, security and reliability
> > >> perspectives.
> > >
> > > Brendan Eich agrees [2].  This is heartening, because javascript has
> > > much in common with python.
> >
> > > [1] http://www.ocf.berkeley.edu/~bac/thesis.pdf
> > > [2] http://brendaneich.com/2010/08/static-analysis-ftw/
> >
> 
> 
> 
> 
> 
> -- 
> Paolo Giarrusso - Ph.D. Student
> http://www.informatik.uni-marburg.de/~pgiarrusso/
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev