From greg.ewing at canterbury.ac.nz Thu May 1 00:21:25 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 01 May 2008 10:21:25 +1200 Subject: [Cython] [Pyrex] pyrex problems In-Reply-To: <5533c3c80804211621p39aec737lb31a2dd76cc5f2b5@mail.gmail.com> References: <5533c3c80804211527o32568558sea7961803c79711c@mail.gmail.com> <85e81ba30804211612i3e38bebw5a74825815def96d@mail.gmail.com> <5533c3c80804211621p39aec737lb31a2dd76cc5f2b5@mail.gmail.com> Message-ID: <4818F0E5.70101@canterbury.ac.nz> On Mon, Apr 21, 2008 at 3:27 PM, Marco Zanger > wrote: > ctypedef struct cGCoptimization "GCoptimization": > EnergyType expansionIter "expansion"(int max_num_iterations) > You need to declare the member functions as function pointers, e.g. ctypedef struct cGCoptimization "GCoptimization": EnergyType (*expansionIter "expansion")(int max_num_iterations) -- Greg From greg.ewing at canterbury.ac.nz Thu May 1 00:24:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 01 May 2008 10:24:37 +1200 Subject: [Cython] pxi or pxd for numpy? In-Reply-To: <96de71860804241417i4ffa5b2bm7b663b3c4f34b273@mail.gmail.com> References: <6ce0ac130804241316p56006278y8319e49db10eb7be@mail.gmail.com> <96de71860804241417i4ffa5b2bm7b663b3c4f34b273@mail.gmail.com> Message-ID: <4818F1A5.3080907@canterbury.ac.nz> On Thu, Apr 24, 2008 at 1:16 PM, Brian Granger wrote: > The only way I can get all this to work is the rename numpy.pxi -> > c_numpy.pxd and use cimport. Then all works well. But, this seems to > go against the recommendation that pxd files should not be used for > this purpose. What recommendation are you talking about? Using a pxd file to provide a namespace is fine as far as I'm concerned. It's one of the things they were invented for. -- Greg From ellisonbg.net at gmail.com Thu May 1 00:52:56 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 30 Apr 2008 16:52:56 -0600 Subject: [Cython] pxi or pxd for numpy? In-Reply-To: <4818F1A5.3080907@canterbury.ac.nz> References: <6ce0ac130804241316p56006278y8319e49db10eb7be@mail.gmail.com> <96de71860804241417i4ffa5b2bm7b663b3c4f34b273@mail.gmail.com> <4818F1A5.3080907@canterbury.ac.nz> Message-ID: <6ce0ac130804301552y62186885i208b0b262452dd9c@mail.gmail.com> > What recommendation are you talking about? Using a pxd file > to provide a namespace is fine as far as I'm concerned. > It's one of the things they were invented for. Previously, (on this list and at an in person meeting of Sage devs) some people were saying that .pxi files should be used for this purpose. Because of these discussions, numpy.pxd was renamed numpy.pxi, but now this looks like a bad decision. > -- > Greg > From greg.ewing at canterbury.ac.nz Thu May 1 01:08:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 01 May 2008 11:08:36 +1200 Subject: [Cython] pxi or pxd for numpy? In-Reply-To: <6ce0ac130804301552y62186885i208b0b262452dd9c@mail.gmail.com> References: <6ce0ac130804241316p56006278y8319e49db10eb7be@mail.gmail.com> <96de71860804241417i4ffa5b2bm7b663b3c4f34b273@mail.gmail.com> <4818F1A5.3080907@canterbury.ac.nz> <6ce0ac130804301552y62186885i208b0b262452dd9c@mail.gmail.com> Message-ID: <4818FBF4.4010808@canterbury.ac.nz> Brian Granger wrote: > Previously, (on this list and at an in person meeting of Sage devs) > some people were saying that .pxi files should be used for this > purpose. Because of these discussions, numpy.pxd was renamed > numpy.pxi, but now this looks like a bad decision. Indeed, I would say the opposite. You should almost *never* use a .pxi file for anything. The reason is the same reason that "import *" is almost always a bad idea. When you cimport a name, it's easy to find out where it came from, but if you get it by including a .pxi, it can be a lot harder. The 'include' statement is only present in Pyrex mostly for historical reasons. If I had invented 'cimport' earlier, it might never have existed at all. -- Greg From wstein at gmail.com Thu May 1 01:15:59 2008 From: wstein at gmail.com (William Stein) Date: Wed, 30 Apr 2008 16:15:59 -0700 Subject: [Cython] pxi or pxd for numpy? In-Reply-To: <6ce0ac130804301552y62186885i208b0b262452dd9c@mail.gmail.com> References: <6ce0ac130804241316p56006278y8319e49db10eb7be@mail.gmail.com> <96de71860804241417i4ffa5b2bm7b663b3c4f34b273@mail.gmail.com> <4818F1A5.3080907@canterbury.ac.nz> <6ce0ac130804301552y62186885i208b0b262452dd9c@mail.gmail.com> Message-ID: <85e81ba30804301615t95e5d1excd279c0d2ca93aec@mail.gmail.com> On Wed, Apr 30, 2008 at 3:52 PM, Brian Granger wrote: > > What recommendation are you talking about? Using a pxd file > > to provide a namespace is fine as far as I'm concerned. > > It's one of the things they were invented for. > > Previously, (on this list and at an in person meeting of Sage devs) > some people were saying that .pxi files should be used for this > purpose. Because of these discussions, numpy.pxd was renamed > numpy.pxi, but now this looks like a bad decision. > It's me. I gave them some advice about using pxi versus pxd files at Sage Days 8, but I did not actually *look* at numpy.pxd before giving said advice. Clearly I was wrong. Sorry for any confusion caused by my offhand comment. -- William From robertwb at math.washington.edu Thu May 1 03:17:31 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 30 Apr 2008 18:17:31 -0700 Subject: [Cython] __getattribute__ In-Reply-To: <20080430004208.GJ15181@tilt> References: <20080430004208.GJ15181@tilt> Message-ID: <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> On Apr 29, 2008, at 5:42 PM, Peter Todd wrote: > Is there a __getattribute__ work-alike in Cython? > > Essentially I need direct control over an objects tp_getattro and > tp_setattro slots to implement a wrapper class. Specificly > wrapped.__class__ should go to the wrapped objects class attribute, > not > the wrapping objects __class__ attribute. > > __getattr__ outputs C-source that includes a call to > PyObject_GenericGetAttr first, and won't run my code if that call > succeeds. > > Thanks, > > Peter Not that I'm aware of, though I'd imagine that implementing __getattribute__ (if it exists) as being called at the top of this function would be fairly easy to do. One would want to match Python symantics exactly. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080430/e5d1c3c1/attachment-0001.pgp From robertwb at math.washington.edu Thu May 1 03:49:09 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 30 Apr 2008 18:49:09 -0700 Subject: [Cython] Porting the Docs In-Reply-To: <20080430153918.GA6561@giton> References: <20080426000805.GA5117@basestar> <20080430153918.GA6561@giton> Message-ID: On Apr 30, 2008, at 8:39 AM, Gabriel Gellner wrote: > So I have finished the first rough mockup, which mainly consists of > getting > the markup correct. There are still some rough edges, mainly with > tables, and > some of the latex output, but in the spirit of release early > release often ;-) > > Check out the html at: > http://www.mudskipper.ca/cython-doc > > Get the pdf at: > http://www.mudskipper.ca/cython.pdf > > And finally get the source at: > http://www.mudskipper.ca/cython_doc.tar.gz > or > http://www.mudskipper.ca/cython_doc.zip > > Again I would appreciate any comments on if I am screwing up > authorship. Down > the road I think it would be good to attribute everything to the > 'Cython Doc > Team' and have a page that lists contributers. To do this we should > make the > license and authorship on the wiki more explicit. Tell me what you > think? I am > no lawyer, and certainly don't want to piss anyone off who has put > the hard > work into documenting either cython or pyrex. > > My plan for the next steps (in order of importance, any comments): > - Make a PyGments lexer for cython/pyrex so we get nice color coding. > - Fix up the latex style file so that boxes are not messed up when > we have > code examples. > - Put on my writing hat, and do an overhaul of the structure of the > docs so > that it is faster to navigate. I will be using the python doc > structure as a > reference. > - Get some simple howto's written. > - Thinking if there is an easy way to test the cython code in the > docs so that > I can ensure accuracy. Thanks again for putting these all together in a nice form. While browsing I noticed some inaccuracies (often just dealing with the differences between Cython and Pyrex). Should I send them directly to you, or should we set up some revision control (need not be centralized). - Robert From pete at petertodd.org Thu May 1 03:41:07 2008 From: pete at petertodd.org (Peter Todd) Date: Wed, 30 Apr 2008 21:41:07 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> Message-ID: <20080501014107.GK15181@tilt> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Apr 30, 2008 at 06:17:31PM -0700, Robert Bradshaw wrote: > On Apr 29, 2008, at 5:42 PM, Peter Todd wrote: > > >Is there a __getattribute__ work-alike in Cython? > > > >Essentially I need direct control over an objects tp_getattro and > >tp_setattro slots to implement a wrapper class. Specificly > >wrapped.__class__ should go to the wrapped objects class attribute, > >not > >the wrapping objects __class__ attribute. > > > >__getattr__ outputs C-source that includes a call to > >PyObject_GenericGetAttr first, and won't run my code if that call > >succeeds. > > > >Thanks, > > > >Peter > > Not that I'm aware of, though I'd imagine that implementing > __getattribute__ (if it exists) as being called at the top of this > function would be fairly easy to do. One would want to match Python > symantics exactly. Thanks for the reply. I've been looking at the Cython source myself, and it looks doable, I'll put some effort into implementing it. From what I can see I'd be updating generate_getattro_function() to do a scope.lookup_here() on '__getattribute__' first, if that succeeds, output code to call it, otherwise go on to the existing code. With the additional complication that if __getattribute__ is defined, as well as __getattr__, the latter must be called if the former raises AttributeError. Another question is I noticed that the latest cython-devel code emits warnings on usage of __new__, saying to use __cinit__ instead. I take it there is planned work to make __new__ more like Python's __new__ and have it create the object as well? Again, that's another feature I could really use. FWIW I'm writing a electrical design automation library called Tuke. Currently I have some C extension API code written to wrap objects in contexts, but would like to re-write that in Cython as well as implement a whole lot of other basic functionality. http://github.com/retep/tuke/tree/master - -- http://petertodd.org 'peter'[:-1]@petertodd.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFIGR+z3bMhDbI9xWQRAjUsAKCx7XeM/+KRs9vCA3+XUuundIfF0wCfRJcf nH3ByGq6MPaRUktCxWho5KE= =oz0w -----END PGP SIGNATURE----- From robertwb at math.washington.edu Thu May 1 04:12:42 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 30 Apr 2008 19:12:42 -0700 Subject: [Cython] __getattribute__ In-Reply-To: <20080501014107.GK15181@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080501014107.GK15181@tilt> Message-ID: <8367C02B-B674-44BB-AF04-4472EE8AABED@math.washington.edu> On Apr 30, 2008, at 6:41 PM, Peter Todd wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Wed, Apr 30, 2008 at 06:17:31PM -0700, Robert Bradshaw wrote: >> On Apr 29, 2008, at 5:42 PM, Peter Todd wrote: >> >>> Is there a __getattribute__ work-alike in Cython? >>> >>> Essentially I need direct control over an objects tp_getattro and >>> tp_setattro slots to implement a wrapper class. Specificly >>> wrapped.__class__ should go to the wrapped objects class attribute, >>> not >>> the wrapping objects __class__ attribute. >>> >>> __getattr__ outputs C-source that includes a call to >>> PyObject_GenericGetAttr first, and won't run my code if that call >>> succeeds. >>> >>> Thanks, >>> >>> Peter >> >> Not that I'm aware of, though I'd imagine that implementing >> __getattribute__ (if it exists) as being called at the top of this >> function would be fairly easy to do. One would want to match Python >> symantics exactly. > > Thanks for the reply. > > I've been looking at the Cython source myself, and it looks doable, > I'll > put some effort into implementing it. Great! > From what I can see I'd be > updating generate_getattro_function() to do a scope.lookup_here() on > '__getattribute__' first, if that succeeds, output code to call it, > otherwise go on to the existing code. Sounds like a good plan. Is __getattribute__ inherited? > > With the additional complication > that if __getattribute__ is defined, as well as __getattr__, the > latter > must be called if the former raises AttributeError. Yep, it should go onto the existing code (or call the super/default getattribute) on an error, even if __getattr__ is not defined, to match the specs. > Another question is I noticed that the latest cython-devel code emits > warnings on usage of __new__, saying to use __cinit__ instead. I > take it > there is planned work to make __new__ more like Python's __new__ and > have it create the object as well? Again, that's another feature I > could > really use. Eventually we may implement this (I looked into it some, but it wasn't as straightforward as I had hoped), but using __cinit__ now is good to avoid the confusion. > FWIW I'm writing a electrical design automation library called Tuke. > Currently I have some C extension API code written to wrap objects in > contexts, but would like to re-write that in Cython as well as > implement > a whole lot of other basic functionality. > > http://github.com/retep/tuke/tree/master Cool. Should I add this to http://wiki.cython.org/projects ? - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080430/38c2d1ac/attachment.pgp From robertwb at math.washington.edu Thu May 1 04:16:11 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 30 Apr 2008 19:16:11 -0700 Subject: [Cython] a problem with declaring and initializing (*pointer)[N] = NULL In-Reply-To: References: Message-ID: That is really strange, no idea why. On Apr 30, 2008, at 8:22 AM, Lisandro Dalcin wrote: > Just pulled from cython-devel repo, this does not work (neither > before the pull) > > cdef int (*iranges)[3] = NULL > > but this indeed work > > cdef int (*iranges)[3] > iranges = NULL > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From pete at petertodd.org Thu May 1 05:16:03 2008 From: pete at petertodd.org (Peter Todd) Date: Wed, 30 Apr 2008 23:16:03 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <8367C02B-B674-44BB-AF04-4472EE8AABED@math.washington.edu> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080501014107.GK15181@tilt> <8367C02B-B674-44BB-AF04-4472EE8AABED@math.washington.edu> Message-ID: <20080501031603.GL15181@tilt> On Wed, Apr 30, 2008 at 07:12:42PM -0700, Robert Bradshaw wrote: > >From what I can see I'd be > >updating generate_getattro_function() to do a scope.lookup_here() on > >'__getattribute__' first, if that succeeds, output code to call it, > >otherwise go on to the existing code. > > Sounds like a good plan. Is __getattribute__ inherited? Yes, and when inherited, it overrides __getattr__: >>> class a(object): ... def __getattribute__(self,n): ... print 'a getattribute',n ... raise AttributeError ... >>> class b(a): ... def __getattr__(self,n): ... print 'b getattr',n ... return None ... >>> o = b() >>> print o.asdf a getattribute asdf b getattr asdf None >>> But note that b.__getattr__ is called in the end. If b defines __setattr__, it is called for any setattr as well. The Python C extension API docs also state that tp_getattro and tp_setattro are "inherited by subtypes together with tp_getattr: a subtype inherits both tp_getattr and tp_getattro from its base type when the subtype's tp_getattr and tp_getattro are both NULL." I think this will cause a conflict, in the following situation: cdef class a: def __getattribute__(self,n): return n cdef class b(a): def __setattr__(self,n,v): print 'b setattr',n,v Evaluating b().asdf will not evaluate to 'asdf', rather an AttributeError will be raised. Why? Because PyObject_GenericGetAttr() doesn't look in the base class for attributes at all. And tp_getattro will for type b will end up pointing to PyObject_GenericGetAttr due to the above rule of only inhereting if *both* tp_getattro and tp_setattro are not set. From what I can see in the slot_table definition tp_getattro and tp_setattro are handled independently, and setting setattr will leave getattro set to 0 > > > >With the additional complication > >that if __getattribute__ is defined, as well as __getattr__, the > >latter > >must be called if the former raises AttributeError. > > Yep, it should go onto the existing code (or call the super/default > getattribute) on an error, even if __getattr__ is not defined, to > match the specs. Will do. > >Another question is I noticed that the latest cython-devel code emits > >warnings on usage of __new__, saying to use __cinit__ instead. I > >take it > >there is planned work to make __new__ more like Python's __new__ and > >have it create the object as well? Again, that's another feature I > >could > >really use. > > Eventually we may implement this (I looked into it some, but it > wasn't as straightforward as I had hoped), but using __cinit__ now is > good to avoid the confusion. Thanks, I'll take a look at it as well. > >FWIW I'm writing a electrical design automation library called Tuke. > >Currently I have some C extension API code written to wrap objects in > >contexts, but would like to re-write that in Cython as well as > >implement > >a whole lot of other basic functionality. > > > >http://github.com/retep/tuke/tree/master > > Cool. Should I add this to http://wiki.cython.org/projects ? Sure, although, it may still be another two or three weeks before they is any actual Cython code in the above tree. ;) -- http://petertodd.org 'peter'[:-1]@petertodd.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080430/f62fafeb/attachment.pgp From robertwb at math.washington.edu Thu May 1 07:32:10 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 30 Apr 2008 22:32:10 -0700 Subject: [Cython] Cython 0.9.6.14 release candidate Message-ID: Up at http://www.cython.org/Cython-0.9.6.14.tar.gz (or pull from the devel branch). If no problems are found, this will be the release. (I have already tested Sage). - Robert From greg.ewing at canterbury.ac.nz Thu May 1 07:44:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 01 May 2008 17:44:36 +1200 Subject: [Cython] __getattribute__ In-Reply-To: <20080501014107.GK15181@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080501014107.GK15181@tilt> Message-ID: <481958C4.2070205@canterbury.ac.nz> Peter Todd wrote: > Another question is I noticed that the latest cython-devel code emits > warnings on usage of __new__, saying to use __cinit__ instead. I take it > there is planned work to make __new__ more like Python's __new__ and > have it create the object as well? Again, that's another feature I could > really use. That's the idea. I'm not sure exactly what I'm going to do with __new__ yet, but I want to free up the name so I can do something better with it later. -- Greg From stefan_ml at behnel.de Thu May 1 07:54:52 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 01 May 2008 07:54:52 +0200 Subject: [Cython] __getattribute__ In-Reply-To: <20080501031603.GL15181@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080501014107.GK15181@tilt> <8367C02B-B674-44BB-AF04-4472EE8AABED@math.washington.edu> <20080501031603.GL15181@tilt> Message-ID: <48195B2C.6060301@behnel.de> Hi, Peter Todd wrote: >>> From what I can see I'd be >>> updating generate_getattro_function() to do a scope.lookup_here() on >>> '__getattribute__' first, if that succeeds, output code to call it, >>> otherwise go on to the existing code. Please see the code in Python's typeobject.c, functions "slot_tp_getattr_hook" and "slot_tp_getattro" (essentially a fast path), and the comment above them. The code Cython generates must behave like these two. > On Wed, Apr 30, 2008 at 07:12:42PM -0700, Robert Bradshaw wrote: >> Sounds like a good plan. Is __getattribute__ inherited? Python is already doing a couple of things here for us. There is some code in "inherit_slots" (typeobject.c) that inherits the slots, but only if both get* functions are set to NULL, or set* functions respectively: if (type->tp_getattr == NULL && type->tp_getattro == NULL) { type->tp_getattr = base->tp_getattr; type->tp_getattro = base->tp_getattro; } if (type->tp_setattr == NULL && type->tp_setattro == NULL) { type->tp_setattr = base->tp_setattr; type->tp_setattro = base->tp_setattro; } Again, Cython must behave alike here. Stefan From robertwb at math.washington.edu Thu May 1 13:01:09 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 1 May 2008 04:01:09 -0700 Subject: [Cython] [Pyrex] Python 3 In-Reply-To: References: <15D9902A-0C11-4939-95C4-86C50B5B816F@math.washington.edu> <4816BB88.6000708@canterbury.ac.nz> <4816C6FC.2050103@behnel.de> <4816CC06.1040706@canterbury.ac.nz> <4816F091.1000505@cheimes.de> <4817A2D1.9010802@canterbury.ac.nz> Message-ID: On Apr 29, 2008, at 4:16 PM, Christian Heimes wrote: > Greg Ewing schrieb: >> Thanks, I didn't know that. >> >> So it seems that we could, if it were considered desirable, >> have an automatic cast from unicode to char *. But the encoding >> would *have* to be utf8 -- anything else would require memory >> allocation. > > You're welcome! :) > > The UTF-8 default encoding is hard coded in Python 3.0. IMHO it's the > most sensible encoding for users from the Western world. Asian users > would probably prefer UTF-16 but that's a waste of memory for the > rest. Yes. Even ignoring memory concerns, UTF-8 plays much nicer with unix/ other tools that are not unicode-aware. > In my opinion wchar_t support is much more important than casting > PyUnicode objects to char*. Especially Windows developers need wchar_t > for the wide Windows API. Python 2.6 and 3.0 have dropped support for > the Windows 9x/ME/NT series. Only 2k SP4 and newer are supported. > wchar_t support is an important step for the poor souls ... err > Windows > developers. ;) I'm curious what you mean by wchar_t support--one can typedef wchar_t to be an integer type. Or perhaps you mean something more... - Robert From robertwb at math.washington.edu Thu May 1 13:02:00 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 1 May 2008 04:02:00 -0700 Subject: [Cython] MultiCores/AMD open source tools In-Reply-To: References: Message-ID: <1C5D2BE2-66BB-4362-9244-332B7B1A7461@math.washington.edu> There's the issue of the GIL, but if you don't touch the interpreter you can do threading just as in C. On Apr 29, 2008, at 5:26 PM, Gottfried F. Zojer wrote: > From: Gottfried F. Zojer > Date: Apr 30, 2008 2:12 AM > Subject: MultiCores/AMD open source tools > To: cython-dev at codespeak.net > > > Hallo, > > Just wondering if someone is using cython in combination with any > of AMD > open-source projects like framewave or have any other thoughts to use > cython/python for parallel programming or Multi Core programming. > Any feedback welcome > > Rgds Gottfried > > > www.5152.eu > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From ndbecker2 at gmail.com Thu May 1 13:56:44 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 01 May 2008 07:56:44 -0400 Subject: [Cython] Cython 0.9.6.14 release candidate References: Message-ID: Robert Bradshaw wrote: > Up at http://www.cython.org/Cython-0.9.6.14.tar.gz (or pull from the > devel branch). If no problems are found, this will be the release. (I > have already tested Sage). > > - Robert I did hg clone http://hg.cython.org/cython/ python setup.py build && sudo python setup.py install python runtests.py: Ran 317 tests in 52.931s FAILED (failures=3) From bblais at gmail.com Thu May 1 15:32:33 2008 From: bblais at gmail.com (Brian Blais) Date: Thu, 1 May 2008 09:32:33 -0400 Subject: [Cython] access to numpy functions? Message-ID: Hello, I have been using Pyrex/Cython for a couple years now, interfacing with the numpy ndarray objects. I was wondering recently if there is direct access to some of the functions of numpy, like dot products, addition and subtraction, etc... without going through the python API? If I have two objects in Cython/Pyrex, like: c_numpy.ndarray x,y and I want to do a dot product, or add the two arrays, is there access to the numpy C-version of this? I know I could write my own for the dot product, but I'd rather not reinvent the wheel when possible. thanks, Brian Blais From dagss at student.matnat.uio.no Thu May 1 16:39:44 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 01 May 2008 16:39:44 +0200 Subject: [Cython] access to numpy functions? In-Reply-To: References: Message-ID: <4819D630.9060602@student.matnat.uio.no> > API? If I have two objects in Cython/Pyrex, like: > > > c_numpy.ndarray x,y > > and I want to do a dot product, or add the two arrays, is there > access to the numpy C-version of this? I know I could write my own > for the dot product, but I'd rather not reinvent the wheel when > possible. > First off, you can use the Python functions directly, like this (Cython syntax): --- import numpy cimport c_numpy cdef c_numpy.ndarray a = ..., b = ... cdef c_numpy.ndarray c = numpy.dot(a, b) --- As for calling the NumPy C functions directly: The a place to start is looking seems to be core/src/multiarraymodule.c. (You may want to ask in the numpy mailing list as well though. I'm not very familiar with NumPy internals.) My first impression is that the implementations (say, PyArray_InnerProduct for instance) seem to be a) very generic (any number of dimensions/strides etc. and b) coded using the Python/C API directly, so there is no "inner layer" to communicate with. I.e., arguments to functions are Python objects. So you're probably just as well off by calling the functions through the Python API like stated above. If that is not fast enough, there doesn't seem to be much to be gained by "direct calls" to the C functions; and a new implementation in Cython seems in order. The main reason is that to make this faster you will want to make use of additional information -- if you know the number of dimensions (or even their size) and datatype at compile-time, you can make some optimizations that functions in the NumPy library will never be able to do. (For most if not all cases I'm guessing that we're talking very small gains if any gains here; though if you tend to multiply really small arrays a lot (like doing coordinate transformation...) then I suppose you must write your own code.) -- Dag Sverre From stefan_ml at behnel.de Thu May 1 18:52:46 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 01 May 2008 18:52:46 +0200 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: References: Message-ID: <4819F55E.5010909@behnel.de> Hi, Neal Becker wrote: > Robert Bradshaw wrote: >> Up at http://www.cython.org/Cython-0.9.6.14.tar.gz (or pull from the >> devel branch). If no problems are found, this will be the release. (I >> have already tested Sage). >> >> - Robert > > I did > hg clone http://hg.cython.org/cython/ That's the wrong repository. As Robert said, the RC is in the devel branch: http://hg.cython.org/cython-devel/ Stefan From ndbecker2 at gmail.com Thu May 1 19:07:24 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 01 May 2008 13:07:24 -0400 Subject: [Cython] Cython 0.9.6.14 release candidate References: <4819F55E.5010909@behnel.de> Message-ID: Stefan Behnel wrote: > Hi, > > Neal Becker wrote: >> Robert Bradshaw wrote: >>> Up at http://www.cython.org/Cython-0.9.6.14.tar.gz (or pull from the >>> devel branch). If no problems are found, this will be the release. (I >>> have already tested Sage). >>> >>> - Robert >> >> I did >> hg clone http://hg.cython.org/cython/ > > That's the wrong repository. As Robert said, the RC is in the devel > branch: > > http://hg.cython.org/cython-devel/ > > Stefan Thanks, now I got: Traceback (most recent call last): File "/usr/lib64/python2.5/doctest.py", line 2112, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for r_vree_1 File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 190, in r_vree_1 ---------------------------------------------------------------------- File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 203, in r_vree_1 Failed example: test(sys.maxint + 1) Expected: 2147483648L Got: 9223372036854775808L ---------------------------------------------------------------------- File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 205, in r_vree_1 Failed example: test(sys.maxint * 2 + 1) Expected: 4294967295L Got: 18446744073709551615L ---------------------------------------------------------------------- Ran 213 tests in 53.887s FAILED (failures=1) These are probably harmless 32/64 bit errors? Tests should be fixed? From stefan_ml at behnel.de Thu May 1 19:20:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 01 May 2008 19:20:55 +0200 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: References: <4819F55E.5010909@behnel.de> Message-ID: <4819FBF7.9030901@behnel.de> Hi, Neal Becker wrote: > Traceback (most recent call last): > File "/usr/lib64/python2.5/doctest.py", line 2112, in runTest > raise self.failureException(self.format_failure(new.getvalue())) > AssertionError: Failed doctest test for r_vree_1 > File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 190, in r_vree_1 > > ---------------------------------------------------------------------- > File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 203, in r_vree_1 > Failed example: > test(sys.maxint + 1) > Expected: > 2147483648L > Got: > 9223372036854775808L > ---------------------------------------------------------------------- > File "/home/nbecker/cython-devel/BUILD/r_vree_1.so", line 205, in r_vree_1 > Failed example: > test(sys.maxint * 2 + 1) > Expected: > 4294967295L > Got: > 18446744073709551615L > > > ---------------------------------------------------------------------- > Ran 213 tests in 53.887s > > FAILED (failures=1) > > These are probably harmless 32/64 bit errors? Tests should be fixed? Sure, looks like the test is broken here. What is sys.maxint on a 64 bit platform? Could you dig into this and try to provide a fix? Stefan From robertwb at math.washington.edu Thu May 1 19:48:54 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 1 May 2008 10:48:54 -0700 Subject: [Cython] access to numpy functions? In-Reply-To: <4819D630.9060602@student.matnat.uio.no> References: <4819D630.9060602@student.matnat.uio.no> Message-ID: <65DA8D65-87AE-4D1D-9777-D843BD2B00C1@math.washington.edu> On May 1, 2008, at 7:39 AM, Dag Sverre Seljebotn wrote: > >> API? If I have two objects in Cython/Pyrex, like: >> >> >> c_numpy.ndarray x,y >> >> and I want to do a dot product, or add the two arrays, is there >> access to the numpy C-version of this? I know I could write my own >> for the dot product, but I'd rather not reinvent the wheel when >> possible. >> > > First off, you can use the Python functions directly, like this > (Cython > syntax): > > --- > import numpy > cimport c_numpy > > cdef c_numpy.ndarray a = ..., b = ... > cdef c_numpy.ndarray c = numpy.dot(a, b) > --- > > As for calling the NumPy C functions directly: The a place to start is > looking seems to be core/src/multiarraymodule.c. (You may want to > ask in > the numpy mailing list as well though. I'm not very familiar with > NumPy > internals.) > > My first impression is that the implementations (say, > PyArray_InnerProduct for instance) seem to be a) very generic (any > number of dimensions/strides etc. and b) coded using the Python/C API > directly, so there is no "inner layer" to communicate with. I.e., > arguments to functions are Python objects. > > So you're probably just as well off by calling the functions > through the > Python API like stated above. > > If that is not fast enough, there doesn't seem to be much to be gained > by "direct calls" to the C functions; and a new implementation in > Cython > seems in order. And fortunately this is another GSoC project. > The main reason is that to make this faster you will > want to make use of additional information -- if you know the > number of > dimensions (or even their size) and datatype at compile-time, you can > make some optimizations that functions in the NumPy library will never > be able to do. (For most if not all cases I'm guessing that we're > talking very small gains if any gains here; though if you tend to > multiply really small arrays a lot (like doing coordinate > transformation...) then I suppose you must write your own code.) > > -- > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From ndbecker2 at gmail.com Thu May 1 19:59:39 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 01 May 2008 13:59:39 -0400 Subject: [Cython] Cython 0.9.6.14 release candidate References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> Message-ID: Stefan Behnel wrote: > Hi, > > Neal Becker wrote: ... >> These are probably harmless 32/64 bit errors? Tests should be fixed? > > Sure, looks like the test is broken here. What is sys.maxint on a 64 bit > platform? > python -c 'import sys; print sys.maxint' 9223372036854775807 This is x86_64 From stefan_ml at behnel.de Thu May 1 20:10:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 01 May 2008 20:10:38 +0200 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> Message-ID: <481A079E.8030705@behnel.de> Hi, Neal Becker wrote: > Stefan Behnel wrote: >> Neal Becker wrote: > ... >>> These are probably harmless 32/64 bit errors? Tests should be fixed? >> Sure, looks like the test is broken here. What is sys.maxint on a 64 bit >> platform? > > python -c 'import sys; print sys.maxint' > 9223372036854775807 > > This is x86_64 I pushed a fix for the test case. Could you try it? Thanks, Stefan From ndbecker2 at gmail.com Thu May 1 20:46:42 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 01 May 2008 14:46:42 -0400 Subject: [Cython] Cython 0.9.6.14 release candidate References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> <481A079E.8030705@behnel.de> Message-ID: Stefan Behnel wrote: > Hi, > > Neal Becker wrote: >> Stefan Behnel wrote: >>> Neal Becker wrote: >> ... >>>> These are probably harmless 32/64 bit errors? Tests should be fixed? >>> Sure, looks like the test is broken here. What is sys.maxint on a 64 bit >>> platform? >> >> python -c 'import sys; print sys.maxint' >> 9223372036854775807 >> >> This is x86_64 > > I pushed a fix for the test case. Could you try it? > > Thanks, > Stefan Ran 213 tests in 22.508s OK From kirr at landau.phys.spbu.ru Fri May 2 07:58:44 2008 From: kirr at landau.phys.spbu.ru (Kirill Smelkov) Date: Fri, 2 May 2008 09:58:44 +0400 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> <481A079E.8030705@behnel.de> Message-ID: <20080502055844.GA10755@evo> On Thu, May 01, 2008 at 02:46:42PM -0400, Neal Becker wrote: > Stefan Behnel wrote: > > > Hi, > > > > Neal Becker wrote: > >> Stefan Behnel wrote: > >>> Neal Becker wrote: > >> ... > >>>> These are probably harmless 32/64 bit errors? Tests should be fixed? > >>> Sure, looks like the test is broken here. What is sys.maxint on a 64 bit > >>> platform? > >> > >> python -c 'import sys; print sys.maxint' > >> 9223372036854775807 > >> > >> This is x86_64 > > > > I pushed a fix for the test case. Could you try it? > > > > Thanks, > > Stefan > Ran 213 tests in 22.508s I still have 3 failures with latest cython-devel (b869698d6f22) Attached is a log of 'python runtests.py' run on Debian testing. -- ????? ????????, ??????. -------------- next part -------------- Doctest: notinop ... ok Doctest: ishimoto2 ... ok Doctest: classkwonlyargs ... ok Doctest: assert ... ok Doctest: kwargproblems ... ok Doctest: ct_IF ... ok Doctest: behnel3 ... ok Doctest: r_jiba1 ... ok Doctest: extkwonlyargs ... ok Doctest: if ... ok Doctest: subop ... ok Doctest: dict ... ok Doctest: ishimoto3 ... ok Doctest: addressof ... ok Doctest: baas3 ... ok Doctest: r_docstrings ... ok Doctest: r_hordijk1 ... ok Doctest: unicodeliterals ... ok Doctest: tuple ... ok Doctest: powop ... ok Doctest: compiledef ... ok Doctest: print ... ok Doctest: r_mang1 ... ok Doctest: jarausch1 ... ok Doctest: unop ... ok Doctest: bishop2 ... ok Doctest: cstruct ... ok Doctest: unicodeliteralslatin1 ... ok Doctest: pynumop ... ok Doctest: isnonebool ... ok Doctest: exarkun ... ok Doctest: extinstantiate ... ok Doctest: unpacklistcomp ... ok Doctest: r_starargs ... ok Doctest: kostyrka2 ... ok Doctest: cfuncdef ... ok Doctest: attr ... ok modop.c:252: warning: ?__pyx_f_5modop_modptr? defined but not used Doctest: modop ... ok Doctest: bishop1 ... ok Doctest: r_spamtype ... ok Doctest: wundram1 ... ok Doctest: kostyrka ... ok Doctest: varargcall ... ok Doctest: pyintop ... ok Doctest: include ... ok Doctest: classpass ... ok Doctest: addloop ... ok Doctest: extinherit ... ok Doctest: cunion ... ok Doctest: cvardef ... ok Doctest: pinard7 ... ok Doctest: append ... ok Doctest: boolop ... ok Doctest: r_pythonapi ... ok Doctest: pass ... ok Doctest: slice3 ... ok Doctest: unpack ... ok Doctest: pyextattrref ... ok Doctest: pylistsubtype ... ok Doctest: ass2global ... ok Doctest: pycmp ... ok Doctest: r_starargcall ... ok Doctest: new_style_exceptions ... FAIL Doctest: r_argdefault ... ok Doctest: extstarargs ... ok Doctest: kwonlyargs ... ok Doctest: pinard8 ... ok Doctest: rodriguez_1 ... ok Doctest: r_extstarargs ... ok Doctest: cstringmul ... ok Doctest: multass ... ok Doctest: r_barbieri1 ... ok Doctest: list ... ok Doctest: r_mitch_chapman_2 ... ok Doctest: slice2 ... ok Doctest: backquote ... ok Doctest: r_toofewargs ... ok Doctest: strconstinclass ... ok Doctest: extclasspass ... ok Doctest: pinard6 ... ok Doctest: tuplereassign ... FAIL Doctest: r_starargsonly ... ok Doctest: dietachmayer1 ... ok Doctest: lepage_1 ... ok Doctest: exttype ... ok Doctest: r_pyclassdefault ... ok Doctest: r_addint ... ok Doctest: unicodeliteralsdefault ... ok Doctest: r_extcomplex2 ... ok Doctest: return ... ok Doctest: behnel2 ... ok Doctest: concatcstrings ... ok Doctest: getattr3call ... ok Doctest: unicodefunction ... ok Doctest: pinard5 ... ok Doctest: r_primes ... ok Doctest: r_pyclass ... ok Doctest: r_print ... ok Doctest: ass2local ... ok Doctest: r_vree_1 ... ok Doctest: simpcall ... ok Doctest: modbody ... ok Doctest: altet2 ... ok Doctest: r_mcintyre1 ... ok Doctest: specialfloat ... ok Doctest: addop ... ok Doctest: r_forloop ... ok Doctest: strfunction ... ok Doctest: cintop ... ok Doctest: ref2local ... ok Doctest: literals ... ok Doctest: ct_DEF ... ok Doctest: extlen ... ok Doctest: sizeof ... ok Doctest: inop ... ok Doctest: starargs ... ok Doctest: behnel1 ... ok compiling undefinedname ... ok compiling encoding ... ok compiling void_as_arg ... ok compiling arrayptrcompat ... arrayptrcompat.c:123: warning: ?__pyx_f_14arrayptrcompat_f? defined but not used ok compiling extsetattr ... ok compiling ctypedef ... ok compiling extpropertydoc ... ok compiling extimportedsubtype ... ok compiling extsetitem ... ok compiling cargdef ... ok compiling enumintcompat ... enumintcompat.c:135: warning: ?__pyx_f_13enumintcompat_f? defined but not used ok compiling omittedargnames ... ok compiling import ... ok compiling callingconvention ... ok compiling lepage_2 ... ok compiling extpropertyget ... ok compiling builtin ... ok compiling formfeed ... ok compiling emptytry ... emptytry.c:111: warning: ?__pyx_f_8emptytry_f? defined but not used ok compiling extcmethcall ... extcmethcall.c:179: warning: ?__pyx_f_12extcmethcall_tomato? defined but not used ok compiling extdelslice ... ok compiling ewing4 ... ewing4.c:110: warning: ?__pyx_f_6ewing4_f? defined but not used ok compiling extindex ... ok compiling huss2 ... huss2.c:125: warning: ?__pyx_f_5huss2_f? defined but not used ok compiling declarations ... declarations.c:125: warning: ?__pyx_f_12declarations_f? defined but not used declarations.c:146: warning: ?__pyx_f_12declarations_g? defined but not used ok compiling extdelitem ... ok compiling excvaldecl ... excvaldecl.c:116: warning: ?__pyx_f_10excvaldecl_spam? defined but not used excvaldecl.c:131: warning: ?__pyx_f_10excvaldecl_eggs? defined but not used excvaldecl.c:146: warning: ?__pyx_f_10excvaldecl_grail? defined but not used excvaldecl.c:161: warning: ?__pyx_f_10excvaldecl_tomato? defined but not used excvaldecl.c:176: warning: ?__pyx_f_10excvaldecl_brian? defined but not used excvaldecl.c:190: warning: ?__pyx_f_10excvaldecl_silly? defined but not used ok compiling ctypedefenum ... ok compiling tryexcept ... ok compiling for ... ok compiling specmethextarg ... ok compiling while ... while.c: In function ?__pyx_pf_5while_f?: while.c:197: warning: ?__pyx_v_i? is used uninitialized in this function ok compiling jiba3 ... ok compiling varargdecl ... varargdecl.c:110: warning: ?__pyx_f_10varargdecl_grail? defined but not used ok compiling extsetslice ... ok compiling belchenko1 ... belchenko1.c:113: warning: ?__pyx_f_10belchenko1__is_aligned? defined but not used ok compiling ewing6 ... ewing6.c:207: warning: ?__pyx_f_6ewing6_f? defined but not used ok compiling jiba4 ... ok compiling nogil ... nogil.c:114: warning: ?__pyx_f_5nogil_f? defined but not used ok compiling extargdefault ... ok compiling typecast ... typecast.c:113: warning: ?__pyx_f_8typecast_f? defined but not used ok compiling coventry1 ... ok compiling fromimport ... ok compiling cstructreturn ... cstructreturn.c:124: warning: ?__pyx_f_13cstructreturn_f? defined but not used ok compiling globvardef ... ok compiling ctypedefclass ... ok compiling ia_cdefblock ... ia_cdefblock.c:157: warning: ?__pyx_f_12ia_cdefblock_priv_f? defined but not used ok compiling coercetovoidptr ... coercetovoidptr.c:111: warning: ?__pyx_f_15coercetovoidptr_f? defined but not used ok compiling withgil ... withgil.c:114: warning: ?__pyx_f_7withgil_f? defined but not used withgil.c:140: warning: ?__pyx_f_7withgil_g? defined but not used ok compiling declandimpl ... ok compiling ishimoto1 ... ok compiling specmethargdefault ... ok compiling funcptr ... ok compiling globalonly ... ok compiling extdelattr ... ok compiling pyclass ... ok compiling tryfinally ... tryfinally.c: In function ?__pyx_pf_10tryfinally_f?: tryfinally.c:213: warning: ?__pyx_exc_type? is used uninitialized in this function tryfinally.c:214: warning: ?__pyx_exc_value? is used uninitialized in this function tryfinally.c:215: warning: ?__pyx_exc_tb? is used uninitialized in this function tryfinally.c:223: warning: ?__pyx_exc_lineno? is used uninitialized in this function tryfinally.c:311: warning: ?__pyx_exc_tb? is used uninitialized in this function tryfinally.c:311: warning: ?__pyx_exc_value? is used uninitialized in this function tryfinally.c:311: warning: ?__pyx_exc_type? is used uninitialized in this function tryfinally.c:312: warning: ?__pyx_exc_lineno? is used uninitialized in this function ok compiling doda1 ... doda1.c:127: warning: ?__pyx_f_5doda1_foo? defined but not used ok compiling extgetattr ... ok compiling johnson2 ... ok compiling cenum ... cenum.c:129: warning: ?__pyx_f_5cenum_eggs? defined but not used ok compiling coercearraytoptr ... coercearraytoptr.c:126: warning: ?__pyx_f_16coercearraytoptr_eggs? defined but not used ok compiling kleckner1 ... kleckner1.c:126: warning: ?__pyx_f_9kleckner1_g? defined but not used ok compiling exthash ... ok compiling extern ... extern.c:131: warning: ?__pyx_f_6extern_grail? defined but not used ok compiling pinard4 ... ok compiling hinsen2 ... ok compiling ewing9 ... ok compiling longunsigned ... ok compiling extinheritset ... ok compiling ewing7 ... ok compiling ishimoto4 ... ishimoto4.c:110: warning: ?__pyx_f_9ishimoto4_f? defined but not used ok compiling khavkine1 ... khavkine1.c:126: warning: ?__pyx_f_9khavkine1_f? defined but not used ok compiling a_capi ... ok compiling ctypedefunion ... ok compiling extpropertydel ... ok compiling withnogil ... withnogil.c:113: warning: ?__pyx_f_9withnogil_f? defined but not used withnogil.c:145: warning: ?__pyx_f_9withnogil_g? defined but not used ok compiling extimported ... ok compiling ctypedefstruct ... ok compiling constexpr ... ok compiling ewing8 ... ok compiling signedtypes ... ok compiling nonctypedefclass ... ok compiling cdefexternfromstar ... ok compiling complexbasetype ... complexbasetype.c:115: warning: ?__pyx_f_15complexbasetype_brian? defined but not used ok compiling classmethargdefault ... ok compiling cforfromloop ... ok compiling extpropertyset ... ok compiling ewing3 ... ok compiling behnel4 ... behnel4.c:123: warning: ?__pyx_f_7behnel4_f? defined but not used ok compiling specmethdocstring ... specmethdocstring.c:131: warning: ?__pyx_doc_17specmethdocstring_1C___init__? defined but not used specmethdocstring.c:155: warning: ?__pyx_doc_17specmethdocstring_1C_3foo___get__? defined but not used specmethdocstring.c:171: warning: ?__pyx_doc_17specmethdocstring_1C_3foo___set__? defined but not used ok compiling extforward ... ok compiling cverylongtypes ... ok compiling argdefault ... ok compiling nullptr ... ok compiling extinheritdel ... ok ====================================================================== FAIL: Doctest: new_style_exceptions ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/doctest.py", line 2157, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for new_style_exceptions File "/home/kirr/src/tools/cython/cython-devel/BUILD/new_style_exceptions.so", line unknown line number, in new_style_exceptions ---------------------------------------------------------------------- File "/home/kirr/src/tools/cython/cython-devel/BUILD/new_style_exceptions.so", line ?, in new_style_exceptions Failed example: test(Exception('hi')) Expected: Raising: Exception('hi',) Caught: Exception('hi',) Got: Raising: Caught: ====================================================================== FAIL: Doctest: tuplereassign ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/doctest.py", line 2157, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for tuplereassign File "/home/kirr/src/tools/cython/cython-devel/BUILD/tuplereassign.so", line 23, in tuplereassign ---------------------------------------------------------------------- File "/home/kirr/src/tools/cython/cython-devel/BUILD/tuplereassign.so", line 31, in tuplereassign Failed example: testnonsense() Expected: Traceback (most recent call last): TypeError: 'int' object is not iterable Got: Traceback (most recent call last): File "/usr/lib/python2.4/doctest.py", line 1248, in __run compileflags, 1) in test.globs File "", line 1, in ? testnonsense() File "tuplereassign.pyx", line 26, in tuplereassign.testnonsense (tuplereassign.c:%u) TypeError: iteration over non-sequence ---------------------------------------------------------------------- Ran 213 tests in 289.050s FAILED (failures=2) Creating lexicon... Done (0.21 seconds) Pickling lexicon... Done (0.02 seconds) From stefan_ml at behnel.de Fri May 2 09:13:10 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 02 May 2008 09:13:10 +0200 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: <20080502055844.GA10755@evo> References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> <481A079E.8030705@behnel.de> <20080502055844.GA10755@evo> Message-ID: <481ABF06.7020707@behnel.de> Hi, Kirill Smelkov wrote: > I still have 3 failures with latest cython-devel (b869698d6f22) > Attached is a log of 'python runtests.py' run on Debian testing. Thanks. That's running against Python 2.4, while the two tests were depending on Python 2.5 specifics (an exception message and new style exception types). I relaxed them a bit to run on Py2.3-2.5. The test runner script also prints the versions of Cython and Python now, to put some more context into the test log. Stefan From kirr at landau.phys.spbu.ru Fri May 2 09:46:38 2008 From: kirr at landau.phys.spbu.ru (Kirill Smelkov) Date: Fri, 2 May 2008 11:46:38 +0400 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: <481ABF06.7020707@behnel.de> References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> <481A079E.8030705@behnel.de> <20080502055844.GA10755@evo> <481ABF06.7020707@behnel.de> Message-ID: <20080502074637.GB10755@evo> Hi Stefan, On Fri, May 02, 2008 at 09:13:10AM +0200, Stefan Behnel wrote: > Hi, > > Kirill Smelkov wrote: > > I still have 3 failures with latest cython-devel (b869698d6f22) > > Attached is a log of 'python runtests.py' run on Debian testing. > > Thanks. That's running against Python 2.4, while the two tests were depending > on Python 2.5 specifics (an exception message and new style exception types). > I relaxed them a bit to run on Py2.3-2.5. > > The test runner script also prints the versions of Cython and Python now, to > put some more context into the test log. kirr at evo:~/src/tools/cython/cython-devel$ python runtests.py Running tests against Cython 0.9.6.13.1 Python 2.4.5 (#2, Mar 12 2008, 00:15:51) [GCC 4.2.3 (Debian 4.2.3-2)] ... Ran 213 tests in 243.547s OK ---- Thanks! -- ????? ????????, ??????. From robertwb at math.washington.edu Fri May 2 10:38:42 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 2 May 2008 01:38:42 -0700 Subject: [Cython] Cython 0.9.6.14 released Message-ID: <463FD591-1904-4CD0-A67B-0EBB62AB6A4E@math.washington.edu> Cython 0.9.6.14 is out. This is mostly a bugfix release, however there are several other improvements, notably: - Source code encoding support (PEP 263) and UTF-8 default source encoding (PEP 3120) (Stefan Behnel) - New command line option -w to change the working directory when running Cython (Gary Furnish) - L.append(x) now optimized if L a (runtime) list (Robert Bradshaw) - Cdef variables may be declared python builtin types (CEP 507), though there is much more potential for optimization (Robert Bradshaw) - Enums declared "public" will get exported to the (python- accessible) module namespace (Robert Bradshaw) - Correct special float values (Christian Heimes/Stefan Behnel) Thanks also to all those who submitted good bug reports as well. - Robert From robertwb at math.washington.edu Fri May 2 10:40:11 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 2 May 2008 01:40:11 -0700 Subject: [Cython] Cython 0.9.6.14 release candidate In-Reply-To: <20080502074637.GB10755@evo> References: <4819F55E.5010909@behnel.de> <4819FBF7.9030901@behnel.de> <481A079E.8030705@behnel.de> <20080502055844.GA10755@evo> <481ABF06.7020707@behnel.de> <20080502074637.GB10755@evo> Message-ID: On May 2, 2008, at 12:46 AM, Kirill Smelkov wrote: > Hi Stefan, > > On Fri, May 02, 2008 at 09:13:10AM +0200, Stefan Behnel wrote: >> Hi, >> >> Kirill Smelkov wrote: >>> I still have 3 failures with latest cython-devel (b869698d6f22) >>> Attached is a log of 'python runtests.py' run on Debian testing. >> >> Thanks. That's running against Python 2.4, while the two tests >> were depending >> on Python 2.5 specifics (an exception message and new style >> exception types). >> I relaxed them a bit to run on Py2.3-2.5. >> >> The test runner script also prints the versions of Cython and >> Python now, to >> put some more context into the test log. > > kirr at evo:~/src/tools/cython/cython-devel$ python runtests.py > Running tests against Cython 0.9.6.13.1 > Python 2.4.5 (#2, Mar 12 2008, 00:15:51) > [GCC 4.2.3 (Debian 4.2.3-2)] > > ... > > Ran 213 tests in 243.547s > > OK > > ---- > > Thanks! Thanks for fixing this. It didn't make it into Cython 0.9.6.14 (which I pushed out this morning) but at least it's a bug in the testing code rather than in the compiler itself, and these changes will certainly be in the next release. - Robert From dagss at student.matnat.uio.no Fri May 2 13:22:35 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 02 May 2008 13:22:35 +0200 Subject: [Cython] Some small phase refactorings Message-ID: <481AF97B.8030305@student.matnat.uio.no> This isn't much, but it is a start. Phase refactoring seems like it will become a bit difficult and I think it is better left for dev1 if possible; I need to work with someone who know the source more intimately. It appears that some parts must almost be "rebuilt" or at least control flow heavily altered between statements, it is all about keeping the changes to a minimum... the good news is that a certain amount of spaghetti will disappear through the process (you know -- the things that break in a different part of the code when you change another part in the code; while not necesarrily the root of all evil, it does make it harder to get to know a codebase). But I have patch for some trivial stuff at the "top" of the call chain that at least is a good point for discussion (if somebody else cares how this is done). It can be found here: http://wiki.cython.org/DagSverreSeljebotn/patches?action=AttachFile&do=get&target=phaserefactoring1.diff and succeeds the testcases; I consider it ready for inclusion. Basically, what it does is turn this (psuedo-code describing overall structure, you won't find this anywhere): for each function in module: construct_function_scope analyse_control_flow analyse_declarations analyse_expressions generate_code into for each function in module: construct_function_scope for each function in module: analyse_control_flow for each function in module: analyse_declarations for each function in module: analyse_expressions for each function in module: generate_code However, each of these are in turn run as recursive calls just like before. In particular, analyse_expressions is still one phase and not easily seperateable (it should be split in three I think: analyse types, coercion, allocate temps). Also, these are all "function-level"; on module-level things happen like before, i.e. analyse_declarations in the module level is run at another time (and for the nearest future it should remain this way; just consider module-level and function-level analysis different phases). The new code is implemented using visitor transforms. One could argue about this, but I do think it leads to pretty neat code. The third for-loop in the psuedo-code above is this implemented like this: Cython/Compiler/Transforms/Analysis.py [is this source structure ok?]: class AnalyseFunctionBodyDeclarations(VisitorTransform): def pre_FuncDefNode(self, node): node.body.analyse_declarations(node.scope) return False # do not recurse beyond first function level Cython/Compiler/ModuleNode.py: class ModuleNode: def generate_c_code(self, env, options, result): ... import Cython.Compiler.Transforms.Analysis as Analysis ... Analysis.AnalyseFunctionBodyDeclarations()(self, env=env) ... self.body.generate_function_definitions(env, code, options.transforms) ... The only non-trivial part is that the function scope is constructed by a tree transform that adds the "scope" and "scopenode" attributes in the following way: The FuncDefNodes and ModuleNodes gets their "scope" attribute set to the object that was previously passed around everywhere as the "env" parameter (this is then read back in order to pass the "env" correctly). All nodes also have their "scopenode" attribute set to the node that caused creation of the enclosing scope. I.e., rather than relying on a passed in env passed recursively from the function/module, it is possible for nodes to look up self.scopenode.scope. This will aid seperation of phases. -- Dag Sverre From dagss at student.matnat.uio.no Fri May 2 13:38:53 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 02 May 2008 13:38:53 +0200 Subject: [Cython] Visitor patterns and Python 2.4 Message-ID: <481AFD4D.4050401@student.matnat.uio.no> Are there any real reasons for leaving the Cython compiler (not talking about generated or supported code of course) at Python 2.3, rather than a small bump to 2.4? Reason: I'd like decorators. The rationale: Notice that parse tree visitors can currently be written like this: class AnalyseControlFlow(VisitorTransform): def pre_FuncDefNode(self, node): node.body.analyse_control_flow(node.scope) return False # do not recurse beyond first function level The VisitorTransform parent implements a tree iteration/recursion and calls "pre_X" which is only allowed to visit and stop recursion at that point, "process_X" which has full control (can replace or remove node, do various infix processing -- if "process_X" is implemented, pre and post are not called) and "post_X" (which only visits)). VisitorTransform basically looks up node.__class__.__name__ to figure out the name of the function to call, and uses introspection. Things are cached. There are two disadvantages to this though: - Relying on node.__class__.__name__ is a bit fragile. Multiple; and simply changing a class. It would be better to use a real reference - If using multiple instances of a Transform class, each instance have to rebuild their call table cache. (This is not a real issue though, I don't expect such cases to come up.) Using a metaclass, the call table could however be done at class definition time which is "nicer". This is not a good reason. If bumping the version to 2.4, decorators can solve this. OTOH, I'd much rather do with function names rather than "manually calling decorators". The example above could then look like this: class AnalyseControlFlow(VisitorTransform): @pre(FuncDefNode) def call_analyse_control_flow(self, node): node.body.analyse_control_flow(node.scope) return False # do not recurse beyond first function level This has the advantage that the class is directly references rather than being embedded in a string, so that name clashes, class renaming etc works correctly. Like this: from Nodes import FuncDefNode funcdef = FuncDefNode class AnalyseControlFlow(VisitorTransform): @pre(funcdef) def call_analyse_control_flow(self, node): ... Note that the standard visitor pattern (implementing accept on each and every node) was shot down by Fabrizio earlier (in private communication) as wasting code lines and unpythonic, and I guess I agree, so I implemented his suggestion instead. (The only reason I can see is speed when running under Cython compilation, if one uses "cdef" and compilation binding so that a dictionary lookups happen anyway. Not the thing to worry about at this stage IMO.) -- Dag Sverre From stefan_ml at behnel.de Sat May 3 07:53:30 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 03 May 2008 07:53:30 +0200 Subject: [Cython] Some small phase refactorings In-Reply-To: <481AF97B.8030305@student.matnat.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> Message-ID: <481BFDDA.8060007@behnel.de> Hi, since no-one replied so far (and since I think public code-review is important), here I go. Lines starting with ### are mine. In general, I'm +0 for the change and -0 for the patch. I think, if we use transformations, there should be a good way to subscribe them to what they work on. We brought up the idea of a path language, but maybe a subscription to an AST class and just calling the transform to let it return either the unchanged input or a modified AST subtree might already be enough. diff -r 3c924a0594ba Cython/Compiler/ModuleNode.py --- a/Cython/Compiler/ModuleNode.py Fri May 02 10:22:20 2008 +0200 +++ b/Cython/Compiler/ModuleNode.py Fri May 02 11:42:17 2008 +0200 @@ -33,7 +33,7 @@ class ModuleNode(Nodes.Node, Nodes.Block # module_temp_cname string # full_module_name string - children_attrs = ["body"] + child_attrs = ["body"] def analyse_declarations(self, env): if Options.embed_pos_in_docstring: ### This looks like a separate fix to me. + import Cython.Compiler.Transforms.Analysis as Analysis + Analysis.CreateFunctionScope()(self, env=env) + Analysis.AnalyseControlFlow()(self, env=env) + Analysis.AnalyseFunctionBodyDeclarations()(self, env=env) + Analysis.AnalyseFunctionBodyExpressions()(self, env=env) + options.transforms.run('after_function_analysis', self, global_env=env) + ### These look like functions, they should follow PEP 8 naming. Then again, ### why aren't they functions? + def get_visitfunc(self, prefix, cls): + mname = prefix + cls.__name__ + m = self.visitmethods[prefix].get(mname) + if m is None: + # Must resolve, try entire hierarchy + for cls in inspect.getmro(cls): + m = getattr(self, prefix + cls.__name__, None) + if m is not None: + break + if m is None: raise RuntimeError("Not a Node descendant: " + node) + self.visitmethods[prefix][mname] = m + return m ### I'm not convinced of this one. I understand why you do it, but I believe ### that using the class itself rather than dispatching on a string would help ### here (I saw your decorator proposal - still thinking about it, but might ### be worth doing). @@ -110,7 +174,6 @@ class TransformSet(dict): def run(self, name, node, **options): assert name in self for transform in self[name]: - transform.initialize(phase=name, **options) - transform.process_node(node, "(root)") + transform(node, phase=name, **options) ### I like this part. From stefan_ml at behnel.de Sat May 3 07:56:42 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 03 May 2008 07:56:42 +0200 Subject: [Cython] Visitor patterns and Python 2.4 In-Reply-To: <481AFD4D.4050401@student.matnat.uio.no> References: <481AFD4D.4050401@student.matnat.uio.no> Message-ID: <481BFE9A.4080600@behnel.de> Hi, Dag Sverre Seljebotn wrote: > Are there any real reasons for leaving the Cython compiler (not talking > about generated or supported code of course) at Python 2.3, rather than > a small bump to 2.4? Reason: I'd like decorators. > > The rationale: Notice that parse tree visitors can currently be written > like this: > > class AnalyseControlFlow(VisitorTransform): > def pre_FuncDefNode(self, node): > node.body.analyse_control_flow(node.scope) > return False # do not recurse beyond first function level If this way of doing it is accepted, I'm actually for accepting 2.4 code. I don't know any important platform that doesn't come with at least Py2.4. And note that most people won't even need to run Cython themselves, even if they use software implemented in Cython. Stefan From dagss at student.matnat.uio.no Sat May 3 10:27:16 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 3 May 2008 10:27:16 +0200 (CEST) Subject: [Cython] Some small phase refactorings In-Reply-To: <481BFDDA.8060007@behnel.de> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> Message-ID: <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> Thanks for the feedback. > - children_attrs = ["body"] > + child_attrs = ["body"] > > > ### This looks like a separate fix to me. Indeed (sorry). > > + import Cython.Compiler.Transforms.Analysis as Analysis > + Analysis.CreateFunctionScope()(self, env=env) > + Analysis.AnalyseControlFlow()(self, env=env) > + Analysis.AnalyseFunctionBodyDeclarations()(self, env=env) > + Analysis.AnalyseFunctionBodyExpressions()(self, env=env) > + options.transforms.run('after_function_analysis', self, > global_env=env) > + > > ### These look like functions, they should follow PEP 8 naming. Then > again, > ### why aren't they functions? Note that this code will almost certainly be moved again and rewritten at some later point (they can't really belong to ModuleNode); but more refactoring must happen before they can be moved to their proper location and they serve a good purpose where they are for now though. They have to be classes as they are transform objects with member methods, some state etc. etc. But thinking about it one could have functions like this in Analysis.py: def analyse_function_body_declarations(tree, **opts): return AnalyseFunctionBodyDeclarations()(tree, **opts) if that helps. The reason I started using the __call__ is that I think in time one can treat these as functions, like this: pipeline = [ f, g, AnalyseFunctionBodyDeclarations(), Coercions(), ] for transform in pipeline: tree = transform(tree) (ie, basically saying that "pipeline = f g h"...) > ### I'm not convinced of this one. I understand why you do it, but I > believe > ### that using the class itself rather than dispatching on a string would > help > ### here (I saw your decorator proposal - still thinking about it, but > might > ### be worth doing). If not, then the classical visitor pattern might put you at ease?: class FuncDefNode: def accept(self, visitor): visitor.visit_FuncDefNode(self) However, that's a lot of extra trivial code to add (this would have to be added to _all_ classes), and when one is using Python anyway I'd like to avoid pretending I'm writing Java... :-) I hope we can start using Python 2.4, then I'll implement a decorator/metaclass solution instead. In conclusion, I'd like to mention that I really think the important thing here is to consider the "grand, large-scale" features of the patch. I didn't polish the details in any way, because I think that what is important here is the changes they make possible in the application structure. How the visitors look like can be changed entirely in Visitor.py and Analysis.py without interfering with existing code; while the phase refactoring is going to intrude everywhere and make changes all over the place, so the form the phase refactoring will take is the important point. (OTOH; I guess it is a good time to think about the details as well so that when the 1000 line coercion refactoring patch should be written one knows what to do in the details...) (I've asked myself as to whether it is all worth it BTW. It is a heavy and non-fun task. But I'm still convinced there's absolutely no way around it if the feature-set of Cython is to grow significantly in any way. And realistically, it can be done in two or three days with for instance me, you and Robert working together... So this might be too early to talk about it; but I end up working on it anyway because it is effectively a blocker for me and I cannot get anywhere with my GSoC stuff until it is done :-) ) Dag Sverre From kirr at mns.spb.ru Sat May 3 18:21:36 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sat, 03 May 2008 20:21:36 +0400 Subject: [Cython] [PATCH 2 of 3] Fix cimport example in "Sharing Declarations" section In-Reply-To: Message-ID: # HG changeset patch # User Kirill Smelkov # Date 1209831534 -14400 # Node ID a81645475dc33f1916f6dc7ee5ec8f52141b20e8 # Parent 0745e2b1f904c2ab8b7c70218d59bd933cbbe3ab Fix cimport example in "Sharing Declarations" section - we need to define struct variables with 'cdef' - there is no -> in Cython/Pyrex diff --git a/docs/sharing_declarations.rst b/docs/sharing_declarations.rst --- a/docs/sharing_declarations.rst +++ b/docs/sharing_declarations.rst @@ -82,9 +82,9 @@ uses it. d.filler = dishes.sausage def serve(): - spamdish d + cdef spamdish d prepare(&d) - print "%d oz spam, filler no. %d" % (d->oz_of_spam, d->otherstuff) + print "%d oz spam, filler no. %d" % (d.oz_of_spam, d.otherstuff) It is important to understand that the :keyword:`cimport` statement can only be used to import C data types, C functions and variables, and extension From kirr at mns.spb.ru Sat May 3 18:21:37 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sat, 03 May 2008 20:21:37 +0400 Subject: [Cython] [PATCH 3 of 3] Reminder to put the table about styles of struct, union and enum declarations In-Reply-To: Message-ID: <630bbffdd62e0e82c551.1209831697@tugrik2.mns.mnsspb.ru> # HG changeset patch # User Kirill Smelkov # Date 1209831643 -14400 # Node ID 630bbffdd62e0e82c551334cbf4579b751859248 # Parent a81645475dc33f1916f6dc7ee5ec8f52141b20e8 Reminder to put the table about styles of struct, union and enum declarations diff --git a/docs/external_C_code.rst b/docs/external_C_code.rst --- a/docs/external_C_code.rst +++ b/docs/external_C_code.rst @@ -158,6 +158,8 @@ header file, and the corresponding Cytho header file, and the corresponding Cython declaration that you should put in the ``cdef extern`` from block. Struct declarations are used as an example; the same applies equally to union and enum declarations. + +**TODO** put table here... Note that in all the cases below, you refer to the type in Cython code simply as :ctype:`Foo`, not ``struct Foo``. From kirr at mns.spb.ru Sat May 3 18:21:35 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sat, 03 May 2008 20:21:35 +0400 Subject: [Cython] [PATCH 1 of 3] Fix typos in documentation here and there In-Reply-To: Message-ID: <0745e2b1f904c2ab8b7c.1209831695@tugrik2.mns.mnsspb.ru> # HG changeset patch # User Kirill Smelkov # Date 1209831403 -14400 # Node ID 0745e2b1f904c2ab8b7c70218d59bd933cbbe3ab # Parent bc0d87c4a2953e32beed52bdb461bba99a7adda5 Fix typos in documentation here and there diff --git a/docs/extension_types.rst b/docs/extension_types.rst --- a/docs/extension_types.rst +++ b/docs/extension_types.rst @@ -209,25 +209,26 @@ time it is written to, returns the list time it is written to, returns the list when it is read, and empties the list when it is deleted.:: - #cheesy.pyx Test input + # cheesy.pyx cdef class CheeseShop: - cdef object cheeses + cdef object cheeses - def __cinit__(self): - self.cheeses = [] + def __cinit__(self): + self.cheeses = [] - property cheese: + property cheese: - def __get__(self): - return "We don't have: %s" % self.cheeses + def __get__(self): + return "We don't have: %s" % self.cheeses - def __set__(self, value): - self.cheeses.append(value) + def __set__(self, value): + self.cheeses.append(value) - def __del__(self): - del self.cheeses[:] + def __del__(self): + del self.cheeses[:] + # Test input from cheesy import CheeseShop shop = CheeseShop() @@ -242,7 +243,7 @@ when it is deleted.:: del shop.cheese print shop.cheese - #Test output + # Test output We don't have: [] We don't have: ['camembert'] We don't have: ['camembert', 'cheddar'] @@ -280,18 +281,17 @@ functions, C methods are declared using :keyword:`def`. C methods are "virtual", and may be overridden in derived extension types.:: - pets.pyx - Output + # pets.pyx cdef class Parrot: - cdef void describe(self): - print "This parrot is resting." + cdef void describe(self): + print "This parrot is resting." cdef class Norwegian(Parrot): - cdef void describe(self): - Parrot.describe(self) - print "Lovely plumage!" + cdef void describe(self): + Parrot.describe(self) + print "Lovely plumage!" cdef Parrot p1, p2 @@ -301,7 +301,9 @@ extension types.:: p1.describe() print "p2:" p2.describe() - p1: + + # Output + p1: This parrot is resting. p2: This parrot is resting. diff --git a/docs/external_C_code.rst b/docs/external_C_code.rst --- a/docs/external_C_code.rst +++ b/docs/external_C_code.rst @@ -293,18 +293,16 @@ Any public C type or extension type decl Any public C type or extension type declarations in the Cython module are also made available when you include :file:`modulename_api.h`.:: - delorean.pyx - - marty.c - + # delorean.pyx cdef public struct Vehicle: - int speed - float power + int speed + float power cdef api void activate(Vehicle *v): if v.speed >= 88 and v.power >= 1.21: print "Time travel achieved" + # marty.c #include "delorean_api.h" Vehicle car; diff --git a/docs/language_basics.rst b/docs/language_basics.rst --- a/docs/language_basics.rst +++ b/docs/language_basics.rst @@ -327,7 +327,7 @@ Error return values Error return values ------------------- -If you don't do anything special, a function declared with :keyword`cdef` that +If you don't do anything special, a function declared with :keyword:`cdef` that does not return a Python object has no way of reporting Python exceptions to its caller. If an exception is detected in such a function, a warning message is printed and the exception is ignored. @@ -520,7 +520,7 @@ statements, combined using any of the Py statements, combined using any of the Python expression syntax. The following compile-time names are predefined, corresponding to the values -returned by :func:``os.uname``. +returned by :func:`os.uname`. UNAME_SYSNAME, UNAME_NODENAME, UNAME_RELEASE, UNAME_VERSION, UNAME_MACHINE diff --git a/docs/pyrex_differences.rst b/docs/pyrex_differences.rst --- a/docs/pyrex_differences.rst +++ b/docs/pyrex_differences.rst @@ -108,7 +108,7 @@ python. One can declare variables and re :ctype:`bint` type. For example:: cdef int i = x - Cdef bint b = x + cdef bint b = x The first conversion would happen via ``x.__int__()`` whereas the second would happen via ``x.__nonzero__()``. (Actually, if ``x`` is the python object @@ -156,7 +156,7 @@ method on the class directly, e.g.:: cdef class A: cpdef foo(self): - pass + pass x = A() x.foo() # will check to see if overridden @@ -197,10 +197,10 @@ It does not stop one from casting where It does not stop one from casting where there is no conversion (though it will emit a warning). If one really wants the address, cast to a ``void *`` first. -As in Pyrex ``x`` will cast ``x`` to type `MyExtensionType` without any +As in Pyrex ``x`` will cast ``x`` to type :ctype:`MyExtensionType` without any type checking. Cython supports the syntax ```` to do the cast with type checking (i.e. it will throw an error if ``x`` is not a (subclass of) -`MyExtensionType`. +:ctype:`MyExtensionType`. Optional arguments in cdef/cpdef functions ------------------------------------------ diff --git a/docs/sharing_declarations.rst b/docs/sharing_declarations.rst --- a/docs/sharing_declarations.rst +++ b/docs/sharing_declarations.rst @@ -16,7 +16,12 @@ modules, and an implementation file with modules, and an implementation file with a ``.pyx`` suffix, containing everything else. When a module wants to use something declared in another module's definition file, it imports it using the :keyword:`cimport` -statement. What a Definition File contains A definition file can contain: +statement. + +What a Definition File contains +------------------------------- + +A definition file can contain: * Any kind of C type declaration. * extern C function or variable declarations. @@ -139,11 +144,11 @@ C functions defined at the top level of :keyword:`cimport` by putting headers for them in the ``.pxd`` file, for example,: -:file:`volume.pxd`: +:file:`volume.pxd`:: + + cdef float cube(float) :file:`spammery.pyx`:: - - cdef float cube(float) from volume cimport cube @@ -185,17 +190,22 @@ Here is an example of a module which def Here is an example of a module which defines and exports an extension type, and another module which uses it.:: - Shrubbing.pxd Shrubbing.pyx + # Shrubbing.pxd cdef class Shrubbery: cdef int width - cdef int length cdef class Shrubbery: + cdef int length + + # Shrubbing.pyx + cdef class Shrubbery: def __new__(self, int w, int l): self.width = w self.length = l def standard_shrubbery(): return Shrubbery(3, 7) - Landscaping.pyx + + + # Landscaping.pyx cimport Shrubbing import Shrubbing From kirr at mns.spb.ru Sat May 3 18:21:34 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sat, 03 May 2008 20:21:34 +0400 Subject: [Cython] [PATCH 0 of 3] Various fixes to documentation Message-ID: Hi, While studying (again!) Cython/Pyrex, I've fixed few bits in the documentation. Three patches attached, please apply. Thanks! From ggellner at uoguelph.ca Sat May 3 19:53:48 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Sat, 3 May 2008 13:53:48 -0400 Subject: [Cython] [PATCH 0 of 3] Various fixes to documentation In-Reply-To: References: Message-ID: <20080503175348.GA13881@basestar> On Sat, May 03, 2008 at 08:21:34PM +0400, Kirill Smelkov wrote: > Hi, > > While studying (again!) Cython/Pyrex, I've fixed few bits in the documentation. > > Three patches attached, please apply. > Thanks for all the changes! I have applied them. For future reference the source is now at: http://hg.cython.org/cython-docs (I have not pushed the new file upstream yet, but I will do it as soon as I can. You can see the changes in the website and pdf). Gabriel From pete at petertodd.org Sun May 4 07:15:32 2008 From: pete at petertodd.org (Peter Todd) Date: Sun, 4 May 2008 01:15:32 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> Message-ID: <20080504051532.GA7720@tilt> On Wed, Apr 30, 2008 at 06:17:31PM -0700, Robert Bradshaw wrote: > On Apr 29, 2008, at 5:42 PM, Peter Todd wrote: > > >Is there a __getattribute__ work-alike in Cython? > > > >Essentially I need direct control over an objects tp_getattro and > >tp_setattro slots to implement a wrapper class. Specificly > >wrapped.__class__ should go to the wrapped objects class attribute, > >not > >the wrapping objects __class__ attribute. > > > >__getattr__ outputs C-source that includes a call to > >PyObject_GenericGetAttr first, and won't run my code if that call > >succeeds. > > > >Thanks, > > > >Peter > > Not that I'm aware of, though I'd imagine that implementing > __getattribute__ (if it exists) as being called at the top of this > function would be fairly easy to do. One would want to match Python > symantics exactly. Here's my first patch. This correctly implements __getattribute__ and __getattr__ in the single class case. FWIW I also have a mercurial tree if it'd be better to pull from it than apply patches. I'm working on making subclasses behave correctly, I've got test cases written up showing where things fail, but no solutions to that written yet. The slot_tp_getattro stuff Stefan mentioned is useful though. -- http://petertodd.org 'peter'[:-1]@petertodd.org -------------- next part -------------- A non-text attachment was scrubbed... Name: __getattribute__.patch Type: text/x-diff Size: 4799 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080504/376ae4c6/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080504/376ae4c6/attachment.pgp From kirr at mns.spb.ru Sun May 4 10:20:56 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 04 May 2008 12:20:56 +0400 Subject: [Cython] [PATCH 2 of 2] Initial .hgignore In-Reply-To: Message-ID: # HG changeset patch # User Kirill Smelkov # Date 1209814507 -14400 # Node ID aaa56370d9fb26ca99e2aa774e373b7383ffd685 # Parent 2fe2f4f4e58972b2a3eb106964cd68db12d88e0e Initial .hgignore We ignore *.pyc, vim swap files, build results under BUILD/ and Lexicon.pickle This is handy, because otherwise, say after runtest.py run, 'hg status' shows lots of unrelated info, thus lowering signal-to-noise ratio. diff --git a/.hgignore b/.hgignore new file mode 100644 --- /dev/null +++ b/.hgignore @@ -0,0 +1,8 @@ +syntax: glob + +*.pyc +*.swp + +Cython/Compiler/Lexicon.pickle +BUILD/ + From kirr at mns.spb.ru Sun May 4 10:20:54 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 04 May 2008 12:20:54 +0400 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree Message-ID: Hi, Here is a couple of handy patches for the case when cython is used "in-tree", i.e. when no installation is done. - First we need to 'chmod a+x' relevant files in bin/ , - and second, it would be convenient to finally create reasonable .hgignore Please apply. From kirr at mns.spb.ru Sun May 4 10:20:55 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 04 May 2008 12:20:55 +0400 Subject: [Cython] [PATCH 1 of 2] chmod a+x bin/* In-Reply-To: Message-ID: <2fe2f4f4e58972b2a3eb.1209889255@tugrik2.mns.mnsspb.ru> # HG changeset patch # User Kirill Smelkov # Date 1209814181 -14400 # Node ID 2fe2f4f4e58972b2a3eb106964cd68db12d88e0e # Parent 3c924a0594baee13d595e66de8577fcd1691ee7c chmod a+x bin/* This is handy when using cython right from development tree diff --git a/bin/cython b/bin/cython old mode 100644 new mode 100755 diff --git a/bin/update_references b/bin/update_references old mode 100644 new mode 100755 From kirr at mns.spb.ru Sun May 4 11:12:23 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 13:12:23 +0400 Subject: [Cython] [PATCH 0 of 3] Various fixes to documentation In-Reply-To: <20080503175348.GA13881@basestar> References: <20080503175348.GA13881@basestar> Message-ID: <200805041312.23099.kirr@mns.spb.ru> ? ????????? ?? ??????? 03 ??? 2008 Gabriel Gellner ???????(a): > On Sat, May 03, 2008 at 08:21:34PM +0400, Kirill Smelkov wrote: > > Hi, > > > > While studying (again!) Cython/Pyrex, I've fixed few bits in the documentation. > > > > Three patches attached, please apply. > > > Thanks for all the changes! I have applied them. Thanks! Kirill. From stefan_ml at behnel.de Sun May 4 11:18:27 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 11:18:27 +0200 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: References: Message-ID: <481D7F63.5030402@behnel.de> Hi, Kirill Smelkov wrote: > Here is a couple of handy patches for the case when cython is used "in-tree", > i.e. when no installation is done. > > - First we need to 'chmod a+x' relevant files in bin/ , > - and second, it would be convenient to finally create reasonable .hgignore Thanks, your patches are appreciated. Still, for the next time, could you send a bundle of changesets (hg bundle) instead of one mail per patch? Although this list is called "cython-dev", it's a somewhat general purpose list for Cython, so many people who read it do not care about small fixes and changes. Thanks, Stefan From kirr at mns.spb.ru Sun May 4 12:08:03 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 14:08:03 +0400 Subject: [Cython] =?utf-8?q?=5BPATCH_0_of_2=5D_A_couple_of_handy_patches_w?= =?utf-8?q?hen_using_cython=09in-tree?= In-Reply-To: <481D7F63.5030402@behnel.de> References: <481D7F63.5030402@behnel.de> Message-ID: <200805041408.03746.kirr@mns.spb.ru> Hi, ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > Hi, > > Kirill Smelkov wrote: > > Here is a couple of handy patches for the case when cython is used "in-tree", > > i.e. when no installation is done. > > > > - First we need to 'chmod a+x' relevant files in bin/ , > > - and second, it would be convenient to finally create reasonable .hgignore > > Thanks, your patches are appreciated. Still, for the next time, could you send > a bundle of changesets (hg bundle) instead of one mail per patch? Although > this list is called "cython-dev", it's a somewhat general purpose list for > Cython, so many people who read it do not care about small fixes and changes. Stefan, I see your points. I used 'hg email --outgoing' to send my patches, so if i want it to be sent as bundle, I'd use 'hg email --bundle'. I see one problem with this approach though: bundle is a binary file, and if someone would like to comment on the patch itself - he'll need to manually paste relevant chunks into reply mail. I understand this list is general purpose, so maybe it would be good idea to create separate list for patches? e.g. cython-patches at codespeak.net Then sending patches there in plain text would be ok for all: - readable patches right in email client - ready-to apply patches right from email client (one button in mutt) - easy to comment on patches This is common technique, and for example we do this in sympy-patches: http://groups.google.com/group/sympy-patches/ What do you think? Kirill. From gfurnish at gfurnish.net Sun May 4 13:22:46 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 4 May 2008 05:22:46 -0600 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <200805041408.03746.kirr@mns.spb.ru> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> Message-ID: <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> I strongly disagree. Bundles are horrible for small changes. Text patches are easy to examine without using hg. I should not have to modify my repo to inspect a 3 line patch. On Sun, May 4, 2008 at 4:08 AM, Kirill Smelkov wrote: > Hi, > > ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > > > Hi, > > > > Kirill Smelkov wrote: > > > Here is a couple of handy patches for the case when cython is used "in-tree", > > > i.e. when no installation is done. > > > > > > - First we need to 'chmod a+x' relevant files in bin/ , > > > - and second, it would be convenient to finally create reasonable .hgignore > > > > Thanks, your patches are appreciated. Still, for the next time, could you send > > a bundle of changesets (hg bundle) instead of one mail per patch? Although > > this list is called "cython-dev", it's a somewhat general purpose list for > > Cython, so many people who read it do not care about small fixes and changes. > > Stefan, I see your points. > > I used 'hg email --outgoing' to send my patches, so if i want it to be sent as > bundle, I'd use 'hg email --bundle'. > > I see one problem with this approach though: bundle is a binary file, and if > someone would like to comment on the patch itself - he'll need to manually > paste relevant chunks into reply mail. > > I understand this list is general purpose, so maybe it would be good idea to > create separate list for patches? e.g. cython-patches at codespeak.net > > Then sending patches there in plain text would be ok for all: > > - readable patches right in email client > - ready-to apply patches right from email client (one button in mutt) > - easy to comment on patches > > This is common technique, and for example we do this in sympy-patches: > > http://groups.google.com/group/sympy-patches/ > > What do you think? > > Kirill. > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From stefan_ml at behnel.de Sun May 4 13:34:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 13:34:09 +0200 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> Message-ID: <481D9F31.7040800@behnel.de> Hi, Gary Furnish top-posted: > I strongly disagree. Bundles are horrible for small changes. Text > patches are easy to examine without using hg. I should not have to > modify my repo to inspect a 3 line patch. Ok, what do you suggest then? Stefan From gfurnish at gfurnish.net Sun May 4 13:51:28 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 4 May 2008 05:51:28 -0600 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <481D9F31.7040800@behnel.de> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> Message-ID: <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> I agree with the suggestion to have a cython-patches mailing list, either on google or codespeak. On Sun, May 4, 2008 at 5:34 AM, Stefan Behnel wrote: > Hi, > > Gary Furnish top-posted: > > > I strongly disagree. Bundles are horrible for small changes. Text > > patches are easy to examine without using hg. I should not have to > > modify my repo to inspect a 3 line patch. > > Ok, what do you suggest then? > > Stefan > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From stefan_ml at behnel.de Sun May 4 14:27:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 14:27:09 +0200 Subject: [Cython] code review In-Reply-To: <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> Message-ID: <481DAB9D.1070003@behnel.de> Hi, Gary Furnish top-posted again: > I agree with the suggestion to have a cython-patches mailing list, > either on google or codespeak. I can ask on codespeak. However, wouldn't a tool like this be better: http://codereview.appspot.com/ Guido pronounced he'd be open-sourcing it tomorrow (Apache 2 license). http://comments.gmane.org/gmane.comp.python.python-3000.devel/13110 We could see how much work it would be to get it running with hg, and if it's doable, maybe Robert and William could host it on their site, next to the hg repositories? Stefan From gfurnish at gfurnish.net Sun May 4 14:38:56 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 4 May 2008 06:38:56 -0600 Subject: [Cython] code review In-Reply-To: <481DAB9D.1070003@behnel.de> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> <481DAB9D.1070003@behnel.de> Message-ID: <8f8f8530805040538o5ae2be44w333b7974e6894b44@mail.gmail.com> Why not just use trac? The interface is much nicer, and we could move away from the slow and unfriendly launchpad at the same time (and we could easily host it too). It is not integrated with svn like codereview is, but it does have the online patch viewer functionality, and using trac saves us from having to port some other solution to hg. --Gary On Sun, May 4, 2008 at 6:27 AM, Stefan Behnel wrote: > Hi, > > Gary Furnish top-posted again: > > I agree with the suggestion to have a cython-patches mailing list, > > either on google or codespeak. > > I can ask on codespeak. However, wouldn't a tool like this be better: > > http://codereview.appspot.com/ > > Guido pronounced he'd be open-sourcing it tomorrow (Apache 2 license). > > http://comments.gmane.org/gmane.comp.python.python-3000.devel/13110 > > We could see how much work it would be to get it running with hg, and if it's > doable, maybe Robert and William could host it on their site, next to the hg > repositories? > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From kirr at mns.spb.ru Sun May 4 14:55:31 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 16:55:31 +0400 Subject: [Cython] code review In-Reply-To: <481DAB9D.1070003@behnel.de> References: <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> <481DAB9D.1070003@behnel.de> Message-ID: <200805041655.31562.kirr@mns.spb.ru> ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > Hi, > > Gary Furnish top-posted again: > > I agree with the suggestion to have a cython-patches mailing list, > > either on google or codespeak. > > I can ask on codespeak. However, wouldn't a tool like this be better: > > http://codereview.appspot.com/ > > Guido pronounced he'd be open-sourcing it tomorrow (Apache 2 license). > > http://comments.gmane.org/gmane.comp.python.python-3000.devel/13110 I'm looking forward for this! Actually, for sympy-patches we already started to think about setting up, say roundup instance somewhere, sitting and watching for mails on sympy-patches. Then every review request will get into our issues tracker, and every mail reply gets into tracker too, and also the whole thing could be managed then from issues interface too. I'm +1 for everything that would help manage reviews, but I think it should integrate pretty well with plain old email and text editor. Roundup has good email integration, and I hope Guido's tool do too. Looking forward! > We could see how much work it would be to get it running with hg, and if it's > doable, maybe Robert and William could host it on their site, next to the hg > repositories? In SymPy we are too interested in such things, so the effort could be joined. Kirill. From stefan_ml at behnel.de Sun May 4 14:58:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 14:58:38 +0200 Subject: [Cython] code review In-Reply-To: <8f8f8530805040538o5ae2be44w333b7974e6894b44@mail.gmail.com> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> <481DAB9D.1070003@behnel.de> <8f8f8530805040538o5ae2be44w333b7974e6894b44@mail.gmail.com> Message-ID: <481DB2FE.70309@behnel.de> Hi, Gary Furnish continues to top-post: > Why not just use trac? The interface is much nicer [...] (and we > could easily host it too). It is not integrated with svn like > codereview is, but it does have the online patch viewer functionality, Hmm, I never used that. Would you know an example to look at? Is it just a patch "viewer" or can you add comments? And if so, at what granularity? Per patch? Per line? > and using trac saves us from having to port some other solution to hg. that is definitely a plus. > the slow and unfriendly launchpad Interesting. Why do you think so? Stefan From kirr at mns.spb.ru Sun May 4 16:24:45 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 18:24:45 +0400 Subject: [Cython] =?utf-8?q?=5BPATCH_0_of_2=5D_A_couple_of_handy_patches_w?= =?utf-8?q?hen_using_cython=09in-tree?= In-Reply-To: <481D7F63.5030402@behnel.de> References: <481D7F63.5030402@behnel.de> Message-ID: <200805041824.45735.kirr@mns.spb.ru> Hi Stefan, ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > Hi, > > Kirill Smelkov wrote: > > Here is a couple of handy patches for the case when cython is used "in-tree", > > i.e. when no installation is done. > > > > - First we need to 'chmod a+x' relevant files in bin/ , > > - and second, it would be convenient to finally create reasonable .hgignore > > Thanks, your patches are appreciated. I look here http://hg.cython.org/cython-devel/ and still can't see they were applied. Does it mean I have to do them another way, or am I missing something? Thanks, Kirill. From gfurnish at gfurnish.net Sun May 4 16:46:20 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 4 May 2008 08:46:20 -0600 Subject: [Cython] code review In-Reply-To: <481DB2FE.70309@behnel.de> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> <481DAB9D.1070003@behnel.de> <8f8f8530805040538o5ae2be44w333b7974e6894b44@mail.gmail.com> <481DB2FE.70309@behnel.de> Message-ID: <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> The canonical example of a project that uses trac is Sage: http://trac.sagemath.org/sage_trac It has good email integration (notify on ticket modification, etc, although it is supposed to be even better in the next version). Normally we attach patches to a ticket and then just comment them in the ticket associated with the patch. This associates the bug/feature ticket with the patch (as opposed to needing two systems if you went with something like codeview + launchpad). Compared to trac, launchpad is slow for webpage response time. It is also slow on ticket creation time (I can create a ticket for a given item in maybe 30 seconds on one page in trac, whereas launchpad has so much complexity it is significantly more time consuming, requires a multipage creation step, etc). --Gary On Sun, May 4, 2008 at 6:58 AM, Stefan Behnel wrote: > Hi, > > Gary Furnish continues to top-post: > > > Why not just use trac? The interface is much nicer [...] (and we > > > could easily host it too). It is not integrated with svn like > > codereview is, but it does have the online patch viewer functionality, > > Hmm, I never used that. Would you know an example to look at? Is it just a > patch "viewer" or can you add comments? And if so, at what granularity? Per > patch? Per line? > > > > > and using trac saves us from having to port some other solution to hg. > > that is definitely a plus. > > > > > the slow and unfriendly launchpad > > Interesting. Why do you think so? > > Stefan > > From kirr at mns.spb.ru Sun May 4 17:16:49 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 19:16:49 +0400 Subject: [Cython] code review In-Reply-To: <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> References: <481DB2FE.70309@behnel.de> <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> Message-ID: <200805041916.49167.kirr@mns.spb.ru> ? ????????? ?? ??????????? 04 ??? 2008 Gary Furnish ???????(a): > The canonical example of a project that uses trac is Sage: > http://trac.sagemath.org/sage_trac > It has good email integration (notify on ticket modification, etc, > although it is supposed to be even better in the next version). I'd like to clarify what I mean saying "good email integration": Good email integration is two-way That is: -> it is possible to affect state of the issues, add comments to patch review, etc... by sending mail. <- you get notification mails, when someone changes something through web interface, or another way. Personally, I think having the first entry is important - a lot of tasks could be done via plain emails, and at least some people are more productive with keyboard & text editor (compared to clicking with mouse) :) Does Trac have two-way email integration? > Normally we attach patches to a ticket and then just comment them in > the ticket associated with the patch. This associates the bug/feature > ticket with the patch (as opposed to needing two systems if you went > with something like codeview + launchpad). Compared to trac, > launchpad is slow for webpage response time. It is also slow on > ticket creation time (I can create a ticket for a given item in maybe > 30 seconds on one page in trac, whereas launchpad has so much > complexity it is significantly more time consuming, requires a > multipage creation step, etc). Gary, All, imagine you could create a patch issue with plain 'hg email', or create new issue with just sending mail to special address. Isn't this cool!? Roundup has this now and it works. Also, although Roundup is not so shiny, it was choosen as the tracker for Python itself: http://bugs.python.org/ What do you think? From stefan_ml at behnel.de Sun May 4 17:36:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 17:36:19 +0200 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <200805041824.45735.kirr@mns.spb.ru> References: <481D7F63.5030402@behnel.de> <200805041824.45735.kirr@mns.spb.ru> Message-ID: <481DD7F3.6040401@behnel.de> Hi, Kirill Smelkov wrote: > Hi Stefan, > > ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): >> Hi, >> >> Kirill Smelkov wrote: >>> Here is a couple of handy patches for the case when cython is used "in-tree", >>> i.e. when no installation is done. >>> >>> - First we need to 'chmod a+x' relevant files in bin/ , >>> - and second, it would be convenient to finally create reasonable .hgignore >> Thanks, your patches are appreciated. > > I look here > > http://hg.cython.org/cython-devel/ > > and still can't see they were applied. See? That's one of the advantages of a bundle. It allows you to send a complete set of changes against the current trunk, so that things don't get lost. Anything else missing? Stefan From gfurnish at gfurnish.net Sun May 4 17:37:28 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 4 May 2008 09:37:28 -0600 Subject: [Cython] code review In-Reply-To: <200805041916.49167.kirr@mns.spb.ru> References: <481DB2FE.70309@behnel.de> <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> <200805041916.49167.kirr@mns.spb.ru> Message-ID: <8f8f8530805040837k6cd312f4u47e64337b3dbea20@mail.gmail.com> Trac doesn't have two way notification emails, only outgoing (as far as I know). Roundup looks neat (especially the two way email integration), but I'm not happy about the lack of a good interface for viewing patches compared to trac's (Example: http://trac.sagemath.org/sage_trac/attachment/ticket/3025/9609.patch ), as roundup just displays the raw text. On Sun, May 4, 2008 at 9:16 AM, Kirill Smelkov wrote: > ? ????????? ?? ??????????? 04 ??? 2008 Gary Furnish ???????(a): > > > The canonical example of a project that uses trac is Sage: > > http://trac.sagemath.org/sage_trac > > It has good email integration (notify on ticket modification, etc, > > although it is supposed to be even better in the next version). > > I'd like to clarify what I mean saying "good email integration": > > Good email integration is two-way > > That is: > > -> it is possible to affect state of the issues, add comments to patch review, > etc... by sending mail. > <- you get notification mails, when someone changes something through web > interface, or another way. > > > Personally, I think having the first entry is important - a lot of tasks could > be done via plain emails, and at least some people are more productive with > keyboard & text editor (compared to clicking with mouse) :) > > Does Trac have two-way email integration? > > > > > > Normally we attach patches to a ticket and then just comment them in > > the ticket associated with the patch. This associates the bug/feature > > ticket with the patch (as opposed to needing two systems if you went > > with something like codeview + launchpad). Compared to trac, > > launchpad is slow for webpage response time. It is also slow on > > ticket creation time (I can create a ticket for a given item in maybe > > 30 seconds on one page in trac, whereas launchpad has so much > > complexity it is significantly more time consuming, requires a > > multipage creation step, etc). > > Gary, All, imagine you could create a patch issue with plain 'hg email', > or create new issue with just sending mail to special address. > > Isn't this cool!? > > Roundup has this now and it works. Also, although Roundup is not so shiny, > it was choosen as the tracker for Python itself: > > http://bugs.python.org/ > > What do you think? > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From stefan_ml at behnel.de Sun May 4 17:41:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 17:41:38 +0200 Subject: [Cython] code review In-Reply-To: <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> References: <481D7F63.5030402@behnel.de> <200805041408.03746.kirr@mns.spb.ru> <8f8f8530805040422i1a7fce66h94086394bb5cf41b@mail.gmail.com> <481D9F31.7040800@behnel.de> <8f8f8530805040451t3dfbea2cx670948fd8a8725b2@mail.gmail.com> <481DAB9D.1070003@behnel.de> <8f8f8530805040538o5ae2be44w333b7974e6894b44@mail.gmail.com> <481DB2FE.70309@behnel.de> <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> Message-ID: <481DD932.4010407@behnel.de> Hi, Gary Furnish consistently continues to top-post: > The canonical example of a project that uses trac is Sage: > http://trac.sagemath.org/sage_trac obviously :) > Normally we attach patches to a ticket and then just comment them in > the ticket associated with the patch. This associates the bug/feature > ticket with the patch (as opposed to needing two systems if you went > with something like codeview + launchpad). Launchpad can do these things, but it lacks hg integration. > Compared to trac, > launchpad is slow for webpage response time. I rarely had a problem with that. Maybe we're just using it from different time zones. > It is also slow on > ticket creation time (I can create a ticket for a given item in maybe > 30 seconds on one page in trac, whereas launchpad has so much > complexity it is significantly more time consuming, requires a > multipage creation step, etc). I can't second that either. In launchpad, you write a subject and on submit it will tell you (a goodie I really like) if it found any possible matches amongst the existing tickets. You can then look at them or decide to write a new ticket. I found that totally straight forward (even the first time I used it) and very helpful for unexperienced users. Stefan From stefan_ml at behnel.de Sun May 4 17:43:20 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 04 May 2008 17:43:20 +0200 Subject: [Cython] code review In-Reply-To: <200805041916.49167.kirr@mns.spb.ru> References: <481DB2FE.70309@behnel.de> <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> <200805041916.49167.kirr@mns.spb.ru> Message-ID: <481DD998.8000204@behnel.de> Hi, Kirill Smelkov wrote: > I'd like to clarify what I mean saying "good email integration": > > Good email integration is two-way > > That is: > > -> it is possible to affect state of the issues, add comments to patch review, > etc... by sending mail. > <- you get notification mails, when someone changes something through web > interface, or another way. Launchpad also has that. You can just reply to the bug notification email and it will show up in the ticket comments. Stefan From mabshoff at googlemail.com Sun May 4 17:06:10 2008 From: mabshoff at googlemail.com (Michael Abshoff) Date: Sun, 04 May 2008 17:06:10 +0200 Subject: [Cython] code review In-Reply-To: <200805041916.49167.kirr@mns.spb.ru> References: <481DB2FE.70309@behnel.de> <8f8f8530805040746q3da11076l44c0709531f3a2bb@mail.gmail.com> <200805041916.49167.kirr@mns.spb.ru> Message-ID: <481DD0E2.9090207@googlemail.com> Kirill Smelkov wrote: > ? ????????? ?? ??????????? 04 ??? 2008 Gary Furnish ???????(a): > >> The canonical example of a project that uses trac is Sage: >> http://trac.sagemath.org/sage_trac >> It has good email integration (notify on ticket modification, etc, >> although it is supposed to be even better in the next version). >> > > I'd like to clarify what I mean saying "good email integration": > > Good email integration is two-way > > That is: > > -> it is possible to affect state of the issues, add comments to patch review, > etc... by sending mail. > Nope. > <- you get notification mails, when someone changes something through web > interface, or another way. > > Yes. > Personally, I think having the first entry is important - a lot of tasks could > be done via plain emails, and at least some people are more productive with > keyboard & text editor (compared to clicking with mouse) :) > > Well, some people like the Debian bugtracker which does let you submit item email, some people don't. Trac does not offer that in the default config and I don't think it is a good idea to do that. Trac offers some subset of wiki tags which make it preferable to use only the webinterface. The Debian bug tracker reads like an email list, so if you want that there is no advantage to switching to Trac. > Does Trac have two-way email integration? > Nope. It might be something available at trac-hacks, but last time I checked I didn't see anything. >> Normally we attach patches to a ticket and then just comment them in >> the ticket associated with the patch. This associates the bug/feature >> ticket with the patch (as opposed to needing two systems if you went >> with something like codeview + launchpad). Compared to trac, >> launchpad is slow for webpage response time. It is also slow on >> ticket creation time (I can create a ticket for a given item in maybe >> 30 seconds on one page in trac, whereas launchpad has so much >> complexity it is significantly more time consuming, requires a >> multipage creation step, etc). >> > > Gary, All, imagine you could create a patch issue with plain 'hg email', > or create new issue with just sending mail to special address. > > Isn't this cool!? > No. I don't really see the benefit there by just firing off an email. Our work flows are probably very different and I am no mouse pusher, but I prefer Trac and its workflow. I was skeptical initially, but since I spend a lot of time daily with Trac putting releases together for Sage I am quite sold on its work flow. I generally dislie the "submit bug report by email" work flow. We have sage-devel for the discussion bit and if we really have an issue we open a ticket. We run 0.10.x, which is the current stable release. It has a couple issues, but those are mostly getting sorted out in 0.11. The main point of running your own trac install is that you own your data and do not depend on some other org to do things for you. Robert Bradshaw wrote an hg inspection module to let you check out patches and bundles online. There is also svn integration per default and some optional code to watch commits in an hg repo. > Roundup has this now and it works. Also, although Roundup is not so shiny, > it was choosen as the tracker for Python itself: > > Sure, but you need Django and a couple other things. Trac is self contained and has next to no dependencies, i.e. the default db is sqlite. It scales well for Sage, so I would assume it will work well for Cython which deals with a lot fewer patches on average compared to Sage. > http://bugs.python.org/ > > What do you think? > :) Cheers, Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From kirr at mns.spb.ru Sun May 4 18:10:43 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 20:10:43 +0400 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <481DD7F3.6040401@behnel.de> References: <200805041824.45735.kirr@mns.spb.ru> <481DD7F3.6040401@behnel.de> Message-ID: <200805042010.43445.kirr@mns.spb.ru> ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > Hi, > > Kirill Smelkov wrote: > > Hi Stefan, > > > > ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > >> Hi, > >> > >> Kirill Smelkov wrote: > >>> Here is a couple of handy patches for the case when cython is used "in-tree", > >>> i.e. when no installation is done. > >>> > >>> - First we need to 'chmod a+x' relevant files in bin/ , > >>> - and second, it would be convenient to finally create reasonable .hgignore > >> Thanks, your patches are appreciated. > > > > I look here > > > > http://hg.cython.org/cython-devel/ > > > > and still can't see they were applied. > > See? That's one of the advantages of a bundle. It allows you to send a > complete set of changes against the current trunk, so that things don't get lost. No :) I see that bundle v.s. text patches is another things. You import bundle, and you apply patches. Actually, applying patches is more convenient to me, since some of the patches could be good, and some don't. that's why i'd like to apply only some, and with bundle you have applied them all. Just look at how things are done in Linux, Mercurial, etc -- they send plain text patches. > Anything else missing? Everything is in, thanks! I (minorly) wonder why you changed patch description in 'chmod a+x bin/*' patch Kirill. From kirr at mns.spb.ru Sun May 4 19:08:24 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sun, 4 May 2008 21:08:24 +0400 Subject: [Cython] code review In-Reply-To: <481DD0E2.9090207@googlemail.com> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> Message-ID: <200805042108.24619.kirr@mns.spb.ru> Hi, All ? ????????? ?? ??????????? 04 ??? 2008 Gary Furnish ???????(a): > Trac doesn't have two way notification emails, only outgoing (as far > as I know). Roundup looks neat (especially the two way email > integration), but I'm not happy about the lack of a good interface for > viewing patches compared to trac's (Example: > http://trac.sagemath.org/sage_trac/attachment/ticket/3025/9609.patch > ), as roundup just displays the raw text. Gary, I see your points. Having patch viewer is indeed useful, like having two-way emails are useful too. ? ????????? ?? ??????????? 04 ??? 2008 Stefan Behnel ???????(a): > Kirill Smelkov wrote: > > I'd like to clarify what I mean saying "good email integration": > > > > Good email integration is two-way > > > > That is: > > > > -> it is possible to affect state of the issues, add comments to patch review, > > etc... by sending mail. > > <- you get notification mails, when someone changes something through web > > interface, or another way. > > Launchpad also has that. You can just reply to the bug notification email and > it will show up in the ticket comments. Then this is a plus for Launchpad. From the other way is it possible to export all stored data from launchpad into some interoperable format, e.g. csv? ? ????????? ?? ??????????? 04 ??? 2008 Michael Abshoff ???????(a): > Kirill Smelkov wrote: > > ? ????????? ?? ??????????? 04 ??? 2008 Gary Furnish ???????(a): > > > >> The canonical example of a project that uses trac is Sage: > >> http://trac.sagemath.org/sage_trac > >> It has good email integration (notify on ticket modification, etc, > >> although it is supposed to be even better in the next version). > >> > > > > I'd like to clarify what I mean saying "good email integration": > > > > Good email integration is two-way > > > > That is: > > > > -> it is possible to affect state of the issues, add comments to patch review, > > etc... by sending mail. > > > > Nope. What do you mean here? You don't agree with my understanding of email integration? Could you please clarify? > > <- you get notification mails, when someone changes something through web > > interface, or another way. > > > > > Yes. > > > Personally, I think having the first entry is important - a lot of tasks could > > be done via plain emails, and at least some people are more productive with > > keyboard & text editor (compared to clicking with mouse) :) > > > > > Well, some people like the Debian bugtracker which does let you submit > item email, some people don't. Trac does not offer that in the default > config and I don't think it is a good idea to do that. Trac offers some > subset of wiki tags which make it preferable to use only the > webinterface. The Debian bug tracker reads like an email list, so if you > want that there is no advantage to switching to Trac. I don't want to be forced to only use either one of web or email. I *do* want to use web interface often, but sometimes it is more convenient to reply via email. And when it comes to patch review, I think it is imho handy to use full-featured editor, when commenting on a patch. Maybe at least for me... > > Does Trac have two-way email integration? > > > Nope. It might be something available at trac-hacks, but last time I > checked I didn't see anything. > > >> Normally we attach patches to a ticket and then just comment them in > >> the ticket associated with the patch. This associates the bug/feature > >> ticket with the patch (as opposed to needing two systems if you went > >> with something like codeview + launchpad). Compared to trac, > >> launchpad is slow for webpage response time. It is also slow on > >> ticket creation time (I can create a ticket for a given item in maybe > >> 30 seconds on one page in trac, whereas launchpad has so much > >> complexity it is significantly more time consuming, requires a > >> multipage creation step, etc). > >> > > > > Gary, All, imagine you could create a patch issue with plain 'hg email', > > or create new issue with just sending mail to special address. > > > > Isn't this cool!? > > > > No. I don't really see the benefit there by just firing off an email. > Our work flows are probably very different and I am no mouse pusher, but > I prefer Trac and its workflow. I was skeptical initially, but since I > spend a lot of time daily with Trac putting releases together for Sage I > am quite sold on its work flow. I generally dislie the "submit bug > report by email" work flow. We have sage-devel for the discussion bit > and if we really have an issue we open a ticket. Ok, I agree emails on usual issues maybe not so handy, and creating issue via email may have its own drawbacks. But my initial context was about patch reviews, and then I think it is handy to do it in fully-featured text editor. e.g. OP sends a mid-size patch, and reviewer comments, e.g.: http://groups.google.com/group/sympy-patches/msg/5a23e70f64b1b776 http://groups.google.com/group/sympy-patches/msg/aa5a9ce17b9165bc http://groups.google.com/group/sympy-patches/msg/0b5107e7885f6dcb So maybe mailing bug reports is not a good idea, but submitting patches via 'hg email' seems (at least to me) to be good, because sending them is one button, and importing them is one button, and reviewing them is one button + text edits + "y" in the end. I agree this is not ideal, and could be improved, e.g. thus made text edits could go into patch issue, and when viewing the diff there would be added comments from various people. But you know, nothing is ideal, and the scheme I'm talking about works ok in a lot of places. > We run 0.10.x, which is the current stable release. It has a couple > issues, but those are mostly getting sorted out in 0.11. The main point > of running your own trac install is that you own your data and do not > depend on some other org to do things for you. +1 about managing your own data by yourself. > Robert Bradshaw wrote an > hg inspection module to let you check out patches and bundles online. Do you mean importing patches from trac with 'hg somecommand'? This would be handy. And also, if there were another 'hg somecommand2' moddeled after 'hg email' to send patches to trac, it would be handy^2. Btw, how do I check myself how it works? > There is also svn integration per default and some optional code to > watch commits in an hg repo. > > Roundup has this now and it works. Also, although Roundup is not so shiny, > > it was choosen as the tracker for Python itself: > > > > > Sure, but you need Django and a couple other things. Trac is self No, Django is not needed to run roundup. > contained and has next to no dependencies, i.e. the default db is > sqlite. It scales well for Sage, so I would assume it will work well for > Cython which deals with a lot fewer patches on average compared to Sage. Actually there are very little dependencies for roundup - it can even use plain files as DB (not to depend on anything), or optionally sqlite and other dbs. I'd say roundup depends only on Python and optionally on sqlite, pytz. And also you need to setup exim so that incoming emails work. > > > http://bugs.python.org/ > > > > What do you think? > > > :) :) Guys, please don't get me wrong. I'm not saying "let's use roundup, period." For amount of patches I observe I think plain cython-patches would do, and then, if needed, we could use any tool which would do the job. Only myself I observed that doing things in text editor is handy, thus I value tools that provide this ability. I really hope Guido's codereview would be good though and helpful and well designed. Let's settle on something practical, Kirill. From pete at petertodd.org Mon May 5 05:37:49 2008 From: pete at petertodd.org (Peter Todd) Date: Sun, 4 May 2008 23:37:49 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <20080504051532.GA7720@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080504051532.GA7720@tilt> Message-ID: <20080505033749.GF4087@tilt> On Sun, May 04, 2008 at 01:15:32AM -0400, Peter Todd wrote: > Here's my first patch. This correctly implements __getattribute__ and > __getattr__ in the single class case. FWIW I also have a mercurial tree > if it'd be better to pull from it than apply patches. > > I'm working on making subclasses behave correctly, I've got test cases > written up showing where things fail, but no solutions to that written > yet. The slot_tp_getattro stuff Stefan mentioned is useful though. __getattr(ibute)__ support is now working with subclasses. The semantics should match Python exactly if my test cases are correct. All that has changed from the previous patch is that at compile time base classes are checked for __getattr(ibute)__ methods and those methods are used if found. Attached is an hg bundle of the two commits. -- http://petertodd.org 'peter'[:-1]@petertodd.org -------------- next part -------------- A non-text attachment was scrubbed... Name: getattr_support.hg Type: application/octet-stream Size: 2517 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080504/e2e831c7/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080504/e2e831c7/attachment.pgp From robertwb at math.washington.edu Mon May 5 20:06:16 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 5 May 2008 11:06:16 -0700 Subject: [Cython] code review In-Reply-To: <200805042108.24619.kirr@mns.spb.ru> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> <200805042108.24619.kirr@mns.spb.ru> Message-ID: <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> > Guys, please don't get me wrong. > > I'm not saying "let's use roundup, period." For amount of patches I > observe I > think plain cython-patches would do, and then, if needed, we could > use any > tool which would do the job. > > Only myself I observed that doing things in text editor is handy, > thus I > value tools that provide this ability. > > I really hope Guido's codereview would be good though and helpful > and well > designed. > > Let's settle on something practical, I agree that we need something practical. I have also been frustrated with the speed (both page load time and having to navigate multiple pages to do simple tasks) with launchpad, which is why I don't use it as much as I probably should. I would also much prefer to self-host our own bug tracker. I'd also vote for trac, but I guess that's no surprise because I'm also part of the Sage team (as are a lot of other potential Cython developers). The workflow of using trac to manage bugs and feature requests works well. Also, cython.org is hosted from the same server as sagemath, so we know that trac will always run well on it as it is being used for the Sage project. I wrote a simple bundle inspector for trac, it has good support for outgoing email notifications, and when the UW math department started using Trac for their help system I wrote a simple plugin that lets it accept incoming tickets as well if people want this. (One big problem with this is that one then has to deal with the issue of spam, so I would suggest only ticket modification, not creation, be allowed via email if we decide to do this.) If people like this suggestion I can go ahead and set up a trac server. On the questions of patches vs. bundles, for small changes, especially if there aren't interdependencies, I prefer patches. Once you get beyond a couple of changesets bundles become easier to deal with, especially to make sure nothing slips between the cracks (this is less of an issue if we start using a good ticketing system as discussed above). Note that one can use "hg incoming -p" to see the contents of a bundle without applying it. - Robert From robertwb at math.washington.edu Mon May 5 20:30:30 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 5 May 2008 11:30:30 -0700 Subject: [Cython] [Pyrex] newbie list processing question In-Reply-To: <481F4CF3.1020204@cc.gatech.edu> References: <481F4CF3.1020204@cc.gatech.edu> Message-ID: <0F507576-A9F6-43CA-8FF7-B744C1A6796B@math.washington.edu> On May 5, 2008, at 11:07 AM, Daniel Ashbrook wrote: > So I've done a lot of Google searching and haven't found an answer to > this question, or at least one that I understand. > > I'm trying to write some fairly simple list-processing code using > pyrex. > As a simple example, let's say I wanted to translate this: > > def addOne(l): > assert(isinstance(l, list)) > for i in xrange(l): > l[i] += 1 > > or even this: > > def addOne(l): > return [i+1 for i in l] > > to fast-running pyrex code. This last implementation of addOne should work as is in Cython, and will be nearly optimal (assuming your CPU has reasonable branch prediction). However, if you are manipulating word-sized integers, using C arrays will give you a manyfold over python arithmetic. > How can I actually access the list properly? > (I should point out that in my real problem, I actually need random > access to the list.) I've seen some talk of using > PyObject_AsWriteBuffer > to do it, but the post was related to array objects, which I'd rather > not use unless I have to. I've been trying to figure it out using > PyList_GetItem/SetItem and so on, but it seems extremely > convoluted, like: > > PyList_SetItem(l,i,PyInt_FromLong(PyInt_AsLong(PyList_GetItem(l,i)) > +1)) > > Surely there is another way! (Plus that line doesn't work - trying to > use "for i from" gives me errors about making i an integer from a > pointer without a cast.) Did you do cdef int i for i from 0 <= i < len(L): ... > > Any tips? If the person who maintains documentation for pyrex is > reading, I'd love to see a simple list-processing example included. In Cython, if one writes L[i] where i is a cdef int, then it checks to see at runtime if L is a list and accesses its elements via a macro. Otherwise one can use PyList_SetItem and friends, but as you have noticed that is cumbersome (as well ahs being hard to read). - Robert From robertwb at math.washington.edu Mon May 5 21:19:38 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 5 May 2008 12:19:38 -0700 Subject: [Cython] [Pyrex] newbie list processing question In-Reply-To: <481F55E8.20603@cc.gatech.edu> References: <481F4CF3.1020204@cc.gatech.edu> <0F507576-A9F6-43CA-8FF7-B744C1A6796B@math.washington.edu> <481F55E8.20603@cc.gatech.edu> Message-ID: On May 5, 2008, at 11:46 AM, Daniel Ashbrook wrote: > Robert Bradshaw wrote: >>> def addOne(l): >>> return [i+1 for i in l] > >> ... >> This last implementation of addOne should work as is in Cython, >> and will be nearly optimal (assuming your CPU has reasonable >> branch prediction). However, if you are manipulating word-sized >> integers, using C arrays will give you a manyfold over python >> arithmetic. > > So in the real code, I'm actually doing float math. And I'll be > wanting to return my results in a list object. There will be many > thousands of float results; what's the best way to deal with that? > Use a C-specific data structure to store the results then turn it > into a list somehow? I would do all my computations with C doubles, and convert to a list only at the very end. I would also seriously consider using NumPy arrays http://numpy.scipy.org/ . There is a fair amount of support for NumPy/Cython integration, and it's only going to get better this summer (two SoC projects). >> Did you do >> cdef int i >> for i from 0 <= i < len(L): >> ... > > Ah, I missed the "cdef int i" part of it. Yeah. One thing you can do is use the -a option, which will spit out an "annotated" html file that will catch stuff like this (bright yellow lines means there's a lot of unnecessary conversion going on, usually indicating that a variable wasn't cdef'd. >> In Cython, if one writes >> L[i] >> where i is a cdef int, then it checks to see at runtime if L is a >> list and accesses its elements via a macro. Otherwise one can use >> PyList_SetItem and friends, but as you have noticed that is >> cumbersome (as well ahs being hard to read). > > Oh ho! That works very nicely; I'll include code to help others in > the future: > > cdef int i > for i from 0 <= i < len(l): > l[i] = l[i] + 1 > > > Thanks for the help! No problem. - Robert From stefan_ml at behnel.de Mon May 5 08:16:11 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 May 2008 08:16:11 +0200 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <200805042010.43445.kirr@mns.spb.ru> References: <200805041824.45735.kirr@mns.spb.ru> <481DD7F3.6040401@behnel.de> <200805042010.43445.kirr@mns.spb.ru> Message-ID: <481EA62B.6000405@behnel.de> Hi, Kirill Smelkov wrote: > I (minorly) wonder why you changed patch description in 'chmod a+x bin/*' patch Because it's more work for me to save an email to a file and import it as patch, than to apply the trivial change myself and commit it. :) Stefan From stefan_ml at behnel.de Mon May 5 21:35:58 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 05 May 2008 21:35:58 +0200 Subject: [Cython] code review In-Reply-To: <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> <200805042108.24619.kirr@mns.spb.ru> <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> Message-ID: <481F619E.50908@behnel.de> Hi Robert, Robert Bradshaw wrote: > I'd also vote for trac, > [...] If people like this suggestion I can go ahead and set up a > trac server. ok, I didn't hear any major reason not to use trac so far, so if you can handle the setup and hosting, I think trac is the best solution. Stefan From greg.ewing at canterbury.ac.nz Tue May 6 03:51:16 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 May 2008 13:51:16 +1200 Subject: [Cython] __getattribute__ In-Reply-To: <20080505033749.GF4087@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080504051532.GA7720@tilt> <20080505033749.GF4087@tilt> Message-ID: <481FB994.9030102@canterbury.ac.nz> Peter Todd wrote: > __getattr(ibute)__ support is now working with subclasses. > > Attached is an hg bundle of the two commits. Can someone send me these as plain text? I may want to incorporate them into Pyrex as well. Thanks, Greg From robertwb at math.washington.edu Tue May 6 07:30:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 5 May 2008 22:30:12 -0700 Subject: [Cython] __getattribute__ In-Reply-To: <481FB994.9030102@canterbury.ac.nz> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080504051532.GA7720@tilt> <20080505033749.GF4087@tilt> <481FB994.9030102@canterbury.ac.nz> Message-ID: On May 5, 2008, at 6:51 PM, Greg Ewing wrote: > Peter Todd wrote: > >> __getattr(ibute)__ support is now working with subclasses. >> >> Attached is an hg bundle of the two commits. > > Can someone send me these as plain text? I may want to > incorporate them into Pyrex as well. Sure. Here you go. There are two patches in this file. -------------- next part -------------- A non-text attachment was scrubbed... Name: getattr_support.patch Type: application/octet-stream Size: 10506 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080505/fa7b3961/attachment.obj From robertwb at math.washington.edu Tue May 6 08:59:23 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 5 May 2008 23:59:23 -0700 Subject: [Cython] code review In-Reply-To: <481F619E.50908@behnel.de> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> <200805042108.24619.kirr@mns.spb.ru> <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> <481F619E.50908@behnel.de> Message-ID: <832C57CB-F230-4A14-B520-150FBDD7B5A9@math.washington.edu> On May 5, 2008, at 12:35 PM, Stefan Behnel wrote: > Hi Robert, > > Robert Bradshaw wrote: >> I'd also vote for trac, >> [...] If people like this suggestion I can go ahead and set up a >> trac server. > > ok, I didn't hear any major reason not to use trac so far, so if > you can > handle the setup and hosting, I think trac is the best solution. Up at http://trac.cython.org/cython_trac/ From kirr at mns.spb.ru Tue May 6 09:02:50 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Tue, 6 May 2008 11:02:50 +0400 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <481EA62B.6000405@behnel.de> References: <200805042010.43445.kirr@mns.spb.ru> <481EA62B.6000405@behnel.de> Message-ID: <200805061102.50657.kirr@mns.spb.ru> Hi, ? ????????? ?? ??????????? 05 ??? 2008 Stefan Behnel ???????(a): > Hi, > > Kirill Smelkov wrote: > > I (minorly) wonder why you changed patch description in 'chmod a+x bin/*' patch > > Because it's more work for me to save an email to a file and import it as > patch, than to apply the trivial change myself and commit it. :) Stefan, most mail clients could be simply setup to do patch imports with one keypress. For example in mutt, I use something like this: macro pager A "(umask 022; hg -R ~/src/sympy/sympy import -)" And, for git there is a git-am to apply series of patches from mailbox. So I think we should review our patch applying practices :) Kirill. From ondrej at certik.cz Tue May 6 13:02:41 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 6 May 2008 13:02:41 +0200 Subject: [Cython] [PATCH 0 of 2] A couple of handy patches when using cython in-tree In-Reply-To: <200805061102.50657.kirr@mns.spb.ru> References: <200805042010.43445.kirr@mns.spb.ru> <481EA62B.6000405@behnel.de> <200805061102.50657.kirr@mns.spb.ru> Message-ID: <85b5c3130805060402i50449ef5nc6c1d11aac54f173@mail.gmail.com> On Tue, May 6, 2008 at 9:02 AM, Kirill Smelkov wrote: > Hi, > > ? ????????? ?? ??????????? 05 ??? 2008 Stefan Behnel ???????(a): >> Hi, >> >> Kirill Smelkov wrote: >> > I (minorly) wonder why you changed patch description in 'chmod a+x bin/*' patch >> >> Because it's more work for me to save an email to a file and import it as >> patch, than to apply the trivial change myself and commit it. :) > > Stefan, most mail clients could be simply setup to do patch imports with one keypress. > > For example in mutt, I use something like this: > > macro pager A "(umask 022; hg -R ~/src/sympy/sympy import -)" > > And, for git there is a git-am to apply series of patches from mailbox. > > So I think we should review our patch applying practices :) BTW, I need to try this as well and write it to our tutorial at docs.sympy.org. So far I also use the "manual" method and so it sucks, exactly as Stefan said. :) Ondrej From stefan_ml at behnel.de Tue May 6 19:09:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 6 May 2008 19:09:29 +0200 (CEST) Subject: [Cython] code review In-Reply-To: <832C57CB-F230-4A14-B520-150FBDD7B5A9@math.washington.edu> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> <200805042108.24619.kirr@mns.spb.ru> <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> <481F619E.50908@behnel.de> <832C57CB-F230-4A14-B520-150FBDD7B5A9@math.washington.edu> Message-ID: <43251.194.114.62.38.1210093769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Bradshaw wrote: > Up at http://trac.cython.org/cython_trac/ Mighty cool! :) Let's see how people get along with it. Stefan From ellisonbg.net at gmail.com Tue May 6 19:19:27 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Tue, 6 May 2008 11:19:27 -0600 Subject: [Cython] non cdef'd class in a pyx file Message-ID: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> Hi, I am wondering if there are any issues with declaring a regular python class in a pyx file. Specifically, we need to define an exception like this: class FooError(Exception): ... I have done this before and it seems to work, but are there any subtleties or things to be aware of? Thanks Brian From robertwb at math.washington.edu Tue May 6 19:25:06 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 6 May 2008 10:25:06 -0700 Subject: [Cython] non cdef'd class in a pyx file In-Reply-To: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> References: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> Message-ID: This should work just fine. On May 6, 2008, at 10:19 AM, Brian Granger wrote: > Hi, > > I am wondering if there are any issues with declaring a regular python > class in a pyx file. Specifically, we need to define an exception > like this: > > class FooError(Exception): > ... > > I have done this before and it seems to work, but are there any > subtleties or things to be aware of? > > Thanks > > Brian > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From ellisonbg.net at gmail.com Tue May 6 19:47:48 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Tue, 6 May 2008 11:47:48 -0600 Subject: [Cython] non cdef'd class in a pyx file In-Reply-To: References: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> Message-ID: <6ce0ac130805061047n136bd362ya93b3342923cac6c@mail.gmail.com> Thanks! On Tue, May 6, 2008 at 11:25 AM, Robert Bradshaw wrote: > This should work just fine. > > > > On May 6, 2008, at 10:19 AM, Brian Granger wrote: > > Hi, > > > > I am wondering if there are any issues with declaring a regular python > > class in a pyx file. Specifically, we need to define an exception > > like this: > > > > class FooError(Exception): > > ... > > > > I have done this before and it seems to work, but are there any > > subtleties or things to be aware of? > > > > Thanks > > > > Brian > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From mistobaan at gmail.com Tue May 6 19:56:16 2008 From: mistobaan at gmail.com (Fabrizio Milo aka misto) Date: Tue, 6 May 2008 10:56:16 -0700 Subject: [Cython] code review In-Reply-To: <43251.194.114.62.38.1210093769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <200805041916.49167.kirr@mns.spb.ru> <481DD0E2.9090207@googlemail.com> <200805042108.24619.kirr@mns.spb.ru> <52199062-9690-4C1F-8E5A-40195520E386@math.washington.edu> <481F619E.50908@behnel.de> <832C57CB-F230-4A14-B520-150FBDD7B5A9@math.washington.edu> <43251.194.114.62.38.1210093769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: > > Up at http://trac.cython.org/cython_trac/ Really Good. Fabrizio From robertwb at math.washington.edu Wed May 7 00:21:25 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 6 May 2008 15:21:25 -0700 Subject: [Cython] Visitor patterns and Python 2.4 In-Reply-To: <481BFE9A.4080600@behnel.de> References: <481AFD4D.4050401@student.matnat.uio.no> <481BFE9A.4080600@behnel.de> Message-ID: On May 2, 2008, at 10:56 PM, Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> Are there any real reasons for leaving the Cython compiler (not >> talking >> about generated or supported code of course) at Python 2.3, rather >> than >> a small bump to 2.4? Reason: I'd like decorators. >> >> The rationale: Notice that parse tree visitors can currently be >> written >> like this: >> >> class AnalyseControlFlow(VisitorTransform): >> def pre_FuncDefNode(self, node): >> node.body.analyse_control_flow(node.scope) >> return False # do not recurse beyond first function level > > If this way of doing it is accepted, I'm actually for accepting 2.4 > code. I > don't know any important platform that doesn't come with at least > Py2.4. And > note that most people won't even need to run Cython themselves, > even if they > use software implemented in Cython. We ship our own Python (2.5.2) with Sage, but I would rather stick with 2.3 compatibility. You might not consider it "important" but my computer (OS X 10.4, yet I know 10.5 is out but I haven't found the time to upgrade yet) has 2.3 by default, as did OS X 10.3. I very much dislike using __class__.__name__ to decide what functions to call, but it doesn't look like you need decorators to get around this, do you? - Robert From greg.ewing at canterbury.ac.nz Wed May 7 01:42:52 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 07 May 2008 11:42:52 +1200 Subject: [Cython] non cdef'd class in a pyx file In-Reply-To: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> References: <6ce0ac130805061019s179c398ocdd4c47fdc22d40c@mail.gmail.com> Message-ID: <4820ECFC.7010006@canterbury.ac.nz> Brian Granger wrote: > I am wondering if there are any issues with declaring a regular python > class in a pyx file. No, that should work pretty much the same as it does in Python. There are a couple of very minor differences -- it creates the class before executing the class body rather than after, and it wraps methods in unbound method objects before putting them in the class (because they're C-implemented functions rather than Python ones, so the usual descriptor magic doesn't get done). But you probably won't notice any of this except in very rare circumstances. -- Greg From robertwb at math.washington.edu Wed May 7 03:03:03 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 6 May 2008 18:03:03 -0700 Subject: [Cython] Some small phase refactorings In-Reply-To: <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> Message-ID: <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> On May 3, 2008, at 1:27 AM, Dag Sverre Seljebotn wrote: >> >> + import Cython.Compiler.Transforms.Analysis as Analysis >> + Analysis.CreateFunctionScope()(self, env=env) >> + Analysis.AnalyseControlFlow()(self, env=env) >> + Analysis.AnalyseFunctionBodyDeclarations()(self, env=env) >> + Analysis.AnalyseFunctionBodyExpressions()(self, env=env) >> + options.transforms.run('after_function_analysis', self, >> global_env=env) >> + >> >> ### These look like functions, they should follow PEP 8 naming. Then >> again, >> ### why aren't they functions? > > Note that this code will almost certainly be moved again and > rewritten at > some later point (they can't really belong to ModuleNode); but more > refactoring must happen before they can be moved to their proper > location > and they serve a good purpose where they are for now though. > > They have to be classes as they are transform objects with member > methods, > some state etc. etc. But thinking about it one could have functions > like > this in Analysis.py: > > def analyse_function_body_declarations(tree, **opts): > return AnalyseFunctionBodyDeclarations()(tree, **opts) > > if that helps. The reason I started using the __call__ is that I > think in > time one can treat these as functions, like this: > > pipeline = [ > f, > g, > AnalyseFunctionBodyDeclarations(), > Coercions(), > ] > for transform in pipeline: > tree = transform(tree) > > (ie, basically saying that "pipeline = f g h"...) > > >> ### I'm not convinced of this one. I understand why you do it, but I >> believe >> ### that using the class itself rather than dispatching on a >> string would >> help >> ### here (I saw your decorator proposal - still thinking about it, >> but >> might >> ### be worth doing). > > If not, then the classical visitor pattern might put you at ease?: > > class FuncDefNode: > def accept(self, visitor): > visitor.visit_FuncDefNode(self) > > However, that's a lot of extra trivial code to add (this would have > to be > added to _all_ classes), and when one is using Python anyway I'd > like to > avoid pretending I'm writing Java... :-) I think one could put a generic accept function in the Node class that would dispatch based on class name (or some class attribute) if one wanted, rather than having to write it for every class. > I hope we can start using Python 2.4, then I'll implement a > decorator/metaclass solution instead. > > > In conclusion, I'd like to mention that I really think the > important thing > here is to consider the "grand, large-scale" features of the patch. I > didn't polish the details in any way, because I think that what is > important here is the changes they make possible in the application > structure. How the visitors look like can be changed entirely in > Visitor.py and Analysis.py without interfering with existing code; > while > the phase refactoring is going to intrude everywhere and make > changes all > over the place, so the form the phase refactoring will take is the > important point. > > (OTOH; I guess it is a good time to think about the details as well so > that when the 1000 line coercion refactoring patch should be > written one > knows what to do in the details...) > > (I've asked myself as to whether it is all worth it BTW. It is a > heavy and > non-fun task. But I'm still convinced there's absolutely no way > around it > if the feature-set of Cython is to grow significantly in any way. And > realistically, it can be done in two or three days with for > instance me, > you and Robert working together... So this might be too early to talk > about it; but I end up working on it anyway because it is > effectively a > blocker for me and I cannot get anywhere with my GSoC stuff until > it is > done :-) ) I think this is good to talk about now. Basically, you want to change Cython to use the "visitor pattern" rather than the recursive pattern that it currently uses. This has pros and cons, but I think this could be a good thing, and you are certainly convinced that its needed to get started on the GSoC stuff. Something doesn't sit right about the patch though, but I can't quite put my finger on it. I think that it should happen at a higher level than inside the module node (i.e. ModuleNode should be "visited" rather than initiate the visitors.) As for the "phase shift" between function bodies and the ambient code, the ambient analysis phase needs to happen before entering function bodies, but I think it could be done immediately after at the tail end of the same phase (so all declarations would be analyzed before any times need to be looked at) which may make things cleaner. Probably http://wiki.cython.org/enhancements/treevisitors should be fleshed out with more specific details and a plan. - Robert From mistobaan at gmail.com Wed May 7 04:49:10 2008 From: mistobaan at gmail.com (Fabrizio Milo aka misto) Date: Tue, 6 May 2008 19:49:10 -0700 Subject: [Cython] Some small phase refactorings In-Reply-To: <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> Message-ID: > (i.e. ModuleNode should be "visited" rather than initiate the > visitors.) Absolutely. Fabrizio ------------- Luck favors the prepared mind. (Pasteur) From dalcinl at gmail.com Wed May 7 06:16:03 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 7 May 2008 01:16:03 -0300 Subject: [Cython] WARNING!! cython, classmethod's and Python 2.6 Message-ID: Python 2.6 introduces a new feature, a cache for method in C-level Python types. I believe this will severely interfere with the way Cython implements 'classmethod' hackery. I was getting extrange segfaults or very strange behavior (when accessing a classmethod, then I actually got other method) in my new mpi4py. Then I started to look at the whole beast, and I believe I've found the fix. But then I do not kwnow how we could modify Cython for implementing this, so I'll show and example of what needs to be done. Look at the very end the 'else' of the pasted generated code, I've manually added it!!! /* "/u/dalcinl/Devel/Cython/tmp/cls.pyx":3 * cdef class Foo: * def bar(cls): pass * bar = classmethod(bar) # <<<<<<<<<<<<<< * */ __pyx_1 = __Pyx_GetName((PyObject *)__pyx_ptype_3cls_Foo, __pyx_n_bar); if (unlikely(!__pyx_1)) { /* tb stuff */} __pyx_2 = __Pyx_Method_ClassMethod(__pyx_1); if (unlikely(!__pyx_2)) { /* tb stuff*/} Py_DECREF(__pyx_1); __pyx_1 = 0; if (PyDict_SetItem((PyObject *)__pyx_ptype_3cls_Foo->tp_dict, __pyx_n_bar, __pyx_2) < 0) { /* tb stuff*/} /* AND NOW THE FIX !!!!! */ else { (&__pyx_type_3cls_Foo)->tp_flags &= ~Py_TPFLAGS_VALID_VERSION_TAG; } This way, in the next _PyType_Lookup, Python will ignore the method cache and do a normal lookup and then update the cache. This seems to solve all the nasty problems I was experienced. Of course, I'm not completely sure if my fix is the best one. You will surelly need to ask at Python-Dev with the hope the guy that implemented this very, very clever hackery on type objects can give some directions for Cython. Finally, there is a public function PyType_ClearCache(), but this function invalidates all the caches starting from the base 'PyBaseObject_Type' type object and recursing on all subclasses. Perhaps this would be the right way to go, but this would means that every time you import a cython extension module, you have a full traversal of type objects clearing caches, and all this because of a 'classmethod' in a cdef class... Comments? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Wed May 7 08:30:47 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 08:30:47 +0200 Subject: [Cython] Visitor patterns and Python 2.4 In-Reply-To: References: <481AFD4D.4050401@student.matnat.uio.no> <481BFE9A.4080600@behnel.de> Message-ID: <48214C97.2030302@student.matnat.uio.no> > We ship our own Python (2.5.2) with Sage, but I would rather stick > with 2.3 compatibility. You might not consider it "important" but my > computer (OS X 10.4, yet I know 10.5 is out but I haven't found the > time to upgrade yet) has 2.3 by default, as did OS X 10.3. > It is certainly an argument I didn't consider. Still... Python 2.4 was released 3 1/2 years ago, and I think Cython users are going to be power users who will know how to upgrade their Python. > I very much dislike using __class__.__name__ to decide what functions > to call, but it doesn't look like you need decorators to get around > this, do you? > Not having decorators makes the overhead of not using the function name much greater. I.e. compare: class MyVisitor(VisitorTransform): @pre(cls=FuncDefNode) def create_function_scope(self, node): ... (or similar) with something like (changing the match function slightly) class MyVisitor(VisitorTransform): def create_function_scope(self, node): ... create_function_scope = pre(cls=FuncDefNode, create_function_scope) The latter one is so heavy to write that I'd rather want to simply do class MyVisitor(VisitorTransform): def pre_FuncDefNode(self, node): .... however I don't think this is as nice. Yes, I know, it is the standard visitor pattern, but I think that pattern is the way it is because it is made for less powerful languages than Python; using a real class references seems better than having (using __name__ or using a convention or in some way add extra, additional information to care about). Also using decorator opens the way for other types of matches (though I won't propose a full XPath match again, perhaps simple "attr" matching, i.e. to which attribute and parent does the node belong to: @pre(parent_cls=FuncDefNode, attr="body") ... -- Dag Sverre From wstein at gmail.com Wed May 7 08:42:23 2008 From: wstein at gmail.com (William Stein) Date: Tue, 6 May 2008 23:42:23 -0700 Subject: [Cython] Visitor patterns and Python 2.4 In-Reply-To: <48214C97.2030302@student.matnat.uio.no> References: <481AFD4D.4050401@student.matnat.uio.no> <481BFE9A.4080600@behnel.de> <48214C97.2030302@student.matnat.uio.no> Message-ID: <85e81ba30805062342g9a125d4m2b48a0d4d987f728@mail.gmail.com> On Tue, May 6, 2008 at 11:30 PM, Dag Sverre Seljebotn wrote: > > > We ship our own Python (2.5.2) with Sage, but I would rather stick > > with 2.3 compatibility. You might not consider it "important" but my > > computer (OS X 10.4, yet I know 10.5 is out but I haven't found the > > time to upgrade yet) has 2.3 by default, as did OS X 10.3. > > > It is certainly an argument I didn't consider. > > Still... Python 2.4 was released 3 1/2 years ago, and I think Cython > users are going to be power users who will know how to upgrade their Python. > > > > I very much dislike using __class__.__name__ to decide what functions > > to call, but it doesn't look like you need decorators to get around > > this, do you? > > > Not having decorators makes the overhead of not using the function name > much greater. I.e. compare: > > class MyVisitor(VisitorTransform): > @pre(cls=FuncDefNode) > def create_function_scope(self, node): > ... > > (or similar) with something like (changing the match function slightly) > > class MyVisitor(VisitorTransform): > def create_function_scope(self, node): > ... > create_function_scope = pre(cls=FuncDefNode, create_function_scope) > > > The latter one is so heavy to write that I'd rather want to simply do > > class MyVisitor(VisitorTransform): > > def pre_FuncDefNode(self, node): > .... > > however I don't think this is as nice. Yes, I know, it is the standard > visitor pattern, but I think that pattern is the way it is because it is > made for less powerful languages than Python; using a real class > references seems better than having (using __name__ or using a > convention or in some way add extra, additional information to care about). > > Also using decorator opens the way for other types of matches (though I > won't propose a full XPath match again, perhaps simple "attr" matching, > i.e. to which attribute and parent does the node belong to: > > @pre(parent_cls=FuncDefNode, attr="body") > ... > Two comments: (1) I personally think Python 2.3 is pretty old at this point, and we shouldn't worry too much about supporting it. I.e., I buy your "Cython users are power users" argument. (2) If you use decorators then you'll have to implement them in Cython so that compiling Cython itself using Cython will work. That seems like good motivation. -- William From dagss at student.matnat.uio.no Wed May 7 08:46:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 08:46:25 +0200 Subject: [Cython] Some small phase refactorings In-Reply-To: <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> Message-ID: <48215041.7070805@student.matnat.uio.no> > I think one could put a generic accept function in the Node class > that would dispatch based on class name (or some class attribute) if > one wanted, rather than having to write it for every class. > Yes, simply move the function. You spoke against using the class name in the other thread though. I suppose I could use an attribute; still I like decorators much better. > I think this is good to talk about now. Basically, you want to change > Cython to use the "visitor pattern" rather than the recursive pattern > that it currently uses. This has pros and cons, but I think this > could be a good thing, and you are certainly convinced that its > needed to get started on the GSoC stuff. Something doesn't sit right > about the patch though, but I can't quite put my finger on it. I > think that it should happen at a higher level than inside the module > node (i.e. ModuleNode should be "visited" rather than initiate the > visitors.) As for the "phase shift" between function bodies and the > But of course! I'll repeat what I said to Stefan (the context is the codelines in ModuleNode.py): --- Note that this code will almost certainly be moved again and rewritten at some later point (they can't really belong to ModuleNode); but more refactoring must happen before they can be moved to their proper location and they serve a good purpose where they are for now though. --- It's a rather big knot, and one has to begin to untangle it somewhere. I chose in the middle of ModuleNode. One could make a (perhaps more natural) decision to start untangle it at the top (i.e. in Main.py); however two things spoke against it: - I wanted to jump right in and have a feel for the most difficult phase seperation we were going to transform. If I started at the top, I would have to refactor ModuleNode before going on, and that would take some time yet still be a (comparatively) easy task -- and I wouldn't get time there and then for challenging refactoring the main tree and see what the real issues were. So I wouldn't gain any real new knowledge from starting "at the right place" (which was needed at this stage, in order to start thinking about this refactoring). - It will also take some longer to see the immediate gains one gets from this starting from the top, but that is fine... Of course, different needs drive Cython patch inclusion than what I need to investigage; and I'm fine with the patch not being included at this stage. -- Dag Sverre From dagss at student.matnat.uio.no Wed May 7 08:48:38 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 08:48:38 +0200 Subject: [Cython] Some small phase refactorings In-Reply-To: <48215041.7070805@student.matnat.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> Message-ID: <482150C6.9070704@student.matnat.uio.no> An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080507/565d0d72/attachment.htm From dagss at student.matnat.uio.no Wed May 7 08:51:59 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 08:51:59 +0200 Subject: [Cython] Visitor patterns and Python 2.4 In-Reply-To: <85e81ba30805062342g9a125d4m2b48a0d4d987f728@mail.gmail.com> References: <481AFD4D.4050401@student.matnat.uio.no> <481BFE9A.4080600@behnel.de> <48214C97.2030302@student.matnat.uio.no> <85e81ba30805062342g9a125d4m2b48a0d4d987f728@mail.gmail.com> Message-ID: <4821518F.6050406@student.matnat.uio.no> > (2) If you use decorators then you'll have to implement them in Cython > so that compiling Cython itself using Cython will work. That seems > like good motivation. > And not too hard either. The funny thing is that I'd really want to have decorators available for writing such support :-) Which brings up the question of a parser...will raise it in a different thread. -- Dag Sverre From dagss at student.matnat.uio.no Wed May 7 09:00:21 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 09:00:21 +0200 Subject: [Cython] PyPy parser Message-ID: <48215385.50207@student.matnat.uio.no> We all wish Fabrizio had got his SoC. The question is though, what do we do now? I was thinking about if it is scaled down a lot and one only considers the PyPy parser and a parser consumer that builds the existing Pyrex tree (drop-in replacement of Scanning.py and Parsing.py). How much work would that be? It seems wasteful to put any more work into the Cython parser when PyPy is already maintaining one, and there needs to be a few changes in the parser this summer (type arguments, probably with statements and decorators) so it seems like a good time. Not volunteering to take it on though (I'll have a GSoC to care for) but I'd like to know if there are any plans in this area (or if we can make one). -- Dag Sverre From stefan_ml at behnel.de Wed May 7 09:51:41 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 7 May 2008 09:51:41 +0200 (CEST) Subject: [Cython] PyPy parser In-Reply-To: <48215385.50207@student.matnat.uio.no> References: <48215385.50207@student.matnat.uio.no> Message-ID: <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Dag Sverre Seljebotn wrote: > We all wish Fabrizio had got his SoC. The question is though, what do we > do now? > > I was thinking about if it is scaled down a lot and one only considers > the PyPy parser and a parser consumer that builds the existing Pyrex > tree (drop-in replacement of Scanning.py and Parsing.py). How much work > would that be? Implied question: how much work is it to keep in sync with the PyPy parser development? We'd have to extend the parser to understand C types and 'cdef', so the first thing to check is if there's a way to do that without having to patch into the parser too heavily. So, even in the long term, the PyPy parser does not come for free. We're actually lucky we have a couple of compiler tests by now, but the first thing to do before starting such a project is growing the test suite. We should consider including the test suites of Python and PyPy, but that still wouldn't give us enough tests for the stuff that is most important here: the C type integration, which is neither a part of Python nor PyPy. To some extent, the question of parser rewriting reminds me of this: http://www.joelonsoftware.com/articles/fog0000000069.html Doesn't "if it ain't broken, don't fix it" apply here? Stefan From dagss at student.matnat.uio.no Wed May 7 10:01:35 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 07 May 2008 10:01:35 +0200 Subject: [Cython] PyPy parser In-Reply-To: <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <482161DF.9010108@student.matnat.uio.no> > We're actually lucky we have a couple of compiler tests by now, but the > first thing to do before starting such a project is growing the test > suite. We should consider including the test suites of Python and PyPy, > but that still wouldn't give us enough tests for the stuff that is most > important here: the C type integration, which is neither a part of Python > nor PyPy. > > To some extent, the question of parser rewriting reminds me of this: > > http://www.joelonsoftware.com/articles/fog0000000069.html > > Doesn't "if it ain't broken, don't fix it" apply here? > Yes, I think you're right :-) -- Dag Sverre From mistobaan at gmail.com Wed May 7 17:08:34 2008 From: mistobaan at gmail.com (Fabrizio Milo aka misto) Date: Wed, 7 May 2008 08:08:34 -0700 Subject: [Cython] PyPy parser In-Reply-To: <482161DF.9010108@student.matnat.uio.no> References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> Message-ID: > > Doesn't "if it ain't broken, don't fix it" apply here? The fact is that it actually breaks with Python 3.0 :) I see three solutions: 1) to copy main parts of pypy's parser and put them inside cython * Pro: more easily to modify and clean specialized for cython purpose * Cons: duplicate code among libraries 2) main parser logic from the pypy repository compatibility layer inside the cython repository * Pro: No code duplication * Cons: Cython dependent of another library 3) Tell to pypy's guys to create a python parser library for python from pypy's parser and both cython and python use it. They will have python 3k support as well as cython I will try to do my best in working on this parser. What do you think ? Fabrizio -------------------------- Luck favors the prepared mind. (Pasteur) From robertwb at math.washington.edu Wed May 7 20:01:05 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 7 May 2008 11:01:05 -0700 Subject: [Cython] PyPy parser In-Reply-To: References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> Message-ID: On May 7, 2008, at 8:08 AM, Fabrizio Milo aka misto wrote: >>> Doesn't "if it ain't broken, don't fix it" apply here? > The fact is that it actually breaks with Python 3.0 :) Yes, this is true. Actually modifying the parser to accept, for example, decorators will be easy. With statements are already implemented at the parser level, as are inner functions (just have to remove the line disallowing them). Parameterized types shouldn't be too hard to do either. For about every 2.x deficiency I can think of modifying the parser is going to be by far the easy part of the job. I am not aware of any huge changes in 3.0, but there may be some. If we decide to replace the parser, I would like to see one that's automatically generated from a grammar file (rather than hand coded) and lends itself well to optimization when we compile cython with itself (as parsing is a huge portion of the total runtime). > I see three solutions: > 1) > to copy main parts of pypy's parser and put them inside cython > * Pro: > more easily to modify and clean > specialized for cython purpose > * Cons: > duplicate code among libraries This seems like the best idea, depending on how much code we are talking about here, and also as long as there aren't any license issues. > 2) > main parser logic from the pypy repository > compatibility layer inside the cython repository > > * Pro: No code duplication > * Cons: Cython dependent of another library I really don't want to increase dependancies, and I have the feeling that the "compatibility layer" wouldn't be very clean. > 3) Tell to pypy's guys to create a python parser library for python > from pypy's parser > and both cython and python use it. > They will have python 3k support as well as cython I'm not sure why they would to this for us, and would we rely on them to do it repeatedly as the language changes? > I will try to do my best in working on this parser. > > What do you think ? > > Fabrizio I don't think the discussion would be complete without at least looking at the possibility of using the parser that comes with Python itself. - Robert From mistobaan at gmail.com Thu May 8 00:37:05 2008 From: mistobaan at gmail.com (Fabrizio Milo aka misto) Date: Wed, 7 May 2008 15:37:05 -0700 Subject: [Cython] PyPy parser In-Reply-To: References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> Message-ID: > I don't think the discussion would be complete without at least > looking at the possibility of using the parser that comes with Python > itself. The one written in C ? and extending it with Cython itself ? not sure to understand :| Fabrizio From jek-gmane at kleckner.net Thu May 8 03:07:26 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Wed, 07 May 2008 18:07:26 -0700 Subject: [Cython] Offtopic: Good Python IDEs? In-Reply-To: <47FF20EC.4040803@behnel.de> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> Message-ID: Stefan Behnel wrote: ... > For debugging, I mostly use print and unit tests for Python and print, tests > and valgrind for Cython, so I can't really comment on the debugging > environments (which actually *are* available for emacs). Stefan, do you have some tips for Valgrind? Do you have a suppression file that you like? I ran valgrind again and found a sea of messages with a small number of suspicious ones. Two of them have the form: __pyx_4 = PyObject_Call(__pyx_12, __pyx_empty_tuple, NULL); if (unlikely(!__pyx_4)) { which suggests that the NULL is fooling Valgrind in some way. This page might be a good place to put such things as suppression files: http://wiki.cython.org/UsingValgrindToDebug From jek-gmane at kleckner.net Thu May 8 03:12:57 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Wed, 07 May 2008 18:12:57 -0700 Subject: [Cython] Offtopic: Good Python IDEs? In-Reply-To: References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> Message-ID: Jim Kleckner wrote: > Stefan Behnel wrote: > ... >> For debugging, I mostly use print and unit tests for Python and print, tests >> and valgrind for Cython, so I can't really comment on the debugging >> environments (which actually *are* available for emacs). > > Stefan, do you have some tips for Valgrind? > Do you have a suppression file that you like? > > I ran valgrind again and found a sea of messages with a small number > of suspicious ones. > Two of them have the form: > __pyx_4 = PyObject_Call(__pyx_12, __pyx_empty_tuple, NULL); if > (unlikely(!__pyx_4)) { > which suggests that the NULL is fooling Valgrind in some way. > > This page might be a good place to put such things as suppression files: > http://wiki.cython.org/UsingValgrindToDebug > This page explains some of the PyMalloc issues: http://svn.python.org/projects/python/trunk/Misc/README.valgrind From stefan_ml at behnel.de Thu May 8 07:46:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 May 2008 07:46:29 +0200 Subject: [Cython] Debugging with valgrind In-Reply-To: References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> Message-ID: <482293B5.3080802@behnel.de> Hi, Jim Kleckner wrote: > Jim Kleckner wrote: >> Stefan Behnel wrote: >> ... >>> For debugging, I mostly use print and unit tests for Python and print, tests >>> and valgrind for Cython, so I can't really comment on the debugging >>> environments (which actually *are* available for emacs). >> Stefan, do you have some tips for Valgrind? >> Do you have a suppression file that you like? >> >> I ran valgrind again and found a sea of messages with a small number >> of suspicious ones. >> Two of them have the form: >> __pyx_4 = PyObject_Call(__pyx_12, __pyx_empty_tuple, NULL); if >> (unlikely(!__pyx_4)) { >> which suggests that the NULL is fooling Valgrind in some way. >> >> This page might be a good place to put such things as suppression files: >> http://wiki.cython.org/UsingValgrindToDebug >> > > This page explains some of the PyMalloc issues: > http://svn.python.org/projects/python/trunk/Misc/README.valgrind Yes, that's where we took the suppression file from that we use for lxml. http://codespeak.net/svn/lxml/trunk/valgrind-python.supp Stefan From dalcinl at gmail.com Thu May 8 08:05:39 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 03:05:39 -0300 Subject: [Cython] working patch for generating code targeting Py 2.6 and Py 3.0 Message-ID: The last four hours (3:00 AM right now here at Argentina) I've been working in a patch for enabling Cython generate code working with Python 2.6 and Python 3.0. Until now, the generated code (at least for the full mpi4py project) compiles and link fine with no errors. However, I have a big problems I do not know how to 'fix' in Cython. It is related to unbound methods disapearing in Py3K. Then, normal python classes does not work, but the cdef ones are fine. Is there any interest on this to go mainstream? I was very conservative about the PyString/PyUnicode issue. The right one is used in a place-by-place base. Of course, because of this, I have to pass 'bytes' to MPI, and I get 'bytes' from the C calls. Finally, I'm completelly sure that I've not fixed all the relevant parts, but this is IMHO a good starting point. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Thu May 8 08:56:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 May 2008 08:56:19 +0200 Subject: [Cython] working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: Message-ID: <4822A413.8060902@behnel.de> Hi, Lisandro Dalcin wrote: > The last four hours (3:00 AM right now here at Argentina) I've been > working in a patch for enabling Cython generate code working with > Python 2.6 and Python 3.0. > > Until now, the generated code (at least for the full mpi4py project) > compiles and link fine with no errors. > > However, I have a big problems I do not know how to 'fix' in Cython. > It is related to unbound methods disapearing in Py3K. Then, normal > python classes does not work, but the cdef ones are fine. These are the kind of things that we planned to address at the dev1 workshop. The C-API of Python is expected to stabilize next month, I don't know how stable these parts currently are. > Is there any interest on this to go mainstream? Sure, totally! > I was very > conservative about the PyString/PyUnicode issue. The right one is used > in a place-by-place base. Of course, because of this, I have to pass > 'bytes' to MPI, and I get 'bytes' from the C calls. Yes, I think this is the right way to deal with it. Python2 was very lax in terms of semantics here, so the two have to be separated on a case-by-case basis. > Finally, I'm completelly sure that I've not fixed all the relevant > parts, but this is IMHO a good starting point. Is it one big patch or did you/can you split it up? As this is potentially a big change, trac is the wrong place to discuss it. We should put up an official Py3 branch that people can actively work on without impacting the main trunk, so that we can merge working stuff gradually. Can you send a bundle against cython-devel to me and Robert for now? He can set it up. Stefan From dalcinl at gmail.com Thu May 8 17:33:55 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 12:33:55 -0300 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <4822A413.8060902@behnel.de> References: <4822A413.8060902@behnel.de> Message-ID: Well, I send here a patch obtained with 'hg diff'. Of course, before seeting up a new repo with this patch as a base, look at it for possible pitfalls regarding to style. Additionally, take care of the comments below. - The PyNumberMethods stuff is the hard part. Some of them changed it name (nb_nonzero -> nb_bool) in py3k. I did not managed this name change yet. Other slots are gone, I basically added a flag for manage this. The realy anoying part is that some slots are UNUSED, but they are still there in the struct. Could you ask for clarifications in Python-Dev about all this crap? IMHO, all sloots should be keept, or all unused in py3k should be removed. - PyMember_New changed, I've tried to emulate this with a macro, but normal classes are not working anyway. PyClass_New is gone, I've emulated it with a call to 'type' (but in C, of course). Perhaps the class dict should be filled before calling type(name, bases, dict)?? But then I do not know how to hack cython to do that... - Finally, in the very clever part of traceback hackery, the call to Py_CodeNew receives 'filenames' created with PyUnicode_AsString. I do not know what the C preprocessor uses for enconding __FILE__ macro, it its always ASCII, then all is fine, if not, the filesystem encoding should be taken into account. Regards, On 5/8/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > The last four hours (3:00 AM right now here at Argentina) I've been > > working in a patch for enabling Cython generate code working with > > Python 2.6 and Python 3.0. > > > > Until now, the generated code (at least for the full mpi4py project) > > compiles and link fine with no errors. > > > > However, I have a big problems I do not know how to 'fix' in Cython. > > It is related to unbound methods disapearing in Py3K. Then, normal > > python classes does not work, but the cdef ones are fine. > > > These are the kind of things that we planned to address at the dev1 workshop. > The C-API of Python is expected to stabilize next month, I don't know how > stable these parts currently are. > > > > > Is there any interest on this to go mainstream? > > > Sure, totally! > > > > > I was very > > conservative about the PyString/PyUnicode issue. The right one is used > > in a place-by-place base. Of course, because of this, I have to pass > > 'bytes' to MPI, and I get 'bytes' from the C calls. > > > Yes, I think this is the right way to deal with it. Python2 was very lax in > terms of semantics here, so the two have to be separated on a case-by-case basis. > > > > > Finally, I'm completelly sure that I've not fixed all the relevant > > parts, but this is IMHO a good starting point. > > > Is it one big patch or did you/can you split it up? > > As this is potentially a big change, trac is the wrong place to discuss it. We > should put up an official Py3 branch that people can actively work on without > impacting the main trunk, so that we can merge working stuff gradually. Can > you send a bundle against cython-devel to me and Robert for now? He can set it up. > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: cython-py3k.patch Type: application/octet-stream Size: 21182 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080508/c5896734/attachment-0001.obj From robertwb at math.washington.edu Thu May 8 17:37:43 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 08:37:43 -0700 Subject: [Cython] working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <4822A413.8060902@behnel.de> References: <4822A413.8060902@behnel.de> Message-ID: <800C6DA2-4F48-4BE2-9647-623E09146A63@math.washington.edu> On May 7, 2008, at 11:56 PM, Stefan Behnel wrote: > Hi, > > Lisandro Dalcin wrote: >> The last four hours (3:00 AM right now here at Argentina) I've been >> working in a patch for enabling Cython generate code working with >> Python 2.6 and Python 3.0. >> >> Until now, the generated code (at least for the full mpi4py project) >> compiles and link fine with no errors. >> >> However, I have a big problems I do not know how to 'fix' in Cython. >> It is related to unbound methods disapearing in Py3K. Then, normal >> python classes does not work, but the cdef ones are fine. > > These are the kind of things that we planned to address at the dev1 > workshop. > The C-API of Python is expected to stabilize next month, I don't > know how > stable these parts currently are. > > >> Is there any interest on this to go mainstream? > > Sure, totally! Yes, we're very interested! Is it backwards compatible too? >> I was very >> conservative about the PyString/PyUnicode issue. The right one is >> used >> in a place-by-place base. Of course, because of this, I have to pass >> 'bytes' to MPI, and I get 'bytes' from the C calls. > > Yes, I think this is the right way to deal with it. Python2 was > very lax in > terms of semantics here, so the two have to be separated on a case- > by-case basis. > > >> Finally, I'm completelly sure that I've not fixed all the relevant >> parts, but this is IMHO a good starting point. > > Is it one big patch or did you/can you split it up? > > As this is potentially a big change, trac is the wrong place to > discuss it. We > should put up an official Py3 branch that people can actively work > on without > impacting the main trunk, so that we can merge working stuff > gradually. Can > you send a bundle against cython-devel to me and Robert for now? He > can set it up. Trac is a good place to put the patches and status though. Are you sure we want a separate branch for this, as we want to retain Py2 compatibility as we work on this. - Robert From robertwb at math.washington.edu Thu May 8 17:55:49 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 08:55:49 -0700 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> Message-ID: On May 8, 2008, at 8:33 AM, Lisandro Dalcin wrote: > Well, I send here a patch obtained with 'hg diff'. Thanks. I briefly glanced at it and it looks good. What you probably want to do is commit and then export the patch (that way it will have some history attached, and you'll get credit for it.) FYI, I am preparing for my general exams that I take in a couple of weeks, so won't have tons of time to put into Cython in the near future, but this looks like really good work. > Of course, before seeting up a new repo with this patch as a base, > look at it for > possible pitfalls regarding to style. Additionally, take care of the > comments below. One remark I had is that perhaps more care needs to be done with "PyInt_CheckExact" as in Py2 this implies that the size fits into a long, so it's not quite the same as PyLong_CheckExact. > - The PyNumberMethods stuff is the hard part. Some of them changed it > name (nb_nonzero -> nb_bool) in py3k. I did not managed this name > change yet. Other slots are gone, I basically added a flag for manage > this. The realy anoying part is that some slots are UNUSED, but they > are still there in the struct. Could you ask for clarifications in > Python-Dev about all this crap? IMHO, all sloots should be keept, or > all unused in py3k should be removed. Could you elaborate a bit more on what you mean by "unused?" If they're still in the struct, I'd think we would need to keep them so it has the right size/layout, right? > - PyMember_New changed, I've tried to emulate this with a macro, but > normal classes are not working anyway. PyClass_New is gone, I've > emulated it with a call to 'type' (but in C, of course). Perhaps the > class dict should be filled before calling type(name, bases, dict)?? > But then I do not know how to hack cython to do that... Filling the dict ahead of time might actually work better...one would execute the body in a temporary scope and then construct the class. This would bring things more in line with how Python does things. > - Finally, in the very clever part of traceback hackery, the call to > Py_CodeNew receives 'filenames' created with PyUnicode_AsString. I do > not know what the C preprocessor uses for enconding __FILE__ macro, it > its always ASCII, then all is fine, if not, the filesystem encoding > should be taken into account. I actually have no idea on this one--maybe there's some system call that can give this information? Stefan would probably know better than me. > > Regards, > > > On 5/8/08, Stefan Behnel wrote: >> Hi, >> >> >> Lisandro Dalcin wrote: >>> The last four hours (3:00 AM right now here at Argentina) I've been >>> working in a patch for enabling Cython generate code working with >>> Python 2.6 and Python 3.0. >>> >>> Until now, the generated code (at least for the full mpi4py project) >>> compiles and link fine with no errors. >>> >>> However, I have a big problems I do not know how to 'fix' in >>> Cython. >>> It is related to unbound methods disapearing in Py3K. Then, normal >>> python classes does not work, but the cdef ones are fine. >> >> >> These are the kind of things that we planned to address at the >> dev1 workshop. >> The C-API of Python is expected to stabilize next month, I don't >> know how >> stable these parts currently are. >> >> >> >>> Is there any interest on this to go mainstream? >> >> >> Sure, totally! >> >> >> >>> I was very >>> conservative about the PyString/PyUnicode issue. The right one is >>> used >>> in a place-by-place base. Of course, because of this, I have to pass >>> 'bytes' to MPI, and I get 'bytes' from the C calls. >> >> >> Yes, I think this is the right way to deal with it. Python2 was >> very lax in >> terms of semantics here, so the two have to be separated on a >> case-by-case basis. >> >> >> >>> Finally, I'm completelly sure that I've not fixed all the relevant >>> parts, but this is IMHO a good starting point. >> >> >> Is it one big patch or did you/can you split it up? >> >> As this is potentially a big change, trac is the wrong place to >> discuss it. We >> should put up an official Py3 branch that people can actively >> work on without >> impacting the main trunk, so that we can merge working stuff >> gradually. Can >> you send a bundle against cython-devel to me and Robert for now? >> He can set it up. >> >> Stefan >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 py3k.patch>_______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Thu May 8 17:57:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 May 2008 17:57:09 +0200 Subject: [Cython] working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <800C6DA2-4F48-4BE2-9647-623E09146A63@math.washington.edu> References: <4822A413.8060902@behnel.de> <800C6DA2-4F48-4BE2-9647-623E09146A63@math.washington.edu> Message-ID: <482322D5.3020600@behnel.de> Hi, Robert Bradshaw wrote: > Trac is a good place to put the patches and status though. Sure. > Are you sure > we want a separate branch for this, as we want to retain Py2 > compatibility as we work on this. I do not expect this to be finished with a single patch, so there will be things that may break somewhere in between and there may be things that require larger refactorings. Keeping cython-devel somewhat stable and having a separate branch where things can become unstable along the way helps us in keeping up the ability to continue the normal development without being directly impacted by Py3 work. The idea is to merge things over step-by-step when they prove to be correct and "stable enough". Stefan From robertwb at math.washington.edu Thu May 8 18:00:07 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:00:07 -0700 Subject: [Cython] PyPy parser In-Reply-To: References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> Message-ID: <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> On May 7, 2008, at 3:37 PM, Fabrizio Milo aka misto wrote: >> I don't think the discussion would be complete without at least >> looking at the possibility of using the parser that comes with >> Python >> itself. > > The one written in C ? and extending it with Cython itself ? > not sure to understand :| There are some python parsing files as well, see, for example, src/ Lib/compiler in the Python sources. We could perhaps even use some stuff (classes, ast vistor, etc) from the compiler modules. On the other hand, we don't want to rely to heavily on the specific version of Python the user has. A simple grammar -> parser module like you've talked about before seems ideal, and would probably be fairly compact. - Robert From robertwb at math.washington.edu Thu May 8 18:01:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:01:39 -0700 Subject: [Cython] Debugging with valgrind In-Reply-To: <482293B5.3080802@behnel.de> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> Message-ID: <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> On May 7, 2008, at 10:46 PM, Stefan Behnel wrote: > Hi, > > Jim Kleckner wrote: >> Jim Kleckner wrote: >>> Stefan Behnel wrote: >>> ... >>>> For debugging, I mostly use print and unit tests for Python and >>>> print, tests >>>> and valgrind for Cython, so I can't really comment on the debugging >>>> environments (which actually *are* available for emacs). >>> Stefan, do you have some tips for Valgrind? >>> Do you have a suppression file that you like? >>> >>> I ran valgrind again and found a sea of messages with a small number >>> of suspicious ones. >>> Two of them have the form: >>> __pyx_4 = PyObject_Call(__pyx_12, __pyx_empty_tuple, NULL); if >>> (unlikely(!__pyx_4)) { >>> which suggests that the NULL is fooling Valgrind in some way. >>> >>> This page might be a good place to put such things as suppression >>> files: >>> http://wiki.cython.org/UsingValgrindToDebug >>> >> >> This page explains some of the PyMalloc issues: >> http://svn.python.org/projects/python/trunk/Misc/README.valgrind > > Yes, that's where we took the suppression file from that we use for > lxml. > > http://codespeak.net/svn/lxml/trunk/valgrind-python.supp I bet turning off string interning, etc. would help clean up lots of the noise too. - Robert From robertwb at math.washington.edu Thu May 8 18:08:43 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:08:43 -0700 Subject: [Cython] Abandon support for Python 2.3? Message-ID: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> As mentioned in another thread, we are considering requiring Python 2.4 or greater to run the Cython compiler, mostly so we can use decorators in the compiler code. Is anyone still using Cython with this version of Python? - Robert From mabshoff at googlemail.com Thu May 8 17:30:41 2008 From: mabshoff at googlemail.com (Michael Abshoff) Date: Thu, 08 May 2008 17:30:41 +0200 Subject: [Cython] Debugging with valgrind In-Reply-To: <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> Message-ID: <48231CA1.5080702@googlemail.com> Robert Bradshaw wrote: > On May 7, 2008, at 10:46 PM, Stefan Behnel wrote: > Hi, >> Yes, that's where we took the suppression file from that we use for >> lxml. >> >> http://codespeak.net/svn/lxml/trunk/valgrind-python.supp >> > > I am actually pretty paranoid about using suppression files and I do prefer recompiling python "--without-pymalloc". But that is likely to be something most people will not do since they do not build their own python interpreter from sources. > I bet turning off string interning, etc. would help clean up lots of > the noise too. > Hmm, I have seen a lot of noise with Cython 0.9.6.14 even when I set generate_cleanup_code = 3 In Sage about 100 doctests segfault at exit [out of 900] with that setting on, nearly all of them involved matrix[2].pyx somehow IIRC. Something for dev1 maybe? It is my understanding that this isn't the default of Cython, but more which classes import other classes and in which order. Anyhow: How do I turn off "string interning"? > - Robert > Cheers, Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > From dalcinl at gmail.com Thu May 8 18:22:25 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 13:22:25 -0300 Subject: [Cython] Debugging with valgrind In-Reply-To: <48231CA1.5080702@googlemail.com> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> Message-ID: I'm using Python 2.5.1 and run valgrind with the Misc/valgrind-python.supp inside the Python source distribution, and I do not get any warnings, even if I activate cleanup code. Of course, my project is not cimporting anything. On 5/8/08, Michael Abshoff wrote: > Robert Bradshaw wrote: > > On May 7, 2008, at 10:46 PM, Stefan Behnel wrote: > > > > Hi, > > > >> Yes, that's where we took the suppression file from that we use for > >> lxml. > >> > >> http://codespeak.net/svn/lxml/trunk/valgrind-python.supp > >> > > > > > > I am actually pretty paranoid about using suppression files and I do > prefer recompiling python "--without-pymalloc". But that is likely to be > something most people will not do since they do not build their own > python interpreter from sources. > > > > I bet turning off string interning, etc. would help clean up lots of > > the noise too. > > > > Hmm, I have seen a lot of noise with Cython 0.9.6.14 even when I set > > generate_cleanup_code = 3 > > In Sage about 100 doctests segfault at exit [out of 900] with that > setting on, nearly all of them involved matrix[2].pyx somehow IIRC. > Something for dev1 maybe? It is my understanding that this isn't the > default of Cython, but more which classes import other classes and in > which order. > > Anyhow: How do I turn off "string interning"? > > - Robert > > > Cheers, > > > Michael > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Thu May 8 18:26:01 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:26:01 -0700 Subject: [Cython] Debugging with valgrind In-Reply-To: <48231CA1.5080702@googlemail.com> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> Message-ID: <2742AEAA-F1CF-404B-9190-69284823B8D5@math.washington.edu> On May 8, 2008, at 8:30 AM, Michael Abshoff wrote: > Robert Bradshaw wrote: >> On May 7, 2008, at 10:46 PM, Stefan Behnel wrote: >> > > Hi, > >>> Yes, that's where we took the suppression file from that we use for >>> lxml. >>> >>> http://codespeak.net/svn/lxml/trunk/valgrind-python.supp >>> >> >> > I am actually pretty paranoid about using suppression files and I do > prefer recompiling python "--without-pymalloc". But that is likely > to be > something most people will not do since they do not build their own > python interpreter from sources. > >> I bet turning off string interning, etc. would help clean up lots of >> the noise too. >> > Hmm, I have seen a lot of noise with Cython 0.9.6.14 even when I set > > generate_cleanup_code = 3 > > In Sage about 100 doctests segfault at exit [out of 900] with that > setting on, nearly all of them involved matrix[2].pyx somehow IIRC. > Something for dev1 maybe? It is my understanding that this isn't the > default of Cython, but more which classes import other classes and in > which order. Cleanup is inherently dangerous... Suppose one has two modules a and b, with classes A and B respectively, both of which have non-empty __del__ or __dealloc__ methods. Then if a holds an instance of B and b holds an instance of A (think parent-element) then it is impossible to cleanup a or b after the other module has been deallocated. > Anyhow: How do I turn off "string interning"? It's in Compiler/Options.py. This may some segfaults go away too, but slows things down. - Robert From robertwb at math.washington.edu Thu May 8 18:33:49 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:33:49 -0700 Subject: [Cython] Some small phase refactorings In-Reply-To: <48215041.7070805@student.matnat.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> Message-ID: <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> On May 6, 2008, at 11:46 PM, Dag Sverre Seljebotn wrote: >> I think one could put a generic accept function in the Node class >> that would dispatch based on class name (or some class attribute) if >> one wanted, rather than having to write it for every class. >> > Yes, simply move the function. You spoke against using the class > name in > the other thread though. I suppose I could use an attribute; still I > like decorators much better. Yes, I do to. I'm not totally dead set against them either. I just sent out a (much more explicit) email on this note, if it doesn't get any replies than I guess that's an indication. I am actually a bit worried about how the use of decorators will impact the ability of Cython to compile itself into ultra-efficient C. >> I think this is good to talk about now. Basically, you want to change >> Cython to use the "visitor pattern" rather than the recursive pattern >> that it currently uses. This has pros and cons, but I think this >> could be a good thing, and you are certainly convinced that its >> needed to get started on the GSoC stuff. Something doesn't sit right >> about the patch though, but I can't quite put my finger on it. I >> think that it should happen at a higher level than inside the module >> node (i.e. ModuleNode should be "visited" rather than initiate the >> visitors.) As for the "phase shift" between function bodies and the >> > But of course! I'll repeat what I said to Stefan (the context is the > codelines in ModuleNode.py): > > --- > > Note that this code will almost certainly be moved again and > rewritten at > some later point (they can't really belong to ModuleNode); but more > refactoring must happen before they can be moved to their proper > location > and they serve a good purpose where they are for now though. > > --- > > It's a rather big knot, and one has to begin to untangle it > somewhere. I > chose in the middle of ModuleNode. > > One could make a (perhaps more natural) decision to start untangle > it at > the top (i.e. in Main.py); however two things spoke against it: > > - I wanted to jump right in and have a feel for the most difficult > phase > seperation we were going to transform. If I started at the top, I > would > have to refactor ModuleNode before going on, and that would take some > time yet still be a (comparatively) easy task -- and I wouldn't get > time > there and then for challenging refactoring the main tree and see what > the real issues were. So I wouldn't gain any real new knowledge from > starting "at the right place" (which was needed at this stage, in > order > to start thinking about this refactoring). > > - It will also take some longer to see the immediate gains one gets > from this starting from the top, but that is fine... You're reasoning makes sense here, as long as it doesn't make it more difficult to move things up to the top at the end. > Of course, different needs drive Cython patch inclusion than what I > need > to investigage; and I'm fine with the patch not being included at this > stage. It just didn't seem to add anything except another level of indirection, but that will of course eventually be needed. There is a lot of re-factoring that will need to be done. I've just got other obligations (general exam) to be able to put much time/though into this for the next two weeks or so :-(. - Robert From robertwb at math.washington.edu Thu May 8 18:39:07 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 09:39:07 -0700 Subject: [Cython] __getattribute__ In-Reply-To: <20080505033749.GF4087@tilt> References: <20080430004208.GJ15181@tilt> <5EB02A25-EE20-4FAB-A780-19DEC5EFA16A@math.washington.edu> <20080504051532.GA7720@tilt> <20080505033749.GF4087@tilt> Message-ID: Thanks! It works great for me. On May 4, 2008, at 8:37 PM, Peter Todd wrote: > On Sun, May 04, 2008 at 01:15:32AM -0400, Peter Todd wrote: >> Here's my first patch. This correctly implements __getattribute__ and >> __getattr__ in the single class case. FWIW I also have a mercurial >> tree >> if it'd be better to pull from it than apply patches. >> >> I'm working on making subclasses behave correctly, I've got test >> cases >> written up showing where things fail, but no solutions to that >> written >> yet. The slot_tp_getattro stuff Stefan mentioned is useful though. > > __getattr(ibute)__ support is now working with subclasses. The > semantics > should match Python exactly if my test cases are correct. All that has > changed from the previous patch is that at compile time base > classes are > checked for __getattr(ibute)__ methods and those methods are used if > found. > > Attached is an hg bundle of the two commits. > > -- > http://petertodd.org 'peter'[:-1]@petertodd.org > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080508/46ab919d/attachment.pgp From robert.kern at gmail.com Thu May 8 18:58:16 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 11:58:16 -0500 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> Message-ID: <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> On Thu, May 8, 2008 at 11:08 AM, Robert Bradshaw wrote: > As mentioned in another thread, we are considering requiring Python > 2.4 or greater to run the Cython compiler, mostly so we can use > decorators in the compiler code. Is anyone still using Cython with > this version of Python? We have been considering moving from Pyrex to Cython for part of numpy. numpy will be remaining with Python 2.3 for some time. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dagss at student.matnat.uio.no Thu May 8 19:17:10 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 08 May 2008 19:17:10 +0200 Subject: [Cython] A high-level discussion of the Cython internals Message-ID: <48233596.5020204@student.matnat.uio.no> This is in response to this statement by Robert: -- I think this is good to talk about now. Basically, you want to change Cython to use the "visitor pattern" rather than the recursive pattern that it currently uses. This has pros and cons, but I think this could be a good thing, and you are certainly convinced that its needed to get started on the GSoC stuff. -- So, leaving the patch and implementation strategies aside for now I'm going to attempt discussing this from the mountain top. (If a lot of this seems trivial, all the better -- it can then start to serve as documentation for a direction, rather than my "private" efforts :-). If not, it will be good to discuss it.) I honestly believe that any time spent discussing and implementing this now can repay itself many times later on in saved developer hours and a lower learning curve for new developers. If people agree with me in how important this is, I think it would be good to perhaps spend some time at the beginning of dev1 talking and discussing this (with some slides etc.). Here we go: Assertion 1: Development of Cython has a potential for becoming stagnant (if it hasn't already) quickly. My (subjective) felling is that the codebase has a somewhat high learning curve. If one wants to add a fundamentally new features (as opposed to quick fixes, which do currently happen very quickly -- but think type inference!) it looks like one has to be very careful to keep many different effects of one's changes in the head at the same time. Assertion 2: The reason for this state can at least in some part be attributed to some confusion as to what the current code tree *is*. I believe that the reason that the codebase can be practical in the current state is tightly connected with how similar C and CPython is with Python as a language. The design, so to speak, arose in an historical happy accident, because the input and output are almost 1:1 related. A Python function and a C function loosely correspond, and so FuncDefNode can serve the role as both the parsed element and the element due for C serialization. However there is no reason to assume that the 1:1 correspondance holds everywhere; and I think there is a tendency that "ugly hacks" already tend to arise in situations where the correspondance in program structure diverges more. (To consider how this might become worse, think about a "method template node".) If this was only a theoretically founded critisism one might simply dismiss it, but I believe the problems in assertion 1 comes from this very fact, and that they can be solved by dealing with this conceptual problem. The proposed solution: Work towards "The ideal solution" below in tiny, incremental steps. All *new* major features are implemented with the ideal solution in mind, and old code refactored as far as is needed for that to happen, but one doesn't start rewriting working and bugfixed code just for the sake of abstract purity. The ideal solution: A strict pipeline-based approach where the "data" (one or more trees, graphs or similar) is in different stages, with different phases transforming one into the other. I.e, always try to fit what one does into the following example scheme: data stage:string of code -> phase:parsing -> data stage:syntax tree -> phase:expand with statements -> data stage:tree without with statements -> phase:create scopes -> data stage:tree with scopes -> phase:analyse types -> data stage:tree with ".type" attributes -> ... ... data stage:tree closely related to *structure* of C source -> phase:output Whenever something doesn't quit fit in the scheme of naive input/output, split it up into more phases until it does :-) (Digression: Closely related to C structure above means, for instance, no inner functions. No with statements. No untyped variables. That kind of thing. Classes may be included though, since one can create a 1:1 correspondance between C code using the CPython API and a class; so it is not about language "features" as such but some deeper structure.) Note that all phases does not need to be visitors! For instance, the output-to-C phase will stay a recursive process for the foreseeable future. This is a way of thinking, not a recipe for implementation. (And indeed, this way of thinking is already followed to a degree -- but not everywhere, and we get problems at the spots where it is not.) A very important side-effect of this is that it lends itself to "implementing complex Cython statements as more simple Cython statements". (Though ideally "more simple Cython statements" can also be "intermediate simpler instructions" that bridges the gap between Cython and C in a more fine-grained way in order to give more code reuse than there is today. But I digress.) (Implementation notes: This does not necesarrily mean that the classes of the nodes changes between phases; one does whatever is most convenient. For instance, if one has T2 = expand_with_nodes(T1), then T1 is allowed to keep WithStatementNode, T2 is however not, but otherwise the trees are the same. In the same way, if T1 = type_analysis(T0), then ExprNodes in T0 is not allowed to have the type attribute, while ExprNodes in T1 are required to have them. This can be a documentation matter or enforced (in a number of ways), that issue is better left for later.). Final words: This might seem trivial. It is not. The reason is that currently, almost wherever you try to implement two simple phases f and g one ends up having the result of f depend on what is done in g for some nodes, and the result of g depend on f for other nodes. And so neither f(g(T)) nor g(f(T)) can be made to work. The point is, this way of changing how one thinks mostly affects the compiler design as such, more than it is a specific style of coding. Visitors happen to be the most natural way to implement this, but are not the main point. I'd be happy (or, happier than I am today) with a cleanly defined phase-based recursive process. -- Dag Sverre From stefan_ml at behnel.de Thu May 8 19:27:15 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 08 May 2008 19:27:15 +0200 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> Message-ID: <482337F3.8000105@behnel.de> Hi, Robert Kern wrote: > On Thu, May 8, 2008 at 11:08 AM, Robert Bradshaw > wrote: >> As mentioned in another thread, we are considering requiring Python >> 2.4 or greater to run the Cython compiler, mostly so we can use >> decorators in the compiler code. Is anyone still using Cython with >> this version of Python? > > We have been considering moving from Pyrex to Cython for part of > numpy. numpy will be remaining with Python 2.3 for some time. what Robert meant was: the compiler itself would no longer run on Py2.3, the generated code would still support it. So you would only need Py2.4 or later if you changed the code, not to compile and run the generated sources. Would that really impact numpy? Stefan From robert.kern at gmail.com Thu May 8 19:30:27 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 May 2008 12:30:27 -0500 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: <482337F3.8000105@behnel.de> References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> <482337F3.8000105@behnel.de> Message-ID: <3d375d730805081030w35c0c532xbc6eb9fb44f3c755@mail.gmail.com> On Thu, May 8, 2008 at 12:27 PM, Stefan Behnel wrote: > Hi, > > Robert Kern wrote: >> On Thu, May 8, 2008 at 11:08 AM, Robert Bradshaw >> wrote: >>> As mentioned in another thread, we are considering requiring Python >>> 2.4 or greater to run the Cython compiler, mostly so we can use >>> decorators in the compiler code. Is anyone still using Cython with >>> this version of Python? >> >> We have been considering moving from Pyrex to Cython for part of >> numpy. numpy will be remaining with Python 2.3 for some time. > > what Robert meant was: the compiler itself would no longer run on Py2.3, the > generated code would still support it. So you would only need Py2.4 or later > if you changed the code, not to compile and run the generated sources. > > Would that really impact numpy? I do not like being unable to develop with the platform we're deploying on. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robertwb at math.washington.edu Thu May 8 19:42:53 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 10:42:53 -0700 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: <3d375d730805081030w35c0c532xbc6eb9fb44f3c755@mail.gmail.com> References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> <482337F3.8000105@behnel.de> <3d375d730805081030w35c0c532xbc6eb9fb44f3c755@mail.gmail.com> Message-ID: On May 8, 2008, at 10:30 AM, Robert Kern wrote: > On Thu, May 8, 2008 at 12:27 PM, Stefan Behnel > wrote: >> Hi, >> >> Robert Kern wrote: >>> On Thu, May 8, 2008 at 11:08 AM, Robert Bradshaw >>> wrote: >>>> As mentioned in another thread, we are considering requiring Python >>>> 2.4 or greater to run the Cython compiler, mostly so we can use >>>> decorators in the compiler code. Is anyone still using Cython with >>>> this version of Python? >>> >>> We have been considering moving from Pyrex to Cython for part of >>> numpy. numpy will be remaining with Python 2.3 for some time. Numpy and numpy-using code are very important targets for Cython. >> what Robert meant was: the compiler itself would no longer run on >> Py2.3, the >> generated code would still support it. So you would only need >> Py2.4 or later >> if you changed the code, not to compile and run the generated >> sources. >> >> Would that really impact numpy? > > I do not like being unable to develop with the platform we're > deploying on. This is a valid point. Also, as we work (via Dag's GSoC project for example) to make it easier to use numpy effectively from Cython, it does seem kind of odd to say "you need 2.3 to use NumPy, but 2.4 to use NumPy+Cython." I was finally starting to be convinced that moving to Py2.4 might be OK, but if it impacts our relation with SciPy then that is too high of a price to pay. Py2.4 features were suggested for convenience, not necessity. - Robert From dagss at student.matnat.uio.no Thu May 8 19:50:26 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 08 May 2008 19:50:26 +0200 Subject: [Cython] Some small phase refactorings In-Reply-To: <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> Message-ID: <48233D62.7000200@student.matnat.uio.no> > any replies than I guess that's an indication. I am actually a bit > worried about how the use of decorators will impact the ability of > Cython to compile itself into ultra-efficient C. > Efficiency doesn't bother me at all, and I have reasons! What the decorators would do (does, in prototype code) is stuff the class name and function into a dictionary at *class construction time* (or, without metaclass support, at object construction time), i.e. a O(1) overhead that is negligible even when running in the Python interpreter. During tree traversal, it's just a dict lookup on class id and a dispatch to the resulting function. If one were generating C++ objects then the classical visitor pattern would be faster because real vtables are faster than dict lookups, however Cython polymorphism is (if I'm not wrong, haven't looked at this closely) dict-based anyway so it shouldn't make a difference. Here's another way to use 2.3 that might be acceptable: class WithStatementHandler(VisitorTransform): def handle_with(self, node): ... matches = [ class_match(WithStatementNode, handle_with) ] It would end up as a dict of class -> function like the other approach (but written like a list of objects because I'd like to leave a way open up for other types of matches though). > It just didn't seem to add anything except another level of > indirection, but that will of course eventually be needed. There is a > lot of re-factoring that will need to be done. I've just got other > obligations (general exam) to be able to put much time/though into > this for the next two weeks or so :-(. > What it added was the capability to plug in new transforms after type analysis, but before generation, on a module-wide basis, rather than having the transform run once for each function. I didn't make that very clear; I suppose because you're right, it's real value is in another layer of indirection -- but one that will be needed and is useful during refactoring, and can allow "in-production" refactoring, rather than applying it all at once. " The only problem that cannot be solved by another layer of indirection is too many layers of indirection " :-) -- Dag Sverre From dagss at student.matnat.uio.no Thu May 8 19:52:41 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 08 May 2008 19:52:41 +0200 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> <482337F3.8000105@behnel.de> <3d375d730805081030w35c0c532xbc6eb9fb44f3c755@mail.gmail.com> Message-ID: <48233DE9.90800@student.matnat.uio.no> > This is a valid point. Also, as we work (via Dag's GSoC project for > example) to make it easier to use numpy effectively from Cython, it > does seem kind of odd to say "you need 2.3 to use NumPy, but 2.4 to > use NumPy+Cython." I agree, this seems to pretty much override all my arguments. -- Dag Sverre From dalcinl at gmail.com Thu May 8 20:01:51 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 15:01:51 -0300 Subject: [Cython] working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <482322D5.3020600@behnel.de> References: <4822A413.8060902@behnel.de> <800C6DA2-4F48-4BE2-9647-623E09146A63@math.washington.edu> <482322D5.3020600@behnel.de> Message-ID: On 5/8/08, Stefan Behnel wrote: > I do not expect this to be finished with a single patch, so there will be > things that may break somewhere in between and there may be things that > require larger refactorings. Indeed. Although my patch is (all all future must be IMHO) backward compatible at the C level, the best would be to have a new repo. Keeping cython-devel somewhat stable and having a > separate branch where things can become unstable along the way helps us in > keeping up the ability to continue the normal development without being > directly impacted by Py3 work. > > The idea is to merge things over step-by-step when they prove to be correct > and "stable enough". > > > Stefan > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Thu May 8 20:02:06 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 11:02:06 -0700 Subject: [Cython] A high-level discussion of the Cython internals In-Reply-To: <48233596.5020204@student.matnat.uio.no> References: <48233596.5020204@student.matnat.uio.no> Message-ID: On May 8, 2008, at 10:17 AM, Dag Sverre Seljebotn wrote: > This is in response to this statement by Robert: > > > -- > > I think this is good to talk about now. Basically, you want to change > Cython to use the "visitor pattern" rather than the recursive pattern > that it currently uses. This has pros and cons, but I think this > could be a good thing, and you are certainly convinced that its > needed to get started on the GSoC stuff. > > -- > > So, leaving the patch and implementation strategies aside for now I'm > going to attempt discussing this from the mountain top. (If a lot of > this seems trivial, all the better -- it can then start to serve as > documentation for a direction, rather than my "private" > efforts :-). If > not, it will be good to discuss it.) > > I honestly believe that any time spent discussing and implementing > this > now can repay itself many times later on in saved developer hours > and a > lower learning curve for new developers. If people agree with me in > how > important this is, I think it would be good to perhaps spend some time > at the beginning of dev1 talking and discussing this (with some slides > etc.). +1 > > Here we go: > > Assertion 1: Development of Cython has a potential for becoming > stagnant > (if it hasn't already) quickly. In the past 18 months or so since I started first hacking on the code, development has increased at an exponential rate, far from becoming stagnant. This does not mitigate your point though--we need to have this discussion to minimize brittle interdependence, lessen the learning curve, and take things to the next level. > My (subjective) felling is that the codebase has a somewhat high > learning curve. If one wants to add a fundamentally new features (as > opposed to quick fixes, which do currently happen very quickly -- but > think type inference!) it looks like one has to be very careful to > keep > many different effects of one's changes in the head at the same time. > > > Assertion 2: The reason for this state can at least in some part be > attributed to some confusion as to what the current code tree *is*. > > I believe that the reason that the codebase can be practical in the > current state is tightly connected with how similar C and CPython is > with Python as a language. The design, so to speak, arose in an > historical happy accident, because the input and output are almost 1:1 > related. A Python function and a C function loosely correspond, and so > FuncDefNode can serve the role as both the parsed element and the > element due for C serialization. However there is no reason to assume > that the 1:1 correspondance holds everywhere; and I think there is a > tendency that "ugly hacks" already tend to arise in situations > where the > correspondance in program structure diverges more. (To consider how > this > might become worse, think about a "method template node".) > > If this was only a theoretically founded critisism one might simply > dismiss it, but I believe the problems in assertion 1 comes from this > very fact, and that they can be solved by dealing with this conceptual > problem. > > > The proposed solution: Work towards "The ideal solution" below in > tiny, > incremental steps. All *new* major features are implemented with the > ideal solution in mind, and old code refactored as far as is needed > for > that to happen, but one doesn't start rewriting working and bugfixed > code just for the sake of abstract purity. > > > The ideal solution: A strict pipeline-based approach where the "data" > (one or more trees, graphs or similar) is in different stages, with > different phases transforming one into the other. I.e, always try > to fit > what one does into the following example scheme: > > data stage:string of code -> phase:parsing -> > data stage:syntax tree -> phase:expand with statements -> > data stage:tree without with statements -> phase:create scopes -> > data stage:tree with scopes -> phase:analyse types -> > data stage:tree with ".type" attributes -> ... > ... > data stage:tree closely related to *structure* of C source -> > phase:output > > Whenever something doesn't quit fit in the scheme of naive input/ > output, > split it up into more phases until it does :-) > > (Digression: Closely related to C structure above means, for instance, > no inner functions. No with statements. No untyped variables. That > kind > of thing. Classes may be included though, since one can create a 1:1 > correspondance between C code using the CPython API and a class; so it > is not about language "features" as such but some deeper structure.) > > Note that all phases does not need to be visitors! For instance, the > output-to-C phase will stay a recursive process for the foreseeable > future. This is a way of thinking, not a recipe for implementation. > (And > indeed, this way of thinking is already followed to a degree -- but > not > everywhere, and we get problems at the spots where it is not.) > > A very important side-effect of this is that it lends itself to > "implementing complex Cython statements as more simple Cython > statements". (Though ideally "more simple Cython statements" can > also be > "intermediate simpler instructions" that bridges the gap between > Cython > and C in a more fine-grained way in order to give more code reuse than > there is today. But I digress.) > > (Implementation notes: This does not necesarrily mean that the classes > of the nodes changes between phases; one does whatever is most > convenient. For instance, if one has T2 = expand_with_nodes(T1), > then T1 > is allowed to keep WithStatementNode, T2 is however not, but otherwise > the trees are the same. In the same way, if T1 = type_analysis(T0), > then > ExprNodes in T0 is not allowed to have the type attribute, while > ExprNodes in T1 are required to have them. This can be a documentation > matter or enforced (in a number of ways), that issue is better left > for > later.). > > > Final words: This might seem trivial. It is not. The reason is that > currently, almost wherever you try to implement two simple phases f > and > g one ends up having the result of f depend on what is done in g for > some nodes, and the result of g depend on f for other nodes. And so > neither f(g(T)) nor g(f(T)) can be made to work. > > The point is, this way of changing how one thinks mostly affects the > compiler design as such, more than it is a specific style of coding. > Visitors happen to be the most natural way to implement this, but are > not the main point. I'd be happy (or, happier than I am today) with a > cleanly defined phase-based recursive process. +1 to this whole idea. The visitor pattern certainly has the advantage that one can add phases without changing every node (which is perhaps why, e.g., the "analyze" phase does so many things that could be logically separated). - Robert From dalcinl at gmail.com Thu May 8 20:12:12 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 15:12:12 -0300 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> Message-ID: On 5/8/08, Robert Bradshaw wrote: > Thanks. I briefly glanced at it and it looks good. What you probably > want to do is commit and then export the patch (that way it will have > some history attached, and you'll get credit for it.) OK. But perhaps better... It's really hard for you to clone cython-devel and name it cython-devel-py3k, and give me commit access to only that repo? > One remark I had is that perhaps more care needs to be done with > "PyInt_CheckExact" as in Py2 this implies that the size fits into a > long, so it's not quite the same as PyLong_CheckExact. Good catch!. I believe there is PyLong_FitsInLong or something similar to this. I have to check... > IMHO, all sloots should be keept, or > > all unused in py3k should be removed. > > Could you elaborate a bit more on what you mean by "unused?" If > they're still in the struct, I'd think we would need to keep them so > it has the right size/layout, right? I mean that Python 3.0 sources are maintaining some slots in the struct for no good reason... for example, nb_hex and nb_oct. IMHO, they should be removed in core Python... But if they are not removed, of course we still have to fill the struct. > PyClass_New is gone, I've > > Filling the dict ahead of time might actually work better... Just a question: Why Cython does not implement old-style classes this way for Python2.X ? Filling a dict with PyCFunction's and next calling PyClass_New does not work (I've never tried)? > > I actually have no idea on this one--maybe there's some system call > that can give this information? Stefan would probably know better > than me. Anyway this is not really important at this point. In the worst scenario, Python would not be able to find a source file for displaying traceback lines in a fancy way (like the one in IPython), but this is not a real problem. > > > > > > Regards, > > > > > > On 5/8/08, Stefan Behnel wrote: > >> Hi, > >> > >> > >> Lisandro Dalcin wrote: > >>> The last four hours (3:00 AM right now here at Argentina) I've been > >>> working in a patch for enabling Cython generate code working with > >>> Python 2.6 and Python 3.0. > >>> > >>> Until now, the generated code (at least for the full mpi4py project) > >>> compiles and link fine with no errors. > >>> > >>> However, I have a big problems I do not know how to 'fix' in > >>> Cython. > >>> It is related to unbound methods disapearing in Py3K. Then, normal > >>> python classes does not work, but the cdef ones are fine. > >> > >> > >> These are the kind of things that we planned to address at the > >> dev1 workshop. > >> The C-API of Python is expected to stabilize next month, I don't > >> know how > >> stable these parts currently are. > >> > >> > >> > >>> Is there any interest on this to go mainstream? > >> > >> > >> Sure, totally! > >> > >> > >> > >>> I was very > >>> conservative about the PyString/PyUnicode issue. The right one is > >>> used > >>> in a place-by-place base. Of course, because of this, I have to pass > >>> 'bytes' to MPI, and I get 'bytes' from the C calls. > >> > >> > >> Yes, I think this is the right way to deal with it. Python2 was > >> very lax in > >> terms of semantics here, so the two have to be separated on a > >> case-by-case basis. > >> > >> > >> > >>> Finally, I'm completelly sure that I've not fixed all the relevant > >>> parts, but this is IMHO a good starting point. > >> > >> > >> Is it one big patch or did you/can you split it up? > >> > >> As this is potentially a big change, trac is the wrong place to > >> discuss it. We > >> should put up an official Py3 branch that people can actively > >> work on without > >> impacting the main trunk, so that we can merge working stuff > >> gradually. Can > >> you send a bundle against cython-devel to me and Robert for now? > >> He can set it up. > >> > >> Stefan > >> _______________________________________________ > >> Cython-dev mailing list > >> Cython-dev at codespeak.net > >> http://codespeak.net/mailman/listinfo/cython-dev > >> > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > > Tel/Fax: +54-(0)342-451.1594 > py3k.patch>_______________________________________________ > > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Thu May 8 20:20:25 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 11:20:25 -0700 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> Message-ID: <2A65A78D-FAE9-47FB-940B-2DE0E0D2E6A6@math.washington.edu> On May 8, 2008, at 11:12 AM, Lisandro Dalcin wrote: > On 5/8/08, Robert Bradshaw wrote: >> Thanks. I briefly glanced at it and it looks good. What you probably >> want to do is commit and then export the patch (that way it will >> have >> some history attached, and you'll get credit for it.) > > OK. But perhaps better... It's really hard for you to clone > cython-devel and name it cython-devel-py3k, and give me commit access > to only that repo? Sure. I'll just need a htpasswd file so you can authenticate. (This will add you as a user to trac too.) >> One remark I had is that perhaps more care needs to be done with >> "PyInt_CheckExact" as in Py2 this implies that the size fits into a >> long, so it's not quite the same as PyLong_CheckExact. > > Good catch!. I believe there is PyLong_FitsInLong or something similar > to this. I have to check... Perhaps this should be handled at a higher level--i.e. we should see where in the source we're using PyInt_CheckExact, etc. and do something else there so we don't need the #defines... >> IMHO, all sloots should be keept, or >>> all unused in py3k should be removed. >> >> Could you elaborate a bit more on what you mean by "unused?" If >> they're still in the struct, I'd think we would need to keep them so >> it has the right size/layout, right? > > I mean that Python 3.0 sources are maintaining some slots in the > struct for no good reason... for example, nb_hex and nb_oct. IMHO, > they should be removed in core Python... But if they are not removed, > of course we still have to fill the struct. > >> PyClass_New is gone, I've >> >> Filling the dict ahead of time might actually work better... > > Just a question: Why Cython does not implement old-style classes this > way for Python2.X ? Filling a dict with PyCFunction's and next calling > PyClass_New does not work (I've never tried)? No idea, this is how Greg wrote it. > >> >> I actually have no idea on this one--maybe there's some system call >> that can give this information? Stefan would probably know better >> than me. > > Anyway this is not really important at this point. In the worst > scenario, Python would not be able to find a source file for > displaying traceback lines in a fancy way (like the one in IPython), > but this is not a real problem. Having good tracebacks is important to me at least, hopefully we'll find a way to do it. - Robert From dalcinl at gmail.com Thu May 8 20:33:31 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 15:33:31 -0300 Subject: [Cython] A high-level discussion of the Cython internals In-Reply-To: <48233596.5020204@student.matnat.uio.no> References: <48233596.5020204@student.matnat.uio.no> Message-ID: Definitely +1 on all your comments ... Being a new cython user with a rather good knowledge of Python at the C level, and having a serious project (at least, I believe it is serious) already working in a couple of weeks, the really BIG, BIG problem I have with Cython is that it is almost impossible for me to learn and follow the internals in order to help Cython improve. Any step in this direction would be very well welcome. And Robert is right, Cython is not stagnat, but I have the feel that all power users are depending too much on core developers. For example, after trying to add Py3K compat, I cannot have any idea of the goodness of this work, how much has to be done, or what is broken. And that should definitelly not happen. The more understandable the code base is, less work for core developers to review contributions from other 'trusted' developers. But at this point, I do not even trust on myself when hacking inside Cython. On 5/8/08, Dag Sverre Seljebotn wrote: > This is in response to this statement by Robert: > > > -- > > I think this is good to talk about now. Basically, you want to change > Cython to use the "visitor pattern" rather than the recursive pattern > that it currently uses. This has pros and cons, but I think this > could be a good thing, and you are certainly convinced that its > needed to get started on the GSoC stuff. > > -- > > So, leaving the patch and implementation strategies aside for now I'm > going to attempt discussing this from the mountain top. (If a lot of > this seems trivial, all the better -- it can then start to serve as > documentation for a direction, rather than my "private" efforts :-). If > not, it will be good to discuss it.) > > I honestly believe that any time spent discussing and implementing this > now can repay itself many times later on in saved developer hours and a > lower learning curve for new developers. If people agree with me in how > important this is, I think it would be good to perhaps spend some time > at the beginning of dev1 talking and discussing this (with some slides > etc.). > > Here we go: > > Assertion 1: Development of Cython has a potential for becoming stagnant > (if it hasn't already) quickly. > > My (subjective) felling is that the codebase has a somewhat high > learning curve. If one wants to add a fundamentally new features (as > opposed to quick fixes, which do currently happen very quickly -- but > think type inference!) it looks like one has to be very careful to keep > many different effects of one's changes in the head at the same time. > > > Assertion 2: The reason for this state can at least in some part be > attributed to some confusion as to what the current code tree *is*. > > I believe that the reason that the codebase can be practical in the > current state is tightly connected with how similar C and CPython is > with Python as a language. The design, so to speak, arose in an > historical happy accident, because the input and output are almost 1:1 > related. A Python function and a C function loosely correspond, and so > FuncDefNode can serve the role as both the parsed element and the > element due for C serialization. However there is no reason to assume > that the 1:1 correspondance holds everywhere; and I think there is a > tendency that "ugly hacks" already tend to arise in situations where the > correspondance in program structure diverges more. (To consider how this > might become worse, think about a "method template node".) > > If this was only a theoretically founded critisism one might simply > dismiss it, but I believe the problems in assertion 1 comes from this > very fact, and that they can be solved by dealing with this conceptual > problem. > > > The proposed solution: Work towards "The ideal solution" below in tiny, > incremental steps. All *new* major features are implemented with the > ideal solution in mind, and old code refactored as far as is needed for > that to happen, but one doesn't start rewriting working and bugfixed > code just for the sake of abstract purity. > > > The ideal solution: A strict pipeline-based approach where the "data" > (one or more trees, graphs or similar) is in different stages, with > different phases transforming one into the other. I.e, always try to fit > what one does into the following example scheme: > > data stage:string of code -> phase:parsing -> > data stage:syntax tree -> phase:expand with statements -> > data stage:tree without with statements -> phase:create scopes -> > data stage:tree with scopes -> phase:analyse types -> > data stage:tree with ".type" attributes -> ... > ... > data stage:tree closely related to *structure* of C source -> phase:output > > Whenever something doesn't quit fit in the scheme of naive input/output, > split it up into more phases until it does :-) > > (Digression: Closely related to C structure above means, for instance, > no inner functions. No with statements. No untyped variables. That kind > of thing. Classes may be included though, since one can create a 1:1 > correspondance between C code using the CPython API and a class; so it > is not about language "features" as such but some deeper structure.) > > Note that all phases does not need to be visitors! For instance, the > output-to-C phase will stay a recursive process for the foreseeable > future. This is a way of thinking, not a recipe for implementation. (And > indeed, this way of thinking is already followed to a degree -- but not > everywhere, and we get problems at the spots where it is not.) > > A very important side-effect of this is that it lends itself to > "implementing complex Cython statements as more simple Cython > statements". (Though ideally "more simple Cython statements" can also be > "intermediate simpler instructions" that bridges the gap between Cython > and C in a more fine-grained way in order to give more code reuse than > there is today. But I digress.) > > (Implementation notes: This does not necesarrily mean that the classes > of the nodes changes between phases; one does whatever is most > convenient. For instance, if one has T2 = expand_with_nodes(T1), then T1 > is allowed to keep WithStatementNode, T2 is however not, but otherwise > the trees are the same. In the same way, if T1 = type_analysis(T0), then > ExprNodes in T0 is not allowed to have the type attribute, while > ExprNodes in T1 are required to have them. This can be a documentation > matter or enforced (in a number of ways), that issue is better left for > later.). > > > Final words: This might seem trivial. It is not. The reason is that > currently, almost wherever you try to implement two simple phases f and > g one ends up having the result of f depend on what is done in g for > some nodes, and the result of g depend on f for other nodes. And so > neither f(g(T)) nor g(f(T)) can be made to work. > > The point is, this way of changing how one thinks mostly affects the > compiler design as such, more than it is a specific style of coding. > Visitors happen to be the most natural way to implement this, but are > not the main point. I'd be happy (or, happier than I am today) with a > cleanly defined phase-based recursive process. > > > -- > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu May 8 22:35:30 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 17:35:30 -0300 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <2A65A78D-FAE9-47FB-940B-2DE0E0D2E6A6@math.washington.edu> References: <4822A413.8060902@behnel.de> <2A65A78D-FAE9-47FB-940B-2DE0E0D2E6A6@math.washington.edu> Message-ID: On 5/8/08, Robert Bradshaw wrote: > On May 8, 2008, at 11:12 AM, Lisandro Dalcin wrote: > Sure. I'll just need a htpasswd file so you can authenticate. (This > will add you as a user to trac too.) The you will send me the password, right? > > >> "PyInt_CheckExact" as in Py2 this implies that the size fits into a > >> long, so it's not quite the same as PyLong_CheckExact. > > Perhaps this should be handled at a higher level--i.e. we should see > where in the source we're using PyInt_CheckExact, etc. and do > something else there so we don't need the #defines... You are right. But then I do not know how to do that right now, so this is for the near future... > > Just a question: Why Cython does not implement old-style classes this > > way for Python2.X ? Filling a dict with PyCFunction's and next calling > > PyClass_New does not work (I've never tried)? > > > No idea, this is how Greg wrote it. Well, could you ask Greg to post some comment about this? This way, I can follow the right path trying to fix this for the Py3K case... > Having good tracebacks is important to me at least, hopefully we'll > find a way to do it. BTW, I've tried to install the source files alongside the generated extensions modules, but IPython does not show the traceback. Could you give me a tip about this? Please note that my *.so extension module is inside a dir with a __init__.py file, then it is actually a module inside a package... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Thu May 8 22:46:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 13:46:39 -0700 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> <2A65A78D-FAE9-47FB-940B-2DE0E0D2E6A6@math.washington.edu> Message-ID: On May 8, 2008, at 1:35 PM, Lisandro Dalcin wrote: > On 5/8/08, Robert Bradshaw wrote: >> On May 8, 2008, at 11:12 AM, Lisandro Dalcin wrote: >> Sure. I'll just need a htpasswd file so you can authenticate. (This >> will add you as a user to trac too.) > > The you will send me the password, right? An htpasswd file is a file that contains a username and (hashed) password. You can do htpasswd -c -s somefile username (available on most systems) and it will prompt for a password to associate with username and store the sha1 has. Then send me the somefile it created and I'll add it to the list. See man htpasswd for more details. > >> >>>> "PyInt_CheckExact" as in Py2 this implies that the size fits >>>> into a >>>> long, so it's not quite the same as PyLong_CheckExact. >> >> Perhaps this should be handled at a higher level--i.e. we should see >> where in the source we're using PyInt_CheckExact, etc. and do >> something else there so we don't need the #defines... > > You are right. But then I do not know how to do that right now, so > this is for the near future... > >>> Just a question: Why Cython does not implement old-style classes >>> this >>> way for Python2.X ? Filling a dict with PyCFunction's and next >>> calling >>> PyClass_New does not work (I've never tried)? >> >> >> No idea, this is how Greg wrote it. > > Well, could you ask Greg to post some comment about this? This way, I > can follow the right path trying to fix this for the Py3K case... I believe on this list now :-) >> Having good tracebacks is important to me at least, hopefully we'll >> find a way to do it. > > BTW, I've tried to install the source files alongside the generated > extensions modules, but IPython does not show the traceback. Could you > give me a tip about this? Please note that my *.so extension module > is inside a dir with a __init__.py file, then it is actually a module > inside a package... Yeah, I'm not sure what mechanism IPython uses to look up source files, but something needs to be fixed there (either on our side or theirs). You can set it to not give context in which case it will give a traceback. - Robert From dalcinl at gmail.com Thu May 8 22:48:50 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 8 May 2008 17:48:50 -0300 Subject: [Cython] Abandon support for Python 2.3? In-Reply-To: <48233DE9.90800@student.matnat.uio.no> References: <47797395-6F61-44F6-B232-8C3F9CC751E4@math.washington.edu> <3d375d730805080958q65d7cd39v22d9f1cd86e1ba5d@mail.gmail.com> <482337F3.8000105@behnel.de> <3d375d730805081030w35c0c532xbc6eb9fb44f3c755@mail.gmail.com> <48233DE9.90800@student.matnat.uio.no> Message-ID: Dag, suposse the only 2.4 feature you really need is @decorators, then, Do you believe it would be too much hard to also implement a code conversion tool transforming the cython 2.4 sources to valid 2.3 code? On 5/8/08, Dag Sverre Seljebotn wrote: > > > This is a valid point. Also, as we work (via Dag's GSoC project for > > example) to make it easier to use numpy effectively from Cython, it > > does seem kind of odd to say "you need 2.3 to use NumPy, but 2.4 to > > use NumPy+Cython." > > I agree, this seems to pretty much override all my arguments. I'm not still convinced... as long as cython is used as a command line tool, you do not need cython depend on 2.3. Of couse, if you can develop a code conversion tool for decorators, then go on with 2.4!!! -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Fri May 9 03:55:40 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 13:55:40 +1200 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> Message-ID: <4823AF1C.5090900@canterbury.ac.nz> Robert Bradshaw wrote: > Filling the dict ahead of time might actually work better...one would > execute the body in a temporary scope and then construct the class. > This would bring things more in line with how Python does things. That's probably the right thing to do. I can't remember exactly, but I think the main reason I created the class first was so that I could use the same code to access a name as I was already for module variables. If you can remember somehow that you need to use dict access instead of attribute access, I don't think there should be any problem. -- Greg From greg.ewing at canterbury.ac.nz Fri May 9 04:01:25 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 14:01:25 +1200 Subject: [Cython] PyPy parser In-Reply-To: <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> Message-ID: <4823B075.5000903@canterbury.ac.nz> Robert Bradshaw wrote: > A simple grammar -> parser module like you've talked about before > seems ideal, and would probably be fairly compact. Before diving into this, there are a couple of things you should take into account: 1. A parser generator of the kind used by CPython is probably not capable of parsing C declarations without doing some hackery somewhere. 2. There is computation embedded in the current Pyrex parser in various places that you would need to track down and move somewhere else. -- Greg From greg.ewing at canterbury.ac.nz Fri May 9 04:13:21 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 14:13:21 +1200 Subject: [Cython] __getattribute__ In-Reply-To: References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> Message-ID: <4823B341.8070209@canterbury.ac.nz> I've been thinking some more about __getattribute__, and I think that rather than trying to emulate the Python semantics, the right thing to do is simply to expose the C type slot directly. This would be in keeping with the way the other type slots are handled. The philosophy is that when defining an extension type, you should have as much control and as little overhead in the way as possible. The only places where I've done things differently are where there isn't a directly corresponding C slot, such as __getattr__. Since __getattribute__ corresponds most closely to tp_getattr, it should just fill it directly. -- Greg From robertwb at math.washington.edu Fri May 9 04:25:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 19:25:21 -0700 Subject: [Cython] PyPy parser In-Reply-To: <4823B075.5000903@canterbury.ac.nz> References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> <4823B075.5000903@canterbury.ac.nz> Message-ID: On May 8, 2008, at 7:01 PM, Greg Ewing wrote: > Robert Bradshaw wrote: > >> A simple grammar -> parser module like you've talked about before >> seems ideal, and would probably be fairly compact. > > Before diving into this, there are a couple of things > you should take into account: > > 1. A parser generator of the kind used by CPython is > probably not capable of parsing C declarations without > doing some hackery somewhere. I think the C declarations are representable by a generative grammar, so this method could work. > 2. There is computation embedded in the current Pyrex > parser in various places that you would need to track > down and move somewhere else. Yes, I've seen this too. I think most people who have proposed replacing the parser have underestimated the complexity and subtleties of parsing Pyrex/Cython compared to pure Python. On the surface it is very close to Python, but those parts that differ (e.g. type declaration) can be very complicated. That being said I think it can be done (I don't want to discourage anyone, there are some very capable people on this list) and with Python 3.0 on the horizon significant changes will need to be made one way or another. Other than that the current parser works very well. - Robert From robertwb at math.washington.edu Fri May 9 04:30:03 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 8 May 2008 19:30:03 -0700 Subject: [Cython] __getattribute__ In-Reply-To: <4823B341.8070209@canterbury.ac.nz> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> <4823B341.8070209@canterbury.ac.nz> Message-ID: <9E4E3C4E-DD03-4536-8328-838CB4A19676@math.washington.edu> On May 8, 2008, at 7:13 PM, Greg Ewing wrote: > I've been thinking some more about __getattribute__, and I think > that rather than trying to emulate the Python semantics, the > right thing to do is simply to expose the C type slot directly. > > This would be in keeping with the way the other type slots > are handled. The philosophy is that when defining an extension > type, you should have as much control and as little overhead > in the way as possible. I think this is a philosophical difference between Cython and Pyrex-- in Cython we want to stick to Python semantics as closely as possible so the user doesn't even have to think about whether they're writing an extension class or not. Eventually one should be able to take any .py file, run Cython on it, and have it behave exactly the same. If this is not the case than it is a bug. (I'm not sure exactly what to do about arithmetic though--that might be one exception.) > The only places where I've done things differently are where > there isn't a directly corresponding C slot, such as __getattr__. > Since __getattribute__ corresponds most closely to tp_getattr, > it should just fill it directly. In terms of efficiency, __getattribute__ is called at the top, and on success tp_getattr returns immediately, so it should be just about as fast. - Robert From greg.ewing at canterbury.ac.nz Fri May 9 05:24:28 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 15:24:28 +1200 Subject: [Cython] __getattribute__ In-Reply-To: <9E4E3C4E-DD03-4536-8328-838CB4A19676@math.washington.edu> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> <4823B341.8070209@canterbury.ac.nz> <9E4E3C4E-DD03-4536-8328-838CB4A19676@math.washington.edu> Message-ID: <4823C3EC.4080306@canterbury.ac.nz> Robert Bradshaw wrote: > I think this is a philosophical difference between Cython and Pyrex-- > ... Eventually one should be able to take > any .py file, run Cython on it, and have it behave exactly the same. Actually, it would, because a pure Python file isn't going to contain any extension classes, and whatever else is done, __getattribute__ on a Python class should follow Python semantics. But this is certainly a difference in goals between Pyrex and Cython. The goal of Pyrex is not to compile Python efficiently, but to enable efficient bridging between Python and C. > In terms of efficiency, __getattribute__ is called at the top, and on > success tp_getattr returns immediately, so it should be just about as > fast. To my mind, the semantic difference is actually a feature -- it lets you completely control what it means to look up an attribute of your object. A compromise might be to find another name for the method that directly fills tp_getattr. But then we'd have *three* variants of attribute-getting methods to keep straight... not sure if that would be an improvement or not. -- Greg From pete at petertodd.org Fri May 9 05:26:12 2008 From: pete at petertodd.org (Peter Todd) Date: Thu, 8 May 2008 23:26:12 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <4823B341.8070209@canterbury.ac.nz> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> <4823B341.8070209@canterbury.ac.nz> Message-ID: <20080509032612.GB27761@tilt> On Fri, May 09, 2008 at 02:13:21PM +1200, Greg Ewing wrote: > I've been thinking some more about __getattribute__, and I think > that rather than trying to emulate the Python semantics, the > right thing to do is simply to expose the C type slot directly. > > This would be in keeping with the way the other type slots > are handled. The philosophy is that when defining an extension > type, you should have as much control and as little overhead > in the way as possible. > > The only places where I've done things differently are where > there isn't a directly corresponding C slot, such as __getattr__. > Since __getattribute__ corresponds most closely to tp_getattr, > it should just fill it directly. If no __getattr__ is defined, __getattribute__ is essentially a tp_getattr slot filler: static PyObject *__pyx_tp_getattro_3foo_foo(PyObject *o, PyObject *n) { PyObject *v = __pyx_pf_3foo_3foo___getattribute__(o, n); return v; } One level of indirection, but that's just an artifact of the current implementation that the compiler might very well optimize out anyway. -- http://petertodd.org 'peter'[:-1]@petertodd.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080508/e816f6fe/attachment.pgp From greg.ewing at canterbury.ac.nz Fri May 9 05:33:03 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 15:33:03 +1200 Subject: [Cython] __getattribute__ In-Reply-To: <20080509032612.GB27761@tilt> References: <47FE4418.40703@student.matnat.uio.no> <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> <4823B341.8070209@canterbury.ac.nz> <20080509032612.GB27761@tilt> Message-ID: <4823C5EF.6050700@canterbury.ac.nz> Peter Todd wrote: > If no __getattr__ is defined, __getattribute__ is essentially a > tp_getattr slot filler: > > static PyObject *__pyx_tp_getattro_3foo_foo(PyObject *o, PyObject *n) { > PyObject *v = __pyx_pf_3foo_3foo___getattribute__(o, n); > return v; > } What if a __getattr__ is inherited from a base class? What if this class doesn't define a __getattr__, but another class that inherits from it does, and doesn't override __getattribute__? -- Greg From pete at petertodd.org Fri May 9 05:51:05 2008 From: pete at petertodd.org (Peter Todd) Date: Thu, 8 May 2008 23:51:05 -0400 Subject: [Cython] __getattribute__ In-Reply-To: <4823C5EF.6050700@canterbury.ac.nz> References: <47FF20EC.4040803@behnel.de> <482293B5.3080802@behnel.de> <5447846C-35D8-4E52-98BA-AB0F0082E311@math.washington.edu> <48231CA1.5080702@googlemail.com> <4823B341.8070209@canterbury.ac.nz> <20080509032612.GB27761@tilt> <4823C5EF.6050700@canterbury.ac.nz> Message-ID: <20080509035105.GC27761@tilt> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, May 09, 2008 at 03:33:03PM +1200, Greg Ewing wrote: > Peter Todd wrote: > > > If no __getattr__ is defined, __getattribute__ is essentially a > > tp_getattr slot filler: > > > > static PyObject *__pyx_tp_getattro_3foo_foo(PyObject *o, PyObject *n) { > > PyObject *v = __pyx_pf_3foo_3foo___getattribute__(o, n); > > return v; > > } > > What if a __getattr__ is inherited from a base class? > > What if this class doesn't define a __getattr__, but > another class that inherits from it does, and doesn't > override __getattribute__? My simplistic idea of the world breaks? :) No, you are totally right. A __tp_getattro__ special method could be defined, it'd be about ~15 lines of code I think, and another test case, including error messages if __tp_getattro__ and either __getattribute__ or __getattr__ are defined. I'd be happy to do it, but I think it'd be better to think about other cases like it first. - -- http://petertodd.org 'peter'[:-1]@petertodd.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFII8op3bMhDbI9xWQRAsSIAJ95BRIQSBxXtvfabgqSniajkY7FcACfcehX tLxL2KN/OG4gPtjqL4qvFIs= =uRg9 -----END PGP SIGNATURE----- From stefan_ml at behnel.de Fri May 9 10:25:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 10:25:14 +0200 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> Message-ID: <48240A6A.7040700@behnel.de> Hi, skipping through the patch, the changes seem to make sense in general. I'll have to look through it a bit more thorough, but not today. There are still a couple of places where you mix up strings (i.e. "bytes") and unicode, like this: ------------------------ @@ -4309,7 +4336,11 @@ static int __Pyx_InitStrings(__Pyx_Strin if (t->is_unicode) { *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL); } else { + #if PY_MAJOR_VERSION < 3 *t->p = PyString_FromStringAndSize(t->s, t->n - 1); + #else + *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); + #endif } if (!*t->p) return -1; ------------------------ The original code already handles unicode strings properly. For Py3 source code, this has to be adapted in the Cython parser, not in the generated code (i.e. parse "abc" as if it was u"abc" and b"abc" like "abc"). For current Cython code, this should just work unchanged - assuming you import the string compatibility header file that #defines PyString_* as PyBytes_*. Although I might want to see plain PyBytes_* calls generated here - not sure yet. Robert Bradshaw wrote: > On May 8, 2008, at 8:33 AM, Lisandro Dalcin wrote: >> - Finally, in the very clever part of traceback hackery, the call to >> Py_CodeNew receives 'filenames' created with PyUnicode_AsString. I do >> not know what the C preprocessor uses for enconding __FILE__ macro, it >> its always ASCII, then all is fine, if not, the filesystem encoding >> should be taken into account. > > I actually have no idea on this one--maybe there's some system call > that can give this information? Stefan would probably know better > than me. Not sure either. I expect __FILE__ to be in the platform specific filesystem encoding, so decoding to unicode can easily fail. We should store the UTF-8 encoded filename in the file so that it can be read instead of __FILE__. Stefan From stefan_ml at behnel.de Fri May 9 11:21:07 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 11:21:07 +0200 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <48240A6A.7040700@behnel.de> References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> Message-ID: <48241783.5080002@behnel.de> Hi again, Stefan Behnel wrote: > ------------------------ > @@ -4309,7 +4336,11 @@ static int __Pyx_InitStrings(__Pyx_Strin > if (t->is_unicode) { > *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL); > } else { > + #if PY_MAJOR_VERSION < 3 > *t->p = PyString_FromStringAndSize(t->s, t->n - 1); > + #else > + *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); > + #endif > } > if (!*t->p) > return -1; > ------------------------ I think this should read if (t->is_unicode) { #if PY_MAJOR_VERSION < 3 *t->p = PyUnicode_DecodeUTF8(t->s, t->n - 1, NULL); #else *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); #endif } else { *t->p = PyString_FromStringAndSize(t->s, t->n - 1); } Also this: ------------------------ @@ -4289,7 +4312,11 @@ static int __Pyx_InternStrings(__Pyx_Int """,""" static int __Pyx_InternStrings(__Pyx_InternTabEntry *t) { while (t->p) { + #if PY_MAJOR_VERSION < 3 *t->p = PyString_InternFromString(t->s); + #else + *t->p = PyUnicode_InternFromString(t->s); + #endif if (!*t->p) return -1; ++t; ------------------------ The thing here is that we currently do not intern unicode strings at all, so this must continue to return byte strings. The actualy problem should be fixed in the compiler, which should know how to distinguish byte strings from unicode strings in its interned string dictionary, and generate similar code as for the normal string table (i.e. with a "unicode" flag). See the add_py_string() method in Symtab.py for a start. I noticed that cython-devel-py3 is up, but I'll wait for Lisandro to commit his patch before I start working on it. Stefan From robertwb at math.washington.edu Fri May 9 11:53:58 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 9 May 2008 02:53:58 -0700 Subject: [Cython] [Pyrex] Callbacks In-Reply-To: <66a8c45d0805090242lb47b651p388945a78e329ff7@mail.gmail.com> References: <66a8c45d0805090242lb47b651p388945a78e329ff7@mail.gmail.com> Message-ID: <86D7288D-E9D9-4EC4-836D-B37524C135E7@math.washington.edu> On May 9, 2008, at 2:42 AM, Daniele Pianu wrote: [...] > Is there a way to solve the problem without importing anything from > Python.h? No, unless you want to cache of the func/data manually somewhere else. Also note that this may cause a memory leak, as the reference count will never be decremented. - Robert From dagss at student.matnat.uio.no Fri May 9 12:13:24 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 09 May 2008 12:13:24 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) Message-ID: <482423C4.9040005@student.matnat.uio.no> I think there's a fundamental flaw to my NumPy proposal, which need correction. I've thought about it as "polishing the access to the extension type", so that after cdef object x = numpy.zeros([3,3]) cdef numpy.ndarray y = x y will behave efficiently when treated like a Python object. There are problems to this way of thinking though: "print y.strides" will not work as it is a pointer-array, while "print x.strides" will work as it is a tuple (one has to do "print (y).strides", which is not going to fly with for the users of my NumPy project). On the other hand, one has to remember to do "print y.shape" for tuple-access but "y.dimensions[i]" for speedy, non-Python access. (And this difference in behaviour comes entirely from strides having a name-clash while shape/dimensions do not). So while my approach has been to make the y variable act "more like a numpy array", I now think this is flawed. Optimizations through typing and access to extension structs should probably instead be treated as fundamentally different things, and the typical NumPy user shouldn't deal with a reference to the extension struct even if typing for speed is wanted. I'll now propose a solution for this. It has been proposed before in a different form (pxd shadowing etc.); I hope I succeed better now. I think that what is wanted here is to /speed up how NumPy objects are accessed/, and the extension type only comes into it peripherally. So I'd like a new syntax for modifying the compile-time behaviour of how objects are treated. It could look something like the following. I'm calling the keyword "compiletimefeatures" but that is for lack of a better word, also I use cython_ndarray to avoid namespace clashes (have ideas for allowing "ndarray" directly but I'd like to leave that out of the discussion for now). Anyway, numpy.ndarray is an extension type like before, while cython_ndarray is a new "type specifier" providing extra compile-time optimizations to the variable that carries its type. numpy.pxd: cdef extern from "numpy/arrayobject.h": ctypedef class numpy.ndarray [object PyArrayObject]: cdef char *data cdef int nd cdef Py_intptr_t *dimensions cdef Py_intptr_t *strides ... compiletimefeatures cython_ndarray: def __applyfeatures__(self): if not isinstance(self, numpy.ndarray): raise TypeError(...) @property cdef shape(self): cdef numpy.ndarray imp = self return (imp.dimensions[i] for idx in range(imp.nd)) # See my gsoc project for __getitem__, just replace "self" with "imp" ... User code: cdef cython_ndarray y = x print y.shape So, what happens: * cython_ndarray is registered as a new "cdef-able" type. * When assigning something to y, a runtime call to __applyfeatures__ is automatically added (i.e. principially the same thing that happens with extension type declarations, but more specific and versatile) * When y is operated on, it is first checked whether cython_array contains any compile time optimizations. If so, they are called (using y as the "self", like a class, however "self" is a Python object!) * End result: Rather than having to hope for some good luck in the namespace resolution towards the extension type struct to provide the speedup, one can be explicit about it. Note: The "optimizations" provided above will only be faster if Cython gets quite sophisticated unrolling/optimization. While I'd like to have this, a more declarative approach might be more realistic. So something like: compiletimefeatures cython_ndarray: carray_or_tuple shape(x): extension_type(x).shape Sorry, couldn't think about a good declarative syntax now :-) Note that this is instead of the plans to allow inlineable code in extension type declarations. Extension type declarations wouldn't have to be touched at all then. Flame away :-) (yes, I see that this could be confusing to OOP. However it is no worse than the current situation, one cannot really override extension type struct items either.) -- Dag Sverre From stefan_ml at behnel.de Fri May 9 13:50:15 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 13:50:15 +0200 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <48240A6A.7040700@behnel.de> References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> Message-ID: <48243A77.9040008@behnel.de> Hi, Stefan Behnel wrote: > For current Cython code, this should just work unchanged - assuming you import > the string compatibility header file that #defines PyString_* as PyBytes_*. sorry for the confusion: there is no special header involved. The current API in Py3.0a5 uses these names: bytearray -> PyBytes_* bytes -> PyString_* str -> PyUnicode_* Described here http://www.python.org/dev/peps/pep-3137/ with a summary here: http://www.python.org/dev/peps/pep-3137/#summary We should eventually special case bytearray(b'') literals, although that is not strictly required. It's more important to actually support these types and the coercion between them first. ;) Stefan From greg.ewing at canterbury.ac.nz Fri May 9 13:57:21 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 09 May 2008 23:57:21 +1200 Subject: [Cython] ANN: Pyrex 0.9.7 Message-ID: <48243C21.2020207@canterbury.ac.nz> Pyrex 0.9.7 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Highlights of this version: * I have streamlined the integer for-loop syntax. Instead of the loop variable redundantly appearing in two places, it's now just for x < i < y: ... * If you declare a variable as a list or dict, then calls to some of its methods will be compiled into type-specific Python API calls instead of generic ones. * Most built-in constants are referenced directly instead of via dict lookup. * There are two new builtin functions, typecheck() and issubtype(), for checking the types of arguments more safely (since isinstance and issubclass can be fooled). What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From greg.ewing at canterbury.ac.nz Fri May 9 14:20:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 May 2008 00:20:04 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <482423C4.9040005@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> Message-ID: <48244174.10506@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > And this > difference in behaviour comes entirely from strides having a name-clash > while shape/dimensions do not. I missed the beginning of this, so I'm not sure exactly what problem is being discussed here, but I suspect you're making all this much more complicated than it needs to be. If a name clash is the only problem, it should be easily resolvable by using a C name declaration to give the offending attribute a different name in the Pyrex/Cython code. -- Greg From dagss at student.matnat.uio.no Fri May 9 14:52:06 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 09 May 2008 14:52:06 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <48244174.10506@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <48244174.10506@canterbury.ac.nz> Message-ID: <482448F6.2050600@student.matnat.uio.no> An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080509/bed2e4bd/attachment.htm From stefan_ml at behnel.de Fri May 9 17:21:52 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 17:21:52 +0200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48243C21.2020207@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> Message-ID: <48246C10.6050703@behnel.de> Hi Greg, Greg Ewing wrote: * I have streamlined the integer for-loop syntax. Instead > of the loop variable redundantly appearing in two places, > it's now just > > for x < i < y: > ... while this is shorter, it also breaks existing code - although I assume you left the old syntax in? Also, I'm not sure this is more readable if x is non trivial. > * If you declare a variable as a list or dict, then calls > to some of its methods will be compiled into type-specific > Python API calls instead of generic ones. Cython has a bit of that also. This would be something to bring back in line. > * Most built-in constants are referenced directly instead of > via dict lookup. Again, something like that is in Cython as well, but I'm not up-to-date with the details. Probably works different... > * There are two new builtin functions, typecheck() and > issubtype(), for checking the types of arguments more safely > (since isinstance and issubclass can be fooled). Could you elaborate on the actual problem here? I also noticed you have a couple of new test cases. Could you imagine using our doctest based test runner as well? That way, we could at least share the test suites and see more easily which problems apply to both code bases. http://hg.cython.org/cython-devel/file/tip/runtests.py Stefan From dalcinl at gmail.com Fri May 9 18:20:50 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 9 May 2008 13:20:50 -0300 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <48243A77.9040008@behnel.de> References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> Message-ID: Well, at this point I'm a bit confused... The unicode stuff is the weakest point in my knowledge of Python internals... Anyway, I will not be able to work on this in the next two weeks, so if anyone wants to go on and give a try, just start applying the patch to 'cython-devel-py3' repo (I'm not concerned at all about credits for the original patch ;-) ) in order this does not stagnate. At any point, I can contribute testing against my project, just ask me for help. Finally, IMHO Cython pyx-level and generated C-level code should work with any Python version. So, if in a pyx file I wrote "abc", then, by default, it should be a ascii string for Py2 and a unicode one for Py3. And Cython should grow a new special 'char' type (like it has for bint) to specify ascii string. Then, at the C-level, and depending on Python major version, the proper conversion py->c or c->py can be managed. If not, wrapping legacy C libs and targeting Py2 and Py3 is going to be a nightmare. On 5/9/08, Stefan Behnel wrote: > Hi, > > > Stefan Behnel wrote: > > For current Cython code, this should just work unchanged - assuming you import > > the string compatibility header file that #defines PyString_* as PyBytes_*. > > > sorry for the confusion: there is no special header involved. The current API > in Py3.0a5 uses these names: > > bytearray -> PyBytes_* > bytes -> PyString_* > str -> PyUnicode_* > > Described here > > http://www.python.org/dev/peps/pep-3137/ > > with a summary here: > > http://www.python.org/dev/peps/pep-3137/#summary > > We should eventually special case bytearray(b'') literals, although that is > not strictly required. It's more important to actually support these types and > the coercion between them first. ;) > > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri May 9 19:12:24 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 19:12:24 +0200 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> Message-ID: <482485F8.5080508@behnel.de> Hi, Lisandro Dalcin wrote: > Well, at this point I'm a bit confused... The unicode stuff is the > weakest point in my knowledge of Python internals... :) that's just because Unicode is not exactly trivial - especially if you have to start thinking about bytes at some point. > Anyway, I will not be able to work on this in the next two weeks, so > if anyone wants to go on and give a try, just start applying the patch > to 'cython-devel-py3' repo (I'm not concerned at all about credits for > the original patch ;-) ) in order this does not stagnate. done. > At any > point, I can contribute testing against my project, just ask me for > help. some test cases would be greatly appreciated. :) > Finally, IMHO Cython pyx-level and generated C-level code should work > with any Python version. But your code shouldn't change semantics when you change the runtime environment. Otherwise it would become completely impossible to write portable code. > So, if in a pyx file I wrote "abc", then, by > default, it should be a ascii string for Py2 and a unicode one for > Py3. Not quite right. If you did that in a .py file, that would be the case. If you do that in Cython, a literal must not change its type, because it compiles to C. > And Cython should grow a new special 'char' type (like it has for > bint) to specify ascii string. Why is that? Two things: what's special about an ASCII string compared to a byte string in general? And: why aren't "bytes" and "bytearray" enough? > Then, at the C-level, and depending on > Python major version, the proper conversion py->c or c->py can be > managed. If not, wrapping legacy C libs and targeting Py2 and Py3 is > going to be a nightmare. It's a nightmare if your code changes behaviour. I think there should be a command line switch "-3" for compiling plain Py3 code at some point (as opposed to normal Py2 code) or maybe that should be determined by the runtime environment of the *compiler*, but the Cython language itself should stay compatible to Py2.6 for now. Stefan From stefan_ml at behnel.de Fri May 9 19:51:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 19:51:14 +0200 Subject: [Cython] Long strings Message-ID: <48248F12.7070803@behnel.de> Hi, I've seen this problem already a while ago: MSVC 2003 has a problem with strings that are longer than 2KB. Honestly, a compiler that chokes on 2048 tiny bytes! This can obviously affect string data in the source code, which I worked around back then by splitting long strings up and concatenating them at runtime. However, now I have the problem that a function in lxml has a docstring(!) that is too long, so there is not much I can do outside Cython (except for cutting down the documentation, but that would be stupid, right?). The obvious way to work around this (may I say it?) buggy compiler is to do the splitting in Cython and reconstruct the complete string at module initialisation time. Essentially, this Python code a = "... loads ... of ... bytes ... " would map to the (pre-built) equivalent of this: a = "... loads" + " ... of " + "... bytes " + "... " Before I start working on a hack like this, are there other ideas how to deal with this problem? Thanks for any ideas, Stefan From dagss at student.matnat.uio.no Fri May 9 20:11:28 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 09 May 2008 20:11:28 +0200 Subject: [Cython] Long strings In-Reply-To: <48248F12.7070803@behnel.de> References: <48248F12.7070803@behnel.de> Message-ID: <482493D0.7070705@student.matnat.uio.no> > I've seen this problem already a while ago: MSVC 2003 has a problem with > strings that are longer than 2KB. Honestly, a compiler that chokes on 2048 > tiny bytes! > Hehe, sounds familiar. I remember struggling with getting VC6 to compile template code -- nice to know they kept up their "quality" since then... > Before I start working on a hack like this, are there other ideas how to deal > with this problem? > I know very, very little about post-2000 Microsoft stuff. But at least in VC6 it was possible (and common) to seperately link in binary resource files which was accessed through the Windows API. One could have a VC-mode where a ".res"-file is dumped containing strings, and these could then be compiled in, and looked up run-time. This should be very slightly faster as you get your data linked into the executable in one block (don't know what the .NET-equivalent would be though); or at least, any overhead is O(1) with respect to string length (unlike your approach, which is O(N) -- though probably not noticeable anyway). Your approach sounds a lot saner though :-) Would be nice if you would code it up as a transform, so that we had some transforms to start building a pipeline structure around and some real code to work out issues with the approach. It is an ideal transform situation -- in goes string nodes, out goes addition nodes and string nodes; right after the parsing phase which is available now. In fact, if you don't find a better solution, I'll volunteer to do the implementation, since it sounds so incredibly ideal for a simple transform demonstration (and it is always nicer to have something real than something contrived). Dag Sverre From stefan_ml at behnel.de Fri May 9 20:18:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 20:18:44 +0200 Subject: [Cython] Long strings In-Reply-To: <482493D0.7070705@student.matnat.uio.no> References: <48248F12.7070803@behnel.de> <482493D0.7070705@student.matnat.uio.no> Message-ID: <48249584.2070507@behnel.de> Hi Dag, Dag Sverre Seljebotn wrote: >> Before I start working on a hack like this, are there other ideas how to deal >> with this problem? > > Would be nice if you would code it up as a transform Nope. I actually envisioned to code it right into __Pyx_InitStrings(). We could also write two different versions of that function and write the more complex one into the .c file only if we detect that it's needed. Stefan From dagss at student.matnat.uio.no Fri May 9 20:20:37 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 09 May 2008 20:20:37 +0200 Subject: [Cython] Long strings In-Reply-To: <482493D0.7070705@student.matnat.uio.no> References: <48248F12.7070803@behnel.de> <482493D0.7070705@student.matnat.uio.no> Message-ID: <482495F5.3020900@student.matnat.uio.no> Dag Sverre wrote: > I know very, very little about post-2000 Microsoft stuff. But at least > in VC6 it was possible (and common) to seperately link in binary > resource files which was accessed through the Windows API. One could > OK, looked it up, apparently the old approach is still valid when compiling "non-managed" code. Basically you output an RC file, which must then be compiled and linked in using VC tools. For .NET-code you'd use "resource bundles" or something like that. Dag Sverre From dalcinl at gmail.com Fri May 9 20:26:26 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 9 May 2008 15:26:26 -0300 Subject: [Cython] Fwd: working patch for generating code targeting Py 2.6 and Py 3.0 In-Reply-To: <482485F8.5080508@behnel.de> References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> <482485F8.5080508@behnel.de> Message-ID: You see!! All your observations just seems to indicate I do not actually undestand the whole thing. Please, if you have the time, can you clarify me a bit this: I have this class Info (something like a dictionary ) wrapping some MPI calls for getting and setting (key,value) pairs (they have to be ascii, 8-bits strings in order MPI understands them) cdef class Info: # ... def Get(self, char key[]): """ Retrieve the value associated with a key """ cdef char value[MPI_MAX_INFO_VAL] cdef int flag = 0 CHKERR( MPI_Info_get(self.ob_mpi, key, maxlen, value, &flag) ) if not flag: return (None, False) else: return (value, True) def Set(self, char key[], char value[]): """ Add the (key,value) pair to info, and overrides the value if a value for the same key was previously set """ CHKERR( MPI_Info_set(self.ob_mpi, key, value) ) So then all your comments actually means that I should not take any special action in Cython for wrapping this, and then * - If I run this in Python 2, I just do: info.Set("key", ''value") * - If I run this in Python 3, I should do: info.Set(b"key", b"value") In that case, suppose now that a user running on Python 3 does the following: >>> info.Set("key", "value") . This is broken, because MPI will not recognize the key or the value, as they are not C plain char arrays containing null-terminated ascii 8-bits string. Then, how should I modify my *.pyx code to detect this and generate an error/warning, or even try to coerce the input to ascii 8-bits if the input is 'unicode', or pass it unchanged if the input is 'bytes'? And all this working both in 2.3/2.4/2.5/2.6 and 3.0?? On 5/9/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > Well, at this point I'm a bit confused... The unicode stuff is the > > weakest point in my knowledge of Python internals... > > > :) that's just because Unicode is not exactly trivial - especially if you have > to start thinking about bytes at some point. > > > > > Anyway, I will not be able to work on this in the next two weeks, so > > if anyone wants to go on and give a try, just start applying the patch > > to 'cython-devel-py3' repo (I'm not concerned at all about credits for > > the original patch ;-) ) in order this does not stagnate. > > > done. > > > > > At any > > point, I can contribute testing against my project, just ask me for > > help. > > > some test cases would be greatly appreciated. :) > > > > > Finally, IMHO Cython pyx-level and generated C-level code should work > > with any Python version. > > > But your code shouldn't change semantics when you change the runtime > environment. Otherwise it would become completely impossible to write portable > code. > > > > > So, if in a pyx file I wrote "abc", then, by > > default, it should be a ascii string for Py2 and a unicode one for > > Py3. > > > Not quite right. If you did that in a .py file, that would be the case. If you > do that in Cython, a literal must not change its type, because it compiles to C. > > > > > And Cython should grow a new special 'char' type (like it has for > > bint) to specify ascii string. > > > Why is that? Two things: what's special about an ASCII string compared to a > byte string in general? And: why aren't "bytes" and "bytearray" enough? > > > > > Then, at the C-level, and depending on > > Python major version, the proper conversion py->c or c->py can be > > managed. If not, wrapping legacy C libs and targeting Py2 and Py3 is > > going to be a nightmare. > > > It's a nightmare if your code changes behaviour. I think there should be a > command line switch "-3" for compiling plain Py3 code at some point (as > opposed to normal Py2 code) or maybe that should be determined by the runtime > environment of the *compiler*, but the Cython language itself should stay > compatible to Py2.6 for now. > > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri May 9 21:30:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 21:30:38 +0200 Subject: [Cython] How to deal with byte strings and unicode strings at the API level In-Reply-To: References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> <482485F8.5080508@behnel.de> Message-ID: <4824A65E.90007@behnel.de> Hi, Lisandro Dalcin wrote: > I have this class Info (something like a dictionary ) wrapping some > MPI calls for getting and setting (key,value) pairs (they have to be > ascii, 8-bits strings in order MPI understands them) :) first problem: ASCII is 7-bit, not 8-bit. > cdef class Info: > def Get(self, char key[]): > [...] > def Set(self, char key[], char value[]): > [...] Here, you define your API as taking byte strings as input. Fair enough. However, in your example, you do not verify that your input is really ASCII encoded, so you allow users to pass 8-bit strings without any warning. > So then all your comments actually means that I should not take any > special action in Cython for wrapping this, and then > > * - If I run this in Python 2, I just do: info.Set("key", ''value") This works, as plain string literals in Python are byte strings. Also, conversion between plain ASCII byte strings and unicode strings happens automatically in Python2 (the infamous UnicodeDecodeError on print). > * - If I run this in Python 3, I should do: info.Set(b"key", b"value") Again, this works as you pass byte strings, as enforced by your API. > In that case, suppose now that a user running on Python 3 does the following: > >>>> info.Set("key", "value") . > > This is broken, It's not broken, it's just incorrect API usage. In a way, it's like doing this: >>> info.Set(-999, 123456789) and, guess what: Python will raise a TypeError for this! > because MPI will not recognize the key or the value, > as they are not C plain char arrays containing null-terminated ascii > 8-bits string. > > Then, how should I modify my *.pyx code to detect this and generate an > error/warning, Once the code is in place, Cython will generate a TypeError for you, just like Py3 itself does when attempting automatic conversion between unicode strings and bytes objects. > or even try to coerce the input to ascii 8-bits if the input is 'unicode' No, that's one of the problems why there is a lot of broken code in Python2: "works on my machine, so it can't be broken, can it?" > or pass it unchanged if the input is 'bytes'? That will work, as a bytes object (which is actually a PyStringObject in both Py2 and Py3) is compatible with a char*. However, imagine this line in a Python2 source file: info.Set("???", "??????") or this line in a Python3 source file: info.Set(b"???", b"??????") What will be the byte sequence that you get in your char[] for key and value? Well, it depends on the source code encoding, which you can declare at the beginning of your source file. If, for example, it's "UTF-8", you will get a UTF-8 encoded byte sequence, which is 6 bytes long for key and 12 bytes long for value. If, on the other hand, it's "iso-8859-1", you will get a 3-byte sequence for key and a 6-byte sequence for value. Same code, looks exactly the same in an editor, but results in completely different input to your methods. How is your code going to deal with this? How will it even know what happens? > And all this working both in 2.3/2.4/2.5/2.6 and 3.0?? If byte strings is what your API deals with, this will continue to work across all of those versions. You can use the 2to3 tool to convert your Python2 code to Python3. It works quite well and will change this info.Set("key", "value") into this info.Set(b"key", b"value") for you (you can even call the tool from your setup.py). If, however, what you actually want is text input (i.e. unicode characters), you can fix your API like this (probably plus some more input checking): cdef class Info: def Get(self, key): key = key.encode("ASCII") # or whatever encoding you use internally [...] def Set(self, key, value): key = key.encode("ASCII") # or whatever encoding you use value = value.encode("ASCII") # or whatever encoding you use [...] That way, users can call your API in Python2 like this: info.Set(u"key", u"value") and the 2to3 tool will convert this to info.Set("key", "value") for them, which (again) will continue to work, and (now the cool thing) your methods will always receive the input in the expected encoding, and your users will get an encoding exception if they pass non-ASCII strings. :) So all you have to take care of is what your actual API is: bytes? characters? Does this make things a bit clearer? Stefan From stefan_ml at behnel.de Fri May 9 21:48:16 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 09 May 2008 21:48:16 +0200 Subject: [Cython] How to deal with byte strings and unicode strings at the API level In-Reply-To: <4824A65E.90007@behnel.de> References: <4822A413.8060902@behnel.de> <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> <482485F8.5080508@behnel.de> <4824A65E.90007@behnel.de> Message-ID: <4824AA80.3040803@behnel.de> Hi again, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> or even try to coerce the input to ascii 8-bits if the input is 'unicode' > > No, that's one of the problems why there is a lot of broken code in Python2: Sorry, I misread your sentence here. You were asking how to fix your code and I thought you meant Cython should do it for you (which it can't and also shouldn't...) If your input is unicode characters, then yes, encoding them to a well defined byte sequence is the right thing to do. Stefan From robertwb at math.washington.edu Fri May 9 22:07:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 9 May 2008 13:07:21 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <482423C4.9040005@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> Message-ID: <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> On May 9, 2008, at 3:13 AM, Dag Sverre Seljebotn wrote: > I think there's a fundamental flaw to my NumPy proposal, which need > correction. I've thought about it as "polishing the access to the > extension type", so that after > > cdef object x = numpy.zeros([3,3]) > cdef numpy.ndarray y = x > > y will behave efficiently when treated like a Python object. > > There are problems to this way of thinking though: "print y.strides" > will not work as it is a pointer-array, while "print x.strides" will > work as it is a tuple (one has to do "print (y).strides", > which > is not going to fly with for the users of my NumPy project). I think the correct way to handle this (if we want to) is to make y.strides transform into (y).strides when coerced to a PyObject (assuming that y.strides is not a type we already know how to turn into an object). > On the > other hand, one has to remember to do "print y.shape" for tuple-access > but "y.dimensions[i]" for speedy, non-Python access. (And this > difference in behaviour comes entirely from strides having a name- > clash > while shape/dimensions do not). This is a numpy api issue, nothing to do with Cython. > > So while my approach has been to make the y variable act "more like a > numpy array", I now think this is flawed. Optimizations through typing > and access to extension structs should probably instead be treated as > fundamentally different things, and the typical NumPy user shouldn't > deal with a reference to the extension struct even if typing for speed > is wanted. > > I'll now propose a solution for this. It has been proposed before in a > different form (pxd shadowing etc.); I hope I succeed better now. > > I think that what is wanted here is to /speed up how NumPy objects are > accessed/, and the extension type only comes into it peripherally. So > I'd like a new syntax for modifying the compile-time behaviour of how > objects are treated. It could look something like the following. > > I'm calling the keyword "compiletimefeatures" but that is for lack > of a > better word, also I use cython_ndarray to avoid namespace clashes > (have > ideas for allowing "ndarray" directly but I'd like to leave that > out of > the discussion for now). > > Anyway, numpy.ndarray is an extension type like before, while > cython_ndarray is a new "type specifier" providing extra compile-time > optimizations to the variable that carries its type. numpy.pxd: [...] > Flame away :-) (yes, I see that this could be confusing to OOP. > However > it is no worse than the current situation, one cannot really override > extension type struct items either.) You asked for it... I think this is a very bad idea. The "compile time features" of a type belong to the type, and putting them in some overlay (with a different name, that one has to know about) seems counterintuitive. And I really don't see any advantages (assuming the name clashing can be solved as above). Here is a very simple prototype of what I think could be done in the pxd: ---------a.pxd---------- cdef class A: cdef int len cdef int* data cdef inline [final?] int __getitem__(A a, int i): """ Note that subtypes can't override this. """ if i < 0 or i > a.len: raise IndexError return data[i] ---------b.pyx-------- from a cimport A cdef A(len=10) a = a([1,2,3,4,5,6,7,8,9,10]) # I'll leave the init function to your imagination print a[9] # the code from __getitem__ gets inlined here, and since len is known the a.len is resolved to 10 at compile time. (Here a.len tries to do a lookup first on the compile time type of a, and that failing the runtime type of a. The compile time types need not be struct members, but if they're not then they must be specified because the "runtime" lookup would fail.) - Robert From dalcinl at gmail.com Fri May 9 22:50:23 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 9 May 2008 17:50:23 -0300 Subject: [Cython] How to deal with byte strings and unicode strings at the API level In-Reply-To: <4824AA80.3040803@behnel.de> References: <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> <482485F8.5080508@behnel.de> <4824A65E.90007@behnel.de> <4824AA80.3040803@behnel.de> Message-ID: Stefan, many, many thanks for your explanations. I believe I've started to understand the whole beast. Please clarify me this: Suppose I write this method in a pyx file: def foo(char value[]): pass and next call it (in a Py3 runtime env) in this two ways: 1- foo(b"abc") 2- foo("abc") then (1) should be just fine, Cython should pass the raw C data as it comes. But in the case of (2), will/should Cython (or at the Python C API level) generate an error? If the answer is NO, does it make sense to extend Cython with a C 'pseudotype' (ala bint), let call it 'bchar' to make the generated C code being strict about the type of the Python-level argument?. On 5/9/08, Stefan Behnel wrote: > Hi again, > > Stefan Behnel wrote: > > > Lisandro Dalcin wrote: > >> or even try to coerce the input to ascii 8-bits if the input is 'unicode' > > > > No, that's one of the problems why there is a lot of broken code in Python2: > > > Sorry, I misread your sentence here. You were asking how to fix your code and > I thought you meant Cython should do it for you (which it can't and also > shouldn't...) > > If your input is unicode characters, then yes, encoding them to a well defined > byte sequence is the right thing to do. > > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From curiousjan at gmail.com Fri May 9 23:43:17 2008 From: curiousjan at gmail.com (Jan Strube) Date: Fri, 9 May 2008 14:43:17 -0700 Subject: [Cython] cython add vs __add__ Message-ID: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> Hi List, sorry for the crosspost, I have also asked the question here https:// answers.launchpad.net/cython/+question/31930, but probably the list is a better place to ask. In the new version of Cython I am excited to report that overloading operators seems to work just fine. This is great, because in the high energy physics community, people think it's a good idea to abuse the () operator for iterators to mean next (among other nonsense). So I have a short class in C++, the code is below, and as posted, everything works fine. However, when I try to change "def add(self)" to "def __add__(self)" in myTest I get complaints of the sort File "test.py", line 37, in print test + 3 File "stdMatrix.pyx", line 67, in stdMatrix.myTest.__add__ (stdMatrix.cpp:452) return self.thisPtr.add(other) AttributeError: 'stdMatrix.myTest' object has no attribute 'thisPtr' I find this quite surprising, and could use some enlightenment. I can easily enough work around this, but this is a bug, no? Thanks for cython and best of luck on your sprint. Best, Jan ------------ class_operators.cc --------------- class TEST { public: int x; TEST(int y) {x=y;} int operator() () {x = 17; return x;} int operator+(int y) {return x+y;} int minimize() {return 25;} }; ---------------------------------------------------------- ------stdMatrix.pyx---------------------------------------- cdef extern from "class_operators.cc": ctypedef struct monkey "TEST": int x int (*minimize)() int add "operator+"(int y) int call "operator()"() monkey* new_Test "new TEST"(int y) void del_Test "delete"(monkey* x) cdef class myTest: cdef monkey *thisPtr def __cinit__(self, int y): self.thisPtr = new_Test(y) def __dealloc__(self): del_Test(self.thisPtr) def min(self): return self.thisPtr.minimize() def plus(self, other): return self.thisPtr.add(other) def call(self): return self.thisPtr.call() -------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080509/6fa4b0d0/attachment.htm From dalcinl at gmail.com Sat May 10 00:59:01 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 9 May 2008 19:59:01 -0300 Subject: [Cython] cython add vs __add__ In-Reply-To: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> Message-ID: I believe you are right, it seems a bug, I'v experienced this in the past, and there and used this fixes cdef class myTest def __add__(myTest self, other): # note declaration: myTest self return self.thisPtr.add(other) You should go on with this, It should even work when this get fixed (in the case this is actually a bug and I'm not missing somethig) So rename your 'plus' and 'call' to standard __add__ and __call__ with the declaration trick above, and all should just work. Regards, On 5/9/08, Jan Strube wrote: > > Hi List, > sorry for the crosspost, I have also asked the question here > https://answers.launchpad.net/cython/+question/31930, but > probably the list is a better place to ask. > > > In the new version of Cython I am excited to report that overloading > operators seems to work just fine. This is great, because in the high energy > physics community, people think it's a good idea to abuse the () operator > for iterators to mean next (among other nonsense). > > So I have a short class in C++, the code is below, and as posted, everything > works fine. > However, when I try to change "def add(self)" to "def __add__(self)" in > myTest I get complaints of the sort > > File "test.py", line 37, in > print test + 3 > File "stdMatrix.pyx", line 67, in stdMatrix.myTest.__add__ > (stdMatrix.cpp:452) > return self.thisPtr.add(other) > AttributeError: 'stdMatrix.myTest' object has no attribute 'thisPtr' > > I find this quite surprising, and could use some enlightenment. I can easily > enough work around this, but this is a bug, no? > > Thanks for cython and best of luck on your sprint. > Best, > Jan > > ------------ class_operators.cc --------------- > class TEST > { > public: > int x; > TEST(int y) {x=y;} > int operator() () {x = 17; return x;} > int operator+(int y) {return x+y;} > int minimize() {return 25;} > }; > ---------------------------------------------------------- > > ------stdMatrix.pyx---------------------------------------- > cdef extern from "class_operators.cc": > ctypedef struct monkey "TEST": > int x > int (*minimize)() > int add "operator+"(int y) > int call "operator()"() > monkey* new_Test "new TEST"(int y) > void del_Test "delete"(monkey* x) > > cdef class myTest: > cdef monkey *thisPtr > def __cinit__(self, int y): > self.thisPtr = new_Test(y) > def __dealloc__(self): > del_Test(self.thisPtr) > def min(self): > return self.thisPtr.minimize() > def plus(self, other): > return self.thisPtr.add(other) > def call(self): > return self.thisPtr.call() > -------------------------------------------------------------------- > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From curiousjan at gmail.com Sat May 10 01:17:23 2008 From: curiousjan at gmail.com (Jan Strube) Date: Fri, 9 May 2008 16:17:23 -0700 Subject: [Cython] cython add vs __add__ In-Reply-To: References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> Message-ID: <76513339-9B13-435A-8F56-27DD69EEA841@gmail.com> Very interesting. Thanks for the tip. Cheers, Jan On May 09, 2008, at 3:59 PM, Lisandro Dalcin wrote: > I believe you are right, it seems a bug, I'v experienced this in the > past, and there and used this fixes > > cdef class myTest > def __add__(myTest self, other): # note declaration: myTest self > return self.thisPtr.add(other) > > You should go on with this, It should even work when this get fixed > (in the case this is actually a bug and I'm not missing somethig) > > So rename your 'plus' and 'call' to standard __add__ and __call__ with > the declaration trick above, and all should just work. > > Regards, > > > > On 5/9/08, Jan Strube wrote: >> >> Hi List, >> sorry for the crosspost, I have also asked the question here >> https://answers.launchpad.net/cython/+question/31930, but >> probably the list is a better place to ask. >> >> >> In the new version of Cython I am excited to report that overloading >> operators seems to work just fine. This is great, because in the >> high energy >> physics community, people think it's a good idea to abuse the () >> operator >> for iterators to mean next (among other nonsense). >> >> So I have a short class in C++, the code is below, and as posted, >> everything >> works fine. >> However, when I try to change "def add(self)" to "def __add__ >> (self)" in >> myTest I get complaints of the sort >> >> File "test.py", line 37, in >> print test + 3 >> File "stdMatrix.pyx", line 67, in stdMatrix.myTest.__add__ >> (stdMatrix.cpp:452) >> return self.thisPtr.add(other) >> AttributeError: 'stdMatrix.myTest' object has no attribute 'thisPtr' >> >> I find this quite surprising, and could use some enlightenment. I >> can easily >> enough work around this, but this is a bug, no? >> >> Thanks for cython and best of luck on your sprint. >> Best, >> Jan >> >> ------------ class_operators.cc --------------- >> class TEST >> { >> public: >> int x; >> TEST(int y) {x=y;} >> int operator() () {x = 17; return x;} >> int operator+(int y) {return x+y;} >> int minimize() {return 25;} >> }; >> ---------------------------------------------------------- >> >> ------stdMatrix.pyx---------------------------------------- >> cdef extern from "class_operators.cc": >> ctypedef struct monkey "TEST": >> int x >> int (*minimize)() >> int add "operator+"(int y) >> int call "operator()"() >> monkey* new_Test "new TEST"(int y) >> void del_Test "delete"(monkey* x) >> >> cdef class myTest: >> cdef monkey *thisPtr >> def __cinit__(self, int y): >> self.thisPtr = new_Test(y) >> def __dealloc__(self): >> del_Test(self.thisPtr) >> def min(self): >> return self.thisPtr.minimize() >> def plus(self, other): >> return self.thisPtr.add(other) >> def call(self): >> return self.thisPtr.call() >> -------------------------------------------------------------------- >> >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> >> > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Sat May 10 03:12:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 May 2008 13:12:04 +1200 Subject: [Cython] PyPy parser In-Reply-To: References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> <4823B075.5000903@canterbury.ac.nz> Message-ID: <4824F664.2000807@canterbury.ac.nz> Robert Bradshaw wrote: > I think the C declarations are representable by a generative grammar, > so this method could work. It's not the grammar that's the issue, it's the parsing algorithm. To do it with an LL(1) parser, you need to be able to tell whether you're looking at a type name. The current parser does this by remembering all the names it's seen being declared as types, and this affects the decisions it makes later on. It would be hard to do this without being able to embed arbitrary actions in the grammar rules. There are also other places where it uses contextual information. For example, it passes down a flag indicating whether a statement is inside a cdef extern block, and if so, it treats everything as though it had 'cdef' in front of it. -- Greg From greg.ewing at canterbury.ac.nz Sat May 10 04:31:30 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 May 2008 14:31:30 +1200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48246C10.6050703@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> Message-ID: <48250902.2020804@canterbury.ac.nz> Stefan Behnel wrote: >> for x < i < y: >> ... > > while this is shorter, it also breaks existing code - although I assume you > left the old syntax in? Yes, it will be there for a while, although it will produce a deprecation warning. Ultimately I would like there to be One Way to do it. >>* There are two new builtin functions, typecheck() and >> issubtype(), for checking the types of arguments more safely >> (since isinstance and issubclass can be fooled). > > Could you elaborate on the actual problem here? It's possible for an object to lie about its class to isinstance() via its __class__ attribute. This is not acceptable when you need to ensure that the object has a particular C layout. > Could you imagine using our doctest based test runner as well? To do what, exactly? -- Greg From greg.ewing at canterbury.ac.nz Sat May 10 05:04:01 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 May 2008 15:04:01 +1200 Subject: [Cython] cython add vs __add__ In-Reply-To: References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> Message-ID: <482510A1.7090806@canterbury.ac.nz> Lisandro Dalcin wrote: > I believe you are right, it seems a bug, > > cdef class myTest > def __add__(myTest self, other): # note declaration: myTest self > return self.thisPtr.add(other) It's not a bug. With extension types, the first argument of operator methods isn't necessarily self, so it's not automatically typed as such. See the section on Special Methods of Extension Types in the docs for a full explanation. -- Greg From robertwb at math.washington.edu Sat May 10 05:54:29 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 9 May 2008 20:54:29 -0700 Subject: [Cython] PyPy parser In-Reply-To: <4824F664.2000807@canterbury.ac.nz> References: <48215385.50207@student.matnat.uio.no> <10619.194.114.62.39.1210146701.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482161DF.9010108@student.matnat.uio.no> <743600FF-F1BC-48F4-BDC6-EC89A99396BA@math.washington.edu> <4823B075.5000903@canterbury.ac.nz> <4824F664.2000807@canterbury.ac.nz> Message-ID: On May 9, 2008, at 6:12 PM, Greg Ewing wrote: > Robert Bradshaw wrote: >> I think the C declarations are representable by a generative grammar, >> so this method could work. > > It's not the grammar that's the issue, it's the parsing > algorithm. To do it with an LL(1) parser, you need to > be able to tell whether you're looking at a type name. > The current parser does this by remembering all the > names it's seen being declared as types, and this affects > the decisions it makes later on. It would be hard to do > this without being able to embed arbitrary actions in > the grammar rules. Yes, determining whether or not a identifier is a type or not would happen at a higher level. This needs to be dealt with anyways to handle parameterized types. > There are also other places where it uses contextual > information. For example, it passes down a flag indicating > whether a statement is inside a cdef extern block, and > if so, it treats everything as though it had 'cdef' in > front of it. This isn't a blocker, it would be written into the grammar in a more direct way. - Robert From stefan_ml at behnel.de Sat May 10 07:32:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 May 2008 07:32:34 +0200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48250902.2020804@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> Message-ID: <48253372.9060802@behnel.de> Hi Greg, Greg Ewing wrote: > Stefan Behnel wrote: >>> * There are two new builtin functions, typecheck() and >>> issubtype(), for checking the types of arguments more safely >>> (since isinstance and issubclass can be fooled). >> >> Could you elaborate on the actual problem here? > > It's possible for an object to lie about its class > to isinstance() via its __class__ attribute. This is > not acceptable when you need to ensure that the object > has a particular C layout. Wouldn't special casing a type test for builtins and extension classes be better here than adding new functions? I mean, we already had that with getattr3(), where the right thing to do would have been to fix builtin functions. >> Could you imagine using our doctest based test runner as well? > > To do what, exactly? To share a common test suite. Currently, you have a test suite that is mostly based on comparing the C source, while our test suite is mostly based on building the module and running a doctest against it. The sources generated by Pyrex and Cython look very different, so your approach is not portable, but our test suite can easily be used for both and the tests that are supported by both compilers must obviously yield the same results. So if you added doctests to your test modules (as we do for ours), you could still validate the resulting source code for Pyrex, but we could both join forces and benefit from a growing test suite. This would allow us to see if a bug that gets fixed in one compiler must also be fixed in the other, and it would also make it easier to keep the feature set of both compilers somewhat aligned. Stefan From stefan_ml at behnel.de Sat May 10 07:36:15 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 May 2008 07:36:15 +0200 Subject: [Cython] How to deal with byte strings and unicode strings at the API level In-Reply-To: References: <48240A6A.7040700@behnel.de> <48243A77.9040008@behnel.de> <482485F8.5080508@behnel.de> <4824A65E.90007@behnel.de> <4824AA80.3040803@behnel.de> Message-ID: <4825344F.20201@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, many, many thanks for your explanations. I believe I've > started to understand the whole beast. Please clarify me this: > > Suppose I write this method in a pyx file: > > def foo(char value[]): pass > > and next call it (in a Py3 runtime env) in this two ways: > > 1- foo(b"abc") > > 2- foo("abc") > > then (1) should be just fine, Cython should pass the raw C data as it > comes. But in the case of (2), will/should Cython (or at the Python C > API level) generate an error? Yes, you will get a TypeError for (2). The Python3 equivalent is this: def foo(value): value = bytes(value) Stefan From dagss at student.matnat.uio.no Sat May 10 12:14:44 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 10 May 2008 12:14:44 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> Message-ID: <48257594.7080709@student.matnat.uio.no> An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080510/a3c1f2a8/attachment.htm From stefan_ml at behnel.de Sat May 10 14:34:43 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 May 2008 14:34:43 +0200 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> References: <48243C21.2020207@canterbury.ac.nz> <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> Message-ID: <48259663.3070009@behnel.de> Hi, Brian Blais wrote: > In python, I am used to syntax: > > for var in stuff: > > where my eye finds "var" and then "stuff", so I first find out the > relevant variable, and then the values it will take. > > in old pyrex syntax, > > for var in begin<= var < end: > > my eye finds "var" and then "begin" and "end", so I first find out the > relevant variable, and then the values it will take. Actually, in the old Pyrex syntax, it was for var *from* begin <= var < end: but since now we have > for begin <= var < end: would it be that bad to always use for var in ... instead and just distinguish between for var in something: and for var in begin <= var < end: ? Then again, there even is a switch in Cython that allows you to convert for var in range(begin, end): to for var in begin <= var < end: so maybe the whole discussion is somewhat pointless anyway. I don't quite see why we shouldn't just always convert for var in range(begin, end): to for var in begin <= var < end: *iff* var is cdef-ed as a C integer type. According to Robert, there's a difference if the loop overflows, but that case is almost certainly a programming error when var is a C type, and there is no such thing as Python compatibility for C variables anyway. Opinions? Stefan From dagss at student.matnat.uio.no Sat May 10 14:59:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 10 May 2008 14:59:15 +0200 Subject: [Cython] (repost in plain text) Python type optimizations (NumPy GSoC-related) In-Reply-To: <48257594.7080709@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <48257594.7080709@student.matnat.uio.no> Message-ID: <48259C23.9020904@student.matnat.uio.no> [Sorry about the HTML, have no idea what happened. Posting again in a proper format.] Thanks for the feedback, I have no problems seeing weaknesses in my suggestion, but I think there are real issues here that you don't address either. >> On the >> other hand, one has to remember to do "print y.shape" for tuple-access >> but "y.dimensions[i]" for speedy, non-Python access. (And this >> difference in behaviour comes entirely from strides having a name- >> clash >> while shape/dimensions do not). >> > > This is a numpy api issue, nothing to do with Cython. > Sure. Remember however: Which API are we talking about here, when we say "NumPy API"? When using the extension type, that is essentially the C API. Basically, this seem to imply: a) The NumPy community has to provide a stable API for their extension type (i.e. stable API in header files) b) ...that lies close to the Python API What I'd like is to have support for a thin layer in the pxd file so that one is able to emulate/reimplement parts of the _Python_ API. Then, that pxd file can be shipped with NumPy, and if changes are made to the C API/extension type then just change and ship a new pxd as well; client code remains the same. This is so that code written in Python towards the Python NumPy API can simply be typed and compiled. The "NumPy C API" (which is what is really used when one declares extension types etc.) is however really a different API to NumPy, and so one cannot in general assume that one will ever be able to simply "type and compile" Python code towards it. My point is that we would like this to be attractive for current Python users, using the current Python API. The less they have to know about the C API the better, and then relying on "fudging" the C API to "almost look like" the Python API doesn't seem to cut it. Ahhh.... We might have slightly different goals here? That is, can you agree with this statement?: * Your goal is making writing Cython code towards the NumPy "C API" (extension type etc.) a little more convenient * My goal is to start with the Python API and simply speed it up. Then that is perhaps a fundamental question that should be resolved first. And I do not think they can coincide, that is discussable as well. On to your suggestion: In your prototype you use the "cdef class". Sure, for classes implemented in Cython the C API and Python API coincide, but that is not so for other extension types. Do you mean to imply that NumPy ndarrays will have to be reimplemented entirely in Cython in order to generate effective code for then? I don't think that will happen (nor should it be necesarry). If one changes your example slightly and putting the __getitem__ in a cdef struct for an extension type then you end up "putting them in some overlay", since the real implementation is in another module (written in C). I just wanted to make the overlay more explicit, or if it is not explicit in the syntax as such, at least we need to know that that is what we're doing. BTW, as I said, the "different name thing" was something I hoped to get rid of, it was just to make it clear what was happpening. Sorry, that was a bit confusing. -- Dag Sverre From bblais at gmail.com Sat May 10 16:09:13 2008 From: bblais at gmail.com (Brian Blais) Date: Sat, 10 May 2008 10:09:13 -0400 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: <48259663.3070009@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> <48259663.3070009@behnel.de> Message-ID: <69731ED6-129E-454F-AE7C-2F22652B3DBF@gmail.com> On May 10, 2008, at May 10:8:34 AM, Stefan Behnel wrote: > Brian Blais wrote: >> In python, I am used to syntax: >> >> for var in stuff: >> >> where my eye finds "var" and then "stuff", so I first find out the >> relevant variable, and then the values it will take. >> >> in old pyrex syntax, >> >> for var in begin<= var < end: >> >> my eye finds "var" and then "begin" and "end", so I first find out >> the >> relevant variable, and then the values it will take. > > Actually, in the old Pyrex syntax, it was > > for var *from* begin <= var < end: ooops, my bad. :) > > so maybe the whole discussion is somewhat pointless anyway. I don't > quite see > why we shouldn't just always convert > > for var in range(begin, end): > > to > > for var in begin <= var < end: > > *iff* var is cdef-ed as a C integer type. According to Robert, > there's a > difference if the loop overflows, but that case is almost certainly a > programming error when var is a C type, and there is no such thing > as Python > compatibility for C variables anyway. > that would be great! bb From stefan_ml at behnel.de Sat May 10 16:23:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 10 May 2008 16:23:55 +0200 Subject: [Cython] String interning and Python 3 Message-ID: <4825AFFB.6040305@behnel.de> Hi, I'm wondering how to continue the support for this feature given the fact that identifiers are Unicode strings in Py3. We currently only intern byte strings that look like Python identifiers, so in Py3, they simply no longer look like identifiers, as they are not Unicode strings. I can see four ways how to deal with this: 1) drop string interning completely 2) disable string interning in Py3 and use normally created byte strings instead 3) keep separate sets of identifier-like byte strings and unicode strings in the compiler and write them into the C file. Then, depending on the Python version, either intern the byte strings or the unicode strings, and create the other set as un-interned strings. 4) keep the information if a string should be interned for all strings we deal with (bytes and unicode), remove the intern tab and merge it with the general string tab by adding an additional field "intern". Then __Pyx_InitStrings() would create the strings differently depending on the compile time Python version, i.e., it would intern Unicode identifiers in Py3 and byte string identifiers in Py2, and create everything else as normal strings. Personally, I favour 4) - although I could live with 1) - but since I'm not quite sure what the original intention of string interning was (saving memory?), I'd like to hear other opinions first. Stefan From languitar at semipol.de Sat May 10 18:47:30 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sat, 10 May 2008 18:47:30 +0200 Subject: [Cython] cimport'ing module with cdef public declarations Message-ID: <4825D1A2.5000900@semipol.de> Hi again, a few weeks ago I reported a problem[1] about cimport'ing a module that contains cdef public declarations. I could manage that problem that time but now I'm facing the same problem again. Does anyone know a solution to this? [1] http://codespeak.net/pipermail/cython-dev/2008-April/000582.html Thanks Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080510/49480ebd/attachment.pgp From robertwb at math.washington.edu Sat May 10 19:47:16 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 10 May 2008 10:47:16 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <48257594.7080709@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <48257594.7080709@student.matnat.uio.no> Message-ID: <5DB2E374-D355-4A6E-A2EC-9382AC4E5556@math.washington.edu> On May 10, 2008, at 3:14 AM, Dag Sverre Seljebotn wrote: > Thanks for the feedback, I have no problems seeing weaknesses in my > suggestion, but I think there are real issues here that you don't > address either. > >>> On the other hand, one has to remember to do "print y.shape" for >>> tuple-access but "y.dimensions[i]" for speedy, non-Python access. >>> (And this difference in behaviour comes entirely from strides >>> having a name- clash while shape/dimensions do not). >> This is a numpy api issue, nothing to do with Cython. > Sure. Remember however: Which API are we talking about here, when > we say "NumPy API"? When using the extension type, that is > essentially the C API. Yes, this I was talking about the NumPy C API being different from the NumPy Py API. > Basically, this seem to imply: > > a) The NumPy community has to provide a stable API for their > extension type (i.e. stable API in header files) > b) ...that lies close to the Python API > > What I'd like is to have support for a thin layer in the pxd file > so that one is able to emulate/reimplement parts of the _Python_ > API. Then, that pxd file can be shipped with NumPy, and if changes > are made to the C API/extension type then just change and ship a > new pxd as well; client code remains the same. > > This is so that code written in Python towards the Python NumPy API > can simply be typed and compiled. The "NumPy C API" (which is what > is really used when one declares extension types etc.) is however > really a different API to NumPy, and so one cannot in general > assume that one will ever be able to simply "type and compile" > Python code towards it. > > My point is that we would like this to be attractive for current > Python users, using the current Python API. The less they have to > know about the C API the better, and then relying on "fudging" the > C API to "almost look like" the Python API doesn't seem to cut it. > > Ahhh.... We might have slightly different goals here? That is, can > you agree with this statement?: > > * Your goal is making writing Cython code towards the NumPy "C > API" (extension type etc.) a little more convenient > * My goal is to start with the Python API and simply speed it up. > > Then that is perhaps a fundamental question that should be resolved > first. And I do not think they can coincide, that is discussable as > well. I think both goals are important, and can coincide. Our target audience is SciPy developers as well as SciPy users. > On to your suggestion: In your prototype you use the "cdef class". > Sure, for classes implemented in Cython the C API and Python API > coincide, but that is not so for other extension types. Do you mean > to imply that NumPy ndarrays will have to be reimplemented entirely > in Cython in order to generate effective code for then? I don't > think that will happen (nor should it be necesarry). Incidentally, some of the classes may be implemented in Cython (there is a relevant GSoC for the SciPy folks on this), but that is completely besides the point. Looking at my example, I had a.pxd and b.pyx that used a.A. If A was implemented in Cython or pure C it made no difference. > If one changes your example slightly and putting the __getitem__ in > a cdef struct for an extension type then you end up "putting them > in some overlay", since the real implementation is in another > module (written in C). I just wanted to make the overlay more > explicit, or if it is not explicit in the syntax as such, at least > we need to know that that is what we're doing. Yes, any actual code in a pxd file is an "overlay" as you call it. But semantically it belongs to the class, and should be used whenever that class is used. Code in a .pyx file is compiled to the module. Code in a .pxd file is only used to help compile other modules. > BTW, as I said, the "different name thing" was something I hoped to > get rid of, it was just to make it clear what was happpening. > Sorry, that was a bit confusing. No problem. BTW, did you have any comments about my proposal for how use parameterize types? - Robert From roed at math.harvard.edu Sat May 3 09:27:14 2008 From: roed at math.harvard.edu (David Roe) Date: Sat, 3 May 2008 03:27:14 -0400 Subject: [Cython] Comments on Documentation Message-ID: <5cddadeb0805030027qe47e3e1y311c242ad2456680@mail.gmail.com> Hey guys, I saw the post about the documentation at http://www.mudskipper.ca/cython-doc/ on sage-devel, and have a few typos to point out. It looks awesome though. 1) Indentation problem (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): cdef Shrubbery another_shrubbery(Shrubbery sh1): cdef Shrubbery sh2 sh2 = Shrubbery() sh2.width = sh1.width sh2.height = sh1.height return sh2 2) Another indentation error (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): cdef class CheeseShop: cdef object cheeses def __cinit__(self): self.cheeses = [] property cheese: def __get__(self): return "We don't have: %s" % self.cheeses def __set__(self, value): self.cheeses.append(value) def __del__(self): del self.cheeses[:] 3) Again (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): cdef class Parrot: cdef void describe(self): print "This parrot is resting." cdef class Norwegian(Parrot): cdef void describe(self): Parrot.describe(self) print "Lovely plumage!" 4) This should be a header? (http://www.mudskipper.ca/cython-doc/docs/sharing_declarations.html): What a Definition File contains A definition file can contain: 5) A linebreak error in (http://www.mudskipper.ca/cython-doc/docs/sharing_declarations.html): "cdef float cube(float)" should go before "spammery.pyx" 6) The Sharing extension types example is confusing because the file names are mixed into the code. They should be split (and in different colors as in the "Using cimport to resolve naming conflicts" section). 7) The highlighting on "The cdef extern from clause does three things:" is wrong (http://www.mudskipper.ca/cython-doc/docs/external_C_code.html) 8) There's a table missing in the "Styles of struct, union and enum declaration" section (http://www.mudskipper.ca/cython-doc/docs/external_C_code.html) 9) In "Declaring a function as callable without the GIL," I think there should be a with before the nogil in the final example. The last sentence may have a typo as well. I could be wrong though: I don't actually know the syntax you're explaining. 10) Capitalization problem (http://www.mudskipper.ca/cython-doc/docs/pyrex_differences.html) Cdef bint b = x David From kirr at mns.spb.ru Sat May 3 13:38:07 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Sat, 03 May 2008 15:38:07 +0400 Subject: [Cython] [PATCH 0 of 2] Handy patches for those, running Cython in-tree Message-ID: Hi, Two handy patches to run cython in-tree, i.e. without installation at all. Please apply. From anjiro at cc.gatech.edu Mon May 5 20:46:00 2008 From: anjiro at cc.gatech.edu (Daniel Ashbrook) Date: Mon, 05 May 2008 14:46:00 -0400 Subject: [Cython] [Pyrex] newbie list processing question In-Reply-To: <0F507576-A9F6-43CA-8FF7-B744C1A6796B@math.washington.edu> References: <481F4CF3.1020204@cc.gatech.edu> <0F507576-A9F6-43CA-8FF7-B744C1A6796B@math.washington.edu> Message-ID: <481F55E8.20603@cc.gatech.edu> Robert Bradshaw wrote: >> def addOne(l): >> return [i+1 for i in l] >> ... > > This last implementation of addOne should work as is in Cython, and will > be nearly optimal (assuming your CPU has reasonable branch prediction). > However, if you are manipulating word-sized integers, using C arrays > will give you a manyfold over python arithmetic. So in the real code, I'm actually doing float math. And I'll be wanting to return my results in a list object. There will be many thousands of float results; what's the best way to deal with that? Use a C-specific data structure to store the results then turn it into a list somehow? > Did you do > > cdef int i > for i from 0 <= i < len(L): > ... Ah, I missed the "cdef int i" part of it. > In Cython, if one writes > > L[i] > > where i is a cdef int, then it checks to see at runtime if L is a list > and accesses its elements via a macro. Otherwise one can use > PyList_SetItem and friends, but as you have noticed that is cumbersome > (as well ahs being hard to read). Oh ho! That works very nicely; I'll include code to help others in the future: cdef int i for i from 0 <= i < len(l): l[i] = l[i] + 1 Thanks for the help! dan From greg at cosc.canterbury.ac.nz Fri May 9 11:07:21 2008 From: greg at cosc.canterbury.ac.nz (greg) Date: Fri, 09 May 2008 21:07:21 +1200 Subject: [Cython] ANN: Pyrex 0.9.7 Message-ID: <48241449.5040309@cosc.canterbury.ac.nz> Pyrex 0.9.7 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Highlights of this version: * I have streamlined the integer for-loop syntax. Instead of the loop variable redundantly appearing in two places, it's now just for x < i < y: ... * If you declare a variable as a list or dict, then calls to some of its methods will be compiled into type-specific Python API calls instead of generic ones. * Most built-in constants are referenced directly instead of via dict lookup. * There are two new builtin functions, typecheck() and issubtype(), for checking the types of arguments more safely (since isinstance and issubclass can be fooled). What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From bblais at bryant.edu Sat May 10 13:16:26 2008 From: bblais at bryant.edu (Brian Blais) Date: Sat, 10 May 2008 07:16:26 -0400 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: <48243C21.2020207@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> Message-ID: <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> On May 9, 2008, at May 9:7:57 AM, Greg Ewing wrote: > > * I have streamlined the integer for-loop syntax. Instead > of the loop variable redundantly appearing in two places, > it's now just > > for x < i < y: > ... > > Not that my opinion matters a whole lot, but I prefer the redundant syntax. This new syntax solves a problem, and introduces a problem which (in my opinion) is worse. In python, I am used to syntax: for var in stuff: where my eye finds "var" and then "stuff", so I first find out the relevant variable, and then the values it will take. in old pyrex syntax, for var in begin<= var < end: my eye finds "var" and then "begin" and "end", so I first find out the relevant variable, and then the values it will take. In the new syntax, for begin <= var < end: my eye finds some values of var, then the relevant variable, then the end. It changes the logic of the for-loop organization in my head, and as I read. It makes it less like Python, and thus more jarring to go between the two. Personally, I never minded the redundancy (how much extra typing is it really?) I find breaking readability, and a parallel with existing python syntax to be far far worse. I hope you keep the two syntaxes, and see what the response is when developers actually use it for a while. My gut feeling is that the new syntax will not be favored, but I could very well be wrong. Just my 2c thanks, Brian Blais -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080510/1c331c88/attachment-0001.htm From jek-gmane at kleckner.net Sat May 10 22:27:30 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Sat, 10 May 2008 13:27:30 -0700 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: <48259663.3070009@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> <48259663.3070009@behnel.de> Message-ID: Stefan Behnel wrote: > Brian Blais wrote: >> In python, I am used to syntax: >> >> for var in stuff: >> >> where my eye finds "var" and then "stuff", so I first find out the >> relevant variable, and then the values it will take. +1 to the value of seeing the "var" first and being more consistent with Python (being more Pythonic). From dalcinl at gmail.com Sat May 10 23:06:51 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 10 May 2008 18:06:51 -0300 Subject: [Cython] String interning and Python 3 In-Reply-To: <4825AFFB.6040305@behnel.de> References: <4825AFFB.6040305@behnel.de> Message-ID: Stefan , If (4) is doable with no much effort, then I believe this is the right way. As you said, one of the points of string interning is saving memory. But there is also another very important benefit. Look at this for Python 2.6 sources in stringobject.c int _PyString_Eq(PyObject *o1, PyObject *o2) { PyStringObject *a = (PyStringObject*) o1; PyStringObject *b = (PyStringObject*) o2; return Py_SIZE(a) == Py_SIZE(b) && *a->ob_sval == *b->ob_sval && memcmp(a->ob_sval, b->ob_sval, Py_SIZE(a)) == 0; } As you can see the line with '*a->ob_sval == *b->ob_sval' provides a fast path for string equality. So if o1 and o2 are the same (interned) strings then with a pointer comparison you avoid at all the memcmp call. And this is very, very important in dictionary lookups to make that operation faster in the case the keys are strings. On 5/10/08, Stefan Behnel wrote: > Hi, > > I'm wondering how to continue the support for this feature given the fact that > identifiers are Unicode strings in Py3. We currently only intern byte strings > that look like Python identifiers, so in Py3, they simply no longer look like > identifiers, as they are not Unicode strings. > > I can see four ways how to deal with this: > > 1) drop string interning completely > > 2) disable string interning in Py3 and use normally created byte strings instead > > 3) keep separate sets of identifier-like byte strings and unicode strings in > the compiler and write them into the C file. Then, depending on the Python > version, either intern the byte strings or the unicode strings, and create the > other set as un-interned strings. > > 4) keep the information if a string should be interned for all strings we deal > with (bytes and unicode), remove the intern tab and merge it with the general > string tab by adding an additional field "intern". Then __Pyx_InitStrings() > would create the strings differently depending on the compile time Python > version, i.e., it would intern Unicode identifiers in Py3 and byte string > identifiers in Py2, and create everything else as normal strings. > > Personally, I favour 4) - although I could live with 1) - but since I'm not > quite sure what the original intention of string interning was (saving > memory?), I'd like to hear other opinions first. > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat May 10 23:30:51 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 10 May 2008 18:30:51 -0300 Subject: [Cython] cython add vs __add__ In-Reply-To: <482510A1.7090806@canterbury.ac.nz> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> Message-ID: Gre, the (Cython) docs say this Special Method Table --------------------------------- This table lists all of the special methods together with their parameter and return types. A parameter named self is of the type the method belongs to. Other untyped parameters are generic Python objects. What do you understand for 'A parameter named self is of the type the method belongs to'. Anyway, do not you believe the current behavior is a bit counter-intuitive? On 5/10/08, Greg Ewing wrote: > Lisandro Dalcin wrote: > > I believe you are right, it seems a bug, > > > > > cdef class myTest > > def __add__(myTest self, other): # note declaration: myTest self > > return self.thisPtr.add(other) > > > It's not a bug. With extension types, the first argument > of operator methods isn't necessarily self, so it's not > automatically typed as such. > > See the section on Special Methods of Extension Types in > the docs for a full explanation. > > -- > > Greg > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sun May 11 00:03:27 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 10 May 2008 19:03:27 -0300 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: <48259663.3070009@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> <48259663.3070009@behnel.de> Message-ID: On 5/10/08, Stefan Behnel wrote: > I don't quite see why we shouldn't just always convert > > for var in range(begin, end): > > to > > for var in begin <= var < end: > > > *iff* var is cdef-ed as a C integer type. I'm definitely +1 on this!! Regarding to the overflow issue, perhaps a check can be added before entering the loop. And Cython should take care if the user modifies the range() arguments inside de loop. What will you do in this case? IMHO, matching the Python semantics is the right way. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Sun May 11 02:04:16 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 12:04:16 +1200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48253372.9060802@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> <48253372.9060802@behnel.de> Message-ID: <48263800.8070606@canterbury.ac.nz> Stefan Behnel wrote: > Wouldn't special casing a type test for builtins and extension classes be > better here than adding new functions? I'm not sure what you mean by that. There's no run-time distinction between a builtin class, an extension class implemented with Pyrex, or an extension class created some other way. They're all just types. I could make the Pyrex version of isinstance() behave differently, but then it wouldn't have quite the same semantics as it does in Python. That could be confusing, especially since you aren't guaranteed to be always getting the directly-called version. Rather than fiddle with the semantics, I felt it was better to provide different functions -- EIBTI and all that. They're not really new functions, anyway -- they're just exposing functions that are in the Python API, and they're the ones you would use if you were writing the extension module in C. > I mean, we already had that with getattr3(), where the right > thing to do would have been to fix builtin functions. That was a matter of pragmatism. I don't currently have any mechanism for dealing with a C function that can have more than one signature. I would have had to build a very special case into the guts of the compiler somewhere, which I didn't feel was worth doing just for that one function. So I did the next best thing and provided something that works, even if it's not ideal. I can always come back and do something else later. I think we're coming up against those philosophical differences again. Pyrex code is not Python code, and doesn't pretend to be. I try to make things compatible where reasonably possible, but it's not an overriding principle. > So if you added doctests > to your test modules (as we do for ours), you could still validate the > resulting source code for Pyrex, but we could both join forces and benefit > from a growing test suite. Okay, so you just want me to run the Cython tests as well. That shouldn't be too hard. Is there somewhere I can download them from? -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 02:23:12 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 12:23:12 +1200 Subject: [Cython] String interning and Python 3 In-Reply-To: <4825AFFB.6040305@behnel.de> References: <4825AFFB.6040305@behnel.de> Message-ID: <48263C70.4000400@canterbury.ac.nz> Stefan Behnel wrote: > Personally, I favour 4) - although I could live with 1) - but since I'm not > quite sure what the original intention of string interning was (saving > memory?) No, it's the same reason Python does it -- to speed up namespace lookups. -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 02:53:03 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 12:53:03 +1200 Subject: [Cython] cython add vs __add__ In-Reply-To: References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> Message-ID: <4826436F.90608@canterbury.ac.nz> Lisandro Dalcin wrote: > 'A parameter named self is of the type the > method belongs to'. I mean the documentation indicates when a parameter has the type of 'self' by using the word 'self' in the table. It *doesn't* mean that naming it 'self' in your code will give it the type of self. Sorry about the confusion; I'll see if I can re-word that part of the docs to make it clearer. > Anyway, do not you believe the current behavior is > a bit counter-intuitive? It's different from Python, but that's because the whole way that the operator methods work at the C type slot level is very different. It's something you just have to know about and deal with. The section on "Arithmetic Methods" explains what's going on. -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 02:55:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 12:55:45 +1200 Subject: [Cython] cimport'ing module with cdef public declarations In-Reply-To: <4825D1A2.5000900@semipol.de> References: <4825D1A2.5000900@semipol.de> Message-ID: <48264411.7070907@canterbury.ac.nz> Johannes Wienke wrote: > a few weeks ago I reported a problem[1] about cimport'ing a module that > contains cdef public declarations. > from ship.icewing.gui cimport plug_observ_data With extension modules that live inside packages, you need to make sure that the compiler knows the fully qualified name of the module. With the current version of Pyrex, you need to do this by naming the source files with the full dotted name, e.g. ship.icewing.gui.pyx and ship.icewing.gui.pxd. But Cython does things differently in this area, and I'm not sure exactly what you need to do. -- Greg From wstein at gmail.com Sun May 11 03:12:59 2008 From: wstein at gmail.com (William Stein) Date: Sat, 10 May 2008 18:12:59 -0700 Subject: [Cython] Arc Riley, PyMill, and FUD Message-ID: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> Hi Cython-Devel, Mike Hansen pointed out to me that this Arc Riley guy is making a lot of noise today about forking Pyrex and Cython. Everyone here should be aware of this if you aren't already. Here's Arc's blog post: http://arcriley.blogspot.com/2008/05/radical-redirection.html Here's a mailing list post where outlines why and how: http://www.pysoy.org/pipermail/pysoy-dev/2008-May/000173.html Here's the PyMill project website: http://pysoy.org:8000/ There are in my opinion some confused and wrong statements above about Pyrex/Cython there. Judge for yourself. Basically he takes all the good aspects of the many valuable recent discussions and brainstorming threads that have happened on cython-devel and transforms them into a bunch of FUD. -- William -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From greg.ewing at canterbury.ac.nz Sun May 11 03:18:29 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 13:18:29 +1200 Subject: [Cython] cimport'ing module with cdef public declarations In-Reply-To: <48264411.7070907@canterbury.ac.nz> References: <4825D1A2.5000900@semipol.de> <48264411.7070907@canterbury.ac.nz> Message-ID: <48264965.2010008@canterbury.ac.nz> I wrote: > With extension modules that live inside packages, you need > to make sure that the compiler knows the fully qualified > name of the module. Sorry, I should have read what you wrote more carefully. The problem isn't what I thought it was. I'm not sure I can help you with this, because I believe Cython handles cimporting of functions a bit differently from Pyrex. But in Pyrex you wouldn't need the "public" declaration, as it's only for making functions available to external C code, not other Pyrex modules. -- Greg From arcriley at gmail.com Sun May 11 03:37:11 2008 From: arcriley at gmail.com (Arc Riley) Date: Sat, 10 May 2008 21:37:11 -0400 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> Message-ID: Take it as FUD if you want. The truth of the matter is we have our timeline, and we need a language that is going to work for us, and neither Pyrex nor Cython fit the bill. In our discussions tracking Cython's continuing changes, given the amount of work it'd take us to modify Cython to our needs and maintain a variant given radical changes being discussed, is completely unfeasible. You may want to consider, however, that while you have very good reasons for making choices, and I am not at all saying you should not be making those exact choices, those same choices don't work for everyone. On Sat, May 10, 2008 at 9:12 PM, William Stein wrote: > Hi Cython-Devel, > > Mike Hansen pointed out to me that this Arc Riley guy is making > a lot of noise today about forking Pyrex and Cython. Everyone > here should be aware of this if you aren't already. Here's Arc's > blog post: > > http://arcriley.blogspot.com/2008/05/radical-redirection.html > > Here's a mailing list post where outlines why and how: > > http://www.pysoy.org/pipermail/pysoy-dev/2008-May/000173.html > > Here's the PyMill project website: > http://pysoy.org:8000/ > > There are in my opinion some confused and wrong statements > above about Pyrex/Cython there. Judge for yourself. > Basically he takes all the good aspects of the many valuable > recent discussions and brainstorming threads that have happened > on cython-devel and transforms them into a bunch of FUD. > > -- William > > -- > William Stein > Associate Professor of Mathematics > University of Washington > http://wstein.org > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From ggellner at uoguelph.ca Sun May 11 04:06:04 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Sat, 10 May 2008 22:06:04 -0400 Subject: [Cython] Comments on Documentation In-Reply-To: <5cddadeb0805030027qe47e3e1y311c242ad2456680@mail.gmail.com> References: <5cddadeb0805030027qe47e3e1y311c242ad2456680@mail.gmail.com> Message-ID: <20080511020604.GA25681@basestar> Thanks for the changes, I will apply them next week (currently getting ready for a conference ...). If you are feeling ambitious you can try to make a patch against: http://hg.cython.org/cython-docs/ If not don't worry, your email was very clear. Thanks for the help. Gabriel On Sat, May 03, 2008 at 03:27:14AM -0400, David Roe wrote: > Hey guys, > I saw the post about the documentation at > http://www.mudskipper.ca/cython-doc/ on sage-devel, and have a few > typos to point out. It looks awesome though. > > 1) Indentation problem > (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): > > cdef Shrubbery another_shrubbery(Shrubbery sh1): > cdef Shrubbery sh2 > sh2 = Shrubbery() > sh2.width = sh1.width > sh2.height = sh1.height > return sh2 > > 2) Another indentation error > (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): > cdef class CheeseShop: > > cdef object cheeses > > def __cinit__(self): > self.cheeses = [] > > property cheese: > > def __get__(self): > return "We don't have: %s" % self.cheeses > > def __set__(self, value): > self.cheeses.append(value) > > def __del__(self): > del self.cheeses[:] > > 3) Again (http://www.mudskipper.ca/cython-doc/docs/extension_types.html): > cdef class Parrot: > > cdef void describe(self): > print "This parrot is resting." > > cdef class Norwegian(Parrot): > > cdef void describe(self): > Parrot.describe(self) > print "Lovely plumage!" > > 4) This should be a header? > (http://www.mudskipper.ca/cython-doc/docs/sharing_declarations.html): > What a Definition File contains A definition file can contain: > > 5) A linebreak error in > (http://www.mudskipper.ca/cython-doc/docs/sharing_declarations.html): > "cdef float cube(float)" should go before "spammery.pyx" > > 6) The Sharing extension types example is confusing because the file > names are mixed into the code. They should be split (and in different > colors as in the "Using cimport to resolve naming conflicts" section). > > 7) The highlighting on "The cdef extern from clause does three > things:" is wrong > (http://www.mudskipper.ca/cython-doc/docs/external_C_code.html) > > 8) There's a table missing in the "Styles of struct, union and enum > declaration" section > (http://www.mudskipper.ca/cython-doc/docs/external_C_code.html) > > 9) In "Declaring a function as callable without the GIL," I think > there should be a with before the nogil in the final example. The > last sentence may have a typo as well. I could be wrong though: I > don't actually know the syntax you're explaining. > > 10) Capitalization problem > (http://www.mudskipper.ca/cython-doc/docs/pyrex_differences.html) > Cdef bint b = x > > David > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From wstein at gmail.com Sun May 11 04:22:12 2008 From: wstein at gmail.com (William Stein) Date: Sat, 10 May 2008 19:22:12 -0700 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> Message-ID: <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> On Sat, May 10, 2008 at 6:37 PM, Arc Riley wrote: > Take it as FUD if you want. > > The truth of the matter is we have our timeline, and we need a > language that is going to work for us, and neither Pyrex nor Cython > fit the bill. In our discussions tracking Cython's continuing > changes, given the amount of work it'd take us to modify Cython to our > needs and maintain a variant given radical changes being discussed, is > completely unfeasible. > > You may want to consider, however, that while you have very good > reasons for making choices, and I am not at all saying you should not > be making those exact choices, those same choices don't work for > everyone. > 1. You wrote: "Cython was founded with the primary goal of remaining compatible with Pyrex." I started Cython, so I can say with certainty that the above was absolutely not the primary goal for starting Cython. The goals for Cython are listed on the cython website and they are: "(1) good test suite, (2) easyinstall support, (3) good documentation, (4) make cython part of Python (like ctypes), (5) compile most python code, (6) mitigate or eliminate the need for users to invoke the Python/C API directly without sacrificing performance." 2. You write "Cython guys are obviously upset, because Greg didn't consult -anyone- for even opinions before doing so, a behavior many of us may recognize the author of Soya demonstrating frequently, and they are realizing that their goal of remaining Pyrex-compatible puts them at Greg's mercy as far as language changes." This is total FUD. The Cython developers (1) in no way have that goal, and (2) are certainly not "at Greg's mercy". And by the way, if anything Greg has in my opinion been tremendously supportive of Cython lately, having joined the Cython mailing list, answering and asking many questions, etc. Thanks Greg! 3. "build the main mill [cython/pyrex fork] package from scratch, using the wisdom of the Cython guys (reverse engineering in some cases), even the people working on Cython agree it needs to be rewritten since the barrier to entry for new developers is obscenely high". One person who is new to Cython made a claim on cython-devel about the barrier to entry. That is much different than "the people working on Cython agreeing the barrier to entry for new developers is obscenely high". There certainly is a barrier to entry -- Cython is a compiler of a highly nontrivial language after all. But I think describing it as obscenely high due to the particular implementation is FUD. And it is exactly the kind of FUD that will hurt the Cython project right now. Arc, I have nothing against you starting whatever C-extension generating project you want, etc. I hope the above helps clarify some misunderstandings you might have about Cython/Pyrex. -- William From arcriley at gmail.com Sun May 11 04:51:06 2008 From: arcriley at gmail.com (Arc Riley) Date: Sat, 10 May 2008 22:51:06 -0400 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> Message-ID: > 1. You wrote: "Cython was founded with the primary goal of remaining > compatible with Pyrex." Primary may be a misnomer, but it is listed in your faq. If you are changing your mind on this, it may be something to edit. IMHO, it would be a good move at this juncture. > This is total FUD. The Cython developers (1) in no way have that > goal, and (2) are certainly not "at Greg's mercy". FUD implies that I'm intentionally trying to spread Fear, Uncertainty, and Doubt. And the latter is true if you wish to remain Pyrex compatible. It's true whenever you provide compatibility with something, whenever Greg decides to change the language you'll need to support his changes. > 3. "build the main mill [cython/pyrex fork] package > from scratch your added [] is inconsistent with that sentence - mill cannot be a fork if it is being written from scratch. > One person who is new to Cython made a claim on cython-devel > about the barrier to entry. That is much different than "the > people working on Cython > agreeing the barrier to entry for new developers is obscenely high". You assume your mailing list is the only place I've gotten that impression from. We've had several discussions on IRC about the state of the codebase. You also missed part of that statement, "agree" is not meaning "Cython devs all agree" but "[some] Cython devs agree [with the assessment we have made]" You're digging into our mailing list archives, and that's fine, but you have to remember when reading these that the people reading them are having discussions both on and off the mailing list. You can feel free to post a refute in reply to our mailing list, since I believe I'm the only one on this one, but it really matters not to either Cython or us. It's not like people reading my blog posting on some rss feed syndicate is going to read our list archives. > There certainly is a barrier to entry -- Cython is a compiler of > a highly nontrivial language after all. But I think describing > it as obscenely high due to the particular implementation is FUD. > And it is exactly the kind of FUD that will hurt the Cython project > right now. You seem to like the word FUD, however, you are misusing it. FUD is something that sleezy marketing types conjure up intentionally to mislead the public into not buying a product. If you look at the blog posting itself from a neutral POV you'll see I included nothing which would lead someone to choose our solution (which is currently vaporware, as we're just starting it, and we have no intention at marketing to the larger community) over Cython or otherwise twist perceptions. When I post to my blog, knowing how many places it's syndicated, I'm usually careful in the words I use. The mailing list is an internal development list with maybe 25 people on it, we're a lot more casual there and venting is common. If there is false information in the blog posting, something specific, let me know and I'll fix it. Any fix made will get to everywhere it's posted. From wstein at gmail.com Sun May 11 05:14:26 2008 From: wstein at gmail.com (William Stein) Date: Sat, 10 May 2008 20:14:26 -0700 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> Message-ID: <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> On Sat, May 10, 2008 at 7:51 PM, Arc Riley wrote: >> 1. You wrote: "Cython was founded with the primary goal of remaining >> compatible with Pyrex." > > Primary may be a misnomer, but it is listed in your faq. If you are > changing your mind on this, it may be something to edit. IMHO, it > would be a good move at this juncture. I just read our FAQ and it does not say that. The closest I got was "It usually follows in compatibility once Pyrex implements a comparable feature, though." >> This is total FUD. The Cython developers (1) in no way have that >> goal, and (2) are certainly not "at Greg's mercy". > > FUD implies that I'm intentionally trying to spread Fear, Uncertainty, > and Doubt. > > And the latter is true if you wish to remain Pyrex compatible. It's > true whenever you provide compatibility with something, whenever Greg > decides to change the language you'll need to support his changes. We definitely have made no commitment to do that, haven't said we will, and historically haven't done that in a rigid manner. That said Robert and Stefan have put a lot of effort into merging most of the great work Greg does on Pyrex into Cython. >> 3. "build the main mill [cython/pyrex fork] package >> from scratch > > your added [] is inconsistent with that sentence - mill cannot be a > fork if it is being written from scratch. I wrote "fork" because that's exactly how you describe PyMill on the Pymill trac server page: http://pysoy.org:8000/ That page disappeared so I can't paste from it. > You seem to like the word FUD, however, you are misusing it. > > FUD is something that sleezy marketing types conjure up intentionally > to mislead the public into not buying a product. If you look at the It's FUD about the Cython and Pyrex projects. Anyway, I have the impression you thrive on attention so I'm not going to bother with this further. -- William -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From arcriley at gmail.com Sun May 11 05:54:52 2008 From: arcriley at gmail.com (Arc Riley) Date: Sat, 10 May 2008 23:54:52 -0400 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> Message-ID: > I just read our FAQ and it does not say that. "The intention is to make it a drop-in replacement for existing Pyrex code." If this is not the meaning you intended, then you may want to change the way the FAQ reads. > I wrote "fork" because that's exactly how you describe > PyMill on the Pymill trac server page: http://pysoy.org:8000/ > That page disappeared so I can't paste from it. That "page" is a development tracd that is not open to the public and nothing on it is official in any way. The numerous broken images on it should have been a clue, if the port wasn't enough. Your reposting it here was the reason it was moved to another port. The reason we're not using the name "PyMill" in any public posting is that may not even be the name we end up with. There's a good chance it isn't. We're nowhere near the point of publishing a definition of it. You'll notice my blog entry didn't use the name. In fact your entire panic mode seems to be derived from a warped sense of importance of a posting to our internal development list a link you followed from it to a draft site we're working from. Nothing you pasted, besides a link to the blog entry, was read by the general Python community until you posted it here. Again - if there's something in the blog posting that is inaccurate, specify it, and I'll correct it. That is the only referenced place where FUD (sic) could have been spread, since everything else was internal. From wstein at gmail.com Sun May 11 06:06:07 2008 From: wstein at gmail.com (William Stein) Date: Sat, 10 May 2008 21:06:07 -0700 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> Message-ID: <85e81ba30805102106u3ab17ddlc95b1b70dba2a0e9@mail.gmail.com> On Sat, May 10, 2008 at 8:54 PM, Arc Riley wrote: >> I just read our FAQ and it does not say that. > > "The intention is to make it a drop-in replacement for existing Pyrex code." > > If this is not the meaning you intended, then you may want to change > the way the FAQ reads. > You're right. I've changed that sentence to read "The intention is to make it for the most part a drop-in replacement for existing Pyrex code, though some changes to that existing code may have to be made." Thanks for your other clarifications as well. -- William -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From gfurnish at gfurnish.net Sun May 11 09:30:26 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 11 May 2008 01:30:26 -0600 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> Message-ID: <8f8f8530805110030l534317d7j763f1534a9a509c3@mail.gmail.com> Arc, at this point all I ever hear from you is how Cython is so horrible and has to be rewritten to be even tolerable. What exactly is so horrible about Cython? You have never given any specifics on why Cython is so horrible that it is better to start from scratch and thus violate about every good rule of programming (see Joel on Software). If this isn't intended to be FUD why don't you name some specific features that require a full rewrite as opposed to making grand movements, which are in many cases flat out wrong, that seem designed only to attack Cython and Pyrex? Are you actually going to produce code at some point? Last I heard there was some grand plan to fork Cython and license it under GPL3 that went predictably nowhere. Why are you taking on an even grander project? If your goal is to produce a simple graphics engine why are you getting into the compiler business? On Sat, May 10, 2008 at 9:54 PM, Arc Riley wrote: >> I just read our FAQ and it does not say that. > > "The intention is to make it a drop-in replacement for existing Pyrex code." > > If this is not the meaning you intended, then you may want to change > the way the FAQ reads. > > >> I wrote "fork" because that's exactly how you describe >> PyMill on the Pymill trac server page: http://pysoy.org:8000/ >> That page disappeared so I can't paste from it. > > That "page" is a development tracd that is not open to the public and > nothing on it is official in any way. The numerous broken images on > it should have been a clue, if the port wasn't enough. Your reposting > it here was the reason it was moved to another port. > > The reason we're not using the name "PyMill" in any public posting is > that may not even be the name we end up with. There's a good chance > it isn't. We're nowhere near the point of publishing a definition of > it. You'll notice my blog entry didn't use the name. > > In fact your entire panic mode seems to be derived from a warped sense > of importance of a posting to our internal development list a link you > followed from it to a draft site we're working from. Nothing you > pasted, besides a link to the blog entry, was read by the general > Python community until you posted it here. > > Again - if there's something in the blog posting that is inaccurate, > specify it, and I'll correct it. That is the only referenced place > where FUD (sic) could have been spread, since everything else was > internal. > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From languitar at semipol.de Sun May 11 10:07:55 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sun, 11 May 2008 10:07:55 +0200 Subject: [Cython] cimport'ing module with cdef public declarations In-Reply-To: <48264965.2010008@canterbury.ac.nz> References: <4825D1A2.5000900@semipol.de> <48264411.7070907@canterbury.ac.nz> <48264965.2010008@canterbury.ac.nz> Message-ID: <4826A95B.40509@semipol.de> Am 05/11/2008 03:18 AM schrieb Greg Ewing: > But in Pyrex you wouldn't need the "public" > declaration, as it's only for making functions > available to external C code, not other Pyrex > modules. But that's what I need, too. The module contains functions that must be available to external C code and others that are needed for Cython to work. Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080511/071331da/attachment-0001.pgp From dagss at student.matnat.uio.no Sun May 11 10:10:23 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 10:10:23 +0200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <8f8f8530805110030l534317d7j763f1534a9a509c3@mail.gmail.com> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <8f8f8530805110030l534317d7j763f1534a9a509c3@mail.gmail.com> Message-ID: <4826A9EF.7090305@student.matnat.uio.no> Gary Furnish wrote: > Arc, at this point all I ever hear from you is how Cython is so > horrible and has to be rewritten to be even tolerable. What exactly > is so horrible about Cython? You have never given any specifics on According to the blog post, it is "recent developments in their community" (I wonder if it's me, and he's just being sensitive and trying to protect me :-) ) > why Cython is so horrible that it is better to start from scratch and > thus violate about every good rule of programming (see Joel on > Software). If this isn't intended to be FUD why don't you name some I second that (strongly!) with the exact URL: http://www.joelonsoftware.com/articles/fog0000000069.html Still, while the premise might seem strange to us ("Cython codebase is moving so quickly and they are so aware of possible problems with their codebase that I should rather start from scratch"), there's an important aspect to this rebuild: While whatever they build will be moving quickly as well, whatever they build will also be under their control, and only move when it is fitting for them. (I'm not jesting here: That can be a valid reason! As Arc says: """ You may want to consider, however, that while you have very good reasons for making choices, and I am not at all saying you should not be making those exact choices, those same choices don't work for everyone. """ ). I can't argue with that, even if Arc's conclusion seems strange to me. I don't know how much developer resources he has available, after all. (But I think it would be great if Arc would want discuss any problems with the Cython release process on the Cython list; as far as I can see he only raised the point on the Pyrex list). And he does raise an important point that should be discussed: Pledging to language stability. But I'll start another thread on that. -- Dag Sverre From dagss at student.matnat.uio.no Sun May 11 10:14:32 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 10:14:32 +0200 Subject: [Cython] Language stability Message-ID: <4826AAE8.6010706@student.matnat.uio.no> Given recent developments (Pyrex release with new syntax and Arc's new project) I think Cython should make some public website statements on language policy and stability. I think these are already present in the "community spirit", but we should make them very explicit. Rough suggestions aimed for starting this thread: - "Cython want to be a dynamic work in progress and so new language features will happen quickly over the next year. *However*, each change will be related to previously unavailable features and will have a backwards-compatible syntax." - If we need to break syntax backwards-incompatibly, it will be done by the scheme discussed earlier: A "#lang: cython-ver" comment header. Files without such a header will be compiled with the current ("v0.9") syntax. - Some decision should be made on Pyrex compatability. I think it should either be flat out "the language looks like Pyrex because of its roots but is *not* Pyrex", or, we should aim to correctly compile Pyrex (including future changes) but do so only when "#lang: pyrex" is specified. (Of course, any "lang" instruction would be command-line-overridable so that existing Pyrex code could be built with a "--pyrex-mode" flag) - We need a consistent keyword policy. I'm going to be controversial and suggest that this will be "figure it out from context", because we aim to compile any Python code. I.e.: * Any Python keywords are also Cython keywords (which exact keywords depends on the Python language level we are compiling) * Any non-Python keywords are only keywords when used in a context where they can mean a keyword instruction. That is, we should start supporting Cython code like this: cdef = 4 # Created cdef variable, does not alter behaviour of cdef keyword We already support final = 4 anyway... I think any new Cython keywords (over Python) are always used in contexts where they could not be a variable name so I think this is possible. This is unusual, but because our aim is to go towards full Python support, but still support Cython code, I think such a policy is a possible compromise (the important thing is to have a solid, dependable rule we follow). Another possibility is to only allow cdef = 4 when we are in (also on the idea-stage, from the #lang-thread) Python mode (compiling a .py-file or having a "#lang: python" header). -- Dag Sverre From robertwb at math.washington.edu Sun May 11 10:23:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 01:23:12 -0700 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48263800.8070606@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> <48253372.9060802@behnel.de> <48263800.8070606@canterbury.ac.nz> Message-ID: On May 10, 2008, at 5:04 PM, Greg Ewing wrote: > Stefan Behnel wrote: >> Wouldn't special casing a type test for builtins and extension >> classes be >> better here than adding new functions? > > I'm not sure what you mean by that. There's no run-time > distinction between a builtin class, an extension class > implemented with Pyrex, or an extension class created some > other way. They're all just types. > > I could make the Pyrex version of isinstance() behave > differently, but then it wouldn't have quite the same > semantics as it does in Python. That could be confusing, > especially since you aren't guaranteed to be always > getting the directly-called version. > > Rather than fiddle with the semantics, I felt it was > better to provide different functions -- EIBTI and > all that. > > They're not really new functions, anyway -- they're > just exposing functions that are in the Python API, > and they're the ones you would use if you were writing > the extension module in C. > >> I mean, we already had that with getattr3(), where the right >> thing to do would have been to fix builtin functions. > > That was a matter of pragmatism. I don't currently have > any mechanism for dealing with a C function that can have > more than one signature. I would have had to build a very > special case into the guts of the compiler somewhere, > which I didn't feel was worth doing just for that one > function. So I did the next best thing and provided > something that works, even if it's not ideal. I can > always come back and do something else later. > > I think we're coming up against those philosophical > differences again. Pyrex code is not Python code, and > doesn't pretend to be. I try to make things compatible > where reasonably possible, but it's not an overriding > principle. > >> So if you added doctests >> to your test modules (as we do for ours), you could still validate >> the >> resulting source code for Pyrex, but we could both join forces and >> benefit >> from a growing test suite. > > Okay, so you just want me to run the Cython tests as > well. That shouldn't be too hard. Yes. When we add new features, we try and write tests for them. > Is there somewhere I can download them from? You can get them all here: http://hg.cython.org/cython/file/ 0927890724ab/tests/ - Robert From greg.ewing at canterbury.ac.nz Sun May 11 10:18:35 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 20:18:35 +1200 Subject: [Cython] cimport'ing module with cdef public declarations In-Reply-To: <4826A95B.40509@semipol.de> References: <4825D1A2.5000900@semipol.de> <48264411.7070907@canterbury.ac.nz> <48264965.2010008@canterbury.ac.nz> <4826A95B.40509@semipol.de> Message-ID: <4826ABDB.9080906@canterbury.ac.nz> Johannes Wienke wrote: > But that's what I need, too. The module contains functions that must be > available to external C code and others that are needed for Cython to work. That's fine, then. I don't think it's the cause of your problem -- it shouldn't do any harm, whether it's needed or not. Someone who knows more about the way function cimporting is supposed to work in Cython will be needed to sort this out. -- Greg From gfurnish at gfurnish.net Sun May 11 10:26:47 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 11 May 2008 02:26:47 -0600 Subject: [Cython] Some small phase refactorings In-Reply-To: <48233D62.7000200@student.matnat.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> <48233D62.7000200@student.matnat.uio.no> Message-ID: <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> Efficiency does matter to Sage though, and an O(1) overhead is very far from negligible. Cython uses real C vtables, so there are absolutely no dict-based lookups involved with cdef class polymorphism. -Infinity to any proposal that makes code slower. On Thu, May 8, 2008 at 11:50 AM, Dag Sverre Seljebotn wrote: > >> any replies than I guess that's an indication. I am actually a bit >> worried about how the use of decorators will impact the ability of >> Cython to compile itself into ultra-efficient C. >> > Efficiency doesn't bother me at all, and I have reasons! What the > decorators would do (does, in prototype code) is stuff the class name > and function into a dictionary at *class construction time* (or, without > metaclass support, at object construction time), i.e. a O(1) overhead > that is negligible even when running in the Python interpreter. During > tree traversal, it's just a dict lookup on class id and a dispatch to > the resulting function. > > If one were generating C++ objects then the classical visitor pattern > would be faster because real vtables are faster than dict lookups, > however Cython polymorphism is (if I'm not wrong, haven't looked at this > closely) dict-based anyway so it shouldn't make a difference. > > Here's another way to use 2.3 that might be acceptable: > > class WithStatementHandler(VisitorTransform): > def handle_with(self, node): ... > > matches = [ > class_match(WithStatementNode, handle_with) > ] > > > It would end up as a dict of class -> function like the other approach > (but written like a list of objects because I'd like to leave a way open > up for other types of matches though). > >> It just didn't seem to add anything except another level of >> indirection, but that will of course eventually be needed. There is a >> lot of re-factoring that will need to be done. I've just got other >> obligations (general exam) to be able to put much time/though into >> this for the next two weeks or so :-(. >> > What it added was the capability to plug in new transforms after type > analysis, but before generation, on a module-wide basis, rather than > having the transform run once for each function. > > I didn't make that very clear; I suppose because you're right, it's real > value is in another layer of indirection -- but one that will be needed > and is useful during refactoring, and can allow "in-production" > refactoring, rather than applying it all at once. > > " The only problem that cannot be solved by another layer of indirection > is too many layers of indirection " :-) > > -- > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From robertwb at math.washington.edu Sun May 11 10:28:02 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 01:28:02 -0700 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7 In-Reply-To: References: <48243C21.2020207@canterbury.ac.nz> <205284A8-3452-4FEE-BDCD-D5DECAD61CE9@bryant.edu> <48259663.3070009@behnel.de> Message-ID: <4FBCBADF-F795-455F-9BA4-B4EF45FF212A@math.washington.edu> On May 10, 2008, at 3:03 PM, Lisandro Dalcin wrote: > On 5/10/08, Stefan Behnel wrote: >> I don't quite see why we shouldn't just always convert >> >> for var in range(begin, end): >> >> to >> >> for var in begin <= var < end: >> >> >> *iff* var is cdef-ed as a C integer type. > > I'm definitely +1 on this!! This is actually enabled in the current version of Cython: http:// hg.cython.org/cython/file/0927890724ab/Cython/Compiler/Options.py > Regarding to the overflow issue, perhaps > a check can be added before entering the loop. This is the semantic difference--if there will be an overflow then Cython throws an error before entering the loop (rather than looping until an overflow would occur). I think this is acceptable. > And Cython should take > care if the user modifies the range() arguments inside de loop. What > will you do in this case? IMHO, matching the Python semantics is the > right way. One can't do that (in Python or Cython) as range() is evaluated exactly once, before entering the loop, so this is a non-issue. - Robert From gfurnish at gfurnish.net Sun May 11 10:29:51 2008 From: gfurnish at gfurnish.net (Gary Furnish) Date: Sun, 11 May 2008 02:29:51 -0600 Subject: [Cython] Some small phase refactorings In-Reply-To: <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> <48233D62.7000200@student.matnat.uio.no> <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> Message-ID: <8f8f8530805110129w395c93a9r535f36615fbf365@mail.gmail.com> Well, clarifying what I meant a bit more, the *biggest* speed loss anywhere is dictionary lookups. If your going to use dictionary lookups at object time, you lose most of the advantage of using cython to compile cython. On Sun, May 11, 2008 at 2:26 AM, Gary Furnish wrote: > Efficiency does matter to Sage though, and an O(1) overhead is very > far from negligible. Cython uses real C vtables, so there are > absolutely no dict-based lookups involved with cdef class > polymorphism. -Infinity to any proposal that makes code slower. > > On Thu, May 8, 2008 at 11:50 AM, Dag Sverre Seljebotn > wrote: >> >>> any replies than I guess that's an indication. I am actually a bit >>> worried about how the use of decorators will impact the ability of >>> Cython to compile itself into ultra-efficient C. >>> >> Efficiency doesn't bother me at all, and I have reasons! What the >> decorators would do (does, in prototype code) is stuff the class name >> and function into a dictionary at *class construction time* (or, without >> metaclass support, at object construction time), i.e. a O(1) overhead >> that is negligible even when running in the Python interpreter. During >> tree traversal, it's just a dict lookup on class id and a dispatch to >> the resulting function. >> >> If one were generating C++ objects then the classical visitor pattern >> would be faster because real vtables are faster than dict lookups, >> however Cython polymorphism is (if I'm not wrong, haven't looked at this >> closely) dict-based anyway so it shouldn't make a difference. >> >> Here's another way to use 2.3 that might be acceptable: >> >> class WithStatementHandler(VisitorTransform): >> def handle_with(self, node): ... >> >> matches = [ >> class_match(WithStatementNode, handle_with) >> ] >> >> >> It would end up as a dict of class -> function like the other approach >> (but written like a list of objects because I'd like to leave a way open >> up for other types of matches though). >> >>> It just didn't seem to add anything except another level of >>> indirection, but that will of course eventually be needed. There is a >>> lot of re-factoring that will need to be done. I've just got other >>> obligations (general exam) to be able to put much time/though into >>> this for the next two weeks or so :-(. >>> >> What it added was the capability to plug in new transforms after type >> analysis, but before generation, on a module-wide basis, rather than >> having the transform run once for each function. >> >> I didn't make that very clear; I suppose because you're right, it's real >> value is in another layer of indirection -- but one that will be needed >> and is useful during refactoring, and can allow "in-production" >> refactoring, rather than applying it all at once. >> >> " The only problem that cannot be solved by another layer of indirection >> is too many layers of indirection " :-) >> >> -- >> Dag Sverre >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > From stefan_ml at behnel.de Sun May 11 10:34:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 May 2008 10:34:19 +0200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <48263800.8070606@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> <48253372.9060802@behnel.de> <48263800.8070606@canterbury.ac.nz> Message-ID: <4826AF8B.9010208@behnel.de> Hi Greg, Greg Ewing wrote: > Stefan Behnel wrote: >> Wouldn't special casing a type test for builtins and extension classes be >> better here than adding new functions? > > I'm not sure what you mean by that. There's no run-time > distinction between a builtin class, an extension class > implemented with Pyrex, or an extension class created some > other way. They're all just types. I didn't mean at runtime. The compiler knows how to distinguish them. > I could make the Pyrex version of isinstance() behave > differently, but then it wouldn't have quite the same > semantics as it does in Python. That could be confusing, > especially since you aren't guaranteed to be always > getting the directly-called version. I could live with different semantics even in this case: allowed_type = ExtType print isinstance(obj, allowed_type) # Python semantics and print isinstance(obj, ExtType) # non-Python semantics The Pyrex/Cython semanics would be: - testing for an explicitly named extension type or builtin type would give you a direct (sub-)type check (maybe even for tuples of subtypes? It would be nice to optimise that as well) - testing for a runtime type will fall back to Python semantics. I think those are the perfect semantics for the context of Pyrex/Cython. Besides, the actual difference of allowing or disallowing a type override through ".__class__" is so tiny that almost no user will actually notice, and those who need that distinction can well be asked to use a somewhat less straight forward syntax. That's much better than forcing users to use a non-standard function for the much more likely case that they do the type test to make sure you can rely on a specific struct layout. One of the major use cases of Pyrex/Cython is wrapping external libraries with Python classes, after all. > Rather than fiddle with the semantics, I felt it was > better to provide different functions -- EIBTI and > all that. I would be fine with that if the new function was for the less common case. But I would say it's for the most common case, which would better be handled by the normal function that is named as in Python. >> So if you added doctests >> to your test modules (as we do for ours), you could still validate the >> resulting source code for Pyrex, but we could both join forces and >> benefit >> from a growing test suite. > > Okay, so you just want me to run the Cython tests as > well. That shouldn't be too hard. Sort of. I just want to avoid doubling the work on both sides. After all, I prefer investing more time into the compiler than into the test suite. > Is there somewhere I can download them from? Yes, as Robert just pointed out. Note that most of them are actually based on your own test suite, but most of the tests (those in the tests/run/ directory) have an added doctest that is executed by the test runner script after building the module. You will also notice that the tests/errors/ directory has modules that produce compiler errors, and the expected error messages are part of the modules themselves. When I wrote the test runner, my main goal was to keep the expected results in the same file as the test code itself, so that you only have one file to edit and look at (ok, minus .pxd and .h files). Doctests are just perfect for this. Stefan From robertwb at math.washington.edu Sun May 11 10:38:30 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 01:38:30 -0700 Subject: [Cython] String interning and Python 3 In-Reply-To: References: <4825AFFB.6040305@behnel.de> Message-ID: <31D60ED7-91BE-435B-A04B-AF14E733FFAB@math.washington.edu> On May 10, 2008, at 2:06 PM, Lisandro Dalcin wrote: > Stefan , If (4) is doable with no much effort, then I believe this is > the right way. > > As you said, one of the points of string interning is saving memory. > But there is also another very important benefit. Look at this for > Python 2.6 sources in stringobject.c > > int > _PyString_Eq(PyObject *o1, PyObject *o2) > { > PyStringObject *a = (PyStringObject*) o1; > PyStringObject *b = (PyStringObject*) o2; > return Py_SIZE(a) == Py_SIZE(b) > && *a->ob_sval == *b->ob_sval > && memcmp(a->ob_sval, b->ob_sval, Py_SIZE(a)) == 0; > } > > > As you can see the line with '*a->ob_sval == *b->ob_sval' provides a > fast path for string equality. So if o1 and o2 are the same (interned) > strings then with a pointer comparison you avoid at all the memcmp > call. And this is very, very important in dictionary lookups to make > that operation faster in the case the keys are strings. I want to second this, we want to keep interned strings if at all possible for the above reason. Making everything a dictionary lookup is one of the ways Python is so dynamic, but it means that the speed of dictionary lookups is extremely important (in fact I would say this is one of the primary bottlenecks of Python). It also offers the advantage that the lookup strings don't need to be re-allocated each time they're needed. > On 5/10/08, Stefan Behnel wrote: >> Hi, >> >> I'm wondering how to continue the support for this feature given >> the fact that >> identifiers are Unicode strings in Py3. We currently only intern >> byte strings >> that look like Python identifiers, so in Py3, they simply no >> longer look like >> identifiers, as they are not Unicode strings. >> >> I can see four ways how to deal with this: >> >> 1) drop string interning completely >> >> 2) disable string interning in Py3 and use normally created byte >> strings instead >> >> 3) keep separate sets of identifier-like byte strings and unicode >> strings in >> the compiler and write them into the C file. Then, depending on >> the Python >> version, either intern the byte strings or the unicode strings, >> and create the >> other set as un-interned strings. >> >> 4) keep the information if a string should be interned for all >> strings we deal >> with (bytes and unicode), remove the intern tab and merge it with >> the general >> string tab by adding an additional field "intern". Then >> __Pyx_InitStrings() >> would create the strings differently depending on the compile >> time Python >> version, i.e., it would intern Unicode identifiers in Py3 and >> byte string >> identifiers in Py2, and create everything else as normal strings. >> >> Personally, I favour 4) - although I could live with 1) - but >> since I'm not >> quite sure what the original intention of string interning was >> (saving >> memory?), I'd like to hear other opinions first. >> >> Stefan >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dagss at student.matnat.uio.no Sun May 11 10:55:42 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 10:55:42 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> Message-ID: <4826B48E.3060708@student.matnat.uio.no> > ---------a.pxd---------- > cdef class A: > cdef int len > cdef int* data > > cdef inline [final?] int __getitem__(A a, int i): > """ > Note that subtypes can't override this. > """ > if i < 0 or i > a.len: > raise IndexError > return data[i] > > ---------b.pyx-------- > > from a cimport A > cdef A(len=10) a = a([1,2,3,4,5,6,7,8,9,10]) # I'll leave the init > function to your imagination > print a[9] # the code from __getitem__ gets inlined here, and since > len is known the a.len is resolved to 10 at compile time. > > (Here a.len tries to do a lookup first on the compile time type of a, > and that failing the runtime type of a. The compile time types need > not be struct members, but if they're not then they must be specified > because the "runtime" lookup would fail.) On the type arguments solution: My first thought is that I don't think it will cover all cases -- if native C++ template support is added, for instance, then the type arguments must probably behave differently there. So type arguments in itself is a more generic thing (related to the parser and type handling etc.), and this specifies one concrete use of them (ie, what happens to "cdef class"es when given type parameters). At first it struck me as way too magic. Then it grew on me. But then I dislike it again :-) So these are some non-conclusive thoughts: 1) A "disadvantage" is that it looks like one has to break down the type specification vs. run-time parsing context seperation that we've talked about earlier? -- how would you specify that a type parameter "T" should take "unsigned short int*"? So to have consistency in any sane way one needs to make "all compile-time types also available run-time types" 2) It seems to leave the way open for some confusing (if not impossible to solve compiler-wise) results: cdef A(len=10) a = ... cdef A(len=8) b = a # What does this mean? Compile-time error? Note, for instance, that while assigning a ndarray(2, int8) to a ndarray(3, int8) or ndarray(2, float32); it should (or rather, might be wanted behaviour) for it to be legal to assign ndarray(2, uint8, flat=True) to ndarray(2, uint8, flat=False); where flat is flag to toggle multi-dimensional indexing. Also consider: cdef A(len=10) a = ... a.len = 8 # legal or not? cdef A(len=8) b = a # Is this legal now? Such things would have to be described in more detail, and since it is hard to "guess" what the semantics should be (at least for me) it might be an indication that this is too magic. -- Dag Sverre From stefan_ml at behnel.de Sun May 11 10:53:39 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 May 2008 10:53:39 +0200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> Message-ID: <4826B413.4080608@behnel.de> Hi, Arc Riley wrote: >> I just read our FAQ and it does not say that. > > "The intention is to make it a drop-in replacement for existing Pyrex code." Right, "existing" Pyrex code. We are not necessarily talking about newly developed code that uses new Pyrex features, neither are we talking about code that uses Cython features that are not supported by Pyrex. Cython is not Pyrex, otherwise we wouldn't have two code bases. That aside, language compatibility between Cython and Pyrex *is* a goal. It would be stupid to have two compilers that understand almost the same language, and allow them to diverge in subtleties that make it impossible for users to change between the two if they need to. The latest discussion with Greg on the integer-loop syntax should make it clear that Cython and Pyrex are not fundamentally diverging projects. And I really don't think they should be. Stefan From stefan_ml at behnel.de Sun May 11 10:58:57 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 May 2008 10:58:57 +0200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> Message-ID: <4826B551.2010004@behnel.de> Hi, William Stein wrote: > On Sat, May 10, 2008 at 6:37 PM, Arc Riley wrote: > 3. "build the main mill [cython/pyrex fork] package from scratch, using the > wisdom of the Cython guys (reverse engineering in some cases), even the > people working on Cython agree it needs to be rewritten since the barrier > to entry for new developers is obscenely high". > > One person who is new to Cython made a claim on cython-devel about the > barrier to entry. That is much different than "the people working on > Cython agreeing the barrier to entry for new developers is obscenely high". > There certainly is a barrier to entry -- Cython is a compiler of a highly > nontrivial language after all. But I think describing it as obscenely high > due to the particular implementation is FUD. And it is exactly the kind of > FUD that will hurt the Cython project right now. This sounds like Arc doesn't know that code is actually harder to read than to write. http://www.joelonsoftware.com/articles/fog0000000069.html Stefan From greg.ewing at canterbury.ac.nz Sun May 11 11:26:07 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 21:26:07 +1200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <4826AF8B.9010208@behnel.de> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> <48253372.9060802@behnel.de> <48263800.8070606@canterbury.ac.nz> <4826AF8B.9010208@behnel.de> Message-ID: <4826BBAE.4060603@canterbury.ac.nz> Stefan Behnel wrote: > Greg Ewing wrote: > >>There's no run-time >>distinction between a builtin class, an extension class >>implemented with Pyrex, or an extension class created some >>other way. > > I didn't mean at runtime. The compiler knows how to distinguish them. Well, there's no compile-time distinction either, really. They're all the same thing. Treating a type differently depending on whether it was implemented using a Pyrex extension class would be bizarre and confusing. > allowed_type = ExtType > print isinstance(obj, allowed_type) # Python semantics That's not the only possibility. There's also f = isinstance if f(ob, MyExtensionClass): ... You're testing against an explicitly-named class -- but using the Python version of isinstance rather than the Pyrex one. > Besides, the actual difference of allowing or disallowing a type override > through ".__class__" is so tiny that almost no user will actually notice, Possibly you're right. However, I do have a guideline of my own concerning Python compatibility: if a function has the same name as a Python function, then you can rely on it to have the same semantics. There's another consideration as well: What do you do if you really *do* want exactly the same semantics as the Python isinstance()? There would have to be another function for that, which *is* exactly the same as isinstance(), but has a different name. That situation could be considered somewhat farcical. > Note that most of them are actually based on > your own test suite, but most of the tests (those in the tests/run/ directory) > have an added doctest that is executed by the test runner script after > building the module. Hmmm. I don't really want to include a lot of tests that are more or less duplicates of the existing ones. If you have any new, Cython-specific tests that you think ought to work in Pyrex as well, I'd be happy to run those. But I don't see how much will be gained by just including all the Cython tests as-is. I'm not going to be changing my own test setup, so you're not going to be able to share any new tests that I add without converting them to your format. And new tests for Cython-specific features probably won't apply to Pyrex anyway. If you find a bug and come up with a new test to guard it against regression, I think I would actually rather convert it to my test format and make it one of my tests. So I would be happy for you to bring any such tests to my attention. > When I wrote the test runner, my main goal was to keep the expected results in > the same file as the test code itself, so that you only have one file to edit > and look at There is one advantage to keeping them separate -- after I've fixed everything so that the expected and actual results are near enough to being in sync, I can run a script that automatically updates the expected results from the last actual results. That would be hard to do if the expected results were embedded in the test files. This is probably more important for the compilation tests, where the "expected result" is large and can change in ways that are ignorable. But having built the framework to cope with that, it was easiest to just use it for everything. -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 11:34:39 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 21:34:39 +1200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <4826B413.4080608@behnel.de> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <4826B413.4080608@behnel.de> Message-ID: <4826BDAF.9050106@canterbury.ac.nz> Stefan Behnel wrote: > The latest discussion with Greg on the integer-loop syntax should make it > clear that Cython and Pyrex are not fundamentally diverging projects. And I > really don't think they should be. That's good to hear. However, if one of the priorities of Cython is really going to be to compile pure Python code, there are likely to be problems at some point, because I'm going to want to make Pyrex do things that you won't like, and vice versa. BTW, the discussion we've been having about isinstance vs. typecheck is a bit ironic in this light. I'm arguing for *more* compatibility with Python in this area, and you're arguing for *less*. :-) -- Greg From robertwb at math.washington.edu Sun May 11 11:41:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 02:41:12 -0700 Subject: [Cython] Language stability In-Reply-To: <4826AAE8.6010706@student.matnat.uio.no> References: <4826AAE8.6010706@student.matnat.uio.no> Message-ID: <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> On May 11, 2008, at 1:14 AM, Dag Sverre Seljebotn wrote: > Given recent developments (Pyrex release with new syntax and Arc's new > project) I think Cython should make some public website statements on > language policy and stability. I think these are already present in > the > "community spirit", but we should make them very explicit. This is a very good idea. > Rough suggestions aimed for starting this thread: > > - "Cython want to be a dynamic work in progress and so new language > features will happen quickly over the next year. *However*, each > change > will be related to previously unavailable features and will have a > backwards-compatible syntax." I think there is a place for backwards-incompatibility. For example, the __new__() being renamed to __cinit__() (as __new__() a very different meaning in Python). However, if there is ever backwards- compatibility issues, it should be for a *very* good reason. Also, we might want to add syntax candy, so I'm not sure about the "previously unavailable features" clause. > - If we need to break syntax backwards-incompatibly, it will be > done by > the scheme discussed earlier: A "#lang: cython-ver" comment header. > Files without such a header will be compiled with the current ("v0.9") > syntax. > > - Some decision should be made on Pyrex compatability. I think it > should > either be flat out "the language looks like Pyrex because of its roots > but is *not* Pyrex", or, we should aim to correctly compile Pyrex > (including future changes) but do so only when "#lang: pyrex" is > specified. > > (Of course, any "lang" instruction would be command-line- > overridable so > that existing Pyrex code could be built with a "--pyrex-mode" flag) I don't think supporting multiple versions of the language, and having a specific Pyrex flag, is a good idea. If one wants a specific version of Cython, one can download and use that one. If one wants *exatly* Pyrex syntax, use Pyrex itself. Any difference should be clearly documented, and be for good reason, but having different "modes" that we continue supporting seems cumbersome. > - We need a consistent keyword policy. I'm going to be > controversial and > suggest that this will be "figure it out from context", because we aim > to compile any Python code. I.e.: > > * Any Python keywords are also Cython keywords (which exact keywords > depends on the Python language level we are compiling) > * Any non-Python keywords are only keywords when used in a context > where they can mean a keyword instruction. That is, we should start > supporting Cython code like this: > > cdef = 4 > # Created cdef variable, does not alter behaviour of cdef keyword I would much rather this be a syntax error. It also will make the parser a lot more complicated (and the code a lot harder to read.) > We already support > > final = 4 > > anyway... Final is not a keyword, and I don't think it will ever be. It could still have a special meaning, but being a keyword is a lot stronger. > I think any new Cython keywords (over Python) are always used > in contexts where they could not be a variable name so I think this is > possible. > > This is unusual, but because our aim is to go towards full Python > support, but still support Cython code, I think such a policy is a > possible compromise (the important thing is to have a solid, > dependable > rule we follow). > > Another possibility is to only allow > > cdef = 4 > > when we are in (also on the idea-stage, from the #lang-thread) Python > mode (compiling a .py-file or having a "#lang: python" header). Maybe... I really don't like the idea of mixing the two modes, e.g. cdef cdef x = cdef(3) - Robert From greg.ewing at canterbury.ac.nz Sun May 11 11:50:28 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 21:50:28 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826B48E.3060708@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> Message-ID: <4826C164.1090206@canterbury.ac.nz> Another thing to consider is how this would interact with subclassing. Will it be legal to do something like cdef class B(A(len=10)): ... and if so, how does the initialisation of B ensure that the length constraint is satisfied? -- Greg From stefan_ml at behnel.de Sun May 11 12:08:33 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 May 2008 12:08:33 +0200 Subject: [Cython] ANN: Pyrex 0.9.7 In-Reply-To: <4826BBAE.4060603@canterbury.ac.nz> References: <48243C21.2020207@canterbury.ac.nz> <48246C10.6050703@behnel.de> <48250902.2020804@canterbury.ac.nz> <48253372.9060802@behnel.de> <48263800.8070606@canterbury.ac.nz> <4826AF8B.9010208@behnel.de> <4826BBAE.4060603@canterbury.ac.nz> Message-ID: <4826C5A1.9050308@behnel.de> Hi, Greg Ewing wrote: > Stefan Behnel wrote: > >> Greg Ewing wrote: >> >>> There's no run-time >>> distinction between a builtin class, an extension class >>> implemented with Pyrex, or an extension class created some >>> other way. >> >> I didn't mean at runtime. The compiler knows how to distinguish them. > > Well, there's no compile-time distinction either, really. > They're all the same thing. Treating a type differently > depending on whether it was implemented using a Pyrex > extension class would be bizarre and confusing. Why? If the docs stated "explicitly testing isinstance(obj, ExtType) for a Pyrex extension type will make sure it's really that specific type with the expected field structure", then that's just another Pyrex/Cython specific enhancement that deals with the grey area between Python and C, and that most (should I say all?) users would expect anyway when calling isinstance(). This does in no way interfere with Python semantics, as testing for (non-cdefed) Python types will not imply a specific struct layout anyway. Make it a feature: tell people that they can assert the C type structure by calling isinstance(obj, MyExtType) explicitly, and that all other calls will use the normal Python semantics. Then, you could still use the following to do a non-explicit test: >> allowed_type = ExtType >> print isinstance(obj, allowed_type) # Python semantics This could even become an example in the docs, right next to the paragraph that explains the lack of "security" in the Python semantics. > I'm not going to be changing > my own test setup, so you're not going to be able to share > any new tests that I add without converting them to your > format. Ok, then we'll continue to do that. Stefan From greg.ewing at canterbury.ac.nz Sun May 11 12:05:59 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 22:05:59 +1200 Subject: [Cython] Language stability In-Reply-To: <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> Message-ID: <4826C507.7040602@canterbury.ac.nz> Robert Bradshaw wrote: > Maybe... I really don't like the idea of mixing the two modes, e.g. > > cdef cdef x = cdef(3) This puts me in mind of PL/I, where it's perfectly legal to write things like IF IF = THEN THEN THEN = ELSE ELSE ELSE = END END But anyone even contemplating writing something like that should be promptly shipped off to Guantanamo Bay for a spot of waterboarding practice. Also I'm not sure how the PL/I parser coped with that and retained its sanity. -- Greg From robertwb at math.washington.edu Sun May 11 12:25:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 03:25:21 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826C164.1090206@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> Message-ID: On May 11, 2008, at 2:50 AM, Greg Ewing wrote: > Another thing to consider is how this would > interact with subclassing. Will it be legal > to do something like > > cdef class B(A(len=10)): > ... > > and if so, how does the initialisation of B > ensure that the length constraint is satisfied? No, I don't think the above would be allowed. One would do cdef class B(A): ... cdef B(len=1) b It should also be noted that some of the stuff being discussed here circumvents subclassing--the prototypical example is if x is a NumPy array with float entries then x[i] can be accessed directly via internal knowledge of the internal layout of the type of x. If x is an instance of a subclass that overrides __getitem__ to do something different this will break. This is bad, but so is manually hard-coding against the internal structure of x, and accessing arrays via __getitem__ is just way to slow for serious numerical algorithms. (Yes, you could say "just use a float*", but NumPy provides a lot of niceites, among them interfacing well with pure Python code and memory management). - Robert From stefan_ml at behnel.de Sun May 11 12:34:05 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 11 May 2008 12:34:05 +0200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <4826BDAF.9050106@canterbury.ac.nz> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <4826B413.4080608@behnel.de> <4826BDAF.9050106@canterbury.ac.nz> Message-ID: <4826CB9D.4080309@behnel.de> Hi, Greg Ewing wrote: > Stefan Behnel wrote: > >> The latest discussion with Greg on the integer-loop syntax should make it >> clear that Cython and Pyrex are not fundamentally diverging projects. And I >> really don't think they should be. > > That's good to hear. And I really mean it. We even adapted our versioning scheme to Pyrex and put a great deal of work into merging the changes in Pyrex 0.9.6. > However, if one of the priorities > of Cython is really going to be to compile pure Python code, > there are likely to be problems at some point, because > I'm going to want to make Pyrex do things that you won't > like, and vice versa. As long as there is a way to discuss these things and decide that the difference is worth it, that's fine. > BTW, the discussion we've been having about isinstance > vs. typecheck is a bit ironic in this light. I'm arguing > for *more* compatibility with Python in this area, > and you're arguing for *less*. :-) Read my answer. :) This is not about Python compatibility, as it deals with the semantics of C types. Stefan From greg.ewing at canterbury.ac.nz Sun May 11 12:41:25 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 22:41:25 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> Message-ID: <4826CD55.9000103@canterbury.ac.nz> Robert Bradshaw wrote: > On May 11, 2008, at 2:50 AM, Greg Ewing wrote: > >> Will it be legal to do something like >> >> cdef class B(A(len=10)): > > No, I don't think the above would be allowed. That would sidestep some problems, I suppose, although it would make it less of a general parameterised-types mechanism. Another thing bothers me a bit. Your simple little __getitem__ example is all very well, but a full-blown NumPy array with multiple dimensions and strides and whatnot is a rather complicated beast. Are you sure it will be possible to write a __getitem__ that deals with all that in its full generality while still providing the efficiencies you're after? What would such a __getitem__ implementation look like? Also, as well as accessing single items, there's slicing to consider. Is it going to be feasible to write a __getitem__ and a __setitem__ that work together such that things like a[i:j, k:l] = b[x, y, i:j, k:l] do the right thing efficiently? -- Greg From robertwb at math.washington.edu Sun May 11 12:50:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 03:50:12 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826B48E.3060708@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> Message-ID: <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> On May 11, 2008, at 1:55 AM, Dag Sverre Seljebotn wrote: >> ---------a.pxd---------- >> cdef class A: >> cdef int len >> cdef int* data >> >> cdef inline [final?] int __getitem__(A a, int i): >> """ >> Note that subtypes can't override this. >> """ >> if i < 0 or i > a.len: >> raise IndexError >> return data[i] >> >> ---------b.pyx-------- >> >> from a cimport A >> cdef A(len=10) a = a([1,2,3,4,5,6,7,8,9,10]) # I'll leave the init >> function to your imagination >> print a[9] # the code from __getitem__ gets inlined here, and since >> len is known the a.len is resolved to 10 at compile time. >> >> (Here a.len tries to do a lookup first on the compile time type of a, >> and that failing the runtime type of a. The compile time types need >> not be struct members, but if they're not then they must be specified >> because the "runtime" lookup would fail.) > > On the type arguments solution: > > My first thought is that I don't think it will cover all cases -- if > native C++ template support is added, for instance, then the type > arguments must probably behave differently there. So type arguments in > itself is a more generic thing (related to the parser and type > handling > etc.), and this specifies one concrete use of them (ie, what > happens to > "cdef class"es when given type parameters). > > At first it struck me as way too magic. Then it grew on me. But then I > dislike it again :-) So these are some non-conclusive thoughts: > > 1) A "disadvantage" is that it looks like one has to break down the > type > specification vs. run-time parsing context seperation that we've > talked > about earlier? -- how would you specify that a type parameter "T" > should > take "unsigned short int*"? So to have consistency in any sane way one > needs to make "all compile-time types also available run-time types" No matter what mechanism is used, the issue of specifying types will take some work. I think the question of having types as parameters in other types is a difficult but somewhat orthogonal issue that needs to be dealt with, and having runtime equivalents of types will play into this (does ctypes already provide a mechanism?). or perhaps A (type=(cdef unsigned short int*)) as a way to bind a parameter to a type. Here "(cdef ...)" would be an allowable expression resolving to some (dynamically generated?) runtype variable that represents this type. > 2) It seems to leave the way open for some confusing (if not > impossible > to solve compiler-wise) results: > > cdef A(len=10) a = ... > cdef A(len=8) b = a # What does this mean? Compile-time error? Yep, compile time error. If a was a plain "cdef A" then it may be a runtime error. Of course if one is feeling risky one can cast ;-). > Note, for instance, that while assigning a ndarray(2, int8) to a > ndarray(3, int8) or ndarray(2, float32); I assume you're trying to say this is not allowed... > it should (or rather, might be > wanted behaviour) for it to be legal to assign ndarray(2, uint8, > flat=True) to ndarray(2, uint8, flat=False); where flat is flag to > toggle multi-dimensional indexing. Perhaps, or one can cast. I think it would also be useful (especially in the context of C++ support) to have the notion of "assignable from." > Also consider: > > cdef A(len=10) a = ... > a.len = 8 # legal or not? Compile-time error--when it does a lookup on a.len, it will note it's trying to assign to a "constant." > cdef A(len=8) b = a # Is this legal now? Nope. The compile-time type of a is fixed. > Such things would have to be described in more detail, and since it is > hard to "guess" what the semantics should be (at least for me) it > might > be an indication that this is too magic. Think of len as an (compile-time fixed) attribute of a which has precedence over, and is used in place of a the (hypothetical runtime) attribute of a. This should answer the semantics questions. Again, it need not actually correspond to a actual attribute of A (in which case trying to access it would be a compile-time error when no compile-time value is given), though often it would. One thing I like about this syntax is that it allows one to write a single piece of code that handles the generic (value only known at runtime) case and can be optimized if the value is known at compile time. - Robert From robertwb at math.washington.edu Sun May 11 13:08:24 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 04:08:24 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826CD55.9000103@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> <4826CD55.9000103@canterbury.ac.nz> Message-ID: <94D210AA-9967-4D0A-AEC5-0F84017555C2@math.washington.edu> On May 11, 2008, at 3:41 AM, Greg Ewing wrote: > Robert Bradshaw wrote: >> On May 11, 2008, at 2:50 AM, Greg Ewing wrote: >> >>> Will it be legal to do something like >>> >>> cdef class B(A(len=10)): >> >> No, I don't think the above would be allowed. > > That would sidestep some problems, I suppose, although > it would make it less of a general parameterised-types > mechanism. Yes, that is true. We are not trying to re-invent C++ here (which is powerful but at the expense of a much steeper learning curve). > Another thing bothers me a bit. Your simple little > __getitem__ example is all very well, but a full-blown > NumPy array with multiple dimensions and strides and > whatnot is a rather complicated beast. > > Are you sure it will be possible to write a __getitem__ > that deals with all that in its full generality while > still providing the efficiencies you're after? What > would such a __getitem__ implementation look like? I would imagine it would look a lot like the actual implementation of NumPy's current __getitem__ (well, after inlining some of the function calls, the current code does the actual work several levels down). Actually, probably more realistically, it would handle some of the simpler cases (one-dimensional for sure), and have an "else" clause that calls the current very-generic code (which stands the least to gain from an overhead-reducing perspective anyways). One hitch would be coming up with a nice syntax for handling a variable number of dimensions. If that wasn't feasible, hardcoding the method for 1, 2, and 3 dimensions would handle (I bet) most of cases, and it would call of to more generic code if the dimension was larger. Related to this I am pretty sure we want to implement polymorphism for cdef (and perhaps special) methods. > Also, as well as accessing single items, there's slicing > to consider. Is it going to be feasible to write a > __getitem__ and a __setitem__ that work together such > that things like > > a[i:j, k:l] = b[x, y, i:j, k:l] > > do the right thing efficiently? Ah, that certainly is quite the challenge :-). Unwrapping it into a single loop that does the copy would be tricky (for one thing, the analysis would have to happen on a much higher level) but I think the right hand side can do things with strides to avoid any copying, and the assignment would do the efficient loop. This may actually already be quite efficient--the big performance hits come when one wants to manipulate elements one-by-one (rather than as blocks, for which NumPy has efficient code). - Robert From greg.ewing at canterbury.ac.nz Sun May 11 13:05:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 23:05:45 +1200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <4826CB9D.4080309@behnel.de> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <4826B413.4080608@behnel.de> <4826BDAF.9050106@canterbury.ac.nz> <4826CB9D.4080309@behnel.de> Message-ID: <4826D309.1050307@canterbury.ac.nz> Stefan Behnel wrote: > Read my answer. :) This is not about Python compatibility, as it deals with > the semantics of C types. But C types are Python types too. There's no way of reading the programmer's mind to find out whether he's thinking of his extension type as a C type or a Python type when he passes it to isinstance(). So there is a semantic difference that could affect the outcome in some cases, albeit rare ones. Another consideration is that while I try to make it so that you don't need to know about the C API in order to use Pyrex, I also want to make it so that it seems natural to someone who *does* know something about the C API. The C API manual documents PyObject_IsInstance as having semantics that correspond to the the Python isinstance(). Someone who knows that is likely to expect them to correspond in Pyrex. -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 13:23:07 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 23:23:07 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> Message-ID: <4826D71B.4020306@canterbury.ac.nz> On May 11, 2008, at 1:55 AM, Dag Sverre Seljebotn wrote: > So to have consistency in any sane way one > needs to make "all compile-time types also available run-time types" I don't understand what you're getting at here. I thought that all these type parameters -- whether they're "values" like ints, or other types -- would be resolved at compile time. In other words, in cdef A(len = x) a the x would have to be a constant expression. There's no problem with constant expressions, as they're already used in C array declarations. -- Greg From greg.ewing at canterbury.ac.nz Sun May 11 13:28:55 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 23:28:55 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <94D210AA-9967-4D0A-AEC5-0F84017555C2@math.washington.edu> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> <4826CD55.9000103@canterbury.ac.nz> <94D210AA-9967-4D0A-AEC5-0F84017555C2@math.washington.edu> Message-ID: <4826D877.3050500@canterbury.ac.nz> Robert Bradshaw wrote: > Yes, that is true. We are not trying to re-invent C++ here (which is > powerful but at the expense of a much steeper learning curve). If you were going to reinvent some parameterised type system, I'd suggest reinventing Eiffel instead -- it has quite a nice one, without any of the bizarrities of C++ templates. Another nice model to follow would be Haskell (although I'd excuse you for skipping the typeclasses if you weren't feeling up to it. :-) -- Greg From robertwb at math.washington.edu Sun May 11 13:36:29 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 04:36:29 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826D71B.4020306@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> <4826D71B.4020306@canterbury.ac.nz> Message-ID: On May 11, 2008, at 4:23 AM, Greg Ewing wrote: > On May 11, 2008, at 1:55 AM, Dag Sverre Seljebotn wrote: > >> So to have consistency in any sane way one >> needs to make "all compile-time types also available run-time types" > > I don't understand what you're getting at here. I thought > that all these type parameters -- whether they're "values" > like ints, or other types -- would be resolved at compile > time. In other words, in > > cdef A(len = x) a > > the x would have to be a constant expression. There's no > problem with constant expressions, as they're already > used in C array declarations. The issue here is handling something like cdef A x cdef A(type=cdef int) y = x # this needs to be able to do type checking at runtime it also is inline with writing code like if A.type == cdef int: ... else: .... to be able to "easily" handle types with type parameters. - Robert From greg.ewing at canterbury.ac.nz Sun May 11 13:45:41 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 11 May 2008 23:45:41 +1200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> <4826D71B.4020306@canterbury.ac.nz> Message-ID: <4826DC65.7020807@canterbury.ac.nz> Robert Bradshaw wrote: > The issue here is handling something like > > cdef A x > cdef A(type=cdef int) y = x # this needs to be able to do type > checking at runtime Hmmm. I think I would be happy if you could only do things like that for type parameters which are Python types, not C types. It may help to disallow writing a bare declaration like cdef A x if A has any unspecified type parameters. In the case of C types, you would have to commit yourself to some concrete type, and then the compiler can perform type checking. In the case of Python types, you could write things like cdef A(object) x cdef A(Foo) y y = x and a run-time type test would be done on the type parameters. If you went on to say cdef A(int) z y = z this would be a compile-time error, since a C int can never be a subclass of Foo. Or perhaps, as you say, something like ctypes could be used to provide a runtime representation of C types. -- Greg From robertwb at math.washington.edu Sun May 11 14:01:03 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 11 May 2008 05:01:03 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826DC65.7020807@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <9EA11E81-35A0-418D-AD47-5C9BEA51368B@math.washington.edu> <4826D71B.4020306@canterbury.ac.nz> <4826DC65.7020807@canterbury.ac.nz> Message-ID: On May 11, 2008, at 4:45 AM, Greg Ewing wrote: > Robert Bradshaw wrote: > >> The issue here is handling something like >> >> cdef A x >> cdef A(type=cdef int) y = x # this needs to be able to do type >> checking at runtime > > Hmmm. I think I would be happy if you could only do things > like that for type parameters which are Python types, not > C types. > > It may help to disallow writing a bare declaration like > > cdef A x > > if A has any unspecified type parameters. In the case of > C types, you would have to commit yourself to some concrete > type, and then the compiler can perform type checking. > > In the case of Python types, you could write things like > > cdef A(object) x > cdef A(Foo) y > y = x > > and a run-time type test would be done on the type parameters. > > If you went on to say > > cdef A(int) z > y = z > > this would be a compile-time error, since a C int can never > be a subclass of Foo. > > Or perhaps, as you say, something like ctypes could be used to > provide a runtime representation of C types. The difficulty is that we want to be able to support things like NumPy arrays (whose element times are literally c doubles, ints, etc) and hopefully at least some support for wrapping C++ templated types (such as vector). - Robert From dagss at student.matnat.uio.no Sun May 11 14:48:01 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 14:48:01 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826CD55.9000103@canterbury.ac.nz> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> <4826CD55.9000103@canterbury.ac.nz> Message-ID: <4826EB01.4050406@student.matnat.uio.no> > Another thing bothers me a bit. Your simple little > __getitem__ example is all very well, but a full-blown > NumPy array with multiple dimensions and strides and > whatnot is a rather complicated beast. Robert's answered already but here's my thoughts on the matter. If you haven't read it already, I've written about this in my GSoC application: http://wiki.cython.org/DagSverreSeljebotn/soc/details Highlights: * It does rely on some pretty "advanced" compile-time inlining (which I've called unrolling a couple of places). To handle multiple dimensions one must ideally also inline loops of compile-time values (but *only* in functions marked with a @cython.unroll, you don't normally want loops unrolled). I've written about my thoughts on this here: http://wiki.cython.org/enhancements/inlining * If there's slices, simply hand them to the existing NumPy (i.e. (return (self)[index]). (In particular, we should *not* inline slice copying, as that can potentially be handed to special MMX instructions etc. and is very CPU-specific). Multiple dimensions, negative indices etc. will probably be handled though. * It is possible to above is too utopic and one can say that it won't work. But, this is pretty much critical functionality for numerical Python use; and will be done anyway. If nothing else it will have to be implemented as a custom plugin to Cython, treating NumPy arrays as a language primitive type; but we really hope to find a way that is more in line with general Cython development. -- Dag Sverre From dagss at student.matnat.uio.no Sun May 11 14:48:57 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 14:48:57 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <4826EB01.4050406@student.matnat.uio.no> References: <482423C4.9040005@student.matnat.uio.no> <9CA6BD87-F619-4E3B-9C0E-3EF0B552FAD9@math.washington.edu> <4826B48E.3060708@student.matnat.uio.no> <4826C164.1090206@canterbury.ac.nz> <4826CD55.9000103@canterbury.ac.nz> <4826EB01.4050406@student.matnat.uio.no> Message-ID: <4826EB39.6060909@student.matnat.uio.no> Dag Sverre Seljebotn wrote: >> Another thing bothers me a bit. Your simple little >> __getitem__ example is all very well, but a full-blown >> NumPy array with multiple dimensions and strides and >> whatnot is a rather complicated beast. > > Robert's answered already but here's my thoughts on the matter. > > If you haven't read it already, I've written about this in my GSoC > application: > > http://wiki.cython.org/DagSverreSeljebotn/soc/details BTW, scroll to the very bottom of that page and you'll find a full __getitem__ example. -- Dag Sverre From dagss at student.matnat.uio.no Sun May 11 14:58:57 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 14:58:57 +0200 Subject: [Cython] Language stability In-Reply-To: <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> Message-ID: <4826ED91.70706@student.matnat.uio.no> > I think there is a place for backwards-incompatibility. For example, > the __new__() being renamed to __cinit__() (as __new__() a very > different meaning in Python). However, if there is ever backwards- > compatibility issues, it should be for a *very* good reason. Also, we > might want to add syntax candy, so I'm not sure about the "previously > unavailable features" clause. I'll agree to the last sentence, but not what is before (see below). > I don't think supporting multiple versions of the language, and > having a specific Pyrex flag, is a good idea. If one wants a specific > version of Cython, one can download and use that one. If one wants > *exatly* Pyrex syntax, use Pyrex itself. Any difference should be > clearly documented, and be for good reason, but having different > "modes" that we continue supporting seems cumbersome. The sum of this is, I think, exactly why some people are sceptical to Pyrex and Cython as software projects. The thing is, if you are responsible for a big project and wondering if Cython is for you, there's *two* needs: - Is it stable in the sense that I don't have to keep up with new developments in the languages? (Doesn't matter how trivial the change is as long as you cannot compile previously working code -- that's essentially wasted time, no matter how nice-looking the new syntax is.) - Bugfixes! You can't simply say download an old revision and use that one, because when you stumble upon bugs in Cython (which there are!), you have to consistently work around them because you cannot upgrade for the bugfixes without also upgrading your syntax. Look to Python and how they promise to keep 2.6 alive for a long time even if 3.0 arrives... I think there's basically two models: - The big project way: Fork off new development branches but keep backporting bugfixes to older releases. - Keep supporting multiple language revisions in the same branch. (This is not too hard in many cases; one can probably often simply stick a backwards-compatability transform for the syntax in the pipeline if the right conditions are met. Though there's the danger of a growingly complex parser.). I don't think the current model (keep adding bugfixes to newest release, which also reserves the right to change the language without being backwards-compatible) is a good idea. > Maybe... I really don't like the idea of mixing the two modes, e.g. > > cdef cdef x = cdef(3) The idea is really only that Cython should be able to compile *all* Python code, and that includes Python code with the cdef used as variable names. If Cython cannot compile Python code containing "cdef" as a variable it could really, as you've said put it sometimes, be considered a bug since it's possible to write Python code that's not Cython-compilable. But a mode flag probably fixes that in a good way without allowing stuff like above. -- Dag Sverre From dagss at student.matnat.uio.no Sun May 11 15:50:00 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: 11 May 2008 15:50:00 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) Message-ID: <3293365801.1640468@smtp.netcom.no> Robert wrote: >One thing I like about this syntax is that it allows one to write a >single piece of code that handles the generic (value only known at >runtime) case and can be optimized if the value is known at compile >time. Indeed, that's what I liked about it. What I dislike is the exact way the type parameters are put into fields of the same name. (It basically means that the caller decides what is compile-time values and not, and things can easily get "out of sync", if method implementations depends on them being runtime?) What about this modification: cdef A(len=10) a then "a" is considered assignable "any instance of A which had len=10 passed to the constructor" (or instances which are willing to convert to that). The constructor can then decide how to assign the values to fields (or entirely drop them for compile-time treatment). Might require smart inlining of the constructor in perhaps impossible ways though, I'm waving my hands and will think more... That also solves the overloaded syntax problems -- () always do in fact refer to the constructor... Don't take this post too seriously, have to think more... Dag Sverre From jim-crow at rambler.ru Sun May 11 20:07:43 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Mon, 12 May 2008 01:07:43 +0700 Subject: [Cython] defining module constants Message-ID: <20080512010743.5a583b3c.jim-crow@rambler.ru> Hello. I have same problem. Will be any answers or advices? :-) -- Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/725d1622/attachment.pgp From dagss at student.matnat.uio.no Sun May 11 23:00:33 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 11 May 2008 23:00:33 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <3293365801.1640468@smtp.netcom.no> References: <3293365801.1640468@smtp.netcom.no> Message-ID: <48275E71.2030109@student.matnat.uio.no> Ok, been thinking some more. To sum up, what we're after is some way of having the following work (A is a class with a value field, printme is a method somehow selected for inlining and optimization for Cython): cdef A a = get_a() a.printme() a.printme() turns into cdef A a = get_a() print a.value print a.value , while this: cdef A(value=23) a = get_a() a.printme() a.printme() would be turned into cdef A a = get_a() if a.value != 23: raise TypeError(...) # line (a) print 23 # line (b) print 23 (This is the final result, not saying it is a recipe for how it happens magically). So, what is happening is that we want to set up some assumptions about an object (without actually changing the object); and have the code after generated by making use of those assumptions. I have some seperate proposals building further on Robert's ideas here. 1) An explicit method for the assumptions bit. I.e: cdef class A: cdef int value cdef __assume__(self, int value): if self.value != value: raise AssumptionError(...) The argument list to this function is basically a more explicit declaration of type arguments. The reason I want this, so explicitly (rather than using a more general __coerce__ which must probably also be added) is that it hints strongly to the writer of the class A to treat self.value as "const". Once an assumption is made by having __assume__ called, you cannot "back out" and change self.value. This seem to make it explicit that value should be treated as a const after construction; and it leaves the contract for the name, number and types of "type arguments" on the class creator side rather than the caller side. In addition to just checking assumptions, this method can also put general constraints on the arguments (i.e., "if value < 0: raise ValueError(...)"). Important: This does *not* address how one can make optimizations later on, line (b), it is simply a way to insert the assumption line, line (a). Also, __assume__ can simply be called in the normal way. 2) With this in place, it seems ok to follow Robert's proposal and automatically treat fields having the same name as type arguments as known compile-time. The parameter list to __assume__ restricts which fields can be used. I am still thinking about something more explicit like cdef class A: cdef int value cdef int not_possible_typearg __typearguments__ = ["value", "compiletimeonlyarg"] but it doesn't seem strictly necesarry as the argument list to __assume__ can serve the same role; and in a possibly more dynamic way, ie cdef A(constant_alpha=True, alpha=4) a = x # ok print a.alpha # ok to optimize... cdef A(constant_alpha=False, alpha=4) a = x # __assume__ might raise... # .. run-time error because of invalid combination, # .. so code below will never run, which is lucky because # .. alpha is now for some reason changing constantly. print a.alpha # Will be optimized but never run 3) I still want to throw __init__ into the mix. The main reason: For type inference, it would be nice if A = ndarray(shape=(4, 4), dtype=float64, buffer=arr) would automatically (because type arguments are somehow interlinked with constructor arguments) be type-inferred to cdef ndarray(shape=(4,4), dtype=float64) A A = ndarray(shape=(4,4), dtype=float64, buffer=arr) If so, keeping () for the type argument syntax would also make more sense and be less confusing. Perhaps all that is needed is this rule at the type-inference stage: * A call to a known constructor (of a type which is a candidate for typing) obviously leads to typing it explicitly * At the same time, any arguments passed to the constructor is checked for a match with the __assume__ signature -- the arguments that __assume__ can take are then put into the type arguments list. (If so, we can pretty much ignore this for now, but we have a "defense" for using the () syntax.) Dag Sverre -- Dag Sverre From stefan_ml at behnel.de Mon May 12 10:13:46 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 10:13:46 +0200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <4826D309.1050307@canterbury.ac.nz> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <4826B413.4080608@behnel.de> <4826BDAF.9050106@canterbury.ac.nz> <4826CB9D.4080309@behnel.de> <4826D309.1050307@canterbury.ac.nz> Message-ID: <4827FC3A.90109@behnel.de> Hi, Greg Ewing wrote: > But C types are Python types too. There's no way > of reading the programmer's mind to find out whether > he's thinking of his extension type as a C type or > a Python type when he passes it to isinstance(). But Pyrex types are already optimised in a couple of ways. If you add a struct field, Pyrex will not use Python calls to access it but plain C calls. If you assign it to a variable, Pyrex will not look it up in the module dict. However, if you test its type, you expect the user to assume that Pyrex does not handle this its own way? I think most users will not agree with you here, and I feel comfortable saying so from the (although unrelated) feedback we get on the mailing list. At least Cython users expect optimised code that special cases common use patterns internally (such as "for ... in range()"), and they are surprised when they find that Cython/Pyrex gives them normal Python semantics in a "sub-optimal" way. > Another consideration is that while I try to make > it so that you don't need to know about the C > API in order to use Pyrex, I also want to make it > so that it seems natural to someone who *does* > know something about the C API. Taking this a bit out of context, this is exactly my point. If you don't know much about the C-API, you a) won't notice the difference and b) will expect isinstance(obj, ExtType) to make sure that the type has the expected type struct. However, if you know about the C-API and about the possible problems with isinstance() and PyObject_IsInstance(), you will either write the type check in an explicit C-API call or check the Pyrex/Cython docs for an easy way to do it for you, and then find the section on the semantics difference between Python and Pyrex here. In this case, you will most likely appreciate the fact that isinstance() has semantics in Pyrex that make sure isinstance(obj, ExtType) will give you the expected struct, and that you can easily get back the original Python semantics by not testing for the extension type explicitly. There is only one case where things are unexpected, which is if you rely on isinstance() being overridable by type.__class__. In this case, your code will fail and you will have to debug it - but only if you have written it under the expectation that extension types behave exactly like Python types. And I think that is a very unlikely expectation to have because users who found their path all the way through the language docs to figure out how to write an extension type will already have lost this expectation. For me, that's not the path of highest Python language compatibility, but it's the clear path of least surprises - as long as it's documented. Stefan From stefan_ml at behnel.de Mon May 12 10:23:05 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 10:23:05 +0200 Subject: [Cython] Language stability In-Reply-To: <4826ED91.70706@student.matnat.uio.no> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> Message-ID: <4827FE69.7020804@behnel.de> Hi, Dag Sverre Seljebotn wrote: > - Is it stable in the sense that I don't have to keep up with new > developments in the languages? It should be (and stay) that stable. The latest syntax change regarding the for loop is not required for Cython, where the (IMHO much more obvious) cdef + range() syntax is optimised, which also works in Pyrex. We should just take care not to remove the older for-from syntax for compatibility reasons (which I don't remember considering for Cython anyway). And I'm still fighting against the two new functions that Greg introduced for type testing. To make the rest a bit shorter: I'm stronly -1 on supporting different language versions and even -1 on bug-fixing multiple branches. That's just two reasons more to keep the language itself stable. And if we really have to change the syntax, that's another two reasons more why we should consider this very, very carefully and only let syntax changes in that are very well backed by arguments. BTW, I would find it somewhat ironic if the Cython project, which I found to be a lot more agile than Pyrex lately, could agree on keeping the language more stable than Pyrex itself. ;) Stefan From dagss at student.matnat.uio.no Mon May 12 10:34:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 10:34:15 +0200 Subject: [Cython] Language stability In-Reply-To: <4827FE69.7020804@behnel.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> Message-ID: <48280107.204@student.matnat.uio.no> > To make the rest a bit shorter: I'm stronly -1 on supporting different > language versions and even -1 on bug-fixing multiple branches. That's just two Yes, yes, I'm not saying we do! I really hope we stay backwards-compatible. What I am saying is that we should have a policy (and a website statement) saying that *if* we break backwards-compatability, *then* we will commit to either backporting bugfixes or support multiple language versions (and *if* that happens, I think the latter one is going to be less of a pain, but one can discuss that if that happens). Without that, we can't really give people the security they need, they're at our mercy that we do not break the compatability, so to speak. I think "from __future__ import division" is an excellent example of supporting multiple language levels in a nice way, it is not that horrible to do. But let's hope not. -- Dag Sverre From stefan_ml at behnel.de Mon May 12 10:36:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 10:36:36 +0200 Subject: [Cython] cython add vs __add__ In-Reply-To: References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> Message-ID: <48280194.5080708@behnel.de> Hi, Lisandro Dalcin wrote: > On 5/10/08, Greg Ewing wrote: >> Lisandro Dalcin wrote: >> > I believe you are right, it seems a bug, >> > >> >>> cdef class myTest >> > def __add__(myTest self, other): # note declaration: myTest self >> > return self.thisPtr.add(other) >> >> >> It's not a bug. With extension types, the first argument >> of operator methods isn't necessarily self, so it's not >> automatically typed as such. > > What do you understand for 'A parameter named self is of the type the > method belongs to'. Anyway, do not you believe the current behavior is > a bit counter-intuitive? I agree that it is counter-intuitive for Python users, but these things are also non-trivial in Python. The Python API actually distinguishes between __add__ etc. and __radd__ etc., which are merged into a single function in Pyrex. Maybe that should be mentioned in the docs. http://docs.python.org/ref/numeric-types.html I think anyone who proves to be able to implement a complete numeric type in Python should be able to implement it in Pyrex - possibly minus some fixes in the Pyrex/Cython documentation. But you definitely will have to read the docs to get this right in Python *and* in Pyrex. Stefan From dagss at student.matnat.uio.no Mon May 12 10:41:31 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 10:41:31 +0200 Subject: [Cython] cython add vs __add__ In-Reply-To: <48280194.5080708@behnel.de> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> <48280194.5080708@behnel.de> Message-ID: <482802BB.80004@student.matnat.uio.no> > I agree that it is counter-intuitive for Python users, but these things are > also non-trivial in Python. The Python API actually distinguishes between > __add__ etc. and __radd__ etc., which are merged into a single function in > Pyrex. Maybe that should be mentioned in the docs. Wouldn't it be an idea (well, I suppose it's too late now, just wondering what you think) to call it __cadd__ instead then? Having custom functionality is ok, but I think the expectations of Python compliancy gets bigger when it has the same name. -- Dag Sverre From dagss at student.matnat.uio.no Mon May 12 10:43:02 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 10:43:02 +0200 Subject: [Cython] Visitors and speed issues In-Reply-To: <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> <48233D62.7000200@student.matnat.uio.no> <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> Message-ID: <48280316.7090503@student.matnat.uio.no> Gary Furnish wrote: > Efficiency does matter to Sage though, and an O(1) overhead is very > far from negligible. Cython uses real C vtables, so there are > absolutely no dict-based lookups involved with cdef class > polymorphism. -Infinity to any proposal that makes code slower. This is interesting; if there's a real-world, noticeable speed penalty to dict-based visitors then I agree they should be made vtable-based instead. I'd like to discuss it a bit more first though: Just to have more data I benchmarked this on my own computer; 10^7 visitor lookups were done in a loop. Doing everything as typed as possible (all arguments typed, all classes and methods cdef-ed) it took 1.7 seconds, while using a full-blown dict-lookup (with 100 items in the dict) it took 8.4 seconds. Assuming an "average" file has 10 000 nodes syntax tree nodes in it, and that there will be 40 visitors passing through the tree in a compilation, that gives ~0,07 seconds for typed and ~0,34 seconds for dict-based, ie a difference in 0,27 seconds per file. So on 100 files that is half a minute wasted. I don't know if that is much or little, how long does Cython spend on compiling 100 files of 10 000 nodes today? So far all is well. However, the problem is that Cython code is not using cdef classes, nor typing the method arguments! Nor will "cdef class" be likely to hit the source, because even if we'd like to be able to compile it, it is a must that it can use the Python interpreter in order to compile Cython with Cython (we don't want to bother with having a bootstrapping chain for Cython I think...). So in order to cash out on this one must first make Python code produce cdef classes with typed method signatures. Of course full-blown inference of this is not needed, decorator support (or some comment format support, if we want it compilable in Python 2.3) could be used instead. Thoughts? (BTW, when I said O(1) I was referring to class construction, which happens once per invocation of Cython, not dict lookups (in this context I consider dict lookups as giving an "O(n) penalty", since there would be a constant number of them per node, and the number of nodes is proportional to input file size). If the invocation time of Cython is a problem then one better just have Cython compile many files in one process session, the Python interpreter starts up relatively slowly anyway...) -- Dag Sverre From dagss at student.matnat.uio.no Mon May 12 10:57:18 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 10:57:18 +0200 Subject: [Cython] Visitors and compiling Cython in Cython Message-ID: <4828066E.6040307@student.matnat.uio.no> Gary recently voiced concern over dict-based visitor lookups perhaps giving a speed penalty Now, I might be wrong, but vtables can only be used if classes or methods are declared as "cdef", right? And on the other hand, I think we definitely want Cython to be runnable in Python, so that we don't have to ship Cython in .c-form to people who don't already have Cython installed, not to mention having to "bootstrap" any new Cython feature using the current Cython features (agreed or not?) Is there going to be a way to have this (i.e., vtable-based Cython) in the foreseeable future? Using decorators certainly comes to mind, however that breaks 2.3 support. Perhaps special comments? #cython: cdef Visitor: class Visitor: #cython: cdef object pre_FuncDefNode(FuncDefNode node): def pre_FuncDefNode(node): ... Because if this is never going to happen, I think we might as well go for nice, dict-based visitors. If it is going to happen though, it is a case for vtable-based visitors. Dag Sverre Appendix 1: For people who haven't followed that discussion, dict-based visitors refers to writing visitors in a way similar to this: class MyVisitor(Visitor): def make_function_scope(node): ... matches = [ pre(FuncDefNode, make_function_scope) ] with no changes to existing node classes, while vtable-bases would be like this: class MyVisitor(Visitor): def pre_FuncDefNode(self, node): # make function scope .... and each node class must have something like this added (*no*, this can't be done magically some other place like you're used to, that would break the vtable-ness of it): class FuncDefNode(Node): def accept(self, v): v.visit_FuncDefNode(self) From stefan_ml at behnel.de Mon May 12 11:18:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 11:18:44 +0200 Subject: [Cython] Language stability In-Reply-To: <48280107.204@student.matnat.uio.no> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <48280107.204@student.matnat.uio.no> Message-ID: <48280B74.1030106@behnel.de> Hi, Dag Sverre Seljebotn wrote: > I think "from __future__ import division" is an excellent example of > supporting multiple language levels in a nice way, it is not that > horrible to do. BTW, I actually think we need to support that anyway if we want to compile Python code. Things like >>> from __future__ import unicode_literals >>> from __future__ import print_function provide a simple upgrade path for Py3 syntax support. Both would just be enabled for plain Python code (.py) when the compiler runs in a Py3 environment. Now that you mention it, maybe changes in Cython and Pyrex could be enabled by >>> from __future__ cimport some_new_pyrex_syntax_feature (mind the "cimport") ? We would then still need a roadmap for a versioned upgrade path if we add a syntax change. Stefan From stefan_ml at behnel.de Mon May 12 11:49:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 11:49:55 +0200 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <4828066E.6040307@student.matnat.uio.no> References: <4828066E.6040307@student.matnat.uio.no> Message-ID: <482812C3.1040908@behnel.de> Hi, Dag Sverre Seljebotn wrote: > Gary recently voiced concern over dict-based visitor lookups perhaps > giving a speed penalty Why? You could use lazy initialisation. Just look up a node type in the dict (which is fast) and if it's not in there yet, walk it's base types (adding each one to the dict) until there is one that already is in the dict, which then determines the result for the lookup and for the newly added base types. That way, you'd quickly end up with a complete flattened type hierarchy in the dict. The total number of dict updates will be <= the total number of node types and the time for a node type lookup is still O(1). The type registration would follow the same algorithm as a lookup, but the new callbacks would be added to the results of the found base types. > I think we > definitely want Cython to be runnable in Python, so that we don't have > to ship Cython in .c-form to people who don't already have Cython > installed, not to mention having to "bootstrap" any new Cython feature > using the current Cython features (agreed or not?) Running it in C will only ever be an optimisation, possibly triggered by an option to setup.py that will compile Cython using Cython itself before installing it as binary module. > Is there going to be a way to have this (i.e., vtable-based Cython) in > the foreseeable future? Using decorators certainly comes to mind, > however that breaks 2.3 support. Perhaps special comments? > > #cython: cdef Visitor: > class Visitor: Robert already stated his opinion on this a couple of times and I second it: hiding code semantics in Python comments is a bad idea. We should only do this if we really find we can't do something any other way. I don't see that here. Stefan From dagss at student.matnat.uio.no Mon May 12 12:06:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 12:06:15 +0200 Subject: [Cython] Visitors and compiling Cython in Cython Message-ID: <48281697.2000907@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> Gary recently voiced concern over dict-based visitor lookups perhaps >> giving a speed penalty > > Why? You could use lazy initialisation. Just look up a node type in the dict > (which is fast) and if it's not in there yet, walk it's base types (adding > each one to the dict) until there is one that already is in the dict, which > then determines the result for the lookup and for the newly added base types. Sure, sure, that's already being done and in fact you'll see that *exact* procedure in my patch! :-) The issue Gary has is indeed with raw dict lookup performance. (Consider that this will be between 10 and 50 times to every single node in the tree of a file.) I'll quote Gary so you have the context: """ Efficiency does matter to Sage though, and an O(1) overhead is very far from negligible. Cython uses real C vtables, so there are absolutely no dict-based lookups involved with cdef class polymorphism. -Infinity to any proposal that makes code slower. Well, clarifying what I meant a bit more, the *biggest* speed loss anywhere is dictionary lookups. If your going to use dictionary lookups at object time, you lose most of the advantage of using cython to compile cython. """ Apparently, you don't agree with Gary here :-) Since this will have a big effect on all of Cython I thought I'd investigate it properly before truncating Gary's -Infinity to -1 :-) -- Dag Sverre -- Dag Sverre From stefan_ml at behnel.de Mon May 12 12:54:41 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 12:54:41 +0200 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <48281697.2000907@student.matnat.uio.no> References: <48281697.2000907@student.matnat.uio.no> Message-ID: <482821F1.8060808@behnel.de> Hi, Dag Sverre Seljebotn wrote: > I'll quote Gary so you have the context: > > """ > Efficiency does matter to Sage though, and an O(1) overhead is very > far from negligible. Cython uses real C vtables, so there are > absolutely no dict-based lookups involved with cdef class > polymorphism. -Infinity to any proposal that makes code slower. > > > > Well, clarifying what I meant a bit more, the *biggest* speed loss > anywhere is dictionary lookups. If your going to use dictionary > lookups at object time, you lose most of the advantage of using cython > to compile cython. > """ I don't think the intention to compile Cython should impact our design decisions. If a Cython compiled Cython compiler is not any faster than a Python based one, then that's a reason to improve Cython, not its code. Stefan From stefan_ml at behnel.de Mon May 12 12:58:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 12:58:02 +0200 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <48281697.2000907@student.matnat.uio.no> References: <48281697.2000907@student.matnat.uio.no> Message-ID: <482822BA.8070800@behnel.de> Hi, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> You could use lazy initialisation. Just look up a node type in the dict >> (which is fast) and if it's not in there yet, walk it's base types (adding >> each one to the dict) until there is one that already is in the dict, which >> then determines the result for the lookup and for the newly added base types. > > Sure, sure, that's already being done and in fact you'll see that > *exact* procedure in my patch! :-) Obviously. :) > The issue Gary has is indeed with raw > dict lookup performance. (Consider that this will be between 10 and 50 > times to every single node in the tree of a file.) Unless you annotate a parse tree with all its transformers in one traversal step. Stefan From stefan_ml at behnel.de Mon May 12 13:04:49 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 12 May 2008 13:04:49 +0200 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <4828066E.6040307@student.matnat.uio.no> References: <4828066E.6040307@student.matnat.uio.no> Message-ID: <48282451.1000202@behnel.de> Hi again, Dag Sverre Seljebotn wrote: > #cython: cdef Visitor: > class Visitor: > > #cython: cdef object pre_FuncDefNode(FuncDefNode node): > def pre_FuncDefNode(node): What about just using plain old "cdef" to make it valid Cython code, and then run a cy2py tool over it in setup.py that strips it down into valid Python code? That's how 2to3 works. The tool could just bail out if it finds anything that's not pythonisable, such as cimports and C pointer types. The down side is obviously that this would prevent us from running Cython from the source directory ... Stefan From dalcinl at gmail.com Mon May 12 14:29:34 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 12 May 2008 09:29:34 -0300 Subject: [Cython] String interning and Python 3 In-Reply-To: <31D60ED7-91BE-435B-A04B-AF14E733FFAB@math.washington.edu> References: <4825AFFB.6040305@behnel.de> <31D60ED7-91BE-435B-A04B-AF14E733FFAB@math.washington.edu> Message-ID: On 5/11/08, Robert Bradshaw wrote: > It also offers the > advantage that the lookup strings don't need to be re-allocated each > time they're needed. Thats very true on Cython as it has the table. Interestingly, Python sources uses PyString_InternFromString() in many places, but this function do actually create a new tmp string, but in the end you get the interned string. > > On 5/10/08, Stefan Behnel wrote: > >> Hi, > >> > >> I'm wondering how to continue the support for this feature given > >> the fact that > >> identifiers are Unicode strings in Py3. We currently only intern > >> byte strings > >> that look like Python identifiers, so in Py3, they simply no > >> longer look like > >> identifiers, as they are not Unicode strings. > >> > >> I can see four ways how to deal with this: > >> > >> 1) drop string interning completely > >> > >> 2) disable string interning in Py3 and use normally created byte > >> strings instead > >> > >> 3) keep separate sets of identifier-like byte strings and unicode > >> strings in > >> the compiler and write them into the C file. Then, depending on > >> the Python > >> version, either intern the byte strings or the unicode strings, > >> and create the > >> other set as un-interned strings. > >> > >> 4) keep the information if a string should be interned for all > >> strings we deal > >> with (bytes and unicode), remove the intern tab and merge it with > >> the general > >> string tab by adding an additional field "intern". Then > >> __Pyx_InitStrings() > >> would create the strings differently depending on the compile > >> time Python > >> version, i.e., it would intern Unicode identifiers in Py3 and > >> byte string > >> identifiers in Py2, and create everything else as normal strings. > >> > >> Personally, I favour 4) - although I could live with 1) - but > >> since I'm not > >> quite sure what the original intention of string interning was > >> (saving > >> memory?), I'd like to hear other opinions first. > >> > >> Stefan > >> _______________________________________________ > >> Cython-dev mailing list > >> Cython-dev at codespeak.net > >> http://codespeak.net/mailman/listinfo/cython-dev > >> > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From languitar at semipol.de Mon May 12 15:29:45 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 12 May 2008 15:29:45 +0200 Subject: [Cython] assigning to struct member Message-ID: <48284649.9030709@semipol.de> Hi again, maybe I'm blind or I don't know but how do I assign a value to a struct member? This code: cdef plugData *data = malloc(sizeof(plugData)) data.ident = "foo" with this definition of plugData: ctypedef struct plugData: plugDefinition *plug char *ident void *data generates an error: "Object of type 'plugData' has no attribute 'ident'" What's wrong with this? Thanks Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/9c4cbaae/attachment.pgp From dagss at student.matnat.uio.no Mon May 12 16:12:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 16:12:25 +0200 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <482821F1.8060808@behnel.de> References: <48281697.2000907@student.matnat.uio.no> <482821F1.8060808@behnel.de> Message-ID: <48285049.5060404@student.matnat.uio.no> > I don't think the intention to compile Cython should impact our design > decisions. If a Cython compiled Cython compiler is not any faster than a > Python based one, then that's a reason to improve Cython, not its code. The issue here is that using the "classical visitor pattern" is theoretically a much more likely candidate for optimization than dict-based visitors. Consider them different algorithms which currently perform at equal speed, but one of them has a much higher potential for hypothetical future optimization. No matter how smart Cython becomes it cannot switch to faster algorithms... >> The issue Gary has is indeed with raw >> dict lookup performance. (Consider that this will be between 10 and 50 >> times to every single node in the tree of a file.) > > Unless you annotate a parse tree with all its transformers in one traversal step. This is highly non-trivial though -- the output of one transform is changed and new nodes, which are the input to the next transform. And, the output of one "node transform" is potentially dependant on anything in child nodes, parent nodes, sibling nodes etc. To get anywhere one would at least have to put in restrictions on what a node visitors was allowed to read from the tree (likely inspect children and ancestors but not following-siblings). And still it is not simple. Let's do that when everything else is working :-) > What about just using plain old "cdef" to make it valid Cython code, and then > run a cy2py tool over it in setup.py that strips it down into valid Python > code? That's how 2to3 works. The tool could just bail out if it finds anything > that's not pythonisable, such as cimports and C pointer types. > > The down side is obviously that this would prevent us from running Cython from > the source directory ... Good idea. (I actually hope nothing like this is attempted for a while, but what our possibilities are affects how visitors are written...). Still my own summary is: This doesn't matter that much now -- the important thing is getting visitors at all. No matter what solution is picked, going from one to the other will take about a day of conceptually very simple work. So I'm still for dict-based; allows us to move faster, until something like the above is actually in place. -- Dag Sverre From dalcinl at gmail.com Mon May 12 16:12:56 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 12 May 2008 11:12:56 -0300 Subject: [Cython] assigning to struct member In-Reply-To: <48284649.9030709@semipol.de> References: <48284649.9030709@semipol.de> Message-ID: Do you also have a 'cdef class plugData' defined somewhere ? On 5/12/08, Johannes Wienke wrote: > Hi again, > > maybe I'm blind or I don't know but how do I assign a value to a struct > member? > > This code: > cdef plugData *data = malloc(sizeof(plugData)) > data.ident = "foo" > > with this definition of plugData: > ctypedef struct plugData: > plugDefinition *plug > char *ident > void *data > > generates an error: > "Object of type 'plugData' has no attribute 'ident'" > > What's wrong with this? > > Thanks > > Johannes > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Mon May 12 19:05:20 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 10:05:20 -0700 Subject: [Cython] defining module constants In-Reply-To: <20080512010743.5a583b3c.jim-crow@rambler.ru> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> Message-ID: <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> Could you clarify? I did add something that lets you do cdef public enum foo: a b c = 10 ... and theses will be exported into the public module dict. On May 11, 2008, at 11:07 AM, Anatoly A. Kazantsev wrote: > Hello. > > I have same problem. > Will be any answers or advices? :-) > > -- > Protect your digital freedom and privacy, eliminate DRM, learn more at > http://www.defectivebydesign.org/what_is_drm > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/fbe6445f/attachment.pgp From robertwb at math.washington.edu Mon May 12 19:30:58 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 10:30:58 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <48275E71.2030109@student.matnat.uio.no> References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> Message-ID: <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> On May 11, 2008, at 2:00 PM, Dag Sverre Seljebotn wrote: > Ok, been thinking some more. To sum up, what we're after is some > way of > having the following work (A is a class with a value field, printme > is a > method somehow selected for inlining and optimization for Cython): > > cdef A a = get_a() > a.printme() > a.printme() > > turns into > > cdef A a = get_a() > print a.value > print a.value > > , while this: > > cdef A(value=23) a = get_a() > a.printme() > a.printme() > > would be turned into > > cdef A a = get_a() > if a.value != 23: raise TypeError(...) # line (a) > print 23 # line (b) > print 23 > > (This is the final result, not saying it is a recipe for how it > happens > magically). > > So, what is happening is that we want to set up some assumptions about > an object (without actually changing the object); and have the code > after generated by making use of those assumptions. Yes, this is exactly what I was saying, and I think it'll be very easy to implement. sorry if it wasn't totally clear. > > I have some seperate proposals building further on Robert's ideas > here. > > 1) An explicit method for the assumptions bit. I.e: > > cdef class A: > cdef int value > > cdef __assume__(self, int value): > if self.value != value: raise AssumptionError(...) > > The argument list to this function is basically a more explicit > declaration of type arguments. The reason I want this, so explicitly > (rather than using a more general __coerce__ which must probably > also be > added) is that it hints strongly to the writer of the class A to treat > self.value as "const". Once an assumption is made by having __assume__ > called, you cannot "back out" and change self.value. This seem to make > it explicit that value should be treated as a const after > construction; > and it leaves the contract for the name, number and types of "type > arguments" on the class creator side rather than the caller side. > > In addition to just checking assumptions, this method can also put > general constraints on the arguments (i.e., "if value < 0: raise > ValueError(...)"). > > Important: This does *not* address how one can make optimizations > later > on, line (b), it is simply a way to insert the assumption line, line > (a). Also, __assume__ can simply be called in the normal way. > > 2) With this in place, it seems ok to follow Robert's proposal and > automatically treat fields having the same name as type arguments as > known compile-time. The parameter list to __assume__ restricts which > fields can be used. > > I am still thinking about something more explicit like > > cdef class A: > cdef int value > cdef int not_possible_typearg > > __typearguments__ = ["value", "compiletimeonlyarg"] > > but it doesn't seem strictly necesarry as the argument list to > __assume__ can serve the same role; and in a possibly more dynamic > way, ie I'm generally opposed to adding extra keywords like __typearguments__, that's why I've been writing A(len=11) rather than A(11). > cdef A(constant_alpha=True, alpha=4) a = x # ok > print a.alpha # ok to optimize... > > cdef A(constant_alpha=False, alpha=4) a = x # __assume__ might > raise... > # .. run-time error because of invalid combination, > # .. so code below will never run, which is lucky because > # .. alpha is now for some reason changing constantly. > print a.alpha # Will be optimized but never run > > 3) I still want to throw __init__ into the mix. The main reason: For > type inference, it would be nice if > > A = ndarray(shape=(4, 4), dtype=float64, buffer=arr) > > would automatically (because type arguments are somehow interlinked > with > constructor arguments) be type-inferred to > > cdef ndarray(shape=(4,4), dtype=float64) A > A = ndarray(shape=(4,4), dtype=float64, buffer=arr) > > If so, keeping () for the type argument syntax would also make more > sense and be less confusing. > > Perhaps all that is needed is this rule at the type-inference stage: > > * A call to a known constructor (of a type which is a candidate for > typing) obviously leads to typing it explicitly > * At the same time, any arguments passed to the constructor is > checked > for a match with the __assume__ signature -- the arguments that > __assume__ can take are then put into the type arguments list. > > (If so, we can pretty much ignore this for now, but we have a > "defense" > for using the () syntax.) I had actually thought about the "assume" perspective too. The two issues I have with it is that it adds an extra "special" method __assume__ which could just as well be written cdef A a = get_a() assume(a.len = 11) and also that it requires the use of full control flow to do any reasoning (e.g. there's the variable before assumption, the variable after, and the variable which (depending on branching) may or may not have been certified to have a given property. Then further __assumes__ would be illegal? Or just ones that contradict?) It just gets a lot messier than simply adding the data to the compile-time type of the object. Another problem is that is specifically requires one to declare ahead of time what compile-time assumptions can be made, rather than letting the user of the .pxd file specify things ahead of time for explicit optimization. The mapping of __init__ parameters to type parameters (for use with type inference) could be arbitrarily complicated, and I don't know how to do that without having the compiler actually execute code at compile time. - Robert From languitar at semipol.de Mon May 12 19:33:42 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 12 May 2008 19:33:42 +0200 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> Message-ID: <48287F76.6050002@semipol.de> Am 05/12/2008 04:12 PM schrieb Lisandro Dalcin: > Do you also have a 'cdef class plugData' defined somewhere ? No, do I need that? > On 5/12/08, Johannes Wienke wrote: >> Hi again, >> >> maybe I'm blind or I don't know but how do I assign a value to a struct >> member? >> >> This code: >> cdef plugData *data = malloc(sizeof(plugData)) >> data.ident = "foo" >> >> with this definition of plugData: >> ctypedef struct plugData: >> plugDefinition *plug >> char *ident >> void *data >> >> generates an error: >> "Object of type 'plugData' has no attribute 'ident'" >> >> What's wrong with this? >> >> Thanks >> >> Johannes >> >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> >> >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/4034c49a/attachment-0001.pgp From robertwb at math.washington.edu Mon May 12 19:52:55 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 10:52:55 -0700 Subject: [Cython] Visitors and speed issues In-Reply-To: <48280316.7090503@student.matnat.uio.no> References: <481AF97B.8030305@student.matnat.uio.no> <481BFDDA.8060007@behnel.de> <1078.195.159.185.117.1209803236.squirrel@webmail.uio.no> <452383C0-47A5-4E0D-840E-0158348C5755@math.washington.edu> <48215041.7070805@student.matnat.uio.no> <5303982D-51B8-4D6F-B363-BDC8C9729892@math.washington.edu> <48233D62.7000200@student.matnat.uio.no> <8f8f8530805110126v697012a2pd5770fa877b89dbd@mail.gmail.com> <48280316.7090503@student.matnat.uio.no> Message-ID: On May 12, 2008, at 1:43 AM, Dag Sverre Seljebotn wrote: > Gary Furnish wrote: >> Efficiency does matter to Sage though, and an O(1) overhead is very >> far from negligible. Cython uses real C vtables, so there are >> absolutely no dict-based lookups involved with cdef class >> polymorphism. -Infinity to any proposal that makes code slower. > > This is interesting; if there's a real-world, noticeable speed penalty > to dict-based visitors then I agree they should be made vtable-based > instead. I'd like to discuss it a bit more first though: > > Just to have more data I benchmarked this on my own computer; 10^7 > visitor lookups were done in a loop. Doing everything as typed as > possible (all arguments typed, all classes and methods cdef-ed) it > took > 1.7 seconds, while using a full-blown dict-lookup (with 100 items > in the > dict) it took 8.4 seconds. I am very surprised that the results are so close--I'm getting a 14x speedup for cdef vs. def methods in my tests. > Assuming an "average" file has 10 000 nodes syntax tree nodes in > it, and > that there will be 40 visitors passing through the tree in a > compilation, that gives ~0,07 seconds for typed and ~0,34 seconds for > dict-based, ie a difference in 0,27 seconds per file. So on 100 files > that is half a minute wasted. > > I don't know if that is much or little, how long does Cython spend on > compiling 100 files of 10 000 nodes today? > > So far all is well. However, the problem is that Cython code is not > using cdef classes, nor typing the method arguments! Nor will "cdef > class" be likely to hit the source, because even if we'd like to be > able > to compile it, it is a must that it can use the Python interpreter in > order to compile Cython with Cython (we don't want to bother with > having > a bootstrapping chain for Cython I think...). > > So in order to cash out on this one must first make Python code > produce > cdef classes with typed method signatures. Of course full-blown > inference of this is not needed, decorator support (or some comment > format support, if we want it compilable in Python 2.3) could be used > instead. > > Thoughts? My hope is that Cython will have an option intelligently turn some classes/methods into cdef classes/cpdef methods without forcing the user to be explicit (as part of making it a good Python compiler). > (BTW, when I said O(1) I was referring to class construction, which > happens once per invocation of Cython, not dict lookups (in this > context > I consider dict lookups as giving an "O(n) penalty", since there would > be a constant number of them per node, and the number of nodes is > proportional to input file size). If the invocation time of Cython > is a > problem then one better just have Cython compile many files in one > process session, the Python interpreter starts up relatively slowly > anyway...) > > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Mon May 12 20:03:27 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 11:03:27 -0700 Subject: [Cython] Language stability In-Reply-To: <48280B74.1030106@behnel.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <48280107.204@student.matnat.uio.no> <48280B74.1030106@behnel.de> Message-ID: On May 12, 2008, at 2:18 AM, Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> I think "from __future__ import division" is an excellent example of >> supporting multiple language levels in a nice way, it is not that >> horrible to do. > > BTW, I actually think we need to support that anyway if we want to > compile > Python code. Things like > >>>> from __future__ import unicode_literals >>>> from __future__ import print_function > > provide a simple upgrade path for Py3 syntax support. Both would > just be > enabled for plain Python code (.py) when the compiler runs in a Py3 > environment. > > Now that you mention it, maybe changes in Cython and Pyrex could be > enabled by > >>>> from __future__ cimport some_new_pyrex_syntax_feature > > (mind the "cimport") ? We would then still need a roadmap for a > versioned > upgrade path if we add a syntax change. The from __future__ is a good way of going about doing things. Backwards compatibility is a high priority for us, but I would like to note that we are at the mercy of Python in terms of syntax changes (though they are in general very stable). - Robert From dagss at student.matnat.uio.no Mon May 12 20:10:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 20:10:25 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> Message-ID: <48288811.9040602@student.matnat.uio.no> (Rearranged email to order of urgency.) > type of the object. Another problem is that is specifically requires > one to declare ahead of time what compile-time assumptions can be > made, rather than letting the user of the .pxd file specify things > ahead of time for explicit optimization. This "problem" was the exact reason I did this, and is a feature!! Perhaps there's more I don't understand. Tell me then how you prevent something like this: cdef timer(seconds=0) t = timer(0) print t.seconds # prints 0 sleep(10) print t.seconds # prints 0 (!) The whole point was to leave what one can safely (!) make assumptions about up to the class designer (and presumably pxd author). Otherwise this just becomes some dangerous shoestring feature, not something you give to engineers using NumPy without deep programming knowledge... Possible assumptions should be part of the published API of the class! (Is this something we will simply never agree on? I really hope I am misunderstanding something, would make all of this much easier.) (About __assume__:) > and also that it requires the use of full control flow to do any > reasoning (e.g. there's the variable before assumption, the variable > after, and the variable which (depending on branching) may or may not > have been certified to have a given property. Then further > __assumes__ would be illegal? Or just ones that contradict?) It just > gets a lot messier than simply adding the data to the compile-time You misunderstood me here! I specifically noted that __assume__ only did the checking part, and does *not* magically constitute the assumptions themselves (that would be insane :-), probably challenging halting and NP-completeness and whatnot). If we want to up the bets from there, I think __assume__ could return a dict like this: cdef __assume__(self, len): ... return { "_len" : len } in order to provide renames etc. This *does* have the problems you mention though, but it is somewhat easier to arbitrarily raise errors for complex expressions. But I'm not advocating it this time around. > The mapping of __init__ parameters to type parameters (for use with > type inference) could be arbitrarily complicated, and I don't know > how to do that without having the compiler actually execute code at > compile time. I don't see why. What I meant is simply: - Take parameters passed to constructor. - Take intersection of the names of these with __assume__ method signature. - Pass same parameters (set to same expressions) to __assume__. OK, I suppose if you have non-trivial expressions as parameters this fails, so add this rule then: - However, if the expression of a parameter is not a compile-time-value, don't pass it to __assume__ anyway. If we know enough to attempt type inference, we'll know enough to do this I think. If this is not accepted, I'm leaning against using the () syntax, because you'll want to do stuff like (using [] syntax): x = ndarray[nd=2, dtype=float64](shape=(2,2), dtype=float64) in a type-inferred environment, to ensure that x has efficient access, and using () is very ambigous in the expression above. -- Dag Sverre From robertwb at math.washington.edu Mon May 12 20:12:27 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 11:12:27 -0700 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <482821F1.8060808@behnel.de> References: <48281697.2000907@student.matnat.uio.no> <482821F1.8060808@behnel.de> Message-ID: <3A26A77E-2D4F-430C-9194-61E4558ECFDE@math.washington.edu> On May 12, 2008, at 3:54 AM, Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> I'll quote Gary so you have the context: >> >> """ >> Efficiency does matter to Sage though, and an O(1) overhead is very >> far from negligible. Cython uses real C vtables, so there are >> absolutely no dict-based lookups involved with cdef class >> polymorphism. -Infinity to any proposal that makes code slower. >> >> >> >> Well, clarifying what I meant a bit more, the *biggest* speed loss >> anywhere is dictionary lookups. If your going to use dictionary >> lookups at object time, you lose most of the advantage of using >> cython >> to compile cython. >> """ > > I don't think the intention to compile Cython should impact our design > decisions. If a Cython compiled Cython compiler is not any faster > than a > Python based one, then that's a reason to improve Cython, not its > code. I fully agree here. I think that when Cython compiles a py file, it should intelligently (optionally) decide to cdef and cpdef classes and methods. Just having the self parameter typed is a huge gain, even if nothing else is. Between that and type inference, there should be significant gains. In terms of adding a visit() function to each node, that is tiny compared to the work of implementing the actual visitor functions for a single phase, so if this can be optimized by using vtables then I think we should probably do it (perhaps eventually, no urgency here). - Robert From robertwb at math.washington.edu Mon May 12 20:15:18 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 11:15:18 -0700 Subject: [Cython] assigning to struct member In-Reply-To: <48287F76.6050002@semipol.de> References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> Message-ID: <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> On May 12, 2008, at 10:33 AM, Johannes Wienke wrote: > Am 05/12/2008 04:12 PM schrieb Lisandro Dalcin: >> Do you also have a 'cdef class plugData' defined somewhere ? > > No, do I need that? No, you don't. > >> On 5/12/08, Johannes Wienke wrote: >>> Hi again, >>> >>> maybe I'm blind or I don't know but how do I assign a value to a >>> struct >>> member? >>> >>> This code: >>> cdef plugData *data = malloc(sizeof(plugData)) >>> data.ident = "foo" >>> >>> with this definition of plugData: >>> ctypedef struct plugData: >>> plugDefinition *plug >>> char *ident >>> void *data >>> >>> generates an error: >>> "Object of type 'plugData' has no attribute 'ident'" >>> >>> What's wrong with this? I just tried ctypedef struct plugData: # plugDefinition *plug char *ident void *data cdef plugData *data = malloc(sizeof(plugData)) data.ident = "foo" and it works fine for me. Perhaps you're redefining plugData somewhere else in your header files? - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/b4b916f5/attachment.pgp From languitar at semipol.de Mon May 12 21:31:28 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 12 May 2008 21:31:28 +0200 Subject: [Cython] assigning to struct member In-Reply-To: <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> Message-ID: <48289B10.9080808@semipol.de> Am 05/12/2008 08:15 PM schrieb Robert Bradshaw: > I just tried > > ctypedef struct plugData: > # plugDefinition *plug > char *ident > void *data > > cdef plugData *data = malloc(sizeof(plugData)) > data.ident = "foo" > > and it works fine for me. Perhaps you're redefining plugData somewhere > else in your header files? Hm, I don't see a problem. Here's a complete test code that won't compile: include "stdlib.pxi" include "plugin.pxi" cdef plugData *foo(): cdef plugData *data = malloc(sizeof(plugData)) data.ident = "foo" plugin.pxi: ----------- include "grab.pxi" cdef extern from "main/plugin.h": cdef struct plugData: pass ctypedef struct plugDefinition: char *name int abi_version void (*init) (plugDefinition *plug, grabParameter *para, int argc, char **argv) except * int (*init_options) (plugDefinition *plug) except * void (*cleanup) (plugDefinition *plug) except * bint (*process) (plugDefinition *plug, char *id, plugData *data) except * void *reserved1 void *reserved2 void *reserved3 ctypedef plugDefinition* (*plugGetInfoFunc) (int, bint*) cdef extern from "main/plugin_comm.h": ctypedef void (*plugDataDestroyFunc) (void *data) ctypedef void (*plugFunc) () ctypedef struct plugData: plugDefinition *plug char *ident void *data ctypedef struct plugDataFunc: pass ctypedef enum: PLUG_PRIORITY_DEFAULT PLUG_PRIORITY_MIN PLUG_PRIORITY_MAX Anythin wrong with this? Thanks Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/3b2ab602/attachment.pgp From dalcinl at gmail.com Mon May 12 21:37:12 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 12 May 2008 16:37:12 -0300 Subject: [Cython] assigning to struct member In-Reply-To: <48289B10.9080808@semipol.de> References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> Message-ID: That's what Robert warned you about!! Indeed, you have two definitions of plugData: On 5/12/08, Johannes Wienke wrote: > plugin.pxi: > ----------- > include "grab.pxi" > cdef extern from "main/plugin.h": > cdef struct plugData: > pass > > cdef extern from "main/plugin_comm.h": > ctypedef struct plugData: > plugDefinition *plug > char *ident > void *data -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From languitar at semipol.de Mon May 12 21:42:18 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 12 May 2008 21:42:18 +0200 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> Message-ID: <48289D9A.9020905@semipol.de> Am 05/12/2008 09:37 PM schrieb Lisandro Dalcin: > That's what Robert warned you about!! Indeed, you have two definitions > of plugData: Must be blind... thanks! But how do I resolve the problem in the pxi file, that both struct depend on each other withput duplicating the struct? > On 5/12/08, Johannes Wienke wrote: >> plugin.pxi: >> ----------- >> include "grab.pxi" >> cdef extern from "main/plugin.h": >> cdef struct plugData: >> pass >> >> cdef extern from "main/plugin_comm.h": >> ctypedef struct plugData: >> plugDefinition *plug >> char *ident >> void *data > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/0352732d/attachment.pgp From dalcinl at gmail.com Mon May 12 22:00:47 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 12 May 2008 17:00:47 -0300 Subject: [Cython] assigning to struct member In-Reply-To: <48289D9A.9020905@semipol.de> References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> <48289D9A.9020905@semipol.de> Message-ID: The example above is working for me (not I'm following cython-devel repo). Give a try cdef extern from *: ctypedef struct A # like a forward declaration ? ctypedef struct B # like a forward declaration ? ctypedef struct A: B* b ctypedef struct B: A* a cdef A tmp_a cdef B tmp_b tmp_a.b = &tmp_b tmp_b.a = &tmp_a On 5/12/08, Johannes Wienke wrote: > Am 05/12/2008 09:37 PM schrieb Lisandro Dalcin: > > > That's what Robert warned you about!! Indeed, you have two definitions > > of plugData: > > > Must be blind... thanks! > > But how do I resolve the problem in the pxi file, that both struct > depend on each other withput duplicating the struct? > > > > On 5/12/08, Johannes Wienke wrote: > >> plugin.pxi: > >> ----------- > >> include "grab.pxi" > >> cdef extern from "main/plugin.h": > >> cdef struct plugData: > >> pass > >> > >> cdef extern from "main/plugin_comm.h": > >> ctypedef struct plugData: > >> plugDefinition *plug > >> char *ident > >> void *data > > > > > > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue May 13 00:09:15 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 12 May 2008 19:09:15 -0300 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) Message-ID: I saw some previous posts complaining about using numpy and some name clashes about 'strides', and other stuff like that... Well, as time pass, I like Cython more and more, because nasty names and name clashes can be easyly fixed. For example, lest see some code that uses the convention 'cxxxx' for accession at the Cython C-level what in python is a 'xxxxx' member (eg. 'cshape' and 'shape'). This is so easy for me that I will never mind about requiring the numpy headers to change. Have fun! I post this directly to F.Perez and B. Granger, I will love to hear their opinion about this hackery. Any chances to go in for the 'official' numpy.pxy, or a variant shipped with numpy ?? # -------------------------------------------------------------------- cdef extern from "numpy/arrayobject.h": int import_numpy "_import_array" () except -1 ctypedef int npy_intp ctypedef extern class numpy.ndarray [object PyArrayObject]: cdef char *cdata "data" cdef int cndim "nd" cdef int *cshape "dimensions" cdef int *cstrides "strides" cdef int cflags "flags" # -------------------------------------------------------------------- import_numpy() # -------------------------------------------------------------------- def prn(ndarray a): cdef int i=0 # C-level access print 'ndim: ', a.cndim print 'shape: ', [a.cshape[i] for i from 0 <= i < a.cndim] print 'strides: ', [a.cstrides[i] for i from 0 <= i < a.cndim] print 'flags: ', a.cflags # print # Python-level access print 'ndim: ', a.ndim print 'shape: ', a.shape print 'strides: ', a.strides print 'flags: ', a.flags # -------------------------------------------------------------------- -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Tue May 13 01:56:48 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 11:56:48 +1200 Subject: [Cython] Arc Riley, PyMill, and FUD In-Reply-To: <4827FC3A.90109@behnel.de> References: <85e81ba30805101812r15979a9dk8c178123ade0b347@mail.gmail.com> <85e81ba30805101922m158e9614s5f01fddc68e48fc5@mail.gmail.com> <85e81ba30805102014n7b0f779fm2ac27a6b915ddbae@mail.gmail.com> <4826B413.4080608@behnel.de> <4826BDAF.9050106@canterbury.ac.nz> <4826CB9D.4080309@behnel.de> <4826D309.1050307@canterbury.ac.nz> <4827FC3A.90109@behnel.de> Message-ID: <4828D940.9060509@canterbury.ac.nz> Stefan Behnel wrote: > But Pyrex types are already optimised in a couple of ways. Yes, but those things are all internal to the type itself. When you call a method or access an attribute, you expect these things to be handled in a way that depends on the type. But isinstance() isn't like that -- it's a function that operates from outside the object. You don't expect functions like that to behave differently depending on the type, unless it's delegating to some method, such as __len__. That's not the case here -- it's not calling any kind of type-specific class-testing method, it's doing everything from outside. > Taking this a bit out of context, this is exactly my point. If you don't know > much about the C-API, you a) won't notice the difference and b) will expect > isinstance(obj, ExtType) to make sure that the type has the expected type struct. Perhaps I can put some context back in here. The places where this matters are inside special methods such as __add__. These methods are already wildly different from their Python counterparts, so you have to know you're in a different land and that different rules apply. Basically you have to RTM (where M is the Pyrex manual, not the Python/C API manual). Actually, it's just occurred to me that using isinstance where you should have used typecheck might not be as dangerous a problem as I thought. The reason is that in order to access any C attributes, you still need to cast it to the appropriate type, and Pyrex will generate a strict type test when you do that. So the worst that can happen is that you get a TypeError from a slightly unexpected place. -- Greg From greg.ewing at canterbury.ac.nz Tue May 13 02:17:55 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 12:17:55 +1200 Subject: [Cython] Language stability In-Reply-To: <4827FE69.7020804@behnel.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> Message-ID: <4828DE33.3060303@canterbury.ac.nz> Stefan Behnel wrote: > The latest syntax change regarding the > for loop is not required for Cython Concerning that, I've decided to remove the deprecation warning, and continue supporting the old syntax. This will be done in the next release; in the meantime, you can apply the following patch: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/undeprecate-for-from.patch > where the (IMHO much more obvious) cdef + > range() syntax is optimised Even in the presence of this optimisation, I don't consider that the integer for-loop syntax is entirely redundant. It concisely and clearly expresses all the possible combinations of including or excluding the lower and upper bound, together with iteration direction. This is something that I don't think is obvious at all with range() once you get beyond the simplest cases. It's for this reason -- notational clarity -- that I introduced the integer for-loop syntax, at least as much as optimisation. This is also the reason I want to simplify the syntax. The 'for i from...' version was a compromise -- I was originally thinking of it as a possible addition to Python itself, and I didn't want to do anything that would cause undue difficulties for Python's parser. But the Pyrex parser is more flexible, and it turns out not to be very difficult at all to do it the way I originally envisaged. -- Greg From greg.ewing at canterbury.ac.nz Tue May 13 02:32:11 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 12:32:11 +1200 Subject: [Cython] cython add vs __add__ In-Reply-To: <482802BB.80004@student.matnat.uio.no> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> <48280194.5080708@behnel.de> <482802BB.80004@student.matnat.uio.no> Message-ID: <4828E18B.5070800@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > Wouldn't it be an idea (well, I suppose it's too late now, just > wondering what you think) to call it __cadd__ instead then? If I were designing Pyrex over again, I might do something like that. But then people would expect __add__/__radd__ etc. to be emulated as well, and that would be very complicated to support. It would also be rather inefficient. -- Greg From greg.ewing at canterbury.ac.nz Tue May 13 05:49:49 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 15:49:49 +1200 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> Message-ID: <48290FDD.5040407@canterbury.ac.nz> Lisandro Dalcin wrote: > That's what Robert warned you about!! Indeed, you have two definitions > of plugData: Although there appears to be a compiler bug there somewhere, as it *should* have complained about a redefinition of plugData. -- Greg From jim-crow at rambler.ru Tue May 13 07:55:23 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Tue, 13 May 2008 12:55:23 +0700 Subject: [Cython] defining module constants In-Reply-To: <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> Message-ID: <20080513125523.b320584f.jim-crow@rambler.ru> On Mon, 12 May 2008 10:05:20 -0700 Robert Bradshaw wrote: > Could you clarify? I did add something that lets you do > > cdef public enum foo: > a > b > c = 10 > ... > > and theses will be exported into the public module dict. For the testing purposes I created a new project with one test.pyx file which contains: cdef public enum FOO: BAR = 3 Then build and install it with distutils. In python shell run next commands: >>> import test >>> dir(test) And got next output: ['__builtins__', '__doc__', '__file__', '__name__', '__pyx_capi__'] I have python 2.5.2 and cython 0.9.6.13.1 -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/7e07f76c/attachment.pgp From robertwb at math.washington.edu Tue May 13 08:05:30 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 23:05:30 -0700 Subject: [Cython] cython add vs __add__ In-Reply-To: <4828E18B.5070800@canterbury.ac.nz> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> <48280194.5080708@behnel.de> <482802BB.80004@student.matnat.uio.no> <4828E18B.5070800@canterbury.ac.nz> Message-ID: <7940F278-6536-471A-BD27-5764438F8123@math.washington.edu> On May 12, 2008, at 5:32 PM, Greg Ewing wrote: > Dag Sverre Seljebotn wrote: >> Wouldn't it be an idea (well, I suppose it's too late now, just >> wondering what you think) to call it __cadd__ instead then? > > If I were designing Pyrex over again, I might do > something like that. But then people would expect > __add__/__radd__ etc. to be emulated as well, and > that would be very complicated to support. > > It would also be rather inefficient. I think at this point it would be too hard of a break with backwards compatibility to change. This does give support to supporting a "pure python" mode (e.g. for .py files) that would emulate the __add__/ __radd__ as well as allow stuff like "cdef = 3." - Robert From robertwb at math.washington.edu Tue May 13 08:05:38 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 23:05:38 -0700 Subject: [Cython] defining module constants In-Reply-To: <20080513125523.b320584f.jim-crow@rambler.ru> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> Message-ID: <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> On May 12, 2008, at 10:55 PM, Anatoly A. Kazantsev wrote: > On Mon, 12 May 2008 10:05:20 -0700 > Robert Bradshaw wrote: > >> Could you clarify? I did add something that lets you do >> >> cdef public enum foo: >> a >> b >> c = 10 >> ... >> >> and theses will be exported into the public module dict. > > For the testing purposes I created a new project with one test.pyx > file which contains: > > cdef public enum FOO: > BAR = 3 > > Then build and install it with distutils. > > In python shell run next commands: > >>>> import test >>>> dir(test) > > And got next output: > > ['__builtins__', '__doc__', '__file__', '__name__', '__pyx_capi__'] > > I have python 2.5.2 and cython 0.9.6.13.1 Cython 0.9.6.14. Sorry, very new feature. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/48068123/attachment-0001.pgp From robertwb at math.washington.edu Tue May 13 08:11:11 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 12 May 2008 23:11:11 -0700 Subject: [Cython] assigning to struct member In-Reply-To: <48290FDD.5040407@canterbury.ac.nz> References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> <48290FDD.5040407@canterbury.ac.nz> Message-ID: On May 12, 2008, at 8:49 PM, Greg Ewing wrote: > Lisandro Dalcin wrote: >> That's what Robert warned you about!! Indeed, you have two >> definitions >> of plugData: > > Although there appears to be a compiler bug there somewhere, > as it *should* have complained about a redefinition of > plugData. IIRC, something like this was disabled way, way back with SageX. I can't recall the reason but it allowed for "redefinition" of certain functions. Yep: http://hg.cython.org/cython/rev/5e793a8b4ca0 In retrospect, I'm not sure why this is a good idea. (Was it due to PyObject* vs object?) - Robert From greg.ewing at canterbury.ac.nz Tue May 13 09:30:48 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 19:30:48 +1200 Subject: [Cython] cython add vs __add__ In-Reply-To: <7940F278-6536-471A-BD27-5764438F8123@math.washington.edu> References: <4A9A59A6-7CFF-4F93-A91C-8F22804DC0F6@gmail.com> <482510A1.7090806@canterbury.ac.nz> <48280194.5080708@behnel.de> <482802BB.80004@student.matnat.uio.no> <4828E18B.5070800@canterbury.ac.nz> <7940F278-6536-471A-BD27-5764438F8123@math.washington.edu> Message-ID: <482943A8.30206@canterbury.ac.nz> Robert Bradshaw wrote: > This does give support to supporting a "pure > python" mode (e.g. for .py files) that would emulate the __add__/ > __radd__ as well as allow stuff like "cdef = 3." Pure python code isn't going to be defining any classes with cdef, so there shouldn't be any confusion. Certainly __add__/__radd__ should work as they do in Python in a pure Python class. If you're going to optimise pure Python classes by implementing them as types, then you would need to emulate __add__/__radd__ for them, whether in pure python mode or not. -- Greg From greg.ewing at canterbury.ac.nz Tue May 13 09:36:48 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 May 2008 19:36:48 +1200 Subject: [Cython] ANN: Pyrex 0.9.7.1 Message-ID: <48294510.5060407@canterbury.ac.nz> Pyrex 0.9.7.1 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ This version fixes a bug in the new integer indexing optimisation which causes indexing of a non-sequence type with a C int to fail with a TypeError. What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From jim-crow at rambler.ru Tue May 13 09:49:11 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Tue, 13 May 2008 14:49:11 +0700 Subject: [Cython] defining module constants In-Reply-To: <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> Message-ID: <20080513144911.ac42579e.jim-crow@rambler.ru> On Mon, 12 May 2008 23:05:38 -0700 Robert Bradshaw wrote: > Cython 0.9.6.14. Sorry, very new feature. > > - Robert OK. In 0.9.6.14 it' works :-) Thanks a lot! -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/e253be41/attachment.pgp From dagss at student.matnat.uio.no Tue May 13 10:00:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 10:00:15 +0200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: References: Message-ID: <48294A8F.1030807@student.matnat.uio.no> Lisandro Dalcin wrote: > I saw some previous posts complaining about using numpy and some name > clashes about 'strides', and other stuff like that... Yes, that was me, but it was mostly a philosophical point :-) > Well, as time pass, I like Cython more and more, because nasty names > and name clashes can be easyly fixed. For example, lest see some code > that uses the convention 'cxxxx' for accession at the Cython C-level > what in python is a 'xxxxx' member (eg. 'cshape' and 'shape'). This is > so easy for me that I will never mind about requiring the numpy > headers to change. Great! I like the convention and the names you picked. I'll use the same names in the pxd that I will end up signing off after this summer's Summer of Code [1]. Still, it would be nice to ship this as an "official" numpy.pxd before that. I can then make a promise that any code written using it will be forwards-compatible with my pxd after the summer (which I can't do with the current one that the NumPy project ships). One way of going about it is discussed in CEP 106 (http://wiki.cython.org/declaration). I personally prefer this over having the NumPy project ship it for now, as numpy.pxd will be in heavy development tightly coupled with the latest Cython releases. What do you think? It basically boils down to putting it in the Includes directory in the Cython repo (though Cython doesn't seem to automatically add this to the include path, perhaps that can be done?) [1]: http://wiki.cython.org/DagSverreSeljebotn/soc -- Dag Sverre From robertwb at math.washington.edu Tue May 13 10:04:04 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 13 May 2008 01:04:04 -0700 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <48288811.9040602@student.matnat.uio.no> References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> <48288811.9040602@student.matnat.uio.no> Message-ID: On May 12, 2008, at 11:10 AM, Dag Sverre Seljebotn wrote: > (Rearranged email to order of urgency.) > >> type of the object. Another problem is that is specifically requires >> one to declare ahead of time what compile-time assumptions can be >> made, rather than letting the user of the .pxd file specify things >> ahead of time for explicit optimization. > > This "problem" was the exact reason I did this, and is a feature!! > > Perhaps there's more I don't understand. Tell me then how you prevent > something like this: > > cdef timer(seconds=0) t = timer(0) > print t.seconds # prints 0 > sleep(10) > print t.seconds # prints 0 (!) > > The whole point was to leave what one can safely (!) make assumptions > about up to the class designer (and presumably pxd author). Otherwise > this just becomes some dangerous shoestring feature, not something you > give to engineers using NumPy without deep programming knowledge... > > Possible assumptions should be part of the published API of the class! > > (Is this something we will simply never agree on? I really hope I am > misunderstanding something, would make all of this much easier.) No, you have a very good point here that I missed. It should probably be an attribute of the attributes themselves, e.g. cdef class A: cdef int can_be_set_at_compile_time len # No, I'm not seriously suggesting that name ... perhaps there could be keywords designating type parameters too... (These could then be used in function signatures? Maybe) They're attributes, they just belong to the type rather than (necessarily) to an instance. > (About __assume__:) >> and also that it requires the use of full control flow to do any >> reasoning (e.g. there's the variable before assumption, the variable >> after, and the variable which (depending on branching) may or may not >> have been certified to have a given property. Then further >> __assumes__ would be illegal? Or just ones that contradict?) It just >> gets a lot messier than simply adding the data to the compile-time > > You misunderstood me here! I specifically noted that __assume__ > only did > the checking part, and does *not* magically constitute the assumptions > themselves (that would be insane :-), probably challenging halting and > NP-completeness and whatnot). > > If we want to up the bets from there, I think __assume__ could > return a > dict like this: > > cdef __assume__(self, len): > ... > return { "_len" : len } > > in order to provide renames etc. This *does* have the problems you > mention though, but it is somewhat easier to arbitrarily raise errors > for complex expressions. But I'm not advocating it this time around. The body of __assume__ gets executed at compile time? Is it checking or setting the object's parameters? It's more like assert I guess. I guess it's unclear what __assume__ really is--it's not really a cdef, def, or special function... >> The mapping of __init__ parameters to type parameters (for use with >> type inference) could be arbitrarily complicated, and I don't know >> how to do that without having the compiler actually execute code at >> compile time. > > I don't see why. What I meant is simply: > > - Take parameters passed to constructor. > - Take intersection of the names of these with __assume__ method > signature. > - Pass same parameters (set to same expressions) to __assume__. > > OK, I suppose if you have non-trivial expressions as parameters this > fails, so add this rule then: > > - However, if the expression of a parameter is not a compile-time- > value, > don't pass it to __assume__ anyway. > > If we know enough to attempt type inference, we'll know enough to do > this I think. > > If this is not accepted, I'm leaning against using the () syntax, > because you'll want to do stuff like (using [] syntax): > > x = ndarray[nd=2, dtype=float64](shape=(2,2), dtype=float64) > > in a type-inferred environment, to ensure that x has efficient access, > and using () is very ambigous in the expression above. I have a slight preference for the () notation, but I can see (especially in the above case) how [] is much clearer. Being able to do type inference based on the init parameters seems desirable, but one can't even be sure that A() returns an object of type A (if A implements a python __new__ method). - Robert From robertwb at math.washington.edu Tue May 13 10:29:34 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 13 May 2008 01:29:34 -0700 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: References: Message-ID: On May 12, 2008, at 3:09 PM, Lisandro Dalcin wrote: > I saw some previous posts complaining about using numpy and some name > clashes about 'strides', and other stuff like that... > > Well, as time pass, I like Cython more and more, because nasty names > and name clashes can be easyly fixed. For example, lest see some code > that uses the convention 'cxxxx' for accession at the Cython C-level > what in python is a 'xxxxx' member (eg. 'cshape' and 'shape'). This is > so easy for me that I will never mind about requiring the numpy > headers to change. > > Have fun! I post this directly to F.Perez and B. Granger, I will love > to hear their opinion about this hackery. Any chances to go in for the > 'official' numpy.pxy, or a variant shipped with numpy ?? > # -------------------------------------------------------------------- > > cdef extern from "numpy/arrayobject.h": > int import_numpy "_import_array" () except -1 > > ctypedef int npy_intp > > ctypedef extern class numpy.ndarray [object PyArrayObject]: > cdef char *cdata "data" > cdef int cndim "nd" > cdef int *cshape "dimensions" > cdef int *cstrides "strides" > cdef int cflags "flags" > > # -------------------------------------------------------------------- > > import_numpy() > > # -------------------------------------------------------------------- > > def prn(ndarray a): > cdef int i=0 > # C-level access > print 'ndim: ', a.cndim > print 'shape: ', [a.cshape[i] for i from 0 <= i < a.cndim] > print 'strides: ', [a.cstrides[i] for i from 0 <= i < a.cndim] > print 'flags: ', a.cflags > # > print > # Python-level access > print 'ndim: ', a.ndim > print 'shape: ', a.shape > print 'strides: ', a.strides > print 'flags: ', a.flags > > # -------------------------------------------------------------------- I am having trouble understanding exactly what you're trying to accomplish here--what is the correspondence between strides and cstrides, and how will they be kept in sync? Why not let a.ndim be the c attribute that gets coerced to a python attribute when needed? For some types where coercion is not possible (e.g. the int* shape) then it could transform the above to (a).shape and be implemented via the property: __get__ mechanism as at http:// www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/ extension_types.html#Properties (if it exists?). Not having two names that point to the same thing is one of the whole points of cpdef functions, this seems to be going backwards with attributes. Also, taking Python code and having to change all the names to get Cython to optimize it (thus rendering it incompatible with pure Python) seems like an undesirable thing too. Maybe I'm missing something... - Robert From dagss at student.matnat.uio.no Tue May 13 11:05:18 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 11:05:18 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> <48288811.9040602@student.matnat.uio.no> Message-ID: <482959CE.6060305@student.matnat.uio.no> (Note: Perhaps we should just postpone this until after your exam and then have an IRC session about it? The communication overhead seems bigger here than in other matters, perhaps because of more complexity...) > No, you have a very good point here that I missed. It should probably > be an attribute of the attributes themselves, e.g. > > cdef class A: > cdef int can_be_set_at_compile_time len # No, I'm not seriously > suggesting that name > ... > > perhaps there could be keywords designating type parameters too... > (These could then be used in function signatures? Maybe) They're > attributes, they just belong to the type rather than (necessarily) to > an instance. Yes, this seems ok (for clarity, note this as an alternative proposal to __assume__). >> in order to provide renames etc. This *does* have the problems you >> mention though, but it is somewhat easier to arbitrarily raise errors >> for complex expressions. But I'm not advocating it this time around. > > The body of __assume__ gets executed at compile time? Is it checking > or setting the object's parameters? It's more like assert I guess. I > guess it's unclear what __assume__ really is--it's not really a cdef, > def, or special function... In my original proposal for __assume__ (simply forget about the dict return), it gets executed run-time. That is, you could just compile it in a different pyx and only declare it in a pxd if you want to. As long as the signature is available compile-time you are fine (so, it would be a cdef method, predeclared or inline-implemented in pxd file). Reasons: * It provides a place to specify an argument order, which I'd prefer to have available. (Ie ndarray(2, float64), rather than "ndim=2, dtype=float64)). What we have here is essentially a function call, and it seems nice to be able to use a function signature to declare it. * It provides a place to specify "arguments which are not fields", ie arr.pxd: cdef class MyArray: # no dtype declared! cdef MyArray cdef __assume__(self, dtype) client.pyx: cdef MyArray(dtype=int) arr = ... cdef arr.dtype x = arr[23] But a declarative syntax (like you suggest) can be used for this as well. * It gives a place to also check arbitrarily complex boundaries for the parameters. Ie, if "use_foo=False" is specified as one assumption then also assuming "foo=3" can raise an exception. (There's a weakness here: Errors from inconsistent assumptions in this way would give a run-time error!, which could in theory be known compile-time. However it would "work" -- that run-time error prevents the inconsistently generated code from being run, even if it *is* being generated.) * Checking the assumptions for an object could be done automatically, but requesting that it is done manually has some advantages: a) The class coder is made very responsible and conscious about not changing the fields after an assumption is made. This kind of unenforced "run-time contract" feels very Pythonic. b) Having an assumption match might not always mean "==". Consider C booleans for instance, where "attr=True" should map to checking whether "self.attr != 0". However one could argue that one should then create a Python property that converts attr to a Python boolean object; so this might not hold; but having manual run-time-code just seems more flexible... Still this, aspect can be dropped from assume, it can still play the other roles. * It makes it possible to have side-effects on assumptions. One thing is that in special situations the object might be able to change to accomodate the assumption (though I cannot think of examples where this would make sense, it is much better to be explicit in such cases). A much better usecase is that an object might want to freeze itself, provide CopyOnWrite-semantics, and so on. In fact, perhaps __assume__ could simply return what should be assigned to the "assumed" variable, so that it can return a "frozen proxy" in special cases? Example (requires that an __unassume__ is added as well): arr.pxd: cdef class Arr: cdef int len cdef int assumption_refcount cdef __assume__(self, int len) cdef __unassume__(self) cdef append(self, int value) arr.pyx: cdef class Arr: cdef int len cdef int assumption_refcount cdef __assume__(self, int len): if self.len != len: raise AssumptionError(...) self.assumption_refcount += 1 # same arguments passed for convenience cdef __unassume__(self, int len): self.assumption_refcount -= 1 cdef append(self, int value): if self.assumption_refcount > 0: raise RuntimeError(...) client.pyx: cimport arr o = get_arr() print o.assumption_refcount # => 0 cdef Arr(len=3) x = o # calls __assume__ on new x print o.assumption_refcount # => 1 try: o.append(3) # => exception except RuntimeError: pass x = some_other_object # ^^^ calls __unassume__ on old x and __assume__ on new x print o.assumption_refcount # => 0 o.append(3) # ok -- Dag Sverre From dagss at student.matnat.uio.no Tue May 13 11:27:50 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 11:27:50 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <482959CE.6060305@student.matnat.uio.no> References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> <48288811.9040602@student.matnat.uio.no> <482959CE.6060305@student.matnat.uio.no> Message-ID: <48295F16.2070404@student.matnat.uio.no> >>> in order to provide renames etc. This *does* have the problems you >>> mention though, but it is somewhat easier to arbitrarily raise errors >>> for complex expressions. But I'm not advocating it this time around. >> The body of __assume__ gets executed at compile time? Is it checking >> or setting the object's parameters? It's more like assert I guess. I >> guess it's unclear what __assume__ really is--it's not really a cdef, >> def, or special function... To answer this more directly: __assume__ is simply called, and expected to generate a run-time error if the assumption was wrong. After that, the assumptions (about object attributes) are simply made (by the compiler, and __assume__ doesn't come into it.) -- Dag Sverre From dagss at student.matnat.uio.no Tue May 13 11:49:20 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 11:49:20 +0200 Subject: [Cython] Python type optimizations (NumPy GSoC-related) In-Reply-To: <482959CE.6060305@student.matnat.uio.no> References: <3293365801.1640468@smtp.netcom.no> <48275E71.2030109@student.matnat.uio.no> <3169E86B-8A2F-4763-A158-63F142E05523@math.washington.edu> <48288811.9040602@student.matnat.uio.no> <482959CE.6060305@student.matnat.uio.no> Message-ID: <48296420.6060602@student.matnat.uio.no> > b) Having an assumption match might not always mean "==". Consider C > booleans for instance, where "attr=True" should map to checking whether > "self.attr != 0". However one could argue that one should then create a > Python property that converts attr to a Python boolean object; so this > might not hold; but having manual run-time-code just seems more > flexible... Still this, aspect can be dropped from assume, it can still > play the other roles. I found a better example. Consider this: cdef ndarray(c_contiguous=True) arr = x Here, c_contiguous is a "virtual field" which is not present in itself, only present for the benefit of inlineable code. However, you cannot check whether an array is c_contiguous just by checking the contents of a field, rather you want to do if x.flags.c_contiguous is not True: raise AssumptionError(...) Ok, so you could have modified the syntax slightly, to do cdef ndarray(flags.c_contiguous=True) arr = x but this quickly gets hairy... The point here is: - You can do arbitrarily complex assumptions... - and stuff the "result" of the assumption into a simple "virtual field" that is only available when operating under the assumption. I.e: cdef Arr(does_very_complex_assumption_hold=True) arr = x # __assume__ is called under to hood, and notices that # it should check a very complex assumption and raise # exception otherwise arr.do_something() # This is inlined to very fast code because the # inlineable implementation of do_something simply # has a check for whether arr.does_very_complex_assumption_hold # is True -- Dag Sverre From dagss at student.matnat.uio.no Tue May 13 13:28:40 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 13:28:40 +0200 Subject: [Cython] Language feature: Assumptions Message-ID: <48297B68.5020302@student.matnat.uio.no> Me and Robert has been discussing this over the last few days. That discussion has been a bit disorganized and probably hard to follow for the rest (it is for myself anyway). And so I drafted my latest thoughts on this here (it is relatively short, simple and clean compared to the discussion on the list): http://wiki.cython.org/enhancements/assumptions This represents my own proposal now and I'm sure Robert will disagree on a few points when he gets time for it :-) I don't expect this to reach any conclusion until in a few weeks, there's no hurry with this. In summary: This is a draft for a new language feature in Cython. It loosely fills the same role as "templates" does in other languages, however it does not result in seperate types in a type system, but rather constitute assumptions made on a /per-variable/ basis. Dag Sverre From stefan_ml at behnel.de Tue May 13 14:15:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 14:15:00 +0200 (CEST) Subject: [Cython] Language stability In-Reply-To: <4828DE33.3060303@canterbury.ac.nz> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> Message-ID: <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi Greg, Greg Ewing wrote: >> where the (IMHO much more obvious) cdef + >> range() syntax is optimised > > Even in the presence of this optimisation, I don't consider > that the integer for-loop syntax is entirely redundant. Not redundant, no. But less calling for improvement. > The 'for i from...' version was a compromise I understand that. Still, having two spellings for "for ... in ...", one for Python, one for C, looks better than a completely different syntax that just starts with "for". So I vote for for x in iterable: and for x in 1 < x <= 5: instead of the new for 1 < x <= 5: purely for readability (and obviously keeping the old "from" spelling for compatibility). Stefan From languitar at semipol.de Tue May 13 15:16:10 2008 From: languitar at semipol.de (Johannes Wienke) Date: Tue, 13 May 2008 15:16:10 +0200 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> <48289D9A.9020905@semipol.de> Message-ID: <4829949A.2080301@semipol.de> Am 05/12/2008 10:00 PM schrieb Lisandro Dalcin: > The example above is working for me (not I'm following cython-devel > repo). Give a try > > cdef extern from *: > > ctypedef struct A # like a forward declaration ? > ctypedef struct B # like a forward declaration ? > > ctypedef struct A: > B* b > > ctypedef struct B: > A* a > > cdef A tmp_a > cdef B tmp_b > > tmp_a.b = &tmp_b > tmp_b.a = &tmp_a That works. Thank you! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/dee5b6a1/attachment.pgp From dagss at student.matnat.uio.no Tue May 13 16:28:54 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 16:28:54 +0200 Subject: [Cython] Int looping In-Reply-To: <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4829A5A6.2010901@student.matnat.uio.no> >> The 'for i from...' version was a compromise >> > > I understand that. Still, having two spellings for "for ... in ...", one > for Python, one for C, looks better than a completely different syntax > that just starts with "for". So I vote for > > for x in iterable: > > and > > for x in 1 < x <= 5: > Is this (int looping) something you tend to do? When writing Python code I almost never end up doing it, rather I end up using "enumerate" (or for NumPy, "ndenumerate") when I need the indices. I'd rather we worked on improved high-level looping than inventing new syntax for low-level looping. For instance, one could implement optimizations for "enumerate" in Cython as well as "range": for idx, value in enumerate(a): BLOCK could be turned into the much more efficient for idx from 0 <= len(a): value = a[idx] BLOCK I suppose operating with C pointers increases the need for int looping, but when working with C code I see a much larger need for for (char* ch = start; *ch != 0; ++ch) ... which could be done simply by letting our builting "enumerate" know that char* should stop looping when hitting 0, also the more general (especially in C++). The more general iteration pattern (especially for C++): for (T* iter = start; iter != v.end(); ++iter) ... could be done by something like for iterator in c_iteration(a.begin(), a.end()): print iterator.get() iterator.set(3) and so on. There are better alternatives to int iteration :-) If that's not good enough, one should start with writing a Cython-centric summary of the PEPs suggesting alternative looping constructs for Python (of which there are no shortage), e.g. http://www.python.org/dev/peps/pep-0284/ and http://www.python.org/dev/peps/pep-0212/ My personal favorite would be "for x in [:10]:" so that one is consistent with the slice notation (of course, this has the same drawbacks as range does). But really I'm -1 for anything that deals with a specific syntax run-time syntax, I think all Cython-specific syntax should somehow be connected with the fact that we have a compilation phase. Dag Sverre From dagss at student.matnat.uio.no Tue May 13 16:35:59 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 16:35:59 +0200 Subject: [Cython] Int looping In-Reply-To: <4829A5A6.2010901@student.matnat.uio.no> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4829A5A6.2010901@student.matnat.uio.no> Message-ID: <4829A74F.6050004@student.matnat.uio.no> *sigh* .. slow down... Corrections: > could be turned into the much more efficient > > for idx from 0 <= len(a): > This should be: for idx from 0 <= idx < len(a): > for (T* iter = start; iter != v.end(); ++iter) ... > > could be done by something like > > for iterator in c_iteration(a.begin(), a.end()): > print iterator.get() > iterator.set(3) > This should be: for (T* iter = start; iter != end; ++iter) ... could be done by something like: for iterator in c_iteration(start, end): print iterator.get() iterator.set(3) (When used with STL containers, start and end would typically be a.begin() and a.end() though.) Of course, c_iteration would also be a Cython builtin (for speed). BTW the enumerate feature would be a prime candidate for a simple demo transform so that's a reason for me to take it on if it gets support. (Without char* support for now, can extend to char* support after the phase seperation.) If enumerate already *is* supported (didn't check...) then I apologize... Dag Sverre From stefan_ml at behnel.de Tue May 13 17:06:48 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 17:06:48 +0200 (CEST) Subject: [Cython] Int looping In-Reply-To: <4829A5A6.2010901@student.matnat.uio.no> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4829A5A6.2010901@student.matnat.uio.no> Message-ID: <43017.194.114.62.67.1210691208.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Dag Sverre Seljebotn wrote: > >>> The 'for i from...' version was a compromise >> >> I understand that. Still, having two spellings for "for ... in ...", one >> for Python, one for C, looks better than a completely different syntax >> that just starts with "for". So I vote for >> >> for x in iterable: >> >> and >> >> for x in 1 < x <= 5: >> > Is this (int looping) something you tend to do? It happens. > When writing Python code Not in Python code, but in Cython code, especially in low-level C-ish functions. > I almost never end up doing it, rather I end up using "enumerate" (or > for NumPy, "ndenumerate") when I need the indices. I'd rather we worked > on improved high-level looping than inventing new syntax for low-level > looping. As Greg pointed out, it's there because it's convenient. > For instance, one could implement optimizations for "enumerate" in > Cython as well as "range": > > for idx, value in enumerate(a): > BLOCK > > could be turned into the much more efficient > > for idx from 0 <= len(a): > value = a[idx] > BLOCK Provided that a is indexable, which is getting less likely in recent Python code. > I suppose operating with C pointers increases the need for int looping, > but when working with C code I see a much larger need for > > for (char* ch = start; *ch != 0; ++ch) ... That's a while loop. > which could be done simply by letting our builting "enumerate" know that > char* should stop looping when hitting 0, also the more general > (especially in C++). No, please. You are trying to optimise an extremely special case here, that is best dealt with in a straight C loop. > The more general iteration pattern (especially for C++): > > for (T* iter = start; iter != v.end(); ++iter) ... > > could be done by something like > > for iterator in c_iteration(a.begin(), a.end()): > print iterator.get() > iterator.set(3) -1 on this and other syntax proposals. Stefan From jim-crow at rambler.ru Tue May 13 17:15:48 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Tue, 13 May 2008 22:15:48 +0700 Subject: [Cython] Export None to module under different name Message-ID: <20080513221548.f64cc950.jim-crow@rambler.ru> Hello! Is it possible to export None to module under different name? For example: cdef public enum: FOO = None -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/5f6353cb/attachment-0001.pgp From stefan_ml at behnel.de Tue May 13 17:31:10 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 17:31:10 +0200 (CEST) Subject: [Cython] Export None to module under different name In-Reply-To: <20080513221548.f64cc950.jim-crow@rambler.ru> References: <20080513221548.f64cc950.jim-crow@rambler.ru> Message-ID: <60847.194.114.62.67.1210692670.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Anatoly A. Kazantsev wrote: > Hello! > > Is it possible to export None to module under different name? > > For example: > > cdef public enum: > FOO = None Try FOO = None Stefan From dalcinl at gmail.com Tue May 13 17:35:54 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 13 May 2008 12:35:54 -0300 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> <48290FDD.5040407@canterbury.ac.nz> Message-ID: Not sure if allowing redefinitions is a good idea, but at least something like a 'forward declaration' have to be valid. If not, the example I posted would be unworkable... On 5/13/08, Robert Bradshaw wrote: > On May 12, 2008, at 8:49 PM, Greg Ewing wrote: > > > Lisandro Dalcin wrote: > >> That's what Robert warned you about!! Indeed, you have two > >> definitions > >> of plugData: > > > > Although there appears to be a compiler bug there somewhere, > > as it *should* have complained about a redefinition of > > plugData. > > > IIRC, something like this was disabled way, way back with SageX. I > can't recall the reason but it allowed for "redefinition" of certain > functions. Yep: > > http://hg.cython.org/cython/rev/5e793a8b4ca0 > > In retrospect, I'm not sure why this is a good idea. (Was it due to > PyObject* vs object?) > > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Tue May 13 17:35:59 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 13 May 2008 17:35:59 +0200 Subject: [Cython] Int looping In-Reply-To: <43017.194.114.62.67.1210691208.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4829A5A6.2010901@student.matnat.uio.no> <43017.194.114.62.67.1210691208.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4829B55F.5030408@student.matnat.uio.no> >> for idx, value in enumerate(a): >> BLOCK >> >> could be turned into the much more efficient >> >> for idx from 0 <= len(a): >> value = a[idx] >> BLOCK >> > > Provided that a is indexable, which is getting less likely in recent > Python code. > Ahh. True. In fact, even when it is indexable, it would still be changing behaviour. Still, the following gives about a small speedup (10-30% depending on the idx type and conversions required) in some simple benchmarks: cdef int idx for idx, value in enumerate(a): BLOCK versus cdef int idx = 0 for value in a: BLOCK idx += 1 Just thought I'd mention it (I realize it is kind of irrelevant and not really a priority). Dag Sverre From jim-crow at rambler.ru Tue May 13 18:13:22 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Tue, 13 May 2008 23:13:22 +0700 Subject: [Cython] Export None to module under different name In-Reply-To: <60847.194.114.62.67.1210692670.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <20080513221548.f64cc950.jim-crow@rambler.ru> <60847.194.114.62.67.1210692670.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <20080513231322.33a73df3.jim-crow@rambler.ru> On Tue, 13 May 2008 17:31:10 +0200 (CEST) "Stefan Behnel" wrote: > Try > > FOO = None > > Stefan It works :-) Thanks for a help -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/ed7c2901/attachment.pgp From jim-crow at rambler.ru Tue May 13 18:28:22 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Tue, 13 May 2008 23:28:22 +0700 Subject: [Cython] defining module constants In-Reply-To: <20080513144911.ac42579e.jim-crow@rambler.ru> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> <20080513144911.ac42579e.jim-crow@rambler.ru> Message-ID: <20080513232822.b3589d24.jim-crow@rambler.ru> On Tue, 13 May 2008 14:49:11 +0700 "Anatoly A. Kazantsev" wrote: I have next code and it's not working properly. In foo.h somebody wrote: #define BAR 1 than I want to define module constant with same name and value: cdef export from "foo.h": ?typedef enum: _BAR "BAR" cdef public enum: BAR = _BAR In python BAR will have 0 not 1. And actually 0 is index number of BAR in the enum. Looks like value of _BAR is not applyed to BAR in the enum. Maybe I do something wrong. -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080513/640bdb50/attachment.pgp From stefan_ml at behnel.de Tue May 13 18:41:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 18:41:22 +0200 (CEST) Subject: [Cython] defining module constants In-Reply-To: <20080513232822.b3589d24.jim-crow@rambler.ru> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> <20080513144911.ac42579e.jim-crow@rambler.ru> <20080513232822.b3589d24.jim-crow@rambler.ru> Message-ID: <62385.194.114.62.67.1210696882.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Anatoly A. Kazantsev wrote: > In foo.h somebody wrote: > > #define BAR 1 > > than I want to define module constant with same name and value: > > cdef export from "foo.h": > ?typedef enum: > _BAR "BAR" You can just use cdef extern from "foo.h": cdef int BAR Stefan From robertwb at math.washington.edu Tue May 13 19:45:00 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 13 May 2008 10:45:00 -0700 Subject: [Cython] [Pyrex] Language stability In-Reply-To: <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On May 13, 2008, at 5:15 AM, Stefan Behnel wrote: > Hi Greg, > > Greg Ewing wrote: >>> where the (IMHO much more obvious) cdef + >>> range() syntax is optimised >> >> Even in the presence of this optimisation, I don't consider >> that the integer for-loop syntax is entirely redundant. > > Not redundant, no. But less calling for improvement. > > >> The 'for i from...' version was a compromise > > I understand that. Still, having two spellings for "for ... > in ...", one > for Python, one for C, looks better than a completely different syntax > that just starts with "for". So I vote for > > for x in iterable: > > and > > for x in 1 < x <= 5: > > instead of the new > > for 1 < x <= 5: > > purely for readability (and obviously keeping the old "from" > spelling for > compatibility). I'm -1 for having lots of multiple ways to do for loops (including that list of PEPs--we're already up to 3). Also, "from" makes it clear that this is a special cython loop--consider the following: x = 1 class A: def __gt__(self, other): return range(3,7) for x in 0 <= x < A(): print x This is valid Python (prints 3, 4, 5, 6), and would act completely differently under your proposal. I do think optimizing enumerate/zip/etc is feasible and probably worthwhile. - Robert From stefan_ml at behnel.de Tue May 13 21:07:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 21:07:08 +0200 (CEST) Subject: [Cython] [Pyrex] Language stability In-Reply-To: References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <15867.194.114.62.67.1210705628.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Bradshaw wrote: > On May 13, 2008, at 5:15 AM, Stefan Behnel wrote: >> for x in iterable: >> >> and >> >> for x in 1 < x <= 5: >> >> instead of the new >> >> for 1 < x <= 5: >> >> purely for readability (and obviously keeping the old "from" >> spelling for >> compatibility). > > I'm -1 for having lots of multiple ways to do for loops I agree, but since Greg brought up the third way of doing it because he didn't like the integer loop syntax, I wanted to discuss what a good syntax would be here *iff* he wants to change it. > Also, "from" makes it > clear that this is a special cython loop--consider the following: > > x = 1 > > class A: > def __gt__(self, other): > return range(3,7) > > for x in 0 <= x < A(): > print x > > > This is valid Python (prints 3, 4, 5, 6), and would act completely > differently under your proposal. What an ugly example. ;) But I see that this syntax cannot be accepted in that case. Which leaves us with the question if Cython should support the shorted integer loop syntax which Pyrex now has. I guess supporting it and not promoting it would be ok. > I do think optimizing enumerate/zip/etc is feasible and probably > worthwhile. I'm just concerned about too many special cases in the optimiser. If we start optimising these (and I would prefer giving range(), zip() & friends their Py3 iterator semantics in this case), we should come up with a generic way to support iterator chaining in C code rather than making the looping code even more complicated and special casing. When we loop over a chain of iterators, this example for i,k in enumerate(range(100)): l += i*k could become something like this: c_range_state = c_range_new(100) c_enumerate_state = c_enumerate_new() while 1: temp1 = c_range_next(c_range_state); if (!temp1) ... temp2 = c_enumerate_next(c_enumerate_state, temp1); if (!temp2) ... i,k = temp2 l += i*k c_range_free(c_range_state) c_enumerate_free(c_enumerate_state) Maybe a subclassable ForLoopNode class with a way to generate the start, body and end code of a loop might be an idea. That way, we could chain the subclasses that know how to loop over a list, range(), enumerate(), etc., and we wouldn't even need malloc as everything could live in local variables. Stefan From stefan_ml at behnel.de Tue May 13 23:50:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 13 May 2008 23:50:08 +0200 Subject: [Cython] status update on Py3 support Message-ID: <482A0D10.1090603@behnel.de> Hi, I managed to compile lxml against Py3, although I couldn't test it so far as the PyFile_* functions are gone due to the new I/O system. Also, Cython's test runner didn't work right away, as it (obviously) tries to compile the tests using Cython, and Cython itself does not yet run under Py3. So I added a work around. If you run python2.5 runtests.py --no-cleanup python3.0 runtests.py --no-cython it will run Cython from Python 2.5 to compile and run the tests, but not delete the generated .c files, and then run Python3 to C-compile and run the tests without importing Cython at all. This allows testing the generated sources with different Python versions in general. However, for Py3 in particular, all tests go straight red (as expected ;) The problem is (again) identifiers, which are unicode strings in Py3. So we will have to find a way to make every one of them a unicode string, from class names to keyword arguments, but only on Py3. A good idea might be to actually remove the string interning Option and *always* intern identifiers, so that we have more control over how they enter the runtime environment. I'll look into that when I find the time... Regarding Cython's source itself, I tried running the 2to3 tool on Cython like this: python3.0 -m lib2to3.refactor Cython which generates a 3000 lines patch, but being only a static analysis tool, it breaks the ".next()" method on the scanner, thinking that what we actually meant was calling the new built-in "next()" function on an iterator. We might even work around that by actually making it an iterator, not sure if that will work. Other than I thought, the 2to3 tool does not touch string literals, so if you use a byte string literal in Py2 code, it will become a unicode literal after running 2to3. I think this is good for Cython where most byte string literals are really meant to be unicode literals. We might still want to take a bit of caution here, as there might be places where this is not the case. You can find the current Py3 branch of Cython here: http://hg.cython.org/cython-devel-py3 It will (or at least should) compile things nicely on Py2 without a noticeable difference, but it will not yet compile any code correctly for Py3. I'll keep working on it, any help is appreciated. Have fun, Stefan From greg.ewing at canterbury.ac.nz Wed May 14 01:43:57 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 May 2008 11:43:57 +1200 Subject: [Cython] Language stability In-Reply-To: <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <482A27BD.6010308@canterbury.ac.nz> Stefan Behnel wrote: > for x in iterable: > > and > > for x in 1 < x <= 5: That won't work, because it's ambiguous -- they're both instances of 'for x in '. In any case, simply changing 'from' to 'in' doesn't address the reason I made the change. -- Greg From greg.ewing at canterbury.ac.nz Wed May 14 01:51:11 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 May 2008 11:51:11 +1200 Subject: [Cython] ANN: Pyrex 0.9.7.2 Message-ID: <482A296F.5090205@canterbury.ac.nz> Pyrex 0.9.7.2 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Seems I didn't quite eradicate all of the integer indexing bugs. Here's a fix for the other half. What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From greg.ewing at canterbury.ac.nz Wed May 14 01:56:19 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 May 2008 11:56:19 +1200 Subject: [Cython] assigning to struct member In-Reply-To: References: <48284649.9030709@semipol.de> <48287F76.6050002@semipol.de> <2D530025-4FDE-4F6D-9DA9-45F9C5D503EF@math.washington.edu> <48289B10.9080808@semipol.de> <48290FDD.5040407@canterbury.ac.nz> Message-ID: <482A2AA3.6030502@canterbury.ac.nz> Lisandro Dalcin wrote: > Not sure if allowing redefinitions is a good idea, but at least > something like a 'forward declaration' have to be valid. Forward declarations are fine, at least in Pyrex. You just have to be careful not to give the declaration a body, otherwise it's taken as a definion. i.e. cdef struct Spam is a forward declaration, but cdef struct Spam: pass is a definition. -- Greg From greg.ewing at canterbury.ac.nz Wed May 14 02:20:33 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 14 May 2008 12:20:33 +1200 Subject: [Cython] status update on Py3 support In-Reply-To: <482A0D10.1090603@behnel.de> References: <482A0D10.1090603@behnel.de> Message-ID: <482A3051.2090104@canterbury.ac.nz> Stefan Behnel wrote: > A good idea might be to actually > remove the string interning Option and *always* intern identifiers That option will probably disappear from Pyrex at some point anyway. It's only there because I wasn't sure if string interning was going to work out, and I didn't want to remove the non-interning code completely until I had all the interning stuff working. But it seems to be okay, and there seems little point in keeping the old code around. > it > breaks the ".next()" method on the scanner, thinking that what we actually > meant was calling the new built-in "next()" function on an iterator. We might > even work around that by actually making it an iterator That wouldn't really be the right thing to do, because the next() method doesn't return a token -- rather it updates the instance variables of the scanner that hold information about the current token. However, it might be usable as a workaround for the 2to3 problem. -- Greg From robertwb at math.washington.edu Wed May 14 07:43:11 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 13 May 2008 22:43:11 -0700 Subject: [Cython] status update on Py3 support In-Reply-To: <482A3051.2090104@canterbury.ac.nz> References: <482A0D10.1090603@behnel.de> <482A3051.2090104@canterbury.ac.nz> Message-ID: <1F8A84E9-AB03-4F2F-8811-89ED525F3DAD@math.washington.edu> On May 13, 2008, at 5:20 PM, Greg Ewing wrote: > Stefan Behnel wrote: >> A good idea might be to actually >> remove the string interning Option and *always* intern identifiers > > That option will probably disappear from Pyrex at some > point anyway. It's only there because I wasn't sure if > string interning was going to work out, and I didn't > want to remove the non-interning code completely until > I had all the interning stuff working. But it seems to > be okay, and there seems little point in keeping the > old code around. One advantage it may have is reducing the amount of noise when doing memory profiling (has anyone tried this). Other than that, it would sure clean up a lot of code. - Robert From robertwb at math.washington.edu Wed May 14 08:14:49 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 13 May 2008 23:14:49 -0700 Subject: [Cython] [Pyrex] Language stability In-Reply-To: <15867.194.114.62.67.1210705628.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <15867.194.114.62.67.1210705628.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <5EB864AE-F540-4909-AA9A-165E97EDFBF0@math.washington.edu> On May 13, 2008, at 12:07 PM, Stefan Behnel wrote: > >> I do think optimizing enumerate/zip/etc is feasible and probably >> worthwhile. > > I'm just concerned about too many special cases in the optimiser. > > If we start optimising these (and I would prefer giving range(), zip > () & > friends their Py3 iterator semantics in this case), we should come > up with > a generic way to support iterator chaining in C code rather than > making > the looping code even more complicated and special casing. Yes, I certainly agree. This is an area where the "visitor" paradigm makes much more sense. > When we loop over a chain of iterators, this example > > for i,k in enumerate(range(100)): > l += i*k > > could become something like this: > > c_range_state = c_range_new(100) > c_enumerate_state = c_enumerate_new() > > while 1: > temp1 = c_range_next(c_range_state); if (!temp1) ... > temp2 = c_enumerate_next(c_enumerate_state, temp1); if (! > temp2) ... > i,k = temp2 > l += i*k > c_range_free(c_range_state) > c_enumerate_free(c_enumerate_state) > > Maybe a subclassable ForLoopNode class with a way to generate the > start, > body and end code of a loop might be an idea. That way, we could > chain the > subclasses that know how to loop over a list, range(), enumerate(), > etc., > and we wouldn't even need malloc as everything could live in local > variables. An interesting idea. I'm not sure how much this would help (it would some for sure), as I believe many of these iterators are already written in C and it's a series of C calls. I think the biggest savings is that cdef int i for i, k in enumerate(L): ... could increment i as a c int, and avoid all the tuple packing/ unpacking. Likewise, the tuple packing/unpacking could be avoided for zip (assuming the number of targets is equal to the number of arguments). - Robert From stefan_ml at behnel.de Wed May 14 15:27:45 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 May 2008 15:27:45 +0200 (CEST) Subject: [Cython] [Pyrex] Language stability In-Reply-To: <5EB864AE-F540-4909-AA9A-165E97EDFBF0@math.washington.edu> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <15867.194.114.62.67.1210705628.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <5EB864AE-F540-4909-AA9A-165E97EDFBF0@math.washington.edu> Message-ID: <60286.194.114.62.39.1210771665.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Bradshaw wrote: > I'm not sure how much this would help (it would > some for sure), as I believe many of these iterators are already > written in C and it's a series of C calls. I think the biggest > savings is that > > cdef int i > for i, k in enumerate(L): > ... > > could increment i as a c int, and avoid all the tuple packing/ > unpacking. ... which is a rather big overhead for such a trivial iterator. The thing is, many iterators really do extremely simple things, but require all the tuple packing and function call indirection *on each iteration*. That's different from functions like max() or the Py2 zip(), which are called once and the do loads of stuff in C. Just look at these: http://docs.python.org/lib/itertools-functions.html The really simple ones are: enumerate, range, chain, count, islice, repeat count() and range() are actually equivalent in their iterable versions. But I think enumerate() and islice(), and maybe chain() might be worth optimising. > Likewise, the tuple packing/unpacking could be avoided for > zip (assuming the number of targets is equal to the number of > arguments). Which would be part of the compiler decision if this should run in plain C code or not. One thing to note: zip() is an iterator in Py3, which changes its semantics in that you must now call list(zip()) if you want to modify its arguments in place. And zip() is actually less trivial than the above mentioned ones. Maybe zip() is not worth being optimised. Stefan From dagss at student.matnat.uio.no Wed May 14 17:09:22 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 14 May 2008 17:09:22 +0200 Subject: [Cython] Transforms tests and source directory structure Message-ID: <482B00A2.6010707@student.matnat.uio.no> This is about some small test utils I'm writing for transform code. I'd like some advice in structuring it before attempting to submit patches... Questions: Is it ok to put doctests in Cython sources? I haven't seen anything so I'm wondering whether you have a policy on it or not. The tests I have are kind of "unit-test" based, more isolated than the tests going in the tests directory (in particular, they abort compilation right after the transform I'm testing, i.e. long before C serialization). Is it ok if I just: - Have a Cython/Testing directory for the test utils... - Have a Transforms directory in Cython/Compiler containing transforms - Have doctests directly in module docstrings in Cython/Compiler/Transforms? Background: Basically, I now have the following running in my ForInOptimizations.py [1]: if __name__ == "__main__": from Cython.Testing.PipelineTesting import test_transform t = ForInOptimizations() test_transform("parsed", t, code = u""" for idx, value in enumerate(iterable): print idx, value """, expected = u""" idx = 0 for value in iterable: print idx, value idx += 1 """) This test-case is now running successfully, but placing the tests in a main section like that is perhaps not ideal? (In case you are wondering what is going on; yes, I am reserializing the modified parse tree to Python code. It's not difficult to do and allows for very easy, isolated tests -- though I won't go the entire way to full unit-testing). [1]: The ForInOptimizations.py only handles enumerate -- I don't think I will submit it before phase seperation is in place, because it doesn't look like I can know whether I should insert a check for int overflow before then, nor whether enumerate really is __builtin__.enumerate. But I hope I'll have the the test framework accepted. -- Dag Sverre From stefan_ml at behnel.de Wed May 14 18:04:59 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 May 2008 18:04:59 +0200 (CEST) Subject: [Cython] Transforms tests and source directory structure In-Reply-To: <482B00A2.6010707@student.matnat.uio.no> References: <482B00A2.6010707@student.matnat.uio.no> Message-ID: <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Dag Sverre Seljebotn wrote: > Is it ok to put doctests in Cython sources? I haven't seen anything so > I'm wondering whether you have a policy on it or not. I wouldn't object if you come up with meaningful doctests, but I see little interest in running things in an interactive session, so a doctest doesn't seem the appropriate measure to me. > The tests I have are kind of "unit-test" based, more isolated than the > tests going in the tests directory (in particular, they abort > compilation right after the transform I'm testing, i.e. long before C > serialization). The tests in "tests/compile/" only compile, the tests in "tests/run/" compile and run doctests. Why not add a directory "tests/transforms/" ? > - Have a Cython/Testing directory for the test utils... Are the test utilities so large that you really need a new source file or even an entire package? And, since you have a Python serialiser in place - do you really think that belongs into a test package? > - Have a Transforms directory in Cython/Compiler containing transforms Sounds like the right place to me. > Basically, I now have the following running in my ForInOptimizations.py > [1]: > > if __name__ == "__main__": > from Cython.Testing.PipelineTesting import test_transform > t = ForInOptimizations() > test_transform("parsed", t, code = u""" > for idx, value in enumerate(iterable): > print idx, value > """, expected = u""" > idx = 0 > for value in iterable: > print idx, value > idx += 1 > """) > > This test-case is now running successfully, but placing the tests in a > main section like that is perhaps not ideal? No, not a good idea. Add a test class to the runtests.py that only transforms the code, and then provide each test file with the expected result in a string, just like the error tester does. That way, you can still add a doctest to the source file and run the unchanged code and the transformed code through the normal test runner to see if both work as expected. Something like this might work as a test layout: ------ tests/transforms/somefile.pyx ----- __doc__ = """ >>> print CODE['func'] idx = 0 for value in iterable: print idx, value idx += 1 >>> func() 0 0 1 1 2 2 3 3 """ def func(): iterable = range(4) for idx, value in enumerate(iterable): print idx, value ------ tests/transforms/somefile.pyx ----- And then you parse the source file (see the error test class for this) and append a serialisation of a dictionary that maps the names of all Python functions found in the file to their transformed body, before you compile and run the file in the test runner. Do you think that would work? Stefan From dalcinl at gmail.com Wed May 14 18:34:17 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 14 May 2008 13:34:17 -0300 Subject: [Cython] status update on Py3 support In-Reply-To: <482A0D10.1090603@behnel.de> References: <482A0D10.1090603@behnel.de> Message-ID: Stefan, I cannot get it working in the current state of the repo... I have to patch like above (hope you have a fixed font in your mail reader). At this point, I'm a bit confused about the Cython-internal string table. In the Py3 context, what would be the point of having in this table byte strings? Just for the case byte-string literals? IMHO, it seems that in Py3 byte string are poor cousins of unicode strings (not used for identifiers, cannot be interned and no point of doing it), and the use cases of byte strings are going to be rather small (passing it to C funcs specting ascii-encoded strings, internet protocols expecting ascii). So, should Cython save (in the Py3 case) byte strings in their internal table? At this point, I'm not sure if meging the string tables was a good idea. I seems that Cython should still use two tables: * one table list identifiers. In the py2 case, their entries are interned (unless disabled) and their types are the py2 builting str object C-level PyString (or does py2 accept unicode identifiers?). In the py3 case, the table entries are interned (unless disabled) and their types are the py3 builtin str object C-level PyUnicode. * other table list string literals, and they can be either byte strings or unicode string, depending on the Python version and the u/b prefix in the literal expresion... Perhaps again I completelly confused about all this. In such a case, just ignore my coments. --- a/Cython/Compiler/Nodes.py Tue May 13 23:41:11 2008 +0200 +++ b/Cython/Compiler/Nodes.py Wed May 14 12:45:16 2008 -0300 @@ -4325,8 +4325,14 @@ static int __Pyx_InitStrings(__Pyx_Strin if (t->intern) *t->p = PyString_InternFromString(t->s); else + *t->p = PyString_FromStringAndSize(t->s, t->n - 1); + #else + if (t->intern) { + *t->p = PyUnicode_InternFromString(t->s); + } else { + *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); + } #endif - *t->p = PyString_FromStringAndSize(t->s, t->n - 1); } if (!*t->p) return -1; On 5/13/08, Stefan Behnel wrote: > Hi, > > I managed to compile lxml against Py3, although I couldn't test it so far as > the PyFile_* functions are gone due to the new I/O system. Also, Cython's test > runner didn't work right away, as it (obviously) tries to compile the tests > using Cython, and Cython itself does not yet run under Py3. So I added a work > around. If you run > > python2.5 runtests.py --no-cleanup > python3.0 runtests.py --no-cython > > it will run Cython from Python 2.5 to compile and run the tests, but not > delete the generated .c files, and then run Python3 to C-compile and run the > tests without importing Cython at all. This allows testing the generated > sources with different Python versions in general. > > However, for Py3 in particular, all tests go straight red (as expected ;) The > problem is (again) identifiers, which are unicode strings in Py3. So we will > have to find a way to make every one of them a unicode string, from class > names to keyword arguments, but only on Py3. A good idea might be to actually > remove the string interning Option and *always* intern identifiers, so that we > have more control over how they enter the runtime environment. I'll look into > that when I find the time... > > Regarding Cython's source itself, I tried running the 2to3 tool on Cython like > this: > > python3.0 -m lib2to3.refactor Cython > > which generates a 3000 lines patch, but being only a static analysis tool, it > breaks the ".next()" method on the scanner, thinking that what we actually > meant was calling the new built-in "next()" function on an iterator. We might > even work around that by actually making it an iterator, not sure if that will > work. > > Other than I thought, the 2to3 tool does not touch string literals, so if you > use a byte string literal in Py2 code, it will become a unicode literal after > running 2to3. I think this is good for Cython where most byte string literals > are really meant to be unicode literals. We might still want to take a bit of > caution here, as there might be places where this is not the case. > > You can find the current Py3 branch of Cython here: > > http://hg.cython.org/cython-devel-py3 > > It will (or at least should) compile things nicely on Py2 without a noticeable > difference, but it will not yet compile any code correctly for Py3. I'll keep > working on it, any help is appreciated. > > Have fun, > Stefan > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Wed May 14 18:41:47 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 09:41:47 -0700 Subject: [Cython] [Pyrex] Language stability In-Reply-To: <60286.194.114.62.39.1210771665.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <15867.194.114.62.67.1210705628.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <5EB864AE-F540-4909-AA9A-165E97EDFBF0@math.washington.edu> <60286.194.114.62.39.1210771665.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On May 14, 2008, at 6:27 AM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> I'm not sure how much this would help (it would >> some for sure), as I believe many of these iterators are already >> written in C and it's a series of C calls. I think the biggest >> savings is that >> >> cdef int i >> for i, k in enumerate(L): >> ... >> >> could increment i as a c int, and avoid all the tuple packing/ >> unpacking. > > ... which is a rather big overhead for such a trivial iterator. The > thing > is, many iterators really do extremely simple things, but require > all the > tuple packing and function call indirection *on each iteration*. > That's > different from functions like max() or the Py2 zip(), which are called > once and the do loads of stuff in C. > > Just look at these: > > http://docs.python.org/lib/itertools-functions.html > > The really simple ones are: enumerate, range, chain, count, islice, > repeat > > count() and range() are actually equivalent in their iterable > versions. > But I think enumerate() and islice(), and maybe chain() might be worth > optimising. I am agreeing with you that this is worth doing, I was just pointing out that your nesting proposal didn't save the tuple packing/ unpacking step which I believe is the most significant overhead. >> Likewise, the tuple packing/unpacking could be avoided for >> zip (assuming the number of targets is equal to the number of >> arguments). > > Which would be part of the compiler decision if this should run in > plain C > code or not. One thing to note: zip() is an iterator in Py3, which > changes > its semantics in that you must now call list(zip()) if you want to > modify > its arguments in place. And zip() is actually less trivial than the > above > mentioned ones. Maybe zip() is not worth being optimised. I use zip more than anything but range... However, I don't think there's much if any savings to optimize these except in the context of a loop, and in that case zip is pretty easy to do. - Robert From kirr at mns.spb.ru Wed May 14 18:46:32 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 14 May 2008 20:46:32 +0400 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place (newbie approach) Message-ID: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> # HG changeset patch # User Kirill Smelkov # Date 1210783363 -14400 # Node ID 9107d71e527446535929608e5003744caceb6238 # Parent 9a731464ea49650627ac6a988518aee93e6991aa RFC: constify Cython output all over the place (newbie approach) You know, when developing code, it is very tedious to look for meaningful errors and warnings in presence of tons of noise like warning: deprecated conversion from string constant to ???char*??? And you know what? It seems in the next version of gcc, this deprecation warnings will be turned into errors. ---- Python sources already use 'const' keyword freely, so I think it's time to add constify bits all over the place. diff --git a/Cython/Compiler/ModuleNode.py b/Cython/Compiler/ModuleNode.py --- a/Cython/Compiler/ModuleNode.py +++ b/Cython/Compiler/ModuleNode.py @@ -399,9 +399,9 @@ class ModuleNode(Nodes.Node, Nodes.Block code.putln('static PyObject *%s;' % Naming.preimport_cname) code.putln('static int %s;' % Naming.lineno_cname) code.putln('static int %s = 0;' % Naming.clineno_cname) - code.putln('static char * %s= %s;' % (Naming.cfilenm_cname, Naming.file_c_macro)) - code.putln('static char *%s;' % Naming.filename_cname) - code.putln('static char **%s;' % Naming.filetable_cname) + code.putln('static const char * %s= %s;' % (Naming.cfilenm_cname, Naming.file_c_macro)) + code.putln('static const char *%s;' % Naming.filename_cname) + code.putln('static const char **%s;' % Naming.filetable_cname) if env.doc: code.putln('') code.putln('static char %s[] = "%s";' % (env.doc_cname, env.doc)) @@ -425,7 +425,7 @@ class ModuleNode(Nodes.Node, Nodes.Block def generate_filename_table(self, code): code.putln("") - code.putln("static char *%s[] = {" % Naming.filenames_cname) + code.putln("static const char *%s[] = {" % Naming.filenames_cname) if code.filename_list: for filename in code.filename_list: filename = os.path.basename(filename) diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py +++ b/Cython/Compiler/Nodes.py @@ -1515,7 +1515,8 @@ class DefNode(FuncDefNode): reqd_kw_flags = [] has_reqd_kwds = False code.put( - "static char *%s[] = {" % + #"static /*const*/ char *%s[] = {" % + "static const char *%s[] = {" % Naming.kwdlist_cname) for arg in self.args: if arg.is_generic: @@ -1668,7 +1669,7 @@ class DefNode(FuncDefNode): argformat = '"%s"' % string.join(arg_formats, "") pt_arglist = [Naming.args_cname, Naming.kwds_cname, argformat, - Naming.kwdlist_cname] + arg_addrs + '(char **)/*temp.hack*/'+Naming.kwdlist_cname] + arg_addrs pt_argstring = string.join(pt_arglist, ", ") code.putln( 'if (unlikely(!PyArg_ParseTupleAndKeywords(%s))) %s' % ( @@ -3729,8 +3730,8 @@ utility_function_predeclarations = \ #define INLINE #endif -typedef struct {PyObject **p; char *s;} __Pyx_InternTabEntry; /*proto*/ -typedef struct {PyObject **p; char *s; long n; int is_unicode;} __Pyx_StringTabEntry; /*proto*/ +typedef struct {PyObject **p; const char *s;} __Pyx_InternTabEntry; /*proto*/ +typedef struct {PyObject **p; const char *s; long n; int is_unicode;} __Pyx_StringTabEntry; /*proto*/ """ + """ @@ -4141,9 +4142,9 @@ missing_kwarg: unraisable_exception_utility_code = [ """ -static void __Pyx_WriteUnraisable(char *name); /*proto*/ -""",""" -static void __Pyx_WriteUnraisable(char *name) { +static void __Pyx_WriteUnraisable(const char *name); /*proto*/ +""",""" +static void __Pyx_WriteUnraisable(const char *name) { PyObject *old_exc, *old_val, *old_tb; PyObject *ctx; PyErr_Fetch(&old_exc, &old_val, &old_tb); @@ -4159,13 +4160,13 @@ static void __Pyx_WriteUnraisable(char * traceback_utility_code = [ """ -static void __Pyx_AddTraceback(char *funcname); /*proto*/ +static void __Pyx_AddTraceback(const char *funcname); /*proto*/ """,""" #include "compile.h" #include "frameobject.h" #include "traceback.h" -static void __Pyx_AddTraceback(char *funcname) { +static void __Pyx_AddTraceback(const char *funcname) { PyObject *py_srcfile = 0; PyObject *py_funcname = 0; PyObject *py_globals = 0; From kirr at mns.spb.ru Wed May 14 18:50:45 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 14 May 2008 20:50:45 +0400 Subject: [Cython] =?utf-8?q?=5BPATCH=5D_RFC=3A_constify_Cython_output_all_?= =?utf-8?q?over_the_place_=28_newbie=09approach_=29?= In-Reply-To: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> Message-ID: <200805142050.45450.kirr@mns.spb.ru> ? ????????? ?? ????? 14 ??? 2008 Kirill Smelkov ???????(a): > # HG changeset patch > # User Kirill Smelkov > # Date 1210783363 -14400 > # Node ID 9107d71e527446535929608e5003744caceb6238 > # Parent 9a731464ea49650627ac6a988518aee93e6991aa > RFC: constify Cython output all over the place (newbie approach) > > You know, when developing code, it is very tedious to look for meaningful > errors and warnings in presence of tons of noise like > > warning: deprecated conversion from string constant to ???char*??? > > And you know what? It seems in the next version of gcc, this deprecation > warnings will be turned into errors. > > ---- > > Python sources already use 'const' keyword freely, so I think it's time to add > constify bits all over the place. Also, please forgive me, if I'm doing something wrong -- I just don't know what the full workflow cycle is with http://trac.cython.org/cython_trac/ Kirill. From dagss at student.matnat.uio.no Wed May 14 18:58:11 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 14 May 2008 18:58:11 +0200 Subject: [Cython] Transforms tests and source directory structure In-Reply-To: <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <482B00A2.6010707@student.matnat.uio.no> <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <482B1A23.60904@student.matnat.uio.no> >> - Have a Cython/Testing directory for the test utils... > > Are the test utilities so large that you really need a new source file or > even an entire package? And, since you have a Python serialiser in place - > do you really think that belongs into a test package? The short answer to this is: Just tell me where to put it. The Cython serializer (is at 50% though) can easily go somewhere else (Cython/Compiler/? Or Cython/?); but it should stay a seperate file. The potential uses outside of debugging and testing are a bit limited though (theoretically it could make a Cython source prettifier but then the parser must insert comment and whitespace nodes...) As for the test utils, wherever. Can I put it in "Cython/TestUtils.py", so that any more test utils can be added later? (Out of curiosity, does a package have more overhead than a few inodes on my filesystem? Or is it just that you don't like to have many of them?) > No, not a good idea. Add a test class to the runtests.py that only > transforms the code, and then provide each test file with the expected > result in a string, just like the error tester does. That way, you can > still add a doctest to the source file and run the unchanged code and the > transformed code through the normal test runner to see if both work as > expected. > > Something like this might work as a test layout: > > And then you parse the source file (see the error test class for this) and > append a serialisation of a dictionary that maps the names of all Python > functions found in the file to their transformed body, before you compile > and run the file in the test runner. Do you think that would work? I can sympathize with the goals, but it needs some changes. Issues: - I don't want to test that the entire pipeline. I.e., if another transform comes along later it might do further optimizations to that loop which I do not want to interfer with the test. (This is all about the "brittle interdependency" stuff -- I don't want to care about what is happening elsewhere when I'm working on one part of the compiler.) I.e there must be a mechanism for specifying exactly which transforms to run. BTW, the reasons for these tests are to set up checkable contracts between different stages of the compilation, not to check that the end-result is working. For the end-result regular tests like what we have will also be needed. ("unit tests" vs. "integration tests") - Transforms generally don't work on such a small level that they can be tested within function definitions. Closures will need full module-scope comparison, inlineable functions (NumPy __getitem__ ...) will need cross-cimport-tests. With a runtime syntax I could simply do test_transform(..., code=""" cimport test ... """, pxds={"test": """ cdef class A: ... """}) and so on. It would be horrible to have to go make different files in order to test simple cross-import functionality. So, in conclusion, perhaps I could fill testing/transforms with xml-files like this?: cdef class A: cdef int i cimport test cdef A a cimport test cdef A a (I'll implement that nice and cleanly BTW, have the parser read StringIO objects from strings in the DOM...) BTW, the output is not always going to stay valid Cython, but I can just not add doctests for those... On a related note, this doesn't cover some other tests I might want to do that isn't related to straight input/output. Perhaps one could have "testing/pyunit" for PyUnit tests or similar? -- Dag Sverre From dagss at student.matnat.uio.no Wed May 14 19:03:29 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 14 May 2008 19:03:29 +0200 Subject: [Cython] Transforms tests and source directory structure In-Reply-To: <482B1A23.60904@student.matnat.uio.no> References: <482B00A2.6010707@student.matnat.uio.no> <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482B1A23.60904@student.matnat.uio.no> Message-ID: <482B1B61.7030707@student.matnat.uio.no> > So, in conclusion, perhaps I could fill testing/transforms with > xml-files like this?: BTW, xml is not important, *any* format giving what I want would do (and I notice that a list of simple Python declarations would work too...having the tested code in strings though). Having < everywhere (or even CDATA declarations...) doesn't sound too intriguing. Also it should be a relatively simple transform (XSLT or Python script, respectively) to turn it into a pyx for the cases where it *can* be used as an integration test as well (adding a doctest section or similar). -- Dag Sverre From robertwb at math.washington.edu Wed May 14 19:09:41 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 10:09:41 -0700 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place ( newbie approach ) In-Reply-To: <200805142050.45450.kirr@mns.spb.ru> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <200805142050.45450.kirr@mns.spb.ru> Message-ID: <07264EDA-A554-4F07-AEC5-0E9A71BB8E7C@math.washington.edu> On May 14, 2008, at 9:50 AM, Kirill Smelkov wrote: > ? ????????? ?? ????? 14 ??? 2008 Kirill Smelkov > ???????(a): >> # HG changeset patch >> # User Kirill Smelkov >> # Date 1210783363 -14400 >> # Node ID 9107d71e527446535929608e5003744caceb6238 >> # Parent 9a731464ea49650627ac6a988518aee93e6991aa >> RFC: constify Cython output all over the place (newbie approach) >> >> You know, when developing code, it is very tedious to look for >> meaningful >> errors and warnings in presence of tons of noise like >> >> warning: deprecated conversion from string constant to >> ???char*??? >> >> And you know what? It seems in the next version of gcc, this >> deprecation >> warnings will be turned into errors. >> >> ---- >> >> Python sources already use 'const' keyword freely, so I think it's >> time to add >> constify bits all over the place. > > Also, please forgive me, if I'm doing something wrong -- I just > don't know > what the full workflow cycle is with > > http://trac.cython.org/cython_trac/ The thing to do here is make a new ticket (with the description you give would be great) and attach the patch. Prefix the subject with [with patch, needs review] until we get a clearer workflow going. - Robert From dalcinl at gmail.com Wed May 14 19:11:23 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 14 May 2008 14:11:23 -0300 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place (newbie approach) In-Reply-To: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> Message-ID: On 5/14/08, Kirill Smelkov wrote: > You know, when developing code, it is very tedious to look for meaningful > errors and warnings in presence of tons of noise like > warning: deprecated conversion from string constant to ???char*??? Indeed! > And you know what? It seems in the next version of gcc, this deprecation > warnings will be turned into errors. > Python sources already use 'const' keyword freely, so I think it's time to add > constify bits all over the place. Still Python sources does not use const at all the places they should. If that's the future of GCC, then Python 2.X series is going to need special compiler flags to disable this error. And I did not reviewed the status of Python 3 If you feel confident and have the time, try to build python 3 from sources using the -Wwrite-strings option for GCC. And if you are completelly sure that this will be and error in future GCC's, I believe you should warn about this in Python-Dev list... Some time agoo I tried to patch python2.6 so fix the 'const' issue. But it turned that this were going to introduce heavy source code compatibility problems in extension modules written for the 2.X series, so I give up. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From mabshoff at googlemail.com Wed May 14 18:41:26 2008 From: mabshoff at googlemail.com (Michael Abshoff) Date: Wed, 14 May 2008 18:41:26 +0200 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place (newbie approach) In-Reply-To: References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> Message-ID: <482B1636.7040100@googlemail.com> Lisandro Dalcin wrote: > On 5/14/08, Kirill Smelkov wrote: > >> You know, when developing code, it is very tedious to look for meaningful >> errors and warnings in presence of tons of noise like >> warning: deprecated conversion from string constant to ???char*??? >> > > Indeed! > > >> And you know what? It seems in the next version of gcc, this deprecation >> warnings will be turned into errors. >> Nope. At least non of the gcc 4.4 snapshots have done so. But it will come with certainty at some point, but I doubt it will be in gcc 4.4 ;) I have constified enough code to know it is a pain to do, so the earlier Cython forces this in people the better. Cheers, Michael >> Python sources already use 'const' keyword freely, so I think it's time to add >> constify bits all over the place. >> > > Still Python sources does not use const at all the places they should. > If that's the future of GCC, then Python 2.X series is going to need > special compiler flags to disable this error. And I did not reviewed > the status of Python 3 > > If you feel confident and have the time, try to build python 3 from > sources using the -Wwrite-strings option for GCC. And if you are > completelly sure that this will be and error in future GCC's, I > believe you should warn about this in Python-Dev list... > > Some time agoo I tried to patch python2.6 so fix the 'const' issue. > But it turned that this were going to introduce heavy source code > compatibility problems in extension modules written for the 2.X > series, so I give up. > > > From kirr at mns.spb.ru Wed May 14 19:21:27 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 14 May 2008 21:21:27 +0400 Subject: [Cython] =?utf-8?q?=5BPATCH=5D_RFC=3A_constify_Cython_output_all_?= =?utf-8?q?over_the_place_=28_newbie=09approach_=29?= In-Reply-To: <07264EDA-A554-4F07-AEC5-0E9A71BB8E7C@math.washington.edu> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <200805142050.45450.kirr@mns.spb.ru> <07264EDA-A554-4F07-AEC5-0E9A71BB8E7C@math.washington.edu> Message-ID: <200805142121.27899.kirr@mns.spb.ru> ? ????????? ?? ????? 14 ??? 2008 Robert Bradshaw ???????(a): > On May 14, 2008, at 9:50 AM, Kirill Smelkov wrote: > > Also, please forgive me, if I'm doing something wrong -- I just > > don't know > > what the full workflow cycle is with > > > > http://trac.cython.org/cython_trac/ > > The thing to do here is make a new ticket (with the description you > give would be great) and attach the patch. Prefix the subject with > [with patch, needs review] until we get a clearer workflow going. Ok, here you go: http://trac.cython.org/cython_trac/ticket/9 I however have some questions (forgive me, but I had not used trac before): - how do I subscribe to some mailing list to monitor tickets activity? - or do I have to use rss feed only? - how would I apply attached there patch - do I need to manually download it, or is there a better, automated way? - how to register? Thanks, Kirill. From kirr at mns.spb.ru Wed May 14 19:41:19 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 14 May 2008 21:41:19 +0400 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place (newbie approach) In-Reply-To: <482B1636.7040100@googlemail.com> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> Message-ID: <200805142141.19872.kirr@mns.spb.ru> ? ????????? ?? ????? 14 ??? 2008 Michael Abshoff ???????(a): > Lisandro Dalcin wrote: > > On 5/14/08, Kirill Smelkov wrote: > > > >> You know, when developing code, it is very tedious to look for meaningful > >> errors and warnings in presence of tons of noise like > >> warning: deprecated conversion from string constant to ???char*??? > >> > > > > Indeed! > > > > > >> And you know what? It seems in the next version of gcc, this deprecation > >> warnings will be turned into errors. > >> > > Nope. At least non of the gcc 4.4 snapshots have done so. But it will > come with certainty at some point, but I doubt it will be in gcc 4.4 ;) You are right :) By next version I meant "some future version". It seems I need to be less ambigous. > I have constified enough code to know it is a pain to do, so the earlier > Cython forces this in people the better. Then let's do it ASAP! Kirill. From kirr at mns.spb.ru Wed May 14 19:46:00 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 14 May 2008 21:46:00 +0400 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place (newbie approach) In-Reply-To: References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> Message-ID: <200805142146.00134.kirr@mns.spb.ru> ? ????????? ?? ????? 14 ??? 2008 Lisandro Dalcin ???????(a): > On 5/14/08, Kirill Smelkov wrote: > > You know, when developing code, it is very tedious to look for meaningful > > errors and warnings in presence of tons of noise like > > warning: deprecated conversion from string constant to ???char*??? > > Indeed! Abolutely! :) > > And you know what? It seems in the next version of gcc, this deprecation > > warnings will be turned into errors. > > Python sources already use 'const' keyword freely, so I think it's time to add > > constify bits all over the place. > > Still Python sources does not use const at all the places they should. > If that's the future of GCC, then Python 2.X series is going to need > special compiler flags to disable this error. And I did not reviewed > the status of Python 3 I have whole Python as HG repo here, and recently I investigated release25-maint and py3k branches regarding const usage. I can say const is there (more in py3k), but a lot of places needs conversion too. > If you feel confident and have the time, try to build python 3 from > sources using the -Wwrite-strings option for GCC. And if you are > completelly sure that this will be and error in future GCC's, I > believe you should warn about this in Python-Dev list... I'm not confident :), and unfortunately my time is very limited. Also I don't know for sure about gcc, but at least it is their practice to remove deprecated features in a near release. Look e.g. here: http://gcc.gnu.org/gcc-4.2/changes.html """ * The command-line option -fconst-strings, deprecated in previous GCC releases, has been removed. """ > Some time agoo I tried to patch python2.6 so fix the 'const' issue. > But it turned that this were going to introduce heavy source code > compatibility problems in extension modules written for the 2.X > series, so I give up. Here are 'const'-relevant Python issues: http://bugs.python.org/issue2758 http://bugs.python.org/issue1772673 http://bugs.python.org/issue1699259 http://bugs.python.org/issue1866 So maybe let's discuss this topic here, and come to some consensus, and then maybe try to push python-dev again? Kirill. From robertwb at math.washington.edu Wed May 14 20:03:10 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 11:03:10 -0700 Subject: [Cython] [PATCH] RFC: constify Cython output all over the place ( newbie approach ) In-Reply-To: <200805142121.27899.kirr@mns.spb.ru> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <200805142050.45450.kirr@mns.spb.ru> <07264EDA-A554-4F07-AEC5-0E9A71BB8E7C@math.washington.edu> <200805142121.27899.kirr@mns.spb.ru> Message-ID: <953F4D24-454E-4818-A719-218696301712@math.washington.edu> On May 14, 2008, at 10:21 AM, Kirill Smelkov wrote: > ? ????????? ?? ????? 14 ??? 2008 Robert > Bradshaw ???????(a): >> On May 14, 2008, at 9:50 AM, Kirill Smelkov wrote: >>> Also, please forgive me, if I'm doing something wrong -- I just >>> don't know >>> what the full workflow cycle is with >>> >>> http://trac.cython.org/cython_trac/ >> >> The thing to do here is make a new ticket (with the description you >> give would be great) and attach the patch. Prefix the subject with >> [with patch, needs review] until we get a clearer workflow going. > > Ok, here you go: > > http://trac.cython.org/cython_trac/ticket/9 > > I however have some questions (forgive me, but I had not used trac > before): > > - how do I subscribe to some mailing list to monitor tickets activity? This isn't set up yet, but will be. There will be a separate list you can subscribe to, as well as the option to receive notifications on (only) the tickets you are associated with (either by editing them, or adding your name to the cc). > - or do I have to use rss feed only? > - how would I apply attached there patch - do I need to manually > download > it, or is there a better, automated way? You can pull directly from the url of the patch online. Would that work? > - how to register? Currently, send me an htpasswd file. - Robert From stefan_ml at behnel.de Wed May 14 22:06:31 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 May 2008 22:06:31 +0200 Subject: [Cython] status update on Py3 support In-Reply-To: References: <482A0D10.1090603@behnel.de> Message-ID: <482B4647.6040306@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, I cannot get it working in the current state of the repo... I > have to patch like above [patch moved here] > --- a/Cython/Compiler/Nodes.py Tue May 13 23:41:11 2008 +0200 > +++ b/Cython/Compiler/Nodes.py Wed May 14 12:45:16 2008 -0300 > @@ -4325,8 +4325,14 @@ static int __Pyx_InitStrings(__Pyx_Strin > if (t->intern) > *t->p = PyString_InternFromString(t->s); > else > + *t->p = PyString_FromStringAndSize(t->s, t->n - 1); > + #else > + if (t->intern) { > + *t->p = PyUnicode_InternFromString(t->s); > + } else { > + *t->p = PyUnicode_FromStringAndSize(t->s, t->n - 1); > + } > #endif > - *t->p = PyString_FromStringAndSize(t->s, t->n - 1); > } > if (!*t->p) > return -1; Can you tell me why? This actually creates strings as unicode strings that were byte strings in the source code. > At this point, I'm a bit confused about the Cython-internal string > table. In the Py3 context, what would be the point of having in this > table byte strings? The byte strings are interned in Py2, where identifiers are byte strings, and the unicode strings are interned in Py3, where identifiers are unicode strings. > should Cython save (in the Py3 case) > byte strings in their internal table? We can't recognise the Py3 case at Cython compile time. Only the C compiler knows what the target environment is. > I seems that Cython should still use two tables: > > * one table list identifiers. [...] > * other table list string literals [...] That's what I was considering, too, although not quite as you describe. The difference between the two is that the real identifiers must become either byte strings or unicode, depending on the compile time Python version. The normal strings must be created as they appeared in the source code, and either unicode strings or byte strings can be interned based on the compile time Python version. So I now added another field to the string tab that states if the string was interned as identifier, which will then make it pop up as either unicode or byte string depending on the C compile time Python version. This (plux a fix for importing based on unicode module names) even gets almost all test cases green, just a few to go. :) One big remaining problem are the PyFile_* functions - as used in "print". I guess we'll have to wait for 3.0b1 here to provide a fix. Another thing is the removal of __setslice__ and __delslice__, i.e. the sq_ass_slice slot. This means that code that uses these will no longer compile. I added a warning, but there are two test cases that depend on it. We might want to remove them. I tested the generated code under Py2.3.6, Py2.4.4, Py2.5.1 and Py3.0a5 now. Except for a couple of remaining problems in Py3, it still works in all of them. :] Stefan From dagss at student.matnat.uio.no Wed May 14 22:13:43 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 14 May 2008 22:13:43 +0200 Subject: [Cython] Fix for raising exceptions from iterator bug Message-ID: <482B47F7.3030407@student.matnat.uio.no> Stupid question perhaps: Should stuff like this to the mailing list until the other list is up, or can the commiters be considered notified by simply adding the ticket? Fix for raising exceptions from iterator bug: http://trac.cython.org/cython_trac/ticket/10 -- Dag Sverre From robertwb at math.washington.edu Wed May 14 22:24:28 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 13:24:28 -0700 Subject: [Cython] Fix for raising exceptions from iterator bug In-Reply-To: <482B47F7.3030407@student.matnat.uio.no> References: <482B47F7.3030407@student.matnat.uio.no> Message-ID: <8DDD9A1F-F490-448E-9528-BE8EF81EDAD7@math.washington.edu> Pinging the list is probably a good idea until I get the other mailing list set up--which won't happen until next week at the earliest. BTW, thanks for the fix. On May 14, 2008, at 1:13 PM, Dag Sverre Seljebotn wrote: > Stupid question perhaps: Should stuff like this to the mailing list > until the other list is up, or can the commiters be considered > notified > by simply adding the ticket? > > Fix for raising exceptions from iterator bug: > > http://trac.cython.org/cython_trac/ticket/10 > > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From joost at cassee.net Wed May 14 22:25:09 2008 From: joost at cassee.net (Joost Cassee) Date: Wed, 14 May 2008 22:25:09 +0200 Subject: [Cython] Profiling a Cython module Message-ID: <482B4AA5.7020201@cassee.net> Hi all, After having profiled a Python app I decided to convert some modules to Cython. Is it possible to profile these modules that are now Cython? I guess it is not quite a Cython question because it will probably involve gprof, but I though some people on this might have done this before. Regards, Joost -- Joost Cassee http://joost.cassee.net -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 544 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080514/50127b43/attachment.pgp From dalcinl at gmail.com Wed May 14 23:15:08 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 14 May 2008 18:15:08 -0300 Subject: [Cython] status update on Py3 support In-Reply-To: <482B4647.6040306@behnel.de> References: <482A0D10.1090603@behnel.de> <482B4647.6040306@behnel.de> Message-ID: On 5/14/08, Stefan Behnel wrote: > [patch moved here] > > > --- a/Cython/Compiler/Nodes.py Tue May 13 23:41:11 2008 +0200 > > > +++ b/Cython/Compiler/Nodes.py Wed May 14 12:45:16 > Can you tell me why? This actually creates strings as unicode strings that > were byte strings in the source code. It was just a try, because I could not import my extension at the time I wrote that. > The byte strings are interned in Py2, where identifiers are byte strings, and > the unicode strings are interned in Py3, where identifiers are unicode strings. OK, > > should Cython save (in the Py3 case) > > byte strings in their internal table? > > We can't recognise the Py3 case at Cython compile time. Only the C compiler > knows what the target environment is. I'll reformulate the question. Why byte/unicode string literal are not going to be managed inside the same table that the strings for identifier names? Why not treat them in the same way as interger literals? Wait a minute!! Now I see, you are clever... if we do D = {} D["abc"] = 1 v = D["abc"] then iff the "abc" literal was interned, then the lookup in the last line will benefit for the interning. So yes, you are right, it DO make sense to intern string literals as much as identifier names... > That's what I was considering, too, although not quite as you describe. The > difference between the two is that the real identifiers must become either > byte strings or unicode, depending on the compile time Python version. The > normal strings must be created as they appeared in the source code, and either > unicode strings or byte strings can be interned based on the compile time > Python version. So I now added another field to the string tab that states if > the string was interned as identifier, which will then make it pop up as > either unicode or byte string depending on the C compile time Python version. OK, this seems now to me the right way. > This (plux a fix for importing based on unicode module names) even gets almost > all test cases green, just a few to go. :) A now all is working for me with current cython-devel-py3 repo!!!, except in parts that are my fault because of poor string handling. > One big remaining problem are the PyFile_* functions - as used in "print". I > guess we'll have to wait for 3.0b1 here to provide a fix. I have not looked at this yet. > Another thing is the removal of __setslice__ and __delslice__, i.e. the > sq_ass_slice slot. This means that code that uses these will no longer > compile. I added a warning, but there are two test cases that depend on it. We > might want to remove them. I would not worry too much about them, Look at http://docs.python.org/ref/sequence-methods.html. They are deprecated since Python release 2.0 . Or perhaps Cython should transform them and define __getitem__/__setitem__/__delitem___ . And if both variant are implemented, generate a warning (perhaps an error?) EVEN in the Py2 case, as there is no point on defining both. > I tested the generated code under Py2.3.6, Py2.4.4, Py2.5.1 and Py3.0a5 now. > Except for a couple of remaining problems in Py3, it still works in all of > them. :] Almost all is working for me now on Py2/Py3, too. However, I suspect that the new method cache in type objects (Py2.6 and Py3.0) is playing bad with the code Cython generates, but only in the cases Cython play games with the 'tp_dict' field of type objects... Regards, and let me say you have done pretty good work on all this... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Wed May 14 22:32:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 14 May 2008 22:32:55 +0200 Subject: [Cython] Transforms tests and source directory structure In-Reply-To: <482B1A23.60904@student.matnat.uio.no> References: <482B00A2.6010707@student.matnat.uio.no> <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482B1A23.60904@student.matnat.uio.no> Message-ID: <482B4C77.3030105@behnel.de> Hi, maybe it's even simpler than I proposed, you could just pass the dictionary into the doctest environment (somehow ;) instead of modifying the source file. Dag Sverre Seljebotn wrote: >>> - Have a Cython/Testing directory for the test utils... >> Are the test utilities so large that you really need a new source file or >> even an entire package? And, since you have a Python serialiser in place - >> do you really think that belongs into a test package? > > The short answer to this is: Just tell me where to put it. not sure, maybe "Cython/CodeWriter/CythonCodeWriter.py" ? > The potential uses outside of debugging and testing are a > bit limited though (theoretically it could make a Cython source > prettifier but then the parser must insert comment and whitespace nodes...) or a plain Python serialiser (i.e. cdef remover). > As for the test utils, wherever. Can I put it in "Cython/TestUtils.py", > so that any more test utils can be added later? > > (Out of curiosity, does a package have more overhead than a few inodes > on my filesystem? Or is it just that you don't like to have many of them?) I was just trying to find out if it's too big to fit into the test runner, but the stuff you mentioned makes it at least a module. > Issues: > - I don't want to test that the entire pipeline. I.e., if another > transform comes along later it might do further optimizations to that > loop which I do not want to interfer with the test. (This is all about > the "brittle interdependency" stuff -- I don't want to care about what > is happening elsewhere when I'm working on one part of the compiler.) Then write a good test case. ;) > I.e there must be a mechanism for specifying exactly which transforms to > run. the compilation options maybe? > BTW, the reasons for these tests are to set up checkable contracts > between different stages of the compilation, not to check that the > end-result is working. For the end-result regular tests like what we > have will also be needed. ("unit tests" vs. "integration tests") Fine. > - Transforms generally don't work on such a small level that they can be > tested within function definitions. Closures will need full module-scope > comparison, inlineable functions (NumPy __getitem__ ...) will need > cross-cimport-tests. With a runtime syntax I could simply do > > test_transform(..., code=""" > cimport test > ... > """, pxds={"test": """ > cdef class A: ... > """}) > > and so on. It would be horrible to have to go make different files in > order to test simple cross-import functionality. Why? You'd have a couple of simple, reusable .pxd files that provide a single thing and they would be used by multiple tests. That's how it works already. > BTW, the output is not always going to stay valid Cython, but I can just > not add doctests for those... That's not a problem if you run the compiler completely. > On a related note, this doesn't cover some other tests I might want to > do that isn't related to straight input/output. Perhaps one could have > "testing/pyunit" for PyUnit tests or similar? If it's a "real" unit test, Cython/Compiler/Tests would be ok with me. Stefan From greg.ewing at canterbury.ac.nz Thu May 15 02:26:09 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 15 May 2008 12:26:09 +1200 Subject: [Cython] status update on Py3 support In-Reply-To: References: <482A0D10.1090603@behnel.de> Message-ID: <482B8321.8020307@canterbury.ac.nz> Lisandro Dalcin wrote: > So, should Cython save (in the Py3 case) > byte strings in their internal table? There are two separate things going on here: 1) The reason for keeping string literals in a table is so that you don't have to create a new string object every time they're used. This would seem to apply to bytes just as much as strings. 2) Interning of strings that are likely to be used in dynamic name lookups. Since all names are strings, this only applies to strings, not bytes. > At this point, I'm not sure if meging the string tables was a good > idea. I don't even see how you *can* merge them, since strings and bytes are completely different types in py3k. -- Greg From greg.ewing at canterbury.ac.nz Thu May 15 02:34:33 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 15 May 2008 12:34:33 +1200 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482B1636.7040100@googlemail.com> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> Message-ID: <482B8519.7000207@canterbury.ac.nz> There's something I've been meaning to ask people about. Until I find a way of handling const properly, I'm considering having Pyrex put this at the top of its generated files: #define const so that all Pyrex-generated code will be a totally const-free zone. But I don't want to do this if there's a chance it could break something. Can anyone think of any problems it could cause? -- Greg From robertwb at math.washington.edu Thu May 15 03:28:53 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 18:28:53 -0700 Subject: [Cython] defining module constants In-Reply-To: <20080513232822.b3589d24.jim-crow@rambler.ru> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> <20080513144911.ac42579e.jim-crow@rambler.ru> <20080513232822.b3589d24.jim-crow@rambler.ru> Message-ID: <4B56CD7D-C7F1-4284-8060-47355D5EF338@math.washington.edu> On May 13, 2008, at 9:28 AM, Anatoly A. Kazantsev wrote: > On Tue, 13 May 2008 14:49:11 +0700 > "Anatoly A. Kazantsev" wrote: > > I have next code and it's not working properly. > > In foo.h somebody wrote: > > #define BAR 1 > > than I want to define module constant with same name and value: > > cdef export from "foo.h": > ?typedef enum: > _BAR "BAR" > > cdef public enum: > BAR = _BAR > > In python BAR will have 0 not 1. And actually 0 is index number of > BAR in the enum. > Looks like value of _BAR is not applyed to BAR in the enum. > > Maybe I do something wrong. Hmm... I am unable to reproduce this--it works for me. Anyone else? - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080514/13233af8/attachment.pgp From robertwb at math.washington.edu Thu May 15 03:35:37 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 14 May 2008 18:35:37 -0700 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: References: Message-ID: On May 13, 2008, at 8:28 AM, Lisandro Dalcin wrote: > On 5/13/08, Robert Bradshaw > >> I am having trouble understanding exactly what you're trying to >> accomplish >> here--what is the correspondence between strides and cstrides, and >> how will >> they be kept in sync? > > I'm just introducing a naming convention for faster access to C-level > atributes that is simple to remind. And the sync between cshape and > shape is in charge of numpy. Of course, if you modify 'cshape' and do > not update yourself 'cstrides', then you get what you deserve (and a > good point for the future Cython to support something like 'const') > >> Why not let a.ndim be the c attribute that gets >> coerced to a python attribute when needed? > > Well, that is easy, a ndim is just an integer! > > For some types where coercion is >> not possible (e.g. the int* shape) then it could transform the >> above to >> (a).shape and be implemented via the property: __get__ >> mechanism as >> at >> http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/ >> Doc/Manual/extension_types.html#Properties >> (if it exists?). > > And then you get a tuple, not a plain C array... You get a tuple *only* if you try and use it as a python object. Otherwise it's C access as usual. >> Not having two names that point to the same thing is one of >> the whole points of cpdef functions, this seems to be going >> backwards with >> attributes. > > Agreed, but I believe that in the case of 'cshape'/'shape', they are > not actually similar, one is a pointer, and other is a tuple. Of > course, if we can make Cython to coerce the 'cshape' to a tuple ONLY > when it is requiered, ie, ' cdef object s = ary.cshape ', then that > would be great. Exactly. > > Also, taking Python code and having to change all the names to >> get Cython to optimize it (thus rendering it incompatible with >> pure Python) >> seems like an undesirable thing too. Maybe I'm missing something... > > You are talking about optimization, but you are proposing that > 'a.shape' being accessed to the descriptor mechanism, which will > always return a tuple. At this point, I'm a bit confused about your > comments. The descriptor saves you the attribute lookup, but still get > a Python-level object, and their access is not as fast as a plain > C-array. > > Furthermore, you commented about 'cpdef' functions. Well, If we want > to take advantage of them, we DO HAVE to change pure-python source > code to take advantage of it. However, I remember that there were > comments about making this automatic. Yes. But the point was before there were (in our codebase at least) starting to be lots of instances of def foo(self, ...): return foo_c(self, ...) def foo_c(self, ...): ... and one would have to remember which one to call from python vs. cython. > However, I do not see why my proposal interferes with the yours. For > numpy arrays, 'shape' could be still looked up via the descriptor way, > returning a tuple, but 'cshape' can be still there for power users > that want to get even faster access. My proposal is that for someone coming from using NumPy from Python, they only need declare their object (say x) as being a numpy array, and then all access to x.shape is suddenly faster rather than having to remember a new way to access shape, ndim, etc. - Robert From dagss at student.matnat.uio.no Thu May 15 09:33:43 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 09:33:43 +0200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: References: Message-ID: <482BE757.9000500@student.matnat.uio.no> > My proposal is that for someone coming from using NumPy from Python, > they only need declare their object (say x) as being a numpy array, > and then all access to x.shape is suddenly faster rather than having > to remember a new way to access shape, ndim, etc. I've been ranting about this before, so I'll be brief [edit: I failed], but I have strong feelings (and a strong investment in the summer too) in this. The problem is that typical usecases want to either use the Python API (though *invisibly* optimized), or the C API. I don't see a need for the in-between crossbreed. Explicitly, I'd love for the following to fail, hard: cdef int* s = arr.shape cdef object s = raw_arr.dimensions ("cdef object s = arr.shape" should work though, but I consider that GSoC-stuff, not something you can fake at this stage.) We might just have to agree that we disagree on this one. Though remember: Explicit is better than implicit. Renaming the fields at least keep the APIs relatively seperate. Even if the Python API is provided, there are still usecases for the native API. For instance, how can one implement the ndarray inlines if there's no raw access? If there's no way to access the raw C API but one relies on "magic" then they will create infinite loops. Perhaps the following can be a compromise?: cdef extern ... ctypedef struct ndarray_extension_struct "PyArrayObject": int nd Py_intptr_t *dimensions cdef class numpy.ndarray [object PyArrayObject]: property shape: cdef inline final object __get__(self): cdef ndarray_extension_struct* raw = \ self return make_tuple_from(raw.dimensions, raw.nd) # And if you want auto-conversion to int-array, # add a type-overload like this: cdef inline final Py_intptr_t* __get__(self): cdef ndarray_extension_struct* raw = \ self return raw.dimensions So if you really want access to the raw struct, there's a predefined way. Lisandro's proposal is a lot easier on the fingers though ("return make_tuple_from(self.cshape, self.cndim)"). -- Dag Sverre From kirr at mns.spb.ru Thu May 15 10:12:19 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Thu, 15 May 2008 12:12:19 +0400 Subject: [Cython] =?utf-8?q?=5BPATCH=5D_RFC=3A_constify_Cython_output_all_?= =?utf-8?q?over_the_place_=28_newbie=09approach_=29?= In-Reply-To: <953F4D24-454E-4818-A719-218696301712@math.washington.edu> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <200805142121.27899.kirr@mns.spb.ru> <953F4D24-454E-4818-A719-218696301712@math.washington.edu> Message-ID: <200805151212.19311.kirr@mns.spb.ru> ? ????????? ?? ????? 14 ??? 2008 Robert Bradshaw ???????(a): > On May 14, 2008, at 10:21 AM, Kirill Smelkov wrote: > > > ? ????????? ?? ????? 14 ??? 2008 Robert > > Bradshaw ???????(a): > >> On May 14, 2008, at 9:50 AM, Kirill Smelkov wrote: > >>> Also, please forgive me, if I'm doing something wrong -- I just > >>> don't know > >>> what the full workflow cycle is with > >>> > >>> http://trac.cython.org/cython_trac/ > >> > >> The thing to do here is make a new ticket (with the description you > >> give would be great) and attach the patch. Prefix the subject with > >> [with patch, needs review] until we get a clearer workflow going. > > > > Ok, here you go: > > > > http://trac.cython.org/cython_trac/ticket/9 > > > > I however have some questions (forgive me, but I had not used trac > > before): > > > > - how do I subscribe to some mailing list to monitor tickets activity? > > This isn't set up yet, but will be. There will be a separate list you > can subscribe to, as well as the option to receive notifications on > (only) the tickets you are associated with (either by editing them, > or adding your name to the cc). I see. Looking forward for this infrastructure to be setup, and thanks beforehand for doing maintanance work. > > - or do I have to use rss feed only? > > - how would I apply attached there patch - do I need to manually > > download > > it, or is there a better, automated way? > > You can pull directly from the url of the patch online. Would that work? Do you mean $ hg import http://trac.cython.org/cython_trac/attachment/ticket/9/constify.patch?format=raw Yes, this works, thanks. Unfortunately, I think there ~ 5 - 10 clicks before this command could be done, and I'm to used to just press one key to apply a patch :) Anyway, I think this is not a stopper. > > - how to register? > > Currently, send me an htpasswd file. Sent via private mail. Thanks, Kirill. P.S. what about the patch itself? From kirr at mns.spb.ru Thu May 15 10:11:12 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Thu, 15 May 2008 12:11:12 +0400 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482B8519.7000207@canterbury.ac.nz> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> Message-ID: <200805151211.12584.kirr@mns.spb.ru> Hi Greg, ? ????????? ?? ??????? 15 ??? 2008 Greg Ewing ???????(a): > There's something I've been meaning to ask people > about. Until I find a way of handling const properly, > I'm considering having Pyrex put this at the top of > its generated files: > > #define const > > so that all Pyrex-generated code will be a totally > const-free zone. > > But I don't want to do this if there's a chance > it could break something. Can anyone think of any > problems it could cause? I think this is going to be trouble at least when interfacing with external C++ code and libraries - at least at link time. All C++ compilers I know put function signature into it's name (there is even C++ ABI rules for this), and 'const' too affects this, look: $ cat c++const.cpp void dosmth_1(int *p) {} void dosmth_2(const int *p) {} $ g++ -c c++const.cpp $ nm c++const.o 00000000 T _Z8dosmth_1Pi 00000006 T _Z8dosmth_2PKi U __gxx_personality_v0 So you see, "void dosmth_1(int *)" was mangled into _Z8dosmth_1Pi, and "void dosmth_2(const int *p)" into _Z8dosmth_2PKi. Note extra 'K' character. Now, if we empty "#define const ", we'll fool C++ compiler to try to link with another signature for functions where const was present, and this will be a failure at link time. This is what first comes to my mind -- there are probably other issues as well. With this explanation, I'm -1 to the "#define const " idea, and also I think that instead, it would be better to spend our energy and time to add proper const support. What do you think? Thanks, Kirill. From robertwb at math.washington.edu Thu May 15 10:17:33 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 01:17:33 -0700 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <482BE757.9000500@student.matnat.uio.no> References: <482BE757.9000500@student.matnat.uio.no> Message-ID: <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> On May 15, 2008, at 12:33 AM, Dag Sverre Seljebotn wrote: > >> My proposal is that for someone coming from using NumPy from Python, >> they only need declare their object (say x) as being a numpy array, >> and then all access to x.shape is suddenly faster rather than having >> to remember a new way to access shape, ndim, etc. > > I've been ranting about this before, so I'll be brief [edit: I > failed], > but I have strong feelings (and a strong investment in the summer too) > in this. > > The problem is that typical usecases want to either use the Python API > (though *invisibly* optimized), or the C API. Having two separate APIs to the same library is suboptimal, but is (currently) sometimes forced by namespace conflicts. What I think people want is to use the Python API that gets compiled (by Cython) to using the C API directly. I see nothing gained by trying to make the two more distinct--in fact the more they overlap the less "translation" Cython will have to do (which would reduce your burden, right?). > I don't see a need for the in-between crossbreed. Cpdef is the ultimite crossbreed, and was very welcome. In fact, I would say the whole point of Python/Cython is an in-between crossbreed between C and Python. Or maybe I'm misinterpreting you here. > > Explicitly, I'd love for the following to fail, hard: > > cdef int* s = arr.shape > cdef object s = raw_arr.dimensions As defined in the current NumPy library, I agree (though it's a little unclear what you mean by "arr" vs. "raw_arr"). However, in the ideal world, I think if arr has a cdef attribute named "shape" and a property named "shape" then it should use the best one for the task at hand (e.g. printing vs indexing). > ("cdef object s = arr.shape" should work though, but I consider that > GSoC-stuff, not something you can fake at this stage.) Doesn't that work now? > > We might just have to agree that we disagree on this one. Though > remember: Explicit is better than implicit. > > Renaming the fields at least keep the APIs relatively seperate. > Even if > the Python API is provided, there are still usecases for the native > API. > > For instance, how can one implement the ndarray inlines if there's no > raw access? If there's no way to access the raw C API but one > relies on > "magic" then they will create infinite loops. No, I'm seeing things going the other way. obj.attr is viewed as a cdef attribute (if it exists) any only as a python attribute if it is used in a python context and can't be coerced. (One would implement the shape property to return the appropriate object.) The "raw access" is always available (assuming obj is typed correctly) and there is no need or motivation for a separate API. > > Perhaps the following can be a compromise?: > > cdef extern ... > ctypedef struct ndarray_extension_struct "PyArrayObject": > int nd > Py_intptr_t *dimensions > > cdef class numpy.ndarray [object PyArrayObject]: > property shape: > cdef inline final object __get__(self): > cdef ndarray_extension_struct* raw = \ > self > return make_tuple_from(raw.dimensions, raw.nd) > > # And if you want auto-conversion to int-array, > # add a type-overload like this: > cdef inline final Py_intptr_t* __get__(self): > cdef ndarray_extension_struct* raw = \ > self > return raw.dimensions > > > So if you really want access to the raw struct, there's a > predefined way. > > Lisandro's proposal is a lot easier on the fingers though ("return > make_tuple_from(self.cshape, self.cndim)"). I think "make_tuple_from(self.shape, self.ndim)" is even easier than that, especially if one has a Python program using NumPy and wants to start using Cython to make it faster. - Robert From dagss at student.matnat.uio.no Thu May 15 10:26:23 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 10:26:23 +0200 Subject: [Cython] Transforms tests and source directory structure In-Reply-To: <482B4C77.3030105@behnel.de> References: <482B00A2.6010707@student.matnat.uio.no> <64733.194.114.62.39.1210781099.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <482B1A23.60904@student.matnat.uio.no> <482B4C77.3030105@behnel.de> Message-ID: <482BF3AF.7070101@student.matnat.uio.no> I've gone full circle. The more I think about it, the more these tests seems like unit tests (and I'm sorry that I was unclear about that). They are unit tests because: - They try to excersize only isolated functionality within the transform - Invoking the full Cython compiler (which I currently do not do) would be a disadvantage rather than an advantage They may not be considered unit tests because: - I "cheat" (but this is only to save time) and rather than constructing a real mock input from scratch I make use of the Cython parser (and potentially a limited subset of pre-transforms) to get the input tree. However, I really use the parser (rather directly) as a test utility, not as part of a compilation as such. Invoking cython.py is out of the question. In some cases, it might be impossible to cheat (the code tree needs info that can't be written in syntax) and in those case I'll be constructing trees manually. So I'll put them in Cython/Compiler/Transforms/Tests (using Python functions and doctests I imagine), if that is ok. -- Dag Sverre From dagss at student.matnat.uio.no Thu May 15 10:49:39 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 10:49:39 +0200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> Message-ID: <482BF923.9090902@student.matnat.uio.no> >> ("cdef object s = arr.shape" should work though, but I consider that >> GSoC-stuff, not something you can fake at this stage.) > > Doesn't that work now? Well, it works because the C-version is called "dimensions". But "cdef object s = arr.strides" does not work ("Cannot convert 'numpy.Py_intptr_t *' to Python object"); so it's "namespace-fragile" behaviour. > No, I'm seeing things going the other way. obj.attr is viewed as a > cdef attribute (if it exists) any only as a python attribute if it is > used in a python context and can't be coerced. (One would implement > the shape property to return the appropriate object.) The "raw > access" is always available (assuming obj is typed correctly) and > there is no need or motivation for a separate API. Yes, I can see the benefits. And again my problem is that this can only really work if raw C API is similar to the Python API. And since this is a /language design/ issue, I think that is too big an assumption for something to build it so heavily into the language core. (But, it is a matter of taste; and it has a certain appeal that might change my mind after thinking more about it.) Consider a "FsAsObj" extension class: >>> a = FsAsObj() >>> print a.usr.include On the other hand, you might have something like an "fshandle" attribute in the extension class, C-side (not meant for Python users to see). So perhaps you'd like to do: cdef FsAsObj d = FsAsObj() d = d.usr.include call_native_c_function(d.fshandle) Then I do $ sudo mkdir /fshandle and want to access it... *grin* (ok, contrived, but again, this is language design, and for good language design issues the criteria for stability and lack of exceptions and ad-hoc-ness should be pretty high. Let's see if we can find a (new, creative) way to satisfy us both... I'll sign off this discussion now though, perhaps I'm convinced after a week of it sinking in. -- Dag Sverre From greg.ewing at canterbury.ac.nz Thu May 15 10:52:30 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 15 May 2008 20:52:30 +1200 Subject: [Cython] ANN: Pyrex 0.9.8 Message-ID: <482BF9CE.2070100@canterbury.ac.nz> Pyrex 0.9.8 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ This version has a number of new features: * In-place operators (+= etc.) are now supported. * The Pyrex compiler now has built-in facilities for automatically tracking down and compiling all the modules a given module depends on, and for only compiling modules whose source has changed. * Some restrictions on nogil functions have been relaxed. In particular, it's now possible to declare a C method as nogil. * A feature has been added that makes it easier to manage circular imports between .pxd files. It's now possible to declare two extension types in different .pxd files that refer to each other. What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From dagss at student.matnat.uio.no Thu May 15 11:05:28 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 11:05:28 +0200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> Message-ID: <482BFCD8.5030208@student.matnat.uio.no> >> I don't see a need for the in-between crossbreed. > > Cpdef is the ultimite crossbreed, and was very welcome. In fact, I > would say the whole point of Python/Cython is an in-between > crossbreed between C and Python. Or maybe I'm misinterpreting you here. One final remark: Don't get me wrong, I love cpdef! The reason I don't mind cpdef is that it doesn't have any *interface* issues. In principle, any cpdef will guarantee a 1:1 match between the Python interface and C interface. The NumPy ndarray object is written in C, outside of Cython control, and their contract with the users are in the Python interface, so the C struct is free to diverge as much or little as they like from the Python interface. In principle, you cannot know anything about the C interface of such extension types. I like to consider my NumPy project as an inlineable reimplementation of (a small subset of) the Python interface for speed purposes; it is in turn a *user* of the C interface. I think it is cleaner and will create less confusion and more readable numpy.pxd code if that layering is explicit. Your approach will only work in-so-far as the C implementation lies very close to the Python interface. Which is true for NumPy (just a few renames needed), but not in principle. -- Dag Sverre From robertwb at math.washington.edu Thu May 15 11:09:22 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 02:09:22 -0700 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <482BF923.9090902@student.matnat.uio.no> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> <482BF923.9090902@student.matnat.uio.no> Message-ID: <6066F20D-BEFB-4C2D-A72C-62E2E173B778@math.washington.edu> On May 15, 2008, at 1:49 AM, Dag Sverre Seljebotn wrote: > >>> ("cdef object s = arr.shape" should work though, but I consider that >>> GSoC-stuff, not something you can fake at this stage.) >> >> Doesn't that work now? > > Well, it works because the C-version is called "dimensions". But "cdef > object s = arr.strides" does not work ("Cannot convert > 'numpy.Py_intptr_t *' to Python object"); so it's "namespace-fragile" > behaviour. > >> No, I'm seeing things going the other way. obj.attr is viewed as a >> cdef attribute (if it exists) any only as a python attribute if it is >> used in a python context and can't be coerced. (One would implement >> the shape property to return the appropriate object.) The "raw >> access" is always available (assuming obj is typed correctly) and >> there is no need or motivation for a separate API. > > Yes, I can see the benefits. > > And again my problem is that this can only really work if raw C API is > similar to the Python API. And since this is a /language design/ > issue, > I think that is too big an assumption for something to build it so > heavily into the language core. (But, it is a matter of taste; and it > has a certain appeal that might change my mind after thinking more > about > it.) You are right that this is a language design issue. If library authors want completely separate API and names for C and Python, they are still free to do so. Right now they are forced to do so, which I think is a deficiency. > > Consider a "FsAsObj" extension class: > >>>> a = FsAsObj() >>>> print a.usr.include > > > On the other hand, you might have something like an "fshandle" > attribute > in the extension class, C-side (not meant for Python users to see). So > perhaps you'd like to do: > > cdef FsAsObj d = FsAsObj() > d = d.usr.include > call_native_c_function(d.fshandle) > > Then I do > > $ sudo mkdir /fshandle > > and want to access it... *grin* As contrived as this is, it doesn't contradict the model above at all. > (ok, contrived, but again, this is > language design, and for good language design issues the criteria for > stability and lack of exceptions and ad-hoc-ness should be pretty > high. > Let's see if we can find a (new, creative) way to satisfy us both... +1 > > I'll sign off this discussion now though, perhaps I'm convinced > after a > week of it sinking in. > > -- > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From greg.ewing at canterbury.ac.nz Thu May 15 11:03:31 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 15 May 2008 21:03:31 +1200 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <200805151211.12584.kirr@mns.spb.ru> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> Message-ID: <482BFC63.2050804@canterbury.ac.nz> Kirill Smelkov wrote: > I think this is going to be trouble at least when interfacing with external > C++ code and libraries - at least at link time. You're right, it would break C++ interfacing in a big way. So I guess I'll just have to bite the bullet and provide proper const support. A pity, because it would have been nice to spare Pyrex users from the tyranny of constness. -- Greg From dagss at student.matnat.uio.no Thu May 15 11:20:52 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 11:20:52 +0200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <482BF923.9090902@student.matnat.uio.no> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> <482BF923.9090902@student.matnat.uio.no> Message-ID: <482C0074.6040606@student.matnat.uio.no> > cdef FsAsObj d = FsAsObj() > d = d.usr.include > call_native_c_function(d.fshandle) > > Then I do > > $ sudo mkdir /fshandle No, actually you're right. With the addition of your type behaviour suggestions one could do cdef FsAsObj mydir = d.fshandle # gets "/fshandle" cdef int myhandle = d.fshandle # gets underlying file handle which, when I look harder at it, isn't all that horrible. "Attribute access overloading"... Perhaps such type-dependant behaviour would be a default mode; while features could be added like (when needed/if worth it/if I get time): cdef cython.python_interface(FsAsObj) mydir... cdef cython.c_interface(FsAsObj) mydir... as additional access-modes for purists and contrived examples. (Heh, promising to shut up is a promise that's hard to keep..) -- Dag Sverre From robertwb at math.washington.edu Thu May 15 11:22:24 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 02:22:24 -0700 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <482C0074.6040606@student.matnat.uio.no> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> <482BF923.9090902@student.matnat.uio.no> <482C0074.6040606@student.matnat.uio.no> Message-ID: <9AEAD986-FA08-4163-840B-33F21B115698@math.washington.edu> On May 15, 2008, at 2:20 AM, Dag Sverre Seljebotn wrote: >> cdef FsAsObj d = FsAsObj() >> d = d.usr.include >> call_native_c_function(d.fshandle) >> >> Then I do >> >> $ sudo mkdir /fshandle > > No, actually you're right. With the addition of your type behaviour > suggestions one could do > > cdef FsAsObj mydir = d.fshandle # gets "/fshandle" > cdef int myhandle = d.fshandle # gets underlying file handle > > which, when I look harder at it, isn't all that horrible. "Attribute > access overloading"... > > Perhaps such type-dependant behaviour would be a default mode; while > features could be added like (when needed/if worth it/if I get time): > > cdef cython.python_interface(FsAsObj) mydir... > cdef cython.c_interface(FsAsObj) mydir... > > as additional access-modes for purists and contrived examples. Well, (foo) is always the Python interface, so we've already got that covered in case one really wants it. > (Heh, promising to shut up is a promise that's hard to keep..) For me too :). - Robert From greg.ewing at canterbury.ac.nz Thu May 15 11:16:29 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 15 May 2008 21:16:29 +1200 Subject: [Cython] about numpy, c/python namespaces and name clashes, and some tricks ; -) In-Reply-To: <482BFCD8.5030208@student.matnat.uio.no> References: <482BE757.9000500@student.matnat.uio.no> <772C53FA-5C52-4D37-96D2-80321DCC097F@math.washington.edu> <482BFCD8.5030208@student.matnat.uio.no> Message-ID: <482BFF6D.6070109@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > Your approach will only work in-so-far as the C implementation lies very > close to the Python interface. Which is true for NumPy (just a few > renames needed), but not in principle. Keep in mind that anything in the C interface can be renamed using C name declarations, so it's possible to provide both Pythonic and C-level interfaces at the same time, with whatever combination of naming is desired. -- Greg From kirr at mns.spb.ru Thu May 15 13:13:53 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Thu, 15 May 2008 15:13:53 +0400 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482BF9CE.2070100@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> Message-ID: <200805151513.54003.kirr@mns.spb.ru> ? ????????? ?? ??????? 15 ??? 2008 Greg Ewing ???????(a): > Pyrex 0.9.8 is now available: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Greg, I'd like to ask you about Why don't you host normal Mercurial repository for Pyrex? Is it intended, or just because you don't have a place to host it? With normal mercurial repository up and running, like http://hg.sympy.org/sympy/ anyone could see, what the development status is, monitor it's changes, try latest version, write patches against bleeding edge (contrary to latest release), etc ... It is just convenient, Isn't it? Kirill. From kirr at mns.spb.ru Thu May 15 13:16:10 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Thu, 15 May 2008 15:16:10 +0400 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482BFC63.2050804@canterbury.ac.nz> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> Message-ID: <200805151516.10382.kirr@mns.spb.ru> ? ????????? ?? ??????? 15 ??? 2008 Greg Ewing ???????(a): > Kirill Smelkov wrote: > > > I think this is going to be trouble at least when interfacing with external > > C++ code and libraries - at least at link time. > > You're right, it would break C++ interfacing in a big > way. > So I guess I'll just have to bite the bullet and provide > proper const support. A pity, because it would have been > nice to spare Pyrex users from the tyranny of constness. A pity indeed. On the other hand, let's see what it will be, and what would be the effort to adapt existing sources to const-aware Pyrex/Cython. So, looking forward! Thanks, Kirill. From ravi_lanka at acusim.com Thu May 15 15:08:37 2008 From: ravi_lanka at acusim.com (Ravi Lanka) Date: Thu, 15 May 2008 09:08:37 -0400 Subject: [Cython] [Pyrex] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482BFC63.2050804@canterbury.ac.nz> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> Message-ID: <482C35D5.8050005@acusim.com> Greg Ewing wrote: > Kirill Smelkov wrote: > > >> I think this is going to be trouble at least when interfacing with external >> C++ code and libraries - at least at link time. >> > > You're right, it would break C++ interfacing in a big > way. > > So I guess I'll just have to bite the bullet and provide > proper const support. A pity, because it would have been > nice to spare Pyrex users from the tyranny of constness. > > I have been using pyrex to interface C as well as C++ code. I use the following trick to handle const ( mostly because of the cpp code), foo.h void foo( const int* p) { ... } mydefines.h #define ConstInt const int foo.pyx cdef extern from "mydefines.h": ctypedef int ConstInt cdef extern from "foo.h": void foo(ConstInt * p) Without the above trick, I had to manually edit the cpp file that got generated by pyrex. hope this helps, Ravi From dalcinl at gmail.com Thu May 15 16:31:27 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 11:31:27 -0300 Subject: [Cython] status update on Py3 support In-Reply-To: <482B8321.8020307@canterbury.ac.nz> References: <482A0D10.1090603@behnel.de> <482B8321.8020307@canterbury.ac.nz> Message-ID: On 5/14/08, Greg Ewing wrote: > Lisandro Dalcin wrote: > > > So, should Cython save (in the Py3 case) > > byte strings in their internal table? > > > There are two separate things going on here: > > 1) The reason for keeping string literals in a table is so > that you don't have to create a new string object every > time they're used. This would seem to apply to bytes just > as much as strings. > > 2) Interning of strings that are likely to be used in > dynamic name lookups. Since all names are strings, this > only applies to strings, not bytes. > > > > At this point, I'm not sure if meging the string tables was a good > > idea. > > > I don't even see how you *can* merge them, since strings > and bytes are completely different types in py3k. Indeed, I agree with you. Then, tell me your opinion about this (in the Py3 case): - Indentifiers and (unicode) string literals can (and should) managed in the same table. This way, 'a.foo' will be as efficient as 'getattr(a, "foo")' - Byte strings are completelly different guys, so they should be managed in a way similar to integer literals are. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu May 15 16:33:28 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 11:33:28 -0300 Subject: [Cython] Profiling a Cython module In-Reply-To: <482B4AA5.7020201@cassee.net> References: <482B4AA5.7020201@cassee.net> Message-ID: I believe Python core, if configured appropriatelly, can still profile C functions in extension modules. But I never did that. On 5/14/08, Joost Cassee wrote: > Hi all, > After having profiled a Python app I decided to convert some modules to > Cython. Is it possible to profile these modules that are now Cython? I guess > it is not quite a Cython question because it will probably involve gprof, > but I though some people on this might have done this before. > > > Regards, > > Joost > > -- > Joost Cassee > http://joost.cassee.net > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jek-gmane at kleckner.net Thu May 15 17:12:14 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Thu, 15 May 2008 08:12:14 -0700 Subject: [Cython] [Pyrex] Language stability In-Reply-To: References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: Robert Bradshaw wrote: > I'm -1 for having lots of multiple ways to do for loops (including > that list of PEPs--we're already up to 3). Also, "from" makes it > clear that this is a special cython loop--consider the following: > > x = 1 > > class A: > def __gt__(self, other): > return range(3,7) > > for x in 0 <= x < A(): > print x > > > This is valid Python (prints 3, 4, 5, 6), and would act completely > differently under your proposal. This seems to me to demonstrate quite well that the newer syntax is less desirable than the old. From dalcinl at gmail.com Thu May 15 17:23:19 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 12:23:19 -0300 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482BFC63.2050804@canterbury.ac.nz> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> Message-ID: On 5/15/08, Greg Ewing wrote: > Kirill Smelkov wrote: > So I guess I'll just have to bite the bullet and provide > proper const support. A pity, because it would have been > nice to spare Pyrex users from the tyranny of constness. I do not agree with you here. Constness is not a tyranny for my (nor ther are inmutable objects like tuples and strings in Python). The real issue is that Python does not properly uses 'const' in a consistent way. Moreover, I believe that at Cython/Pyrex should gain support for 'const' in a primitive way, and more, generate error if the user code try to modify const stuff. So if you do cdef const char *p = "abc" p[0] = 0 Cython/Pyrex complains as a C compiler would do. When I have time, I'll try to look at all this. We should even handle the casting from const to non-const in the C++ case with const_cast<> instead of (type*) using macros. This is the way SWIG did it, and it works just fine. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jim-crow at rambler.ru Thu May 15 17:24:44 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Thu, 15 May 2008 22:24:44 +0700 Subject: [Cython] defining module constants In-Reply-To: <4B56CD7D-C7F1-4284-8060-47355D5EF338@math.washington.edu> References: <20080512010743.5a583b3c.jim-crow@rambler.ru> <07C7DAC6-38D0-43DE-8145-A82366BBD2F2@math.washington.edu> <20080513125523.b320584f.jim-crow@rambler.ru> <2F792A93-8B06-4F00-8299-4D11C580DAA4@math.washington.edu> <20080513144911.ac42579e.jim-crow@rambler.ru> <20080513232822.b3589d24.jim-crow@rambler.ru> <4B56CD7D-C7F1-4284-8060-47355D5EF338@math.washington.edu> Message-ID: <20080515222444.d60cb60e.jim-crow@rambler.ru> On Wed, 14 May 2008 18:28:53 -0700 Robert Bradshaw wrote: > On May 13, 2008, at 9:28 AM, Anatoly A. Kazantsev wrote: > > > On Tue, 13 May 2008 14:49:11 +0700 > > "Anatoly A. Kazantsev" wrote: > > > > I have next code and it's not working properly. > > > > In foo.h somebody wrote: > > > > #define BAR 1 > > > > than I want to define module constant with same name and value: > > > > cdef export from "foo.h": > > ?typedef enum: > > _BAR "BAR" > > > > cdef public enum: > > BAR = _BAR > > > > In python BAR will have 0 not 1. And actually 0 is index number of > > BAR in the enum. > > Looks like value of _BAR is not applyed to BAR in the enum. > > > > Maybe I do something wrong. > > Hmm... I am unable to reproduce this--it works for me. Anyone else? > > - Robert Sorry, that's my fault. I tryed to extern from wrong .h file. Sorry. All works fine! -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080515/7f477723/attachment.pgp From stefan_ml at behnel.de Thu May 15 18:55:58 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 15 May 2008 18:55:58 +0200 Subject: [Cython] from __future__ import ... Message-ID: <482C6B1E.10003@behnel.de> Hi, just a quick note that I added support for from __future__ import ... in the Py3 branch. Currently, there is only one supported feature "unicode_literals", which changes all string literals in a source file that do not have a 'b' prefix (or 'r', although that differs from Python) into unicode strings. Stefan From dalcinl at gmail.com Thu May 15 20:22:53 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 15:22:53 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482C6B1E.10003@behnel.de> References: <482C6B1E.10003@behnel.de> Message-ID: Stefan, then tell me something.. Iff the 'from __future__ ...' is NOT used, and a in a pxy file I do a = "abc" What the type of 'a' will be if I compile and run the generated C source in a Python 3 runtime environment? On 5/15/08, Stefan Behnel wrote: > Hi, > > just a quick note that I added support for > > from __future__ import ... > > in the Py3 branch. Currently, there is only one supported feature > "unicode_literals", which changes all string literals in a source file that do > not have a 'b' prefix (or 'r', although that differs from Python) into unicode > strings. > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Thu May 15 20:41:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 15 May 2008 20:41:26 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> Message-ID: <482C83D6.4030401@behnel.de> Hi, Lisandro Dalcin wrote: > Iff the 'from __future__ ...' is NOT used, and a in a pxy file I do > > a = "abc" > > What the type of 'a' will be if I compile and run the generated C > source in a Python 3 runtime environment? I'm currently reconsidering that, and I'm still not sure. We could simply enable all __future__ features by default that the runtime environment enables (and that we support in Cython, obviously). That's a straight forward thing to do and it wouldn't be any different from the portability of Python code... What do the others think? Stefan From robertwb at math.washington.edu Thu May 15 20:48:35 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 11:48:35 -0700 Subject: [Cython] Profiling a Cython module In-Reply-To: References: <482B4AA5.7020201@cassee.net> Message-ID: <77810601-E054-4EF1-A71E-CACBAAB73F58@math.washington.edu> On May 15, 2008, at 7:33 AM, Lisandro Dalcin wrote: > I believe Python core, if configured appropriatelly, can still profile > C functions in extension modules. But I never did that. > > On 5/14/08, Joost Cassee wrote: >> Hi all, >> After having profiled a Python app I decided to convert some >> modules to >> Cython. Is it possible to profile these modules that are now >> Cython? I guess >> it is not quite a Cython question because it will probably involve >> gprof, >> but I though some people on this might have done this before. You can still use the same profiling tools, it just won't break them down the cdef functions individually (though the total times should still be correct). - Robert From joost at cassee.net Thu May 15 21:04:34 2008 From: joost at cassee.net (Joost Cassee) Date: Thu, 15 May 2008 21:04:34 +0200 Subject: [Cython] Profiling a Cython module In-Reply-To: <77810601-E054-4EF1-A71E-CACBAAB73F58@math.washington.edu> References: <482B4AA5.7020201@cassee.net> <77810601-E054-4EF1-A71E-CACBAAB73F58@math.washington.edu> Message-ID: <482C8942.9070406@cassee.net> On 15-05-08 20:48, Robert Bradshaw wrote: > On May 15, 2008, at 7:33 AM, Lisandro Dalcin wrote: > >> I believe Python core, if configured appropriatelly, can still profile >> C functions in extension modules. But I never did that. > > You can still use the same profiling tools, it just won't break them > down the cdef functions individually (though the total times should > still be correct). The cdef functions are just what I am interested in. I found a short thread on the Python mailing list on profiling C extensions that says it should work: http://mail.python.org/pipermail/python-list/2005-October/345192.html Now I'm going to find out whether shared objects can be profiled (some webpages indicate 'no', but the manpage for my current gprof does not say anything about the subject). Regards, Joost -- Joost Cassee http://joost.cassee.net -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 544 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080515/bfebdfde/attachment.pgp From dagss at student.matnat.uio.no Thu May 15 21:16:33 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 21:16:33 +0200 (CEST) Subject: [Cython] from __future__ import ... In-Reply-To: <482C83D6.4030401@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> Message-ID: <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> Stefan wrote: > Hi, > > Lisandro Dalcin wrote: >> Iff the 'from __future__ ...' is NOT used, and a in a pxy file I do >> >> a = "abc" >> >> What the type of 'a' will be if I compile and run the generated C >> source in a Python 3 runtime environment? > > I'm currently reconsidering that, and I'm still not sure. We could simply > enable all __future__ features by default that the runtime environment > enables > (and that we support in Cython, obviously). That's a straight forward > thing to > do and it wouldn't be any different from the portability of Python code... > > What do the others think? I think: - The default "Python language level" for the pyx source when Cython is run should be that of the interpreter Cython is launched within. I.e, if cython.py is launched in Python 2.6, then "from __future__ import with" is enabled by default (to take one example that I know about). I think this is most likely to match user expectations. Command-line switches to Cython should be able to override this though. - Once compiled to C...the nice, explicit thing to do would be that once it hits C level, the module acts externally in the same way (same types of objects are created) no matter the C compilation environment. This means that if you have a pyx file, and you want the exact same non-conditional source to give old-style str in Python 2 and new-style str in Python 3, you have to invoke cython twice with different command-line arguments and create two C files (basically, the pyx means different things each time it is compiled, as different language levels are enabled for it each time). I think one can live with that, people should just use unicode everywhere anyway. But this is something I'm not sure of :-) Dag Sverre From stefan_ml at behnel.de Thu May 15 21:39:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 15 May 2008 21:39:08 +0200 Subject: [Cython] status update on Py3 support In-Reply-To: References: <482A0D10.1090603@behnel.de> <482B8321.8020307@canterbury.ac.nz> Message-ID: <482C915C.2070802@behnel.de> Hi, Lisandro Dalcin wrote: > - Indentifiers and (unicode) string literals can (and should) managed > in the same table. This way, 'a.foo' will be as efficient as > 'getattr(a, "foo")' > > - Byte strings are completelly different guys, so they should be > managed in a way similar to integer literals are. Cython now uses one list, but keeps the information if the string was - a unicode string - an identifier - decided to be interned and then the runtime setup code 'does the right thing' depending on the compile time environment. Stefan From dalcinl at gmail.com Thu May 15 21:42:50 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 16:42:50 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> Message-ID: On 5/15/08, Dag Sverre Seljebotn wrote: > I think: > > - The default "Python language level" for the pyx source when Cython is > run should be that of the interpreter Cython is launched within. I definitelly disagree with you. Cython 'pyx' has the chance of being more backward compatible even that source 'py' files. I would instead propose the following: * Iff 'from __future__ import unicode_literals' is issued, then Cython should generate unicode strings REGARDELESS of the C-generated compile time Python version. In Py2.X, that would be 'unicode' type, and in Py3, 'str' type. * Iff 'from __future__ import unicode_literals' is NOT issued, then at compile time, Cython should create strings as is the default in the compile-time Python version, that is, (byte) 'str' type as in Py2, or (unicode) 'str' type as in Py3. * Iff string literals are prefixed with 'b' as in b"abc", then Cython sould create at compile time a (byte) 'str' type object in the case of Py2.X (note: Python2.6 already does this), and a 'bytes' instance in the case of Py3. Now the question is: can this be done? Does this make sense ? Perhaps I'm still confused about the whole stuff. In such case, please ignore me. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu May 15 21:57:02 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 16:57:02 -0300 Subject: [Cython] status update on Py3 support In-Reply-To: <482C915C.2070802@behnel.de> References: <482A0D10.1090603@behnel.de> <482B8321.8020307@canterbury.ac.nz> <482C915C.2070802@behnel.de> Message-ID: On 5/15/08, Stefan Behnel wrote: > Cython now uses one list, but keeps the information if the string was > > - a unicode string > - an identifier > - decided to be interned So in the Py3 case the list can have either 'str' instances or 'bytes' instances, right? At first glance, I have no strong objections with this, though (for same not-easy-to-explain reason) my mind believes Py3 byte strings should be managed separatelly. Greg, what do you think? You said that Cython/Pyrex string tables cannot contain a mix of 'str' and 'bytes' instances, but I do not realize where the actual problem is. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Thu May 15 22:03:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 15 May 2008 22:03:29 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> Message-ID: <482C9711.7000207@behnel.de> Hi, Lisandro Dalcin wrote: > I definitelly disagree with you. Cython 'pyx' has the chance of being > more backward compatible even that source 'py' files. That's why I'm asking. There are reasons for and against this idea. > I would instead propose the following: > > * Iff 'from __future__ import unicode_literals' is issued, then Cython > should generate unicode strings REGARDELESS of the C-generated compile > time Python version. In Py2.X, that would be 'unicode' type, and in > Py3, 'str' type. Sure. > * Iff 'from __future__ import unicode_literals' is NOT issued, then at > compile time, Cython should create strings as is the default in the > compile-time Python version, that is, (byte) 'str' type as in Py2, or > (unicode) 'str' type as in Py3. This contradicts what you said above. If you want source compatibility, you can't change the semantics based on the compile time environment - except for the cases where the runtime environments really differ (such as byte/unicode identifiers). Imagine you had some latin-1 encoded XML byte literal in your code. In Py2, under your proposal, this would become a byte string that can be parsed. In Py3, however, this would suddenly become a unicode string and the parser would refuse to handle it, as it's no longer ISO encoded. > * Iff string literals are prefixed with 'b' as in b"abc", then Cython > sould create at compile time a (byte) 'str' type object in the case of > Py2.X (note: Python2.6 already does this), and a 'bytes' instance in > the case of Py3. That's the right thing to do. Stefan From dagss at student.matnat.uio.no Thu May 15 22:29:56 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 15 May 2008 22:29:56 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> Message-ID: <482C9D44.8070007@student.matnat.uio.no> Lisandro Dalcin wrote: > On 5/15/08, Dag Sverre Seljebotn wrote: >> I think: >> >> - The default "Python language level" for the pyx source when Cython is >> run should be that of the interpreter Cython is launched within. > > I definitelly disagree with you. Cython 'pyx' has the chance of being > more backward compatible even that source 'py' files. Do you have a real example where this would be genuinely useful? 1) If you are only creating a wrapper around simple C source, then you need to know very well whether you are dealing with unicode or bytes anyway (and encode explicitly to the right representation). 2) If you are doing anything more advanced, odds are you are using the standard library anyway, and then your code will more than likely break between Python 2 and 3 anyway (from what I hear). My very subjective and unfounded feelings: Creating some super-language which supports both 2 and 3 completely transparently is likely going to either have too few features to be useful, or be such an effort that it's not worth it. Having Cython output Python 3 compileable C code (from current Cython/Python 2) is however a great way of enabling people to write Python 3-specific Cython code before we get to the stage where there is Python 3 syntax support in Cython itself. (BTW, can we find some standard terminology for talking about this? Something like "Cython": pyx -> c stage; then it is a c-file and not Cython, and then "C compilation": c -> executable stage.) -- Dag Sverre From dalcinl at gmail.com Thu May 15 22:28:00 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 17:28:00 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482C9711.7000207@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9711.7000207@behnel.de> Message-ID: On 5/15/08, Stefan Behnel wrote: > > * Iff 'from __future__ import unicode_literals' is NOT issued, then at > > compile time, Cython should create strings as is the default in the > > compile-time Python version, that is, (byte) 'str' type as in Py2, or > > (unicode) 'str' type as in Py3. > > This contradicts what you said above. If you want source compatibility, you > can't change the semantics based on the compile time environment - except for > the cases where the runtime environments really differ (such as byte/unicode > identifiers). Imagine you had some latin-1 encoded XML byte literal in your > code. In Py2, under your proposal, this would become a byte string that can be > parsed. In Py3, however, this would suddenly become a unicode string and the > parser would refuse to handle it, as it's no longer ISO encoded. > Well, depite being a Spanish speaker, my mind still lives in the ASCII world. I was actually thinking about string literals with pure-ascii characters. My intention is just if I write this in a 'pyx' file meth = "__call__" if hasattr(someobj, meth): do_something(...) then this code should work as expected in Py2.X and Py3. Does this make sense? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu May 15 23:09:20 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 18:09:20 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482C9D44.8070007@student.matnat.uio.no> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> Message-ID: On 5/15/08, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: > > On 5/15/08, Dag Sverre Seljebotn wrote: > >> I think: > >> > >> - The default "Python language level" for the pyx source when Cython is > >> run should be that of the interpreter Cython is launched within. > > > > I definitelly disagree with you. Cython 'pyx' has the chance of being > > more backward compatible even that source 'py' files. > > > Do you have a real example where this would be genuinely useful? Perhaps I'm just asking for too much magic from Cython's side. But I already have a working project, wrapping the MPI specification (C API with more than 500 C definition summing up functions and constants), and in the current status of cython-devel-py3k I'm able to generate C source files that are going to compile and run against Python 2.3 to Python 3.0, Moreover, I can distribute a single set of generated C sources (I do not want users have to install Cython to build my package), and they build and the final extension module work just fine. I would definitely love that this degree of compatibility is maintained, not only for the easy of maintaining of my project, but also I'm pretty sure that other developers using Cython for serious work would also REALLY appreciate that. > My very subjective and unfounded feelings: Creating some super-language > which supports both 2 and 3 completely transparently is likely going to > either have too few features to be useful, or be such an effort that > it's not worth it. Well, I would love to see Cython language to be more near Python 3 language, just because Py3 it is a better languaje. Once Cython can parse Py3 sintax, I will readily update my code to that sintax, but I still would require and work for the generate C sources work at least on Python 2.4, better if also in 2.3 . I have users of my code running parallel applications on clusters with dated systems providing Py2.3; I would really avoid those users to ask them for building Python from sources (plus other packages they depend on) just because I like to be in the bleeding-edge Cython, who do not support C-level compatibility with Python 2.X. > Having Cython output Python 3 compileable C code > (from current Cython/Python 2) is however a great way of enabling people > to write Python 3-specific Cython code before we get to the stage where > there is Python 3 syntax support in Cython itself. Indeed. Furthermore, IMHO Cython support for Python 3 syntax could be even extended. Python 3 does not let you to do u"abc" for an unicode literal. Should we really have this restriction in Cython? Unless there is a good reason, I would ask this for being accepted. > (BTW, can we find some standard terminology for talking about this? > Something like "Cython": pyx -> c stage; then it is a c-file and not > Cython, and then "C compilation": c -> executable stage.) Agreed, Let's use "Cython compilation" for the pyx -> c stage, and "C compilation" for the c -> so stage (or whatever extension extension modules have) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri May 16 01:00:21 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 15 May 2008 20:00:21 -0300 Subject: [Cython] cython-devel-py3: __future__ in Future.py Message-ID: Stefan, I really do not understand why you are implementing Future.py in therms of __future__ module, but such an approach will preclude to run the Cython compiler with Python less than 2.6. As I'm not sure what your original intention was, I tried this (note that I used 0 (zero) for the flag, just in case). Attached my version of Future.py -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: Future.py Type: text/x-python Size: 391 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080515/96486969/attachment.py From greg.ewing at canterbury.ac.nz Fri May 16 01:58:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 11:58:45 +1200 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <200805151513.54003.kirr@mns.spb.ru> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> Message-ID: <482CCE35.8030900@canterbury.ac.nz> Kirill Smelkov wrote: > Greg, I'd like to ask you about > > Why don't you host normal Mercurial repository for Pyrex? For a number of reasons, it wouldn't be convenient for me to have the definitive repository on a remote site. I'm open to the idea of having a mirror site somewhere that is kept up to date, although I don't have anywhere to put one at the moment. In the meantime, I'm making Mercurial bundles available to anyone who wants to maintain their own repository. Something I'd like to point out is that there isn't usually any "bleeding edge" that's significantly different from the last released version. I have bursts of working on Pyrex, and any changes I make almost always turn up as a new release very soon after I've made them. It's not as if I'm sitting on things for months before releasing them, if that's what anyone is thinking. -- Greg -- Greg From greg.ewing at canterbury.ac.nz Fri May 16 02:14:52 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 12:14:52 +1200 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> Message-ID: <482CD1FC.6070703@canterbury.ac.nz> Lisandro Dalcin wrote: > The > real issue is that Python does not properly uses 'const' in a > consistent way. The reason I don't like const is that it requires one to go to a lot of effort, and the benefits, if any, don't seem to come anywhere near justifying it. I would have preferred to just leave const out of the Pyrex language, but that doesn't look like it will be possible. > Moreover, I believe that at Cython/Pyrex should gain support for > 'const' in a primitive way Pyrex will support const, I've made that decision now. And it won't be primitive, it'll be full and correct support. -- Greg From jek-gmane at kleckner.net Fri May 16 02:24:04 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Thu, 15 May 2008 17:24:04 -0700 Subject: [Cython] Small patch for Win32 platform Message-ID: When trying out the native M$FT compiler with Cython I found that the declarations are required to all be ahead of any executable code. Please consider the trivial patch below. === This code fails because declaration not top. PyMODINIT_FUNC init2calendar(void); /*proto*/ PyMODINIT_FUNC init2calendar(void) { static int __Pyx_unique = 0; if (__Pyx_unique==1) return; __Pyx_unique = 1; //// V Can't be declared after execution code. PyObject *__pyx_1 = 0; /*--- Execution code ---*/ === This code succeeds: PyMODINIT_FUNC init2calendar(void); /*proto*/ PyMODINIT_FUNC init2calendar(void) { static int __Pyx_unique = 0; //// V Moving this up fixes it. PyObject *__pyx_1 = 0; if (__Pyx_unique==1) return; __Pyx_unique = 1; /*--- Execution code ---*/ === Trivial patch: --- ModuleNode.py.orig 2008-05-15 17:18:08.363704000 -0700 +++ ModuleNode.py 2008-05-15 17:18:32.506829000 -0700 @@ -1443,10 +1443,10 @@ header = "PyMODINIT_FUNC init2%s(void)" % env.module_name code.putln("%s; /*proto*/" % header) code.putln("%s {" % header) + code.put_var_declarations(env.temp_entries) code.putln("static int __Pyx_unique = 0;") code.putln("if (__Pyx_unique==1) return;") code.putln("__Pyx_unique = 1;") - code.put_var_declarations(env.temp_entries) code.putln("/*--- Execution code ---*/") code.mark_pos(None) self.body.generate_execution_code(code) From wstein at gmail.com Fri May 16 02:27:16 2008 From: wstein at gmail.com (William Stein) Date: Thu, 15 May 2008 17:27:16 -0700 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482CCE35.8030900@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> Message-ID: <85e81ba30805151727t247d13a4re9da4bd230fb514a@mail.gmail.com> On Thu, May 15, 2008 at 4:58 PM, Greg Ewing wrote: > Kirill Smelkov wrote: > >> Greg, I'd like to ask you about >> >> Why don't you host normal Mercurial repository for Pyrex? > > For a number of reasons, it wouldn't be convenient for me > to have the definitive repository on a remote site. > > I'm open to the idea of having a mirror site somewhere > that is kept up to date, although I don't have anywhere > to put one at the moment. I'm happy to give you an account on cython.org if you would like. You could put a pyrex mirror site at pyrex.cython.org. Email me (wstein at gmail.com) off list with your desired login. > In the meantime, I'm making Mercurial bundles available > to anyone who wants to maintain their own repository. > > Something I'd like to point out is that there isn't > usually any "bleeding edge" that's significantly > different from the last released version. I have bursts > of working on Pyrex, and any changes I make almost > always turn up as a new release very soon after I've > made them. It's not as if I'm sitting on things for > months before releasing them, if that's what anyone > is thinking. > > -- > Greg > > > > > > > > > > > > > > -- > Greg > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From robertwb at math.washington.edu Fri May 16 03:02:03 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 18:02:03 -0700 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> Message-ID: On May 15, 2008, at 2:09 PM, Lisandro Dalcin wrote: > On 5/15/08, Dag Sverre Seljebotn wrote: >> Lisandro Dalcin wrote: >>> On 5/15/08, Dag Sverre Seljebotn >>> wrote: >>>> I think: >>>> >>>> - The default "Python language level" for the pyx source when >>>> Cython is >>>> run should be that of the interpreter Cython is launched within. >>> >>> I definitelly disagree with you. Cython 'pyx' has the chance of >>> being >>> more backward compatible even that source 'py' files. >> >> >> Do you have a real example where this would be genuinely useful? > > Perhaps I'm just asking for too much magic from Cython's side. But I > already have a working project, wrapping the MPI specification (C API > with more than 500 C definition summing up functions and constants), > and in the current status of cython-devel-py3k I'm able to generate C > source files that are going to compile and run against Python 2.3 to > Python 3.0, > > Moreover, I can distribute a single set of generated C sources (I do > not want users have to install Cython to build my package), and they > build and the final extension module work just fine. > > I would definitely love that this degree of compatibility is > maintained, not only for the easy of maintaining of my project, but > also I'm pretty sure that other developers using Cython for serious > work would also REALLY appreciate that. If this is possible (and I think it is), I would *very* much like to support this. The behavior of the module should depend mostly on the Cython compilation environment/flags. However, for (unmarked) string literals I think we should decide based on the C compilation stage. This makes things like hasattr much more compatible, as well as accepting/returning string literals objects. >> My very subjective and unfounded feelings: Creating some super- >> language >> which supports both 2 and 3 completely transparently is likely >> going to >> either have too few features to be useful, or be such an effort that >> it's not worth it. > > Well, I would love to see Cython language to be more near Python 3 > language, just because Py3 it is a better languaje. > > Once Cython can parse Py3 sintax, I will readily update my code to > that sintax, but I still would require and work for the generate C > sources work at least on Python 2.4, better if also in 2.3 . I have > users of my code running parallel applications on clusters with dated > systems providing Py2.3; I would really avoid those users to ask them > for building Python from sources (plus other packages they depend on) > just because I like to be in the bleeding-edge Cython, who do not > support C-level compatibility with Python 2.X. Agreed. The only thing one would have to watch out for is using, for example, builtings that have different meanings/don't exist in earlier versions of Python. Related to this, I would like Cython to have good support for the buffer interface, which of course (mostly) won't be available pre Python 2.5. >> Having Cython output Python 3 compileable C code >> (from current Cython/Python 2) is however a great way of enabling >> people >> to write Python 3-specific Cython code before we get to the stage >> where >> there is Python 3 syntax support in Cython itself. > > Indeed. Furthermore, IMHO Cython support for Python 3 syntax could be > even extended. Python 3 does not let you to do u"abc" for an unicode > literal. Should we really have this restriction in Cython? Unless > there is a good reason, I would ask this for being accepted. +1 >> (BTW, can we find some standard terminology for talking about this? >> Something like "Cython": pyx -> c stage; then it is a c-file and not >> Cython, and then "C compilation": c -> executable stage.) > > Agreed, Let's use "Cython compilation" for the pyx -> c stage, and "C > compilation" for the c -> so stage (or whatever extension extension > modules have) Works for me. - Robert From jasone at canonware.com Fri May 16 03:08:42 2008 From: jasone at canonware.com (Jason Evans) Date: Thu, 15 May 2008 18:08:42 -0700 Subject: [Cython] Generators (closures) to be supported? Message-ID: <482CDE9A.2060807@canonware.com> I see conflicting information on whether there is any intention to support generators. http://wiki.cython.org/Unsupported claims that work is in progress, and I see email in the April 2008 archives from Dag Sverre Seljebotn about a prototype, but no responses. However, http://www.mudskipper.ca/cython-doc/docs/limitations.html claims that this limitation is likely to remain, though this looks to be nearly identical to http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/Limitations.html . Is someone actively working on generator support? Thanks, Jason Evans From robertwb at math.washington.edu Fri May 16 03:11:03 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 18:11:03 -0700 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482CCE35.8030900@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> Message-ID: On May 15, 2008, at 4:58 PM, Greg Ewing wrote: > Kirill Smelkov wrote: > >> Greg, I'd like to ask you about >> >> Why don't you host normal Mercurial repository for Pyrex? > > For a number of reasons, it wouldn't be convenient for me > to have the definitive repository on a remote site. > > I'm open to the idea of having a mirror site somewhere > that is kept up to date, although I don't have anywhere > to put one at the moment. > > In the meantime, I'm making Mercurial bundles available > to anyone who wants to maintain their own repository. I'd be happy to post one right next to the others at http:// hg.cython.org/ , just based on your bundles. I can make it so you can push to it too. > > Something I'd like to point out is that there isn't > usually any "bleeding edge" that's significantly > different from the last released version. I have bursts > of working on Pyrex, and any changes I make almost > always turn up as a new release very soon after I've > made them. It's not as if I'm sitting on things for > months before releasing them, if that's what anyone > is thinking. > > -- > Greg > > > > > > > > > > > > > > -- > Greg > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Fri May 16 03:12:56 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 18:12:56 -0700 Subject: [Cython] Generators (closures) to be supported? In-Reply-To: <482CDE9A.2060807@canonware.com> References: <482CDE9A.2060807@canonware.com> Message-ID: In the very near term, I am swamped with non-cython related stuff, but plan to do generator support (and closures in general) before or during dev days this June. - Robert On May 15, 2008, at 6:08 PM, Jason Evans wrote: > I see conflicting information on whether there is any intention to > support generators. http://wiki.cython.org/Unsupported claims that > work > is in progress, and I see email in the April 2008 archives from Dag > Sverre Seljebotn about a prototype, but no responses. However, > http://www.mudskipper.ca/cython-doc/docs/limitations.html claims that > this limitation is likely to remain, though this looks to be nearly > identical to > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/ > Doc/Manual/Limitations.html > . > > Is someone actively working on generator support? > > Thanks, > Jason Evans > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From greg.ewing at canterbury.ac.nz Fri May 16 03:20:20 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 13:20:20 +1200 Subject: [Cython] status update on Py3 support In-Reply-To: References: <482A0D10.1090603@behnel.de> <482B8321.8020307@canterbury.ac.nz> Message-ID: <482CE154.2090206@canterbury.ac.nz> Lisandro Dalcin wrote: > - Indentifiers and (unicode) string literals can (and should) managed > in the same table. This way, 'a.foo' will be as efficient as > 'getattr(a, "foo")' I can't answer that without knowing a bit more about the C API in Py3k. If the Py3k equivalent of PyObject_GetAttr etc. take unicode string objects, then yes, that would be correct. > - Byte strings are completelly different guys, so they should be > managed in a way similar to integer literals are. Yes. -- Greg From greg.ewing at canterbury.ac.nz Fri May 16 03:22:02 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 13:22:02 +1200 Subject: [Cython] [Pyrex] Language stability In-Reply-To: References: <4826AAE8.6010706@student.matnat.uio.no> <91E6D358-EAB0-4322-BF5D-9964594C90B0@math.washington.edu> <4826ED91.70706@student.matnat.uio.no> <4827FE69.7020804@behnel.de> <4828DE33.3060303@canterbury.ac.nz> <25237.194.114.62.67.1210680900.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <482CE1BA.2060308@canterbury.ac.nz> Jim Kleckner wrote: > Robert Bradshaw wrote: > >> for x in 0 <= x < A(): >> print x >> >> This is valid Python (prints 3, 4, 5, 6), and would act completely >> differently under your proposal. > > This seems to me to demonstrate quite well that the newer syntax > is less desirable than the old. Not sure I follow -- the new syntax isn't ambiguous. It can't be an ordinary for-loop, because it doesn't have an 'in' in it. -- Greg From greg.ewing at canterbury.ac.nz Fri May 16 03:44:30 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 13:44:30 +1200 Subject: [Cython] Division (from __future__ import ...) In-Reply-To: <482C6B1E.10003@behnel.de> References: <482C6B1E.10003@behnel.de> Message-ID: <482CE6FE.6070707@canterbury.ac.nz> Stefan Behnel wrote: > just a quick note that I added support for > > from __future__ import ... Speaking of that, I realised the other day that Pyrex is currently still using the old Python semantics for division, and there's no way of getting the new ones. What I'd really like to do is make new-style division be on all the time, but in light of the recent integer for-loop fiasco, I'm reluctant to do anything that might break old code. So the obvious thing to do is support importing division from __future__. My current thought is that this would apply to all division, including C types. What do people think of that? -- Greg From greg.ewing at canterbury.ac.nz Fri May 16 04:05:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 14:05:56 +1200 Subject: [Cython] status update on Py3 support In-Reply-To: References: <482A0D10.1090603@behnel.de> <482B8321.8020307@canterbury.ac.nz> <482C915C.2070802@behnel.de> Message-ID: <482CEC04.1010909@canterbury.ac.nz> Lisandro Dalcin wrote: > Greg, what do you think? You said that Cython/Pyrex string tables > cannot contain a mix of 'str' and 'bytes' instances, but I do not > realize where the actual problem is. There may not be one, depending on how Cython handles things internally. I think what I really meant to say is that I didn't see any *need* to combine the tables, since "xxx" and b"xxx" are clearly different things right from the beginning. But I can see that it might be different if you're trying to support both py2 and py3 interpretations of the same source. If you have some reason for and way of combining them into a single table, that's fine. The end result is all that really matters. -- Greg From jek-gmane at kleckner.net Fri May 16 04:15:06 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Thu, 15 May 2008 19:15:06 -0700 Subject: [Cython] Small patch for Win32 platform In-Reply-To: References: Message-ID: Jim Kleckner wrote: > When trying out the native M$FT compiler with Cython > I found that the declarations are required to all be > ahead of any executable code. I found when updating to 0.9.6.14 that a *lot* had changed. So as Emily Latella would say, "Never mind" that patch... Something different did pop up with "static inline". I note that every other declaration is "static INLINE" elsewhere other than here. And changing that makes VS2008 compile: --- ExprNodes.py.orig 2008-05-15 19:12:51.289457400 -0700 +++ ExprNodes.py 2008-05-15 19:13:03.620422200 -0700 @@ -4124,7 +4124,7 @@ append_utility_code = [ """ -static inline PyObject* __Pyx_PyObject_Append(PyObject* L, PyObject* x) { +static INLINE PyObject* __Pyx_PyObject_Append(PyObject* L, PyObject* x) { if (likely(PyList_CheckExact(L))) { if (PyList_Append(L, x) < 0) return NULL; Py_INCREF(Py_None); @@ -4135,4 +4135,4 @@ } } ""","" From greg.ewing at canterbury.ac.nz Fri May 16 04:57:31 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 16 May 2008 14:57:31 +1200 Subject: [Cython] Generators (closures) to be supported? In-Reply-To: References: <482CDE9A.2060807@canonware.com> Message-ID: <482CF81B.1070908@canterbury.ac.nz> Robert Bradshaw wrote: > In the very near term, I am swamped with non-cython related stuff, > but plan to do generator support (and closures in general) before or > during dev days this June. Do you have any ideas on how you intend to implement closures? -- Greg From robertwb at math.washington.edu Fri May 16 07:48:25 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 15 May 2008 22:48:25 -0700 Subject: [Cython] Generators (closures) to be supported? In-Reply-To: <482CF81B.1070908@canterbury.ac.nz> References: <482CDE9A.2060807@canonware.com> <482CF81B.1070908@canterbury.ac.nz> Message-ID: <09042718-6552-4B3A-B032-C232F5B14C0A@math.washington.edu> On May 15, 2008, at 7:57 PM, Greg Ewing wrote: > Robert Bradshaw wrote: >> In the very near term, I am swamped with non-cython related stuff, >> but plan to do generator support (and closures in general) before or >> during dev days this June. > > Do you have any ideas on how you intend to implement > closures? Yes. A closure well be represented by a special cdef class (specific to each closure), which will have its state in member variables (greatly simplifying garbage collection for instance). - Robert From stefan_ml at behnel.de Fri May 16 10:27:49 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 10:27:49 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: References: Message-ID: <482D4585.3060305@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, I really do not understand why you are implementing Future.py > in therms of __future__ module, but such an approach will preclude to > run the Cython compiler with Python less than 2.6. Sorry, that was a bug. It should have been done the way you proposed (that's the moments when I'm happy to work on a branch and not on cython-devel). Stefan From stefan_ml at behnel.de Fri May 16 10:30:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 10:30:00 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9711.7000207@behnel.de> Message-ID: <482D4608.6040904@behnel.de> Hi, Lisandro Dalcin wrote: > I was actually thinking about string literals with pure-ascii > characters. I hope you're not proposing to handle them differently from other string literals. > My intention is just if I write this in a 'pyx' file > > meth = "__call__" > if hasattr(someobj, meth): > do_something(...) > > then this code should work as expected in Py2.X and Py3. Does this make sense? If you want it to run on both Py2 and Py3, make that meth = u"__call__" Stefan From stefan_ml at behnel.de Fri May 16 10:43:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 10:43:21 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> Message-ID: <482D4929.1000501@behnel.de> Hi, Lisandro Dalcin wrote: > Moreover, I can distribute a single set of generated C sources (I do > not want users have to install Cython to build my package), and they > build and the final extension module work just fine. That's exactly my intention. Cython language specifics must be handled by the Cython parser, and specifics of the runtime environment must be handled by the C compiler. In the context of unicode literals, this means: once the parser has determined that a string is a unicode literal or a bytes literal in the source code, later steps in the compiler run (Cython or C compiler) must not change this. Stefan From dagss at student.matnat.uio.no Fri May 16 10:47:21 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 10:47:21 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D4585.3060305@behnel.de> References: <482D4585.3060305@behnel.de> Message-ID: <482D4A19.8020007@student.matnat.uio.no> > Sorry, that was a bug. It should have been done the way you proposed (that's > the moments when I'm happy to work on a branch and not on cython-devel). (This is not retorical, I want to ask and learn about how the repos should be used:) Isn't the "cython" branch where releases are pulled from? Would a temporary bug (fixed one day later) in cython-devel matter, isn't that just a consequence of it being a devel branch? And what are the differences between devel-py3 and devel in this respect? Also, is there any plans for how one will deal with the split in the future? Will everything be merged at once, or could the pieces that are not py3-specific be merged back before? I'm asking because the __future__ part will be useful in the devel branch too -- i.e. Greg's plans for supporting the division statement which would be useful for Cython as well (if his changes doesn't match Cython code I'll volunteer for porting it, it is quite essential for nice NumPy usage...). -- Dag Sverre From stefan_ml at behnel.de Fri May 16 10:51:40 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 10:51:40 +0200 Subject: [Cython] Division (from __future__ import ...) In-Reply-To: <482CE6FE.6070707@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <482CE6FE.6070707@canterbury.ac.nz> Message-ID: <482D4B1C.7020408@behnel.de> Hi, Greg Ewing wrote: > Stefan Behnel wrote: > >> just a quick note that I added support for >> >> from __future__ import ... > > Speaking of that, I realised the other day that Pyrex is > currently still using the old Python semantics for division, > and there's no way of getting the new ones. > > What I'd really like to do is make new-style division be > on all the time, but in light of the recent integer for-loop > fiasco, I'm reluctant to do anything that might break old > code. > > So the obvious thing to do is support importing division > from __future__. Since Py2-style division is currently broken in the Py3 branch anyway (as Py3 doesn't provide it), a future import makes sense even for Py2, yes. > My current thought is that this would apply to all division, > including C types. So that cdef int a = 5, b = 2 print a / b would print 2.5 ? I think that would be a surprising result. We should at least split the future import into "division" and "cdivision" in that case. Stefan From robertwb at math.washington.edu Fri May 16 10:54:22 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 01:54:22 -0700 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D4A19.8020007@student.matnat.uio.no> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> Message-ID: <3146FBD3-73E1-444B-9D3C-93C5ED41AB04@math.washington.edu> On May 16, 2008, at 1:47 AM, Dag Sverre Seljebotn wrote: >> Sorry, that was a bug. It should have been done the way you >> proposed (that's >> the moments when I'm happy to work on a branch and not on cython- >> devel). > > (This is not retorical, I want to ask and learn about how the repos > should be used:) > > Isn't the "cython" branch where releases are pulled from? Would a > temporary bug (fixed one day later) in cython-devel matter, isn't that > just a consequence of it being a devel branch? And what are the > differences between devel-py3 and devel in this respect? View it as stable, unstable, and experimental. Also, I view "cython" as where one wants to look to see what's in the latest release. "cython-devel" is new stuff, I try not to commit anything that breaks stuff (to my knowledge) so that there's not much fear of pulling when multiple people are trying to collaborate. "cython-py3" is because it was expected to be broken much(?) of the time. > Also, is there any plans for how one will deal with the split in the > future? Will everything be merged at once, or could the pieces that > are > not py3-specific be merged back before? I think the goal is to try and not let them diverge to much. When py3 is relatively stable, it will get merged into cython-devel. > I'm asking because the > __future__ part will be useful in the devel branch too -- i.e. Greg's > plans for supporting the division statement which would be useful for > Cython as well (if his changes doesn't match Cython code I'll > volunteer > for porting it, it is quite essential for nice NumPy usage...). Yes, certainly. The only reason they're split up is so the (potentially) very broken cython-py3 doesn't interfere with working on cython-devel. - Robert From stefan_ml at behnel.de Fri May 16 10:56:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 10:56:14 +0200 Subject: [Cython] Small patch for Win32 platform In-Reply-To: References: Message-ID: <482D4C2E.3020109@behnel.de> Jim Kleckner wrote: > Something different did pop up with "static inline". > I note that every other declaration is "static INLINE" > elsewhere other than here. And changing that makes > VS2008 compile: Yep, that's in already. Stefan From dagss at student.matnat.uio.no Fri May 16 11:00:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 11:00:25 +0200 Subject: [Cython] Division (from __future__ import ...) In-Reply-To: <482D4B1C.7020408@behnel.de> References: <482C6B1E.10003@behnel.de> <482CE6FE.6070707@canterbury.ac.nz> <482D4B1C.7020408@behnel.de> Message-ID: <482D4D29.4050407@student.matnat.uio.no> > So that > > cdef int a = 5, b = 2 > print a / b > > would print 2.5 ? I think that would be a surprising result. We should at > least split the future import into "division" and "cdivision" in that case. Well, it has to be explicitly enabled, and once that is done, it is very easy to simply do print a // b to get C behaviour. If you know how to import division, you know enough for this behaviour to not be surprising I think? -- Dag Sverre From stefan_ml at behnel.de Fri May 16 11:08:12 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 11:08:12 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D4A19.8020007@student.matnat.uio.no> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> Message-ID: <482D4EFC.8040109@behnel.de> Hi, Dag Sverre Seljebotn wrote: >> Sorry, that was a bug. It should have been done the way you proposed (that's >> the moments when I'm happy to work on a branch and not on cython-devel). > > (This is not retorical, I want to ask and learn about how the repos > should be used:) > > Isn't the "cython" branch where releases are pulled from? Would a > temporary bug (fixed one day later) in cython-devel matter, isn't that > just a consequence of it being a devel branch? And what are the > differences between devel-py3 and devel in this respect? The cython-devel-py3 branch was very unstable over the last few days. Since cython-devel is the main development branch, it's ok for it to be "under development", but it's not ok if it's broken. (That said, changes that get pushed to the public server should never break whatever branch they appear in. My bad...) > Also, is there any plans for how one will deal with the split in the > future? I'm planning to merge the changes ASAP, but I'd like to fix the "print" statement for Py3 first, as there are currently a lot of test cases that fail because of that, so this may actually hide real bugs. BTW, in Py3, "print" will simply call the print function. The tricky thing is to generate the cleanup code correctly for the two code branches that the C preprocessor selects in Py2 and Py3... Stefan From dagss at student.matnat.uio.no Fri May 16 11:16:11 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 11:16:11 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> Message-ID: <482D50DB.6070400@student.matnat.uio.no> Robert Bradshaw wrote: > On May 15, 2008, at 2:09 PM, Lisandro Dalcin wrote: >> Moreover, I can distribute a single set of generated C sources (I do >> not want users have to install Cython to build my package), and they >> build and the final extension module work just fine. >> > If this is possible (and I think it is), I would *very* much like to > support this. The behavior of the module should depend mostly on the > Cython compilation environment/flags. However, for (unmarked) string > literals I think we should decide based on the C compilation stage. > This makes things like hasattr much more compatible, as well as > accepting/returning string literals objects. If it will work then I won't argue any more (though Stefan is also resisting it). I'll just mention this: My fear is that with this kind of thinking we get a combinatorial explosion in testing scenarios. I.e., if the exact behaviour of the syntax both depend on the pyx source language level [1] and the C compilation environment, then one could argue that one should test for every combination of these. However, if there is a rule that "behaviour is fixed" after Cython compilation then one only has to write tests that only consider the language levels on each side seperately, so no combinatorial explosion. This is all in principle though and in practical reality, less complete testing might be sufficient. In this case, that practical solution would be to think of the ""-literal as being a third magic string type, that behaves differently depending on the runtime environment. Then one only has to test the behaviour of this magic type, so only the C environment levels has to be tested, and no combinatorial explosion... [1] Pyx source language level == switches for which Python version to emulate, "from __future__ import ..." and so on -- Dag Sverre From dagss at student.matnat.uio.no Fri May 16 11:19:58 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 11:19:58 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D4EFC.8040109@behnel.de> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> <482D4EFC.8040109@behnel.de> Message-ID: <482D51BE.70802@student.matnat.uio.no> > BTW, in Py3, "print" will simply call the print function. The tricky thing is > to generate the cleanup code correctly for the two code branches that the C > preprocessor selects in Py2 and Py3... I'm probably missing something. But it appears that one could simply provide an implementation of the "print" function for Py2 (as verbatim C code pasted in) with the same signature and call it in the same way... -- Dag Sverre From stefan_ml at behnel.de Fri May 16 11:31:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 11:31:22 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D51BE.70802@student.matnat.uio.no> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> <482D4EFC.8040109@behnel.de> <482D51BE.70802@student.matnat.uio.no> Message-ID: <482D546A.7080202@behnel.de> Hi, Dag Sverre Seljebotn wrote: >> BTW, in Py3, "print" will simply call the print function. The tricky thing is >> to generate the cleanup code correctly for the two code branches that the C >> preprocessor selects in Py2 and Py3... > > I'm probably missing something. But it appears that one could simply > provide an implementation of the "print" function for Py2 (as verbatim C > code pasted in) with the same signature and call it in the same way... In any case, that would be a rewrite of the current implementation. Also, I don't think the Py2 version would work (not having looked at it) with the new I/O system. Stefan From stefan_ml at behnel.de Fri May 16 11:41:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 11:41:09 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> Message-ID: <482D56B5.8050706@behnel.de> Hi, Robert Bradshaw wrote: > The behavior of the module should depend mostly on the > Cython compilation environment/flags. However, for (unmarked) string > literals I think we should decide based on the C compilation stage. > This makes things like hasattr much more compatible, as well as > accepting/returning string literals objects. The thing is that if you write getattr(o, u"attr") in Cython, it will work in both Py2 and Py3. However, getattr(o, "attr") will only work in Py2, unless you do the future import. I don't mind requiring some effort from users to get their code ready for Py3, especially if it's broken because of (now) wrong assumptions about byte strings and Unicode. >>> Having Cython output Python 3 compileable C code >>> (from current Cython/Python 2) is however a great way of enabling >>> people >>> to write Python 3-specific Cython code before we get to the stage >>> where >>> there is Python 3 syntax support in Cython itself. >> Indeed. Furthermore, IMHO Cython support for Python 3 syntax could be >> even extended. Python 3 does not let you to do u"abc" for an unicode >> literal. Should we really have this restriction in Cython? Unless >> there is a good reason, I would ask this for being accepted. > > +1 Yes, I wouldn't mind accepting u"abc, b"abc" and "abc" at the same time. The meaning of the latter would depend on the future import. >>> (BTW, can we find some standard terminology for talking about this? >>> Something like "Cython": pyx -> c stage; then it is a c-file and not >>> Cython, and then "C compilation": c -> executable stage.) >> Agreed, Let's use "Cython compilation" for the pyx -> c stage, and "C >> compilation" for the c -> so stage (or whatever extension extension >> modules have) > > Works for me. Sure. Stefan From robertwb at math.washington.edu Fri May 16 11:56:11 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 02:56:11 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482D56B5.8050706@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> Message-ID: <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> On May 16, 2008, at 2:41 AM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> The behavior of the module should depend mostly on the >> Cython compilation environment/flags. However, for (unmarked) string >> literals I think we should decide based on the C compilation stage. >> This makes things like hasattr much more compatible, as well as >> accepting/returning string literals objects. > > The thing is that if you write > > getattr(o, u"attr") > > in Cython, it will work in both Py2 and Py3. However, > > getattr(o, "attr") > > will only work in Py2, unless you do the future import. I don't > mind requiring > some effort from users to get their code ready for Py3, especially > if it's > broken because of (now) wrong assumptions about byte strings and > Unicode. I would rather that string literals be interpreted according to the C library they're linked against. I'm also thinking about code that, say, returns string literals. I would much rather it returns str in Py2 and unicode in Py3. Note, this is not something that needs to be done to get ready for Py3--it's an assumption that unqualified string literals are the same type as python identifiers. The latter must be dependent on the version it is linked against. I was doing some playing around with str and unicode in Python, and I noticed that it will automatically convert between the two (no explicit encoding needed) as long as the data in question is pure ASCII. Is this a case for allowing string <-> char* conversions as long as the data in question is pure ASCII? - Robert From kirr at mns.spb.ru Fri May 16 12:02:38 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Fri, 16 May 2008 14:02:38 +0400 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482CCE35.8030900@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> Message-ID: <200805161402.38996.kirr@mns.spb.ru> ? ????????? ?? ??????? 16 ??? 2008 Greg Ewing ???????(a): > Kirill Smelkov wrote: > > > Greg, I'd like to ask you about > > > > Why don't you host normal Mercurial repository for Pyrex? > > For a number of reasons, it wouldn't be convenient for me > to have the definitive repository on a remote site. > > I'm open to the idea of having a mirror site somewhere > that is kept up to date, although I don't have anywhere > to put one at the moment. Greg, I see your points. Yes, please, let's have Pyrex mirror somewhere. Today I've noticed this is already setup at http://hg.cython.org/pyrex/ So, I think everyone would be grateful to you if you'll keep it up-to-date. As to your wish to keep the definitive repository at your side, I'd like to say that as I understand, usually distributed development works in the following way: People, clone your repository, make their improvements and send patches back to maintainer (in Pyrex case this is you). So then the maintainer could see if a patch is ok, or wrong, or semi-ok and needs more work, etc. This is all general terms, and of course one could do the same based on Pyrex-x.y.z.tar.gz, but everything is easier and more convenient, if the main repository is visible to all. And you know, sometimes it positively affects development ... > In the meantime, I'm making Mercurial bundles available > to anyone who wants to maintain their own repository. This are all appreciated steps in positive direction, thanks! > Something I'd like to point out is that there isn't > usually any "bleeding edge" that's significantly > different from the last released version. I have bursts > of working on Pyrex, and any changes I make almost > always turn up as a new release very soon after I've > made them. It's not as if I'm sitting on things for > months before releasing them, if that's what anyone > is thinking. I see. I respect you very much as Pyrex language designer and creator, and I'm *very* thankful to you because I was fascinated by the idea when I first saw it, and when I used Pyrex I had *very* positive experience. So BIG "THANK YOU, GREG!" But look, isn't there is no bleeding edge because mainly *you* work on it? Greg, everything, somehow someday *have* to become team work because one person can't scale that much. Just look at it: Python has a bunch of core development guys and PEPS Why don't we do the same? Yes I know, it could be somewhat scary and uncomfortable to let others to play in not-trivial way with your baby, but this just has to be done, sooner or later! You say you still design Pyrex as a language, but look at your recent quote taken from http://article.gmane.org/gmane.comp.python.cython.devel/1642 > What I'd really like to do is make new-style division be > on all the time, but in light of the recent integer for-loop > fiasco, I'm reluctant to do anything that might break old > code. So Pyrex is already took off, and we have to gradually improve it and care not to break existing deployments as well. This is not a one-man task. So why can't we do it all together?! Yes, you'll loose precise control on what Pyrex is, but given there is a shell of motivated people around, I think Pyrex and you will gain much more in return ... That's how I think ... Please let me know *your* thoughts. Thanks, Kirill. From stefan_ml at behnel.de Fri May 16 12:23:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 12:23:21 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> Message-ID: <482D6099.8000308@behnel.de> Hi, Robert Bradshaw wrote: > I would rather that string literals be interpreted according to the C > library they're linked against. I'm also thinking about code that, > say, returns string literals. I would much rather it returns str in > Py2 and unicode in Py3. That would be unexpected, especially in Py3 where the two are distinct types. As I said to Lisandro in another post: S> If you want source compatibility, you can't change the semantics based on S> the compile time environment - except for the cases where the runtime S> environments really differ (such as byte/unicode identifiers). Imagine you S> had some latin-1 encoded XML byte literal in your code. In Py2, under your S> proposal, this would become a byte string that can be parsed. In Py3, S> however, this would suddenly become a unicode string and the parser would S> refuse to handle it, as it's no longer ISO encoded. > Note, this is not something that needs to be done to get ready for > Py3--it's an assumption that unqualified string literals are the same > type as python identifiers. This happens to be a correct assumption in Py2 and Py3, but I don't see the link. > I was doing some playing around with str and unicode in Python, and I > noticed that it will automatically convert between the two (no > explicit encoding needed) as long as the data in question is pure > ASCII. That would be Py2. Py3 will never attempt any kind of automatic conversion between bytes and str. And I am convinced that Cython shouldn't do that either. Stefan From ggellner at uoguelph.ca Fri May 16 16:11:49 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Fri, 16 May 2008 10:11:49 -0400 Subject: [Cython] Generators (closures) to be supported? In-Reply-To: References: <482CDE9A.2060807@canonware.com> Message-ID: <20080516141149.GA9557@basestar> On Thu, May 15, 2008 at 06:12:56PM -0700, Robert Bradshaw wrote: > In the very near term, I am swamped with non-cython related stuff, > but plan to do generator support (and closures in general) before or > during dev days this June. > > - Robert > I will change the docs to make this distinction clear. Gabriel From dalcinl at gmail.com Fri May 16 16:12:19 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 11:12:19 -0300 Subject: [Cython] Division (from __future__ import ...) In-Reply-To: <482CE6FE.6070707@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <482CE6FE.6070707@canterbury.ac.nz> Message-ID: On 5/15/08, Greg Ewing wrote: > My current thought is that this would apply to all division, > including C types. What do people think of that? Im not sure at all, we could have cdef int a = 5, b = 2 cdef int p = 5/2 cdef float q = 5/2 cdef object r = 5/2 print p, q, r then we will get p->2, q->2.0, r->float(2.0) . As Cython/Pyrex are a mix of Python and C, and IMHO the language is much more on the Python side, perhaps matching more closelly the Python way (even if the involved datatypes are all C) do really make sense. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri May 16 16:35:52 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 11:35:52 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482D56B5.8050706@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> Message-ID: On 5/16/08, Stefan Behnel wrote: > The thing is that if you write > > getattr(o, u"attr") > > in Cython, it will work in both Py2 and Py3. However, > > getattr(o, "attr") > > will only work in Py2, unless you do the future import. Stefan, I understood that one of the traget of Cython is to efficiently compile Python code. Please note that getattr(o, u"attr") is not valid Python 3 code at all !! You are proposing that if I do "def foo(): ..." the the identifier 'foo' will be implicitely treated as unicode for Py3, but a string literal "abc" do not !!. I realize that this can be just fixed with a future import, but it seems a really ugly to me. Future directives are just for that, for the future. In a Py3 runtime, we are at the present, so why Cython should rely on the future import for match the semantics of the present? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri May 16 18:16:46 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 18:16:46 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> Message-ID: <482DB36E.6030400@behnel.de> Hi Lisandro, please read what I post. You've been stating repeatedly that your experience with Unicode is limited, and I think I have told you a lot of what I know about the subject. I appreciate discussion, but please consider that I might have more reasons for my opinions about the subject than I state in each single post. Lisandro Dalcin wrote: > On 5/16/08, Stefan Behnel wrote: >> The thing is that if you write >> >> getattr(o, u"attr") >> >> in Cython, it will work in both Py2 and Py3. However, >> >> getattr(o, "attr") >> >> will only work in Py2, unless you do the future import. > > Stefan, I understood that one of the traget of Cython is to > efficiently compile Python code. Please note that > > getattr(o, u"attr") > > is not valid Python 3 code at all !! That's why I said "in Cython". > You are proposing that if I do "def foo(): ..." the the identifier > 'foo' will be implicitely treated as unicode for Py3, Sure. You didn't state in your source that you wanted the identifier name to be a byte string, did you? (which was obviously because Python doesn't allow you to do that). > but a string literal "abc" do not !!. Because the syntax of Python2, which Cython currently implements, dictates that "abc" is a byte string. This is explicit in Python2, as the unicode string would be u"abc". The difference between identifiers and strings is that one is a name and the other is a piece of data. The language can do whatever it likes with the names (it can even strip them from the compiled result completely), but it must *never* corrupt data. Py3 has come a long way since the initial Unicode support in Py 2.0, almost eight years back. We shouldn't throw all lessons learned away and think we can do better. Stefan From stefan_ml at behnel.de Fri May 16 18:22:10 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 18:22:10 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482D4EFC.8040109@behnel.de> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> <482D4EFC.8040109@behnel.de> Message-ID: <482DB4B2.7010104@behnel.de> Hi, Stefan Behnel wrote: > I'm planning to merge the changes ASAP, but I'd like to fix the "print" > statement for Py3 first, as there are currently a lot of test cases that fail > because of that, so this may actually hide real bugs. > > BTW, in Py3, "print" will simply call the print function. The tricky thing is > to generate the cleanup code correctly for the two code branches that the C > preprocessor selects in Py2 and Py3... This is implemented now. The remaining tests that fail under Py3 are all due to byte literals being used where text would be appropriate for printing. I'll continue to fix those, and if there are no objections, I will merge the changes back into cython-devel tomorrow. This is a good time to test your code with the cython-devel-py3 branch. :) Stefan From dalcinl at gmail.com Fri May 16 18:34:51 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 13:34:51 -0300 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: <482CD1FC.6070703@canterbury.ac.nz> References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> <482CD1FC.6070703@canterbury.ac.nz> Message-ID: On 5/15/08, Greg Ewing wrote: > Pyrex will support const, I've made that decision now. > And it won't be primitive, it'll be full and correct > support. Just fantastic! But when you see full and correct, do you mean that Pyrex will match the behavior of a C/C++ compiler like gcc in the example above (try it if you never coded something like this) int main() { int a[10]; const int * p = a; int * const q = a; const int * const r = a; p[0]=1; q[0]=1; r[0]=1; p=0; q=0; r=0; return 0; } Well, regarding the above code, your comment about the tyranny of 'const' make more sense :-) I do not really ask you for supporting the 'int* const' form, but the 'const int*'. However, the first form can still appear when trying to wrap some C++ API :-( -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Fri May 16 18:41:39 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 18:41:39 +0200 Subject: [Cython] Status update: Transform utilities Message-ID: <482DB943.3090501@student.matnat.uio.no> I'll create a ticket shortly containing my local branch so far. (The phase refactorings I've posted earlier is *not* included, this is all useful stuff :-) ). If anybody wants to start a process of applying it or discuss it then fine, but there's no hurry, if nothing happens I'll just ping the list again when Robert has more time available. A summary: I'll get to the main feature first so that you have a reason for applying this :-) It should now be possible to do stuff like: class WithTransform(VisitorTransform): # from with transform PEP... with_fragment = TreeFragment(u""" _mgr = (EXPR) _exit = mgr.__exit__ _value = mgr.__enter__() _exc = True try: try: VAR = _value BLOCK ... ... """) def process_WithStatementNode(self, node): return self.with_fragment.substitute({ "EXPR" : node.expr, "VAR" : node.var, "BLOCK" : node.body }) :-) (The above is simplified, there's not always a VAR. Also it needs another feature before it can be completely streamlined (automatic "temporaries" that won't clash in the namespace; basically, "with_fragment.substitute(..., temps=("_mgr", ...)). When that is done, supporting the with statement is about as much work as extending the parser, the transform/implementation comes for free. - I've already discussed the CodeWriter. It only supports a limited subset (with some holes, ~30% perhaps) at this time; but it's what I need for now (for unit tests). I might work further on that too. - Some changes to Transform.py which I hope goes through... there's a Visitor object there; using the "process_ClassName" pattern (I think that was the conclusion for future performance reasons). - A clone_node method on Node for proper node copying (shallow object copy except child node lists, which are also copied). - Here's the controversial bit: In order to be able to provide proper error messages for string-based code snippets like the above (which are passed to Parsing.py...); I've changed the pointer to the source code (used as the first element in the position tuples found everywhere...) from being a string filename to being a SourceDescriptor object. A SourceDescriptor can currently be a FileSourceDescriptor, in which case things work like before (it gives the filename on __str__ so much code needed not change), or a StringSourceDestriptor which I use for my new code... I hope you see the advantages to this from the above code. (There are less intrusive ways to do this, but they would only be hacky and postpone the problem. Better do it properly...? BTW this pattern is rather common, consider for instance Source in the XML Transform APIs/TrAX.) -- Dag Sverre From dalcinl at gmail.com Fri May 16 18:53:42 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 13:53:42 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482DB36E.6030400@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> Message-ID: On 5/16/08, Stefan Behnel wrote: > Hi Lisandro, > > please read what I post. You've been stating repeatedly that your experience > with Unicode is limited, and I think I have told you a lot of what I know > about the subject. I appreciate discussion, but please consider that I might > have more reasons for my opinions about the subject than I state in each > single post. You are completelly right, and take for granted that at any point I will accept what you decide is better. It's just I believe, perhaps beause of lack of knowledge, that "abc" should be always the builtin 'str' type in Py2 (byte strings) and Py3 (unicode strings). If we want to represent literals being data, then we have to use b"abc", in Py2 that would be the 'str' (byte string) type, and in Py3, the new 'bytes' types. Note that in Python 2.6 "abc" and b"abc" both return a byte string 'str' type, and 'bytes' type is an alias for 'str' type. What's wrong with the Python 2.6 way? What you believe was the whole point in 2.6 of add support for b"abc" literals and aliasing 'bytes' to str? > > Lisandro Dalcin wrote: > > On 5/16/08, Stefan Behnel wrote: > >> The thing is that if you write > >> > >> getattr(o, u"attr") > >> > >> in Cython, it will work in both Py2 and Py3. However, > >> > >> getattr(o, "attr") > >> > >> will only work in Py2, unless you do the future import. > > > > Stefan, I understood that one of the traget of Cython is to > > efficiently compile Python code. Please note that > > > > getattr(o, u"attr") > > > > is not valid Python 3 code at all !! > > > That's why I said "in Cython". > > > > > You are proposing that if I do "def foo(): ..." the the identifier > > 'foo' will be implicitely treated as unicode for Py3, > > > Sure. You didn't state in your source that you wanted the identifier name to > be a byte string, did you? (which was obviously because Python doesn't allow > you to do that). > > > > > but a string literal "abc" do not !!. > > > Because the syntax of Python2, which Cython currently implements, dictates > that "abc" is a byte string. This is explicit in Python2, as the unicode > string would be u"abc". > > The difference between identifiers and strings is that one is a name and the > other is a piece of data. The language can do whatever it likes with the names > (it can even strip them from the compiled result completely), but it must > *never* corrupt data. > > Py3 has come a long way since the initial Unicode support in Py 2.0, almost > eight years back. We shouldn't throw all lessons learned away and think we can > do better. > > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Fri May 16 18:56:38 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 18:56:38 +0200 Subject: [Cython] Status update: Transform utilities In-Reply-To: <482DB943.3090501@student.matnat.uio.no> References: <482DB943.3090501@student.matnat.uio.no> Message-ID: <482DBCC6.9060108@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > I'll create a ticket shortly containing my local branch so far. (The > phase refactorings I've posted earlier is *not* included, this is all > useful stuff :-) ). It is ticket 11: http://trac.cython.org/cython_trac/ticket/11 I was hoping Trac would show the diff in a better way (with hg commit comments) -- there are many individual commits. I uploaded a bundle as well... The last commit is rather large; as I've noted: It is a rather big commit, however seperating it is non-trivial. The tests for all of these features all rely on using each other, so there's a circular dependency in the tests and I wanted to commit the tests and features at the same time. (However, the non-test-code does not have a circular dependency.) About unit tests: - I've put unit tests in Tests/ subdirectories to the files they test - They are not (yet) automatically run from runtests.py. -- Dag Sverre From stefan_ml at behnel.de Fri May 16 19:04:25 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 19:04:25 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> Message-ID: <482DBE99.3090908@behnel.de> Hi, Lisandro Dalcin wrote: > in Python 2.6 "abc" and b"abc" both return a byte string > 'str' type, and 'bytes' type is an alias for 'str' type. As I said before, that's fine with me. Stefan From dalcinl at gmail.com Fri May 16 19:36:27 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 14:36:27 -0300 Subject: [Cython] from __future__ import ... In-Reply-To: <482DBE99.3090908@behnel.de> References: <482C6B1E.10003@behnel.de> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> Message-ID: OK, Stefan In now realize that all the noise I generated was actually related to the Python Language version Cython is targeting to parse, which is Python 2. And then all is fine, "abc" have to (unless __future__ used) a byte string, matching the semantics of Py2.X series. The point that the generated C sources can be compiled againt a Py3 runtime is just an 'extension' ... But when Cython can parse Python Language version 3, if the user pass a -3 or -py3 flag to cython (or by any other way), then and only then, a string literal like "abc" is going to be unicode by default. If the above scenario is the present and the future of Cython, I now completelly agree with your directions, and please disregard all the noise. On 5/16/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > in Python 2.6 "abc" and b"abc" both return a byte string > > 'str' type, and 'bytes' type is an alias for 'str' type. > > > As I said before, that's fine with me. > > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From wstein at gmail.com Fri May 16 20:07:53 2008 From: wstein at gmail.com (William Stein) Date: Fri, 16 May 2008 11:07:53 -0700 Subject: [Cython] How Cython Works Message-ID: <85e81ba30805161107l553873a1wbdff1b9c9ed6ade0@mail.gmail.com> Hi, You might find this Cython talk with slides and video interesting: http://wiki.wstein.org/2008/sageseminar/kantor Title: How Cython Works Speaker: Josh Kantor Abstract: The cython language is an extension of the python language which can be used to write highly optimized low level (relative to python) code. The cython compiler, which is itself written in python, compiles the cython language to C code. The goal of this talk is to give an overview of how the cython compiler is implemented, and to give ideas for how to possibly modify or improve cython. -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From stefan_ml at behnel.de Fri May 16 21:55:31 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 16 May 2008 21:55:31 +0200 Subject: [Cython] cython-devel-py3: __future__ in Future.py In-Reply-To: <482DB4B2.7010104@behnel.de> References: <482D4585.3060305@behnel.de> <482D4A19.8020007@student.matnat.uio.no> <482D4EFC.8040109@behnel.de> <482DB4B2.7010104@behnel.de> Message-ID: <482DE6B3.4040001@behnel.de> Stefan Behnel wrote: > The remaining tests that fail under Py3 are all due > to byte literals being used where text would be appropriate for printing. I'll > continue to fix those Done. :) cython-devel-py3 is ready for public testing. Stefan From robertwb at math.washington.edu Fri May 16 21:58:59 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 12:58:59 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482D6099.8000308@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> Message-ID: <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> On May 16, 2008, at 3:23 AM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> I would rather that string literals be interpreted according to the C >> library they're linked against. I'm also thinking about code that, >> say, returns string literals. I would much rather it returns str in >> Py2 and unicode in Py3. > > That would be unexpected, especially in Py3 where the two are > distinct types. They are two distinct types in Py2 as well. (Well, str is being renamed to bytes, and unicode to str?) > As I said to Lisandro in another post: > > S> If you want source compatibility, you can't change the semantics > based on > S> the compile time environment - except for the cases where the > runtime > S> environments really differ (such as byte/unicode identifiers). > Imagine you > S> had some latin-1 encoded XML byte literal in your code. In Py2, > under your > S> proposal, this would become a byte string that can be parsed. In > Py3, > S> however, this would suddenly become a unicode string and the > parser would > S> refuse to handle it, as it's no longer ISO encoded. Yes, I saw this. I don't see how this is an issue--due to your work implementing PEP 263 any string literal has a well-defined unicode meaning. >> Note, this is not something that needs to be done to get ready for >> Py3--it's an assumption that unqualified string literals are the same >> type as python identifiers. > > This happens to be a correct assumption in Py2 and Py3, but I don't > see the link. > > >> I was doing some playing around with str and unicode in Python, and I >> noticed that it will automatically convert between the two (no >> explicit encoding needed) as long as the data in question is pure >> ASCII. > > That would be Py2. Py3 will never attempt any kind of automatic > conversion > between bytes and str. And I am convinced that Cython shouldn't do > that either. I haven't played with Py3. I was actually (pleasantly) surprised it worked in Py2, but if it's being discontinued then it doesn't help my case. But it does mean that stuff like --- a.pyx --- def foo(x): if x > 0: return "good" else: return "bad" ------------- import a print "3 is %s" % a.foo(3) won't work in both Py2 and Py3, which I think it should. "Principle of least surprise." - Robert From dalcinl at gmail.com Fri May 16 22:43:38 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 17:43:38 -0300 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> Message-ID: On 5/16/08, Robert Bradshaw wrote: > I haven't played with Py3. I was actually (pleasantly) surprised it > worked in Py2, but if it's being discontinued then it doesn't help my > case. But it does mean that stuff like > > --- a.pyx --- > > def foo(x): > if x > 0: > return "good" > else: > return "bad" > ------------- > > import a > print "3 is %s" % a.foo(3) > > won't work in both Py2 and Py3, which I think it should. "Principle > of least surprise." > Yep, that the reason I noisily complained about this... But still, I can live with it provided that Cython have a easy, non programatic way, like a command line flag, to create "abc" literals as unicode. With easy I mean a command line flag, with non-programatic, I mean I do not have to use __future__ imports. Regarding to your example, it will not fail in Py3, but you will get the following: >>> print("3 is %s" % b"good") 3 is b'good' >>> -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri May 16 22:47:26 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 13:47:26 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> Message-ID: <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> On May 16, 2008, at 1:43 PM, Lisandro Dalcin wrote: > On 5/16/08, Robert Bradshaw wrote: >> I haven't played with Py3. I was actually (pleasantly) surprised it >> worked in Py2, but if it's being discontinued then it doesn't >> help my >> case. But it does mean that stuff like >> >> --- a.pyx --- >> >> def foo(x): >> if x > 0: >> return "good" >> else: >> return "bad" >> ------------- >> >> import a >> print "3 is %s" % a.foo(3) >> >> won't work in both Py2 and Py3, which I think it should. "Principle >> of least surprise." >> > > Yep, that the reason I noisily complained about this... But still, I > can live with it provided that Cython have a easy, non programatic > way, like a command line flag, to create "abc" literals as unicode. > With easy I mean a command line flag, with non-programatic, I mean I > do not have to use __future__ imports. What it does mean is that you have to ship two separate sets of C files, which is why I'm pushing for this specific change to be decided at C compile time. To be clear, I think it makes sense for a string literal to be a str. > Regarding to your example, it will not fail in Py3, but you will get > the following: > >>>> print("3 is %s" % b"good") > 3 is b'good' OK, that's almost worse (imagine this getting inserted into some database rather than being displayed to the user). - Robert From dalcinl at gmail.com Fri May 16 23:04:14 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 18:04:14 -0300 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> References: <482C6B1E.10003@behnel.de> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> Message-ID: On 5/16/08, Robert Bradshaw > What it does mean is that you have to ship two separate sets of C > files, which is why I'm pushing for this specific change to be > decided at C compile time. Indeed. That would be a bit painfull for large project with many generated C files. It would not affect to much to my two current project. as I only generate a single but big 2MB C source. > To be clear, I think it makes sense for a > string literal to be a str. I'm on your side. Because of that, I pushed for the other way, I mean, iff one really want to use a byte string literal with DATA (for example, for sending a ascii data trough a wire), then explicitly use b"abc". But perhaps I'm still missing some point here... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Fri May 16 23:27:50 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 16 May 2008 23:27:50 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> Message-ID: <482DFC56.9030608@student.matnat.uio.no> Robert Bradshaw wrote: > .. > case. But it does mean that stuff like > > --- a.pyx --- > > def foo(x): > if x > 0: > return "good" > else: > return "bad" > ------------- > > import a > print "3 is %s" % a.foo(3) > > won't work in both Py2 and Py3, which I think it should. "Principle > of least surprise." If you import unicode_literals in that pyx (or, presumably, include an equivalent command line parameter to Cython), it will generate unicode strings. The latter example will then work in Py3 (%s is unicode), and in Py2 (%s auto-converts from unicode). -- Dag Sverre From dalcinl at gmail.com Sat May 17 00:01:03 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 16 May 2008 19:01:03 -0300 Subject: [Cython] cython-devel-py3: problem with keywords Message-ID: Stefan, I've just pulled from cython-devel-py3, and I got the following error: Traceback (most recent call last): File "../../test/test_environ.py", line 1, in from mpi4py import MPI File "ExceptionP.pyx", line 58, in mpi4py.MPI (src/MPI.c:45238) error_code = property(Get_error_code, doc="error code") TypeError: keywords must be strings Diving inside the generated sources, it seems that keyword names in a call statement are being created as byte strings. Does it make sense to treat them as identifiers, putting the appropriate flag in the string table? I've tried hard to provide the patch following that idea, but I got lost ;-( -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sat May 17 00:23:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 15:23:12 -0700 Subject: [Cython] Status update: Transform utilities In-Reply-To: <482DB943.3090501@student.matnat.uio.no> References: <482DB943.3090501@student.matnat.uio.no> Message-ID: <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> On May 16, 2008, at 9:41 AM, Dag Sverre Seljebotn wrote: > I'll create a ticket shortly containing my local branch so far. (The > phase refactorings I've posted earlier is *not* included, this is all > useful stuff :-) ). > > If anybody wants to start a process of applying it or discuss it then > fine, but there's no hurry, if nothing happens I'll just ping the list > again when Robert has more time available. :) My first impression: I always feel funny putting blocks of code inside string literals--but this looks more sane (compared to, say doing string substitutions!). Temps can be done by using a ExprNodes.TempNode. > A summary: > > I'll get to the main feature first so that you have a reason for > applying this :-) It should now be possible to do stuff like: > > class WithTransform(VisitorTransform): > # from with transform PEP... > with_fragment = TreeFragment(u""" > _mgr = (EXPR) > _exit = mgr.__exit__ > _value = mgr.__enter__() > _exc = True > try: > try: > VAR = _value > BLOCK > ... > > ... > """) > > def process_WithStatementNode(self, node): > return self.with_fragment.substitute({ > "EXPR" : node.expr, > "VAR" : node.var, > "BLOCK" : node.body > }) > > :-) > > (The above is simplified, there's not always a VAR. Also it needs > another feature before it can be completely streamlined (automatic > "temporaries" that won't clash in the namespace; basically, > "with_fragment.substitute(..., temps=("_mgr", ...)). When that is > done, > supporting the with statement is about as much work as extending the > parser, the transform/implementation comes for free. > > - I've already discussed the CodeWriter. It only supports a limited > subset (with some holes, ~30% perhaps) at this time; but it's what I > need for now (for unit tests). I might work further on that too. > > - Some changes to Transform.py which I hope goes through... there's a > Visitor object there; using the "process_ClassName" pattern (I think > that was the conclusion for future performance reasons). > > - A clone_node method on Node for proper node copying (shallow object > copy except child node lists, which are also copied). > > - Here's the controversial bit: > > In order to be able to provide proper error messages for string-based > code snippets like the above (which are passed to Parsing.py...); I've > changed the pointer to the source code (used as the first element > in the > position tuples found everywhere...) from being a string filename to > being a SourceDescriptor object. > > A SourceDescriptor can currently be a FileSourceDescriptor, in which > case things work like before (it gives the filename on __str__ so much > code needed not change), or a StringSourceDestriptor which I use > for my > new code... > > I hope you see the advantages to this from the above code. (There are > less intrusive ways to do this, but they would only be hacky and > postpone the problem. Better do it properly...? BTW this pattern is > rather common, consider for instance Source in the XML Transform > APIs/TrAX.) > > > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Sat May 17 00:24:02 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 15:24:02 -0700 Subject: [Cython] Status update: Transform utilities In-Reply-To: <482DBCC6.9060108@student.matnat.uio.no> References: <482DB943.3090501@student.matnat.uio.no> <482DBCC6.9060108@student.matnat.uio.no> Message-ID: On May 16, 2008, at 9:56 AM, Dag Sverre Seljebotn wrote: > Dag Sverre Seljebotn wrote: >> I'll create a ticket shortly containing my local branch so far. (The >> phase refactorings I've posted earlier is *not* included, this is all >> useful stuff :-) ). > > It is ticket 11: > > http://trac.cython.org/cython_trac/ticket/11 > > I was hoping Trac would show the diff in a better way (with hg commit > comments) -- there are many individual commits. I uploaded a bundle as > well... Yeah. That can be changed. I'll make it so bundles are expandable as they are for sage as well. > The last commit is rather large; as I've noted: > > It is a rather big commit, however seperating it is non-trivial. > The tests > for all of these features all rely on using each other, so there's a > circular dependency in the tests and I wanted to commit the tests and > features at the same time. (However, the non-test-code does not have a > circular > dependency.) > > About unit tests: > - I've put unit tests in Tests/ subdirectories to the files they > test > - They are not (yet) automatically run from runtests.py. > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dagss at student.matnat.uio.no Sat May 17 01:53:29 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 01:53:29 +0200 (CEST) Subject: [Cython] Status update: Transform utilities In-Reply-To: <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> References: <482DB943.3090501@student.matnat.uio.no> <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> Message-ID: <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> > My first impression: I always feel funny putting blocks of code > inside string literals--but this looks more sane (compared to, say > doing string substitutions!). Temps can be done by using a > ExprNodes.TempNode. Ahh, didn't notice such a thing was already there, thanks! (Ideally I'd like for it to contain a reference to a handle for the temporary rather than "being" the temporary though, so that one could more easily clone and chop and glue the tree without destroying it. But we'll deal with this later...) If it helps with the funny feeling, the string is only parsed directly in the constructor for TreeFragment (and it accepts a more directly created node structure too). So the string disappears from the story after module load time; from there it is only tree node manipulation (the TreeFragment acts like a template which is cloned while substituting nodes on the clone). So there's nothing that requires the string but notational convenience. Dag Sverre From stefan_ml at behnel.de Sat May 17 06:45:30 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 06:45:30 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: <482DBE99.3090908@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> Message-ID: <482E62EA.9010500@behnel.de> Hi, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> in Python 2.6 "abc" and b"abc" both return a byte string >> 'str' type, and 'bytes' type is an alias for 'str' type. > > As I said before, that's fine with me. Done. Cython can now parse "abc" u"abc" b"abc" in the same source, and also ur"abc" br"abc" which were previously unavailable. However, it doesn't have a "bytes" type alias yet. I'm not sure where to put that. Maybe Robert or Greg have an idea? Stefan From robertwb at math.washington.edu Sat May 17 06:57:05 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 21:57:05 -0700 Subject: [Cython] from __future__ import ... In-Reply-To: <482E62EA.9010500@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> <482E62EA.9010500@behnel.de> Message-ID: <6BF72241-4EBF-41B4-B842-9D7943351E49@math.washington.edu> On May 16, 2008, at 9:45 PM, Stefan Behnel wrote: > Hi, > > Stefan Behnel wrote: >> Lisandro Dalcin wrote: >>> in Python 2.6 "abc" and b"abc" both return a byte string >>> 'str' type, and 'bytes' type is an alias for 'str' type. >> >> As I said before, that's fine with me. > > Done. Cython can now parse > > "abc" > u"abc" > b"abc" > > in the same source, and also > > ur"abc" > br"abc" > > which were previously unavailable. What do ur and br mean? > > However, it doesn't have a "bytes" type alias yet. I'm not sure > where to put > that. Maybe Robert or Greg have an idea? > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Sat May 17 07:29:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 07:29:02 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> Message-ID: <482E6D1E.9050306@behnel.de> Hi, Robert Bradshaw wrote: >>> --- a.pyx --- >>> >>> def foo(x): >>> if x > 0: >>> return "good" >>> else: >>> return "bad" >>> ------------- >>> >>> import a >>> print "3 is %s" % a.foo(3) >>> >>> won't work in both Py2 and Py3, which I think it should. "Principle >>> of least surprise." > > What it does mean is that you have to ship two separate sets of C > files, Not at all. You just have to state what you mean *in your source file*, i.e. in your Cython source. If you say "I want this literal to be a byte string", you will get a byte string in both Py2 and Py3. If you say "I want this literal to be a unicode string", you will get a unicode string in both Py2 and Py3. How is that a surprise? Just because your code assumes that a byte string is the same as a unicode string does not mean Cython has to take measures to fix this for you, especially in a way that you might or might not have intended. Be explicit. You now have a number of ways to say what a literal should be, based on the Py2 syntax + the 'b' prefix of Py3: u"abc" # a unicode string "abc" # a byte string b"abc" # a byte string You can do from __future__ import unicode_literals u"abc" # a unicode string "abc" # a unicode string b"abc" # a byte string This is actually different from Py3 (and I think in line with 2.6) in that both the 'u' prefix and the 'b' prefix are allowed at the same time. I think that's ok, you will just have to use them wisely. Some style guide policies will save your project here. ;) Stefan From stefan_ml at behnel.de Sat May 17 07:34:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 07:34:38 +0200 Subject: [Cython] from __future__ import ... In-Reply-To: <6BF72241-4EBF-41B4-B842-9D7943351E49@math.washington.edu> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> <482E62EA.9010500@behnel.de> <6BF72241-4EBF-41B4-B842-9D7943351E49@math.washington.edu> Message-ID: <482E6E6E.5040407@behnel.de> Hi, Robert Bradshaw wrote: > On May 16, 2008, at 9:45 PM, Stefan Behnel wrote: >> ur"abc" >> br"abc" >> >> which were previously unavailable. > > What do ur and br mean? Unescaped raw unicode and raw byte strings. Py2 accepts them, but Pyrex doesn't. Example: http://hg.cython.org/cython-devel-py3/file/tip/tests/run/strliterals.pyx Stefan From robertwb at math.washington.edu Sat May 17 07:40:31 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 22:40:31 -0700 Subject: [Cython] from __future__ import ... In-Reply-To: <482E6E6E.5040407@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> <482E62EA.9010500@behnel.de> <6BF72241-4EBF-41B4-B842-9D7943351E49@math.washington.edu> <482E6E6E.5040407@behnel.de> Message-ID: On May 16, 2008, at 10:34 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> On May 16, 2008, at 9:45 PM, Stefan Behnel wrote: >>> ur"abc" >>> br"abc" >>> >>> which were previously unavailable. >> >> What do ur and br mean? > > Unescaped raw unicode and raw byte strings. Py2 accepts them, but > Pyrex > doesn't. Example: > > http://hg.cython.org/cython-devel-py3/file/tip/tests/run/ > strliterals.pyx Oh, I should have known that--I use r"abc" all the time. - Robert From robertwb at math.washington.edu Sat May 17 07:40:35 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 16 May 2008 22:40:35 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482E6D1E.9050306@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> Message-ID: <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> On May 16, 2008, at 10:29 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >>>> --- a.pyx --- >>>> >>>> def foo(x): >>>> if x > 0: >>>> return "good" >>>> else: >>>> return "bad" >>>> ------------- >>>> >>>> import a >>>> print "3 is %s" % a.foo(3) >>>> >>>> won't work in both Py2 and Py3, which I think it should. >>>> "Principle >>>> of least surprise." >> >> What it does mean is that you have to ship two separate sets of C >> files, > > Not at all. You just have to state what you mean *in your source > file*, i.e. > in your Cython source. If you say "I want this literal to be a byte > string", > you will get a byte string in both Py2 and Py3. If you say "I want > this > literal to be a unicode string", you will get a unicode string in > both Py2 and > Py3. How is that a surprise? I think we're all OK on being able to specify byte string or unicode string. It's a question of what happens when you don't specify one or the other. > > Just because your code assumes that a byte string is the same as a > unicode > string does not mean Cython has to take measures to fix this for you, > especially in a way that you might or might not have intended. Be > explicit. > > You now have a number of ways to say what a literal should be, > based on the > Py2 syntax + the 'b' prefix of Py3: > > u"abc" # a unicode string > "abc" # a byte string > b"abc" # a byte string I'm suggesting "abc" is a byte string when linked against Py2, and a unicode string when linked against Py3. This way string literals from the module have the same type as string literals in the ambient python environment. There is no way to say that in the above proposal. > > You can do > > from __future__ import unicode_literals > > u"abc" # a unicode string > "abc" # a unicode string > b"abc" # a byte string > > > This is actually different from Py3 (and I think in line with 2.6) > in that > both the 'u' prefix and the 'b' prefix are allowed at the same > time. I think > that's ok, you will just have to use them wisely. Some style guide > policies > will save your project here. ;) > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Sat May 17 07:47:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 07:47:00 +0200 Subject: [Cython] cython-devel-py3: problem with keywords In-Reply-To: References: Message-ID: <482E7154.4050705@behnel.de> Hi, Lisandro Dalcin wrote: > I've just pulled from cython-devel-py3, and I got the following error: > > Traceback (most recent call last): > File "../../test/test_environ.py", line 1, in > from mpi4py import MPI > File "ExceptionP.pyx", line 58, in mpi4py.MPI (src/MPI.c:45238) > error_code = property(Get_error_code, doc="error code") > TypeError: keywords must be strings > > Diving inside the generated sources, it seems that keyword names in a > call statement are being created as byte strings. > > Does it make sense to treat them as identifiers, putting the > appropriate flag in the string table? That's actually a tricky problem. Py2 does not accept unicode as keyword arguments and Py3 requires them, so we have to distinguish between Py2 and Py3 here. However, keywords are not really identifiers. You could pass any byte string or unicode string in Py2 using the **dict syntax. To make things worse, Cython stores keyword arguments as a StringNode in a generic DictItemNode of a DictNode. So I find it difficult to figure out the right place to store the information that the string must behave like an identifier. BTW, Py3a5 has a bug here. It crashes when you pass byte strings as keyword arguments to ParseTupleAndKeywords. I filed a bug report and this has been fixed, so I guess Py3b1 will raise a TypeError. Stefan From stefan_ml at behnel.de Sat May 17 08:06:13 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 08:06:13 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> Message-ID: <482E75D5.3060806@behnel.de> Hi, Robert Bradshaw wrote: > I'm suggesting "abc" is a byte string when linked against Py2, and a > unicode string when linked against Py3. This way string literals from > the module have the same type as string literals in the ambient > python environment. There is no way to say that in the above proposal. Admittedly, that's how the 2to3 tool behaves, but that's different as it works on the source level. So once you have invested the work to port your code to Py2.6 and appended all your byte string literals with a 'b', 2to3 will do the right thing when porting your code to Py3. So what you suggest is that users who want to port their Cython code to Py3 are required to append all their byte literals with 'b'? Then what would be so bad in also requiring them to add the future import at the top of their source file to be explicit about the requested semantics? Remember, automatic conversion between the two is gone in Py3. Code that currently assumes an equality of unicode strings and byte strings is fundamentally broken, beyond the simple printing of strings. It cannot be fixed by simply transmogrifying all byte strings into unicode strings. Plus, what would you do about this kind of code: cdef char* s = _some_c_call() result = s + "abc" There's no "automatic" way to heal this. People *will* have to invest work to port their code. Stefan From greg.ewing at canterbury.ac.nz Sat May 17 10:31:40 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 20:31:40 +1200 Subject: [Cython] Division (from __future__ import ...) In-Reply-To: References: <482C6B1E.10003@behnel.de> <482CE6FE.6070707@canterbury.ac.nz> Message-ID: <482E97EC.1090702@canterbury.ac.nz> Lisandro Dalcin wrote: > cdef int p = 5/2 Actually if 5/2 were a float, that would be an error in current Pyrex, as it disallows assigning a float directly to an int (so as not to provoke warnings from C++ compilers). -- Greg From greg.ewing at canterbury.ac.nz Sat May 17 11:02:44 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 21:02:44 +1200 Subject: [Cython] Pyrex and const (Re: constify Cython output all over the place (newbie approach)) In-Reply-To: References: <9107d71e527446535929.1210783592@tugrik2.mns.mnsspb.ru> <482B1636.7040100@googlemail.com> <482B8519.7000207@canterbury.ac.nz> <200805151211.12584.kirr@mns.spb.ru> <482BFC63.2050804@canterbury.ac.nz> <482CD1FC.6070703@canterbury.ac.nz> Message-ID: <482E9F34.90703@canterbury.ac.nz> Lisandro Dalcin wrote: > But when you see full and correct, do you mean that > Pyrex will match the behavior of a C/C++ compiler like gcc in the > example above Below? > const int * p = a; > int * const q = a; > const int * const r = a; That's the intention, yes. If I'm going to do it at all, I might as well do it properly. -- Greg From greg.ewing at canterbury.ac.nz Sat May 17 11:47:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 21:47:36 +1200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> Message-ID: <482EA9B8.1000809@canterbury.ac.nz> Robert Bradshaw wrote: > I think we're all OK on being able to specify byte string or unicode > string. It's a question of what happens when you don't specify one or > the other. From what I've seen so far, the answer appears to be that it gets Damned Confusing (tm). -- Greg From greg.ewing at canterbury.ac.nz Sat May 17 12:00:54 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 22:00:54 +1200 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <200805161402.38996.kirr@mns.spb.ru> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> <200805161402.38996.kirr@mns.spb.ru> Message-ID: <482EACD6.5070107@canterbury.ac.nz> Kirill Smelkov wrote: > http://hg.cython.org/pyrex/ > > So, I think everyone would be grateful to you if you'll keep it up-to-date. I've just added an hg push command to my upload script, so it should happen fairly automatically now. > Yes, you'll loose precise control on what Pyrex is, but given there is a shell of > motivated people around, I think Pyrex and you will gain much more in return ... > > That's how I think ... > Please let me know *your* thoughts. What worries me is that if multiple people are hacking on the Pyrex source at once, I'm going to lose track of how it works, and then I won't be able to contribute to it myself any more. Python has many people working on it, but it's a lot simpler in structure. It's fairly easy to work on things like adding a new object or library module without fear of disturbing anything else. The various parts of Pyrex are much more closely coupled than that. I have to think long and hard before changing anything, and I'm the one who wrote it. Also, I'm a bit overwhelmed by the pace of change going on in the Cython project. It seems to be heading off in directions rather different from Pyrex, such as turning into a Python compiler, and/or a NumPy compiler, and folks are rushing into making changes for py3k, when I've hardly begun to think about what I want to do about that. If all that were happening to Pyrex itself, I wouldn't be able to keep up. So it would either race ahead of me, or I would be holding it back. So I think it's probably a good thing having Cython as a separate project where people are free to try out all their wild ideas. Then when the dust has settled I can take the best ideas and fold them back into Pyrex. Think of it as the "Pyrex-Unstable" branch if you like. :-) -- Greg From greg.ewing at canterbury.ac.nz Sat May 17 12:14:01 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 22:14:01 +1200 Subject: [Cython] ANN: Pyrex 0.9.8.1 Message-ID: <482EAFE9.2080207@canterbury.ac.nz> Pyrex 0.9.8.1 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Base classes no longer need to be specified in a forward declaration of an extension type, or in the implementation part of an extension type defined in a .pxd file. Also, I've come up with an even better way of handling circular cimports involving structs, unions and extension types. You can now say things like from spam import struct Foo, union Blarg, class Ham This simultaneously imports the name and forward-declares it in the other module. What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From greg.ewing at canterbury.ac.nz Sat May 17 12:16:07 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 17 May 2008 22:16:07 +1200 Subject: [Cython] from __future__ import ... In-Reply-To: <482E62EA.9010500@behnel.de> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> <482E62EA.9010500@behnel.de> Message-ID: <482EB067.9040808@canterbury.ac.nz> Stefan Behnel wrote: > However, it doesn't have a "bytes" type alias yet. I'm not sure where to put > that. Maybe Robert or Greg have an idea? I don't know about Cython, but in Pyrex it would go in Pyrex.Compiler.Builtin, in the builtin_type_table. -- Greg From dagss at student.matnat.uio.no Sat May 17 15:04:26 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 15:04:26 +0200 (CEST) Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482EACD6.5070107@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> <200805161402.38996.kirr@mns.spb.ru> <482EACD6.5070107@canterbury.ac.nz> Message-ID: <54653.193.157.243.12.1211029466.squirrel@webmail.uio.no> > Python has many people working on it, but it's a lot simpler > in structure. It's fairly easy to work on things like adding > a new object or library module without fear of disturbing > anything else. > > The various parts of Pyrex are much more closely coupled than > that. I have to think long and hard before changing anything, > and I'm the one who wrote it. We notice that in Cython too, and I've been thinking a lot about this aspect. If you haven't read it already I'd love to hear your comments some time on this thread: http://thread.gmane.org/gmane.comp.python.cython.devel/1384 (BTW my brain was on pause when I wrote the "stagnant part", I agree with Robert's correction). (Also rereading it again it is not completely clear in some parts, I apologize for this, just ask...) When I push for these changes, I don't want it to be taken as critisicm of your work -- the current structure seems to be proper for Pyrex, and was probably more ideal for launching a project like this too (Keep It Simple etc.). It's not as bad a structure as such, the problem is that it doesn't seem to scale up to the things we'd like to do with Cython, and the level of developer independence we need. > So I think it's probably a good thing having Cython as a > separate project where people are free to try out all their > wild ideas. Then when the dust has settled I can take the > best ideas and fold them back into Pyrex. I know that you talk mostly about ideas here, but a note about code: Some new features in Cython might be rather difficult to port over, because of the stuff in the thread above. (But probably these might be the features you're least interested in.) > directions rather different from Pyrex, such as turning > into a Python compiler, and/or a NumPy compiler, and folks > are rushing into making changes for py3k, when I've hardly > begun to think about what I want to do about that. As for NumPy, you might be right :-) the goal is pretty far from Pyrex' goal. I think that is a testimony to the ideas in Pyrex ... they are useful far outside of Pyrex' goals, and I'd like to build on that base for numerical computation use. However, I'll make sure there are no explicity references to NumPy within Cython, except that it might ship a "numpy.pxd" file; it's all about adding generic compiler features that's necesarry for such a pxd file and will also be useful for similar libraries. Dag Sverre From languitar at semipol.de Sat May 17 19:02:24 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sat, 17 May 2008 19:02:24 +0200 Subject: [Cython] default parameter values in python function with class variables Message-ID: <482F0FA0.5030806@semipol.de> Hi, there seems to be a sematic difference between cython and native python in the use of defaults and class variables. Consider the following example: >>> class Bar(object): ... CONST = 25 ... def foo(self, value=CONST): ... print value ... >>> b = Bar() >>> b.foo() 25 Compiling the same code in cython results in this exception: Error converting Pyrex file to C: ------------------------------------------------------------ class Bar(object): CONST = 25 def foo(self, value=CONST): ^ ------------------------------------------------------------ test.pyx:6:29: undeclared name not builtin: CONST Cython will compile the code like this: class Bar(object): CONST = 25 def foo(self, value=Bar.CONST): print value But importing the a module compiled like this results in: >>> import test Traceback (most recent call last): File "", line 1, in ? File "test.pyx", line 3, in test NameError: Bar So how to solve this? Is this a bug or a so called feature? ;) Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080517/b4591519/attachment.pgp From dagss at student.matnat.uio.no Sat May 17 19:25:47 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 19:25:47 +0200 Subject: [Cython] default parameter values in python function with class variables In-Reply-To: <482F0FA0.5030806@semipol.de> References: <482F0FA0.5030806@semipol.de> Message-ID: <482F151B.2010408@student.matnat.uio.no> Johannes Wienke wrote: > Hi, > > there seems to be a sematic difference between cython and native python > in the use of defaults and class variables. Consider the following example: > >>>> class Bar(object): > ... CONST = 25 > ... def foo(self, value=CONST): > ... print value > So how to solve this? Is this a bug or a so called feature? ;) > I think we can consider it a "missing feature". That is, I don't think it is something that slipped in by a mistake but it is wanted functionality. I don't really know for sure though, I just know that there's no lack of examples like this, see: http://wiki.cython.org/enhancements/scope I've added your example now... -- Dag Sverre From robertwb at math.washington.edu Sat May 17 21:05:52 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 17 May 2008 12:05:52 -0700 Subject: [Cython] Status update: Transform utilities In-Reply-To: <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> References: <482DB943.3090501@student.matnat.uio.no> <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> Message-ID: <559AF361-1085-499F-9514-6865A634AA7C@math.washington.edu> On May 16, 2008, at 4:53 PM, Dag Sverre Seljebotn wrote: > >> My first impression: I always feel funny putting blocks of code >> inside string literals--but this looks more sane (compared to, say >> doing string substitutions!). Temps can be done by using a >> ExprNodes.TempNode. > > Ahh, didn't notice such a thing was already there, thanks! > > (Ideally I'd like for it to contain a reference to a handle for the > temporary rather than "being" the temporary though, so that one > could more > easily clone and chop and glue the tree without destroying it. But > we'll > deal with this later...) > > If it helps with the funny feeling, the string is only parsed > directly in > the constructor for TreeFragment (and it accepts a more directly > created > node structure too). So the string disappears from the story after > module > load time; from there it is only tree node manipulation (the > TreeFragment > acts like a template which is cloned while substituting nodes on the > clone). So there's nothing that requires the string but notational > convenience. Yes--the TreeFragment idea makes things much cleaner. Are EXPR et al. special words then? - Robert From robertwb at math.washington.edu Sat May 17 21:09:19 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 17 May 2008 12:09:19 -0700 Subject: [Cython] from __future__ import ... In-Reply-To: <482EB067.9040808@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <482C83D6.4030401@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <482DB36E.6030400@behnel.de> <482DBE99.3090908@behnel.de> <482E62EA.9010500@behnel.de> <482EB067.9040808@canterbury.ac.nz> Message-ID: On May 17, 2008, at 3:16 AM, Greg Ewing wrote: > Stefan Behnel wrote: >> However, it doesn't have a "bytes" type alias yet. I'm not sure >> where to put >> that. Maybe Robert or Greg have an idea? > > I don't know about Cython, but in Pyrex it would go > in Pyrex.Compiler.Builtin, in the builtin_type_table. Similar in Cython. - Robert From robertwb at math.washington.edu Sat May 17 21:13:07 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 17 May 2008 12:13:07 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482E75D5.3060806@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> Message-ID: <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> On May 16, 2008, at 11:06 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> I'm suggesting "abc" is a byte string when linked against Py2, and a >> unicode string when linked against Py3. This way string literals from >> the module have the same type as string literals in the ambient >> python environment. There is no way to say that in the above >> proposal. > > Admittedly, that's how the 2to3 tool behaves, but that's different > as it works > on the source level. So once you have invested the work to port > your code to > Py2.6 and appended all your byte string literals with a 'b', 2to3 > will do the > right thing when porting your code to Py3. > > So what you suggest is that users who want to port their Cython > code to Py3 > are required to append all their byte literals with 'b'? Then what > would be so > bad in also requiring them to add the future import at the top of > their source > file to be explicit about the requested semantics? > > Remember, automatic conversion between the two is gone in Py3. Code > that > currently assumes an equality of unicode strings and byte strings is > fundamentally broken, beyond the simple printing of strings. It > cannot be > fixed by simply transmogrifying all byte strings into unicode > strings. Plus, > what would you do about this kind of code: > > cdef char* s = _some_c_call() > result = s + "abc" This would still work using Py2 semantics (unicode -> str is fine as long as it's ascii). > There's no "automatic" way to heal this. People *will* have to > invest work to > port their code. We can mitigate the pain. If someone writes a module in Cython for Py2, and then someone wants to use it from Py3, they should just be able to compile the C source file (not require the original author to do something special in their Cython source). - Robert From stefan_ml at behnel.de Sat May 17 21:28:28 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 21:28:28 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> Message-ID: <482F31DC.3030505@behnel.de> Hi, Robert Bradshaw wrote: >> cdef char* s = _some_c_call() >> result = s + "abc" > > This would still work using Py2 semantics (unicode -> str is fine as > long as it's ascii). But it would not work in Py3. And I'm very much against porting Py2 quirks to Cython, just because "it worked in Py2". >> There's no "automatic" way to heal this. People *will* have to >> invest work to >> port their code. > > We can mitigate the pain. If someone writes a module in Cython for > Py2, and then someone wants to use it from Py3, they should just be > able to compile the C source file (not require the original author to > do something special in their Cython source). Sure, the original author is required to make the source ready for Py3 either way. It's very unlikely that this will work out of the box if the author relied on Py2 semantics of byte strings and unicode strings. Stefan From dagss at student.matnat.uio.no Sat May 17 21:36:30 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 21:36:30 +0200 Subject: [Cython] Status update: Transform utilities In-Reply-To: <559AF361-1085-499F-9514-6865A634AA7C@math.washington.edu> References: <482DB943.3090501@student.matnat.uio.no> <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> <559AF361-1085-499F-9514-6865A634AA7C@math.washington.edu> Message-ID: <482F33BE.7050402@student.matnat.uio.no> >> If it helps with the funny feeling, the string is only parsed >> directly in >> the constructor for TreeFragment (and it accepts a more directly >> created >> node structure too). So the string disappears from the story after >> module >> load time; from there it is only tree node manipulation (the >> TreeFragment >> acts like a template which is cloned while substituting nodes on the >> clone). So there's nothing that requires the string but notational >> convenience. > > Yes--the TreeFragment idea makes things much cleaner. Are EXPR et al. > special words then? No -- they are inserted into the trees as a regular NameNode (so any name resolving done by the parser will happen to them -- if this bites us, it is a case for refactoring such functionality to a post-parse transform; although in practice we'll just use another name :-) ). However, since they just sit there in the tree, they can be replaced by much more complex nodes. Hmm.. that's something! I should probably "replace the enclosing ExprStatNode instead, if any". Thanks for putting me on track of a bug :-) The context of the string is a module-level pyx-file without any imports but __builtin__. This can be altered though. (For instance, no file-lookup happens in the case of "cimport" or "include" in such a string, instead an exception is raised.) If this simple use of NameNode is not sufficient, one can extend the parser to parse special nodes if in a special mode ("${BLOCK}" parses to "CustomNode(contenst="BLOCK")", however I don't think that will be necesarry. -- Dag Sverre From robertwb at math.washington.edu Sat May 17 22:05:44 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 17 May 2008 13:05:44 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F31DC.3030505@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> <482F31DC.3030505@behnel.de> Message-ID: On May 17, 2008, at 12:28 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >>> cdef char* s = _some_c_call() >>> result = s + "abc" >> >> This would still work using Py2 semantics (unicode -> str is fine as >> long as it's ascii). > > But it would not work in Py3. And I'm very much against porting Py2 > quirks to > Cython, just because "it worked in Py2". It is already there, and we're not going to disable it for Py2 code. >>> There's no "automatic" way to heal this. People *will* have to >>> invest work to >>> port their code. >> >> We can mitigate the pain. If someone writes a module in Cython for >> Py2, and then someone wants to use it from Py3, they should just be >> able to compile the C source file (not require the original author to >> do something special in their Cython source). > > Sure, the original author is required to make the source ready for > Py3 either > way. It's very unlikely that this will work out of the box if the > author > relied on Py2 semantics of byte strings and unicode strings. My point is that if they're not required to do anything to be ready for Py3, why force them to do so? If I want to use someone else's (Py2) module in Py3, I should be able to just compile their C file. I think it is extremely likely that it will just work out of the box-- Py3 is not that incompatible. - Robert From dagss at student.matnat.uio.no Sat May 17 22:10:42 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 22:10:42 +0200 Subject: [Cython] Note on variable scope Message-ID: <482F3BC2.1090504@student.matnat.uio.no> Just for coordination: I've played around with getting correct variable scoping for some hours, so don't start on that without asking me what I've got first. (Estimated time left for good implementation: ~1 day). (Yes, I'm everywhere and never finish anything. My excuse: I'd like to tune transforms and their utilities through many widely different practical usecases now, so that the actual results (like with statement and proper variable scope) can break through through to the surface quickly at dev1.) -- Dag Sverre From stefan_ml at behnel.de Sat May 17 22:11:40 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 22:11:40 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482EA9B8.1000809@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482EA9B8.1000809@canterbury.ac.nz> Message-ID: <482F3BFC.6050300@behnel.de> Greg Ewing wrote: > Robert Bradshaw wrote: > >> I think we're all OK on being able to specify byte string or unicode >> string. It's a question of what happens when you don't specify one or >> the other. > > From what I've seen so far, the answer appears to > be that it gets Damned Confusing (tm). That's the main reason why I am voting for clear semantics, instead of "depends on where you run it". The semantics of source code should be fixed at the time the parser reads it. Stefan From stefan_ml at behnel.de Sat May 17 22:31:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 17 May 2008 22:31:38 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> <482F31DC.3030505@behnel.de> Message-ID: <482F40AA.1060903@behnel.de> Hi, Robert Bradshaw wrote: > On May 17, 2008, at 12:28 PM, Stefan Behnel wrote: >> Robert Bradshaw wrote: >>> We can mitigate the pain. If someone writes a module in Cython for >>> Py2, and then someone wants to use it from Py3, they should just be >>> able to compile the C source file (not require the original author to >>> do something special in their Cython source). >> Sure, the original author is required to make the source ready for >> Py3 either >> way. It's very unlikely that this will work out of the box if the >> author >> relied on Py2 semantics of byte strings and unicode strings. > > My point is that if they're not required to do anything to be ready > for Py3, why force them to do so? If I want to use someone else's > (Py2) module in Py3, I should be able to just compile their C file. I > think it is extremely likely that it will just work out of the box-- > Py3 is not that incompatible. This whole discussion sounds a lot like bike shedding to me. I say it would be better if people fixed their code instead of doing guess-work, and I have come up with a couple of examples where current Cython code breaks under the semantics you propose. You say it would make more users happy if we adopted these semantics. Why don't we just leave it the way it currently is implemented? That way, we start off with correct but strict semantics that match the way this whole ambiguity problem was fixed in Py3, and then users can decide if they want to fix their code that proves broken under Py3, or if they want to complain on the list that their broken code should be fixed by Cython. That would give us some real-life examples to debate instead of having an undermotivated "I think we have more users who would vote for x)" discussion. Stefan From dalcinl at gmail.com Sat May 17 23:08:02 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 17 May 2008 18:08:02 -0300 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F40AA.1060903@behnel.de> References: <482C6B1E.10003@behnel.de> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> <482F31DC.3030505@behnel.de> <482F40AA.1060903@behnel.de> Message-ID: On 5/17/08, Stefan Behnel wrote: > This whole discussion sounds a lot like bike shedding to me. I say it would be > better if people fixed their code instead of doing guess-work, Them problem is what you define by 'broken code', The following valid Cython (and valid Python) code does not looks as broken to me, but perhaps is broken to you? data = [1,2,3] a = getattr(data, "append") a(4) > and I have come > up with a couple of examples where current Cython code breaks under the > semantics you propose. You say it would make more users happy if we adopted > these semantics. > Why don't we just leave it the way it currently is implemented? Because if in the future many people complain, this behavior is perhaps is, and this is not goot regarding the stability of Cython The Language. > That way, we > start off with correct but strict semantics that match the way this whole > ambiguity problem was fixed in Py3, and then users can decide if they want to > fix their code that proves broken under Py3, or if they want to complain on > the list that their broken code should be fixed by Cython. That would give us > some real-life examples to debate instead of having an undermotivated "I think > we have more users who would vote for x)" discussion. Well, I'm a user, and I cannot buy easily that the thee-line snipet I wrote above is going to fail in Py3. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat May 17 23:27:35 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 17 May 2008 18:27:35 -0300 Subject: [Cython] cython-devel-py3: problem with keywords In-Reply-To: <482E7154.4050705@behnel.de> References: <482E7154.4050705@behnel.de> Message-ID: On 5/17/08, Stefan Behnel wrote: > To make things worse, Cython stores keyword arguments as a StringNode in a > generic DictItemNode of a DictNode. Does it make sense to add a new StringKeywordNode class to handle this? > So I find it difficult to figure out the > right place to store the information that the string must behave like an > identifier. I had the same problem! > BTW, Py3a5 has a bug here. It crashes when you pass byte strings as keyword > arguments to ParseTupleAndKeywords. I filed a bug report and this has been > fixed, so I guess Py3b1 will raise a TypeError. Good to know. I did not notice this because I test with the py3 SVN repo directly, and I update and rebuild it every time I try something. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Sat May 17 23:45:56 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 17 May 2008 23:45:56 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F40AA.1060903@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> <482F31DC.3030505@behnel.de> <482F40AA.1060903@behnel.de> Message-ID: <482F5214.7080804@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> On May 17, 2008, at 12:28 PM, Stefan Behnel wrote: >> My point is that if they're not required to do anything to be ready >> for Py3, why force them to do so? If I want to use someone else's >> (Py2) module in Py3, I should be able to just compile their C file. I >> think it is extremely likely that it will just work out of the box-- >> Py3 is not that incompatible. > > This whole discussion sounds a lot like bike shedding to me. I say it would be > better if people fixed their code instead of doing guess-work, and I have come > up with a couple of examples where current Cython code breaks under the > semantics you propose. You say it would make more users happy if we adopted > these semantics. +1. This discussion is getting silly. Everything has been reiterated many times over by now; and it is clear that it's not a matter about understanding one another better, it's a matter of disagreeing. Let me add: It seems like a good idea with a forced break in the discussion at this point, and if Stefan's suggestion with waiting for the real world isn't heeded then a nice summary on the wiki listing pros and cons and trying to break up the question into smaller pieces in a more systematic way should be written before starting again. (For new users: Bikeshedding can be read about here: http://green.bikeshed.com/) -- Dag Sverre From greg.ewing at canterbury.ac.nz Sun May 18 00:09:19 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 18 May 2008 10:09:19 +1200 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <54653.193.157.243.12.1211029466.squirrel@webmail.uio.no> References: <482BF9CE.2070100@canterbury.ac.nz> <200805151513.54003.kirr@mns.spb.ru> <482CCE35.8030900@canterbury.ac.nz> <200805161402.38996.kirr@mns.spb.ru> <482EACD6.5070107@canterbury.ac.nz> <54653.193.157.243.12.1211029466.squirrel@webmail.uio.no> Message-ID: <482F578F.8060104@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > If you haven't read it already I'd love to hear your comments some > time on this thread: > > http://thread.gmane.org/gmane.comp.python.cython.devel/1384 Yes, I've been keeping an eye on that, although I haven't been following all the details. There's no doubt that such a structure could be useful for some things. There's already a hint of it in Pyrex in one or two places, where the parser implements a feature by assembling existing parse tree nodes rather than introducing a new node type. More generally, attempting to decouple things where possible is certainly a worthy goal, and I'm open to ideas on how to do that. However, I'm wary of any plan that would involve making sweeping structural changes all at once. The current structure may seem overly complex and intertwined, but it's solving a problem with inherently complex and intertwined features. I suspect there's a limit to how much it can be simplified. > It's not as bad a structure as such, the problem is that it doesn't > seem to scale up to the things we'd like to do with Cython, and the level > of developer independence we need. I understand that. I hope you can get it into a state that lets you do what you want. > Some > new features in Cython might be rather difficult to port over, because of > the stuff in the thread above. I know -- I'm not expecting to be able to use much of the actual code, perhaps none. It's the ideas I'm mainly interested in. -- Greg From greg.ewing at canterbury.ac.nz Sun May 18 01:29:39 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 18 May 2008 11:29:39 +1200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F3BFC.6050300@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482EA9B8.1000809@canterbury.ac.nz> <482F3BFC.6050300@behnel.de> Message-ID: <482F6A63.9050206@canterbury.ac.nz> Stefan Behnel wrote: > That's the main reason why I am voting for clear semantics, instead of > "depends on where you run it". The semantics of source code should be fixed at > the time the parser reads it. The problem with that is that the semantics aren't confined to just the .pyx file -- the end result depends on the environment in which the resulting module is used. If the semantics of strings in the module are fixed when the source is Cython-compiled, then you end up with a module which can only be used in one environment, either py2 or py3. Given that, I see little point in trying to make it C-compilable to either a py2 or py3 extension. -- Greg From greg.ewing at canterbury.ac.nz Sun May 18 01:55:26 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 18 May 2008 11:55:26 +1200 Subject: [Cython] default parameter values in python function with class variables In-Reply-To: <482F0FA0.5030806@semipol.de> References: <482F0FA0.5030806@semipol.de> Message-ID: <482F706E.8090708@canterbury.ac.nz> Johannes Wienke wrote: >>>>class Bar(object): > > ... CONST = 25 > ... def foo(self, value=CONST): > ... print value > ... > > test.pyx:6:29: undeclared name not builtin: CONST This seems to work fine with the current Pyrex. -- Greg From robertwb at math.washington.edu Sun May 18 08:49:55 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 17 May 2008 23:49:55 -0700 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F5214.7080804@student.matnat.uio.no> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482E75D5.3060806@behnel.de> <2C263693-0F42-4528-BFA8-7D81EBB8638D@math.washington.edu> <482F31DC.3030505@behnel.de> <482F40AA.1060903@behnel.de> <482F5214.7080804@student.matnat.uio.no> Message-ID: On May 17, 2008, at 2:45 PM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi, >> >> Robert Bradshaw wrote: >>> On May 17, 2008, at 12:28 PM, Stefan Behnel wrote: >>> My point is that if they're not required to do anything to be ready >>> for Py3, why force them to do so? If I want to use someone else's >>> (Py2) module in Py3, I should be able to just compile their C >>> file. I >>> think it is extremely likely that it will just work out of the box-- >>> Py3 is not that incompatible. >> >> This whole discussion sounds a lot like bike shedding to me. I say >> it would be >> better if people fixed their code instead of doing guess-work, and >> I have come >> up with a couple of examples where current Cython code breaks >> under the >> semantics you propose. You say it would make more users happy if >> we adopted >> these semantics. > > +1. This discussion is getting silly. Everything has been reiterated > many times over by now; and it is clear that it's not a matter about > understanding one another better, it's a matter of disagreeing. Yes. I've was sitting on a long response to one of the emails for a while, and I'm glad I finally decided not to send it 'cause it just wasn't going to help. The opinions range from "if you're not already using unicode, your code is already broken" to "if I only ever use 7- bit ASCII, it should just work everywhere." > Let me add: It seems like a good idea with a forced break in the > discussion at this point, and if Stefan's suggestion with waiting for > the real world isn't heeded then a nice summary on the wiki listing > pros > and cons and trying to break up the question into smaller pieces in a > more systematic way should be written before starting again. I don't think we can just wait, this is something that needs to be decided before supporting Py3 (which itself is still in alpha, so we have time). Trying to change later would break backwards compatibility. A wiki on this topic would be very good--discussing the options and pros/cons of each. > > (For new users: Bikeshedding can be read about here: > http://green.bikeshed.com/) > > -- > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From greg.ewing at canterbury.ac.nz Sun May 18 13:25:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 18 May 2008 23:25:45 +1200 Subject: [Cython] ANN: Pyrex 0.9.8.2 Message-ID: <48301239.5090302@canterbury.ac.nz> Pyrex 0.9.8.2 is now available: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ A block of external functions can now be declared nogil at once. cdef extern from "somewhere.h" nogil: ... Also some minor nogil-related bugs have been fixed. What is Pyrex? -------------- Pyrex is a language for writing Python extension modules. It lets you freely mix operations on Python and C data, with all Python reference counting and error checking handled automatically. From stefan_ml at behnel.de Sun May 18 09:13:04 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 18 May 2008 09:13:04 +0200 Subject: [Cython] cython-devel-py3: problem with keywords In-Reply-To: <482E7154.4050705@behnel.de> References: <482E7154.4050705@behnel.de> Message-ID: <482FD700.6000607@behnel.de> Hi, Stefan Behnel wrote: > That's actually a tricky problem. Py2 does not accept unicode as keyword > arguments and Py3 requires them, so we have to distinguish between Py2 and Py3 > here. However, keywords are not really identifiers. You could pass any byte > string or unicode string in Py2 using the **dict syntax. > > To make things worse, Cython stores keyword arguments as a StringNode in a > generic DictItemNode of a DictNode. So I find it difficult to figure out the > right place to store the information that the string must behave like an > identifier. Here is a patch. It adds a new KeywordNameNode class that makes keyword names behave just like identifiers and also interns them. The way this is implemented would allow non-ASCII keyword names in Py3, although we can't currently enable that in the parser, as the resulting C code would not work in Py2. You'd end up with UTF-8 encoded byte string names, and you can't pass these directly from Py2, neither can ParseTupleAndKeywords handle them. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: kwcall-py3.patch Type: text/x-patch Size: 6114 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080518/368a771d/attachment.bin From stefan_ml at behnel.de Sun May 18 19:50:49 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 18 May 2008 19:50:49 +0200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <482F6A63.9050206@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482EA9B8.1000809@canterbury.ac.nz> <482F3BFC.6050300@behnel.de> <482F6A63.9050206@canterbury.ac.nz> Message-ID: <48306C79.70509@behnel.de> Hi, Greg Ewing wrote: > If the semantics of strings in the module are fixed when > the source is Cython-compiled, then you end up with a > module which can only be used in one environment This is proven wrong by the current implementation in Cython. Stefan From stefan_ml at behnel.de Sun May 18 23:17:35 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 18 May 2008 23:17:35 +0200 Subject: [Cython] first lessons learned while porting lxml to Py3 Message-ID: <48309CEF.2010102@behnel.de> Hi, since we had a lengthy discussion on whether or not non-prefixed byte strings should automatically mutate into unicode strings when compiled for Py3, here are some initial lessons from my first attempt to port lxml. My first approach was (obviously) to import unicode_literals from __future__. This failed miserably, and even showed a couple of further bugs in Cython. :) I then chose the route to explicitly prepend unicode strings with 'u', as I wanted to keep my source compilable with older Cython versions that do not support the 'b' prefix. Currently, I have changed about 700 lines this way in a quick walk-through, and now I'm searching the places where this was the wrong thing to do. :) Most important evidence found: it's definitely non-trivial in a lot of places to decide what has to be unicode and what doesn't. It's non-trivial for me, and definitely not easier for Cython. One important place where I ended up with a lot of trivial changes are docstrings. Here, I would give an almost 100% chance that the user meant a unicode string if it's not prefixed. The remaining cases, e.g. where some external tool may require binary data for some kind of configuration or analysis are rare enough to just ignore them. For exactly this reason (I think), the doctest module in Py3 ignores docstrings that are not unicode. This might be a place where an automatic conversion might make sense (although, if it's the only place, that would be some funny string semantics...) Another important place are exception messages. Here, I'd give a real 100% for string literals, as their only purpose is to be human readable. A field where I really had to take care is when working with byte sequences. For example, lxml has a couple of places where strings are converted into UTF-8 and then passed into re.findall() or re.sub(). When substituting, the replacement string obviously has to be a byte string, too. I also found a bug in the Py3 re module when working with byte strings in one specific case. There are actually quite a number of places where strings are built as byte strings by combining and formatting literals, and then converted to a char*. Another place where automatic conversion must not happen. So, while still on the way, my first real-world impression meets my original opinion. There are definitely a lot of unprefixed strings in my own code that are meant to be unicode strings. Simply switching their type in Py3 will fix a lot of them, but at the same time break many others. The things that it fixes are the trivial parts: docstrings and exceptions. Almost everything else really were byte strings, and some were non-trivial things that need real work. If I can choose, I opt for going through this once and then having code that correctly distinguishes between byte strings and unicode strings in *both* Py2 and Py3, instead of additionally having to deal with changing string semantics for identical code in different environments. We might think about a way to simplify the transition from unprefixed docstrings and exception messages to unicode strings. As it currently stands, everything else is definitely out of scope for any automatism. Stefan From greg.ewing at canterbury.ac.nz Mon May 19 01:53:11 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 19 May 2008 11:53:11 +1200 Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <48306C79.70509@behnel.de> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482EA9B8.1000809@canterbury.ac.nz> <482F3BFC.6050300@behnel.de> <482F6A63.9050206@canterbury.ac.nz> <48306C79.70509@behnel.de> Message-ID: <4830C167.10306@canterbury.ac.nz> Stefan Behnel wrote: > Greg Ewing wrote: > >>If the semantics of strings in the module are fixed when >>the source is Cython-compiled, then you end up with a >>module which can only be used in one environment > > This is proven wrong by the current implementation in Cython. So what happens when you compile your module so that "foo" is a unicode string, and then use it in 2.x in some situation where it chokes on unicode? -- Greg From stefan_ml at behnel.de Mon May 19 13:55:13 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 May 2008 13:55:13 +0200 (CEST) Subject: [Cython] string literals in Py2 vs Py3 In-Reply-To: <4830C167.10306@canterbury.ac.nz> References: <482C6B1E.10003@behnel.de> <51801.193.157.243.12.1210878993.squirrel@webmail.uio.no> <482C9D44.8070007@student.matnat.uio.no> <482D56B5.8050706@behnel.de> <999E563C-2B1E-4329-AA13-10FF47D81A5F@math.washington.edu> <482D6099.8000308@behnel.de> <6D61F575-31C4-4120-8803-8B2AE6A3F51B@math.washington.edu> <84FC9B13-DB61-4266-88FA-079ECEBF6315@math.washington.edu> <482E6D1E.9050306@behnel.de> <3F94124F-62AA-46DC-AFE7-DF7B1D024528@math.washington.edu> <482EA9B8.1000809@canterbury.ac.nz> <482F3BFC.6050300@behnel.de> <482F6A63.9050206@canterbury.ac.nz> <48306C79.70509@behnel.de> <4830C167.10306@canterbury.ac.nz> Message-ID: <4093.194.114.62.39.1211198113.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Greg Ewing wrote: > Stefan Behnel wrote: > >> Greg Ewing wrote: >> >>>If the semantics of strings in the module are fixed when >>>the source is Cython-compiled, then you end up with a >>>module which can only be used in one environment >> >> This is proven wrong by the current implementation in Cython. > > So what happens when you compile your module so that > "foo" is a unicode string, and then use it in 2.x in > some situation where it chokes on unicode? Simple: don't write broken code. If you want a byte string, use a byte string. Or use a unicode string and encode it before you use it as byte string. Using a unicode string where a byte string is required is like passing the string "1" where the number 1 is required. I don't think you'd want Pyrex to convert between those two either. If you really think the current implementation has a problem, please provide a code example. Stefan From dalcinl at gmail.com Mon May 19 17:11:16 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 19 May 2008 12:11:16 -0300 Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: <48309CEF.2010102@behnel.de> References: <48309CEF.2010102@behnel.de> Message-ID: After all your comments, my conclusion is the following: - Any code dealing with string processing should be written in a way were byte and unicode strings should have to be prefixed like b"abc" and u"abc". This way, the code is not only semantically correct, but also explicit about the intended meaning of a string literal. Unfortunatelly, after long discussions, Guido decided Python 3 not to support the u"abc" form. And now, from a user/developer perspective, althoug this contradict the "only one way...", I'm not sure at all if that was a good idea in practice. - I still think that unprefixed forms should match the builtin 'str' type in Py2 and Py3. This way, things like docstrings, exception messages, and calls like getattr([], "append") will just work. And of course, casting to a raw 'char*' pointer should only accept bytes (by using PyString_AsString). This is going to be safe in Py3, but not in Py2. But Py2 is broken anyway, right? In my current understanding of the problem, the evil thing is automatic conversion. I'm completelly convinced of this, I believe Robert and Greg are also convinced, and Stefan and Dag are definitely sure. Then perhaps a way to make all us happy is the following: - Add a '-py3' command line flag to Cython (this will be needed in the future anyway, right?). When this flag is active, then the following will happen at runtime: * Something like 'cdef char *p = obj' will only accept a byte string ('str' type in Py2, 'bytes' in Py3). If 'obj' is an unicode string, the generated code raises a TypeError both on a Py2 and a Py3 runtime. * A new C pseudo-type have to be added, lets call it 'uchar' (better name would be needed, it can be confused with unsigned char). Then something like 'cdef uchar *p = obj' will only accept an unicode string ('unicode' type in Py2 and 'str' in Py3). If 'obj' is a byte string, the generated code raises TypeError both on a Py2 and a Py3 runtime. I want to remark that the above behavior should be enabled ONLY trough a command line switch. This proposal just tries to make Cython/Pyrex stricter, more explicit, even in a Py2 runtime. That's all. Stefan, please do not get angry if all this is a non-sense. All this stuff is going to be real pain to all Python users, all us have to be ready for handle this, and for answer questions from confused end-users. About a week ago, I asked Fernando Perez and Brian Granger about suggestion about how they think I should handle the byte/unicode stuff in mpi4py. I paste below Fernando's anwer: """ Wow, this one's going to be a *huge* thorn in everyone's side, if I understand the problem correctly. What has been the policy of other projects? """ And I believe this is the situation of many, many Python users and developers out there, even of the very smart and productive ones like Fernando. If Cython takes a good direction on all this, then this will be benefical for the Cython project itself, but also for other people and other projects to follow the right path. On 5/18/08, Stefan Behnel wrote: > Hi, > > since we had a lengthy discussion on whether or not non-prefixed byte strings > should automatically mutate into unicode strings when compiled for Py3, here > are some initial lessons from my first attempt to port lxml. > > My first approach was (obviously) to import unicode_literals from __future__. > This failed miserably, and even showed a couple of further bugs in Cython. :) > > I then chose the route to explicitly prepend unicode strings with 'u', as I > wanted to keep my source compilable with older Cython versions that do not > support the 'b' prefix. Currently, I have changed about 700 lines this way in > a quick walk-through, and now I'm searching the places where this was the > wrong thing to do. :) > > Most important evidence found: it's definitely non-trivial in a lot of places > to decide what has to be unicode and what doesn't. It's non-trivial for me, > and definitely not easier for Cython. > > One important place where I ended up with a lot of trivial changes are > docstrings. Here, I would give an almost 100% chance that the user meant a > unicode string if it's not prefixed. The remaining cases, e.g. where some > external tool may require binary data for some kind of configuration or > analysis are rare enough to just ignore them. For exactly this reason (I > think), the doctest module in Py3 ignores docstrings that are not unicode. > This might be a place where an automatic conversion might make sense > (although, if it's the only place, that would be some funny string semantics...) > > Another important place are exception messages. Here, I'd give a real 100% for > string literals, as their only purpose is to be human readable. > > A field where I really had to take care is when working with byte sequences. > For example, lxml has a couple of places where strings are converted into > UTF-8 and then passed into re.findall() or re.sub(). When substituting, the > replacement string obviously has to be a byte string, too. I also found a bug > in the Py3 re module when working with byte strings in one specific case. > > There are actually quite a number of places where strings are built as byte > strings by combining and formatting literals, and then converted to a char*. > Another place where automatic conversion must not happen. > > So, while still on the way, my first real-world impression meets my original > opinion. There are definitely a lot of unprefixed strings in my own code that > are meant to be unicode strings. Simply switching their type in Py3 will fix a > lot of them, but at the same time break many others. The things that it fixes > are the trivial parts: docstrings and exceptions. Almost everything else > really were byte strings, and some were non-trivial things that need real work. > > If I can choose, I opt for going through this once and then having code that > correctly distinguishes between byte strings and unicode strings in *both* Py2 > and Py3, instead of additionally having to deal with changing string semantics > for identical code in different environments. We might think about a way to > simplify the transition from unprefixed docstrings and exception messages to > unicode strings. As it currently stands, everything else is definitely out of > scope for any automatism. > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From languitar at semipol.de Mon May 19 17:59:16 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 19 May 2008 17:59:16 +0200 Subject: [Cython] reference counting for objects stored in C code Message-ID: <4831A3D4.5050103@semipol.de> I remember there was something about increasing the reference count of python objects that are stored by C code without without the notice of python. But how to achieve this? I thought there was a section in the cython manual about that but I can't find it. What I tried is: cdef extern from "Python.h": Py_INCREF(obj) Py_DECREF(obj) Py_INCREF(dataElement) g_hash_table_insert(mapping, dataWrapper, dataElement) whereas dataElement is a python object. But this code results in a segmentation fault in the Py_INCREF statement. Thanks for the help Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080519/66e8603b/attachment.pgp From arcriley at gmail.com Mon May 19 18:20:28 2008 From: arcriley at gmail.com (Arc Riley) Date: Mon, 19 May 2008 12:20:28 -0400 Subject: [Cython] reference counting for objects stored in C code In-Reply-To: <4831A3D4.5050103@semipol.de> References: <4831A3D4.5050103@semipol.de> Message-ID: > What I tried is: > cdef extern from "Python.h": > Py_INCREF(obj) > Py_DECREF(obj) > It may not make a difference, but try (object) instead. To use Py_INCREF you need to be holding the GIL. Same for DECREF. You also need to call these /before/ the respective object can be garbage collected. This means if the object in question is passed via property, before that property __set__ returns you have to INCREF it. If you're putting it in a hash table from a nogil function, then however the Python function got passed in to be stored for use by that function, you have to INCREF it before that setting function returns. I can't give much more advice without seeing how you're using it in context, but some sample code from PySoy's use of glib merged with your line of code: You probably have this already: ctypedef void* gpointer cdef void g_hash_table_insert ( GHashTable*, gpointer, gpointer ) A typical use case would be as follows: def foo(key, dataElement) : cdef glib.gchar* _key _key = glib.g_strdup(key) Py_INCREF(dataElement) g_hash_table_insert(mapping, _key, dataElement) From stefan_ml at behnel.de Mon May 19 18:55:58 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 May 2008 18:55:58 +0200 (CEST) Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: References: <48309CEF.2010102@behnel.de> Message-ID: <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi Lisandro, Lisandro Dalcin wrote: > After all your comments, my conclusion is the following: > > - Any code dealing with string processing should be written in a way > were byte and unicode strings should have to be prefixed like b"abc" > and u"abc". I actually like the way it's in Py3. Unicode is the right thing most of the time - except when you deal with C-APIs as in Cython, where the best place to handle unicode is right below the API level, and nowhere else in your code. :) > Guido decided Python 3 not to > support the u"abc" form. And now, from a user/developer perspective, > althoug this contradict the "only one way...", I'm not sure at all if > that was a good idea in practice. It makes writing portable code very hard, just think of code that must support Python 2.3-3.0. I'm currently wrapping all string literals in the test cases in lxml with a function call _bytes() or _str(), which then does the right thing depending on the runtime environment. But it's a whole bunch of work to manually put this all over the place... > - I still think that unprefixed forms should match the builtin 'str' > type in Py2 and Py3. This way, things like docstrings, exception > messages, and calls like getattr([], "append") will just work. That's the problem: those are simple. Simple to find and simple to change, even with a script. All other places where data handling is involved are actually likely to break if we make it a general switch. I could agree on automatic promotion of docstrings and maybe even exception messages to unicode strings, but such a selective automatism would be somewhat surprising to users. And I'm a big fan of "explicit is better than implicit". Actually, shipping Cython with a simple script that prefixes all docstrings and "raise" messages with a 'u' would get us a lot of relief here. Maybe someone could write such a beast? > And of > course, casting to a raw 'char*' pointer should only accept bytes (by > using PyString_AsString). This is going to be safe in Py3, but not in > Py2. Even in my current implementation, the semantics are not entirely clean here. There are still cases where an explicit string literal gets coerced to the other type, in the DEF statement, for example. > But Py2 is broken anyway, right? It's still an important platform, though. Much more important than Py3. :) > - Add a '-py3' command line flag to Cython (this will be needed in the > future anyway, right?). I've always seen that as a way to handle plain Python 3 source code, not Py3-esque Cython code. I think the latter would be better served with one or more explicit __future__ imports. Configuring source semantics outside of the source is something that is hard to keep track of and that can get difficult to manage when you combine code from different sources - which is not so rare as it might seem. > When this flag is active, then the following will happen at runtime: > > * Something like 'cdef char *p = obj' will only accept a byte string > ('str' type in Py2, 'bytes' in Py3). If 'obj' is an unicode string, > the generated code raises a TypeError both on a Py2 and a Py3 runtime. Right, this actually currently works (sort-of) in Py2: cdef char* val uval = u"abc" val = uval print repr(val) prints 'abc' in Py2 and raises a TypeError in Py3. If you use non-ASCII letters, however, this fails with a UnicodeDecodeError in Py2. It would really be better if Cython catched that for the literal case and raised at least a runtime TypeError in the case above. And I mean: always, not just with a command line switch. As this will really help users by showing them where work has to be done. > * A new C pseudo-type have to be added, lets call it 'uchar' (better > name would be needed, it can be confused with unsigned char). Then > something like 'cdef uchar *p = obj' will only accept an unicode > string ('unicode' type in Py2 and 'str' in Py3). If 'obj' is a byte > string, the generated code raises TypeError both on a Py2 and a Py3 > runtime. I assume you mean a conversion to UTF-8 here, in which case "utf8char" would be appropriate IMHO. Still, I find s.encode("UTF-8") so short and explicit, that I don't see a major need for a special type name here. And in many, many cases, you will even be able to say def dostuff(text): cdef char* c_s text = text.encode("UTF-8") c_s = text ... so you don't even need to care about GC or anything, as "text" will stay alive during the function call. Regarding the TypeError, this would do the trick: def dostuff(unicode text): ... > This proposal just tries to make Cython/Pyrex > stricter, more explicit, even in a Py2 runtime. I for one would appreciate such strict semantics even without a command line switch. lxml has grown over a couple of years. If Cython had told me that some things don't work that way and will stop working in Py3, I wouldn't have to walk through the hassle of migrating my code now. > Stefan, please do not get angry if all this is a non-sense. I rarely bite. ;) And it's not nonsense at all. > All this stuff is going to be real pain to all Python users Yep, this really is work, especially if you do not have a handy (though not even reliable) 2to3 tool at hand. > About a week ago, I asked Fernando Perez and Brian Granger about > suggestion about how they think I should handle the byte/unicode stuff > in mpi4py. I paste below Fernando's anwer: > > """ > Wow, this one's going to be a *huge* thorn in everyone's side, if I > understand the problem correctly. What has been the policy of other > projects? > """ Regarding a policy, I have decided to get lxml's Cython code clean and the Python code portable without 2to3. That's more work than trying to find ways to cheat, but it's the right thing to do, and the safest option. > And I believe this is the situation of many, many Python users and > developers out there, even of the very smart and productive ones like > Fernando. If Cython takes a good direction on all this, then this will > be benefical for the Cython project itself, but also for other people > and other projects to follow the right path. I wouldn't mind finding ways to make the migration easier for users. I'm just very, very reluctant to changes that can end up breaking correct code or that keep people from thinking about the implications of the code they write. For example, automatic conversion of plain ASCII byte strings to unicode strings has a great potential of animating people to write code that breaks right the first time someone passes a non-ASCII string in. Getting code right at the time of its writing is very important to keep the maintenance overhead low, and the tools should help here instead of hiding potential bugs. Stefan From stefan_ml at behnel.de Mon May 19 19:03:04 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 May 2008 19:03:04 +0200 (CEST) Subject: [Cython] reference counting for objects stored in C code In-Reply-To: <4831A3D4.5050103@semipol.de> References: <4831A3D4.5050103@semipol.de> Message-ID: <55284.194.114.62.39.1211216584.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Johannes Wienke wrote: > I remember there was something about increasing the reference count of > python objects that are stored by C code without without the notice of > python. But how to achieve this? I thought there was a section in the > cython manual about that but I can't find it. > > What I tried is: > cdef extern from "Python.h": > Py_INCREF(obj) > Py_DECREF(obj) > > Py_INCREF(dataElement) If you define these two functions (or macros) that way, Pyrex/Cython will assume they are functions that work on an object (which is ok) and that will return an object (which is not ok), and it will check the result for NULL etc. You can define them this way: cdef extern from "Python.h": cdef void Py_INCREF(object o) cdef void Py_DECREF(object o) Stefan From dalcinl at gmail.com Mon May 19 19:32:44 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 19 May 2008 14:32:44 -0300 Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On 5/19/08, Stefan Behnel wrote: > I actually like the way it's in Py3. Unicode is the right thing most of > the time - except when you deal with C-APIs as in Cython, where the best > place to handle unicode is right below the API level, and nowhere else in > your code. :) Yep, that make a lot of sense, you are definitely right here... > > Guido decided Python 3 not to > > support the u"abc" form. > > It makes writing portable code very hard, just think of code that must > support Python 2.3-3.0. I'm currently wrapping all string literals in the > test cases in lxml with a function call _bytes() or _str(), which then > does the right thing depending on the runtime environment. But it's a > whole bunch of work to manually put this all over the place... Indeed. And I'll probably use the same 'trick' you are using (but perhaps implement it using Python C-API) > I could agree on automatic promotion of docstrings and maybe even > exception messages to unicode strings, but such a selective automatism > would be somewhat surprising to users. And I'm a big fan of "explicit is > better than implicit". You already convinced my that automatic promotion is a really bad idea, even in those 'special' cases > Right, this actually currently works (sort-of) in Py2: > > cdef char* val > uval = u"abc" > val = uval > print repr(val) > > prints 'abc' in Py2 and raises a TypeError in Py3. If you use non-ASCII > letters, however, this fails with a UnicodeDecodeError in Py2. It would > really be better if Cython catched that for the literal case and raised at > least a runtime TypeError in the case above. And I mean: always, not just > with a command line switch. As this will really help users by showing them > where work has to be done. Perhaps this stricter way will be really helpfull. So I'm +1 on it. Still, providing a non-programatic way of DISABLE this check would also be needed, just for Cython backward compatibility in Cython targeting users that are not ready for fixing their Py2.X codes. > > * A new C pseudo-type have to be added, lets call it 'uchar' (better > > name would be needed, it can be confused with unsigned char). > I assume you mean a conversion to UTF-8 here, in which case "utf8char" > would be appropriate Yes, I meant a conversion to UTF-8. IMHO. Still, I find > > s.encode("UTF-8") > > so short and explicit, that I don't see a major need for a special type > name here. And in many, many cases, you will even be able to say > > def dostuff(text): > cdef char* c_s > text = text.encode("UTF-8") > c_s = text > ... > > so you don't even need to care about GC or anything, as "text" will stay > alive during the function call. It's shorter, but 'cdef utf8char* c_s = text' is even shorter and explicit as well, and it can me implemented with pure Python C-API calls. But > Regarding a policy, I have decided to get lxml's Cython code clean and the > Python code portable without 2to3. That's more work than trying to find > ways to cheat, but it's the right thing to do, and the safest option. I second you here. I hope at some point something like a 3to2 (note the reversed number) tool is provided. As Python 3 is cleaner and stricter, perhaps gowing from 3 sintax and semantics to the 2 one will be easier, and then, in such scenario, we can writte code directly for Python 3 and make it backward compatible with Py2.X series. Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From languitar at semipol.de Mon May 19 21:15:05 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 19 May 2008 21:15:05 +0200 Subject: [Cython] reference counting for objects stored in C code In-Reply-To: <55284.194.114.62.39.1211216584.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4831A3D4.5050103@semipol.de> <55284.194.114.62.39.1211216584.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4831D1B9.2040201@semipol.de> Am 05/19/2008 07:03 PM schrieb Stefan Behnel: > If you define these two functions (or macros) that way, Pyrex/Cython will > assume they are functions that work on an object (which is ok) and that > will return an object (which is not ok), and it will check the result for > NULL etc. > > You can define them this way: > > cdef extern from "Python.h": > cdef void Py_INCREF(object o) > cdef void Py_DECREF(object o) Thanks, that was the problem. Shouldn't this be mentioned in the documentation? Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080519/0692ca97/attachment.pgp From stefan_ml at behnel.de Mon May 19 21:47:43 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 May 2008 21:47:43 +0200 Subject: [Cython] reference counting for objects stored in C code In-Reply-To: <4831D1B9.2040201@semipol.de> References: <4831A3D4.5050103@semipol.de> <55284.194.114.62.39.1211216584.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4831D1B9.2040201@semipol.de> Message-ID: <4831D95F.5060006@behnel.de> Hi, Johannes Wienke wrote: > Am 05/19/2008 07:03 PM schrieb Stefan Behnel: >> If you define these two functions (or macros) that way, Pyrex/Cython will >> assume they are functions that work on an object (which is ok) and that >> will return an object (which is not ok), and it will check the result for >> NULL etc. >> >> You can define them this way: >> >> cdef extern from "Python.h": >> cdef void Py_INCREF(object o) >> cdef void Py_DECREF(object o) > > Thanks, that was the problem. > > Shouldn't this be mentioned in the documentation? Sure, please add it to the wiki. Stefan From stefan_ml at behnel.de Mon May 19 22:03:35 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 19 May 2008 22:03:35 +0200 Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4831DD17.7030409@behnel.de> Hi, Lisandro Dalcin wrote: > On 5/19/08, Stefan Behnel wrote: >> > Guido decided Python 3 not to >> > support the u"abc" form. >> >> It makes writing portable code very hard, just think of code that must >> support Python 2.3-3.0. I'm currently wrapping all string literals in the >> test cases in lxml with a function call _bytes() or _str(), which then >> does the right thing depending on the runtime environment. But it's a >> whole bunch of work to manually put this all over the place... > > Indeed. And I'll probably use the same 'trick' you are using (but > perhaps implement it using Python C-API) Hmmm, but you don't have to do that in Cython code (at least the way it's currently implemented). String literals do not change semantics there, so if you use the correct string types everywhere, the generated C code will just work unchanged in Py2 and Py3. They do, however, change in unmodified Python code - which is a pitty if you really have to test byte strings and unicode strings, and can't prefix the first with 'b' as your code has to run in 2.3... >> Still, I find >> s.encode("UTF-8") >> >> so short and explicit, that I don't see a major need for a special type >> name here. And in many, many cases, you will even be able to say >> >> def dostuff(text): >> cdef char* c_s >> text = text.encode("UTF-8") >> c_s = text >> ... >> >> so you don't even need to care about GC or anything, as "text" will stay >> alive during the function call. > > It's shorter, but 'cdef utf8char* c_s = text' is even shorter and > explicit as well, and it can me implemented with pure Python C-API > calls. But it only works in Py3 as is. In Py2, Cython will have to do the entire magic, including the automatic cleanup of the UTF-8 encoded string. Here, this is comparable to this: cdef char* s = "abc" + some_string for which Cython currently raises a compiler error as you take the pointer to a temporary variable. So this will have to be changed first, before the coercion feature can be enabled for unicode strings in both Py2 and Py3. >> Regarding a policy, I have decided to get lxml's Cython code clean and the >> Python code portable without 2to3. That's more work than trying to find >> ways to cheat, but it's the right thing to do, and the safest option. > > I second you here. I hope at some point something like a 3to2 (note > the reversed number) tool is provided. As Python 3 is cleaner and > stricter, perhaps gowing from 3 sintax and semantics to the 2 one will > be easier, and then, in such scenario, we can writte code directly for > Python 3 and make it backward compatible with Py2.X series. Yep, there was some discussion on the Py3k list on this. No idea what became of it... Stefan From stefan_ml at behnel.de Tue May 20 00:04:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 00:04:19 +0200 Subject: [Cython] py3 branch merged back Message-ID: <4831F963.70807@behnel.de> Hi, just a quick note that I merged the Py3 branch back into cython-devel. It's stable enough by now, so this will hopefully give it some more testing, and allow others to work on bigger changes without the fear of impacting future merges. Have fun, Stefan From greg.ewing at canterbury.ac.nz Tue May 20 03:25:32 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 20 May 2008 13:25:32 +1200 Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: References: <48309CEF.2010102@behnel.de> Message-ID: <4832288C.8030401@canterbury.ac.nz> Lisandro Dalcin wrote: > In my current understanding of the problem, the evil thing is > automatic conversion. I'm completelly convinced of this, I believe > Robert and Greg are also convinced I'm convinced that unrestricted automatic conversion between char * and unicode would be a bad idea. I'm not yet totally convinced that Pyrex shouldn't allow it under certain conditions, such as the string containing only ascii code points (checked at run time). For Pyrex, I'm also thinking about not trying to make the language match py3 at all, at least not in every way. For example, I may decide to keep the 'u' prefix for Python unicode literals. This probably isn't the right thing for Cython to do if it wants to be a pure-Python compiler, but Pyrex has a different goal -- it's meant to be a half-way house between Python and C. Currently in Pyrex, "xxx" is not a Python type at all -- it's a C type (i.e. char *). It only becomes a Python type when used in a Python context, forcing conversion to a Python string object. I don't think it's necessarily wrong to keep it that way, i.e. "xxx" is a C string, and if you want a Python string object as a literal, you have to say which kind you want with a "b" or "u" prefix. That way, the Pyrex language itself can stay much the same, and you just have to write code that takes care to accept unicode strings if you intend to use it in a py3 environment. > * A new C pseudo-type have to be added, lets call it 'uchar' (better > name would be needed, it can be confused with unsigned char). Then > something like 'cdef uchar *p = obj' will only accept an unicode > string What would it actually point to -- utf8 encoded chars? How would it interact with char *? -- Greg From greg.ewing at canterbury.ac.nz Tue May 20 03:54:05 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 20 May 2008 13:54:05 +1200 Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <48322F3D.4040309@canterbury.ac.nz> Stefan Behnel wrote: > I actually like the way it's in Py3. Unicode is the right thing most of > the time - except when you deal with C-APIs as in Cython, where the best > place to handle unicode is right below the API level, and nowhere else in > your code. :) This is one of the reasons I think that, in Pyrex, having "xxx" always mean unicode may *not* be the right thing to do. In the midst of a Pyrex module, you're just as likely to be dealing with C data and calls as Python ones. -- Greg From greg.ewing at canterbury.ac.nz Tue May 20 04:06:03 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 20 May 2008 14:06:03 +1200 Subject: [Cython] Unicode issues In-Reply-To: References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4832320B.2050900@canterbury.ac.nz> One of the arguments being used against automatic unicode->char * using ascii as the encoding seems to be that it can cause your module to fail at run time. But how is this different from using an explicit encoding operation? It can *still* fail at run time if the unicode string passed can't be represented in the chosen encoding. -- Greg From stefan_ml at behnel.de Tue May 20 09:19:48 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 09:19:48 +0200 (CEST) Subject: [Cython] first lessons learned while porting lxml to Py3 In-Reply-To: <4832288C.8030401@canterbury.ac.nz> References: <48309CEF.2010102@behnel.de> <4832288C.8030401@canterbury.ac.nz> Message-ID: <16690.194.114.62.65.1211267988.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi Greg, Greg Ewing wrote: > I'm convinced that unrestricted automatic conversion between > char * and unicode would be a bad idea. I'm not yet totally > convinced that Pyrex shouldn't allow it under certain > conditions, such as the string containing only ascii code > points (checked at run time). That would be the way Py2 behaves. > For Pyrex, I'm also thinking about not trying to make the > language match py3 at all, at least not in every way. For > example, I may decide to keep the 'u' prefix for Python > unicode literals. I agree that this would not hurt much. Cython currently allows it. > This probably isn't the right thing for Cython to do if it > wants to be a pure-Python compiler, but Pyrex has a different > goal -- it's meant to be a half-way house between Python > and C. Cython has the same goal, meaning that it tries to simplify the work between Python code and C code. But additionally, it wants to support as much of the Python language itself as possible, to lower the entry level for Python programmers, which are the main target audience after all. I think the targeted work-flow will always be: write it in Python, add the C calls to connect to external libraries, optimise by adding type declarations to the Python code. And one of the main goals of Cython is to reduce the need for the last step as much as possible. I think that's what will eventually make it a "pure Python compiler". > Currently in Pyrex, "xxx" is not a Python type at all -- > it's a C type (i.e. char *). It only becomes a Python type > when used in a Python context, forcing conversion to a > Python string object. > > I don't think it's necessarily wrong to keep it that way, > i.e. "xxx" is a C string, and if you want a Python string > object as a literal, you have to say which kind you want > with a "b" or "u" prefix. Makes sense to me, although cdef char* s = b"..." would still be possible and done at compile time, so it's not quite as simple. > That way, the Pyrex language itself can stay much the same, > and you just have to write code that takes care to accept > unicode strings if you intend to use it in a py3 environment. I would say: regardless of the environment. Not checking string input is a bug IMHO. >> * A new C pseudo-type have to be added, lets call it 'uchar' (better >> name would be needed, it can be confused with unsigned char). Then >> something like 'cdef uchar *p = obj' will only accept an unicode >> string > > What would it actually point to -- utf8 encoded chars? I guess so. > How would it interact with char *? Good question. :) That raises the question what a uchar* is good for it you can just assign it to a char* variable. Then you'd have to use this quirk to assign a unicode string as UTF-8 encoded data to a char*: cdef char* s = u_str Vicious! :) Stefan From stefan_ml at behnel.de Tue May 20 09:39:45 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 09:39:45 +0200 (CEST) Subject: [Cython] Unicode issues In-Reply-To: <4832320B.2050900@canterbury.ac.nz> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> Message-ID: <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Greg Ewing wrote: > One of the arguments being used against automatic > unicode->char * using ascii as the encoding seems > to be that it can cause your module to fail at run > time. > > But how is this different from using an explicit > encoding operation? It can *still* fail at run time > if the unicode string passed can't be represented > in the chosen encoding. I agree that that's not an argument. The main arguments IMHO are that a) you can't be sure that you are actually looking at an ASCII-compatible string (i.e. ISO or UTF-8 encoded) and b) this makes it very easy to write buggy code that works perfectly until someone passes non-ASCII characters. I find it helpful to prevent writing such code right from the beginning, rather than requiring manual fixing when the problem comes up. I think that was one of the main reasons why the types were separated for Py3. Stefan From stefan_ml at behnel.de Tue May 20 10:37:07 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 10:37:07 +0200 (CEST) Subject: [Cython] Unicode issues In-Reply-To: <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-d armstadt.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <55166.194.114.62.65.1211272627.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hi Greg, Stefan Behnel wrote: > Greg Ewing wrote: >> One of the arguments being used against automatic >> unicode->char * using ascii as the encoding seems >> to be that it can cause your module to fail at run >> time. >> >> But how is this different from using an explicit >> encoding operation? It can *still* fail at run time >> if the unicode string passed can't be represented >> in the chosen encoding. > > you can't be sure that you are actually looking at an ASCII-compatible > string (i.e. ISO or UTF-8 encoded) Sorry, you were actually talking about the unicode->char* case, in which case it can easily be checked that only ASCII characters are used. c_ptr = PyString_AsString(PyUnicode_AsASCIIString(s)) would do the right thing. The opposite case would be this, then? s = PyUnicode_DecodeASCII(c_ptr, strlen(c_ptr), NULL) How would you deal with null bytes in the string? (Although I guess that's not a valid use case anyway). But there is still the argument that Py3 no longer does this for unicode->bytes coercion... And this: > and b) this makes it very easy to write > buggy code that works perfectly until someone passes non-ASCII characters. isn't really helped either. > I find it helpful to prevent writing such code right from the beginning, > rather than requiring manual fixing when the problem comes up. I think > that was one of the main reasons why the types were separated for Py3. Stefan From kirr at mns.spb.ru Tue May 20 11:58:08 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Tue, 20 May 2008 13:58:08 +0400 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <482EACD6.5070107@canterbury.ac.nz> References: <482BF9CE.2070100@canterbury.ac.nz> <200805161402.38996.kirr@mns.spb.ru> <482EACD6.5070107@canterbury.ac.nz> Message-ID: <200805201358.08696.kirr@mns.spb.ru> ? ????????? ?? ??????? 17 ??? 2008 Greg Ewing ???????(a): > Kirill Smelkov wrote: > > > http://hg.cython.org/pyrex/ > > > > So, I think everyone would be grateful to you if you'll keep it up-to-date. > > I've just added an hg push command to my upload script, > so it should happen fairly automatically now. > > > Yes, you'll loose precise control on what Pyrex is, but given there is a shell of > > motivated people around, I think Pyrex and you will gain much more in return ... > > > > That's how I think ... > > Please let me know *your* thoughts. > > What worries me is that if multiple people are hacking on the > Pyrex source at once, I'm going to lose track of how it works, > and then I won't be able to contribute to it myself any more. > > Python has many people working on it, but it's a lot simpler > in structure. It's fairly easy to work on things like adding > a new object or library module without fear of disturbing > anything else. > > The various parts of Pyrex are much more closely coupled than > that. I have to think long and hard before changing anything, > and I'm the one who wrote it. > > Also, I'm a bit overwhelmed by the pace of change going on > in the Cython project. It seems to be heading off in > directions rather different from Pyrex, such as turning > into a Python compiler, and/or a NumPy compiler, and folks > are rushing into making changes for py3k, when I've hardly > begun to think about what I want to do about that. > > If all that were happening to Pyrex itself, I wouldn't be > able to keep up. So it would either race ahead of me, or > I would be holding it back. > > So I think it's probably a good thing having Cython as a > separate project where people are free to try out all their > wild ideas. Then when the dust has settled I can take the > best ideas and fold them back into Pyrex. > > Think of it as the "Pyrex-Unstable" branch if you like. :-) Greg, thanks for your reply. Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex-Experimental :) from now on. Thanks again, Kirill. From stefan_ml at behnel.de Tue May 20 13:10:40 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 13:10:40 +0200 (CEST) Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <200805201358.08696.kirr@mns.spb.ru> References: <482BF9CE.2070100@canterbury.ac.nz> <200805161402.38996.kirr@mns.spb.ru> <482EACD6.5070107@canterbury.ac.nz> <200805201358.08696.kirr@mns.spb.ru> Message-ID: <38226.194.114.62.65.1211281840.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Kirill Smelkov wrote: > Greg Ewing wrote: >> So I think it's probably a good thing having Cython as a >> separate project where people are free to try out all their >> wild ideas. Then when the dust has settled I can take the >> best ideas and fold them back into Pyrex. >> >> Think of it as the "Pyrex-Unstable" branch if you like. :-) > > Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex-Experimental :) from > now on. I don't think that's consistent with the way the Cython project sees itself. Stefan From bblais at gmail.com Tue May 20 13:27:50 2008 From: bblais at gmail.com (Brian Blais) Date: Tue, 20 May 2008 07:27:50 -0400 Subject: [Cython] some advice on pointers Message-ID: <4CD2C0B9-3ACB-434C-9532-DE241BC66D53@gmail.com> Hello, I have a pointer to a double, like cdef double *d and I'd like to pass to a function a pointer to the data contained in d, with some offset, like: val=myfun(&d[10]) can I do something like that in cython? the only other alternative I could think of is to pass the offset as a parameter, like val=myfun(d,10) but that makes the function itself uglier, and less efficient. Am I thinking about this wrong? thanks, bb From f.guerrieri at gmail.com Tue May 20 13:46:24 2008 From: f.guerrieri at gmail.com (Francesco Guerrieri) Date: Tue, 20 May 2008 13:46:24 +0200 Subject: [Cython] How Cython Works In-Reply-To: <85e81ba30805161107l553873a1wbdff1b9c9ed6ade0@mail.gmail.com> References: <85e81ba30805161107l553873a1wbdff1b9c9ed6ade0@mail.gmail.com> Message-ID: <79b79e730805200446w74abd756gb26f9aa63087e4f3@mail.gmail.com> On Fri, May 16, 2008 at 8:07 PM, William Stein wrote: > Hi, > > You might find this Cython talk with slides and video interesting: > > http://wiki.wstein.org/2008/sageseminar/kantor > > Title: How Cython Works > > Thanks for the link, William. The slides are interesting and furthermore it's always nice to look at a presentation done with Beamer :) bye, francesco -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080520/33da3826/attachment.htm From martin at martincmartin.com Tue May 20 14:24:20 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Tue, 20 May 2008 08:24:20 -0400 Subject: [Cython] Recursive vs. visitor pattern Message-ID: <4832C2F4.1090605@martincmartin.com> Hi all, Recently there was some mention of using the visitor pattern to visit every node, versus a recursive overloaded function. What do people see as the advantage of one over the other? The recursive way seems simpler, but recursion in general is confusing to many, so the visitor pattern might be easier to understand. Is there any other trade off? Best, Martin From greg.ewing at canterbury.ac.nz Tue May 20 14:26:05 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 May 2008 00:26:05 +1200 Subject: [Cython] Unicode issues In-Reply-To: <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4832C35D.8080107@canterbury.ac.nz> Stefan Behnel wrote: > a) you can't be sure that you are actually looking at an ASCII-compatible > string (i.e. ISO or UTF-8 encoded) By ascii, I don't mean utf8, I mean all code points are in the range 0..127. That much can be checked accurately, I think. > b) this makes it very easy to write > buggy code that works perfectly until someone passes non-ASCII characters. That's what I don't follow. Code such as cdef char *p p = s.encode('ascii') has exactly the same property, as far as I can see -- it works until someone passes it non-ascii characters. I would call it a limitation rather than a bug. -- Greg From greg.ewing at canterbury.ac.nz Tue May 20 14:34:55 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 May 2008 00:34:55 +1200 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4832C2F4.1090605@martincmartin.com> References: <4832C2F4.1090605@martincmartin.com> Message-ID: <4832C56F.7030907@canterbury.ac.nz> Martin C. Martin wrote: > The recursive way seems > simpler, but recursion in general is confusing to many Anyone who can't handle recursion is going to have a hard time working on any kind of compiler, it seems to me... -- Greg From martin at martincmartin.com Tue May 20 14:44:36 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Tue, 20 May 2008 08:44:36 -0400 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4832C56F.7030907@canterbury.ac.nz> References: <4832C2F4.1090605@martincmartin.com> <4832C56F.7030907@canterbury.ac.nz> Message-ID: <4832C7B4.6030604@martincmartin.com> Me too, and the visitor pattern has a lot of the properties of recursion, so might be just as confusing. The recursive approach is also more flexible, because the node can decide what order to visit it & its children. Is the visitor pattern typically fixed at depth first? Best, Martin Greg Ewing wrote: > Martin C. Martin wrote: >> The recursive way seems >> simpler, but recursion in general is confusing to many > > Anyone who can't handle recursion is going to have > a hard time working on any kind of compiler, it > seems to me... > From stefan_ml at behnel.de Tue May 20 14:58:53 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 14:58:53 +0200 (CEST) Subject: [Cython] Unicode issues In-Reply-To: <4832C35D.8080107@canterbury.ac.nz> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> Message-ID: <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Greg Ewing wrote: > Stefan Behnel wrote: >> b) this makes it very easy to write >> buggy code that works perfectly until someone passes non-ASCII >> characters. > > That's what I don't follow. Code such as > > cdef char *p > p = s.encode('ascii') Currently, this would rather be cdef char *p b = s.encode('ascii') p = b > has exactly the same property, as far as I can see -- it > works until someone passes it non-ascii characters. I > would call it a limitation rather than a bug. The difference to this code chef [u]char* p p = s is that the code above does an explicit conversion to a user-defined encoding and makes clear what happens when, wheres it is not immediately visible from the code below that it 1) allocates memory for unicode strings but not for byte strings, 2) garbage collects a temporary string at some non users configurable point, 3) converts characters to bytes and thus may fail for some unicode strings and byte strings. This is neither symmetric to the bytes->char* coercion process (which never fails for any kind of byte string), nor is it transparent that there are non-trivial things happening. "Explicit is better than implicit" definitely holds for everything that involves non-trivial magic and memory allocation. Stefan From stefan_ml at behnel.de Tue May 20 15:06:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 20 May 2008 15:06:09 +0200 (CEST) Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4832C2F4.1090605@martincmartin.com> References: <4832C2F4.1090605@martincmartin.com> Message-ID: <19419.194.114.62.65.1211288769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Martin C. Martin wrote: > Recently there was some mention of using the visitor pattern to visit > every node, versus a recursive overloaded function. What do people see > as the advantage of one over the other? The recursive way seems > simpler, but recursion in general is confusing to many, so the visitor > pattern might be easier to understand. Is there any other trade off? One of the differences is that one is currently in place while the other requires a large refactoring. Apart from that, I don't see a major (dis-)advantage of either of the two, although I might find the recursive approach simpler as well (or at least more obvious). Stefan From dalcinl at gmail.com Tue May 20 16:49:58 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 20 May 2008 11:49:58 -0300 Subject: [Cython] some advice on pointers In-Reply-To: <4CD2C0B9-3ACB-434C-9532-DE241BC66D53@gmail.com> References: <4CD2C0B9-3ACB-434C-9532-DE241BC66D53@gmail.com> Message-ID: Just try! &d[0] works for me (i'm using cython-devel repo) On 5/20/08, Brian Blais wrote: > Hello, > > I have a pointer to a double, like > > cdef double *d > > and I'd like to pass to a function a pointer to the data contained in > d, with some offset, like: > > val=myfun(&d[10]) > > can I do something like that in cython? the only other alternative I > could think of is to pass the offset as a parameter, like > > val=myfun(d,10) > > but that makes the function itself uglier, and less efficient. Am I > thinking about this wrong? > > thanks, > > > bb > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bblais at gmail.com Tue May 20 18:45:50 2008 From: bblais at gmail.com (Brian Blais) Date: Tue, 20 May 2008 12:45:50 -0400 Subject: [Cython] some advice on pointers In-Reply-To: References: <4CD2C0B9-3ACB-434C-9532-DE241BC66D53@gmail.com> Message-ID: <2844940E-AE3D-46CA-900D-4758365FDD49@gmail.com> On May 20, 2008, at May 20:10:49 AM, Lisandro Dalcin wrote: > Just try! &d[0] works for me (i'm using cython-devel repo) doh! thanks for pointing out what I really should have done myself.... sorry for the noise, bb From kirr at mns.spb.ru Wed May 21 13:18:54 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Wed, 21 May 2008 15:18:54 +0400 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <38226.194.114.62.65.1211281840.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <482BF9CE.2070100@canterbury.ac.nz> <200805201358.08696.kirr@mns.spb.ru> <38226.194.114.62.65.1211281840.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <200805211518.54235.kirr@mns.spb.ru> ? ????????? ?? ??????? 20 ??? 2008 Stefan Behnel ???????(a): > Kirill Smelkov wrote: > > Greg Ewing wrote: > >> So I think it's probably a good thing having Cython as a > >> separate project where people are free to try out all their > >> wild ideas. Then when the dust has settled I can take the > >> best ideas and fold them back into Pyrex. > >> > >> Think of it as the "Pyrex-Unstable" branch if you like. :-) > > > > Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex-Experimental :) from > > now on. > > I don't think that's consistent with the way the Cython project sees itself. Oops :) From dagss at student.matnat.uio.no Wed May 21 17:35:20 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 21 May 2008 17:35:20 +0200 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4832C2F4.1090605@martincmartin.com> References: <4832C2F4.1090605@martincmartin.com> Message-ID: <48344138.6090405@student.matnat.uio.no> Martin C. Martin wrote: > Hi all, > > Recently there was some mention of using the visitor pattern to visit > every node, versus a recursive overloaded function. What do people see > as the advantage of one over the other? The recursive way seems > simpler, but recursion in general is confusing to many, so the visitor > pattern might be easier to understand. Is there any other trade off? > Thanks for asking :-) I'm the one who's been pushing for this, and I believe I have good reasons. I'll need to break the question up in pieces. (I've been moving so I'll be unresponsive, but when I answer I'll write proper answers). First off, this is NOT about recursive vs. non-recursive -- when using the visitor pattern, one does of course have to care deeply about recursiveness. That doesn't matter at all, if recursion is something one actually have to stop and think about one shouldn't touch Cython code (like Greg says). So this is just confusion coming from me using inprecise language. Sorry. An answer to Stefan: The big refactoring you refer to is related to seperating the type analysis and type coercion phases (both happening in analyse_phase currently). Do you have any other, concrete suggestions about how to do this? How can one implement, say, NumPy support or type inference, without it (in ANY way -- i.e. there's a "logical proof" that any language feature which would allow the NumPy functionality I'll be implementing will need this refactoring). Of course, how the refactoring happens depend on how *new* code should be written. Back to the approach I'm pushing for. It actually has two orthogonal parts: 1) More "phases", and "deeper cuts" in the recursive process (ie have many recursive processes with simpler flow, rather than one big with complex flow). More phases is the important part, like noted above a lot of stuff simply cannot be done until type analysis and type coercion is split into seperate phases. This is the only nasty refactorting job in my proposals. As for deeper cuts: Parts of what the current recursive call chain does can be scetched out like this: analyse_module_level generate_module_code for each function: analyse_function_code generate_function_code It would be better (see below) to have "deeper" cuts, i.e. so that one could say "now the entire tree has been analysed", "now the entire tree is ready for code generation", rather than some parts (functions etc.) being in a seperate state. It's just easier conceptually to always know what state one is in and be sure that what you need is generated before you need it. I'll give an example: Consider for instance trivial inner functions. One natural way to implement this is just "throwing the function out", ie extract it and put it in module scope (with name mangling etc.), and then let the rest of the system deal with it. That way, you don't need to have any specific code for inner functions anywhere but one place. However, if you detect the types etc. you need for the inner functions in line 4 in the above psuedo-code, then you cannot depend on analyse_module_level to do anything about your new global function, because that's already been run; so you *somehow* need an ugly kludge to get around this, and spend time thinking about that (I have no idea how one would do it) rather than doing "real work" which doesn't come as a penalty from how the program is structured. (In reality there'll be full closures instead, which just means generating a "cdef class" (with state) at module scope rather than a function. I'd like to see code doing that in less than 150 lines without using any of my "new structure" proposals...) 2) Transforms. Since it's you Martin, think of them as internal Cython macros on the tree. Rather than implementing everything parser-end to C generation end, it should be possible to implement one feature by writing a simple macro to transform it into other things. With the current approach, the big problem is that the structure of the tree is fixed (of course it is possible to work around it; but this is not done in practice because it is so kludgy). And the problem with this is: Python code and C does not map 1:1 in structure. So the result is rather lots of kludges to get around this [1], and trivial things suddenly becomes non-trivial. Consider for instance the "with" statement. It's PEP specification is given as an equivalent Python code fragment! Still, the only "natural" way to implement it within the current structure is to implement "WithStatementNode" and keep it around in the tree (in some form) all the way until C code generation. With transforms, you'd rather write a (recursive!) macro to handle with statements (simply turn it into the equivalent Python fragment). With my proposals there'd be a top-level "pipeline" list like this: pipeline = [parse, type_analysis, coerce, generate_C] # very simplified Now, you can simply do if should_support_with_statemements: pipeline.insert(1, with_statement) else: pipeline.insert(1, raise_error_on_with_statement) And 1) the remaining phases (type analysis etc. etc.) never even need to know that there exists a WithStatementNode, 2) the parser doesn't need to care about it either above simply "turning the Cython code into an object representation" (it doesn't need any "business logic" for the with statement). What does the visitor pattern have to do with any of this? It's the most natural way to implement stuff like this. But important: This is not the important part. Visitors is really "low-level", it's sort of like "do you prefer for-loops of while-loops for this kind of task", while the stuff mentioned above is more like "do you prefer a microkernel or a monolithic kernel"... :-) Example: Without visitors, in order to implement a Cython code serializer (to reserialize the tree) one would have a method "serialize_to_cython" to every node class. (Imagine then coming along to add "serialize_to_xml" later...). With visitors, one simply creates an isolated CodeWriter.py file, and nothing else knows anything about it -- it allows clear dependencies and compartamentalization. I can write more on this later but this should do for now, tell me which parts you're interested in if any... [1] See ParallellAssignmentNode in Nodes.py for an example -- excellent kludge that works nicely and comes naturally from the current code structure, but a kludge none-the-less. In particular, the parser has to output stuff in *one certain way* (and in a way which is unnatural for the parser, with a "local" view) because of what happens many miles down the road. Design problems like this slows down development and makes the probability of bugs higher. (Rewrites has an even higher bug probability, though! So I'm not advocating that -- I'm simply talking about how I'd like to write *new* code!) Dag Sverre From robertwb at math.washington.edu Wed May 21 21:11:02 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 21 May 2008 12:11:02 -0700 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <200805211518.54235.kirr@mns.spb.ru> References: <482BF9CE.2070100@canterbury.ac.nz> <200805201358.08696.kirr@mns.spb.ru> <38226.194.114.62.65.1211281840.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <200805211518.54235.kirr@mns.spb.ru> Message-ID: On May 21, 2008, at 4:18 AM, Kirill Smelkov wrote: > ? ????????? ?? ??????? 20 ??? 2008 Stefan > Behnel ???????(a): >> Kirill Smelkov wrote: >>> Greg Ewing wrote: >>>> So I think it's probably a good thing having Cython as a >>>> separate project where people are free to try out all their >>>> wild ideas. Then when the dust has settled I can take the >>>> best ideas and fold them back into Pyrex. >>>> >>>> Think of it as the "Pyrex-Unstable" branch if you like. :-) >>> >>> Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex- >>> Experimental :) from >>> now on. >> >> I don't think that's consistent with the way the Cython project >> sees itself. > > Oops :) To clarify, I think the main difference between Cython and Pyrex is that Cython aims to be a good Python compiler, not just a way to create extension types. I think most of the differences in philosophy can be traced back to this point. Also, in terms of development, Greg wants to review and understand every line that goes into Pyrex, which makes it slower to accept new features. We still do quality review on the code contributed to Cython but the process is much more open. In this case it is more experimental, but I would say it's developing into more than just another branch of Pyrex. In terms of stability, I wouldn't say Cython is any worse than Pyrex in terms of finding bugs in releases, and both tend to fix bugs relatively quickly. - Robert From robertwb at math.washington.edu Wed May 21 21:35:25 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 21 May 2008 12:35:25 -0700 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <19419.194.114.62.65.1211288769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4832C2F4.1090605@martincmartin.com> <19419.194.114.62.65.1211288769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On May 20, 2008, at 6:06 AM, Stefan Behnel wrote: > Martin C. Martin wrote: >> Recently there was some mention of using the visitor pattern to visit >> every node, versus a recursive overloaded function. What do >> people see >> as the advantage of one over the other? The recursive way seems >> simpler, but recursion in general is confusing to many, so the >> visitor >> pattern might be easier to understand. Is there any other trade off? > > One of the differences is that one is currently in place while the > other > requires a large refactoring. > > Apart from that, I don't see a major (dis-)advantage of either of > the two, > although I might find the recursive approach simpler as well (or at > least > more obvious). I don't think either is easier to understand, but here's my take on the advantages of visitors: 1) Adding a phase doesn't require adding a new function to every node in the tree. This is especially nice for making more granular phases (such as separating type analysis from coercion from temp variable allocation), and things like optimization "phases" that may only need to deal with a small subset of the nodes. 2) The visitor pattern allows more flexibility in mutating the tree. For example, the mutate_into_name_node function of AttributeNode could be handled more naturally. More complex transformations (such as handling the with statement) could be handled as well. Neither of these are compelling enough reasons to throw out the current model, but they are worth considering as we go to add new code. (I just now read Dag's email, which is in general agreement with the above.) -Robert From robertwb at math.washington.edu Wed May 21 22:33:23 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 21 May 2008 13:33:23 -0700 Subject: [Cython] Unicode issues In-Reply-To: <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> On May 20, 2008, at 5:58 AM, Stefan Behnel wrote: > Greg Ewing wrote: >> Stefan Behnel wrote: >>> b) this makes it very easy to write >>> buggy code that works perfectly until someone passes non-ASCII >>> characters. >> >> That's what I don't follow. Code such as >> >> cdef char *p >> p = s.encode('ascii') > > Currently, this would rather be > > cdef char *p > b = s.encode('ascii') > p = b > >> has exactly the same property, as far as I can see -- it >> works until someone passes it non-ascii characters. I >> would call it a limitation rather than a bug. > > The difference to this code > > chef [u]char* p > p = s > > is that the code above does an explicit conversion to a user-defined > encoding and makes clear what happens when, wheres it is not > immediately > visible from the code below that it 1) allocates memory for unicode > strings but not for byte strings, 2) garbage collects a temporary > string > at some non users configurable point, 3) converts characters to > bytes and > thus may fail for some unicode strings and byte strings. > > This is neither symmetric to the bytes->char* coercion process (which > never fails for any kind of byte string), nor is it transparent > that there > are non-trivial things happening. > > "Explicit is better than implicit" definitely holds for everything > that > involves non-trivial magic and memory allocation. I think points (1) and (2) are non-issues--both Python and Cython/ Pyrex implicitly allocate temporary object all over the place so the user doesn't have to be bothered with it. That leaves (3), the question of whether to allow any implicit string <--> char* conversions (because it's so convenient, especially given the ubiquitous nature of ASCII) or not (because if it's not explicit, it's a bug). - Robert From greg.ewing at canterbury.ac.nz Thu May 22 00:57:58 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 May 2008 10:57:58 +1200 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: <200805211518.54235.kirr@mns.spb.ru> References: <482BF9CE.2070100@canterbury.ac.nz> <200805201358.08696.kirr@mns.spb.ru> <38226.194.114.62.65.1211281840.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <200805211518.54235.kirr@mns.spb.ru> Message-ID: <4834A8F6.6070802@canterbury.ac.nz> Kirill Smelkov wrote: > ? ????????? ?? ??????? 20 ??? 2008 Stefan Behnel ???????(a): > >>Kirill Smelkov wrote: >> >>>Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex-Experimental :) from >>>now on. >> >>I don't think that's consistent with the way the Cython project sees itself. Just to clarify, I didn't mean to imply that users of Cython should regard it as unreliable, only that its feature set and internal structure can be expected to change rapidly. -- Greg From greg.ewing at canterbury.ac.nz Thu May 22 02:30:10 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 May 2008 12:30:10 +1200 Subject: [Cython] Unicode issues In-Reply-To: <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> Message-ID: <4834BE92.7070900@canterbury.ac.nz> Robert Bradshaw wrote: > That leaves (3), the > question of whether to allow any implicit string <--> char* > conversions (because it's so convenient, especially given the > ubiquitous nature of ASCII) or not (because if it's not explicit, > it's a bug). Some more thoughts on that: Keeping unicode and bytes clearly separated makes good sense in py3, because you're in a high-level world that's firmly isolated from the outside. It's a viable strategy to convert all your data to unicode as soon as it comes in, and not have to worry about the issue otherwise. But the inside of a Pyrex module isn't such an isolated environment. At every turn, you're dealing with C code that doesn't make such a clear distinction between bytes and unicode. I'm not sure that trying to maintain the distinction rigidly for Python data, when there is all this C data around that doesn't maintain any such distinction, is worth the effort. -- Greg From greg.ewing at canterbury.ac.nz Thu May 22 04:16:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 May 2008 14:16:53 +1200 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <48344138.6090405@student.matnat.uio.no> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> Message-ID: <4834D795.1000401@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > The big refactoring you refer to is related to > seperating the type analysis and type coercion phases I'm not sure whether you would gain much by doing this. The place where you discover that a coercion is needed is during type analysis, where you look at the types of the operations, decide whether they're compatible, and see whether they need to be converted to a common type, etc. Also you work out what the result type will be and annotate the node with it. To move the creation of coercion nodes into a separate pass, you would have to leave an annotation on the node saying that a coercion is needed. But if some phase between then and doing the actual coercions rearranges the parse tree, these annotations may no longer be correct. So you would have to disallow any other phases between type analysis and coercion, or at least put some restrictions on what they can do, such as not altering the structure of the parse tree, or doing anything else that could invalidate the calculated types. What sort of things were you intending to do in between type analysis and coercion? Could they still be done under these restrictions? More generally, sometimes trying to split things up into more phases can make things more complicated rather than simpler. As an example of this, I'm currently thinking about eliminating the allocate_temps subphase of expression analysis and combining it with the code generation phase. The reason is that there's currently a rather non-obvious dependency between these phases. The order in which temp variables are allocated and released during allocate_temps has to exactly match the order in which code is generated that creates and disposes the references that will be put in those temps. This makes it rather tricky to both write and maintain code for these two phases. The reason they're separate phases at the moment is that I was initially writing the generated code directly to the output file, so I had to know what temp variable declarations would be needed before starting to write any of the body code for a function. However, I'm currently writing the declaration and executable code to separate buffers and combining them afterwards, so there's probably no need for a separate allocate_temps pass any more, and combining it with code generation is likely to simplify quite a lot of things. > It would be better (see below) to have "deeper" cuts, i.e. so that one > could say "now the entire tree has been analysed", "now the entire tree > is ready for code generation", rather than some parts (functions etc.) > being in a seperate state. The reason for doing functions that way is that it seemed wasetful to keep all the symbol tables for the local scopes around longer than necessary. That decision was probably influenced by an earlier project in which I wrote a compiler for a Modula-like language that ran on machines much smaller than we have today. It used a 3-pass arrangement that kept all the symbol tables for everything between passes, with the result that it could only compile a module a few hundred lines long before running out of memory. That experience gave me an appreciation of why Wirth prefers to write single-pass compilers. Although the memory issue probably isn't a concern nowadays, the aforementioned experience led me to approach the problem with the mindset of using as few passes as possible. The only reason I used separate analysis and generation phases at all in the beginning was so that you can refer to C functions that are defined further down without needing forward declarations. > Consider for instance trivial inner functions. One > natural way to implement this is just "throwing the function out" > ... > so you *somehow* need an ugly kludge to > get around this, and spend time thinking about that I don't think it would be all that difficult to make a pass over the function body just before generating code for it, that generates code for any nested functions. In fact, I have a suspicion that for the special case you're talking about (no references to intermediate scopes) it would "just work", because the analysis phase will already have generated an appropriately-mangled C name for the function. On the other hand, "throwing the function out" presents difficulties of its own. As you mentioned, some kind of name mangling would need to be done, which requires knowing which names are supposed to refer to that function, so you need some kind of symbol table functionality available in the pass where you do the throwing-out. The obvious thing is to use the real symbol table, but that means doing it after the declaration analysis phase, by which time the only problem remaining is how to get the C code generated in the right place -- which as I've said isn't really all that hard. > (In reality there'll be full closures instead, which just means > generating a "cdef class" (with state) at module scope rather than a > function. I'd like to see code doing that in less than 150 lines without > using any of my "new structure" proposals...) I'll be impressed if you can do it in 150 lines whatever structure you're using. But in any case, I suspect that the hardest part of this will be coping with all the nested scope issues, and that it will be easiest to do that while the parse tree still reflects the lexical structure of the original code > Consider for instance the "with" statement... the only "natural" > way to implement it within the current structure is to implement > "WithStatementNode" Not necessarily. It may well be feasible to implement it by assembling existing nodes, and I'll be looking into that. > [1] See ParallellAssignmentNode in Nodes.py for an example -- excellent > kludge How so? It seems like exactly the sort of transformation you're advocating, the only difference being that it's implemented by code in Parser.py rather than in a separate pass. > the parser has to output stuff in *one certain way* I'm not sure what you mean by that. The parser doesn't *have* to perform that transformation -- it's just an optimisation. It could just leave parallel assignments alone and let the back end generate a tuple packing and unpacking operation. It would work, albeit less efficiently. If you're referring to the fact that the subnodes of a ParallelAssignmentNode have to be simple assigments, that's the whole purpose of the transformation -- to reduce a parallel assignment to a set of non-parallel ones. If one of the sub-assignments is itself a parallel assignment, it gets expanded as well, until only non-parallel ones are left. If you're wondering why the ParallelAssignmentNode exists at all, it's because you can't just do the assignments one after another -- you have to evaluate all the right hand sides before assigning to any of the left hand sides. It's possible that a more general set of primitive nodes can be found that enables this to be expressed without needing a special ParallelAssignmentNode. I may revisit this after seeing what needs to be done for the with statement. > and in a way which is unnatural for the parser, with a "local" view Not sure what you mean by that, either. The parser has to know that the ParallelAssignmentNode exists, and what kind of subnodes it expects. But whatever phase performed the transformation would need the same knowledge. Neither of them have to know how those nodes work internally. Sorry this reply was so long, I didn't have time to write a shorter one. :-) -- Greg From greg.ewing at canterbury.ac.nz Thu May 22 04:28:01 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 May 2008 14:28:01 +1200 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: References: <4832C2F4.1090605@martincmartin.com> <19419.194.114.62.65.1211288769.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4834DA31.6030505@canterbury.ac.nz> Robert Bradshaw wrote: > 1) Adding a phase doesn't require adding a new function to every node > in the tree. That needn't be necessary anyway. The ExprNodes currently have a list of names of attributes that refer to subnodes. This allows a default implementation to be written for a pass that simply recurses into its subnodes. Then you only have to override it for nodes that need to do something different. Statement nodes don't currently have anything like that, but they could. > 2) The visitor pattern allows more flexibility in mutating the tree. > For example, the mutate_into_name_node function of AttributeNode > could be handled more naturally. I'll concede that one's a bit of a hack. It would be better to replace it with an actual NameNode, but the need for this becomes apparent too late in the process the way it currently works. -- Greg From stefan_ml at behnel.de Thu May 22 09:27:05 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 09:27:05 +0200 Subject: [Cython] Unicode issues In-Reply-To: <4834BE92.7070900@canterbury.ac.nz> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> <4834BE92.7070900@canterbury.ac.nz> Message-ID: <48352049.1070008@behnel.de> Hi, Greg Ewing wrote: > Keeping unicode and bytes clearly separated makes good > sense in py3, because you're in a high-level world that's > firmly isolated from the outside. It's a viable strategy > to convert all your data to unicode as soon as it comes > in, and not have to worry about the issue otherwise. lxml works exactly the other way round. All unicode strings that come in are converted to UTF-8 before doing anything else with them. > But the inside of a Pyrex module isn't such an isolated > environment. At every turn, you're dealing with C code > that doesn't make such a clear distinction between bytes > and unicode. It usually won't know anything about Unicode anyway, at least not about Python unicode strings. C code usually cares about bytes, in which case the inverse of the approach you sketched above is exactly the right thing to do. > I'm not sure that trying to maintain the > distinction rigidly for Python data, when there is all > this C data around that doesn't maintain any such > distinction, is worth the effort. I think you will always have to find some kind of lingua franca that is used throughout your program. You may decide to convert all data coming from C directly into a Python unicode string and work with that, or it may be more suitable to convert all Python unicode input into bytes and work with those. But it's just bad design to keep converting back and forth all over the place, so I still think we are discussing a mixture of a non-issue and a potential source of bugs here. Stefan From stefan_ml at behnel.de Thu May 22 10:38:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 10:38:02 +0200 Subject: [Cython] Developer snapshot of Py3 capable Cython Message-ID: <483530EA.9020103@behnel.de> Hi, just a quick note that I put up a developer snapshot of the Py3 capable Cython (cython-devel revision 582). http://codespeak.net/lxml/dev/Cython-0.9.6.14-3k.tar.gz I also announced it on c.l.python to get some more feedback from people who want to try it. Stefan From stefan_ml at behnel.de Thu May 22 12:51:17 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 12:51:17 +0200 Subject: [Cython] How to deal with the new buffer interface? Message-ID: <48355025.4090003@behnel.de> Hi, I wonder how to best deal with the new buffer interface. http://www.python.org/dev/peps/pep-3118/ It's actually pretty simple, it exposes two straight forward functions through the PyBufferProcs struct using the old Py2 "tp_as_buffer" slot (but different content). typedef int (*getbufferproc)(PyObject *obj, Py_buffer *view, int flags); typedef void (*releasebufferproc)(PyObject *, Py_buffer *); typedef struct { getbufferproc bf_getbuffer; releasebufferproc bf_releasebuffer; } PyBufferProcs; struct bufferinfo { void *buf; Py_ssize_t len; int readonly; const char *format; int ndim; Py_ssize_t *shape; Py_ssize_t *strides; Py_ssize_t *suboffsets; Py_ssize_t itemsize; void *internal; } Py_buffer; This would normally call for two special functions __getbuffer__ and __releasebuffer__. To me, however, this looks like an extremely C-ish interface that does not fit Cython at all. So I'm wondering what others think about this approach: 1) define a pseudo-builtin PyxBuffer extension type that mimics a Py_buffer struct without the ".internal" field 2) support a new special method __getbuffer__(self, int flags) that allows users to create, fill and return an instance of PyxBuffer or a subtype (maybe the PyxBuffer class could also contain a public enum that defines the flags?) 3) generate glue code for the bf_getbuffer() call that a) calls __getbuffer__() and raises an exception if it does not return a PyxBuffer instance b) copies the Py_buffer struct content over from the returned PyxBuffer object to the Py_buffer struct that was passed into bf_getbuffer() c) stores an incref-ed reference to the PyxBuffer object in the ".internal" field 4) generate generic code for the bf_releasebuffer slot that decrefs the reference in the ".internal" field (and sets it to NULL), thus eventually calling __dealloc__() on the PyxBuffer object. User defined subtypes of PyxBuffer can then do The Right Thing here. This has several advantages: - it would only require a single special method - it matches the way GC and normal extension types work in Cython - it relieves the user from having to cast around with "internal" pointers possibly even to hand-refcounted Python objects (if used that way) - the PyxBuffer instance could easily be reused in __getbuffer__() and would only be garbage collected when all references are gone, leaving the user full control over the GC process - the "shapes", "strides" and "suboffsets" field could be normal Py_ssize_t values in PyxBuffer that the Py_buffer fields would just point to - all generated functions would be wrapped by a conditional "#if Py3 ..." - the user code would never rely on Py_buffer directly, thus nicely compiling to unused code in Py2 A disadvantage I see is that PyxBuffer cannot directly inherit the "official" struct from the Python header files, so changes to the struct would have to be reflected in Cython. But given the usual stability of Python APIs, I think that's a non-problem. Opinions? Stefan From kirr at mns.spb.ru Thu May 22 13:52:08 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Thu, 22 May 2008 15:52:08 +0400 Subject: [Cython] ANN: Pyrex 0.9.8 In-Reply-To: References: <482BF9CE.2070100@canterbury.ac.nz> <200805211518.54235.kirr@mns.spb.ru> Message-ID: <200805221552.08309.kirr@mns.spb.ru> ? ????????? ?? ????? 21 ??? 2008 Robert Bradshaw ???????(a): > On May 21, 2008, at 4:18 AM, Kirill Smelkov wrote: > > > ? ????????? ?? ??????? 20 ??? 2008 Stefan > > Behnel ???????(a): > >> Kirill Smelkov wrote: > >>> Greg Ewing wrote: > >>>> So I think it's probably a good thing having Cython as a > >>>> separate project where people are free to try out all their > >>>> wild ideas. Then when the dust has settled I can take the > >>>> best ideas and fold them back into Pyrex. > >>>> > >>>> Think of it as the "Pyrex-Unstable" branch if you like. :-) > >>> > >>> Yes, I'll think of Cython as Pyrex-Unstable (or Pyrex- > >>> Experimental :) from > >>> now on. > >> > >> I don't think that's consistent with the way the Cython project > >> sees itself. > > > > Oops :) > > To clarify, I think the main difference between Cython and Pyrex is > that Cython aims to be a good Python compiler, not just a way to > create extension types. I think most of the differences in philosophy > can be traced back to this point. > > Also, in terms of development, Greg wants to review and understand > every line that goes into Pyrex, which makes it slower to accept new > features. We still do quality review on the code contributed to > Cython but the process is much more open. Sounds good. Would then you or some other Cython developer please review my constify.patch submitted more than one week ago: http://trac.cython.org/cython_trac/ticket/9 ? It is just an RFC (Request For Comments), but still, I've got no response regarding the patch itself. Thanks, Kirill. From greg.ewing at canterbury.ac.nz Thu May 22 14:30:12 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 May 2008 00:30:12 +1200 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <48355025.4090003@behnel.de> References: <48355025.4090003@behnel.de> Message-ID: <48356754.5060709@canterbury.ac.nz> Stefan Behnel wrote: > This would normally call for two special functions __getbuffer__ and > __releasebuffer__. To me, however, this looks like an extremely C-ish > interface that does not fit Cython at all. That's because it's designed for use by C code, not by Python code. Trying to make it more "Python-like" would defeat the purpose of having it in the first place. -- Greg From dagss at student.matnat.uio.no Thu May 22 14:59:33 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 22 May 2008 14:59:33 +0200 (CEST) Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <48355025.4090003@behnel.de> References: <48355025.4090003@behnel.de> Message-ID: <51339.193.157.229.67.1211461173.squirrel@webmail.uio.no> Stefan wrote: > > I wonder how to best deal with the new buffer interface. > > http://www.python.org/dev/peps/pep-3118/ > I like your general idea. I'm weak in the "C side of things" but I'll comment on some other aspects. Some more background: IIRC, the buffers interface comes from repeated attempts by Travis Oliphant of NumPy to get some kind of native support for data exchange of arrays (prime example being loading images with PIL and manipulate the images with NumPy). It therefore looks *very* similar to the NumPy data structures. This means that this stuff is potentially relevant for my summer of code. (I.e., any multidimensional array indexing that I create for NumPy might potentially be made to work for these buffers as well, keep that in mind below.) > This would normally call for two special functions __getbuffer__ and > __releasebuffer__. To me, however, this looks like an extremely C-ish > interface that does not fit Cython at all. So I'm wondering what others > think > about this approach: > > 1) define a pseudo-builtin PyxBuffer extension type that mimics a > Py_buffer > struct without the ".internal" field > 2) support a new special method __getbuffer__(self, int flags) that allows > users to create, fill and return an instance of PyxBuffer or a subtype > (maybe the PyxBuffer class could also contain a public enum that > defines > the flags?) > 3) generate glue code for the bf_getbuffer() call that > a) calls __getbuffer__() and raises an exception if it does not return > a > PyxBuffer instance > b) copies the Py_buffer struct content over from the returned PyxBuffer > object to the Py_buffer struct that was passed into bf_getbuffer() > c) stores an incref-ed reference to the PyxBuffer object in the > ".internal" > field > 4) generate generic code for the bf_releasebuffer slot that decrefs the > reference in the ".internal" field (and sets it to NULL), thus > eventually > calling __dealloc__() on the PyxBuffer object. User defined subtypes of > PyxBuffer can then do The Right Thing here. I'd like to mention to variations on this (without implying I think they are better, I just think they should be mentioned): One is more native support for buffers. I'm not sure how it could be made to work, but perhaps some kind of union with "CEP 512 - C arrays as first class types"; i.e., the dynamically allocated/deallocated C arrays would support the buffer interface. (I haven't thought much about this and whether there would be any overhead). If so, __getbuffer__ could simply return such a primitive array object with native syntax support; or one could drop __getbuffer__ and use a more generic coercion operator overload (and have coercion to a native array type be your __getbuffer__). The other idea I had is the completely opposite: Do provide support for the direct, low-level C manipulation, but also create a pxd-file that ships with Cython with a base class that implement __getbuffer__ in the way stated above, using Cython code. (It looks like inlineable code in pxd files will be part of my summer of code in some form.) This could potentially make the Cython compiler less complex as well as provide some more flexibility. (But as I said, I don't know the C side of this and so this may be off mark.) Dag Sverre From dagss at student.matnat.uio.no Thu May 22 15:07:57 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 22 May 2008 15:07:57 +0200 (CEST) Subject: [Cython] Compile-time duck typung Message-ID: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> In my SoC project, I scetched implementing NumPy support through adding some limited template support. Then Robert's proposal about object assumptions came along and as a result, things are (probably) going to look a lot less C++-like and a lot more Python, dynamically typed-like. The following represents my attempt to do the same for the other features I proposed in my SoC project plan, namely function templates and function overloading. I.e. it allows you to do the same things as C++ overloading and function templates allows, but in a much more Pythonic and much less strongly-typed way. I've called it "compile-time duck-typing": http://wiki.cython.org/enhancements/compiledducktyping Comments welcome. I'd very much like to focus only on the user and language development perspective for now (i.e. not so much how this would actually be implemented in the Cython compiler. BTW this would probably be part of my SoC if approved). Dag Sverre From stefan_ml at behnel.de Thu May 22 15:18:47 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 15:18:47 +0200 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <48356754.5060709@canterbury.ac.nz> References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> Message-ID: <483572B7.2070503@behnel.de> Hi, Greg Ewing wrote: > Stefan Behnel wrote: >> This would normally call for two special functions __getbuffer__ and >> __releasebuffer__. To me, however, this looks like an extremely C-ish >> interface that does not fit Cython at all. > > That's because it's designed for use by C code, not by > Python code. Trying to make it more "Python-like" would > defeat the purpose of having it in the first place. That sounds a bit like you are objecting to my proposal. What would you see as a better solution then? I mean, we will have to support the buffer interface in one way or another, and we will have to do it in a way that works on Py3 and compiles on Py2. Cython is all about writing C code without writing C code, so I would really prefer a solution that does not require going all the way down to C-ish code, just to implement a Python object protocol. Stefan From stefan_ml at behnel.de Thu May 22 17:26:41 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 17:26:41 +0200 Subject: [Cython] Unicode issues In-Reply-To: <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> Message-ID: <483590B1.1050609@behnel.de> Hi, Robert Bradshaw wrote: > On May 20, 2008, at 5:58 AM, Stefan Behnel wrote: >> it is not immediately >> visible from the code below that it 1) allocates memory for unicode >> strings but not for byte strings, 2) garbage collects a temporary >> string >> at some non users configurable point, 3) converts characters to >> bytes and >> thus may fail for some unicode strings and byte strings. >> >> This is neither symmetric to the bytes->char* coercion process (which >> never fails for any kind of byte string), nor is it transparent >> that there are non-trivial things happening. >> >> "Explicit is better than implicit" definitely holds for everything >> that involves non-trivial magic and memory allocation. > > I think points (1) and (2) are non-issues--both Python and Cython/ > Pyrex implicitly allocate temporary object all over the place so the > user doesn't have to be bothered with it. I think you are referring to things like adding C numbers to Python numbers in Python space. That's a trivial case where little memory is involved, and these objects will be cleaned up almost immediately. Here, we are talking about duplicating data in memory (a potentially large string) where the user only asked for a C pointer to it. I find that *very* intransparent. Currently, this code cdef some_c_type var = some_py_value involves very straight forward coercion code for all types I can think of. If we allow automatic conversion from unicode to char* in the ASCII case, the simple statement cdef char* s = some_py_value will require special handling of different Python types (str and bytes, maybe buffer?) and copying and converting the data in one but not in all cases. That's really not the same as reading the long value out of a PyInt. Stefan From stefan_ml at behnel.de Thu May 22 17:35:35 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 22 May 2008 17:35:35 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> Message-ID: <483592C7.3040602@behnel.de> Hi, Dag Sverre Seljebotn wrote: > http://wiki.cython.org/enhancements/compiledducktyping Isn't the overloading stuff what PEP 3124 is heading for? http://www.python.org/dev/peps/pep-3124/ I think parameter annotations would make sense here. Stefan From dalcinl at gmail.com Thu May 22 17:42:32 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 22 May 2008 12:42:32 -0300 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <48356754.5060709@canterbury.ac.nz> References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> Message-ID: On 5/22/08, Greg Ewing wrote: > Stefan Behnel wrote: > > This would normally call for two special functions __getbuffer__ and > > __releasebuffer__. To me, however, this looks like an extremely C-ish > > interface that does not fit Cython at all. > > > That's because it's designed for use by C code, not by > Python code. Trying to make it more "Python-like" would > defeat the purpose of having it in the first place. As I reviewed in the past the whole PEP and even suggested modification to Travis that went in, I believe that I have a rough idea of the beast. I believe Stefan's proposal is a high-level, wrapper API more convenient for writting Cython code, but as capable as the native C one. The only point were I'm not sure is about hidding the 'internal' field. Appart ot that, it seems to me that Stefan is going in the right direction. However, Iff I do not missinterpreted Greg's comments, Stefan's proposal has a problem: iff the new PyxBuffer is goint to be a full featured, true type object, then that is going to impose runtime overheads. Travis designed the interface that way in order to avoid any kind of Python object allocations. So perhaps the __getbuffer__ should still receive a pointer to Py_buffer or a pure C-struct wrapper around it in order to fill the buffer data. Perhaps we should disturb Travis and consult him about this. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Thu May 22 20:35:50 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 22 May 2008 20:35:50 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: <483592C7.3040602@behnel.de> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> Message-ID: <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Stefan wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> http://wiki.cython.org/enhancements/compiledducktyping > > Isn't the overloading stuff what PEP 3124 is heading for? > > http://www.python.org/dev/peps/pep-3124/ > > I think parameter annotations would make sense here. Compile-time duck typing is not really just overloading, it is overloaded template methods, i.e. like this in C++: template A max(A a, B b) {...} (Unlike overloading, compile-time ducktyping has no Java equivalent). It is *implemented* by having multiple overloaded functions internally, but making overloading possible is an (intended) side-effect. PEP 3124 can very easily be implemented in addition on top of my proposal, but I believe it too is orthogonal to what I'm proposing. The idea was to kill many birds with one rock: - Real Cython macros - Possible to drop typing in a lot of places (brings us closer to type inference), leading to very convenient and easy optimizations. This will just work: def max(a, b): return a if a >= b else b and work "like in Python" without any performance penalties or change in behaviour from coercing back and forth to object (if you just skimmed the proposal you might have missed the controversial bit where I suggested that "def" should just change behaviour (in a backwards-compatible way though), rather than introducing a "duckdef"). - Overloading (as in, different behaviour for different object types) is sort-of made possible through the use of "isinstance" like in (current!) Python, so there is less to learn. If we want to, PEP 3124 can be implemented in addition (with the same spec, i.e. map it to a list of isinstance tests). Dag Sverre From dagss at student.matnat.uio.no Thu May 22 22:00:11 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 22 May 2008 22:00:11 +0200 (CEST) Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4834D795.1000401@canterbury.ac.nz> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> Message-ID: <52087.193.157.229.67.1211486411.squirrel@webmail.uio.no> Greg Ewing wrote: > Dag Sverre Seljebotn wrote: >> The big refactoring you refer to is related to >> seperating the type analysis and type coercion phases > > I'm not sure whether you would gain much by doing this. > The place where you discover that a coercion is needed > is during type analysis, where you look at the types of > the operations, decide whether they're compatible, and > see whether they need to be converted to a common type, > etc. Also you work out what the result type will be and > annotate the node with it. > > To move the creation of coercion nodes into a separate > pass, you would have to leave an annotation on the node > saying that a coercion is needed. But if some phase > between then and doing the actual coercions rearranges > the parse tree, these annotations may no longer be > correct. You're right, there's a problem even with annotations in this case. I'd just like to note that passing such annotations is common with transforms (as long as it is well-defined information that is added to the tree, keeping information in such annotations allow for easier seperate testing and concise documentation than if the state lives in temporary variables somewhere on the stack). > So you would have to disallow any other phases between > type analysis and coercion, or at least put some > restrictions on what they can do, such as not altering > the structure of the parse tree, or doing anything > else that could invalidate the calculated types. > > What sort of things were you intending to do in between > type analysis and coercion? Could they still be done under > these restrictions? Thanks, this is going to be very helpful to me and will make me able to express myself much clearer. My main problem was simply that for variables for which the type is explicitly given, as well as , the type is not resolved. I.e I believe there should be two steps, with a possibility for transforms in-between: a) Mark up all the types of declared variables. b) Expressions have their type resolved (which might involve coercions). Alternatively, one could seperate between easy, nonoverloaded behaviour (a) + calling functions, dereferencing pointers etc.) and overloaded behaviour (the rest, especially binary operators). Though this distinction seems more arbitrary. The goal was simply to have transforms that was able to act based on variable types; mostly in order to implement assumptions (http://wiki.cython.org/enhancements/assumptions). For instance, cdef MyArray[len=10] arr = ... cdef int last_idx = arr.len - 1 Here, "arr.len" gets a different value, and potentially different type, because of behaviour explicitly attached to the arr variable (not the "arr" expression). I'm still digesting your input though. It made me rethink some cases (they could be done independt of coercions, but then I realized that this was dependent on the combination of type inference and overloaded coercion operators not becoming a reality. Not such a big danger (in fact, I'll have to think on whether such a combination is internally inconsistent, I have a hunch it is) but I'd like to revalidate my assumptions at this point. Long story...) The rest of your email was very interesting too but I'll mail this now or I won't ever finish it, might answer the rest later. (But yes, I did not think about memory usage too much, and I'll get back to the 150 lines closures implementation, either to prove it or admit that I was wrong :-) ). Dag Sverre From dalcinl at gmail.com Thu May 22 23:51:28 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 22 May 2008 18:51:28 -0300 Subject: [Cython] cython-devel: Python 2.6 issues Message-ID: Stefan, my mpi4py projects is working very well on Python3.0. Stefan, I'm in debt with you for such a good work! I'm still running in trouble with Python2.6 . I believe this is related to the new method cache for type objects. Interestingly enough, this feature is also in Python 3.0, but in such version all works fine!! Could someone in this list try to test some simple pyx file with a 'cdef class' defining some standard methods and a some 'classmethod' and confirm me iff this works or not?? Iff someone can confirm the failure, I'll take the step to figure out what's really going on... Please help me if any of you have a bit of time, IMHO Python 2.6 should have a higher priority for Cython/Pyrex, right?. Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri May 23 00:09:25 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 22 May 2008 19:09:25 -0300 Subject: [Cython] Compile-time duck typing In-Reply-To: <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Message-ID: Dag, I definitelly like the whole idea... Have you considered supporting the equivalent to C++ template specialization? On 5/22/08, Dag Sverre Seljebotn wrote: > Stefan wrote: > > Hi, > > > > Dag Sverre Seljebotn wrote: > >> http://wiki.cython.org/enhancements/compiledducktyping > > > > Isn't the overloading stuff what PEP 3124 is heading for? > > > > http://www.python.org/dev/peps/pep-3124/ > > > > I think parameter annotations would make sense here. > > > Compile-time duck typing is not really just overloading, it is overloaded > template methods, i.e. like this in C++: > > template > A max(A a, B b) {...} > > (Unlike overloading, compile-time ducktyping has no Java equivalent). > > It is *implemented* by having multiple overloaded functions internally, > but making overloading possible is an (intended) side-effect. PEP 3124 can > very easily be implemented in addition on top of my proposal, but I > believe it too is orthogonal to what I'm proposing. > > The idea was to kill many birds with one rock: > > - Real Cython macros > > - Possible to drop typing in a lot of places (brings us closer to type > inference), leading to very convenient and easy optimizations. This will > just work: > > def max(a, b): return a if a >= b else b > > and work "like in Python" without any performance penalties or change in > behaviour from coercing back and forth to object (if you just skimmed the > proposal you might have missed the controversial bit where I suggested > that "def" should just change behaviour (in a backwards-compatible way > though), rather than introducing a "duckdef"). > > - Overloading (as in, different behaviour for different object types) is > sort-of made possible through the use of "isinstance" like in (current!) > Python, so there is less to learn. If we want to, PEP 3124 can be > implemented in addition (with the same spec, i.e. map it to a list of > isinstance tests). > > > Dag Sverre > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Fri May 23 00:25:02 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 May 2008 10:25:02 +1200 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <483572B7.2070503@behnel.de> References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> <483572B7.2070503@behnel.de> Message-ID: <4835F2BE.3040900@canterbury.ac.nz> Stefan Behnel wrote: > That sounds a bit like you are objecting to my proposal. What would you see as > a better solution then? Make the relevant C struct a built-in Cython type (a plain C struct, not an extension type) and expose the buffer slots exactly the way they are. The purpose of the buffer interface is to provide a direct C-level window into the internals of the object, with as little in the way as possible. > I mean, we will have to support the buffer interface in one way or another, > and we will have to do it in a way that works on Py3 and compiles on Py2. It'll be difficult to have the same Cython source work for both. It might be doable by giving the new buffer slots different names in Cython, and generating a wrapper that emulates the old buffer interface using the new one when compiling in py2 mode. (A purely C-level wrapper, not involving any extra Python objects.) That might enable code that provides an old-style buffer interface to work with py3, although you may run into problems with the need for locking. I don't think going the other way will be feasible, since the new buffer interface has capabilities that you can't emulate using the old one. > Cython is all about writing C code without writing C code. I would really > prefer a solution that does not require going all the way down to C-ish code, > just to implement a Python object protocol. But the buffer interface is *not* a Python protocol, it's a C protocol. You can't avoid C-level issues when you're dealing with it. -- Greg From greg.ewing at canterbury.ac.nz Fri May 23 00:32:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 May 2008 10:32:51 +1200 Subject: [Cython] Unicode issues In-Reply-To: <483590B1.1050609@behnel.de> References: <48309CEF.2010102@behnel.de> <43685.194.114.62.39.1211216158.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832320B.2050900@canterbury.ac.nz> <12216.194.114.62.65.1211269185.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4832C35D.8080107@canterbury.ac.nz> <29610.194.114.62.65.1211288333.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4F41AEC2-FE07-45FA-86F3-82633DF6897F@math.washington.edu> <483590B1.1050609@behnel.de> Message-ID: <4835F493.9020602@canterbury.ac.nz> Stefan Behnel wrote: > I think you are referring to things like adding C numbers to Python numbers in > Python space. That's a trivial case where little memory is involved, and these > objects will be cleaned up almost immediately. Here, we are talking about > duplicating data in memory We're all getting off on a wild tangent here. The code snippet I wrote was erroneous. It was meant to be an illustration of explicit decoding, not to imply that automatic memory allocation should be done if you wrote something like that. -- Greg From viyer at facebook.com Fri May 23 01:16:22 2008 From: viyer at facebook.com (Venky Iyer) Date: Thu, 22 May 2008 16:16:22 -0700 Subject: [Cython] Casting python objects to void* and back In-Reply-To: Message-ID: > (cross-posted to pyrex/cython, apologies if this is bad behavior) > > I've run into a number of issues with casting python objects to void* and > back. > > Ultimately the reason I want to do this is to allow C functions to call python > functions as callbacks, passing python callables as void* to C is a pattern > that has been discussed on these lists multiple times. > > However, in addition to python callables, I want to figure out how I can pass > other python objects to the python callback through the void* trick. > > Here is some test code for this: > > --------- > void.pyx > --------- > > cdef void* pack_tuple( int i, int j ): > t = ( i, j ) > return t > > cdef void* pack_dict( int i, int j ): > d = { 'i': i, 'j': j } > return d > > cdef unpack_tuple( void* t ): > u = t > > print u > print u[0] > print u[1] > > # (, 10) > # (, ) > # 10 > > > cdef unpack_dict( void* d ): > u = d > > # print u > > # {'Py_Repr': [{...}, [...]]} > # Fatal Python error: GC object already tracked > # Aborted > > print u['i'] > print u['j'] > > # 5 > # 10 > > cpdef run_tuple(int i, int j): > print "__Tuple__" > cdef void* packed > packed = pack_tuple(i,j) > unpack_tuple(packed) > print "Done" > > cpdef run_dict(int i, int j): > print "__Dict__" > cdef void* packed > packed = pack_dict(i,j) > unpack_dict(packed) > print "Done" > > ------------ > void_test.py > ------------ > > import void > > if __name__ == "__main__": > > void.run_tuple(5,10) > void.run_dict(5,10) > > > --x-x-x--- > > Here are the problems I'm encountering: > > 1) packing as a tuple just doesn't work. The first argument (5) cannot be > retrieved. See output in comments above. > > 2) If I put in a dummy first argument in pack_tuple, like ('', 5, 10), I get > (, 5, 10), but accessing index 0 segfaults. I can access index 1 and 2 > (containing 5 and 10 respectively) just fine though. What is going on here? > > 3) Packing as a dict seems to work well, except that if I try to print this > dict, it aborts. > > 4) Calling run_tuple causes the program to hang on exit. > > 5) Lists behave similar to tuples. > > Any help would be greatly appreciated. > For reference, I'm using > > $ cython -v > Cython version 0.9.6.14 > > $ uname -a > Linux greyice 2.6.24-17-generic #1 SMP Thu May 1 14:31:33 UTC 2008 i686 > GNU/Linux > > $ gcc -v > Using built-in specs. > Target: i486-linux-gnu > Configured with: ../src/configure -v > --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr > --enable-shared --with-system-zlib --libexecdir=/usr/lib > --without-included-gettext --enable-threads=posix --enable-nls > --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 > --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr > --enable-targets=all --enable-checking=release --build=i486-linux-gnu > --host=i486-linux-gnu --target=i486-linux-gnu > Thread model: posix > gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) > > $ python -V > Python 2.5.2 > > Compilation and running: > > $ cython void.pyx > $ gcc -c -fPIC -I/usr/include/python2.5/ void.c > $ gcc -shared void.o -o void.so > $ python void_test.py > > Thanks > Venky Iyer From robertwb at math.washington.edu Fri May 23 01:52:38 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 22 May 2008 16:52:38 -0700 Subject: [Cython] [Pyrex] Casting python objects to void* and back In-Reply-To: References: Message-ID: <8A8DB4B5-AD6D-42AF-AE9D-7B6366C1173F@math.washington.edu> I looks like your problems boil down to garbage collection, for example, you wrote > cdef void* pack_tuple( int i, int j ): > t = ( i, j ) > return t At the end of the function, t is deallocated... You need to manually Py_INCREF(t) so it knows it needs to keep t around after the function is done, and then figure out where (and when) to deallocate it. - Robert On May 22, 2008, at 4:16 PM, Venky Iyer wrote: > >> (cross-posted to pyrex/cython, apologies if this is bad behavior) >> >> I've run into a number of issues with casting python objects to >> void* and >> back. >> >> Ultimately the reason I want to do this is to allow C functions to >> call python >> functions as callbacks, passing python callables as void* to C is >> a pattern >> that has been discussed on these lists multiple times. >> >> However, in addition to python callables, I want to figure out how >> I can pass >> other python objects to the python callback through the void* trick. >> >> Here is some test code for this: >> >> --------- >> void.pyx >> --------- >> >> cdef void* pack_tuple( int i, int j ): >> t = ( i, j ) >> return t >> >> cdef void* pack_dict( int i, int j ): >> d = { 'i': i, 'j': j } >> return d >> >> cdef unpack_tuple( void* t ): >> u = t >> >> print u >> print u[0] >> print u[1] >> >> # (, 10) >> # (, ) >> # 10 >> >> >> cdef unpack_dict( void* d ): >> u = d >> >> # print u >> >> # {'Py_Repr': [{...}, [...]]} >> # Fatal Python error: GC object already tracked >> # Aborted >> >> print u['i'] >> print u['j'] >> >> # 5 >> # 10 >> >> cpdef run_tuple(int i, int j): >> print "__Tuple__" >> cdef void* packed >> packed = pack_tuple(i,j) >> unpack_tuple(packed) >> print "Done" >> >> cpdef run_dict(int i, int j): >> print "__Dict__" >> cdef void* packed >> packed = pack_dict(i,j) >> unpack_dict(packed) >> print "Done" >> >> ------------ >> void_test.py >> ------------ >> >> import void >> >> if __name__ == "__main__": >> >> void.run_tuple(5,10) >> void.run_dict(5,10) >> >> >> --x-x-x--- >> >> Here are the problems I'm encountering: >> >> 1) packing as a tuple just doesn't work. The first argument (5) >> cannot be >> retrieved. See output in comments above. >> >> 2) If I put in a dummy first argument in pack_tuple, like ('', 5, >> 10), I get >> (, 5, 10), but accessing index 0 segfaults. I can access >> index 1 and 2 >> (containing 5 and 10 respectively) just fine though. What is going >> on here? >> >> 3) Packing as a dict seems to work well, except that if I try to >> print this >> dict, it aborts. >> >> 4) Calling run_tuple causes the program to hang on exit. >> >> 5) Lists behave similar to tuples. >> >> Any help would be greatly appreciated. >> For reference, I'm using >> >> $ cython -v >> Cython version 0.9.6.14 >> >> $ uname -a >> Linux greyice 2.6.24-17-generic #1 SMP Thu May 1 14:31:33 UTC 2008 >> i686 >> GNU/Linux >> >> $ gcc -v >> Using built-in specs. >> Target: i486-linux-gnu >> Configured with: ../src/configure -v >> --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr >> --enable-shared --with-system-zlib --libexecdir=/usr/lib >> --without-included-gettext --enable-threads=posix --enable-nls >> --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 >> --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc -- >> enable-mpfr >> --enable-targets=all --enable-checking=release --build=i486-linux-gnu >> --host=i486-linux-gnu --target=i486-linux-gnu >> Thread model: posix >> gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) >> >> $ python -V >> Python 2.5.2 >> >> Compilation and running: >> >> $ cython void.pyx >> $ gcc -c -fPIC -I/usr/include/python2.5/ void.c >> $ gcc -shared void.o -o void.so >> $ python void_test.py >> >> Thanks >> Venky Iyer > > > _______________________________________________ > Pyrex mailing list > Pyrex at lists.copyleft.no > http://lists.copyleft.no/mailman/listinfo/pyrex From jek-gmane at kleckner.net Fri May 23 04:00:40 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Thu, 22 May 2008 19:00:40 -0700 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: References: Message-ID: Lisandro Dalcin wrote: [snip] > Please help me if any of you have a bit of time, IMHO Python 2.6 > should have a higher priority for Cython/Pyrex, right?. I agree that 2.6 should be more important than 3.0. Comments? From greg.ewing at canterbury.ac.nz Fri May 23 04:04:54 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 May 2008 14:04:54 +1200 Subject: [Cython] [Pyrex] Casting python objects to void* and back In-Reply-To: <8A8DB4B5-AD6D-42AF-AE9D-7B6366C1173F@math.washington.edu> References: <8A8DB4B5-AD6D-42AF-AE9D-7B6366C1173F@math.washington.edu> Message-ID: <48362646.5040409@canterbury.ac.nz> Robert Bradshaw wrote: > At the end of the function, t is deallocated... You need to manually > Py_INCREF(t) Rather than manually increfing and decrefing, I'd recommend keeping an ordinary reference somewhere that you know will stay alive long enough, as it will be much less error prone. Manual refcounting should only be used as a last resort. -- Greg From robertwb at math.washington.edu Fri May 23 04:38:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 22 May 2008 19:38:39 -0700 Subject: [Cython] [Pyrex] Casting python objects to void* and back In-Reply-To: <48362646.5040409@canterbury.ac.nz> References: <8A8DB4B5-AD6D-42AF-AE9D-7B6366C1173F@math.washington.edu> <48362646.5040409@canterbury.ac.nz> Message-ID: <49BEE1CB-8676-4DD8-B22A-44EA3A6553EB@math.washington.edu> On May 22, 2008, at 7:04 PM, Greg Ewing wrote: > Robert Bradshaw wrote: >> At the end of the function, t is deallocated... You need to manually >> Py_INCREF(t) > > Rather than manually increfing and decrefing, I'd recommend > keeping an ordinary reference somewhere that you know will > stay alive long enough, as it will be much less error prone. > Manual refcounting should only be used as a last resort. Yes, this is a much better way of keeping track of things. - Robert From robertwb at math.washington.edu Fri May 23 05:16:07 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 22 May 2008 20:16:07 -0700 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4834D795.1000401@canterbury.ac.nz> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> Message-ID: <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> On May 21, 2008, at 7:16 PM, Greg Ewing wrote: > Dag Sverre Seljebotn wrote: >> The big refactoring you refer to is related to >> seperating the type analysis and type coercion phases > > I'm not sure whether you would gain much by doing this. > The place where you discover that a coercion is needed > is during type analysis, where you look at the types of > the operations, decide whether they're compatible, and > see whether they need to be converted to a common type, > etc. Also you work out what the result type will be and > annotate the node with it. > > To move the creation of coercion nodes into a separate > pass, you would have to leave an annotation on the node > saying that a coercion is needed. But if some phase > between then and doing the actual coercions rearranges > the parse tree, these annotations may no longer be > correct. > > So you would have to disallow any other phases between > type analysis and coercion, or at least put some > restrictions on what they can do, such as not altering > the structure of the parse tree, or doing anything > else that could invalidate the calculated types. One would require that transformations done at this point leave the tree in a correct state. As for the separation of coercion, rather than mark a node as needing coercion on the first pass, I would decide whether or not it needs coercion (by looking at types) right before actually creating the coerce nodes. > What sort of things were you intending to do in between > type analysis and coercion? Could they still be done under > these restrictions? The phase that I'd like to stick here is type inference. The type analysis would type all declared variables, and in some cases assign types that are dependent on other (yet unknown) types. One would then run a type-resolution algorithm on the data of the symbol table, which could be used to actually resolve all the types. > More generally, sometimes trying to split things up into > more phases can make things more complicated rather than > simpler. As an example of this, I'm currently thinking > about eliminating the allocate_temps subphase of expression > analysis and combining it with the code generation phase. > > The reason is that there's currently a rather non-obvious > dependency between these phases. The order in which temp > variables are allocated and released during allocate_temps > has to exactly match the order in which code is generated > that creates and disposes the references that will be > put in those temps. This makes it rather tricky to both > write and maintain code for these two phases. > > The reason they're separate phases at the moment is that > I was initially writing the generated code directly to the > output file, so I had to know what temp variable declarations > would be needed before starting to write any of the > body code for a function. However, I'm currently writing > the declaration and executable code to separate buffers and > combining them afterwards, so there's probably no need for a > separate allocate_temps pass any more, and combining it > with code generation is likely to simplify quite a lot > of things. This actually sounds like a very good idea. > >> It would be better (see below) to have "deeper" cuts, i.e. so that >> one >> could say "now the entire tree has been analysed", "now the entire >> tree >> is ready for code generation", rather than some parts (functions >> etc.) >> being in a seperate state. > > The reason for doing functions that way is that it seemed > wasetful to keep all the symbol tables for the local > scopes around longer than necessary. > > That decision was probably influenced by an earlier project > in which I wrote a compiler for a Modula-like language that ran > on machines much smaller than we have today. It used a 3-pass > arrangement that kept all the symbol tables for everything > between passes, with the result that it could only compile > a module a few hundred lines long before running out of memory. > > That experience gave me an appreciation of why Wirth prefers > to write single-pass compilers. > > Although the memory issue probably isn't a concern nowadays, the > aforementioned experience led me to approach the problem with the > mindset of using as few passes as possible. The only reason I used > separate analysis and generation phases at all in the beginning > was so that you can refer to C functions that are defined further > down without needing forward declarations. Certainly memory (at this level) isn't near as tight as it once was. One advantage of lots of passes is the ability to more easily insert (optional) optimization passes. >> Consider for instance trivial inner functions. One >> natural way to implement this is just "throwing the function out" >> ... >> so you *somehow* need an ugly kludge to >> get around this, and spend time thinking about that > > I don't think it would be all that difficult to make a pass > over the function body just before generating code for it, > that generates code for any nested functions. > > In fact, I have a suspicion that for the special case you're > talking about (no references to intermediate scopes) it would > "just work", because the analysis phase will already have > generated an appropriately-mangled C name for the function. > > On the other hand, "throwing the function out" presents > difficulties of its own. As you mentioned, some kind of name > mangling would need to be done, which requires knowing which > names are supposed to refer to that function, so you need > some kind of symbol table functionality available in the > pass where you do the throwing-out. > > The obvious thing is to use the real symbol table, but > that means doing it after the declaration analysis phase, > by which time the only problem remaining is how to get the > C code generated in the right place -- which as I've > said isn't really all that hard. +1 I think "throwing the function out" presents more difficulties than handling it right before the code generation phase. Conceptually it would be easier to say things like "all declarations in the tree has now been processed" though. >> (In reality there'll be full closures instead, which just means >> generating a "cdef class" (with state) at module scope rather than a >> function. I'd like to see code doing that in less than 150 lines >> without >> using any of my "new structure" proposals...) > > I'll be impressed if you can do it in 150 lines whatever > structure you're using. But in any case, I suspect that > the hardest part of this will be coping with all the > nested scope issues, and that it will be easiest to do > that while the parse tree still reflects the lexical > structure of the original code > >> Consider for instance the "with" statement... the only "natural" >> way to implement it within the current structure is to implement >> "WithStatementNode" > > Not necessarily. It may well be feasible to implement it > by assembling existing nodes, and I'll be looking into that. Yes, this would basically be manually implementing the transformation in the parser module. - Robert From robertwb at math.washington.edu Fri May 23 07:03:05 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 22 May 2008 22:03:05 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Message-ID: <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> On May 22, 2008, at 11:35 AM, Dag Sverre Seljebotn wrote: > Stefan wrote: >> Hi, >> >> Dag Sverre Seljebotn wrote: >>> http://wiki.cython.org/enhancements/compiledducktyping >> >> Isn't the overloading stuff what PEP 3124 is heading for? >> >> http://www.python.org/dev/peps/pep-3124/ >> >> I think parameter annotations would make sense here. > > Compile-time duck typing is not really just overloading, it is > overloaded > template methods, i.e. like this in C++: > > template > A max(A a, B b) {...} > > (Unlike overloading, compile-time ducktyping has no Java equivalent). The last thing I want to do is head down a path of trying to reimplement C++. It has a lot of very powerful features, but they come at a cost. Perhaps if it can be pulled off smoothly enough though... > It is *implemented* by having multiple overloaded functions > internally, > but making overloading possible is an (intended) side-effect. PEP > 3124 can > very easily be implemented in addition on top of my proposal, but I > believe it too is orthogonal to what I'm proposing. > > The idea was to kill many birds with one rock: > > - Real Cython macros > > - Possible to drop typing in a lot of places (brings us closer to type > inference), leading to very convenient and easy optimizations. This > will > just work: > > def max(a, b): return a if a >= b else b > > and work "like in Python" without any performance penalties or > change in > behaviour from coercing back and forth to object (if you just > skimmed the > proposal you might have missed the controversial bit where I suggested > that "def" should just change behaviour (in a backwards-compatible way > though), rather than introducing a "duckdef"). It should come as no surprise that I don't like the keyword "duckdef." On the other hand, automatically treating all def functions in this way is scary (imagine several copies of a 1000 line def function, and picking an arbitrary cutoff is, well, arbitrary). It also seems a bit too magic for def functions--if anything this should be applied to cdef functions instead. The "two kinds of C types" has me worried too, and another concern is how to properly deduce the return type. (Maybe given the inputs, it would re-analyze the function and pick the type of the return statement?) Another option (just throwing it out, not convinced it's better) is being explicit like cdef max(generic a, generic b): return a if a >= b else b where "generic" is a special type. Or a keyword could be given as in cdef generic max(a, b): return a if a >= b else b or it could be restricted to inline functions (where it would obviously have the most use). > - Overloading (as in, different behaviour for different object > types) is > sort-of made possible through the use of "isinstance" like in > (current!) > Python, so there is less to learn. If we want to, PEP 3124 can be > implemented in addition (with the same spec, i.e. map it to a list of > isinstance tests). The overloading mechanisms of PEP 3124 seem much cleaner than a list of isinstances, and I would suggests perhaps that cdef functions could be overloadable by default. But, as you said, this is somewhat orthogonal. - Robert From stefan_ml at behnel.de Fri May 23 08:57:12 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 08:57:12 +0200 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> Message-ID: <48366AC8.7020407@behnel.de> Hi, Lisandro Dalcin wrote: > Travis designed the interface that way in order to avoid > any kind of Python object allocations. Ok, I buy that. I had already tried making Py_buffer a builtin struct type before writing my proposal, but that's trickier than you might think. I found a way to do that now, so I'll see that I can provide the interface exactly the way it is. Stefan From stefan_ml at behnel.de Fri May 23 09:35:38 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 09:35:38 +0200 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: References: Message-ID: <483673CA.1050100@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, my mpi4py projects is working very well on Python3.0. Stefan, > I'm in debt with you for such a good work! :) > I'm still running in trouble with Python2.6 . I believe this is > related to the new method cache for type objects. Interestingly > enough, this feature is also in Python 3.0, but in such version all > works fine!! Py2.6 behaves in part like 3.0 and otherwise like 2.x, so we may have to do some fine tuning here. > Could someone in this list try to test some simple pyx file with a > 'cdef class' defining some standard methods and a some 'classmethod' > and confirm me iff this works or not?? Iff someone can confirm the > failure, I'll take the step to figure out what's really going on... Can you come up with a test case? Look at the tests/run/ directory to see how it's done. http://wiki.cython.org/HackerGuide > Please help me if any of you have a bit of time, IMHO Python 2.6 > should have a higher priority for Cython/Pyrex, right?. If Py3 works fine, I don't see the need to set priorities. :) Stefan From robertwb at math.washington.edu Fri May 23 10:58:40 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 23 May 2008 01:58:40 -0700 Subject: [Cython] Status update: Transform utilities In-Reply-To: <482F33BE.7050402@student.matnat.uio.no> References: <482DB943.3090501@student.matnat.uio.no> <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> <559AF361-1085-499F-9514-6865A634AA7C@math.washington.edu> <482F33BE.7050402@student.matnat.uio.no> Message-ID: <72098516-C4C9-45E4-A4AD-6503CCDB6DA3@math.washington.edu> On May 17, 2008, at 12:36 PM, Dag Sverre Seljebotn wrote: >>> If it helps with the funny feeling, the string is only parsed >>> directly in >>> the constructor for TreeFragment (and it accepts a more directly >>> created >>> node structure too). So the string disappears from the story after >>> module >>> load time; from there it is only tree node manipulation (the >>> TreeFragment >>> acts like a template which is cloned while substituting nodes on the >>> clone). So there's nothing that requires the string but notational >>> convenience. >> >> Yes--the TreeFragment idea makes things much cleaner. Are EXPR et al. >> special words then? > > No -- they are inserted into the trees as a regular NameNode (so any > name resolving done by the parser will happen to them -- if this bites > us, it is a case for refactoring such functionality to a post-parse > transform; although in practice we'll just use another name :-) ). > > However, since they just sit there in the tree, they can be > replaced by > much more complex nodes. > > Hmm.. that's something! I should probably "replace the enclosing > ExprStatNode instead, if any". Thanks for putting me on track of a > bug :-) > > The context of the string is a module-level pyx-file without any > imports > but __builtin__. This can be altered though. (For instance, no > file-lookup happens in the case of "cimport" or "include" in such a > string, instead an exception is raised.) > > If this simple use of NameNode is not sufficient, one can extend the > parser to parse special nodes if in a special mode ("${BLOCK}" > parses to > "CustomNode(contenst="BLOCK")", however I don't think that will be > necesarry. As long as the fragments are replaced all at once (rather than sequentially) then it should work just fine. Otherwise you could have (using your example again) the block EXPR containing a NameNode VAR, which would be bad. - Robert From dagss at student.matnat.uio.no Fri May 23 14:11:28 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 23 May 2008 14:11:28 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> Message-ID: <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> Robert wrote: > On May 22, 2008, at 11:35 AM, Dag Sverre Seljebotn wrote: >> >> Compile-time duck typing is not really just overloading, it is >> overloaded >> template methods, i.e. like this in C++: >> >> template >> A max(A a, B b) {...} >> >> (Unlike overloading, compile-time ducktyping has no Java equivalent). > > The last thing I want to do is head down a path of trying to > reimplement C++. It has a lot of very powerful features, but they > come at a cost. Perhaps if it can be pulled off smoothly enough > though... Well, this was an attempt to take a step away from reimplementing C++, and do something more Pythonic. Anyway, all I really want is a way to built my __getitem__. This seemed less static and C++-ish than other alternatives I considered. The only real alternatives I see to some sort of compile-time ducktyping are: - Drop the __getitem__ syntax and rather add more fine-grained control (at the very least, reintroduce __getslice__ which Python is in the process of deprecating, probably add a whole bunch of other special functions too, or invent a fully new syntax). This is so that [] will be able to have different type signatures depending on the usecase. - From Stefan's thoughts on the buffer interface... I might also backport the buffer interface to 2.0 (i.e. we build in the structs etc. needed in the C file if needed) and implement a Cython native [] operator for buffers (and declare ndarray as exporting a buffer interface somehow etc.) Any other ideas? > It should come as no surprise that I don't like the keyword > "duckdef." On the other hand, automatically treating all def Indeed, I purposefully selected a distasteful keyword for the sake of the discussion, so that it wouldn't stick. As long as the feature is there somehow for __getitem__ to use I don't particularily mind syntax. One can make it very explicit first, then benchmark it etc. and see how it would fare if "def" was made automatically duck-typed, and perhaps make that change way, way later (when the stability of the scheme is proven). > functions in this way is scary (imagine several copies of a 1000 line > def function, and picking an arbitrary cutoff is, well, arbitrary). Well, C++ and Boost++ easily has 1000-line templates (or much more), containing multiple methods, being instantiated for multiple combinations of types (and passed to each other etc., creating very many combinations of types that must be expanded). The difference is that it is more explicit in C++. OTOH, in C++ there is not even an option to cut off anywhere (which we would have, if wanted). > It also seems a bit too magic for def functions--if anything this > should be applied to cdef functions instead. The "two kinds of C > types" has me worried too, and another concern is how to properly > deduce the return type. (Maybe given the inputs, it would re-analyze > the function and pick the type of the return statement?) I think having C types which behave in a way so that coercion back and forth to object would be a lossless, transparent operations would be progress in itself. Connecting it with a "ctypes" type hierarchy would not be unnatural. But I guess this is a seperate issue (which must be dealt with before duckdef can assume the role of def, but doesn't block duckdef through other, more explicit means before then). > Another option (just throwing it out, not convinced it's better) is > being explicit like > > cdef max(generic a, generic b): > return a if a >= b else b > > where "generic" is a special type. Or a keyword could be given as in Yes, I really, really like this one. If "generic" is not clear enough, "auto" is another option. Java has "?" for sort of the same job some places, however even if that is very clear it is also very ugly :-) BTW the future C++ specs will have "auto" IIRC, i.e. auto it = mycollection.begin() would declare it to be of the return type of begin(). Allowing "auto" in this way could be one incremental way towards type inference (allowing very easy development; one simply tries to use "auto" in more and more situations...). Function signatures would simply be the first place "auto" is allowed. > or it could be restricted to inline functions (where it would > obviously have the most use). -1. This would mean that def inline add(a, b): return a + b would handle overflowing conditions differently depending on whether the inline keyword is present or not. This would not exactly be obvious to the user. > The overloading mechanisms of PEP 3124 seem much cleaner than a list > of isinstances, and I would suggests perhaps that cdef functions > could be overloadable by default. But, as you said, this is somewhat > orthogonal. Keep in mind that with compile-time ducktyping, the only usecases for explicit overloading that remains are the exact same cases which currently shows up in type-less Python code, and where one currently has to use isinstance. This could be an argument for waiting with supporting explicit overloading until the time that Python decides to do so. However, these cases would tend to appear more often when writing wrapper code around C libraries (which use different names for different types, which you want to wrap using a single function name in Cython). Which is an argument in favor of explicit overloading. Dag Sverre From stefan_ml at behnel.de Fri May 23 17:45:17 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 17:45:17 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> Message-ID: <4836E68D.4000105@behnel.de> Hi, Dag Sverre Seljebotn wrote: > Robert wrote: >> cdef max(generic a, generic b): >> return a if a >= b else b >> >> where "generic" is a special type. This is called "parametric polymorphism", a concept from ML and one of the reasons why compiled ML code is so freaking fast. http://en.wikipedia.org/wiki/Polymorphism_(computer_science)#Parametric_polymorphism > Yes, I really, really like this one. So do I. Very explicit. How would that map to C space? Would you have functions called ..._generic_int_float_max(int a, float b) ? What about extension types? Imagine a stupid type called "int_float" ... > If "generic" is not clear enough, "auto" is another option. I prefer "generic". "auto" sounds more like "find the correct type yourself" and doesn't make it clear that this can lead to code duplication and actually works for multiple type candidates. > def inline add(a, b): return a + b > > would handle overflowing conditions differently depending on whether the > inline keyword is present or not. This would not exactly be obvious to the > user. Agreed. > This could be an argument for waiting with supporting explicit overloading > until the time that Python decides to do so. Overloading and parametric polymorphism are orthogonal concepts. Both have their niche. Intuitively, I would expect both to be similarly complex to implement, but PP looks much more pythonic to me and can be done without caring about Python compatibility, so I would propose to go for that one first. Stefan From stefan_ml at behnel.de Fri May 23 18:11:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 18:11:21 +0200 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <4835F2BE.3040900@canterbury.ac.nz> References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> <483572B7.2070503@behnel.de> <4835F2BE.3040900@canterbury.ac.nz> Message-ID: <4836ECA9.6070602@behnel.de> Hi, Greg Ewing wrote: > Stefan Behnel wrote: >> That sounds a bit like you are objecting to my proposal. What would you see as >> a better solution then? > > Make the relevant C struct a built-in Cython type > (a plain C struct, not an extension type) and expose > the buffer slots exactly the way they are. That's what I ended up doing. It's implemented in rev 586. Stefan From dagss at student.matnat.uio.no Fri May 23 18:56:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 23 May 2008 18:56:15 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: <4836E68D.4000105@behnel.de> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> Message-ID: <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> Seems that we will be able to reach some kind of consensus for an initial subset of my proposal?: - The "name of the feature" will be parametric polymorphism. - Introduced through the "generic" type specifier in "cdef" functions only. - Such functions are not exportable across module boundaries (only same pyx, or as inline pxd functions) - Allow the "typeof" operator in variable declarations in function bodies to resolve the actual type of the generic. This should be forward-compatible to anything done with extending to def, extending to cross-module support, making "generic" the default argument "type" and so on. Such things can be dealt with and discussed when all the above is in place and stable. Agreed/not agreed? If agreed upon, I'll strip down the doc and make it a CEP. And then if Robert agree it's the way to go for my project, I'll implement this as part of my GSoC for the benefit of NumPy __getitem__ and __setitem__. (Given enough time of course.) Stefan wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> Robert wrote: >>> cdef max(generic a, generic b): >>> return a if a >= b else b >>> >>> where "generic" is a special type. > > This is called "parametric polymorphism", a concept from ML and one of the > reasons why compiled ML code is so freaking fast. > > http://en.wikipedia.org/wiki/Polymorphism_(computer_science)#Parametric_polymorphism > Thanks, I'll use that term from now on. >> Yes, I really, really like this one. > > So do I. Very explicit. > > How would that map to C space? Would you have functions called > > ..._generic_int_float_max(int a, float b) > > ? What about extension types? Imagine a stupid type called "int_float" ... Well, obviously one would escape the _ character in type names if that was used as a seperator, but I know that's not the point here :-) I wrote some lines about that in my doc as well, "cross-module duckdefs". The cases are: 1) You know the implementation (same pyx or inline-code-in-pxd). In this case you can create your own instantiations, and you simply mangle them incrementally and keep record of them during the same Cython compilation process. "pyx_pp1_max", "pyx_pp2_max"... 2) You do not know the implementation. But then any instantiations must have been made when the module was compiled, one cannot make new ones, and a lot of the utility for them disappears. I'd be happy to just deny calling such functions across modules at first (and one can look at name mangling later and pre-instantiation later...or read the section in my doc). There is another option though (also outlined in my doc) involving replacing "generic" with "object" across module boundaries, so that only speed is impacted, but it requires a new, parallell type system for native types (making them coerce to Python objects that preserve overflow semantics). >> If "generic" is not clear enough, "auto" is another option. > > I prefer "generic". "auto" sounds more like "find the correct type > yourself" > and doesn't make it clear that this can lead to code duplication and > actually > works for multiple type candidates. Ok, you convinced me. >> This could be an argument for waiting with supporting explicit >> overloading >> until the time that Python decides to do so. > > Overloading and parametric polymorphism are orthogonal concepts. Both have > their niche. Intuitively, I would expect both to be similarly complex to > implement, but PP looks much more pythonic to me and can be done without > caring about Python compatibility, so I would propose to go for that one > first. Sounds good. Dag Sverre From dagss at student.matnat.uio.no Fri May 23 19:05:12 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 23 May 2008 19:05:12 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> Message-ID: <50647.193.157.229.67.1211562312.squirrel@webmail.uio.no> Need to make a correction, guess this wasn't ready anway: Dag Sverre wrote: > Seems that we will be able to reach some kind of consensus for an initial > subset of my proposal?: > > - The "name of the feature" will be parametric polymorphism. > - Introduced through the "generic" type specifier in "cdef" functions > only. - Also allow "generic" in cdef methods in "cdef class"es under the same restrictions (same pyx or inline-in-pxd). (Should one require that such methods also declared "final" or somesuch and cannot be overriden? Combining parametric polymorphism and OOP polymorphism doesn't seem like a good idea initially.) > - Such functions are not exportable across module boundaries (only same > pyx, or as inline pxd functions) > - Allow the "typeof" operator in variable declarations in function bodies > to resolve the actual type of the generic. Dag Sverre From stefan_ml at behnel.de Fri May 23 19:05:40 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 19:05:40 +0200 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: References: Message-ID: <4836F964.6010200@behnel.de> Hi, Lisandro Dalcin wrote: > Could someone in this list try to test some simple pyx file with a > 'cdef class' defining some standard methods and a some 'classmethod' > and confirm me iff this works or not?? I'm not sure Pyrex/Cython ever supported class methods, but the following works for me, also in Py2.6. Stefan __doc__ = u""" >>> class1.plus1(1) 2 >>> class2.plus1(1) 2 >>> class3.plus1(1) 2 """ def f_plus(a): return a + 1 class class1: plus1 = f_plus class class2(object): plus1 = f_plus cdef class class3: plus1 = f_plus From stefan_ml at behnel.de Fri May 23 19:10:56 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 19:10:56 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> Message-ID: <4836FAA0.9020907@behnel.de> Hi, Dag Sverre Seljebotn wrote: > - Allow the "typeof" operator in variable declarations in function bodies > to resolve the actual type of the generic. What's the "typeof" operator? Stefan From cwitty at newtonlabs.com Fri May 23 19:14:16 2008 From: cwitty at newtonlabs.com (Carl Witty) Date: Fri, 23 May 2008 10:14:16 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Message-ID: On Thu, May 22, 2008 at 11:35 AM, Dag Sverre Seljebotn wrote: > Stefan wrote: >> Hi, >> >> Dag Sverre Seljebotn wrote: >>> http://wiki.cython.org/enhancements/compiledducktyping >> >> Isn't the overloading stuff what PEP 3124 is heading for? >> >> http://www.python.org/dev/peps/pep-3124/ >> >> I think parameter annotations would make sense here. > > Compile-time duck typing is not really just overloading, it is overloaded > template methods, i.e. like this in C++: > > template > A max(A a, B b) {...} > > (Unlike overloading, compile-time ducktyping has no Java equivalent). I was hoping this would be powerful enough to write generic optimized mathematics code for Sage, but it doesn't do enough for that. For Sage, a function that takes "double" arguments might include: a = b + c the corresponding function that takes "mpz_t" arguments would have: mpz_add(a, b, c) the corresponding function with "mpfr_t" arguments would have: mpfr_add(a, b, c, GMP_RNDN) for integers mod p (for sufficiently small p), it would be: a = (b + c) % p On the other hand, different subclasses of object (like Integer and RealNumber) should probably not have separate versions of the function compiled for them. Maybe it's not worth even trying to think about supporting this use case with this proposal--anything that works would end up being significantly more complicated than what you're talking about, probably. But it would be nice to at least avoid making it harder to implement this complicated thing in the future. Carl From dagss at student.matnat.uio.no Fri May 23 19:29:16 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 23 May 2008 19:29:16 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: <4836FAA0.9020907@behnel.de> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <4836FAA0.9020907@behnel.de> Message-ID: <50652.193.157.229.67.1211563756.squirrel@webmail.uio.no> > Hi, > > Dag Sverre Seljebotn wrote: >> - Allow the "typeof" operator in variable declarations in function >> bodies >> to resolve the actual type of the generic. > > What's the "typeof" operator? Just skim http://wiki.cython.org/enhancements/compiledducktyping once more :-) It is used in type specification context and allows for this: cdef int some_func(generic arg): cdef typeof(arg) tmp tmp = arg + 4 .... return tmp (I suppose it doesn't allow fetching the type of the result though. Is this wanted? How to make it happen... typeof_retval or typeof(result=True)?) If this turns out to be complicated we can just drop it from the initial feature set. I think I don't need it for NumPy anyway. Dag Sverre From dagss at student.matnat.uio.no Fri May 23 19:36:18 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 23 May 2008 19:36:18 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Message-ID: <50657.193.157.229.67.1211564178.squirrel@webmail.uio.no> Carl Witty wrote: > > I was hoping this would be powerful enough to write generic optimized > mathematics code for Sage, but it doesn't do enough for that. Let me hear your opinion on this though. If this is implemented, one could move on to compile-time optimize away isinstance in a few trivial cases. I.e you could do: cdef generic gen_add(generic b, generic c): if isinstance(b, mpz_t): ... elif isinstance(b, ... This would be a later addition, but a natural one and one that I would like to have for NumPy (regardless on whether overloading on the argument types is introduced as well). Would this solve your case? Dag Sverre From wstein at gmail.com Fri May 23 19:48:31 2008 From: wstein at gmail.com (William Stein) Date: Fri, 23 May 2008 10:48:31 -0700 Subject: [Cython] video of Robert Bradshaw's talk on Cython Message-ID: <85e81ba30805231048g59fd34d2v5311733cf61b1152@mail.gmail.com> Hi, There's a link to the Google Video for Robert Bradshaw's talk on Cython here: http://wiki.wstein.org/2008/sageseminar/bradshaw William -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From stefan_ml at behnel.de Fri May 23 20:14:57 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 20:14:57 +0200 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: <4836F964.6010200@behnel.de> References: <4836F964.6010200@behnel.de> Message-ID: <483709A1.4020507@behnel.de> Hi again, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> Could someone in this list try to test some simple pyx file with a >> 'cdef class' defining some standard methods and a some 'classmethod' >> and confirm me iff this works or not?? > > I'm not sure Pyrex/Cython ever supported class methods, but the following > works for me, also in Py2.6. [...] > def f_plus(a): > return a + 1 > > class class1: > plus1 = f_plus :) sorry, my bad, that would be the equivalent of a "staticmethod". For the code below I get an exception at module load time: File "classmethod.pyx", line 15, in classmethod (classmethod.c:507) TypeError: Class-level classmethod() can only be called on a method_descriptor or instance method. Is this what you meant? Stefan __doc__ = u""" >>> class1.plus(1) 6 >>> class2.plus(1) 7 >>> class3.plus(1) 8 """ def f_plus(cls, a): return cls.a + a class class1: a = 5 plus = classmethod(f_plus) class class2(object): a = 6 plus = classmethod(f_plus) cdef class class3: a = 7 plus = classmethod(f_plus) From stefan_ml at behnel.de Fri May 23 20:30:58 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 20:30:58 +0200 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: References: Message-ID: <48370D62.9070300@behnel.de> Hi, Lisandro Dalcin wrote: > Could someone in this list try to test some simple pyx file with a > 'cdef class' defining some standard methods and a some 'classmethod' > and confirm me iff this works or not?? There is a special hack in Symtab.py for classmethod() calls. When I remove it, it works from 2.3 through 3.0. Robert, according to the hg history, this was supposed to work around a problem back in September 2007 (rev. 162). Can you give me an idea why this was there? Stefan From cwitty at newtonlabs.com Fri May 23 20:37:40 2008 From: cwitty at newtonlabs.com (Carl Witty) Date: Fri, 23 May 2008 11:37:40 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <50657.193.157.229.67.1211564178.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <50657.193.157.229.67.1211564178.squirrel@webmail.uio.no> Message-ID: On Fri, May 23, 2008 at 10:36 AM, Dag Sverre Seljebotn wrote: > Carl Witty wrote: >> >> I was hoping this would be powerful enough to write generic optimized >> mathematics code for Sage, but it doesn't do enough for that. > > Let me hear your opinion on this though. > > If this is implemented, one could move on to compile-time optimize away > isinstance in a few trivial cases. I.e you could do: > > cdef generic gen_add(generic b, generic c): > if isinstance(b, mpz_t): > ... > elif isinstance(b, ... > > This would be a later addition, but a natural one and one that I would > like to have for NumPy (regardless on whether overloading on the argument > types is introduced as well). Would this solve your case? Perhaps. What does "cdef generic gen_add" mean? (How does it figure out what the return type really is?) It would have to somehow split into two cases, between reference types and value types: if gen_is_reference(b): gen_ref_add(a, b, c) else: a = gen_add(b, c) Carl From robertwb at math.washington.edu Fri May 23 20:40:19 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 23 May 2008 11:40:19 -0700 Subject: [Cython] How to deal with the new buffer interface? In-Reply-To: <4836ECA9.6070602@behnel.de> References: <48355025.4090003@behnel.de> <48356754.5060709@canterbury.ac.nz> <483572B7.2070503@behnel.de> <4835F2BE.3040900@canterbury.ac.nz> <4836ECA9.6070602@behnel.de> Message-ID: <84E9C65D-2A54-4B47-9B61-59CF9023B5FD@math.washington.edu> On May 23, 2008, at 9:11 AM, Stefan Behnel wrote: > Hi, > > Greg Ewing wrote: >> Stefan Behnel wrote: >>> That sounds a bit like you are objecting to my proposal. What >>> would you see as >>> a better solution then? >> >> Make the relevant C struct a built-in Cython type >> (a plain C struct, not an extension type) and expose >> the buffer slots exactly the way they are. > > That's what I ended up doing. It's implemented in rev 586. Yes, this certainly sounds like the right approach. - Robert From robertwb at math.washington.edu Fri May 23 21:12:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 23 May 2008 12:12:39 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> Message-ID: On May 23, 2008, at 10:14 AM, Carl Witty wrote: > On Thu, May 22, 2008 at 11:35 AM, Dag Sverre Seljebotn > wrote: >> Stefan wrote: >>> Hi, >>> >>> Dag Sverre Seljebotn wrote: >>>> http://wiki.cython.org/enhancements/compiledducktyping >>> >>> Isn't the overloading stuff what PEP 3124 is heading for? >>> >>> http://www.python.org/dev/peps/pep-3124/ >>> >>> I think parameter annotations would make sense here. >> >> Compile-time duck typing is not really just overloading, it is >> overloaded >> template methods, i.e. like this in C++: >> >> template >> A max(A a, B b) {...} >> >> (Unlike overloading, compile-time ducktyping has no Java equivalent). > > I was hoping this would be powerful enough to write generic optimized > mathematics code for Sage, but it doesn't do enough for that. > > For Sage, a function that takes "double" arguments might include: > a = b + c > the corresponding function that takes "mpz_t" arguments would have: > mpz_add(a, b, c) > the corresponding function with "mpfr_t" arguments would have: > mpfr_add(a, b, c, GMP_RNDN) > for integers mod p (for sufficiently small p), it would be: > a = (b + c) % p > > On the other hand, different subclasses of object (like Integer and > RealNumber) should probably not have separate versions of the function > compiled for them. > > Maybe it's not worth even trying to think about supporting this use > case with this proposal--anything that works would end up being > significantly more complicated than what you're talking about, > probably. But it would be nice to at least avoid making it harder to > implement this complicated thing in the future. Yes, implementing such a thing would be much more complicated to do. However, one can sort of do it now by making a .pxi file that has all the algorithms, then including it in pyx files that would define the types, (inline) arithmetic functions, etc. - Robert From dalcinl at gmail.com Fri May 23 21:20:03 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 23 May 2008 16:20:03 -0300 Subject: [Cython] cython-devel: Python 2.6 issues In-Reply-To: <483709A1.4020507@behnel.de> References: <4836F964.6010200@behnel.de> <483709A1.4020507@behnel.de> Message-ID: On 5/23/08, Stefan Behnel wrote: > For the code below I get an exception at module load time: > > File "classmethod.pyx", line 15, in classmethod (classmethod.c:507) > TypeError: Class-level classmethod() can only be called on a method_descriptor > or instance method. > > Is this what you meant? > http://codespeak.net/mailman/listinfo/cython-dev > No, the problem is really contrived. I wrote a simple test and tryed to run it within the doctest infrastructure of Cython, but then the problem does not show-up. Why? Because the method cache is surelly being invalidated because of some Python code in the doctest infrastructure, and then all works as expected. We have to try it manually to see the issue in action. A small example below: The Cython source code is this: # file: classmeth.pyx cdef class A: def foo(self): print(u"A.foo") def bar(cls): print(u"A.bar") bar = classmethod(bar) After cythonizing and compiling: $ python2.6 Python 2.6a3+ (trunk:63538, May 22 2008, 17:27:43) ... >>> import classmeth >>> classmeth.A.foo >>> classmeth.A.bar Segmentation fault To check that the problem is actually in the new method cache for type objects, test again clearing the cache: $ python2.6 Python 2.6a3+ (trunk:63538, May 22 2008, 17:27:43) ... >>> import classmeth >>> import sys; sys._clear_type_cache() >>> classmeth.A.foo >>> classmeth.A.bar -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri May 23 21:32:57 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 23 May 2008 12:32:57 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> Message-ID: On May 23, 2008, at 9:56 AM, Dag Sverre Seljebotn wrote: > Seems that we will be able to reach some kind of consensus for an > initial > subset of my proposal?: > > - The "name of the feature" will be parametric polymorphism. > - Introduced through the "generic" type specifier in "cdef" > functions only. > - Such functions are not exportable across module boundaries (only > same > pyx, or as inline pxd functions) > - Allow the "typeof" operator in variable declarations in function > bodies > to resolve the actual type of the generic. > > This should be forward-compatible to anything done with extending > to def, > extending to cross-module support, making "generic" the default > argument > "type" and so on. Such things can be dealt with and discussed when > all the > above is in place and stable. > > Agreed/not agreed? > > If agreed upon, I'll strip down the doc and make it a CEP. +1 I think we're to this point. > And then if Robert agree it's the way to go for my project, I'll > implement this as > part of my GSoC for the benefit of NumPy __getitem__ and __setitem__. > (Given enough time of course.) Perhaps put an example of a __getitem__ method that uses this feature. It seems mostly this could be handled with plain old function overloading, but this would be a nice feature to have in and of itself. > > Stefan wrote: >> Hi, >> >> Dag Sverre Seljebotn wrote: >>> Robert wrote: >>>> cdef max(generic a, generic b): >>>> return a if a >= b else b >>>> >>>> where "generic" is a special type. >> >> This is called "parametric polymorphism", a concept from ML and >> one of the >> reasons why compiled ML code is so freaking fast. >> >> http://en.wikipedia.org/wiki/Polymorphism_(computer_science) >> #Parametric_polymorphism >> > > Thanks, I'll use that term from now on. > >>> Yes, I really, really like this one. >> >> So do I. Very explicit. >> >> How would that map to C space? Would you have functions called >> >> ..._generic_int_float_max(int a, float b) >> >> ? What about extension types? Imagine a stupid type called >> "int_float" ... > > Well, obviously one would escape the _ character in type names if > that was > used as a seperator, but I know that's not the point here :-) > > I wrote some lines about that in my doc as well, "cross-module > duckdefs". > > The cases are: > > 1) You know the implementation (same pyx or inline-code-in-pxd). In > this > case you can create your own instantiations, and you simply mangle > them > incrementally and keep record of them during the same Cython > compilation > process. "pyx_pp1_max", "pyx_pp2_max"... > > 2) You do not know the implementation. But then any instantiations > must > have been made when the module was compiled, one cannot make new > ones, and > a lot of the utility for them disappears. I'd be happy to just deny > calling such functions across modules at first (and one can look at > name > mangling later and pre-instantiation later...or read the section in my > doc). > > There is another option though (also outlined in my doc) involving > replacing "generic" with "object" across module boundaries, so that > only > speed is impacted, but it requires a new, parallell type system for > native > types (making them coerce to Python objects that preserve overflow > semantics). > >>> If "generic" is not clear enough, "auto" is another option. >> >> I prefer "generic". "auto" sounds more like "find the correct type >> yourself" >> and doesn't make it clear that this can lead to code duplication and >> actually >> works for multiple type candidates. > > Ok, you convinced me. > >>> This could be an argument for waiting with supporting explicit >>> overloading >>> until the time that Python decides to do so. >> >> Overloading and parametric polymorphism are orthogonal concepts. >> Both have >> their niche. Intuitively, I would expect both to be similarly >> complex to >> implement, but PP looks much more pythonic to me and can be done >> without >> caring about Python compatibility, so I would propose to go for >> that one >> first. > > Sounds good. > > Dag Sverre > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Fri May 23 21:57:26 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 23 May 2008 16:57:26 -0300 Subject: [Cython] please revert ASAP last 'classmethod' changes Message-ID: Stefan, please revert the last 'classmethod' change. That is not going so solve anything, furthermore, I believe the old way was the right way. You did not noticed that because all your classmethods take ONE argument. See the attached file and add please add it to 'tests/run' (I tried to push to cython-devel, but it seems that my commit privileges were only for the py3 branch, or something is wrong) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: classmethod.pyx Type: application/octet-stream Size: 604 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080523/b79a1516/attachment.obj From stefan_ml at behnel.de Fri May 23 22:33:33 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 22:33:33 +0200 Subject: [Cython] please revert ASAP last 'classmethod' changes In-Reply-To: References: Message-ID: <48372A1D.2040902@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, please revert the last 'classmethod' change. That is not > going so solve anything, It did solve the problem for what I tested. :) Rev. 591 has another fix that works for your new test case. Stefan From stefan_ml at behnel.de Fri May 23 23:20:54 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 May 2008 23:20:54 +0200 Subject: [Cython] classmethod crashes in Py3 In-Reply-To: <48372A1D.2040902@behnel.de> References: <48372A1D.2040902@behnel.de> Message-ID: <48373536.9070500@behnel.de> Hi again, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> Stefan, please revert the last 'classmethod' change. That is not >> going so solve anything, > > It did solve the problem for what I tested. :) > > Rev. 591 has another fix that works for your new test case. ... although that now crashes in Py3. Weird. Given my last patch and your example: ---------------------- class class1: a = 5 plus = classmethod(f_plus) def view(cls): print cls.__name__ view = classmethod(view) class class2(object): a = 6 plus = classmethod(f_plus) def view(cls): print cls.__name__ view = classmethod(view) ---------------------- when I comment out any of the the "view = classmethod()" lines, it doesn't segfault and raises the expected exception instead. But when I leave them both in as above, it crashes. The difference to Py2 is that all 'methods' are PyCFunction objects and take the new path I added to make it work in Py2... Lisandro, could you try to look into this a bit? Stefan From dalcinl at gmail.com Sat May 24 01:12:58 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 23 May 2008 20:12:58 -0300 Subject: [Cython] classmethod crashes in Py3 In-Reply-To: <48373536.9070500@behnel.de> References: <48372A1D.2040902@behnel.de> <48373536.9070500@behnel.de> Message-ID: On 5/23/08, Stefan Behnel wrote: > The difference to Py2 is that all 'methods' are PyCFunction objects and take > the new path I added to make it work in Py2... > > Lisandro, could you try to look into this a bit? Yes, I'll give a look, but tomorrow. FYI, in Py2.6 my mpi4py code still does not work (but works in Py3!). And the problem still is the method cache. And all this is because of the (I believe unavoidable) way Cython has to play with the 'tp_dict' slot in type object structures. Your fix was all right, but it will not solve the this nasty issue with the new method cache. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Sat May 24 02:32:20 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 24 May 2008 12:32:20 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> Message-ID: <48376214.3070700@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > Well, this was an attempt to take a step away from reimplementing C++, and > do something more Pythonic. Seems like you're really trying to reinvent Haskell. The term "compile-time duck typing" seems to describe Haskell's type system quite well. -- Greg From stefan_ml at behnel.de Sat May 24 10:35:18 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 24 May 2008 10:35:18 +0200 Subject: [Cython] classmethod crashes in Py3 In-Reply-To: References: <48372A1D.2040902@behnel.de> <48373536.9070500@behnel.de> Message-ID: <4837D346.2010402@behnel.de> Hi, Lisandro Dalcin wrote: > FYI, in Py2.6 my mpi4py code still does not work (but works in Py3!). > And the problem still is the method cache. Then what about this patch? (based on what you proposed in an earlier post) Still crashes for me on Py3, but it might work around the problem you experience. Stefan # HG changeset patch # User Stefan Behnel # Date 1211618008 -7200 # Node ID ac0cc8edbf55197cb7046447d59d0f8ba80c9b92 # Parent 867034806ace7bb715c9c4d5509b12075c73e270 invalidate type cache in Py2.6+ diff -r 867034806ace -r ac0cc8edbf55 Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py Fri May 23 22:32:33 2008 +0200 +++ b/Cython/Compiler/Nodes.py Sat May 24 10:33:28 2008 +0200 @@ -2101,6 +2101,11 @@ class CClassDefNode(StatNode, BlockNode) # default values of method arguments. if self.body: self.body.generate_execution_code(code) + # in Py2.6+, we need to invalidate the type cache + code.putln("#if PY_VERSION_HEX >= 0x02060000") + code.putln("(%s)->tp_flags &= ~Py_TPFLAGS_VALID_VERSION_TAG;" % + self.entry.type.typeptr_cname) + code.putln("#endif") def annotate(self, code): if self.body: From robin at reportlab.com Sat May 24 14:07:34 2008 From: robin at reportlab.com (Robin Becker) Date: Sat, 24 May 2008 13:07:34 +0100 Subject: [Cython] inline error in primes.c Message-ID: <48380506.1020709@jessikat.plus.net> I tried downloading and testing 0.9.6.14 on my win32 xp (with Visual C 7) and Python 2.5. I get an error in the Demos folder C:\Python\devel\Cython-0.9.6.14\Demos>python setup.py build_ext --inplace running build_ext building 'primes' extension C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBUG -Ic:\python\includ e -Ic:\python\PC /Tcprimes.c /Fobuild\temp.win32-2.5\Release\primes.obj primes.c primes.c(99) : error C2054: expected '(' to follow 'inline' primes.c(99) : error C2085: '__Pyx_PyObject_Append' : not in formal parameter list primes.c(99) : error C2143: syntax error : missing ';' before '{' primes.c(287) : warning C4013: '__Pyx_PyObject_Append' undefined; assuming extern returning int primes.c(287) : warning C4047: '=' : 'PyObject *' differs in levels of indirection from 'int' primes.c(566) : warning C4244: 'return' : conversion from 'long' to 'unsigned char', possible loss of data primes.c(581) : warning C4244: 'return' : conversion from 'long' to 'unsigned short', possible loss of data primes.c(596) : warning C4244: 'return' : conversion from 'long' to 'char', possible loss of data primes.c(611) : warning C4244: 'return' : conversion from 'long' to 'short', possible loss of data primes.c(656) : warning C4244: 'return' : conversion from 'long' to 'char', possible loss of data primes.c(671) : warning C4244: 'return' : conversion from 'long' to 'short', possible loss of data error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe"' failed with exit status 2 inspection of primes.c reveals that INLINE is defined thusly > #ifdef __GNUC__ > #define INLINE __inline__ > #elif _WIN32 > #define INLINE __inline > #else > #define INLINE > #endif and there are references to INLINE, however on line 99 I see this > static inline PyObject* __Pyx_PyObject_Append(PyObject* L, PyObject* x) { is this a bug or am I doing soemthing wrong? -- Robin Becker From dagss at student.matnat.uio.no Sat May 24 19:16:36 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 24 May 2008 19:16:36 +0200 (CEST) Subject: [Cython] Compile-time duck typing In-Reply-To: References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> Message-ID: <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> Robert wrote: > Perhaps put an example of a __getitem__ method that uses this > feature. It seems mostly this could be handled with plain old > function overloading, but this would be a nice feature to have in and > of itself. Yes, will do that, but for now I'll scetch the problems with plain old overloading and how this proposal (parameter polymorphism/compile-time duck typing) solves it. Problem 1: The return type is different depending on the assumptions placed on the object (i.e. if dtype=uint8 then single-item indexing should return uint8). Traditional overloading doesn't really solve this -- or if you overload on method return type, you'd have a possibility but it would mean manually copying the method for every single return type... Problem 2: Consider these calls and their translation to more explicit psuedo-code: cdef ndarray[dtype=uint8, ndim=3] arr = ... arr[1, 2, 3] => arr.__getitem__((1, 2, 3)) arr[1:2, ...] => arr.__getitem__((slice(1, 2, None), Ellipsis)) Now, the first lookup should return uint8, while the second one should return a new object (a new array view) -- yet, both arguments are of the type "object". In order to solve these problems, one could start to create complicated solutions like allowing "self.dtype" as a return type in the method signature if an assumption is placed on self, specify "nested types" (ie. tuple(int, int, int)), with a following combinatorial explosions in manual overloads one needs to create) etc. etc. However, just making both the return type and parameter "generic" seems a much simpler, more readable solution that will work out to the same thing. Then the type system doesn't get in the way, and one can write a single __getitem__ and leave the real issues to "optional" optimizations. (In order for this to work there's a small hitch: One must support code like this: cdef generic get_something(as_str): if as_str: return "asdf" else: return 3432 This can be fixed simply by having return type mismatches for instantiated generics converted into runtime errors rather than halt compilation, this emulates Python behaviour nicely.) Dag Sverre From dagss at student.matnat.uio.no Sat May 24 19:24:14 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 24 May 2008 19:24:14 +0200 (CEST) Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> Message-ID: <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> Robert wrote: > On May 21, 2008, at 7:16 PM, Greg Ewing wrote: > >> What sort of things were you intending to do in between >> type analysis and coercion? Could they still be done under >> these restrictions? > > The phase that I'd like to stick here is type inference. The type > analysis would type all declared variables, and in some cases assign > types that are dependent on other (yet unknown) types. One would then > run a type-resolution algorithm on the data of the symbol table, > which could be used to actually resolve all the types. This kind of suggest that rather than trying to take coercion out of analyse_types and put it after (as we've talked about earlier), it could be simpler and more natural to rather take the simplest "typing declared variables" out in a new phase before analyse_types? (This may be what you've concluded though, I'm just noting this thinking about earlier discussion on this.) Dag Sverre From robertwb at math.washington.edu Sat May 24 19:53:48 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 10:53:48 -0700 Subject: [Cython] inline error in primes.c In-Reply-To: <48380506.1020709@jessikat.plus.net> References: <48380506.1020709@jessikat.plus.net> Message-ID: On May 24, 2008, at 5:07 AM, Robin Becker wrote: > I tried downloading and testing 0.9.6.14 on my win32 xp (with Visual C > 7) and Python 2.5. > > I get an error in the Demos folder > > C:\Python\devel\Cython-0.9.6.14\Demos>python setup.py build_ext -- > inplace > running build_ext > building 'primes' extension > C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\cl.exe /c > /nologo /Ox /MD /W3 /GX /DNDEBUG -Ic:\python\includ > e -Ic:\python\PC /Tcprimes.c /Fobuild\temp.win32-2.5\Release > \primes.obj > primes.c > primes.c(99) : error C2054: expected '(' to follow 'inline' > primes.c(99) : error C2085: '__Pyx_PyObject_Append' : not in formal > parameter list > primes.c(99) : error C2143: syntax error : missing ';' before '{' > primes.c(287) : warning C4013: '__Pyx_PyObject_Append' undefined; > assuming extern returning int > primes.c(287) : warning C4047: '=' : 'PyObject *' differs in levels of > indirection from 'int' > primes.c(566) : warning C4244: 'return' : conversion from 'long' to > 'unsigned char', possible loss of data > primes.c(581) : warning C4244: 'return' : conversion from 'long' to > 'unsigned short', possible loss of data > primes.c(596) : warning C4244: 'return' : conversion from 'long' to > 'char', possible loss of data > primes.c(611) : warning C4244: 'return' : conversion from 'long' to > 'short', possible loss of data > primes.c(656) : warning C4244: 'return' : conversion from 'long' to > 'char', possible loss of data > primes.c(671) : warning C4244: 'return' : conversion from 'long' to > 'short', possible loss of data > error: command '"C:\Program Files\Microsoft Visual Studio .NET > 2003\Vc7\bin\cl.exe"' failed with exit status 2 > > inspection of primes.c reveals that INLINE is defined thusly > >> #ifdef __GNUC__ >> #define INLINE __inline__ >> #elif _WIN32 >> #define INLINE __inline >> #else >> #define INLINE >> #endif > > > > and there are references to INLINE, however on line 99 I see this > >> static inline PyObject* __Pyx_PyObject_Append(PyObject* L, >> PyObject* x) { > > is this a bug or am I doing soemthing wrong? No, this was a bug and has since been fixed: http://hg.cython.org/cython-devel/rev/aeb71e54b78a - Robert From languitar at semipol.de Sat May 24 19:59:49 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sat, 24 May 2008 19:59:49 +0200 Subject: [Cython] use of __del__ Message-ID: <48385795.7020404@semipol.de> Hi, is it possible to use __del__ in cython? A first try didn't succeed: cpdef class Foo: def __del__(self): print "Foo::__del__" >>> from ship.test import Foo >>> f = Foo() >>> del f <> Any chance to get this or something equivalent working? Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080524/b20ba0bb/attachment-0001.pgp From robertwb at math.washington.edu Sat May 24 20:01:16 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 11:01:16 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> Message-ID: <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> On May 24, 2008, at 10:16 AM, Dag Sverre Seljebotn wrote: > Robert wrote: >> Perhaps put an example of a __getitem__ method that uses this >> feature. It seems mostly this could be handled with plain old >> function overloading, but this would be a nice feature to have in and >> of itself. > > Yes, will do that, but for now I'll scetch the problems with plain old > overloading and how this proposal (parameter polymorphism/compile-time > duck typing) solves it. > > Problem 1: The return type is different depending on the assumptions > placed on the object (i.e. if dtype=uint8 then single-item indexing > should > return uint8). Traditional overloading doesn't really solve this -- > or if > you overload on method return type, you'd have a possibility but it > would > mean manually copying the method for every single return type... > > Problem 2: Consider these calls and their translation to more explicit > psuedo-code: > > cdef ndarray[dtype=uint8, ndim=3] arr = ... > arr[1, 2, 3] => arr.__getitem__((1, 2, 3)) > arr[1:2, ...] => arr.__getitem__((slice(1, 2, None), Ellipsis)) > > Now, the first lookup should return uint8, while the second one should > return a new object (a new array view) -- yet, both arguments are > of the > type "object". > > In order to solve these problems, one could start to create > complicated > solutions like allowing "self.dtype" as a return type in the method > signature if an assumption is placed on self, specify "nested > types" (ie. > tuple(int, int, int)), with a following combinatorial explosions in > manual > overloads one needs to create) etc. etc. Ah, but in this case it seems much simpler (from the users perspective) to resolve arr[1,2,3] as __getitem__(1,2,3) and let function overloading handle this the normal way. BTW, in terms of focus, I think handling slicing is much lower on the priority list than a lot of other things (the relative gain here is much smaller). > However, just making both the return type and parameter "generic" > seems a > much simpler, more readable solution that will work out to the same > thing. > Then the type system doesn't get in the way, and one can write a > single > __getitem__ and leave the real issues to "optional" optimizations. > > (In order for this to work there's a small hitch: One must support > code > like this: > > cdef generic get_something(as_str): > if as_str: return "asdf" > else: return 3432 > > This can be fixed simply by having return type mismatches for > instantiated > generics converted into runtime errors rather than halt > compilation, this > emulates Python behaviour nicely.) So here "generic" would become the more general of the two, i.e. an object. For generic inline functions, would it get optimized away (i.e. if one knew as_str at compile time, it would know the return type exactly?) - Robert From robertwb at math.washington.edu Sat May 24 20:04:42 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 11:04:42 -0700 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> Message-ID: <4F4A124B-B6AD-468E-8243-E5FEFD2AE13A@math.washington.edu> On May 24, 2008, at 10:24 AM, Dag Sverre Seljebotn wrote: > Robert wrote: >> On May 21, 2008, at 7:16 PM, Greg Ewing wrote: >> >>> What sort of things were you intending to do in between >>> type analysis and coercion? Could they still be done under >>> these restrictions? >> >> The phase that I'd like to stick here is type inference. The type >> analysis would type all declared variables, and in some cases assign >> types that are dependent on other (yet unknown) types. One would then >> run a type-resolution algorithm on the data of the symbol table, >> which could be used to actually resolve all the types. > > This kind of suggest that rather than trying to take coercion out of > analyse_types and put it after (as we've talked about earlier), it > could > be simpler and more natural to rather take the simplest "typing > declared > variables" out in a new phase before analyse_types? > > (This may be what you've concluded though, I'm just noting this > thinking > about earlier discussion on this.) Yes. Currently analyse_types does lots of things at once. I'd do 1) Tag declared types and type relations (e.g. this type is the widest of these two) 2) Infer undeclared types (this would not be a traditional visitor/ recursive phase, rather analysis on the symtab from above 3) Tag types as inferred above, and insert coercions. - Robert From robertwb at math.washington.edu Sat May 24 20:07:08 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 11:07:08 -0700 Subject: [Cython] use of __del__ In-Reply-To: <48385795.7020404@semipol.de> References: <48385795.7020404@semipol.de> Message-ID: <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> No, use the __dealloc__ method instead. On May 24, 2008, at 10:59 AM, Johannes Wienke wrote: > Hi, > > is it possible to use __del__ in cython? A first try didn't succeed: > > cpdef class Foo: > def __del__(self): > print "Foo::__del__" > >>>> from ship.test import Foo >>>> f = Foo() >>>> del f > <> > > Any chance to get this or something equivalent working? > > Johannes > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080524/221410d7/attachment.pgp From robin at reportlab.com Sat May 24 20:09:47 2008 From: robin at reportlab.com (Robin Becker) Date: Sat, 24 May 2008 19:09:47 +0100 Subject: [Cython] inline error in primes.c In-Reply-To: References: <48380506.1020709@jessikat.plus.net> Message-ID: <483859EB.7070808@jessikat.plus.net> Robert Bradshaw wrote: > On May 24, 2008, at 5:07 AM, Robin Becker wrote: > ........ > > No, this was a bug and has since been fixed: > > http://hg.cython.org/cython-devel/rev/aeb71e54b78a > > - Robert ...... can I just hg up to get the corrected version? -- Robin Becker From robertwb at math.washington.edu Sat May 24 20:17:51 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 11:17:51 -0700 Subject: [Cython] inline error in primes.c In-Reply-To: <483859EB.7070808@jessikat.plus.net> References: <48380506.1020709@jessikat.plus.net> <483859EB.7070808@jessikat.plus.net> Message-ID: On May 24, 2008, at 11:09 AM, Robin Becker wrote: > Robert Bradshaw wrote: >> On May 24, 2008, at 5:07 AM, Robin Becker wrote: >> > ........ >> >> No, this was a bug and has since been fixed: >> >> http://hg.cython.org/cython-devel/rev/aeb71e54b78a >> >> - Robert > ...... > > can I just hg up to get the corrected version? You cold pull from the cython-devel branch, though I'm not promising that it's bug free at this point. Or just apply that single change (changing inline to INLINE). Perhaps we should have a cython-bugfix branch with stuff like this (no new features) but so far the pace of releases has been quick enough (and bugs rare enough) to not warrant that. - Robert From robin at reportlab.com Sat May 24 20:29:40 2008 From: robin at reportlab.com (Robin Becker) Date: Sat, 24 May 2008 19:29:40 +0100 Subject: [Cython] inline error in primes.c In-Reply-To: References: <48380506.1020709@jessikat.plus.net> <483859EB.7070808@jessikat.plus.net> Message-ID: <48385E94.10001@jessikat.plus.net> Robert Bradshaw wrote: ....... >> can I just hg up to get the corrected version? > > You cold pull from the cython-devel branch, though I'm not promising > that it's bug free at this point. Or just apply that single change > (changing inline to INLINE). Perhaps we should have a cython-bugfix > branch with stuff like this (no new features) but so far the pace of > releases has been quick enough (and bugs rare enough) to not warrant > that. > ....... I did it handomatically eventually and stuff seems to work. I'm new to hg and I find it strange that my repository has diffs even after I do hg up although hg stat seems to indicate no mods. -- Robin Becker From robertwb at math.washington.edu Sat May 24 20:36:56 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 11:36:56 -0700 Subject: [Cython] inline error in primes.c In-Reply-To: <48385E94.10001@jessikat.plus.net> References: <48380506.1020709@jessikat.plus.net> <483859EB.7070808@jessikat.plus.net> <48385E94.10001@jessikat.plus.net> Message-ID: <029DA506-7556-4B2C-93A0-9D4C8DD0CB70@math.washington.edu> On May 24, 2008, at 11:29 AM, Robin Becker wrote: > Robert Bradshaw wrote: > ....... >>> can I just hg up to get the corrected version? >> >> You cold pull from the cython-devel branch, though I'm not promising >> that it's bug free at this point. Or just apply that single change >> (changing inline to INLINE). Perhaps we should have a cython-bugfix >> branch with stuff like this (no new features) but so far the pace of >> releases has been quick enough (and bugs rare enough) to not warrant >> that. >> > ....... > I did it handomatically eventually and stuff seems to work. I'm new to > hg and I find it strange that my repository has diffs even after I > do hg > up although hg stat seems to indicate no mods. With mercurial, you "pull" to get changes from another repository (e.g. the online one) and then "update" to apply the changes, so they have slightly different meanings than what you're trying to use them for. - Robert From dagss at student.matnat.uio.no Sat May 24 22:03:20 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 24 May 2008 22:03:20 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> Message-ID: <48387488.5050605@student.matnat.uio.no> I'll follow up with a sample getitem implementation, so you need not follow up this thread until then. But I really wanted to explain compile-time duck typing of return types properly (see below). Robert Bradshaw wrote: > On May 24, 2008, at 10:16 AM, Dag Sverre Seljebotn wrote: >> In order to solve these problems, one could start to create >> complicated >> solutions like allowing "self.dtype" as a return type in the method >> signature if an assumption is placed on self, specify "nested >> types" (ie. >> tuple(int, int, int)), with a following combinatorial explosions in >> manual >> overloads one needs to create) etc. etc. > > Ah, but in this case it seems much simpler (from the users > perspective) to resolve arr[1,2,3] as __getitem__(1,2,3) and let > function overloading handle this the normal way. BTW, in terms of > focus, I think handling slicing is much lower on the priority list > than a lot of other things (the relative gain here is much smaller). But this makes our __getitem__ different from Python's! If we do that, we should rather make up a wholly different, new syntax (__cgetitem__, __cgetslice__, and so on); but I do not like to take this direction. It's OK to not do any optimization for slicing, but it's very important that slices correctly fall back to the Python [] operator. As long as the Python __getitem__ interface is kept, I must fall back to the [] operator manually, and also take tuples for n-d indices). (Also I find the prospect of manually creating multiple overloads depending on the number of dimensions somewhat distasteful. Of course, I'll do it if there's not enough time, but I'd like to at least have a path forward that *can* lead there eventually, and *then* hack it.) >> (In order for this to work there's a small hitch: One must support >> code >> like this: >> >> cdef generic get_something(as_str): >> if as_str: return "asdf" >> else: return 3432 >> >> This can be fixed simply by having return type mismatches for >> instantiated >> generics converted into runtime errors rather than halt >> compilation, this >> emulates Python behaviour nicely.) > > So here "generic" would become the more general of the two, i.e. an > object. For generic inline functions, would it get optimized away > (i.e. if one knew as_str at compile time, it would know the return > type exactly?) No, this is all wrong. If having "generic" as the return value simply resulted in the more general of the types, I wouldn't bother with it -- after all, the programmer know which types can be returned, and would be able to specify object manually! I'll exemplify using the function above. If you don't like what you see, read footnote [1]. Working calling code: (1): cdef char* chbuf = get_something(as_str=True) # chbuf = "asdf" (2): cdef object s = get_something(as_str=True) # s = str("asdf") (3): cdef object o_n = get_something(as_str=False) # o_n = int(3423) (4): cdef int i_n = get_something(as_str=False) # i_n = int(3423) I.e. this creates four different instances of get_something, each one with different semantics because of the return type. I.e. (1) instantiates cdef char* get_something(as_str): .... which of course makes 'return "asdf"' return a string literal pointer. (I suppose this will change into an error if that auto-coercion is removed :-)). (2) and (3) both uses the same instantiation, and their code returns object (like your guessed behaviour). (4) turns into cdef int get_something(as_str): ... OK, so obviously for (1) and (4) there will be a type mismatch in the line of code that's not run. That's where I proposed to change it into a run-time error (because "those spots should not be reachable"). I.e, suppose this call is done: cdef int n = get_something(as_str=True) This uses instantiation (4) from above (the int return one). I'll now write out the proposed body of this function: cdef int get_something(as_str): if as_str: "asdf" # evaluate and discard expression in the return statement # Then explicitly, and always, raise the coercion error. # The point is: Usually this place is not reached! raise TypeError("Cannot coerce str to C int") # or whatever you have now for "a"... else: return 3432 Instantiation (1) of the function is symmetric to this, raising an exception if control reaches the place where the integer is returned. So the end result is that the "int-return-type" instantiation of get_something returns the proper, native C int when called with as_str=False, and raises a coercion exception when called with as_str=True. [1] Even if this may seem hard to wrap ones head around, the end of the story for the end-user is rather pleasing; one gets more or less the same behaviour as if get_something was declared with an "object" return type. It should natural to use. But no object coercion is involved for the compiler, so speed is maintained. -- Dag Sverre From robertwb at math.washington.edu Sat May 24 22:43:12 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 13:43:12 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <48387488.5050605@student.matnat.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> Message-ID: <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> On May 24, 2008, at 1:03 PM, Dag Sverre Seljebotn wrote: > I'll follow up with a sample getitem implementation, so you need not > follow up this thread until then. But I really wanted to explain > compile-time duck typing of return types properly (see below). > > Robert Bradshaw wrote: >> On May 24, 2008, at 10:16 AM, Dag Sverre Seljebotn wrote: >>> In order to solve these problems, one could start to create >>> complicated >>> solutions like allowing "self.dtype" as a return type in the method >>> signature if an assumption is placed on self, specify "nested >>> types" (ie. >>> tuple(int, int, int)), with a following combinatorial explosions in >>> manual >>> overloads one needs to create) etc. etc. >> >> Ah, but in this case it seems much simpler (from the users >> perspective) to resolve arr[1,2,3] as __getitem__(1,2,3) and let >> function overloading handle this the normal way. BTW, in terms of >> focus, I think handling slicing is much lower on the priority list >> than a lot of other things (the relative gain here is much smaller). > > But this makes our __getitem__ different from Python's! If we do that, > we should rather make up a wholly different, new syntax (__cgetitem__, > __cgetslice__, and so on); but I do not like to take this direction. We're going to have to avoid the tuple packing/unpacking somehow if we're going for speed, so it make sense in this case to not pack them at all in this case rather than have special unpacking code on the other end (and I think it looks cleaner too). > It's OK to not do any optimization for slicing, but it's very > important > that slices correctly fall back to the Python [] operator. As long as > the Python __getitem__ interface is kept, I must fall back to the [] > operator manually, and also take tuples for n-d indices). Yep. > (Also I find the prospect of manually creating multiple overloads > depending on the number of dimensions somewhat distasteful. Of course, > I'll do it if there's not enough time, but I'd like to at least have a > path forward that *can* lead there eventually, and *then* hack it.) True. A generic n-ary unpacker that gets unrolled completely at compile time may be much more complicated to implement (and read). IMHO, not as important as making 1, 2, and 3-dimensional indexing as fast as possible (possibly falling back to a runtime loop for more). Not as powerful, but more realistic to actually get done (especially given all the other things you're planning to do). >>> (In order for this to work there's a small hitch: One must support >>> code >>> like this: >>> >>> cdef generic get_something(as_str): >>> if as_str: return "asdf" >>> else: return 3432 >>> >>> This can be fixed simply by having return type mismatches for >>> instantiated >>> generics converted into runtime errors rather than halt >>> compilation, this >>> emulates Python behaviour nicely.) >> >> So here "generic" would become the more general of the two, i.e. an >> object. For generic inline functions, would it get optimized away >> (i.e. if one knew as_str at compile time, it would know the return >> type exactly?) > > No, this is all wrong. > > If having "generic" as the return value simply resulted in the more > general of the types, I wouldn't bother with it -- after all, the > programmer know which types can be returned, and would be able to > specify object manually! > > I'll exemplify using the function above. If you don't like what you > see, > read footnote [1]. > > Working calling code: > (1): cdef char* chbuf = get_something(as_str=True) # chbuf = "asdf" > (2): cdef object s = get_something(as_str=True) # s = str("asdf") > (3): cdef object o_n = get_something(as_str=False) # o_n = int(3423) > (4): cdef int i_n = get_something(as_str=False) # i_n = int(3423) > > I.e. this creates four different instances of get_something, each one > with different semantics because of the return type. I.e. (1) > instantiates > > cdef char* get_something(as_str): .... > > which of course makes 'return "asdf"' return a string literal pointer. > (I suppose this will change into an error if that auto-coercion is > removed :-)). (2) and (3) both uses the same instantiation, and their > code returns object (like your guessed behaviour). (4) turns into > > cdef int get_something(as_str): ... > > OK, so obviously for (1) and (4) there will be a type mismatch in the > line of code that's not run. That's where I proposed to change it > into a > run-time error (because "those spots should not be reachable"). I.e, > suppose this call is done: > > cdef int n = get_something(as_str=True) > > This uses instantiation (4) from above (the int return one). I'll now > write out the proposed body of this function: > > cdef int get_something(as_str): > if as_str: > "asdf" # evaluate and discard expression in the return > statement > # Then explicitly, and always, raise the coercion error. > # The point is: Usually this place is not reached! > raise TypeError("Cannot coerce str to C int") > # or whatever you have now for "a"... > else: > return 3432 Note: one concern is keeping the size of this body down, especially if it's inlined in some tight loop. > > Instantiation (1) of the function is symmetric to this, raising an > exception if control reaches the place where the integer is returned. > > So the end result is that the "int-return-type" instantiation of > get_something returns the proper, native C int when called with > as_str=False, and raises a coercion exception when called with > as_str=True. > > [1] Even if this may seem hard to wrap ones head around, the end of > the > story for the end-user is rather pleasing; one gets more or less the > same behaviour as if get_something was declared with an "object" > return > type. It should natural to use. But no object coercion is involved for > the compiler, so speed is maintained. Interesting idea, I think Haskell has something like this. It's like type coercion going the opposite direction--one wants an int result so it changes the expression itself (perhaps after passing through several layers? How feasible is this?). I'd rather be a bit more explicit (especially for ease of doing type inference for statements like "x = arr[5]"). Essentially, what you really want is __getitem__ to return a variety of types, determined at compile time, and without coercion through an object. For inlined functions perhaps we could have a phase automatically optimizing away x where there is a direct conversion from x to type (if the wasn't explicitly requested by the user). No good solution to the generic problem is coming to might right now though... - Robert From dagss at student.matnat.uio.no Sat May 24 23:12:27 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 24 May 2008 23:12:27 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> Message-ID: <483884BB.9000506@student.matnat.uio.no> I agree about time-frame being a danger. "Always scale up, not down." So might do a solution based on our own, custom __cgetitem__ first, and then try to generalize *that* afterwards if there's time. (But I'll still prototype __getitem__ just to see). The only problem is that __cgetitem__ requires conventional overloading, which I was *not* currently planning on doing; I intended this to be a more Pythonic alternative ("don't introduce features that's not in Python "just because we can" -- compile time duck typing is really just Python behaviour at compile-time, normal overloading means a control flow that cannot be had in Python). Robert Bradshaw wrote: > On May 24, 2008, at 1:03 PM, Dag Sverre Seljebotn wrote: > >> Instantiation (1) of the function is symmetric to this, raising an >> exception if control reaches the place where the integer is returned. >> >> So the end result is that the "int-return-type" instantiation of >> get_something returns the proper, native C int when called with >> as_str=False, and raises a coercion exception when called with >> as_str=True. >> >> [1] Even if this may seem hard to wrap ones head around, the end of >> the >> story for the end-user is rather pleasing; one gets more or less the >> same behaviour as if get_something was declared with an "object" >> return >> type. It should natural to use. But no object coercion is involved for >> the compiler, so speed is maintained. > > Interesting idea, I think Haskell has something like this. It's like > type coercion going the opposite direction--one wants an int result > so it changes the expression itself (perhaps after passing through > several layers? How feasible is this?). I'd rather be a bit more > explicit (especially for ease of doing type inference for statements > like "x = arr[5]"). There's not any "magic" or unfeasability involved as far as I can see. Full algorithm: - When calling a function with one or more generic types (possibly return type!), look it up in an instantiation dict. - If the dict lookup misses, instantiate the function like this: a) mangle name, b) insert the exact types that were used in the call (void for return type if it is discarded), c) the body is copied literally, except for return statements which are changed according to (*). - If the copied body contains calls to functions with generic types, recurse. (*) If the return type is not "generic", return statements are handled like normal. If the return type is "generic", any return statements: - When the (actual, instantiated) return type ended up being void, evaluate expression but discard result (as the caller obviously discards the result). In this case, returning multiple different types will just work. - When the actual, instantiated return type is T != void, then treat return statements like normal (i.e. if returning from a function with return type T), except for the cases where coercion fails, in which case, rather than giving a compile-time error, one evaluates expression, discards result, and inserts instructions to raise runtime exception. > Essentially, what you really want is __getitem__ to return a variety > of types, determined at compile time, and without coercion through an > object. For inlined functions perhaps we could have a phase > automatically optimizing away x where there is a direct > conversion from x to type (if the wasn't explicitly > requested by the user). That's what I was hoping to avoid by doing something *simpler*. As long as coercion is not a perfect round-trip this can have unintended side-effects etc. etc. that I'd have to sort out, and the above seems much simpler and more elegant (to me). -- Dag Sverre From robertwb at math.washington.edu Sat May 24 23:38:50 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 14:38:50 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <483884BB.9000506@student.matnat.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> Message-ID: <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> On May 24, 2008, at 2:12 PM, Dag Sverre Seljebotn wrote: > I agree about time-frame being a danger. "Always scale up, not > down." So > might do a solution based on our own, custom __cgetitem__ first, and > then try to generalize *that* afterwards if there's time. (But I'll > still prototype __getitem__ just to see). No need to rename it __cgetitem__... > The only problem is that __cgetitem__ requires conventional > overloading, > which I was *not* currently planning on doing; I intended this to be a > more Pythonic alternative ("don't introduce features that's not in > Python "just because we can" -- compile time duck typing is really > just > Python behaviour at compile-time, normal overloading means a control > flow that cannot be had in Python). That is true. The compile time duck typing doesn't handle a change in the number of arguments, however, and I don't see how it would help with making unpacking not involve object coercions. > Robert Bradshaw wrote: >> On May 24, 2008, at 1:03 PM, Dag Sverre Seljebotn wrote: >> >>> Instantiation (1) of the function is symmetric to this, raising an >>> exception if control reaches the place where the integer is >>> returned. >>> >>> So the end result is that the "int-return-type" instantiation of >>> get_something returns the proper, native C int when called with >>> as_str=False, and raises a coercion exception when called with >>> as_str=True. >>> >>> [1] Even if this may seem hard to wrap ones head around, the end of >>> the >>> story for the end-user is rather pleasing; one gets more or less the >>> same behaviour as if get_something was declared with an "object" >>> return >>> type. It should natural to use. But no object coercion is >>> involved for >>> the compiler, so speed is maintained. >> >> Interesting idea, I think Haskell has something like this. It's like >> type coercion going the opposite direction--one wants an int result >> so it changes the expression itself (perhaps after passing through >> several layers? How feasible is this?). I'd rather be a bit more >> explicit (especially for ease of doing type inference for statements >> like "x = arr[5]"). > > There's not any "magic" or unfeasability involved as far as I can see. Feasibility in determining the instantiated return type, and being able to let "x = arr[5]" infer the type of x (based, of course, on the type of arr). > Full algorithm: > > - When calling a function with one or more generic types (possibly > return type!), look it up in an instantiation dict. > > - If the dict lookup misses, instantiate the function like this: a) > mangle name, b) insert the exact types that were used in the call > (void > for return type if it is discarded), c) the body is copied literally, > except for return statements which are changed according to (*). > > - If the copied body contains calls to functions with generic types, > recurse. > > (*) If the return type is not "generic", return statements are handled > like normal. If the return type is "generic", any return statements: > > - When the (actual, instantiated) return type ended up being void, > evaluate expression but discard result (as the caller obviously > discards > the result). In this case, returning multiple different types will > just > work. > > - When the actual, instantiated return type is T != void, then treat > return statements like normal (i.e. if returning from a function with > return type T), except for the cases where coercion fails, in which > case, rather than giving a compile-time error, one evaluates > expression, > discards result, and inserts instructions to raise runtime exception. This all makes sense, it's a question of deducing the "instantiated return type" which is ill-defined. >> Essentially, what you really want is __getitem__ to return a variety >> of types, determined at compile time, and without coercion through an >> object. For inlined functions perhaps we could have a phase >> automatically optimizing away x where there is a direct >> conversion from x to type (if the wasn't explicitly >> requested by the user). > > That's what I was hoping to avoid by doing something *simpler*. I guess it didn't feel simpler to me, but I'll agree that it's certainly not trivial. > As long as coercion is not a perfect round-trip this can have > unintended side-effects etc. etc. that I'd have to sort out, and the > above seems much simpler and more elegant (to me). If the coercion is compiler-inserted, then it will not have any side effects that can't safely be ignored. - Robert From dagss at student.matnat.uio.no Sat May 24 23:58:15 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 24 May 2008 23:58:15 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> Message-ID: <48388F77.8030709@student.matnat.uio.no> I prototyped __getitem__ now: http://wiki.cython.org/enhancements/numpy/getitem It looks feasible, but perhaps a bit daunting for the time-frame and "value-for-investment" compared to a simpler approach. Robert Bradshaw wrote: > On May 24, 2008, at 2:12 PM, Dag Sverre Seljebotn wrote: > >> I agree about time-frame being a danger. "Always scale up, not >> down." So >> might do a solution based on our own, custom __cgetitem__ first, and >> then try to generalize *that* afterwards if there's time. (But I'll >> still prototype __getitem__ just to see). > > No need to rename it __cgetitem__... So you propose that one keeps the name __getitem__, yet changes the interface completely compared to Python? (Don't pass slices, unpack the tuple prior to calling...) I don't see the benefit of keeping the name then, it only creates confusion... (re: the __add__ operators which doesn't behave like in Python). > Feasibility in determining the instantiated return type, and being > able to let "x = arr[5]" infer the type of x (based, of course, on > the type of arr). Ah, got it! Yes, that is a real problem. (Sorry.) > If the coercion is compiler-inserted, then it will not have > any side effects that can't safely be ignored. Yes, I suppose. But also cdef object __getitem__(object self, object index) would be kind of hard to do type-inference on (would need to compile-time evaluate the contents and see...). What one "really" wants is something like cdef (self.dtype if (type(index) is not tuple or (not slice in [type(x) for x in index] and not Ellipsis in index) else object) __getitem__(self, item): ... Where the first (..) is the type declaration of __getitem__. I somehow don't feel too happy about that type declaration :-) So perhaps __cgetitem__ and __cgetslice__ it is (until you convince me the rename isn't needed...) -- Dag Sverre From robertwb at math.washington.edu Sun May 25 00:14:26 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 15:14:26 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <48388F77.8030709@student.matnat.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> Message-ID: <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> On May 24, 2008, at 2:58 PM, Dag Sverre Seljebotn wrote: > I prototyped __getitem__ now: > http://wiki.cython.org/enhancements/numpy/getitem > > It looks feasible, but perhaps a bit daunting for the time-frame and > "value-for-investment" compared to a simpler approach. > > Robert Bradshaw wrote: >> On May 24, 2008, at 2:12 PM, Dag Sverre Seljebotn wrote: >> >>> I agree about time-frame being a danger. "Always scale up, not >>> down." So >>> might do a solution based on our own, custom __cgetitem__ first, and >>> then try to generalize *that* afterwards if there's time. (But I'll >>> still prototype __getitem__ just to see). >> >> No need to rename it __cgetitem__... > > So you propose that one keeps the name __getitem__, yet changes the > interface completely compared to Python? (Don't pass slices, unpack > the > tuple prior to calling...) I don't see the benefit of keeping the name > then, it only creates confusion... I'm thinking of extending the semantics (note, for inline functions only, otherwise none of this optimization can happen), and not in a way that is backwards incompatible. __getitem__(self, [object] index) will always be available, and always called for non-inlined code. > (re: the __add__ operators which doesn't behave like in Python). Yes, this is a blemish... >> Feasibility in determining the instantiated return type, and being >> able to let "x = arr[5]" infer the type of x (based, of course, on >> the type of arr). > > Ah, got it! Yes, that is a real problem. (Sorry.) > >> If the coercion is compiler-inserted, then it will not have >> any side effects that can't safely be ignored. > > Yes, I suppose. But also > > cdef object __getitem__(object self, object index) > > would be kind of hard to do type-inference on (would need to > compile-time evaluate the contents and see...). What one "really" > wants > is something like > > cdef (self.dtype if (type(index) is not tuple or (not slice in [type > (x) > for x in index] and not Ellipsis in index) else object) > __getitem__(self, item): ... > > Where the first (..) is the type declaration of __getitem__. > > I somehow don't feel too happy about that type declaration :-) Me either... > So perhaps __cgetitem__ and __cgetslice__ it is (until you convince > me the > rename isn't needed...) You're whole (..) can be encoded in the argument signature (fixing the number of arguments, which I'll admit is undesirable). The "tuple" representing the indices never gets created or used in this case, which makes things much clearer. Otherwise, I still don't see how you're going to unpack the tuple argument in the generic case without resorting to Python tuples. - Robert From languitar at semipol.de Sun May 25 00:22:54 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sun, 25 May 2008 00:22:54 +0200 Subject: [Cython] use of __del__ In-Reply-To: <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> Message-ID: <4838953E.8050104@semipol.de> Am 05/24/2008 08:07 PM schrieb Robert Bradshaw: > No, use the __dealloc__ method instead. Thanks. Is there a reason why this isn't documented? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/294a2040/attachment.pgp From robertwb at math.washington.edu Sun May 25 00:36:33 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 15:36:33 -0700 Subject: [Cython] use of __del__ In-Reply-To: <4838953E.8050104@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> Message-ID: <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> On May 24, 2008, at 3:22 PM, Johannes Wienke wrote: > Am 05/24/2008 08:07 PM schrieb Robert Bradshaw: >> No, use the __dealloc__ method instead. > > Thanks. Is there a reason why this isn't documented? It is: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ version/Doc/Manual/special_methods.html But I agree it should be clearer. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080524/5094ac60/attachment.pgp From languitar at semipol.de Sun May 25 00:42:49 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sun, 25 May 2008 00:42:49 +0200 Subject: [Cython] use of __del__ In-Reply-To: <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> Message-ID: <483899E9.4090503@semipol.de> Am 05/25/2008 12:36 AM schrieb Robert Bradshaw: > On May 24, 2008, at 3:22 PM, Johannes Wienke wrote: > >> Am 05/24/2008 08:07 PM schrieb Robert Bradshaw: >>> No, use the __dealloc__ method instead. >> >> Thanks. Is there a reason why this isn't documented? > > It is: > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html I've never seen that site before. To my mind it's really confusing that you have to search in at least two different manuals just to find the correct things. Is there any decision concerning the documentation at http://www.mudskipper.ca/cython-doc/ ? I think it would be a great improvement for users if something like this would become a reference documentation without the need to use the pyrex docs, too. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/71334a19/attachment.pgp From robertwb at math.washington.edu Sun May 25 01:10:25 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 16:10:25 -0700 Subject: [Cython] use of __del__ In-Reply-To: <483899E9.4090503@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> <483899E9.4090503@semipol.de> Message-ID: On May 24, 2008, at 3:42 PM, Johannes Wienke wrote: > Am 05/25/2008 12:36 AM schrieb Robert Bradshaw: >> On May 24, 2008, at 3:22 PM, Johannes Wienke wrote: >>> Am 05/24/2008 08:07 PM schrieb Robert Bradshaw: >>>> No, use the __dealloc__ method instead. >>> >>> Thanks. Is there a reason why this isn't documented? >> It is: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ >> version/Doc/Manual/special_methods.html > > I've never seen that site before. To my mind it's really confusing > that you have to search in at least two different manuals just to > find the correct things. Yes, I agree. It used to be that Pyrex and Cython were so similar they didn't warrant their own manuals, but this seems to be changing. > Is there any decision concerning the documentation at http:// > www.mudskipper.ca/cython-doc/ ? I think it would be a great > improvement for users if something like this would become a > reference documentation without the need to use the pyrex docs, too. This site is awesome, and will become (is) the "official" Cython manual. If anything is missing, we want to add it here. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080524/722bdf53/attachment.pgp From languitar at semipol.de Sun May 25 01:14:44 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sun, 25 May 2008 01:14:44 +0200 Subject: [Cython] use of __del__ In-Reply-To: References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> <483899E9.4090503@semipol.de> Message-ID: <4838A164.7030909@semipol.de> Am 05/25/2008 01:10 AM schrieb Robert Bradshaw: > On May 24, 2008, at 3:42 PM, Johannes Wienke wrote: >> Is there any decision concerning the documentation at >> http://www.mudskipper.ca/cython-doc/ ? I think it would be a great >> improvement for users if something like this would become a reference >> documentation without the need to use the pyrex docs, too. > > This site is awesome, and will become (is) the "official" Cython manual. > If anything is missing, we want to add it here. Shouldn't it be moved to cython.org then? Afterwards the wiki could be cleaned from manual-like things. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/cc8393e3/attachment.pgp From robertwb at math.washington.edu Sun May 25 01:49:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 16:49:21 -0700 Subject: [Cython] use of __del__ In-Reply-To: <4838A164.7030909@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <21288C36-231D-44D0-A96D-15A3CC5B09FE@math.washington.edu> <483899E9.4090503@semipol.de> <4838A164.7030909@semipol.de> Message-ID: <137B6401-3EA8-45F7-8243-CD5A374B70FB@math.washington.edu> On May 24, 2008, at 4:14 PM, Johannes Wienke wrote: > Am 05/25/2008 01:10 AM schrieb Robert Bradshaw: >> On May 24, 2008, at 3:42 PM, Johannes Wienke wrote: >>> Is there any decision concerning the documentation at http:// >>> www.mudskipper.ca/cython-doc/ ? I think it would be a great >>> improvement for users if something like this would become a >>> reference documentation without the need to use the pyrex docs, too. >> This site is awesome, and will become (is) the "official" Cython >> manual. If anything is missing, we want to add it here. > > Shouldn't it be moved to cython.org then? Afterwards the wiki could > be cleaned from manual-like things. Yes, it probably should be, and eventually will be, but is still being worked on by Gabriel (who seems happy to host it for the moment, though if not we'd be happy to do so too). - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://codespeak.net/pipermail/cython-dev/attachments/20080524/d6ac90f5/attachment.pgp From greg.ewing at canterbury.ac.nz Sun May 25 02:47:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 12:47:04 +1200 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> Message-ID: <4838B708.50109@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > This kind of suggest that rather than trying to take coercion out of > analyse_types and put it after (as we've talked about earlier), it could > be simpler and more natural to rather take the simplest "typing declared > variables" out in a new phase before analyse_types? I was just thinking about that. The type analysis phase is already doing a simple form of type inference when it works out the result type of an expression from its operand types. Full-blown type inference might be better seen as a generalisation of that rather than separate process. As for separating out coercion, I suppose the type inference phase could just take it on faith that it can treat an operand as though it had a different type if it's convenient, and leave it to a later phase to make it so. Although that might be introducing implicit dependencies between phases of the kind that you really need to avoid if you want to be able to insert new phases without worrying too much about how they interact with each other. -- Greg From greg.ewing at canterbury.ac.nz Sun May 25 02:50:21 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 12:50:21 +1200 Subject: [Cython] use of __del__ In-Reply-To: <48385795.7020404@semipol.de> References: <48385795.7020404@semipol.de> Message-ID: <4838B7CD.4090001@canterbury.ac.nz> Johannes Wienke wrote: > cpdef class Foo: > def __del__(self): > print "Foo::__del__" Not sure about Cython, but in Pyrex, extension types don't have __del__ methods. On the other hand, they do have a __dealloc__ method, which is better in some ways, since in contrast to __del__ it's always called when the object is deallocated. -- Greg From robertwb at math.washington.edu Sun May 25 03:13:37 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 18:13:37 -0700 Subject: [Cython] Recursive vs. visitor pattern In-Reply-To: <4838B708.50109@canterbury.ac.nz> References: <4832C2F4.1090605@martincmartin.com> <48344138.6090405@student.matnat.uio.no> <4834D795.1000401@canterbury.ac.nz> <0E851CC6-EF76-4247-9D8F-5982BF399B54@math.washington.edu> <51008.193.157.229.67.1211649854.squirrel@webmail.uio.no> <4838B708.50109@canterbury.ac.nz> Message-ID: <433AF863-9B83-43AA-AD91-EE04FF5373E2@math.washington.edu> On May 24, 2008, at 5:47 PM, Greg Ewing wrote: > Dag Sverre Seljebotn wrote: > >> This kind of suggest that rather than trying to take coercion out of >> analyse_types and put it after (as we've talked about earlier), it >> could >> be simpler and more natural to rather take the simplest "typing >> declared >> variables" out in a new phase before analyse_types? > > I was just thinking about that. The type analysis phase is > already doing a simple form of type inference when it works > out the result type of an expression from its operand types. > Full-blown type inference might be better seen as a > generalisation of that rather than separate process. Yep. When we say type inference, we're talking about the broader problem of inferring the types of variables that are not explicitly declared. For example, if I write y = x+1 (say, as my only assignment to y) then the type of y would depend on the type of x. Of course there could be circular and multiple dependancies, so there needs to be one pass to figure out the declarations/dependancies, then a constraint solving algorithm to work out all the types, then another pass to actually use the deduced types to insert coercion nodes. - Robert From viyer at facebook.com Sun May 25 03:15:09 2008 From: viyer at facebook.com (Venky Iyer) Date: Sat, 24 May 2008 18:15:09 -0700 Subject: [Cython] digest Message-ID: why does this mailman daily digest show up more than once a day (thrice today)? From viyer at facebook.com Sun May 25 03:16:53 2008 From: viyer at facebook.com (Venky Iyer) Date: Sat, 24 May 2008 18:16:53 -0700 Subject: [Cython] FW: digest In-Reply-To: Message-ID: Ok, ignore me, it says that it could do that on busy lists. Sorry for the noise. ------ Forwarded Message From: Venky Iyer Date: Sat, 24 May 2008 18:15:09 -0700 To: Conversation: digest Subject: digest why does this mailman daily digest show up more than once a day (thrice today)? ------ End of Forwarded Message From greg.ewing at canterbury.ac.nz Sun May 25 03:15:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 13:15:04 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: <48387488.5050605@student.matnat.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> Message-ID: <4838BD98.1040204@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > (1): cdef char* chbuf = get_something(as_str=True) # chbuf = "asdf" > (2): cdef object s = get_something(as_str=True) # s = str("asdf") > (3): cdef object o_n = get_something(as_str=False) # o_n = int(3423) > (4): cdef int i_n = get_something(as_str=False) # i_n = int(3423) This is all well and good in a simple case like that where you're directly assigning the result to something of a known type. But what if the result is being passed to another function whose argument type is "generic"? If there is more than one way of instantiating that function, it can easily become ambiguous. C++ disallows overloading a function based solely on the return type, probably because of the ambiguities it can lead to. Also, in Haskell I think it's always possible to deduce the return type of a function if you know the types of its arguments. Can you relate what you have in mind to some existing well-understood parametric type system, such as Haskell's? If you could, it would help tremendously in being able to tell what it is and isn't capable of doing. -- Greg From greg.ewing at canterbury.ac.nz Sun May 25 03:39:59 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 13:39:59 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> Message-ID: <4838C36F.1040805@canterbury.ac.nz> Robert Bradshaw wrote: > I think Haskell has something like this. It's like > type coercion going the opposite direction--one wants an int result > so it changes the expression itself (perhaps after passing through > several layers? How feasible is this?). I don't think that's quite the same thing. It's possible for type inference to propagate information downwards as well as upwards, but at the end of the process, Haskell decides on a *single* type for every function. That type may have parameters, but it's a single type nonetheless, within the Haskell meaning of the term. You wouldn't be able to define a single Haskell function that returns an int in some contexts and a string in others, for instance, *unless* one of its parameters was also correspondingly an int or string. In that case, you don't really have two functions returning different types -- you have a single function with type a -> a, where a can be any type. Once you know the type of the argument, then you know the type of the result. Your get_something() example would be impossible in Haskell, because it explicitly returns either a string or an int depending on run-time conditions, and there is no way of unifying those into a single return type that's parameterised with the argument types. -- Greg From greg.ewing at canterbury.ac.nz Sun May 25 03:59:02 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 13:59:02 +1200 Subject: [Cython] use of __del__ In-Reply-To: <4838953E.8050104@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> Message-ID: <4838C7E6.7020706@canterbury.ac.nz> Johannes Wienke wrote: > Is there a reason why this isn't documented? It's mentioned in Special Methods of Extension Types, under the section on __dealloc__. It could perhaps be made more prominent, though. -- Greg From greg.ewing at canterbury.ac.nz Sun May 25 04:08:08 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 May 2008 14:08:08 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> Message-ID: <4838CA08.2020602@canterbury.ac.nz> Why do you want to overload based on return type anyway? I don't see any need to do that in the case of __getitem__. That fits perfectly well into the usual model of parametric polymorphism -- the return type is the same as the element type of the array. If it's used in a context where a different type is required, ordinary coercion will take care of that. -- Greg From robertwb at math.washington.edu Sun May 25 07:10:43 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 24 May 2008 22:10:43 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <4838CA08.2020602@canterbury.ac.nz> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> <4838CA08.2020602@canterbury.ac.nz> Message-ID: On May 24, 2008, at 7:08 PM, Greg Ewing wrote: > Why do you want to overload based on return type > anyway? > > I don't see any need to do that in the case of > __getitem__. That fits perfectly well into the > usual model of parametric polymorphism -- the > return type is the same as the element type of > the array. > > If it's used in a context where a different type > is required, ordinary coercion will take care > of that. The specific use case here is NumPy arrays, which may return different types for __getitem__. This should be determined at compile time by information attached to the ndarray type. I am sure there is a cleaner way to do this. Regarding your remarks on Haskell, that seems like a much saner approach. - Robert From dalcinl at gmail.com Sun May 25 08:57:02 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 03:57:02 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file Message-ID: After eight hours of pain try to make 'classmethod' working in Cython, I've found 'extrange' stuff in Python 3.0 sources, specifically at 'typeobject.c' . Please, open the file or just hit the following direct link: http://svn.python.org/projects/python/branches/py3k/Objects/typeobject.c and look at the macro definition pasted below, it is at the begining of the file: #define MCACHE_CACHEABLE_NAME(name) \ PyString_CheckExact(name) && \ PyString_GET_SIZE(name) <= MCACHE_MAX_ATTR_SIZE Does make any sense the PyString_XXXX there??? If this is a nonsense, surelly from a bad merge from Py2.6 or cut-and-paste error, then I now definitely understand way I was having trouble with the new method cache in 2.6 but NOT in 3.0. In 3.0, the method cache is just not working because the macro definition is wrong, identifiers are never byte strings in Py3.0 !! -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sun May 25 09:32:27 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 25 May 2008 00:32:27 -0700 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: Message-ID: <80898271-5D2A-44CF-989E-6C2A7BA48CB7@math.washington.edu> On May 24, 2008, at 11:57 PM, Lisandro Dalcin wrote: > After eight hours of pain try to make 'classmethod' working in Cython, > I've found 'extrange' stuff in Python 3.0 sources, specifically at > 'typeobject.c' . Please, open the file or just hit the following > direct link: > > http://svn.python.org/projects/python/branches/py3k/Objects/ > typeobject.c > > and look at the macro definition pasted below, it is at the begining > of the file: > > #define MCACHE_CACHEABLE_NAME > (name) \ > PyString_CheckExact(name) && \ > PyString_GET_SIZE(name) <= MCACHE_MAX_ATTR_SIZE > > Does make any sense the PyString_XXXX there??? If this is a nonsense, > surelly from a bad merge from Py2.6 or cut-and-paste error, then I now > definitely understand way I was having trouble with the new method > cache in 2.6 but NOT in 3.0. In 3.0, the method cache is just not > working because the macro definition is wrong, identifiers are never > byte strings in Py3.0 !! I'm not sure what the method cache for 2.6/3.0 is, is there a good reference explaining this? It would seem, however, that you have a good point. - Robert From stefan_ml at behnel.de Sun May 25 09:33:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 09:33:02 +0200 Subject: [Cython] Status update: Transform utilities In-Reply-To: <482DB943.3090501@student.matnat.uio.no> References: <482DB943.3090501@student.matnat.uio.no> Message-ID: <4839162E.4030504@behnel.de> Hi, Dag Sverre Seljebotn wrote: > class WithTransform(VisitorTransform): > # from with transform PEP... > with_fragment = TreeFragment(u""" > _mgr = (EXPR) > _exit = mgr.__exit__ > _value = mgr.__enter__() > _exc = True > try: > try: > VAR = _value > BLOCK > ... > > ... > """) > > def process_WithStatementNode(self, node): > return self.with_fragment.substitute({ > "EXPR" : node.expr, > "VAR" : node.var, > "BLOCK" : node.body > }) General comment: I'm not sure templating is better than creating the tree by hand. For example, you use node.body below, which can expand to anything. Since I consider writing transformers a rare thing, maybe it would be better to really write down the node creation code and stick in the bits from the original tree. To help doing that, we could have a tree dumper that writes out the class creation calls, so that you can write the Cython code, dump the tree, and just modify the tree creation code to fit your needs. I find that simpler and safer than your parsing approach. > - Some changes to Transform.py which I hope goes through... there's a > Visitor object there; using the "process_ClassName" pattern (I think > that was the conclusion for future performance reasons). Did we really reach a conclusion on this? > - A clone_node method on Node for proper node copying (shallow object > copy except child node lists, which are also copied). Fine. > In order to be able to provide proper error messages for string-based > code snippets like the above (which are passed to Parsing.py...); I've > changed the pointer to the source code (used as the first element in the > position tuples found everywhere...) from being a string filename to > being a SourceDescriptor object. I think being able to parse from a string is a valuable feature in itself. "pyinline" comes to mind. > A SourceDescriptor can currently be a FileSourceDescriptor, in which > case things work like before (it gives the filename on __str__ so much > code needed not change), Please make that a method on the SourceDescriptor, like "get_filename()". Calling str() on it reads like you wanted to print the source. > or a StringSourceDestriptor which I use for my new code... How do you create a filename from that one? I mean, the source calls basename() on it in some places. To me, this indicates that the two are not really the same. Stefan From stefan_ml at behnel.de Sun May 25 09:39:31 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 09:39:31 +0200 Subject: [Cython] Status update: Transform utilities In-Reply-To: <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> References: <482DB943.3090501@student.matnat.uio.no> <46A2075A-59EF-4A44-B813-27F1448940DC@math.washington.edu> <54204.193.157.243.12.1210982009.squirrel@webmail.uio.no> Message-ID: <483917B3.9010406@behnel.de> Hi again, Dag Sverre Seljebotn wrote: > If it helps with the funny feeling, the string is only parsed directly in > the constructor for TreeFragment (and it accepts a more directly created > node structure too). So the string disappears from the story after module > load time; from there it is only tree node manipulation (the TreeFragment > acts like a template which is cloned while substituting nodes on the > clone). So there's nothing that requires the string but notational > convenience. :) I guess I missed that bit on first read. But that still doesn't get me convinced that parsing the template at module load time is a good idea. Having straight forward node creation code, especially if it can be bootstrapped by a node dumper, sounds like a better trade-off to me. Stefan From stefan_ml at behnel.de Sun May 25 09:51:16 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 09:51:16 +0200 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: Message-ID: <48391A74.9080309@behnel.de> Hi, Lisandro Dalcin wrote: > After eight hours of pain try to make 'classmethod' working in Cython, > I've found 'extrange' stuff in Python 3.0 sources, specifically at > 'typeobject.c' . Please, open the file or just hit the following > direct link: > > http://svn.python.org/projects/python/branches/py3k/Objects/typeobject.c > > and look at the macro definition pasted below, it is at the begining > of the file: > > #define MCACHE_CACHEABLE_NAME(name) \ > PyString_CheckExact(name) && \ > PyString_GET_SIZE(name) <= MCACHE_MAX_ATTR_SIZE > > Does make any sense the PyString_XXXX there??? If this is a nonsense, > surelly from a bad merge from Py2.6 or cut-and-paste error, then I now > definitely understand way I was having trouble with the new method > cache in 2.6 but NOT in 3.0. In 3.0, the method cache is just not > working because the macro definition is wrong, identifiers are never > byte strings in Py3.0 !! Sounds wrong to me, too. Could you ask about that on the Py3k list? Stefan From dalcinl at gmail.com Sun May 25 09:55:34 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 04:55:34 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <80898271-5D2A-44CF-989E-6C2A7BA48CB7@math.washington.edu> References: <80898271-5D2A-44CF-989E-6C2A7BA48CB7@math.washington.edu> Message-ID: On 5/25/08, Robert Bradshaw > I'm not sure what the method cache for 2.6/3.0 is, is there a good > reference explaining this? It would seem, however, that you have a > good point. Well, I do not know any reference to give you, but I realized all this just by following the actuall code ;-) ... is a very nice hackery, it will speedup a log methods calls like this: L = [] for i in iterable: L.append(i) The actual method is 'cached' internally in a sort of statically sized (about 1000 entries) hashtable, so you can save dict lookups, and this is specially important in the case of inheritance... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sun May 25 10:03:26 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 25 May 2008 01:03:26 -0700 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: <80898271-5D2A-44CF-989E-6C2A7BA48CB7@math.washington.edu> Message-ID: <032ED8B0-B0E7-45E4-8A6D-3B60520F425B@math.washington.edu> On May 25, 2008, at 12:55 AM, Lisandro Dalcin wrote: > On 5/25/08, Robert Bradshaw > I'm not > sure what the method cache for 2.6/3.0 is, is there a good >> reference explaining this? It would seem, however, that you have a >> good point. > > Well, I do not know any reference to give you, but I realized all this > just by following the actuall code ;-) ... is a very nice hackery, it > will speedup a log methods calls like this: > > L = [] > for i in iterable: > L.append(i) > > The actual method is 'cached' internally in a sort of statically sized > (about 1000 entries) hashtable, so you can save dict lookups, and this > is specially important in the case of inheritance... I'm not quite sure I follow. Where is the cache (is it global, or attached to L, or to the class of L?) What if you write "L.append = len" inside the loop? - Robert From dalcinl at gmail.com Sun May 25 10:04:35 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 05:04:35 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <48391A74.9080309@behnel.de> References: <48391A74.9080309@behnel.de> Message-ID: On 5/25/08, Stefan Behnel wrote: > Sounds wrong to me, too. Could you ask about that on the Py3k list? Stefan, I'm afraid that if I post this, chances are high of being completelly ignored :-( . I would prefer other to ask for this, so... Stefan/Rober/Greg, iff you realized there is a problem, this is on your hands... You know Python-Dev policies: 1) write a patch, 2) submit a patch, 3) wait someone review it, or review 5 patches for getting your patch reviewed. No time... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Sun May 25 10:07:20 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 10:07:20 +0200 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <48391A74.9080309@behnel.de> References: <48391A74.9080309@behnel.de> Message-ID: <48391E38.7050802@behnel.de> Hi, Stefan Behnel wrote: > Could you ask about that on the Py3k list? Never mind, I filed a bug report on the bug tracker. http://bugs.python.org/issue2963 Stefan From dalcinl at gmail.com Sun May 25 10:09:33 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 05:09:33 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <032ED8B0-B0E7-45E4-8A6D-3B60520F425B@math.washington.edu> References: <80898271-5D2A-44CF-989E-6C2A7BA48CB7@math.washington.edu> <032ED8B0-B0E7-45E4-8A6D-3B60520F425B@math.washington.edu> Message-ID: On 5/25/08, Robert Bradshaw wrote: > I'm not quite sure I follow. Where is the cache (is it global, or > attached to L, or to the class of L?) What if you write "L.append = > len" inside the loop? It is a static, global, fixed size array, see yourself: struct method_cache_entry { unsigned int version; PyObject *name; /* reference to exactly a str or None */ PyObject *value; /* borrowed */ }; static struct method_cache_entry method_cache[1 << MCACHE_SIZE_EXP]; the 'version' field in the entries, plus specials flags and a new field added to the type object structure are used to determine the validity of the cache... is not so easy to explain in words... Being global, you can expect oscure bugs ;-), specially if you play with tp_dict, as Cython does... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sun May 25 10:15:11 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 05:15:11 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <48391E38.7050802@behnel.de> References: <48391A74.9080309@behnel.de> <48391E38.7050802@behnel.de> Message-ID: Well, I have other one for you then ;-), again Py3.0 Please look at 'Objects/classobject.c' , and tell me what do you think about this patch. This guy was causing the segfaults with Cython's 'classmethod' implementation Index: Objects/classobject.c =================================================================== --- Objects/classobject.c (revision 63598) +++ Objects/classobject.c (working copy) @@ -502,7 +502,7 @@ instancemethod_descr_get(PyObject *descr, PyObject *obj, PyObject *type) { register PyObject *func = PyInstanceMethod_GET_FUNCTION(descr); if (obj == NULL) - return func; + return Py_INCREF(func), func; else return PyMethod_New(func, obj); } On 5/25/08, Stefan Behnel wrote: > Hi, > > > Stefan Behnel wrote: > > Could you ask about that on the Py3k list? > > > Never mind, I filed a bug report on the bug tracker. > > http://bugs.python.org/issue2963 > > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sun May 25 10:19:21 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 05:19:21 -0300 Subject: [Cython] py3k patch, please try for the classmethod issue Message-ID: Stefan, please try this patch for Python 3.0 and let me know if this resolves the 'classmethod' issues for you...it seems to work for me... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: py3k.patch Type: application/octet-stream Size: 1299 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/bd1c50c3/attachment.obj From dalcinl at gmail.com Sun May 25 10:38:16 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 25 May 2008 05:38:16 -0300 Subject: [Cython] and now, a patch for Cython regarding method cache Message-ID: Stefan, this is what I believe is the safe way for invalidating the cache, at least for 'cdef class' classes. For normal clases, I'll have to think a bit more. I go to sleep, my brain is exausted ... Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: cython.patch Type: application/octet-stream Size: 781 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/149b4f94/attachment-0001.obj From stefan_ml at behnel.de Sun May 25 10:41:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 10:41:21 +0200 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: <48391A74.9080309@behnel.de> <48391E38.7050802@behnel.de> Message-ID: <48392631.1010606@behnel.de> Hi, Lisandro Dalcin wrote: > Please look at 'Objects/classobject.c' , and tell me what do you think > about this patch. This guy was causing the segfaults with Cython's > 'classmethod' implementation > > Index: Objects/classobject.c > =================================================================== > --- Objects/classobject.c (revision 63598) > +++ Objects/classobject.c (working copy) > @@ -502,7 +502,7 @@ > instancemethod_descr_get(PyObject *descr, PyObject *obj, PyObject *type) { > register PyObject *func = PyInstanceMethod_GET_FUNCTION(descr); > if (obj == NULL) > - return func; > + return Py_INCREF(func), func; > else > return PyMethod_New(func, obj); > } To comment on this, I would have to look through all occurrences of instancemethod_descr_get() and see if this corresponds to the way the result is handled... ... but given that PyMethod_New() returns a new reference also, I would suspect that the above is the right thing to do - although Py_INCREF(func); return func; looks better to me... :) Thanks for looking through all this, BTW. Should I file the bug report or do you want to do it yourself? Stefan From stefan_ml at behnel.de Sun May 25 10:54:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 10:54:44 +0200 Subject: [Cython] py3k patch, please try for the classmethod issue In-Reply-To: References: Message-ID: <48392954.2030607@behnel.de> Lisandro Dalcin wrote: > Stefan, please try this patch for Python 3.0 and let me know if this > resolves the 'classmethod' issues for you...it seems to work for me... Lets the test case pass for Py3 without crash. Stefan From stefan_ml at behnel.de Sun May 25 11:01:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 11:01:02 +0200 Subject: [Cython] and now, a patch for Cython regarding method cache In-Reply-To: References: Message-ID: <48392ACE.7080405@behnel.de> Hi, Lisandro Dalcin wrote: > Stefan, this is what I believe is the safe way for invalidating the > cache, at least for 'cdef class' classes. I did the same, just after executing the body of a cdef class. I'm not sure, but I think this is the simplest thing to do. Also, I assume that the same thing has to be done after deleting an attribute, so it works in that case, too. The intuition is that whenever there is a cdef class body, there will be changes to the attributes in one way or another... Is the cache rebuilt at any point or will it remain invalid once we invalidate it? > For normal clases, I'll have to think a bit more. I don't think it hits us in that case at all, does it? > I go to sleep, my brain is exausted ... Thanks for all the work you put into this. I'll file the second bug report on Py3 then. Stefan From dagss at student.matnat.uio.no Sun May 25 11:34:27 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 25 May 2008 11:34:27 +0200 Subject: [Cython] Status update: Transform utilities In-Reply-To: <4839162E.4030504@behnel.de> References: <482DB943.3090501@student.matnat.uio.no> <4839162E.4030504@behnel.de> Message-ID: <483932A3.6000804@student.matnat.uio.no> Thanks for taking the time to review. Stefan Behnel wrote: > General comment: I'm not sure templating is better than creating the tree by > hand. For example, you use node.body below, which can expand to anything. > > Since I consider writing transformers a rare thing, maybe it would be better > to really write down the node creation code and stick in the bits from the > original tree. To help doing that, we could have a tree dumper that writes out > the class creation calls, so that you can write the Cython code, dump the > tree, and just modify the tree creation code to fit your needs. I find that > simpler and safer than your parsing approach. (I saw your other comment so I'll disregard the "safer" bit.) This is the kind of question I don't think there's much point discussing in length: - It doesn't affect stability/chance of introducing bugs (except, I'd argue, for an advantage in code readability for the string notation...) - It doesn't affect any interfaces or APIs or anything anywhere - It can be switched to another approach in 20 minutes at any time without affecting anything else. And TreeFragment can actually take a manually constructed "template" node tree as a constructor argument too! So you wouldn't need to change anything but replace the string itself. It's like discussing whether one should use a for loop or while loop! We can't always discuss things at this level IMO. My stance on this is simply that if you implement the with statement, you can create the nodes however you want, and if I write it, I'm free to use strings and TreeFragment. What do you think? (Like I said, the reason I wrote TreeFragment is to have easy unit testing. But once it's there, I'd use it for the with statement too myself.) >> - Some changes to Transform.py which I hope goes through... there's a >> Visitor object there; using the "process_ClassName" pattern (I think >> that was the conclusion for future performance reasons). > > Did we really reach a conclusion on this? Well, we had two in favor of vtable-friendliness (and you and me against); so I considered that if I switched my vote, it was 3/1. I really don't care that much either way, I just wanted to move forward without getting hung up into trivial details. Bike sheds and so on. (As you don't think transforms will be that useful, I guess you wouldn't care that much anyway? I think they will be critical though...and better safe than sorry, i.e. better vtable-friendly than not.) >> A SourceDescriptor can currently be a FileSourceDescriptor, in which >> case things work like before (it gives the filename on __str__ so much >> code needed not change), > > Please make that a method on the SourceDescriptor, like "get_filename()". > Calling str() on it reads like you wanted to print the source. The real problem here was my email message. A Source***Descriptor*** is an abstraction for the filename, not its contents, so str() does not read like you want to print the source. str() is called when you want to inform a human user (in an error message) about which source the code you're displaying comes from. It could say "disk file: /home/..." rather than just "/home/...", but I didn't want to change behaviour for files. "get_description()" would be ok if str() is considered too ambigious though. In the cases where an actual filename is needed, I did (at least I hope I did) always insert something to the effect of: if not isinstance(source_desc, FileSourceDescritor): raise AssertionError("Expected a file disk source here") do_something(source_desc.filename) (For instance, Main.Context does this a lot, but Main.Context also always pass in a FileSourceDescriptor to begin with. The Context that the string parser uses instead implements methods in a way so that this assumption about using file sources is changed.) -- Dag Sverre From dagss at student.matnat.uio.no Sun May 25 11:38:46 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 25 May 2008 11:38:46 +0200 Subject: [Cython] Compile-time duck typing In-Reply-To: <4838CA08.2020602@canterbury.ac.nz> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> <4838CA08.2020602@canterbury.ac.nz> Message-ID: <483933A6.3010302@student.matnat.uio.no> Greg Ewing wrote: > Why do you want to overload based on return type > anyway? > > I don't see any need to do that in the case of > __getitem__. That fits perfectly well into the > usual model of parametric polymorphism -- the > return type is the same as the element type of > the array. To be even more specific: Slices. arr[1:4] should return a Python object, arr[1] should return a single item of native type. Both calling methods uses __getitem__ in Python. -- Dag Sverre From stefan_ml at behnel.de Sun May 25 11:40:48 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 25 May 2008 11:40:48 +0200 Subject: [Cython] py3k patch, please try for the classmethod issue In-Reply-To: <48392954.2030607@behnel.de> References: <48392954.2030607@behnel.de> Message-ID: <48393420.5040602@behnel.de> Hi, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> Stefan, please try this patch for Python 3.0 and let me know if this >> resolves the 'classmethod' issues for you...it seems to work for me... > > Lets the test case pass for Py3 without crash. Both patches were committed to the current Py3 SVN. Stefan From dagss at student.matnat.uio.no Sun May 25 12:07:16 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 25 May 2008 12:07:16 +0200 Subject: [Cython] getitem operator In-Reply-To: <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> Message-ID: <48393A54.90408@student.matnat.uio.no> After getting a night sleep (and Greg's emails too!) I've definitely come over now, having "generic" on the return type complicates the issue way too much. Won't say I've put it totally away, but it should not be anything close to a priority and I won't think more about it until after summer at least. Robert Bradshaw wrote: > On May 24, 2008, at 2:58 PM, Dag Sverre Seljebotn wrote: > >> So you propose that one keeps the name __getitem__, yet changes the >> interface completely compared to Python? (Don't pass slices, unpack >> the >> tuple prior to calling...) I don't see the benefit of keeping the name >> then, it only creates confusion... > > I'm thinking of extending the semantics (note, for inline functions > only, otherwise none of this optimization can happen), and not in a > way that is backwards incompatible. __getitem__(self, [object] index) > will always be available, and always called for non-inlined code. I don't like binding this to the question of inline vs. non-inline. (It might well work that way if overloading is only enabled for inline functions at first, but that would be a side-effect.) It is perfectly reasonable to want to write container classes as Cython "cdef class" wholly in Cython, and want Cython client code to avoid tuple packing/unpacking when indexing. If the __getitem__ name is kept I don't think slices should be specially treated -- they're really just object keys. This is what I think we should do: - Whenever the []-operator is encountered, an overload of __getitem__ with the same number of arguments as there are dimensions in the operator is looked for first. So a[x,y,i:j:k] would look first for a.__getitem__(x, y, slice(i,j,k)), and if that can't be called, will call a.__getitem__((x, y, slice(i,j,k))). - If you type the arguments with "int" the overload will fail matching when slices are used. However, this means there's no way to disallow slices but allow, say, skip tuple packing on ND hash lookups using object keys. However, this makes the one-argument case ambigious still (may be passed a tuple or not). So I'm really still in favor of being very explicit, and state that in the case of exact, non-slice, non-Ellipsis indexing *only* do we also support a __cgetitem__ operator (or perhaps __getsingleitem__) which always takes the exact same number of arguments as passed to []. (One could even drop overloading, and have __getsingleitem1__, __getsingleitem2__, ...) On the return types: I suppose having "object" and rely on inlining to remove it again is preferable. There's still an alternative: Allow "self.dtype" as a return type. This is more explicit and would mean that type inference could be done *before* the inlining/optimization happens (not sure if that is helpful or not). It doesn't look nice syntactically but it's not such a problem for the compiler ("self.dtype" would be "under assumption", and if "self.dtype" is _not_ under assumption that method is simply considered non-existant). (I think we'll end up with "object", but mentioning it. You seem to have thought a lot more about type inference than myself.) > You're whole (..) can be encoded in the argument signature (fixing > the number of arguments, which I'll admit is undesirable). The > "tuple" representing the indices never gets created or used in this > case, which makes things much clearer. Otherwise, I still don't see > how you're going to unpack the tuple argument in the generic case > without resorting to Python tuples. No, it would be Python tuples, but evaluated compile-time through the whole inlining/unrolling machinery I've dreamed up. But I've put that away now. -- Dag Sverre From languitar at semipol.de Sun May 25 12:04:44 2008 From: languitar at semipol.de (Johannes Wienke) Date: Sun, 25 May 2008 12:04:44 +0200 Subject: [Cython] use of __del__ In-Reply-To: <4838C7E6.7020706@canterbury.ac.nz> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <4838C7E6.7020706@canterbury.ac.nz> Message-ID: <483939BC.3090204@semipol.de> Am 05/25/2008 03:59 AM schrieb Greg Ewing: > Johannes Wienke wrote: > >> Is there a reason why this isn't documented? > > It's mentioned in Special Methods of Extension Types, > under the section on __dealloc__. It could perhaps > be made more prominent, though. > If I got to http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ I am unable to find that document (or am I blind?). Maybe a link from there would help? Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/a22fd761/attachment-0001.pgp From jim-crow at rambler.ru Sun May 25 14:27:57 2008 From: jim-crow at rambler.ru (Anatoly A. Kazantsev) Date: Sun, 25 May 2008 19:27:57 +0700 Subject: [Cython] cdef extern struct from .h file Message-ID: <20080525192757.9e84b037.jim-crow@rambler.ru> Hello! Can't find right way to cdef extern struct from bar.h In foo.h defined struct: struct some_struct { int member; } Then in bar.h: #include typedef struct another_struct some_struct -- Anatoly A. Kazantsev Protect your digital freedom and privacy, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080525/80cf4c9d/attachment.pgp From cwitty at newtonlabs.com Sun May 25 18:28:27 2008 From: cwitty at newtonlabs.com (Carl Witty) Date: Sun, 25 May 2008 09:28:27 -0700 Subject: [Cython] Compile-time duck typing In-Reply-To: <4838BD98.1040204@canterbury.ac.nz> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <4838BD98.1040204@canterbury.ac.nz> Message-ID: On Sat, May 24, 2008 at 6:15 PM, Greg Ewing wrote: > But what if the result is being passed to another function > whose argument type is "generic"? If there is more than > one way of instantiating that function, it can easily > become ambiguous. > > C++ disallows overloading a function based solely on the > return type, probably because of the ambiguities it can > lead to. > > Also, in Haskell I think it's always possible to deduce > the return type of a function if you know the types of > its arguments. Actually, that's not quite true; for instance, the function "read" has type (Read a) => String -> a which means that it can return any type such that Haskell knows how to "read" it (parse it from a string): Prelude> (read "3") :: Int 3 Prelude> (read "3") :: Double 3.0 Then a construction like show (read "x") is ambiguous, By default, ambiguous types like this default to type Integer, but a particular module can specify a different default. I could describe Haskell typing in a lot more detail, but I don't really think it's likely to be relevant for Cython; more relevant would be HaXe or the various soft typing systems for Scheme (where "soft typing" means that you don't necessarily try to assign a precise type to every expression, you just go "as precise as you can"). Carl From greg.ewing at canterbury.ac.nz Mon May 26 02:48:22 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 May 2008 12:48:22 +1200 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: Message-ID: <483A08D6.3070100@canterbury.ac.nz> Lisandro Dalcin wrote: > #define MCACHE_CACHEABLE_NAME(name) \ > PyString_CheckExact(name) && \ > PyString_GET_SIZE(name) <= MCACHE_MAX_ATTR_SIZE > > Does make any sense the PyString_XXXX there??? This looks like something you ought to bring to the attention of the py3k developers list. -- Greg From greg.ewing at canterbury.ac.nz Mon May 26 02:56:22 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 May 2008 12:56:22 +1200 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: <48391A74.9080309@behnel.de> Message-ID: <483A0AB6.9070305@canterbury.ac.nz> Lisandro Dalcin wrote: > You know Python-Dev policies: 1) write a patch, 2) > submit a patch, 3) wait someone review it, or review 5 patches for > getting your patch reviewed. That's only for new features and other such non-essential things. I'm sure they're always open to reports of actual bugs in important areas like this. -- Greg From greg.ewing at canterbury.ac.nz Mon May 26 03:18:15 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 May 2008 13:18:15 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: <483933A6.3010302@student.matnat.uio.no> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> <483884BB.9000506@student.matnat.uio.no> <0D75485B-9309-42F4-BCC9-A1F38128D948@math.washington.edu> <48388F77.8030709@student.matnat.uio.no> <84BE93B0-3B44-406B-8ABC-08D61B4418A4@math.washington.edu> <4838CA08.2020602@canterbury.ac.nz> <483933A6.3010302@student.matnat.uio.no> Message-ID: <483A0FD7.4060102@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > To be even more specific: Slices. arr[1:4] should return a Python > object, arr[1] should return a single item of native type. Both calling > methods uses __getitem__ in Python. That's something of a problem, but I don't see how trying to sort it out based on what's being done with the return value will work in general. What happens if the return value is being passed to another function as a parameter of type 'generic'? However, this may not be as much of a problem as it looks. When the slice indices are written directly in the indexing expression like that, you know that the argument to __getitem__ isn't just any kind of object, it's an object of type 'slice'. So you have at least three possible signatures for __getitem__: __getitem__(int) -> element type of array # optimised __getitem__(slice) -> python object # optimised __getitem__(object) -> python object # unoptimised You could go further and have finer specialisations of type 'slice' according to the number of indices, etc., allowing different code to be generated for each case. -- Greg From greg.ewing at canterbury.ac.nz Mon May 26 03:20:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 May 2008 13:20:37 +1200 Subject: [Cython] use of __del__ In-Reply-To: <483939BC.3090204@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <4838C7E6.7020706@canterbury.ac.nz> <483939BC.3090204@semipol.de> Message-ID: <483A1065.8020906@canterbury.ac.nz> Johannes Wienke wrote: > Am 05/25/2008 03:59 AM schrieb Greg Ewing: > >>It's mentioned in Special Methods of Extension Types, > > I am unable to find that document (or am I blind?). http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html -- Greg From greg.ewing at canterbury.ac.nz Mon May 26 03:46:38 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 26 May 2008 13:46:38 +1200 Subject: [Cython] Compile-time duck typing In-Reply-To: References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <4838BD98.1040204@canterbury.ac.nz> Message-ID: <483A167E.8020908@canterbury.ac.nz> Carl Witty wrote: > Then a construction like > show (read "x") > is ambiguous, By default, ambiguous types like this default to type > Integer, but a particular module can specify a different default. Hmmm, that's interesting -- I didn't know that it would actually allow ambiguous constructs like that. I'm not sure this is a feature that Cython ought to emulate -- seems to me it would be better to be explicit about *how* you want to read the string. But maybe EIBTI isn't considered part of the Zen of Haskell.:-) -- Greg From languitar at semipol.de Mon May 26 10:56:30 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 26 May 2008 10:56:30 +0200 Subject: [Cython] use of __del__ In-Reply-To: <483A1065.8020906@canterbury.ac.nz> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <4838C7E6.7020706@canterbury.ac.nz> <483939BC.3090204@semipol.de> <483A1065.8020906@canterbury.ac.nz> Message-ID: <483A7B3E.7010303@semipol.de> Am 05/26/2008 03:20 AM schrieb Greg Ewing: > Johannes Wienke wrote: >> Am 05/25/2008 03:59 AM schrieb Greg Ewing: > > >>> It's mentioned in Special Methods of Extension Types, > > I am unable to find that document (or am I blind?). > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html But how do I get there? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080526/39ebe032/attachment.pgp From rho at fisheggs.name Mon May 26 12:27:54 2008 From: rho at fisheggs.name (Nigel Rowe) Date: Mon, 26 May 2008 20:27:54 +1000 Subject: [Cython] use of __del__ In-Reply-To: <483A7B3E.7010303@semipol.de> References: <48385795.7020404@semipol.de> <483A1065.8020906@canterbury.ac.nz> <483A7B3E.7010303@semipol.de> Message-ID: <200805262027.56777@fisheggs+neverbox.com> On Mon, 26 May 2008 10:56:30 +0200, Johannes Wienke wrote in a message with the id <483A7B3E.7010303 at semipol.de>: > > Am 05/26/2008 03:20 AM schrieb Greg Ewing: > > Johannes Wienke wrote: > >> Am 05/25/2008 03:59 AM schrieb Greg Ewing: > >>> It's mentioned in Special Methods of Extension Types, > >>> > > > I am unable to find that document (or am I blind?). > > > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Do > >c/Manual/special_methods.html > > But how do I get there? Main Pyrex page --> Language Overview --> Extension Types --> Special Methods -- Nigel Rowe rho \N{COMMERCIAL AT} fisheggs \N{FULL STOP} name From languitar at semipol.de Mon May 26 12:33:40 2008 From: languitar at semipol.de (Johannes Wienke) Date: Mon, 26 May 2008 12:33:40 +0200 Subject: [Cython] use of __del__ In-Reply-To: <200805262027.56777@fisheggs+neverbox.com> References: <48385795.7020404@semipol.de> <483A1065.8020906@canterbury.ac.nz> <483A7B3E.7010303@semipol.de> <200805262027.56777@fisheggs+neverbox.com> Message-ID: <483A9204.5080409@semipol.de> Am 05/26/2008 12:27 PM schrieb Nigel Rowe: > On Mon, 26 May 2008 10:56:30 +0200, > Johannes Wienke wrote in a message > with the id <483A7B3E.7010303 at semipol.de>: >> Am 05/26/2008 03:20 AM schrieb Greg Ewing: >>> Johannes Wienke wrote: >>>> Am 05/25/2008 03:59 AM schrieb Greg Ewing: >>>>> It's mentioned in Special Methods of Extension Types, >>>>> >>> > I am unable to find that document (or am I blind?). >>> >>> http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Do >>> c/Manual/special_methods.html >> But how do I get there? > > > Main Pyrex page > > > --> Language Overview > > > --> Extension Types > > > --> Special Methods > Ah, I see, but if you overread the tiny text that says the special methods are one a separate page, you won't find it as the navigation on top of that page pretends all th content will be directly on this page. Johannes -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://codespeak.net/pipermail/cython-dev/attachments/20080526/63049b14/attachment.pgp From dalcinl at gmail.com Mon May 26 16:13:40 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 11:13:40 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <48392631.1010606@behnel.de> References: <48391A74.9080309@behnel.de> <48391E38.7050802@behnel.de> <48392631.1010606@behnel.de> Message-ID: On 5/25/08, Stefan Behnel wrote: > To comment on this, I would have to look through all occurrences of > instancemethod_descr_get() and see if this corresponds to the way the result > is handled... Well, regarding that xxx_descr_get is equivalentet, In Cython terms, of a __get__() inside a 'property' block, it should never return borrowed references.... > ... but given that PyMethod_New() returns a new reference also, I would > suspect that the above is the right thing to do - although > > Py_INCREF(func); > return func; > > looks better to me... :) Of course, an abuse of comma expressions from my part .. > Thanks for looking through all this, BTW. Should I file the bug report or do > you want to do it yourself? Go ahead, file the report, and many thanks. But I would really prefer that firtst you apply the patch in Python 3.0 sources and confirm that then Cython test then pass. BTW, How do you run Cython's test suite in the Py3 case? The obvious way is failing for me ... $ python3.0 runtests.py Traceback (most recent call last): File "runtests.py", line 232, in from Cython.Compiler.Main import \ File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", line 159 except UnicodeDecodeError, msg: ^ SyntaxError: invalid syntax -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Mon May 26 16:39:31 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 11:39:31 -0300 Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: <483A0AB6.9070305@canterbury.ac.nz> References: <48391A74.9080309@behnel.de> <483A0AB6.9070305@canterbury.ac.nz> Message-ID: On 5/25/08, Greg Ewing wrote: > > That's only for new features and other such non-essential > things. I'm sure they're always open to reports of actual > bugs in important areas like this. > Well, Greg, see yourself my last post to py3k list: http://mail.python.org/pipermail/python-3000/2008-May/013594.html Perhaps that's not essential, but if any one would commented on this, and further, a decision on what to do were available, I could even contribute the full patch for fixing this. But I'll not write any single line for a patch that probably is going to be dormant forever in a bug tracker... Then, I really believe that in the case I ever found a real issues, I should discuss it first with smart and well know people (like you, Greg), and ask them to post to the devel lists. That way, chances are much high of the issues being reviewed by a core developer, and solved fast. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Mon May 26 17:15:20 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 12:15:20 -0300 Subject: [Cython] and now, a patch for Cython regarding method cache In-Reply-To: <48392ACE.7080405@behnel.de> References: <48392ACE.7080405@behnel.de> Message-ID: On 5/25/08, Stefan Behnel wrote: > Hi, > Lisandro Dalcin wrote: > > Stefan, this is what I believe is the safe way for invalidating the > > cache, at least for 'cdef class' classes. > > I did the same, just after executing the body of a cdef class. I'm not sure, > but I think this is the simplest thing to do. Also, I assume that the same > thing has to be done after deleting an attribute, so it works in that case, > too. The intuition is that whenever there is a cdef class body, there will be > changes to the attributes in one way or another... In short, as Cython has to play with the 'tpdict' field of type objects, the that type object (and actually all its subclasses) have to be flaged as 'touched', and that should occur at EVERY point were the 'tp_dict' is changed (ie, every time you set or delete an attribute, method, etc.) > Is the cache rebuilt at any point or will it remain invalid once we invalidate it? The cache itself is not actually invalidated, but the type object (and all its subclasses) have to be flaged as 'touched', then the next attribute lookup will ignore the cache, look for the attribute the normal way, next put it in the cache, and finally return the result. So, after every attribute lookup, the type object will be again taking advantage of the cache. For that reason, Cython sould flag the type object as 'touched' after modifying the 'tp_dict' field. For normal Python classes, I believe this is not necesary, and the normal Python mechanisms for get/set attributes are being used; in that case Python takes care of 'touching' types. Finally, I believe Python 3.0 API should provide a function for automatically touch a type object and all subclasses. This is currently implemented in sources, but it is not a public function. What it is available is PyType_ClearCache() function (accessible at the Python level as sys._clear_type_cache() ), but that function works by 'touching' the base type object (that is, the 'type' type at python level) and all it subclasses, which in Py3 would mean that the call will visit ALL classes every defined and alive. Finally, Python3.0 should provide a configure option for disabling at all the method cache code at configure or compile time. Why? Users/distributors of Real-time operating systems will probably take advantage of that. > > For normal clases, I'll have to think a bit more. > > I don't think it hits us in that case at all, does it? Indeed. As for normal classes the standard Python mechanism are being used, I now believe you should not take any action here. Stefan, iff I was not clear enough about all this, sorry, it is not easy to explain. Furthermore, I learned all this by diving in the actual source code (I hope I really understood the best! perhaps I'm missing some point). In sort, every time a 'tp_dict' is changed, Cython should generate code with the line 'xxx->tp_flags &= ~Py_TPFLAGS_VALID_VERSION_TAG'. The patch I sended before solved all the issues of my mpi4py project in both Py2.6 and Py2.3 (after fixing the bug in the method cache to make it actually being used). So, I really think you sould apply the my patch for Cython for now. If we can get Python 3.0 to gain a new C-API level function for doing this the right way, then we can update Cython to use it. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Mon May 26 17:16:37 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 12:16:37 -0300 Subject: [Cython] py3k patch, please try for the classmethod issue In-Reply-To: <48393420.5040602@behnel.de> References: <48392954.2030607@behnel.de> <48393420.5040602@behnel.de> Message-ID: On 5/25/08, Stefan Behnel wrote: > > Both patches were committed to the current Py3 SVN. > Great!! Many thanks, Stefan... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Mon May 26 17:19:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 May 2008 17:19:26 +0200 (CEST) Subject: [Cython] PLEASE HELP for review Python 3.0 'typeobject.c' source file In-Reply-To: References: <48391A74.9080309@behnel.de> <48391E38.7050802@behnel.de> <48392631.1010606@behnel.de> Message-ID: <39397.194.114.62.37.1211815166.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > Go ahead, file the report, and many thanks. But I would really prefer > that firtst you apply the patch in Python 3.0 sources and confirm that > then Cython test then pass. I did. :) > BTW, How do you run Cython's test suite in the Py3 case? The obvious > way is failing for me ... > > $ python3.0 runtests.py > Traceback (most recent call last): > File "runtests.py", line 232, in > from Cython.Compiler.Main import \ > File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", > line 159 > except UnicodeDecodeError, msg: > ^ > SyntaxError: invalid syntax Look at the Makefile, there is a "make test3". python2.5 runtests.py --no-cleanup python3.0 runtests.py -vv --no-cython generally does the trick for me. I also changed the test runner to use optparse, so "python runtests.py -h" should work (although it looks like I forgot to add the parameter documentation :) Stefan From dalcinl at gmail.com Mon May 26 17:25:26 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 12:25:26 -0300 Subject: [Cython] cdef extern struct from .h file In-Reply-To: <20080525192757.9e84b037.jim-crow@rambler.ru> References: <20080525192757.9e84b037.jim-crow@rambler.ru> Message-ID: I do not figure out what do you really want do do at the Cython level. Could you elaborate a bit more? On 5/25/08, Anatoly A. Kazantsev wrote: > Hello! > > Can't find right way to cdef extern struct from bar.h > > In foo.h defined struct: > > struct some_struct { > int member; > } > > Then in bar.h: > > #include > typedef struct another_struct some_struct > > > -- > Anatoly A. Kazantsev > > Protect your digital freedom and privacy, eliminate DRM, learn more at > http://www.defectivebydesign.org/what_is_drm > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jek-gmane at kleckner.net Mon May 26 18:20:57 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Mon, 26 May 2008 09:20:57 -0700 Subject: [Cython] use of __del__ In-Reply-To: <483A9204.5080409@semipol.de> References: <48385795.7020404@semipol.de> <483A1065.8020906@canterbury.ac.nz> <483A7B3E.7010303@semipol.de> <200805262027.56777@fisheggs+neverbox.com> <483A9204.5080409@semipol.de> Message-ID: Johannes Wienke wrote: > Am 05/26/2008 12:27 PM schrieb Nigel Rowe: > Ah, I see, but if you overread the tiny text that says the special > methods are one a separate page, you won't find it as the navigation on > top of that page pretends all th content will be directly on this page. Yes, a bit more navigation help would be valuable. Jim From stefan_ml at behnel.de Mon May 26 18:26:45 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 May 2008 18:26:45 +0200 (CEST) Subject: [Cython] and now, a patch for Cython regarding method cache In-Reply-To: References: <48392ACE.7080405@behnel.de> Message-ID: <43977.194.114.62.37.1211819205.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > In short, as Cython has to play with the 'tpdict' field of type > objects, the that type object (and actually all its subclasses) have > to be flaged as 'touched', and that should occur at EVERY point were > the 'tp_dict' is changed (ie, every time you set or delete an > attribute, method, etc.) [...] > 'xxx->tp_flags &= ~Py_TPFLAGS_VALID_VERSION_TAG'. I was just considering that extension types are fixed after executing their body, so there is not much that can happen to their attributes outside of the body execution. You (currently) can't even delete any attribute from inside the body. That's why I think resetting the flag once is enough, and that generating code that resets that flag all over the place is unnecessary. Here's what I did: http://hg.cython.org/cython-devel/rev/ac0cc8edbf55 Admittedly, your patch feels a bit safer... > Finally, I believe Python 3.0 API should provide a function for > automatically touch a type object and all subclasses. Feels like it, yes. > Finally, Python3.0 should provide a configure option for disabling at > all the method cache code at configure or compile time. Why? > Users/distributors of Real-time operating systems will probably take > advantage of that. Maybe. But you should let "them" ask that. :) Stefan From martin at martincmartin.com Mon May 26 19:15:42 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Mon, 26 May 2008 13:15:42 -0400 Subject: [Cython] Compile-time duck typing In-Reply-To: <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> References: <51345.193.157.229.67.1211461677.squirrel@webmail.uio.no> <483592C7.3040602@behnel.de> <52056.193.157.229.67.1211481350.squirrel@webmail.uio.no> <41EA570A-816D-401A-A146-78FE30B59DE3@math.washington.edu> <50088.193.157.229.67.1211544688.squirrel@webmail.uio.no> <4836E68D.4000105@behnel.de> <50642.193.157.229.67.1211561775.squirrel@webmail.uio.no> <51005.193.157.229.67.1211649396.squirrel@webmail.uio.no> <556C298D-0C72-43D5-AE9D-AFCB54AC6F3A@math.washington.edu> <48387488.5050605@student.matnat.uio.no> <44537E3A-6EE2-4D5C-90CD-0BBB4E667311@math.washington.edu> Message-ID: <483AF03E.3070401@martincmartin.com> Robert Bradshaw wrote: > On May 24, 2008, at 1:03 PM, Dag Sverre Seljebotn wrote: >> >> Robert Bradshaw wrote: > True. A generic n-ary unpacker that gets unrolled completely at > compile time may be much more complicated to implement (and read). > IMHO, not as important as making 1, 2, and 3-dimensional indexing as > fast as possible (possibly falling back to a runtime loop for more). > Not as powerful, but more realistic to actually get done (especially > given all the other things you're planning to do). Also, a generic n-ary unpacker may be horribly inefficient when n is large. It may even be inefficient when n is small. The loop takes up more memory, and hence kicks other things out of the instruction cache. And on x86, instructions in a small loop are only decoded once. When you unroll, each iteration needs to be decoded separately. It's really, really hard to predict what's faster than what at the per-instruction level these days. I see three levels of performance worries: 1. Not worried at all, e.g. outside inner loop, or on quick scripts. Can work in Python subset and use C typing simply to wrap functionality in C libraries that you don't have in Python. 2. Care a little. Put in C types and other notations to the compiler to get a speed boost with little programmer effort. 3. Care a lot. Then you want to control e.g. whether or not things get unrolled by trying them & seeing what actually speeds things up. Nowhere in here is there much scope for the compiler to try to be really smart and optimize aggressively. You could leave that to the C compiler, e.g. emit a loop and compile with -funroll-loops (or -fpeel-loops, which uses the profile guided optimization.) Dag, would you be interested in doing a quick test, writing hand-unrolled C vs. a loop in C, and seeing where the cutoff for "n" is? Best, Martin From dalcinl at gmail.com Mon May 26 20:22:06 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 15:22:06 -0300 Subject: [Cython] and now, a patch for Cython regarding method cache In-Reply-To: <43977.194.114.62.37.1211819205.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48392ACE.7080405@behnel.de> <43977.194.114.62.37.1211819205.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On 5/26/08, Stefan Behnel wrote: > I was just considering that extension types are fixed after executing > their body, so there is not much that can happen to their attributes > outside of the body execution. You (currently) can't even delete any > attribute from inside the body. That's why I think resetting the flag once > is enough, and that generating code that resets that flag all over the > place is unnecessary. But I believe that's not true in the case of 'classmethod' support for the special case of 'cdef' classes. Cython first uses PyObject_GetAttr() (called inside __Pyx_GetName) , but next updates 'tp_dict'. Playing with 'tp_dict' directly is a sort of hack, Python does never notice that, and cannot manage the method cache appropriatelly. Stefan, let's do that. I'll try again with your proposal (just reseting the flag once). If this does not work for me in my project, then I'll try to reproduce with simpler a simpler test case. Then, and only then, iff I can confirm bad behavior of your proposal, I'll ask again for doing things my way. > > Finally, I believe Python 3.0 API should provide a function for > > automatically touch a type object and all subclasses. > > Feels like it, yes. We just need a guy brave enough for asking for this feature on python-dev ;-). But I guess that this could be rejected, because its only pourpose is to support Cython hackery ... Iff you grep at Py3K sources, 'tp_dict' is almost never changed, except in 'typeobject.c' > > Finally, Python3.0 should provide a configure option for disabling at > > all the method cache code at configure or compile time. Why? > > Users/distributors of Real-time operating systems will probably take > > advantage of that. > > Maybe. But you should let "them" ask that. :) > Of couse, just forget my comment. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Mon May 26 21:04:20 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 16:04:20 -0300 Subject: [Cython] Py3K: recent rename PyString -> PyBytes Message-ID: Stefan, how should we handle this? I ask because this is a really fresh change, not available in latest Py3K source distribution, but commited in SVN. IMHO, we would follow SVN as closely as possible, and no wait for alphas. Or perhaps we can again branch to a new repo, until the next Py3K alfa is released? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Mon May 26 21:57:39 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 May 2008 16:57:39 -0300 Subject: [Cython] final conclusion about how to handle the method cache issue Message-ID: Simple example: cdef class A: ATTR1 = 1 ATTR2 = ATTR1 ATTR1 = 2 ATTR2 = ATTR1 So in the end, 'A.ATTR1' and 'A.ATTR2' should be both '2'. Trying this with current cython-devel, I get: $ python3.0 Python 3.0a5+ (py3k:63695, May 26 2008, 12:24:48) ..... >>> import qq >>> qq.A.ATTR1 2 >>> qq.A.ATTR2 1 >>> The method cache in action!!! Actually, you can see that the cache is for more than method, it is for any attribute inside every type dictionary ('tp_dict' field). Now, I try the same but with the attached patch (is its the same as an old one I posted, plus the removal of Stefan's code). Type obects are now marked as 'touched' every time Cython sets something in 'tp_dict' using direct PyDict_SetItem call. $ python3.0 Python 3.0a5+ (py3k:63695, May 26 2008, 12:24:48) ... >>> import qq [36640 refs] >>> qq.A.ATTR1 2 >>> qq.A.ATTR2 2 >>> And then all work as expected. Stefan, if you are still not convinced that is is the right and the only safe way of handling this issue, then I invite you to review Python 3.0 sources, file Object/typeobject.c, function _PyType_Lookup() . Then you will easily figure out that if Cython updated sometype->tp_dict, then we have to invalidate the type version tag immediately. Furthermore, we should invalidate the versions tag of all subclasses, but if we are going to do it the right way, we have to implement the equivalent to the private function 'type_modified()' in 'typeobject.c', or ask Python-Dev for this made available to the public. So, Stefan, I really think you sould push the attached patch. Hope I convinced you! Please, review carefully my hackery. I'm not sure at all if my way of finding the C name of the type object is good enough. Regards, -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: mcache.patch Type: application/octet-stream Size: 1495 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080526/75f84e68/attachment.obj From stefan_ml at behnel.de Mon May 26 21:39:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 May 2008 21:39:34 +0200 Subject: [Cython] and now, a patch for Cython regarding method cache In-Reply-To: <43977.194.114.62.37.1211819205.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <48392ACE.7080405@behnel.de> <43977.194.114.62.37.1211819205.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <483B11F6.4040305@behnel.de> Hi again, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> In short, as Cython has to play with the 'tpdict' field of type >> objects, the that type object (and actually all its subclasses) have >> to be flaged as 'touched', and that should occur at EVERY point were >> the 'tp_dict' is changed (ie, every time you set or delete an >> attribute, method, etc.) > [...] >> 'xxx->tp_flags &= ~Py_TPFLAGS_VALID_VERSION_TAG'. > > Here's what I did: > > http://hg.cython.org/cython-devel/rev/ac0cc8edbf55 > > Admittedly, your patch feels a bit safer... ... and it has the advantage that it only invalidates the cache if it's really necessary, i.e. if there really are assignments. So I changed it according to your patch. Stefan From stefan_ml at behnel.de Mon May 26 23:02:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 May 2008 23:02:09 +0200 Subject: [Cython] final conclusion about how to handle the method cache issue In-Reply-To: References: Message-ID: <483B2551.4020001@behnel.de> Hi, Lisandro Dalcin wrote: > the attached patch (is its the same as an > old one I posted, plus the removal of Stefan's code). he he, I'm first: http://hg.cython.org/cython-devel/rev/501d399af9cf (although you might argue that you were the first who sent the patch to the list...) Stefan From stefan_ml at behnel.de Mon May 26 23:08:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 May 2008 23:08:00 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: Message-ID: <483B26B0.6060409@behnel.de> Lisandro Dalcin wrote: > Stefan, how should we handle this? I ask because this is a really > fresh change, not available in latest Py3K source distribution, but > commited in SVN. IMHO, we would follow SVN as closely as possible, and > no wait for alphas. Or perhaps we can again branch to a new repo, > until the next Py3K alfa is released? Yep, I noticed that discussion on the py3k list. I don't think we should follow SVN, that's too unstable as a target. It should be enough to import the string compat header from 3beta1 on (once it's decided how they actually call the file...). Stefan From greg.ewing at canterbury.ac.nz Tue May 27 02:43:38 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 May 2008 12:43:38 +1200 Subject: [Cython] use of __del__ In-Reply-To: <483A7B3E.7010303@semipol.de> References: <48385795.7020404@semipol.de> <1155D85E-6AF7-4AF0-AE29-AE179287DAA4@math.washington.edu> <4838953E.8050104@semipol.de> <4838C7E6.7020706@canterbury.ac.nz> <483939BC.3090204@semipol.de> <483A1065.8020906@canterbury.ac.nz> <483A7B3E.7010303@semipol.de> Message-ID: <483B593A.7000009@canterbury.ac.nz> Johannes Wienke wrote: > Am 05/26/2008 03:20 AM schrieb Greg Ewing: > > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html > > But how do I get there? That link should take you straight to the page I'm talking about. The relevant section is called "Finalization method: __dealloc__". If you mean how to get to the top level of the Pyrex documentation, it's called "Language Overview" on the main Pyrex page. Main Pyrex page: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Language Overview: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/LanguageOverview.html I should probably give it a clearer name on the main page. If you mean how to get to it from the Cython site, I don't know. -- Greg From greg.ewing at canterbury.ac.nz Tue May 27 02:54:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 May 2008 12:54:56 +1200 Subject: [Cython] use of __del__ In-Reply-To: <483A9204.5080409@semipol.de> References: <48385795.7020404@semipol.de> <483A1065.8020906@canterbury.ac.nz> <483A7B3E.7010303@semipol.de> <200805262027.56777@fisheggs+neverbox.com> <483A9204.5080409@semipol.de> Message-ID: <483B5BE0.7050000@canterbury.ac.nz> Johannes Wienke wrote: > Ah, I see, but if you overread the tiny text that says the special > methods are one a separate page, you won't find it as the navigation on > top of that page pretends all th content will be directly on this page. Oh, I see. I'm not sure what's the best thing to do about that. Logically it's part of the Extension Types topic, but it wouldn't be appropriate to insert in the middle of that page. It's not something you need to look at in detail on a first reading, but at the same time it's important not to miss it when you do need to know about it. I'll see if I can find a better place to put the link. -- Greg From dagss at student.matnat.uio.no Tue May 27 13:55:56 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 27 May 2008 13:55:56 +0200 Subject: [Cython] Info on current dagss branch and my GSoC workflow Message-ID: <483BF6CC.4000904@student.matnat.uio.no> Some info: During the summer there'll be a -dagss branch at the usual location. I make no guarantees as to the state of that branch (I think it will usually be "backwards compatible" and at least pass old tests before I push, but new features I introduce might be under development and broken). I'll pull in the main repository now and then to keep it in sync. Once I consider something ready for inclusion in the main branch I'll ping the mailing list. Robert will have main responsibility for review and pulling what I do there over to the main repository. If I misrepresented something here then please fill in, Robert :-) Anyway, my current state is now up; I think it can be merged into cython-devel now (in order to get the SourceDescriptor thing into the main branch, so that more people than me can see if it breaks anything). I've talked about the contents in "Status update: Transform utilities"; further additions: - Some corrections in the visitor/transform relationship (Robert, take a look and see if you like it, Cython/Compiler/Visitor.py) - Some fixes that Stefan pointed me in the direction of (for SourceDescriptors) - Some fixes Robert pointed me to (for TreeFragment / ExprStatNode) - Not using the over-designed get_child_accessors for visitors (though it is still used in code I didn't write so it's not removed) Back to GSoC: I have a "status page" here: http://wiki.cython.org/DagSverreSeljebotn/status I think it will be updated about weekly. (GSoC students are also encouraged to keep a blog; so perhaps I'll get around to doing that. Though I do feel my writing vs. coding ratio is high enough already...) (Mercurial question: Is there any way to get a tree view rather than a flattened view of the history in the web browser changelog? Or simply sort by time? It appears kind of strange now, with my recents changes way back in the history.) -- Dag Sverre From kirr at mns.spb.ru Tue May 27 15:01:46 2008 From: kirr at mns.spb.ru (Kirill Smelkov) Date: Tue, 27 May 2008 17:01:46 +0400 Subject: [Cython] Info on current dagss branch and my GSoC workflow In-Reply-To: <483BF6CC.4000904@student.matnat.uio.no> References: <483BF6CC.4000904@student.matnat.uio.no> Message-ID: <200805271701.46328.kirr@mns.spb.ru> ? ????????? ?? ??????? 27 ??? 2008 Dag Sverre Seljebotn ???????(a): > (Mercurial question: Is there any way to get a tree view rather than a > flattened view of the history in the web browser changelog? Or simply > sort by time? It appears kind of strange now, with my recents changes > way back in the history.) May I suggest that "hg view" and "hg glog" are your friends. Kirill. From jonas at MIT.EDU Tue May 27 15:09:36 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 09:09:36 -0400 Subject: [Cython] Cython, pickle, inheritance Message-ID: <1211893776.11042.4.camel@convolution> Hello! I've been successfully using cython for the root objects in a class hierarchy to make some common operations much faster. I eventually figured out how to implement __reduce__ in these objects to support pickling. Unfortunately, when I inherit (in python) from my cython classes, and pickle an instance of the resulting derived object, the unpickle only appears to restore the base (cython) object. Does anyone know of any workarounds for this pickle? (bad pun) Thanks, ...Eric From dalcinl at gmail.com Tue May 27 15:43:08 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 10:43:08 -0300 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483B26B0.6060409@behnel.de> References: <483B26B0.6060409@behnel.de> Message-ID: OK, I'm following the SVN Py3 trunk closely, then I'll be updating a local cython-devel repo, patched to follow SVN. This way, when the Py3 C-API finally stabilizes, we will have the patches for Cython ready. On 5/26/08, Stefan Behnel wrote: > > Lisandro Dalcin wrote: > > Stefan, how should we handle this? I ask because this is a really > > fresh change, not available in latest Py3K source distribution, but > > commited in SVN. IMHO, we would follow SVN as closely as possible, and > > no wait for alphas. Or perhaps we can again branch to a new repo, > > until the next Py3K alfa is released? > > > Yep, I noticed that discussion on the py3k list. I don't think we should > follow SVN, that's too unstable as a target. It should be enough to import the > string compat header from 3beta1 on (once it's decided how they actually call > the file...). > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue May 27 15:53:26 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 10:53:26 -0300 Subject: [Cython] final conclusion about how to handle the method cache issue In-Reply-To: <483B2551.4020001@behnel.de> References: <483B2551.4020001@behnel.de> Message-ID: On 5/26/08, Stefan Behnel wrote: > Lisandro Dalcin wrote: > > the attached patch (is its the same as an > > old one I posted, plus the removal of Stefan's code). > > he he, I'm first: > > http://hg.cython.org/cython-devel/rev/501d399af9cf Sorry! I just did not notiched that commit. > (although you might argue that you were the first who sent the patch to the > list...) Mmm, it seems you will get all the credits from my work ;-) . Stefan, many, many thanks. I'm very hapy that Cython-based projects can be Py3K-ready and still supporting Py 2.3. That's just fantastic. All us in debt with you forever... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue May 27 16:06:16 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 11:06:16 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211893776.11042.4.camel@convolution> References: <1211893776.11042.4.camel@convolution> Message-ID: Eric, could you send to me or the list of you reduce code? I guess you are returning a bad callable for the fist argument of the returned tuple , which should be something like 'type(self)', like below class A: def __init__(self): pass def __reduce__(self): return ( type(self), () ) On 5/27/08, Eric Jonas wrote: > Hello! I've been successfully using cython for the root objects in a > class hierarchy to make some common operations much faster. I eventually > figured out how to implement __reduce__ in these objects to support > pickling. Unfortunately, when I inherit (in python) from my cython > classes, and pickle an instance of the resulting derived object, the > unpickle only appears to restore the base (cython) object. Does anyone > know of any workarounds for this pickle? (bad pun) > > Thanks, > ...Eric > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jonas at MIT.EDU Tue May 27 16:21:49 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 10:21:49 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> Message-ID: <1211898109.11042.14.camel@convolution> On Tue, 2008-05-27 at 11:06 -0300, Lisandro Dalcin wrote: > Eric, could you send to me or the list of you reduce code? I guess you > are returning a bad callable for the fist argument of the returned > tuple , which should be something like 'type(self)', like below > > class A: > def __init__(self): > pass > def __reduce__(self): > return ( type(self), () ) > > My apologies, I should have done this to start. I'm a bit confused why you suggest that the reduce method should return the type as the first element of the tuple -- I thought it had to be a function? Thanks, ...Eric cdef class Node: def __init__(self): self.inboundkey = 0 self.inboundlinks = {} self.tags = {} def __reduce__(self): return Node_restore, (self.inboundkey, self.inboundlinks, self.tags) def Node_restore(inboundkey, inboundlinks, tags): s = Node() s.inboundkey = inboundkey s.inboundlinks = inboundlinks s.tags = tags return s From dalcinl at gmail.com Tue May 27 16:29:07 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 11:29:07 -0300 Subject: [Cython] final conclusion about how to handle the method cache issue In-Reply-To: <483B2551.4020001@behnel.de> References: <483B2551.4020001@behnel.de> Message-ID: Stefan, I reviewed your last commit, and that's definitely the way to go. Hope your request in Python-Dev is accepted. Just a note, in your latest commit, you introduced the call like this, doing an explicit cast to PyTypeObject __Pyx_TypeModified((PyTypeObject*)%s) That cast is a bit superfluous, as in the current state of the code, the expansion '%s' will a type object pointer, ie, PyTypeObject. I would prefer that the cast is removed. This way, if any of us ever call in the future __Pyx_TypeModified passing a PyObject* (that would be dangerous, as __Pyx_TypeModified, does not chech the input is really a type object), then the compiler will complain, and we have the chance of quickly notice such a dangerous call. In short, I ask you to remove the cast, or add and explicit PyType_Check() at the begining of __Pyx_TypeModified(). This is just to save us a segfault in the future and have use GDB (something that I never do, because I just never had the time to learn it). On 5/26/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > the attached patch (is its the same as an > > old one I posted, plus the removal of Stefan's code). > > > he he, I'm first: > > http://hg.cython.org/cython-devel/rev/501d399af9cf > > (although you might argue that you were the first who sent the patch to the > list...) > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue May 27 16:37:09 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 11:37:09 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211898109.11042.14.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> Message-ID: Indeed, it seems the problem is in your code, 'Node_restore' creates a new base 'Node' obejct, and never a subclass of it!!. Try to implement your '__reduce__' like this and let me know if it works (as I never actually implemented pickle protocols): class Node: ..... def __reduce__(self): return type(self), (self.inboundkey, self.inboundlinks, self.tags) But note that then all the subclasses of Node should be able to be created with no arguments, that, a call like 'SubNode()' should just work and create a new 'SubNode' instance On 5/27/08, Eric Jonas wrote: > > On Tue, 2008-05-27 at 11:06 -0300, Lisandro Dalcin wrote: > > Eric, could you send to me or the list of you reduce code? I guess you > > are returning a bad callable for the fist argument of the returned > > tuple , which should be something like 'type(self)', like below > > > > class A: > > def __init__(self): > > pass > > def __reduce__(self): > > return ( type(self), () ) > > > > > > > > My apologies, I should have done this to start. I'm a bit confused why > you suggest that the reduce method should return the type as the first > element of the tuple -- I thought it had to be a function? > > Thanks, > ...Eric > > > cdef class Node: > > def __init__(self): > self.inboundkey = 0 > self.inboundlinks = {} > self.tags = {} > > def __reduce__(self): > return Node_restore, (self.inboundkey, self.inboundlinks, > self.tags) > > > def Node_restore(inboundkey, inboundlinks, tags): > s = Node() > s.inboundkey = inboundkey > s.inboundlinks = inboundlinks > s.tags = tags > return s > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jonas at MIT.EDU Tue May 27 16:55:13 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 10:55:13 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> Message-ID: <1211900113.11042.21.camel@convolution> I apologize again, but I just don't see how that would work (and indeed, I tried it and it didn't work). On Tue, 2008-05-27 at 11:37 -0300, Lisandro Dalcin wrote: > Try to implement your '__reduce__' like this and let me know if it > works (as I never actually implemented pickle protocols): > > class Node: > ..... > def __reduce__(self): > return type(self), (self.inboundkey, self.inboundlinks, self.tags) > This will, upon unpickling, call the constructor for whatever object self is, but pass it inboundkey, etc. So I need a constructor for self that takes these arguments. But then: 1. subclasses must always have an __init__ that can take node's arguments? 2. how do any sublcass-specific attributes get pickled? I worry I might just be fundamentally misunderstanding some part of the pickle process. Thanks again, ...Eric > But note that then all the subclasses of Node should be able to be > created with no arguments, that, a call like 'SubNode()' should just > work and create a new 'SubNode' instance > > > On 5/27/08, Eric Jonas wrote: > > > > On Tue, 2008-05-27 at 11:06 -0300, Lisandro Dalcin wrote: > > > Eric, could you send to me or the list of you reduce code? I guess you > > > are returning a bad callable for the fist argument of the returned > > > tuple , which should be something like 'type(self)', like below > > > > > > class A: > > > def __init__(self): > > > pass > > > def __reduce__(self): > > > return ( type(self), () ) > > > > > > > > > > > > > > My apologies, I should have done this to start. I'm a bit confused why > > you suggest that the reduce method should return the type as the first > > element of the tuple -- I thought it had to be a function? > > > > Thanks, > > ...Eric > > > > > > cdef class Node: > > > > def __init__(self): > > self.inboundkey = 0 > > self.inboundlinks = {} > > self.tags = {} > > > > def __reduce__(self): > > return Node_restore, (self.inboundkey, self.inboundlinks, > > self.tags) > > > > > > def Node_restore(inboundkey, inboundlinks, tags): > > s = Node() > > s.inboundkey = inboundkey > > s.inboundlinks = inboundlinks > > s.tags = tags > > return s > > > > > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > From stefan_ml at behnel.de Tue May 27 17:20:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 May 2008 17:20:14 +0200 (CEST) Subject: [Cython] final conclusion about how to handle the method cache issue In-Reply-To: References: <483B2551.4020001@behnel.de> Message-ID: <60026.194.114.62.39.1211901614.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > Stefan, I reviewed your last commit, and that's definitely the way to > go. Hope your request in Python-Dev is accepted. I actually posted it to the Py3 list, but if I don't get a response by tonight (when I have real Internet access), I will cross-post it to python-dev, where it actually belongs... > Just a note, in your latest commit, you introduced the call like this, > doing an explicit cast to PyTypeObject > > __Pyx_TypeModified((PyTypeObject*)%s) > > That cast is a bit superfluous, as in the current state of the code, > the expansion '%s' will a type object pointer, ie, PyTypeObject. True. > In short, I ask you to remove the cast, or add and explicit > PyType_Check() at the begining of __Pyx_TypeModified(). No type check. Removing the cast and accepting a crash if you manage to trick the compiler into passing bogus stuff is ok with me. > This is just > to save us a segfault in the future and have use GDB (something that I > never do, because I just never had the time to learn it). Never too late to learn using it (at least a bit), although I tend to use valgrind a lot more often than gbd. And I even use "print" (and sometimes "printf") a lot more often than gdb. Compilers are fast these days, so adding a "print" at the right place and re-running a test is often faster than running the same test in gdb, putting a breakpoint at the right spot, stepping along, checking values by hand, ... Stefan From dalcinl at gmail.com Tue May 27 17:20:25 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 12:20:25 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211900113.11042.21.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> Message-ID: On 5/27/08, Eric Jonas wrote: > I tried it and it didn't work). > This will, upon unpickling, call the constructor for whatever object > self is, but pass it inboundkey, etc. So I need a constructor for self > that takes these arguments. But then: Sorry, my fault. My suggestion was completelly wrong. Let's try again: class Node: def __reduce__(self): return Node_create, (type(self), arg1, arg2, arg3) def Node_create(klass, arg1, arg2, arg2): n = klass() n.arg1 = arg1 n.arg2 = arg2 n.arg3 = arg3 return n But there is definitelly better ways. You sould look at the docs of pickle module, and particularly to the __getinitargs__/__getnewargs__ and the __getstate__/__setstate__ stuff. Iff I were to implement pickle protocol, I would explicitely use that. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jonas at MIT.EDU Tue May 27 18:42:27 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 12:42:27 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> Message-ID: <1211906548.11042.24.camel@convolution> > But there is definitelly better ways. You sould look at the docs of > pickle module, and particularly to the __getinitargs__/__getnewargs__ > and the __getstate__/__setstate__ stuff. Iff I were to implement > pickle protocol, I would explicitely use that. The python pickle docs, which I've been struggling to understand, seem to suggest that __reduce__ is necessary for all extensions, or to use the copy_reg module. But it seems that __getstate__/__setstate__ are not applicable for c-extensions. Is this correct? Thanks, ...Eric From dalcinl at gmail.com Tue May 27 20:47:13 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 15:47:13 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211906548.11042.24.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> Message-ID: Well, I admit that the Python docs are a bit hard to follow, but let's solve your issue. Look at the attached files, * mod.pyx: a simple 'cdef' class pair, like the C++ one * test.py: a full test, all pickle protocols, includes subclassing * cy2py: my lovely distutils based script for the process *.pyx->*.so (no need of makefiles, improvements welcome!) run like $ python cy2py mod.pyx . This way, you get mod.so. If you find this code useful, you can donate some dollars ... Just a joke! But if you can make it a really good example, It would be great to have this included in Cython docs, in order to help other doing this, I just do not have the time. Enjoy! On 5/27/08, Eric Jonas wrote: > > > But there is definitelly better ways. You sould look at the docs of > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > pickle protocol, I would explicitely use that. > > > > The python pickle docs, which I've been struggling to understand, seem > to suggest that __reduce__ is necessary for all extensions, or to use > the copy_reg module. But it seems that __getstate__/__setstate__ are not > applicable for c-extensions. Is this correct? > > Thanks, > > ...Eric > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: cy2py Type: application/octet-stream Size: 1416 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080527/bdfc2257/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: mod.pyx Type: application/octet-stream Size: 968 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080527/bdfc2257/attachment-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: text/x-python Size: 948 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080527/bdfc2257/attachment.py From jonas at MIT.EDU Tue May 27 22:56:38 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 16:56:38 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> Message-ID: <1211921798.17841.6.camel@convolution> To start with, this is awesome. Thank you so much! Next question: What if I want MyPair to have a zero-argument constructor? (Well, besides self, that is). If I do: class MyPair(pair): def __init__(self): pass then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". Does this mean that derived types must always have constructors with the same number (or more) than their base type? Thanks, ...Eric On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > Well, I admit that the Python docs are a bit hard to follow, but let's > solve your issue. Look at the attached files, > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > * test.py: a full test, all pickle protocols, includes subclassing > * cy2py: my lovely distutils based script for the process *.pyx->*.so > (no need of makefiles, improvements welcome!) > run like $ python cy2py mod.pyx . This way, you get mod.so. > > If you find this code useful, you can donate some dollars ... Just a > joke! But if you can make it a really good example, It would be great > to have this included in Cython docs, in order to help other doing > this, I just do not have the time. > > Enjoy! > > > On 5/27/08, Eric Jonas wrote: > > > > > But there is definitelly better ways. You sould look at the docs of > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > pickle protocol, I would explicitely use that. > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > to suggest that __reduce__ is necessary for all extensions, or to use > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > applicable for c-extensions. Is this correct? > > > > Thanks, > > > > ...Eric > > > > > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Tue May 27 23:27:47 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 18:27:47 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211921798.17841.6.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: On 5/27/08, Eric Jonas wrote: > To start with, this is awesome. Thank you so much! > > Next question: What if I want MyPair to have a zero-argument > constructor? (Well, besides self, that is). If I do: > > class MyPair(pair): > def __init__(self): > pass > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > Does this mean that derived types must always have constructors with the > same number (or more) than their base type? Yes and No. The problem is that different pickle protocols assume and do different things. Give me some minute, I'll try to figure out what's going on. And yes, you are right, this is a real crap if you want so support in a easy way all pickle protocols, plus class inheritance. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Tue May 27 23:32:47 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 27 May 2008 14:32:47 -0700 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: On May 27, 2008, at 2:27 PM, Lisandro Dalcin wrote: > On 5/27/08, Eric Jonas wrote: >> To start with, this is awesome. Thank you so much! >> >> Next question: What if I want MyPair to have a zero-argument >> constructor? (Well, besides self, that is). If I do: >> >> class MyPair(pair): >> def __init__(self): >> pass >> >> then I get "TypeError: ('__init__() takes exactly 1 argument (3 >> given)". >> Does this mean that derived types must always have constructors >> with the >> same number (or more) than their base type? > > Yes and No. The problem is that different pickle protocols assume and > do different things. Give me some minute, I'll try to figure out > what's going on. And yes, you are right, this is a real crap if you > want so support in a easy way all pickle protocols, plus class > inheritance. On this note, I think Cython should be able to provide a default __reduce__ that will "just work" for cdef classes, the way normal pickling works in Python without having to think about it. Of course, one should always be able to override it manually, and if the class has specially-typed members than it may take a bit more work. - Robert From robertwb at math.washington.edu Tue May 27 23:42:13 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 27 May 2008 14:42:13 -0700 Subject: [Cython] Info on current dagss branch and my GSoC workflow In-Reply-To: <483BF6CC.4000904@student.matnat.uio.no> References: <483BF6CC.4000904@student.matnat.uio.no> Message-ID: <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> On May 27, 2008, at 4:55 AM, Dag Sverre Seljebotn wrote: > Some info: During the summer there'll be a -dagss branch at the usual > location. I make no guarantees as to the state of that branch (I think > it will usually be "backwards compatible" and at least pass old tests > before I push, but new features I introduce might be under development > and broken). > > I'll pull in the main repository now and then to keep it in sync. > Once I > consider something ready for inclusion in the main branch I'll ping > the > mailing list. Robert will have main responsibility for review and > pulling what I do there over to the main repository. > > If I misrepresented something here then please fill in, Robert :-) That sounds about right to me. > Anyway, my current state is now up; I think it can be merged into > cython-devel now (in order to get the SourceDescriptor thing into the > main branch, so that more people than me can see if it breaks > anything). Wasn't able to compile Sage with it, broke on line 42 of Nodes.py (should be an easy fix). > I've talked about the contents in "Status update: Transform > utilities"; > further additions: > - Some corrections in the visitor/transform relationship (Robert, > take a > look and see if you like it, Cython/Compiler/Visitor.py) Yep, this looks a lot cleaner to me. > - Some fixes that Stefan pointed me in the direction of (for > SourceDescriptors) > - Some fixes Robert pointed me to (for TreeFragment / ExprStatNode) The SubstitutionTransform still needs to be able to handle multiple substitutions in one pass--e.g. to do stuff like a -> b, b -> c, and c -> a. > - Not using the over-designed get_child_accessors for visitors (though > it is still used in code I didn't write so it's not removed) I'll do that. > > Back to GSoC: I have a "status page" here: > > http://wiki.cython.org/DagSverreSeljebotn/status > > I think it will be updated about weekly. (GSoC students are also > encouraged to keep a blog; so perhaps I'll get around to doing that. > Though I do feel my writing vs. coding ratio is high enough > already...) That wiki page will be really helpful. Thanks. - Robert From dalcinl at gmail.com Tue May 27 23:57:14 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 18:57:14 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211921798.17841.6.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: Eric, please sorry for the confussion, I just had a bug in __reduce__ . Look at the new files attached, your last use case is included in the test script. __reduce__ returns a 3-tuple: the fist item is the type object, with pickle sees as a callable, the second is a tuple argument for that callable, and the third is the state In general, if you implement __reduce__ like this def __reduce__(self): return function, args, state the at un-pickle time, Python (2.5) does function, args, state = unpickle(stream) newobj = function(*args) newobj.__setstate__(state) and then 'newobj' is the final recoverd object. I hope I was clear enough. If you still want to need more complicated stuff, please just keep coming. All this discussion will surelly help in the future at the time Cython tries to implement something to automatically manage all this beast. On 5/27/08, Eric Jonas wrote: > To start with, this is awesome. Thank you so much! > > Next question: What if I want MyPair to have a zero-argument > constructor? (Well, besides self, that is). If I do: > > class MyPair(pair): > def __init__(self): > pass > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > Does this mean that derived types must always have constructors with the > same number (or more) than their base type? > > Thanks, > > ...Eric > > > > > On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > > Well, I admit that the Python docs are a bit hard to follow, but let's > > solve your issue. Look at the attached files, > > > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > > * test.py: a full test, all pickle protocols, includes subclassing > > * cy2py: my lovely distutils based script for the process *.pyx->*.so > > (no need of makefiles, improvements welcome!) > > run like $ python cy2py mod.pyx . This way, you get mod.so. > > > > If you find this code useful, you can donate some dollars ... Just a > > joke! But if you can make it a really good example, It would be great > > to have this included in Cython docs, in order to help other doing > > this, I just do not have the time. > > > > Enjoy! > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > > > But there is definitelly better ways. You sould look at the docs of > > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > > pickle protocol, I would explicitely use that. > > > > > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > > to suggest that __reduce__ is necessary for all extensions, or to use > > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > > applicable for c-extensions. Is this correct? > > > > > > Thanks, > > > > > > ...Eric > > > > > > > > > > > > _______________________________________________ > > > Cython-dev mailing list > > > Cython-dev at codespeak.net > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: mod.pyx Type: application/octet-stream Size: 1083 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080527/1a6486ea/attachment-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: text/x-python Size: 1036 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080527/1a6486ea/attachment-0001.py From dalcinl at gmail.com Wed May 28 00:02:25 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 19:02:25 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: Robert, this is definitely a good idea, but I would object it being enabled by default. If it is enabled in a way like the current '__weakref__' way, then that would be fine (IMHO , and once again, explicit is better than implicit). And finally, if a user want to do himself the full implementation like in my example, then Cython should just ignore its 'default' way and respect user implementation. On 5/27/08, Robert Bradshaw wrote: > On May 27, 2008, at 2:27 PM, Lisandro Dalcin wrote: > > > On 5/27/08, Eric Jonas wrote: > >> To start with, this is awesome. Thank you so much! > >> > >> Next question: What if I want MyPair to have a zero-argument > >> constructor? (Well, besides self, that is). If I do: > >> > >> class MyPair(pair): > >> def __init__(self): > >> pass > >> > >> then I get "TypeError: ('__init__() takes exactly 1 argument (3 > >> given)". > >> Does this mean that derived types must always have constructors > >> with the > >> same number (or more) than their base type? > > > > Yes and No. The problem is that different pickle protocols assume and > > do different things. Give me some minute, I'll try to figure out > > what's going on. And yes, you are right, this is a real crap if you > > want so support in a easy way all pickle protocols, plus class > > inheritance. > > > On this note, I think Cython should be able to provide a default > __reduce__ that will "just work" for cdef classes, the way normal > pickling works in Python without having to think about it. Of course, > one should always be able to override it manually, and if the class > has specially-typed members than it may take a bit more work. > > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Wed May 28 00:20:12 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 28 May 2008 00:20:12 +0200 (CEST) Subject: [Cython] Info on current dagss branch and my GSoC workflow In-Reply-To: <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> References: <483BF6CC.4000904@student.matnat.uio.no> <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> Message-ID: <2789.193.90.3.70.1211926812.squirrel@webmail.uio.no> >> - Some fixes Robert pointed me to (for TreeFragment / ExprStatNode) > > The SubstitutionTransform still needs to be able to handle multiple > substitutions in one pass--e.g. to do stuff like a -> b, b -> c, and > c -> a. Ahh... I didn't have that goal at all, in fact I want to avoid this. (Or am I missing something here? Please elaborate, it's not clear to me what you want to achieve.) If my guess it right, perhaps the fix is a rename to TemplateTransform, and then we can have another SubstitutionTransform doing what you want if needed? Remember that the usecase is for TreeFragment: def complicated_create_assignment(varname, expr): result = TreeFragment("LEFT = RIGHT").substitute({"LEFT" : NameNode(varname), "RIGHT" : expr}) So, if "RIGHT" is passed as varname, your suggestion would make "expr" appear on both sides, *not* what was intended! (and what happens if passing "LEFT" as varname -- infinite recursion?_ (However, it should properly assign pos to substituted nodes (which it does not), I'll try to remember to do that next time I'm coding.) Dag Sverre From dagss at student.matnat.uio.no Wed May 28 00:32:30 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 28 May 2008 00:32:30 +0200 (CEST) Subject: [Cython] Info on current dagss branch and my GSoC workflow In-Reply-To: <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> References: <483BF6CC.4000904@student.matnat.uio.no> <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> Message-ID: <2942.193.90.3.70.1211927550.squirrel@webmail.uio.no> Robert wrote: >> Anyway, my current state is now up; I think it can be merged into >> cython-devel now (in order to get the SourceDescriptor thing into the >> main branch, so that more people than me can see if it breaks >> anything). > > Wasn't able to compile Sage with it, broke on line 42 of Nodes.py > (should be an easy fix). Ah, thanks. I'll remember to add Sage to my test case (though I suppose every time something like this happens one should in principle add something to Cython testcases too (generating docstrings in this case)...will have to see if I get time though...) Won't be with my computer before Thursday or Friday but I'll fix it then. (The fix will probably be -return (pos[0][absolute_path_length+1:], pos[1]) +return (pos[0].get_filenametable_entry()[absolute_path_length+1:], pos[1]) but I might as well get used to compiling Sage...) Dag Sverre From jonas at MIT.EDU Wed May 28 01:31:19 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Tue, 27 May 2008 19:31:19 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: <1211931079.17841.17.camel@convolution> _amazing_ Thank you so much. So one final question regarding additional state in derived classes. To simplify I've written a new bit of code for test.py: class MyPairExtra(pair): def __init__(self, f=0, s=0): self.first = f self.second = s self.extra = 1234 self.extra2 = 0 p6 = MyPairExtra() p6.extra2 = 5678 s6p2 = pkl.dumps(p6, 2) s6p2load = pkl.loads(s6p2) assert s6p2load.extra == 1234 assert s6p2load.extra2 == 5678 The last assertion currently fails. You'll note that the "extra" attribute is correctly recovered by the "extra2" attribute, which we set after construction to 5678, is not. Is there any way to pickle arbitrary extra object state? Thanks again, ...Eric On Tue, 2008-05-27 at 18:57 -0300, Lisandro Dalcin wrote: > Eric, please sorry for the confussion, I just had a bug in __reduce__ > . Look at the new files attached, your last use case is included in > the test script. > > __reduce__ returns a 3-tuple: the fist item is the type object, with > pickle sees as a callable, the second is a tuple argument for that > callable, and the third is the state > > In general, if you implement __reduce__ like this > > def __reduce__(self): > return function, args, state > > the at un-pickle time, Python (2.5) does > > function, args, state = unpickle(stream) > newobj = function(*args) > newobj.__setstate__(state) > > and then 'newobj' is the final recoverd object. > > I hope I was clear enough. If you still want to need more complicated > stuff, please just keep coming. All this discussion will surelly help > in the future at the time Cython tries to implement something to > automatically manage all this beast. > > > > > On 5/27/08, Eric Jonas wrote: > > To start with, this is awesome. Thank you so much! > > > > Next question: What if I want MyPair to have a zero-argument > > constructor? (Well, besides self, that is). If I do: > > > > class MyPair(pair): > > def __init__(self): > > pass > > > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > > Does this mean that derived types must always have constructors with the > > same number (or more) than their base type? > > > > Thanks, > > > > ...Eric > > > > > > > > > > On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > > > Well, I admit that the Python docs are a bit hard to follow, but let's > > > solve your issue. Look at the attached files, > > > > > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > > > * test.py: a full test, all pickle protocols, includes subclassing > > > * cy2py: my lovely distutils based script for the process *.pyx->*.so > > > (no need of makefiles, improvements welcome!) > > > run like $ python cy2py mod.pyx . This way, you get mod.so. > > > > > > If you find this code useful, you can donate some dollars ... Just a > > > joke! But if you can make it a really good example, It would be great > > > to have this included in Cython docs, in order to help other doing > > > this, I just do not have the time. > > > > > > Enjoy! > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > > > > > But there is definitelly better ways. You sould look at the docs of > > > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > > > pickle protocol, I would explicitely use that. > > > > > > > > > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > > > to suggest that __reduce__ is necessary for all extensions, or to use > > > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > > > applicable for c-extensions. Is this correct? > > > > > > > > Thanks, > > > > > > > > ...Eric > > > > > > > > > > > > > > > > _______________________________________________ > > > > Cython-dev mailing list > > > > Cython-dev at codespeak.net > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > _______________________________________________ > > > Cython-dev mailing list > > > Cython-dev at codespeak.net > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Wed May 28 01:45:44 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 27 May 2008 20:45:44 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211931079.17841.17.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> <1211931079.17841.17.camel@convolution> Message-ID: Tomorrow I'll give the details, but yes, you could return a dict from __getstate__, and then in __setstate__ loop over the dict items and calling setattr(self, key, value). Perhaps that's enough for you, give a try!. On 5/27/08, Eric Jonas wrote: > _amazing_ > > Thank you so much. > > So one final question regarding additional state in derived classes. To > simplify I've written a new bit of code for test.py: > > > class MyPairExtra(pair): > def __init__(self, f=0, s=0): > self.first = f > self.second = s > self.extra = 1234 > self.extra2 = 0 > > > p6 = MyPairExtra() > p6.extra2 = 5678 > > s6p2 = pkl.dumps(p6, 2) > > s6p2load = pkl.loads(s6p2) > assert s6p2load.extra == 1234 > assert s6p2load.extra2 == 5678 > > The last assertion currently fails. You'll note that the "extra" > attribute is correctly recovered by the "extra2" attribute, which we set > after construction to 5678, is not. Is there any way to pickle arbitrary > extra object state? > > Thanks again, > > ...Eric > > > > > On Tue, 2008-05-27 at 18:57 -0300, Lisandro Dalcin wrote: > > Eric, please sorry for the confussion, I just had a bug in __reduce__ > > . Look at the new files attached, your last use case is included in > > the test script. > > > > __reduce__ returns a 3-tuple: the fist item is the type object, with > > pickle sees as a callable, the second is a tuple argument for that > > callable, and the third is the state > > > > In general, if you implement __reduce__ like this > > > > def __reduce__(self): > > return function, args, state > > > > the at un-pickle time, Python (2.5) does > > > > function, args, state = unpickle(stream) > > newobj = function(*args) > > newobj.__setstate__(state) > > > > and then 'newobj' is the final recoverd object. > > > > I hope I was clear enough. If you still want to need more complicated > > stuff, please just keep coming. All this discussion will surelly help > > in the future at the time Cython tries to implement something to > > automatically manage all this beast. > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > To start with, this is awesome. Thank you so much! > > > > > > Next question: What if I want MyPair to have a zero-argument > > > constructor? (Well, besides self, that is). If I do: > > > > > > class MyPair(pair): > > > def __init__(self): > > > pass > > > > > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > > > Does this mean that derived types must always have constructors with the > > > same number (or more) than their base type? > > > > > > Thanks, > > > > > > ...Eric > > > > > > > > > > > > > > > On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > > > > Well, I admit that the Python docs are a bit hard to follow, but let's > > > > solve your issue. Look at the attached files, > > > > > > > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > > > > * test.py: a full test, all pickle protocols, includes subclassing > > > > * cy2py: my lovely distutils based script for the process *.pyx->*.so > > > > (no need of makefiles, improvements welcome!) > > > > run like $ python cy2py mod.pyx . This way, you get mod.so. > > > > > > > > If you find this code useful, you can donate some dollars ... Just a > > > > joke! But if you can make it a really good example, It would be great > > > > to have this included in Cython docs, in order to help other doing > > > > this, I just do not have the time. > > > > > > > > Enjoy! > > > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > > > > > > > But there is definitelly better ways. You sould look at the docs of > > > > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > > > > pickle protocol, I would explicitely use that. > > > > > > > > > > > > > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > > > > to suggest that __reduce__ is necessary for all extensions, or to use > > > > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > > > > applicable for c-extensions. Is this correct? > > > > > > > > > > Thanks, > > > > > > > > > > ...Eric > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Cython-dev mailing list > > > > > Cython-dev at codespeak.net > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Cython-dev mailing list > > > > Cython-dev at codespeak.net > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > _______________________________________________ > > > Cython-dev mailing list > > > Cython-dev at codespeak.net > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Wed May 28 03:03:50 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 May 2008 13:03:50 +1200 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211898109.11042.14.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> Message-ID: <483CAF76.8010106@canterbury.ac.nz> Eric Jonas wrote: > I'm a bit confused why > you suggest that the reduce method should return the type as the first > element of the tuple -- I thought it had to be a function? I don't think it has to be strictly a function, just a callable object. Probably the reason the docs call it a function is that before the type/class unification, builtin types weren't classes, so you had to define a function to create one. However I agree with Lisandro that there are probably better ways than using __reduce__ if you want it to work well with subclassing. -- Greg From greg.ewing at canterbury.ac.nz Wed May 28 04:02:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 May 2008 14:02:04 +1200 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> Message-ID: <483CBD1C.5040804@canterbury.ac.nz> Robert Bradshaw wrote: > On this note, I think Cython should be able to provide a default > __reduce__ that will "just work" for cdef classes, the way normal > pickling works in Python without having to think about it. This might be possible if the class's __cinit__ can be called with no arguments and there are no C attributes without automatic conversions to and from Python types. It would be dangerous to attempt it in any other situation, I think. -- Greg From robertwb at math.washington.edu Wed May 28 04:29:09 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 27 May 2008 19:29:09 -0700 Subject: [Cython] Info on current dagss branch and my GSoC workflow In-Reply-To: <2789.193.90.3.70.1211926812.squirrel@webmail.uio.no> References: <483BF6CC.4000904@student.matnat.uio.no> <9A6E11A4-9E8B-4789-ABF3-B2457620959C@math.washington.edu> <2789.193.90.3.70.1211926812.squirrel@webmail.uio.no> Message-ID: On May 27, 2008, at 3:20 PM, Dag Sverre Seljebotn wrote: > >>> - Some fixes Robert pointed me to (for TreeFragment / ExprStatNode) >> >> The SubstitutionTransform still needs to be able to handle multiple >> substitutions in one pass--e.g. to do stuff like a -> b, b -> c, and >> c -> a. > > Ahh... I didn't have that goal at all, in fact I want to avoid > this. (Or > am I missing something here? Please elaborate, it's not clear to me > what > you want to achieve.) > > If my guess it right, perhaps the fix is a rename to > TemplateTransform, > and then we can have another SubstitutionTransform doing what you > want if > needed? > > Remember that the usecase is for TreeFragment: > > def complicated_create_assignment(varname, expr): > result = TreeFragment("LEFT = RIGHT").substitute({"LEFT" : > NameNode(varname), "RIGHT" : expr}) > > So, if "RIGHT" is passed as varname, your suggestion would make "expr" > appear on both sides, *not* what was intended! (and what happens if > passing "LEFT" as varname -- infinite recursion?_ Looking at your code again, I can see that I totally mis-read your code. For some reason it looked like you were only doing one substitution at a time, but "substitute" is supposed to be a dictionary so that's not the case at all. Sorry. On that node, however, this should probably be renamed to "subsitutions" to make it clearer, and there should be at least some documentation. Other that that, it looks good. > (However, it should properly assign pos to substituted nodes (which it > does not), I'll try to remember to do that next time I'm coding.) Yes, for sure. - Robert From jonas at MIT.EDU Wed May 28 14:47:02 2008 From: jonas at MIT.EDU (Eric Jonas) Date: Wed, 28 May 2008 08:47:02 -0400 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: References: <1211893776.11042.4.camel@convolution> <1211898109.11042.14.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> <1211931079.17841.17.camel@convolution> Message-ID: <1211978822.25646.5.camel@convolution> My solution, which is currently working, was to : def __reduce__(self): return type(self), (), self.__getstate__() def __getstate__(self): if hasattr(self, "__dict__"): return (self.inboundkey, self.inboundlinks, self.tags, self.__dict__) else: return (self.inboundkey, self.inboundlinks, self.tags, {}) def __setstate__(self, state): (self.inboundkey, self.inboundlinks, self.tags, tgtdict) = state for k, v in tgtdict.iteritems() : setattr(self, k, v) That is, if an object has a dict, take that dict and return it as an argument. Then when we unpickle just restore everything in that dict. I'd love to hear if someone has a better way, but this is working great right now. Thanks again to everyone who helped. ...Eric On Tue, 2008-05-27 at 20:45 -0300, Lisandro Dalcin wrote: > Tomorrow I'll give the details, but yes, you could return a dict from > __getstate__, and then in __setstate__ loop over the dict items and > calling setattr(self, key, value). Perhaps that's enough for you, give > a try!. > > On 5/27/08, Eric Jonas wrote: > > _amazing_ > > > > Thank you so much. > > > > So one final question regarding additional state in derived classes. To > > simplify I've written a new bit of code for test.py: > > > > > > class MyPairExtra(pair): > > def __init__(self, f=0, s=0): > > self.first = f > > self.second = s > > self.extra = 1234 > > self.extra2 = 0 > > > > > > p6 = MyPairExtra() > > p6.extra2 = 5678 > > > > s6p2 = pkl.dumps(p6, 2) > > > > s6p2load = pkl.loads(s6p2) > > assert s6p2load.extra == 1234 > > assert s6p2load.extra2 == 5678 > > > > The last assertion currently fails. You'll note that the "extra" > > attribute is correctly recovered by the "extra2" attribute, which we set > > after construction to 5678, is not. Is there any way to pickle arbitrary > > extra object state? > > > > Thanks again, > > > > ...Eric > > > > > > > > > > On Tue, 2008-05-27 at 18:57 -0300, Lisandro Dalcin wrote: > > > Eric, please sorry for the confussion, I just had a bug in __reduce__ > > > . Look at the new files attached, your last use case is included in > > > the test script. > > > > > > __reduce__ returns a 3-tuple: the fist item is the type object, with > > > pickle sees as a callable, the second is a tuple argument for that > > > callable, and the third is the state > > > > > > In general, if you implement __reduce__ like this > > > > > > def __reduce__(self): > > > return function, args, state > > > > > > the at un-pickle time, Python (2.5) does > > > > > > function, args, state = unpickle(stream) > > > newobj = function(*args) > > > newobj.__setstate__(state) > > > > > > and then 'newobj' is the final recoverd object. > > > > > > I hope I was clear enough. If you still want to need more complicated > > > stuff, please just keep coming. All this discussion will surelly help > > > in the future at the time Cython tries to implement something to > > > automatically manage all this beast. > > > > > > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > To start with, this is awesome. Thank you so much! > > > > > > > > Next question: What if I want MyPair to have a zero-argument > > > > constructor? (Well, besides self, that is). If I do: > > > > > > > > class MyPair(pair): > > > > def __init__(self): > > > > pass > > > > > > > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > > > > Does this mean that derived types must always have constructors with the > > > > same number (or more) than their base type? > > > > > > > > Thanks, > > > > > > > > ...Eric > > > > > > > > > > > > > > > > > > > > On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > > > > > Well, I admit that the Python docs are a bit hard to follow, but let's > > > > > solve your issue. Look at the attached files, > > > > > > > > > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > > > > > * test.py: a full test, all pickle protocols, includes subclassing > > > > > * cy2py: my lovely distutils based script for the process *.pyx->*.so > > > > > (no need of makefiles, improvements welcome!) > > > > > run like $ python cy2py mod.pyx . This way, you get mod.so. > > > > > > > > > > If you find this code useful, you can donate some dollars ... Just a > > > > > joke! But if you can make it a really good example, It would be great > > > > > to have this included in Cython docs, in order to help other doing > > > > > this, I just do not have the time. > > > > > > > > > > Enjoy! > > > > > > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > > > > > > > > > But there is definitelly better ways. You sould look at the docs of > > > > > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > > > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > > > > > pickle protocol, I would explicitely use that. > > > > > > > > > > > > > > > > > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > > > > > to suggest that __reduce__ is necessary for all extensions, or to use > > > > > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > > > > > applicable for c-extensions. Is this correct? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > ...Eric > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Cython-dev mailing list > > > > > > Cython-dev at codespeak.net > > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Cython-dev mailing list > > > > > Cython-dev at codespeak.net > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > _______________________________________________ > > > > Cython-dev mailing list > > > > Cython-dev at codespeak.net > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > _______________________________________________ > > > Cython-dev mailing list > > > Cython-dev at codespeak.net > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > From dalcinl at gmail.com Wed May 28 15:50:26 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 28 May 2008 10:50:26 -0300 Subject: [Cython] Cython, pickle, inheritance In-Reply-To: <1211978822.25646.5.camel@convolution> References: <1211893776.11042.4.camel@convolution> <1211900113.11042.21.camel@convolution> <1211906548.11042.24.camel@convolution> <1211921798.17841.6.camel@convolution> <1211931079.17841.17.camel@convolution> <1211978822.25646.5.camel@convolution> Message-ID: That's the way, yes. You did it. However, I would wrote __getstate__ like this, but that is just a matter of personal style: def __getstate__(self): return (self.inboundkey, self.inboundlinks, self.tags, getattr(self, "__dict__", {}) On 5/28/08, Eric Jonas wrote: > My solution, which is currently working, was to : > > > def __reduce__(self): > return type(self), (), self.__getstate__() > > def __getstate__(self): > if hasattr(self, "__dict__"): > return (self.inboundkey, self.inboundlinks, self.tags, > self.__dict__) > else: > return (self.inboundkey, self.inboundlinks, self.tags, {}) > > def __setstate__(self, state): > (self.inboundkey, self.inboundlinks, self.tags, tgtdict) = > state > for k, v in tgtdict.iteritems() : > setattr(self, k, v) > > > That is, if an object has a dict, take that dict and return it as an > argument. Then when we unpickle just restore everything in that dict. > > I'd love to hear if someone has a better way, but this is working great > right now. Thanks again to everyone who helped. > > ...Eric > > > > On Tue, 2008-05-27 at 20:45 -0300, Lisandro Dalcin wrote: > > Tomorrow I'll give the details, but yes, you could return a dict from > > __getstate__, and then in __setstate__ loop over the dict items and > > calling setattr(self, key, value). Perhaps that's enough for you, give > > a try!. > > > > On 5/27/08, Eric Jonas wrote: > > > _amazing_ > > > > > > Thank you so much. > > > > > > So one final question regarding additional state in derived classes. To > > > simplify I've written a new bit of code for test.py: > > > > > > > > > class MyPairExtra(pair): > > > def __init__(self, f=0, s=0): > > > self.first = f > > > self.second = s > > > self.extra = 1234 > > > self.extra2 = 0 > > > > > > > > > p6 = MyPairExtra() > > > p6.extra2 = 5678 > > > > > > s6p2 = pkl.dumps(p6, 2) > > > > > > s6p2load = pkl.loads(s6p2) > > > assert s6p2load.extra == 1234 > > > assert s6p2load.extra2 == 5678 > > > > > > The last assertion currently fails. You'll note that the "extra" > > > attribute is correctly recovered by the "extra2" attribute, which we set > > > after construction to 5678, is not. Is there any way to pickle arbitrary > > > extra object state? > > > > > > Thanks again, > > > > > > ...Eric > > > > > > > > > > > > > > > On Tue, 2008-05-27 at 18:57 -0300, Lisandro Dalcin wrote: > > > > Eric, please sorry for the confussion, I just had a bug in __reduce__ > > > > . Look at the new files attached, your last use case is included in > > > > the test script. > > > > > > > > __reduce__ returns a 3-tuple: the fist item is the type object, with > > > > pickle sees as a callable, the second is a tuple argument for that > > > > callable, and the third is the state > > > > > > > > In general, if you implement __reduce__ like this > > > > > > > > def __reduce__(self): > > > > return function, args, state > > > > > > > > the at un-pickle time, Python (2.5) does > > > > > > > > function, args, state = unpickle(stream) > > > > newobj = function(*args) > > > > newobj.__setstate__(state) > > > > > > > > and then 'newobj' is the final recoverd object. > > > > > > > > I hope I was clear enough. If you still want to need more complicated > > > > stuff, please just keep coming. All this discussion will surelly help > > > > in the future at the time Cython tries to implement something to > > > > automatically manage all this beast. > > > > > > > > > > > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > To start with, this is awesome. Thank you so much! > > > > > > > > > > Next question: What if I want MyPair to have a zero-argument > > > > > constructor? (Well, besides self, that is). If I do: > > > > > > > > > > class MyPair(pair): > > > > > def __init__(self): > > > > > pass > > > > > > > > > > then I get "TypeError: ('__init__() takes exactly 1 argument (3 given)". > > > > > Does this mean that derived types must always have constructors with the > > > > > same number (or more) than their base type? > > > > > > > > > > Thanks, > > > > > > > > > > ...Eric > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, 2008-05-27 at 15:47 -0300, Lisandro Dalcin wrote: > > > > > > Well, I admit that the Python docs are a bit hard to follow, but let's > > > > > > solve your issue. Look at the attached files, > > > > > > > > > > > > * mod.pyx: a simple 'cdef' class pair, like the C++ one > > > > > > * test.py: a full test, all pickle protocols, includes subclassing > > > > > > * cy2py: my lovely distutils based script for the process *.pyx->*.so > > > > > > (no need of makefiles, improvements welcome!) > > > > > > run like $ python cy2py mod.pyx . This way, you get mod.so. > > > > > > > > > > > > If you find this code useful, you can donate some dollars ... Just a > > > > > > joke! But if you can make it a really good example, It would be great > > > > > > to have this included in Cython docs, in order to help other doing > > > > > > this, I just do not have the time. > > > > > > > > > > > > Enjoy! > > > > > > > > > > > > > > > > > > On 5/27/08, Eric Jonas wrote: > > > > > > > > > > > > > > > But there is definitelly better ways. You sould look at the docs of > > > > > > > > pickle module, and particularly to the __getinitargs__/__getnewargs__ > > > > > > > > and the __getstate__/__setstate__ stuff. Iff I were to implement > > > > > > > > pickle protocol, I would explicitely use that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > The python pickle docs, which I've been struggling to understand, seem > > > > > > > to suggest that __reduce__ is necessary for all extensions, or to use > > > > > > > the copy_reg module. But it seems that __getstate__/__setstate__ are not > > > > > > > applicable for c-extensions. Is this correct? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > ...Eric > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Cython-dev mailing list > > > > > > > Cython-dev at codespeak.net > > > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Cython-dev mailing list > > > > > > Cython-dev at codespeak.net > > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > _______________________________________________ > > > > > Cython-dev mailing list > > > > > Cython-dev at codespeak.net > > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Cython-dev mailing list > > > > Cython-dev at codespeak.net > > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > _______________________________________________ > > > Cython-dev mailing list > > > Cython-dev at codespeak.net > > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dg at pnylab.com Wed May 28 19:23:38 2008 From: dg at pnylab.com (Dan Gindikin) Date: Wed, 28 May 2008 17:23:38 +0000 (UTC) Subject: [Cython] Cython, pickle, inheritance References: <1211893776.11042.4.camel@convolution> Message-ID: Eric Jonas writes: > > Hello! I've been successfully using cython for the root objects in a > class hierarchy to make some common operations much faster. I eventually > figured out how to implement __reduce__ in these objects to support > pickling. Unfortunately, when I inherit (in python) from my cython > classes, and pickle an instance of the resulting derived object, the > unpickle only appears to restore the base (cython) object. Does anyone > know of any workarounds for this pickle? (bad pun) > > Thanks, > ...Eric > > Hi, I had to walk up this hill for pex, a preprocessor for cython, that among other things generates pickling code for cdef classes. If you have the following code: ------------ cdef class base: cdef int i cdef double d cdef class derived(base): cdef char c cdef object ob ------------ here is an excerpt of what pex generates (sorry, it's machine generated, so maybe not too readable). The type signature you can ignore, it is there for paranoia in case the class definition has changed since pickling - at some point I convinced myself it could lead to memory corruption. pex_create_uninitialized() is the most salient bit - it allocates the right type of object (base classes included), without actually calling __init__: ------------ cdef pex_create_uninitialized(type_obj): if not isinstance(type_obj,type): raise TypeError("Expecting a type object (eg 'item' from 'cdef class item'), not '%s' of %s"%(type_obj,type(type_obj))) cdef PyTypeObject *typeptr = type_obj h=typeptr[0].tp_new(typeptr,NULL,NULL) Py_DECREF(h) # need for some reason return h def __px__pex_create_uninitialized_def_wrap(type): return pex_create_uninitialized(type) cdef class base: ## 181_pickle_play.px,1 def _typesig_(me): return (('double', 'd'), ('int', 'i')) def _todict_(me): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_dictcoercion=False" d = { 'd': me.d, 'i': me.i, } return d def _fromdict_(me,dict): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_dictcoercion=False" me.d = dict['d'] me.i = dict['i'] def __setstate__(me,state): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_pickle=False" type_signature,fields = state unpickled=type_signature expected=(('double', 'd'), ('int', 'i')) if unpickled <> expected: me._fromdict_(fields) def __reduce__(me): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_pickle=False" type_signature = \ __px__type_signature_memoization_for_pickling.setdefault('base', (('double','d'), ('int', 'i'))) return (__px__pex_create_uninitialized_def_wrap, (type(me),), (type_signature,me._todict_()) ) cdef class derived(base): ## 181_pickle_play.px,5 def _typesig_(me): return base._typesig_(me) + (('char', 'c'), ('object', 'ob')) def _todict_(me):" ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_dictcoercion=False" d = { '_baseclass_':base._todict_(me), 'c': me.c, 'ob': me.ob, } return d def _fromdict_(me,dict): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_dictcoercion=False" base._fromdict_(me,dict['_baseclass_']) me.c = dict['c'] me.ob = dict['ob'] def __setstate__(me,state): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_pickle=False" type_signature,fields = state unpickled=type_signature expected=(('char', 'c'), ('object', 'ob')) if unpickled <> expected: me._fromdict_(fields) def __reduce__(me): ## made by pex, turn off generation with ## "%whencompiling: scope.pragma_gen_pickle=False" type_signature = \ __px__type_signature_memoization_for_pickling.setdefault('derived', (('char','c'), ('object', 'ob'))) return (__px__pex_create_uninitialized_def_wrap, (type(me),), (type_signature,me._todict_()) ) From whycode at gmail.com Thu May 29 06:50:21 2008 From: whycode at gmail.com (rahul garg) Date: Wed, 28 May 2008 22:50:21 -0600 Subject: [Cython] new Python-to-C compiler announcement Message-ID: Hi. As some of you already know, I have started a new Python-to-C compiler. The new project is called unPython (earlier was called Spyke but people confused it too much with skype). A very prelim .. and very broken :) .. release is now at http://www.cs.ualberta.ca/~garg1/unpython/ You are encouraged to download and play with the compiler. Its very prelim and I would be very surprised if it compiles anything successfully. If you do not want to play with very broken software, I plan to do a saner release around mid June. A mailing list has been setup at http://groups.google.com/group/unpython-discuss/ thanks, rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080528/81a258a6/attachment.htm From stefan_ml at behnel.de Thu May 29 08:55:43 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 08:55:43 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> Message-ID: <483E536F.2080005@behnel.de> Hi, Lisandro Dalcin wrote: > OK, I'm following the SVN Py3 trunk closely, then I'll be updating a > local cython-devel repo, patched to follow SVN. This way, when the Py3 > C-API finally stabilizes, we will have the patches for Cython ready. Actually, given that Py3a5 is broken in a couple of ways, and that the first beta is supposed to be out in about a week, I don't feel obliged to keep compatibility with the alpha version. So following SVN up to beta1 is fine with me. Do you have anything bundled yet that I can push? Stefan From martin at martincmartin.com Thu May 29 13:21:54 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Thu, 29 May 2008 07:21:54 -0400 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: Message-ID: <483E91D2.1050806@martincmartin.com> Hi Rahul, Thanks for letting us know. How is it similar or different from Cython? Best, Martin rahul garg wrote: > Hi. > > As some of you already know, I have started a new Python-to-C compiler. > The new project is called unPython (earlier was called Spyke but people > confused it too much with skype). > A very prelim .. and very broken :) .. release is now at > http://www.cs.ualberta.ca/~garg1/unpython/ > You are encouraged to download and play with the compiler. Its very > prelim and I would be very surprised if it compiles anything successfully. > If you do not want to play with very broken software, I plan to do a > saner release around mid June. > A mailing list has been setup at > http://groups.google.com/group/unpython-discuss/ > > thanks, > rahul > > > ------------------------------------------------------------------------ > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From whycode at gmail.com Thu May 29 13:58:18 2008 From: whycode at gmail.com (rahul garg) Date: Thu, 29 May 2008 05:58:18 -0600 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483E91D2.1050806@martincmartin.com> References: <483E91D2.1050806@martincmartin.com> Message-ID: Hi. Cython and unPython are actually fairly similar. Both produce C extension modules. unPython takes annotated python source as input and produces code for a C extension module. You have to manually add type annotations to functions/classes/methods etc. For functions, you add decorators with type information. Currently the difference is that the unPython annotations are pure Python and all your code runs straight on the Python interpreter too if you want. Cython has some constructs which are not Python thus in general Cython code does not run on the interpreter. ( However I am told in the long term Cython also intends to be 100% compatible.) Also the main objective of unPython is numpy so I will likely concentrate on thorough coverage and compiler optimizations for numpy. Note that in unPython, all ndarray are typed, i.e. you have to specify the element type and I translate all accesses into low level array access code when compiling to C. Indexing, slicing and basic operators like +,-,* etc are somewhat working. In next 1-2 months, I should have support for slices, looping over arrays, creating and using ufuncs and most ndarray methods. Then focus will be on loop optimizations and loop directives. Note also that unPython does not support arbitrary lengh integers and all the numeric variables are converted to low level C counterparts such as int, float, double etc. (long is missing currently). Currently Cython is definitely way ahead of unPython in features as well as stability but I hope to implement most critical features soon.For example, I dont even have support for lists, dict, tuple etc so its very early days for the project but it should be usable in 1-2 months. Note on implementation : unPython is writting in Java. Hopefully that will not turn people away. thanks, rahul On Thu, May 29, 2008 at 5:21 AM, Martin C. Martin wrote: > Hi Rahul, > > Thanks for letting us know. How is it similar or different from Cython? > > Best, > Martin > > rahul garg wrote: > > Hi. > > > > As some of you already know, I have started a new Python-to-C compiler. > > The new project is called unPython (earlier was called Spyke but people > > confused it too much with skype). > > A very prelim .. and very broken :) .. release is now at > > http://www.cs.ualberta.ca/~garg1/unpython/ > > You are encouraged to download and play with the compiler. Its very > > prelim and I would be very surprised if it compiles anything > successfully. > > If you do not want to play with very broken software, I plan to do a > > saner release around mid June. > > A mailing list has been setup at > > http://groups.google.com/group/unpython-discuss/ > > > > thanks, > > rahul > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080529/656b3655/attachment.htm From martin at martincmartin.com Thu May 29 14:12:21 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Thu, 29 May 2008 08:12:21 -0400 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> Message-ID: <483E9DA5.8040705@martincmartin.com> Thanks Rahul. Is it true that in unPython, everything needs a type? Or are type annotations optional? It sounds like you're duplicating a lot of the work that's already gone into Cython. Would it be quicker if you took the Cython source code and modified it to accept types as annotations, and add your optimizations? One approach that comes to mind is to add annotations to Cython, and deprecate the cdef stuff. It seems like your job would be a lot easier, and you'd help advance Cython at the same time. What do you think? Best, Martin rahul garg wrote: > Hi. > > Cython and unPython are actually fairly similar. Both produce C > extension modules. > > unPython takes annotated python source as input and produces code for a > C extension module. You have to manually add type annotations to > functions/classes/methods etc. For functions, you add decorators with > type information. > > Currently the difference is that the unPython annotations are pure > Python and all your code runs straight on the Python interpreter too if > you want. Cython has some constructs which are not Python thus in > general Cython code does not run on the interpreter. ( However I am told > in the long term Cython also intends to be 100% compatible.) > > Also the main objective of unPython is numpy so I will likely > concentrate on thorough coverage and compiler optimizations for numpy. > Note that in unPython, all ndarray are typed, i.e. you have to specify > the element type and I translate all accesses into low level array > access code when compiling to C. Indexing, slicing and basic operators > like +,-,* etc are somewhat working. In next 1-2 months, I should have > support for slices, looping over arrays, creating and using ufuncs and > most ndarray methods. Then focus will be on loop optimizations and loop > directives. Note also that unPython does not support arbitrary lengh > integers and all the numeric variables are converted to low level C > counterparts such as int, float, double etc. (long is missing currently). > > Currently Cython is definitely way ahead of unPython in features as well > as stability but I hope to implement most critical features soon.For > example, I dont even have support for lists, dict, tuple etc so its very > early days for the project but it should be usable in 1-2 months. > > Note on implementation : unPython is writting in Java. Hopefully that > will not turn people away. > > thanks, > rahul > > On Thu, May 29, 2008 at 5:21 AM, Martin C. Martin > > wrote: > > Hi Rahul, > > Thanks for letting us know. How is it similar or different from Cython? > > Best, > Martin > > rahul garg wrote: > > Hi. > > > > As some of you already know, I have started a new Python-to-C > compiler. > > The new project is called unPython (earlier was called Spyke but > people > > confused it too much with skype). > > A very prelim .. and very broken :) .. release is now at > > http://www.cs.ualberta.ca/~garg1/unpython/ > > > You are encouraged to download and play with the compiler. Its very > > prelim and I would be very surprised if it compiles anything > successfully. > > If you do not want to play with very broken software, I plan to do a > > saner release around mid June. > > A mailing list has been setup at > > http://groups.google.com/group/unpython-discuss/ > > > > thanks, > > rahul > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Thu May 29 14:23:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 14:23:19 +0200 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> Message-ID: <483EA037.3060505@behnel.de> Hi, rahul garg wrote: > Currently the difference is that the unPython annotations are pure Python Does that mean Python 3 function parameter annotations? Or do you hide these annotations in comments somehow? Stefan From whycode at gmail.com Thu May 29 14:42:33 2008 From: whycode at gmail.com (rahul garg) Date: Thu, 29 May 2008 06:42:33 -0600 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483EA037.3060505@behnel.de> References: <483E91D2.1050806@martincmartin.com> <483EA037.3060505@behnel.de> Message-ID: > > rahul garg wrote: > > Currently the difference is that the unPython annotations are pure Python > > Does that mean Python 3 function parameter annotations? Or do you hide > these > annotations in comments somehow? For functions, decorators are used. For classes, types of members need to be specified in a string. For Python 3, I plan to move to function annotations. I believe Python 3 also has class decorators but havent looked into it. thanks, rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080529/a11ac329/attachment.htm From whycode at gmail.com Thu May 29 14:42:50 2008 From: whycode at gmail.com (rahul garg) Date: Thu, 29 May 2008 06:42:50 -0600 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483E9DA5.8040705@martincmartin.com> References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> Message-ID: Hi. On Thu, May 29, 2008 at 6:12 AM, Martin C. Martin wrote: > Thanks Rahul. Is it true that in unPython, everything needs a type? Or > are type annotations optional? Function type signatures are mandatory. Local variable types are inferred. (There are a few cases where local type inference fails and I need to investigate them). > It sounds like you're duplicating a lot of the work that's already gone > into Cython. Would it be quicker if you took the Cython source code and > modified it to accept types as annotations, and add your optimizations? > One approach that comes to mind is to add annotations to Cython, and > deprecate the cdef stuff. It seems like your job would be a lot easier, > and you'd help advance Cython at the same time. I will look into it. thanks, rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080529/8bb3d68c/attachment-0001.htm From stefan_ml at behnel.de Thu May 29 14:58:53 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 14:58:53 +0200 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> Message-ID: <483EA88D.2020804@behnel.de> Hi, rahul garg wrote: > On Thu, May 29, 2008 at 6:12 AM, Martin C. Martin > wrote: >> It sounds like you're duplicating a lot of the work that's already gone >> into Cython. Would it be quicker if you took the Cython source code and >> modified it to accept types as annotations, and add your optimizations? > > I will look into it. Looking into this might also be a good idea: http://www.joelonsoftware.com/articles/fog0000000069.html Stefan From dalcinl at gmail.com Thu May 29 17:06:43 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 29 May 2008 12:06:43 -0300 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483E536F.2080005@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> Message-ID: On 5/29/08, Stefan Behnel wrote: > Actually, given that Py3a5 is broken in a couple of ways, and that the first > beta is supposed to be out in about a week, I don't feel obliged to keep > compatibility with the alpha version. So following SVN up to beta1 is fine > with me. Do you have anything bundled yet that I can push? Yes, just wait me 4 of 5 hours, and I'll send you a patch. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Thu May 29 17:29:39 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 17:29:39 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> Message-ID: <483ECBE3.6010200@behnel.de> Hi, Lisandro Dalcin wrote: > On 5/29/08, Stefan Behnel wrote: >> Actually, given that Py3a5 is broken in a couple of ways, and that the first >> beta is supposed to be out in about a week, I don't feel obliged to keep >> compatibility with the alpha version. So following SVN up to beta1 is fine >> with me. Do you have anything bundled yet that I can push? > > Yes, just wait me 4 of 5 hours, and I'll send you a patch. Hmm, I noticed that the "stringobject.h" compatibility header is not yet in Py3, so it won't currently build against the SVN branch at all. Guess we'll still have to wait a couple of days until the debate on the Py3k list settles... Stefan From dalcinl at gmail.com Thu May 29 18:13:20 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 29 May 2008 13:13:20 -0300 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483ECBE3.6010200@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> Message-ID: Mmm.. look at my patch, I do not believe we need to rely on compatibility headers. I just introduced __Pyx_PyBytes_{As|From}String(). Of course, feel free to change that name if you found better ones On 5/29/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > On 5/29/08, Stefan Behnel wrote: > >> Actually, given that Py3a5 is broken in a couple of ways, and that the first > >> beta is supposed to be out in about a week, I don't feel obliged to keep > >> compatibility with the alpha version. So following SVN up to beta1 is fine > >> with me. Do you have anything bundled yet that I can push? > > > > Yes, just wait me 4 of 5 hours, and I'll send you a patch. > > > Hmm, I noticed that the "stringobject.h" compatibility header is not yet in > Py3, so it won't currently build against the SVN branch at all. Guess we'll > still have to wait a couple of days until the debate on the Py3k list settles... > > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: cython-str2bytes.patch Type: application/octet-stream Size: 2604 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080529/f364aa35/attachment.obj From robertwb at math.washington.edu Thu May 29 20:46:19 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 29 May 2008 11:46:19 -0700 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> Message-ID: On May 29, 2008, at 9:13 AM, Lisandro Dalcin wrote: > Mmm.. look at my patch, I do not believe we need to rely on > compatibility headers. I just introduced > __Pyx_PyBytes_{As|From}String(). Of course, feel free to change that > name if you found better ones > > On 5/29/08, Stefan Behnel wrote: >> Hi, >> >> >> Lisandro Dalcin wrote: >>> On 5/29/08, Stefan Behnel wrote: >>>> Actually, given that Py3a5 is broken in a couple of ways, and >>>> that the first >>>> beta is supposed to be out in about a week, I don't feel >>>> obliged to keep >>>> compatibility with the alpha version. So following SVN up to >>>> beta1 is fine >>>> with me. Do you have anything bundled yet that I can push? >>> >>> Yes, just wait me 4 of 5 hours, and I'll send you a patch. >> >> >> Hmm, I noticed that the "stringobject.h" compatibility header is >> not yet in >> Py3, so it won't currently build against the SVN branch at all. >> Guess we'll >> still have to wait a couple of days until the debate on the Py3k >> list settles... >> I've been pretty quiet on the P3k stuff, but it looks like things are coming a long way! I'd like to push out another release sometime soon (next week?)--do you think things will be stable enough to do so by then? (I'd also like to get Dag's new code in.) Sage compiles and runs fine with the current devel branch. - Robert From stefan_ml at behnel.de Thu May 29 21:53:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 21:53:14 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> Message-ID: <483F09AA.8020802@behnel.de> Hi, Lisandro Dalcin wrote: > Mmm.. look at my patch, I do not believe we need to rely on > compatibility headers. I just introduced > __Pyx_PyBytes_{As|From}String(). Of course, feel free to change that > name if you found better ones Ah, you're right. I was thinking too much in terms of lxml, which does a lot of different PyString API calls. I'm wondering, though: once the header file is in place, I think users might or might not want to #include it, and if they do, it will conflict with the definitions in Cython, which are currently generated before the #includes. So we might still end up including it completely to simplify the transition for user code. Stefan From stefan_ml at behnel.de Thu May 29 22:07:43 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 22:07:43 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> Message-ID: <483F0D0F.2080002@behnel.de> Hi Robert, Robert Bradshaw wrote: > I've been pretty quiet on the P3k stuff, but it looks like things are > coming a long way! They definitely are. > I'd like to push out another release sometime soon > (next week?)--do you think things will be stable enough to do so by > then? I was already thinking about a new release, too. Having a release out before dev1 would remove the need to keep things releasable while everyone is hacking away. However, there is currently a bit of instability in the C-APIs of Py 2.6 and 3.0. So it would be good, on the other hand, to wait for the final release of the first beta versions, which is scheduled for next week (4th) anyway. That would give us a clean version that compiles code unchanged for Py2.3 through 3.0beta1. If things settle before that, we'll report back to the list. Here's the Py3 release schedule: http://www.python.org/dev/peps/pep-0361/ > (I'd also like to get Dag's new code in.) That shouldn't conflict. Things are pretty much finished, except for the unclear parts of the Py3 C-API. But that's a couple of #defines away. > Sage compiles and runs fine with the current devel branch. Nice. If you feel ambitious, you can try to build it with the last 2.6 alpha. It might not compile with 3.0 right away, but it should not be too hard to get it running on 2.6 for a start. Stefan From stefan_ml at behnel.de Thu May 29 22:17:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 29 May 2008 22:17:02 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483F09AA.8020802@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F09AA.8020802@behnel.de> Message-ID: <483F0F3E.102@behnel.de> Hi, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> Mmm.. look at my patch, I do not believe we need to rely on >> compatibility headers. I just introduced >> __Pyx_PyBytes_{As|From}String(). Of course, feel free to change that >> name if you found better ones > > Ah, you're right. I was thinking too much in terms of lxml, which does a lot > of different PyString API calls. > > I'm wondering, though: once the header file is in place, I think users might > or might not want to #include it, and if they do, it will conflict with the > definitions in Cython, which are currently generated before the #includes. > So we might still end up including it completely to simplify the transition > for user code. Hrmpf, sorry, I should have read your patch more closely (pretty late over here...) You're not actually redefining the names, just using new names. That's the right way of doing it. I'll apply it tomorrow. Thanks, Stefan From wstein at gmail.com Thu May 29 23:41:40 2008 From: wstein at gmail.com (William Stein) Date: Thu, 29 May 2008 14:41:40 -0700 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483EA88D.2020804@behnel.de> References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> <483EA88D.2020804@behnel.de> Message-ID: <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> On Thu, May 29, 2008 at 5:58 AM, Stefan Behnel wrote: > Hi, > > rahul garg wrote: >> On Thu, May 29, 2008 at 6:12 AM, Martin C. Martin >> wrote: >>> It sounds like you're duplicating a lot of the work that's already gone >>> into Cython. Would it be quicker if you took the Cython source code and >>> modified it to accept types as annotations, and add your optimizations? >> >> I will look into it. > > Looking into this might also be a good idea: > > http://www.joelonsoftware.com/articles/fog0000000069.html Also, looking into this might also be a good idea too: http://blog.ianbicking.org/unzen-of-unpython.html William From robertwb at math.washington.edu Thu May 29 23:45:34 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 29 May 2008 14:45:34 -0700 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483F0D0F.2080002@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F0D0F.2080002@behnel.de> Message-ID: On May 29, 2008, at 1:07 PM, Stefan Behnel wrote: > Hi Robert, > > Robert Bradshaw wrote: >> I've been pretty quiet on the P3k stuff, but it looks like things are >> coming a long way! > > They definitely are. > > >> I'd like to push out another release sometime soon >> (next week?)--do you think things will be stable enough to do so by >> then? > > I was already thinking about a new release, too. Having a release > out before > dev1 would remove the need to keep things releasable while everyone > is hacking > away. My thoughts exactly. > However, there is currently a bit of instability in the C-APIs of > Py 2.6 and > 3.0. So it would be good, on the other hand, to wait for the final > release of > the first beta versions, which is scheduled for next week (4th) > anyway. That > would give us a clean version that compiles code unchanged for > Py2.3 through > 3.0beta1. If things settle before that, we'll report back to the list. Very cool. Right after the beta releases sounds like a good target. > Here's the Py3 release schedule: > > http://www.python.org/dev/peps/pep-0361/ > > >> (I'd also like to get Dag's new code in.) > > That shouldn't conflict. Yes, I just meant I'm waiting on that too. > Things are pretty much finished, except for the > unclear parts of the Py3 C-API. But that's a couple of #defines away. > > >> Sage compiles and runs fine with the current devel branch. > > Nice. If you feel ambitious, you can try to build it with the last > 2.6 alpha. > It might not compile with 3.0 right away, but it should not be too > hard to get > it running on 2.6 for a start. Porting Sage to 2.6 will likely be more than a one-week project... (though I hope not). - Robert From dalcinl at gmail.com Thu May 29 23:51:08 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 29 May 2008 18:51:08 -0300 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483F0F3E.102@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F09AA.8020802@behnel.de> <483F0F3E.102@behnel.de> Message-ID: On 5/29/08, Stefan Behnel wrote: > I just introduced > __Pyx_PyBytes_{As|From}String(). > > Hrmpf, sorry, I should have read your patch more closely (pretty late over > here...) Now are you the one not reading what I post ;-) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Fri May 30 00:19:44 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 30 May 2008 10:19:44 +1200 Subject: [Cython] cdef extern struct from .h file In-Reply-To: <20080525192757.9e84b037.jim-crow@rambler.ru> References: <20080525192757.9e84b037.jim-crow@rambler.ru> Message-ID: <483F2C00.3080305@canterbury.ac.nz> Anatoly A. Kazantsev wrote: > In foo.h defined struct: > > struct some_struct { > int member; > } > > Then in bar.h: > > #include > typedef struct another_struct some_struct Assuming you meant typedef struct some_struct another_struct; then cdef extern from "bar.h": ctypedef struct another_struct: int member -- Greg From whycode at gmail.com Fri May 30 04:10:42 2008 From: whycode at gmail.com (rahul garg) Date: Thu, 29 May 2008 20:10:42 -0600 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> <483EA88D.2020804@behnel.de> <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> Message-ID: I was looking around inside Cython. I am not very clear what Cython does internally. Some questions : a) Which parser is used? I use Python 2.5's compiler module and then dump ASTs into a file which is then read back in Java. b)Cython does not construct ASTs Parse trees are used as the internal representation and Cython appears to be generating code directly from parse trees currently. Is this correct? I also see that there are proposals to add ASTs , construct control flow graph as well as for an explicit symbol table. c) When is type checking done? d) Does Cython have well defined compiler passes? thanks, rahul On 5/29/08, William Stein wrote: > > On Thu, May 29, 2008 at 5:58 AM, Stefan Behnel > wrote: > > Hi, > > > > rahul garg wrote: > >> On Thu, May 29, 2008 at 6:12 AM, Martin C. Martin < > martin at martincmartin.com> > >> wrote: > >>> It sounds like you're duplicating a lot of the work that's already gone > >>> into Cython. Would it be quicker if you took the Cython source code > and > >>> modified it to accept types as annotations, and add your optimizations? > >> > >> I will look into it. > > > > Looking into this might also be a good idea: > > > > http://www.joelonsoftware.com/articles/fog0000000069.html > > > Also, looking into this might also be a good idea too: > > http://blog.ianbicking.org/unzen-of-unpython.html > > > William > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080529/c3407797/attachment.htm From robertwb at math.washington.edu Fri May 30 04:51:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 29 May 2008 19:51:21 -0700 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> <483EA88D.2020804@behnel.de> <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> Message-ID: On May 29, 2008, at 7:10 PM, rahul garg wrote: > I was looking around inside Cython. I am not very clear what Cython > does internally. Some questions : > > a) Which parser is used? I use Python 2.5's compiler module and > then dump ASTs into a file which is then read back in Java. > It comes with its own custom parser. > b) Cython does not construct ASTs Parse trees are used as the > internal representation and Cython appears to be generating code > directly from parse trees currently. Is this correct? I also see > that there are proposals to add ASTs , construct control flow graph > as well as for an explicit symbol table. > That is correct, though the resulting tree is not strictly the parse tree--the parser does some analysis, many of the leaf nodes (e.g. decelerator nodes) get recorded as node attributes, and there is the possibility of tree transformations. > c) When is type checking done? I am not sure what you mean by this. On conversion (e.g. from a Python object to a c int) code is emitted that does runtime type checking. > d) Does Cython have well defined compiler passes? Yes. See the extensive comments at the top of Nodes.py and ExprNodes.py. - Robert From stefan_ml at behnel.de Fri May 30 07:11:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 May 2008 07:11:34 +0200 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> Message-ID: <483F8C86.8030802@behnel.de> Hi, rahul garg wrote: > Note on implementation : unPython is writting in Java. Hopefully that will > not turn people away. Regarding this bit: any reason you're using Java and not Jython? I remember the Jython project working on AST support a while ago, so that might be a good link for you. Might be easier than making sense out of the Python AST from Java. Stefan From whycode at gmail.com Fri May 30 08:06:25 2008 From: whycode at gmail.com (rahul garg) Date: Fri, 30 May 2008 00:06:25 -0600 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483F8C86.8030802@behnel.de> References: <483E91D2.1050806@martincmartin.com> <483F8C86.8030802@behnel.de> Message-ID: Hi. On Thu, May 29, 2008 at 11:11 PM, Stefan Behnel wrote: > Hi, > > rahul garg wrote: > > Note on implementation : unPython is writting in Java. Hopefully that > will > > not turn people away. > > Regarding this bit: any reason you're using Java and not Jython? I remember > the Jython project working on AST support a while ago, so that might be a > good > link for you. Might be easier than making sense out of the Python AST from > Java. Yes I will be looking at Jython's AST construction. For now, reading back Python's ASTs has not been too difficult actually. I simply wrote an ANTLR grammar for the reading dumped AST files. Fortunately ANTLR returns ASTs so my job became much simpler. Currently this grammar only deals with a subset of the actual Python grammar but adding the missing stuff in the grammar is not hard. So AST construction is not a worry at the moment. I do intend to replace the frontend with something hopefully derived from Jythons frontend but this is not the top priority. Side note on Java : Actually I started writing out the compiler in Scala but that didnt work out very well and I just wrote it in Java. Going forward, plan is to replace backend code generator with Jython. Currently for code generation I use FreeMarker template library + some ad-hoc Java code but I think the task is better suited for Jython. I have also been playing with Clojure and its pretty cool too. Ability to play with different languages easily was one motivation for writing it in Java. thanks, rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080530/20538015/attachment-0001.htm From dagss at student.matnat.uio.no Fri May 30 11:11:02 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 30 May 2008 11:11:02 +0200 Subject: [Cython] Source descriptor comparison Message-ID: <483FC4A6.6070809@student.matnat.uio.no> I noticed that filenames (now source descriptors in my branch) are used for dict lookups etc. -- rather than relying on object identity to hold I implemented this for source descriptors too. Currently I'll go ahead and emulate the old behaviour, i.e. just do string comparison on filenames (for FileSourceDescriptor). I'm kind of questioning this though -- one can refer to the same file in many ways (paths are not relative, however they can contain ".." and "."; and then comes symlinks etc.). Is this something that could cause bugs, or should I just let it fly by? One possibility that comes to mind is refactoring all the absolute/relative path business into FileSourceDescriptor; i.e. FileSourceDescriptor can be passed relative filenames etc. and asked for both relative and non-relative filenames. This would eliminate the need for first making the path absolute and then strip it off again afterwards in relative_position (a scheme which would break if I canonicalized the filenames (i.e. process ".." and ".") for proper comparison). -- Dag Sverre From dagss at student.matnat.uio.no Fri May 30 11:46:54 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 30 May 2008 11:46:54 +0200 Subject: [Cython] Test framework questions Message-ID: <483FCD0E.3030809@student.matnat.uio.no> 1) What's the best way to integrate my unit tests for auto-running? I'd like something to the effect of for file in */Tests/*.py */*/Tests/*.py ...: do Python import of module corresponding to file call unittest to extract testcases and append to suite Surely there must be some utilities for doing this? But I didn't find anything. (If not I can write it up of course; I suppose I'll run them by default but add a flag for not running them). 2) Robert recently pointed out a bug in my code which was only present when Options.embed_pos_in_docstring was set to 1. Therefore, runtests.py did not catch it. Inserting: if WITH_CYTHON: ... + import Cython.Compiler.Options + Cython.Compiler.Options.embed_pos_in_docstring = 1 in runtests.py did allow me to work out the bug, however this did of course make all the docstring testcases fail. (If there's already code present to deal with testing this then just inform me about it and stop reading.) So what I'm thinking is that there should be some kind of mechanism for allowing integration tests to specify their compilation options. Something like: mytest.options: embed_pos_in_docstring = 1 Or perhaps better/more stable: mytest.options: --embed-positions Any thoughts? (Of course, any implementation burden would fall on me, the implementor.) (The latter would, to make it clean, require a refactoring of CmdLine.py to seperate command line argument parsing from command line argument "execution". But CmdLine.py should probably be using optparse anyway, and that gives such a seperation for free and one could simply pull the OptionParser into runtests.py for parsing the options-file.) -- Dag Sverre From dagss at student.matnat.uio.no Fri May 30 12:02:16 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 30 May 2008 12:02:16 +0200 Subject: [Cython] big_indices.pyx broken on 64 bit systems Message-ID: <483FD0A8.2040506@student.matnat.uio.no> Title says it all: File "/home/dagss/cython/may30/BUILD/run/big_indices.so", line 108, in big_indices Failed example: test() Expected: neg -1 pos 4294967294 neg pos neg pos Got: neg -1 pos 18446744073709551614 neg pos neg pos (I rather not fix test cases I didn't write and know in detail.) -- Dag Sverre From dagss at student.matnat.uio.no Fri May 30 12:14:09 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 30 May 2008 12:14:09 +0200 Subject: [Cython] Status update + how to build SAGE? In-Reply-To: <483FCD0E.3030809@student.matnat.uio.no> References: <483FCD0E.3030809@student.matnat.uio.no> Message-ID: <483FD371.1070402@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Any thoughts? (Of course, any implementation burden would fall on me, > the implementor.) (Heh, this didn't make sense. I meant "me, the proposer".) On the issues of the release: SourceDescriptors are a pretty bug-prone change which one could argue for merging right after doing a release rather than right before. Since no actual features depend on these for now, there's no real need for merging them now either. So I'm actually happy for my branch to be merged right after the release rather than right before. Though I suppose that if you consider compiling SAGE testing good enough it shouldn't matter (runtests.py is not sufficient). Anyway, the issues Robert's pointed out should be fixed: - Clearer TreeFragment.py code, and it now copies "pos" (and also copies the substitution arguments, in case they're used in more than one place. This was the simple solution, can optimize/refcount later if needed.) - Fixed the problem pointed out with Sage compilation. I didn't test compiling Sage yet though, I'd like some help with that (see below). - Other SourceDescriptor bugs and fixes. Which brings me to: Could anyone write a sentence or two (not much, just a few pointers) as to where the Cython code in SAGE is located, and the best way to plug in a custom Cython into its build system? I have absolutely no experience with SAGE, and at least the prospect of building all the spkgs seems wrong... -- Dag Sverre From stefan_ml at behnel.de Fri May 30 12:26:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 May 2008 12:26:08 +0200 Subject: [Cython] Source descriptor comparison In-Reply-To: <483FC4A6.6070809@student.matnat.uio.no> References: <483FC4A6.6070809@student.matnat.uio.no> Message-ID: <483FD640.4040300@behnel.de> Hi, Dag Sverre Seljebotn wrote: > FileSourceDescriptor can be passed relative filenames etc. and asked for > both relative and non-relative filenames. This would eliminate the need > for first making the path absolute and then strip it off again > afterwards in relative_position (a scheme which would break if I > canonicalized the filenames (i.e. process ".." and ".") for proper > comparison). We are caching files in a couple of places, so it would actually be correct to use normalised absolute file paths in general (independent of any FSDs). Stefan From stefan_ml at behnel.de Fri May 30 12:44:30 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 May 2008 12:44:30 +0200 Subject: [Cython] Test framework questions In-Reply-To: <483FCD0E.3030809@student.matnat.uio.no> References: <483FCD0E.3030809@student.matnat.uio.no> Message-ID: <483FDA8E.5070307@behnel.de> Hi, Dag Sverre Seljebotn wrote: > 1) What's the best way to integrate my unit tests for auto-running? We use a unit test runner script in lxml: http://codespeak.net/svn/lxml/trunk/test.py It's even Py3 clean by now (although that currently doesn't matter to Cython unit tests...). :) > 2) Robert recently pointed out a bug in my code which was only present > when Options.embed_pos_in_docstring was set to 1. Therefore, runtests.py > did not catch it. I thought about this issue, too, but didn't get around to work on it. Currently, there is no way to specify compiler options from inside a test case. Also, note that Cython.Compiler.Options is a global thing, so you'd have to reset the options after running the test. Calling dict(vars(Options)) in setUp() and resetting all attributes in tearDown() should take care of that, though. Something like __doc__ = u""" ... """ def test(): ... _COMPILER_OPTIONS=u""" embed_pos_in_docstring = 1 """ could be handled the same way _ERRORS is currently special cased. > (Of course, any implementation burden would fall on me, Of cause. :) Stefan From stefan_ml at behnel.de Fri May 30 13:24:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 May 2008 13:24:08 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F0D0F.2080002@behnel.de> Message-ID: <483FE3D8.7020203@behnel.de> Hi Robert, I'm currently fixing up your latest changes ("import *" etc.). You have to be a bit careful now with identifiers (including imported names). They are no longer byte strings in Py3, so calling PyString_AsString() on them will not work (besides the function not being available in Py3 anymore :) Also, code like code.putln('"%s",' % name) may not work in all cases. If we can't be sure it's a pure ASCII identifier (i.e. it was parsed as an identifier by the current Cython parser), the name has to be encoded as UTF-8 first and escaped using Utils.escape_byte_string(). I think this was ok in your source, so I'm just mentioning it in general. I'll try to write up a short howto in the Wiki when I get to it (unless someone else wants to start it first ...) Stefan From martin at martincmartin.com Fri May 30 13:48:32 2008 From: martin at martincmartin.com (Martin C. Martin) Date: Fri, 30 May 2008 07:48:32 -0400 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> <483EA88D.2020804@behnel.de> <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> Message-ID: <483FE990.8040700@martincmartin.com> Robert Bradshaw wrote: > On May 29, 2008, at 7:10 PM, rahul garg wrote: >> b) Cython does not construct ASTs Parse trees are used as the >> internal representation and Cython appears to be generating code >> directly from parse trees currently. Is this correct? I also see >> that there are proposals to add ASTs , construct control flow graph >> as well as for an explicit symbol table. >> > That is correct, though the resulting tree is not strictly the parse > tree--the parser does some analysis, many of the leaf nodes (e.g. > decelerator nodes) get recorded as node attributes, and there is the > possibility of tree transformations. How does it differ from an AST then? Best, Martin From dagss at student.matnat.uio.no Fri May 30 14:24:44 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 30 May 2008 14:24:44 +0200 Subject: [Cython] new Python-to-C compiler announcement In-Reply-To: <483FE990.8040700@martincmartin.com> References: <483E91D2.1050806@martincmartin.com> <483E9DA5.8040705@martincmartin.com> <483EA88D.2020804@behnel.de> <85e81ba30805291441g5c4f2de3i71faf624b63b9cd0@mail.gmail.com> <483FE990.8040700@martincmartin.com> Message-ID: <483FF20C.50202@student.matnat.uio.no> Martin C. Martin wrote: > Robert Bradshaw wrote: > >> On May 29, 2008, at 7:10 PM, rahul garg wrote: >> >>> b) Cython does not construct ASTs Parse trees are used as the >>> internal representation and Cython appears to be generating code >>> directly from parse trees currently. Is this correct? I also see >>> that there are proposals to add ASTs , construct control flow graph >>> as well as for an explicit symbol table. >>> >>> >> That is correct, though the resulting tree is not strictly the parse >> tree--the parser does some analysis, many of the leaf nodes (e.g. >> decelerator nodes) get recorded as node attributes, and there is the >> possibility of tree transformations. >> > > How does it differ from an AST then? > The Cython tree comes close; right after the parsing stage (at which point it is possible to do psuedo-AST-transformations, though none is done currently). The difference is this: The tree that the Cython parser generates has a structure that is determined by how it is going to be output to C, and is not a direct tree representation of the Cython code. For instance, "x, y, z = a, b, c" will show up in the tree structure as (psuedo-code) "ParallellAssignment([{x = a}, {y = b}, {z = c}])" rather than what an AST would have, i.e. "ParallellAssignment(lhs=[{x}, {y}, {z}], rhs=[{a}, {b}, {c}])". There are small cases of variations like this all over the place, so you generally cannot guess things about the tree structure from how the corresponding Cython code would look. Another example is that the Cython parser does some processing, for instance pulls in any "cimport"ed files/modules etc directly. A purer AST approach would just leave the "cimport" statements in place and leave it to code further down to do something about them (allowing transformations affecting cimports and so on). I guess it all boils down to the fact that the Cython parser isn't really a "parser", it's a parser + a lot of other stuff. Getting a real AST in Cython is not such a hard task (refactor Parsing.py into two components, putting all the non-parsing stuff in a new post-parse transform -- this can happen gradually and part by part), but it likely won't happen until there's a real benefit. When working on parsing Py3 code it could be a good time to look into doing some of this to streamline Parsing.py somewhat. Dag Sverre From stefan_ml at behnel.de Fri May 30 15:01:24 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 30 May 2008 15:01:24 +0200 Subject: [Cython] Test framework questions In-Reply-To: <483FDA8E.5070307@behnel.de> References: <483FCD0E.3030809@student.matnat.uio.no> <483FDA8E.5070307@behnel.de> Message-ID: <483FFAA4.7030109@behnel.de> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: >> (Of course [...] > > Of cause. :) Hmmm, looks like Gerald stoke again. http://www.yourdictionary.com/library/tough.html Stefan From wstein at gmail.com Fri May 30 15:23:47 2008 From: wstein at gmail.com (William Stein) Date: Fri, 30 May 2008 06:23:47 -0700 Subject: [Cython] Status update + how to build SAGE? In-Reply-To: <483FD371.1070402@student.matnat.uio.no> References: <483FCD0E.3030809@student.matnat.uio.no> <483FD371.1070402@student.matnat.uio.no> Message-ID: <85e81ba30805300623g4c0a5e7bg3bfd02610965913e@mail.gmail.com> On 5/30/08, Dag Sverre Seljebotn wrote: > - Other SourceDescriptor bugs and fixes. > > Which brings me to: Could anyone write a sentence or two (not much, just > a few pointers) as to where the Cython code in SAGE is located, and the > best way to plug in a custom Cython into its build system? I have > absolutely no experience with SAGE, and at least the prospect of > building all the spkgs seems wrong... If you want to test a custom Cython with Sage, do the following. 1. Download and install Sage from source or a binary. http://sagemath.org/download.html If from source, download the tarball here, extract it, and type make, then wait 2 hours: http://sagemath.org/dist/src/ 2. After Sage is done building, go to your personal Cython, and type sage -python setup.py install This will install *your* Cython into the Python that ships with Sage. 3. From the Sage root directory type ./sage -ba This will rebuild all the Cython code in Sage (well over 50,000 lines of hand-written Cython code), which should take about 15-20 minutes. 4. Assuming 3 finishes, from the Sage root directory type make check This will run the entire Sage test suite (over 50,000 input examples), and report the result. -- William From dalcinl at gmail.com Fri May 30 16:40:00 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 30 May 2008 11:40:00 -0300 Subject: [Cython] about let users have more control on garbage collection Message-ID: I've started the process to migrate other of my public projects (petsc4py) from the SWIG way to the Cython way. An again, as happened with mpi4py, this was easy, fast, and the final result is far, far, better. But in this case I have a problem. I need Cython let me hook in the calls to 'tp_visit' and 'tp_clear' supporting garbage collection. In short, I want to be able to add some special method to my objects, and then Cython will always generate gc support (regardeless if my cdef class has or do not have python attributes). In the generated tp_visit and tp_clear, Cython work as current, visiting/clearing python attributes, but near the end of these function, it calll the user-defined special method. Whould such an addition be accepted? Any suggestion on how to name such special methods? Perhaps the obvious ones: __visit__ and __clear__ ? Or perhaps better, __gc_visit__ and __gc_clear__ ?? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From wstein at gmail.com Fri May 30 20:14:46 2008 From: wstein at gmail.com (William Stein) Date: Fri, 30 May 2008 11:14:46 -0700 Subject: [Cython] [Pyrex] How to prove better Pyrex performance? In-Reply-To: <66a8c45d0805301006se154dabwd0b5418bb089c930@mail.gmail.com> References: <66a8c45d0805301006se154dabwd0b5418bb089c930@mail.gmail.com> Message-ID: <85e81ba30805301114s79402617t3e9ca61fc77bb26b@mail.gmail.com> On Fri, May 30, 2008 at 10:06 AM, Daniele Pianu wrote: > I've to write a Python script that shows the better Pyrex performance > against Swig performance. ?I've to highlight the overhead due to > different layers of Swig wrappers, instead of Pyrex wrappers which > have only one layer made up of Python/C Api calls. I think I need to > write something like a wrapper of a C library that does mathematical > stuff. Could it be a good example? Every advice is welcome. > I think it is VERY important that you explain who your target audience is. Pyrex is vastly better than Swig for *certain* types of wrapping, but for others I think for other types of wrapping it doesn't matter much. So, who exactly do you want to show this benchmark to? By the way, my personal experience with this was four years ago, when I wrote a C++ class to do arbitrary precision integer arithmetic using the GMP library. I then wrapped it in Swig, which meant that every object creation (e.g., multiplication) meant creating a pure python object that wrapped a C library object, etc. I then wrote code to accomplish the same thing in Pyrex, and that's when I realized Pyrex is brilliant. - William From dalcinl at gmail.com Fri May 30 20:58:46 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 30 May 2008 15:58:46 -0300 Subject: [Cython] about SWIG versus Cython/Pyrex Message-ID: I saw a mail from you, forwarded by W. Stein to Cython-Devel list, about you are trying to measure performance for SWIG vs. Cython/Pyrex. Well, I'm a rather power SWIG user, and now, after about a month I believe I'm a rather good Cython user. My main insterest is parallel distributed computing with Python. From some time, I've been developing two very important (at least for me!) packages, mpi4py and petsc4py (just google for the link). For your objectives, I believe petsc4py could be interesting. The C API of PETSc is wrapped with SWIG, but then this stuff is used inside a Python code to implement a fully OO, pythonic API. I've heavily hacked in the way SWIG implements its infamous 'this'. I do not rely on this mechanism for passing object between Python and C layers, instead, I've implemented a full base type object in C; and all this crap is because the normal SWIG way was not enough for me. But now, after fall in love with Cython, I've started to port (in fact, write from scratch) all my work to use Cython. The petsc4py project is not yet ready for the public, but it is in good shape for testing. And now, the interesting part. I've wrote some testing for my thesis presentation (it will be in about a month), where I've crafted some numerical tests in order to stress the overhead of passing back a forth objects from the C layer to the Python layer and viceversa. All the actual numericall computing is done in C or Fortran 90. And all runs are sequential, not parallel. This testing is done, code and results available, but only for the former SWIG based version of petsc4py. The whole problem is related to use Krylov iterative methods for solving a linear system of equations arising from a finite-diferences discretization of the 3D Poisson problem. BUT note that I never build a sparse matrix, instead I use a 'matrix-free' version with implements the matrix-vector product A*x -> y as a function F(x) -> y, thus being crude computing with Fortran 90 arrays. If you are interested in take a look at this, I can send you some PDF pages describing all this testing. I can also send you the code implementing this testing. And if you want to go further, I perhaps can find some spare time to help you implement all this testing with the new Cython-based version of petsc4py that I'm writting. I expect that my SWIG implementation will run faster that a normal SWIG wrapper (as I've hacked on many ways to ge it faster). And I expect that the new Cython based version is still faster. I really believe that this is a very good test case for comparing SWIG vs. Cython, as all the numerical computing is actully done in C, then you finally measure the overhead of passing object around. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri May 30 23:03:19 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 30 May 2008 14:03:19 -0700 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <483FE3D8.7020203@behnel.de> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F0D0F.2080002@behnel.de> <483FE3D8.7020203@behnel.de> Message-ID: <1D4FE830-2596-4050-8FE4-4F5753893A57@math.washington.edu> On May 30, 2008, at 4:24 AM, Stefan Behnel wrote: > Hi Robert, > > I'm currently fixing up your latest changes ("import *" etc.). You > have to be > a bit careful now with identifiers (including imported names). They > are no > longer byte strings in Py3, so calling PyString_AsString() on them > will not > work (besides the function not being available in Py3 anymore :) Good point. > > Also, code like > > code.putln('"%s",' % name) > > may not work in all cases. If we can't be sure it's a pure ASCII > identifier > (i.e. it was parsed as an identifier by the current Cython parser), > the name > has to be encoded as UTF-8 first and escaped using > Utils.escape_byte_string(). On this note, what happens when a cdef variable (like above) has non- ASCII characters in it? > I think this was ok in your source, so I'm just mentioning it in > general. I'll > try to write up a short howto in the Wiki when I get to it (unless > someone > else wants to start it first ...) Yes, that'd be good. I knew there was some unicode stuff I might have to worry about here, but wasn't sure what at that point. - Robert From robertwb at math.washington.edu Fri May 30 23:04:13 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 30 May 2008 14:04:13 -0700 Subject: [Cython] big_indices.pyx broken on 64 bit systems In-Reply-To: <483FD0A8.2040506@student.matnat.uio.no> References: <483FD0A8.2040506@student.matnat.uio.no> Message-ID: That's a 32 vs 64 bit issue...I'm fine with you fixing it (or I could do it myself). - Robert On May 30, 2008, at 3:02 AM, Dag Sverre Seljebotn wrote: > Title says it all: > > File "/home/dagss/cython/may30/BUILD/run/big_indices.so", line 108, in > big_indices > Failed example: > test() > Expected: > neg -1 > pos 4294967294 > neg > pos > neg > pos > Got: > neg -1 > pos 18446744073709551614 > neg > pos > neg > pos > > > (I rather not fix test cases I didn't write and know in detail.) > > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Fri May 30 23:10:33 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 30 May 2008 14:10:33 -0700 Subject: [Cython] big_indices.pyx broken on 64 bit systems In-Reply-To: References: <483FD0A8.2040506@student.matnat.uio.no> Message-ID: <94349E99-CC1C-4EC2-A24A-2E16A8F5D5A5@math.washington.edu> On May 30, 2008, at 2:04 PM, Robert Bradshaw wrote: > That's a 32 vs 64 bit issue...I'm fine with you fixing it (or I could > do it myself). What I mean by this is that the behavior is varying as expected on both platforms, the test should be re-written to be platform agnostic (e.g. by not printing pos itself, just pos > 0). > > - Robert > > On May 30, 2008, at 3:02 AM, Dag Sverre Seljebotn wrote: > >> Title says it all: >> >> File "/home/dagss/cython/may30/BUILD/run/big_indices.so", line >> 108, in >> big_indices >> Failed example: >> test() >> Expected: >> neg -1 >> pos 4294967294 >> neg >> pos >> neg >> pos >> Got: >> neg -1 >> pos 18446744073709551614 >> neg >> pos >> neg >> pos >> >> >> (I rather not fix test cases I didn't write and know in detail.) >> >> >> -- >> Dag Sverre >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Fri May 30 23:43:36 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 30 May 2008 18:43:36 -0300 Subject: [Cython] SWIG and Cython In-Reply-To: <96de71860805301247t5c6e53e8x953734e26c1340de@mail.gmail.com> References: <96de71860805301247t5c6e53e8x953734e26c1340de@mail.gmail.com> Message-ID: Well, In the case of wrapping large, legacy C++ libraries, then the choice is not easy, it depend on many factors. I'll try to elaborate For something like MPI and PETSc in case for mpi4py and petsc4py, which are both being converted to Cython, the option is really clear for me: just use Cython, do not bother with SWIG. Why use Cython? Their native API's are just C (well, MPI has a standard C++ API, but not much more featured than the C one). Wrapping a C API and providing just C-ish functions at the Python level is just unpythonic and painfull to use that plain C. You really need a trully OO Python API, if not, you Python level API is just much crap than the C one, hard to use, perhaps the only you alleviate is error checking. Building a OO API targeting Python through using Cython and calling a your legacy library C API is just easy, fast, convenient than using SWIG and next use the SWIG generated code to implement a better-looking, high-level, pythonic API for Python-level consumption. In the case of a large C++ API its depend on how much of that API it makes sense to expose to Python. Let suppose we have an API with a lot of classes and a lot of methods (let say 10,20,30 classes, with 10,10,30 methods each). Then wrapping the full API in the current status of Cython, that is going to be a real pain, to much manual. BUT, if you are going to use only a few of your classes, and only expose a few of your methods, or even want to expose new methods, more convenient to use in the Python side, then, again I will definitelly choose Cython. My personal experience in going from SWIG to Cython is worth to take into account: Implementing a really good, robust, full featured, pythonic access to full PETSc functionalities for petsc4py took me about two years of carefull writting of typemaps, implementing Python extension types BY HAND, and writting a log or C code inside custom typemaps, with Python and Numpy C API calls plus PETSc calls. Then, after the success with Cython for the mpi4py case, I started the process of rewritting petsc4py from scratch, working at night, let say from 9:00 PM to until 1:00 AM or 2:00 AM. In about 10 days, yes, JUST 10 DAYS of night work, I'm near to have implemented all the features of the previous implementation, and the new implementation is actually bether and more featured.... In short, if you really, really, really want to wrapp the FULL C++ library, then the Cython way will be harder than the SWIG way. But I really doubt that you really need to access the FULL C++ API. But one thing that you sould take for granted: the Cython way will let you write better-looking, more easily maintenable source code. That's all, folks! Regards, On 5/30/08, Fernando Perez wrote: > [ Lisandro, I'm cc'ing you here because we've just had some > discussions with folks from Los Alamos NL, Berkeley and UC Santa Cruz > on the issue of SWIG vs. other options for wrapping a large C++ > library (a high performance computer vision library, in this case). I > suggested SWIG due to its maturity for automatic wrapping (despite the > pain of typemaps) but your recent experience in transitioning away > from it is in my view a critically useful piece of information at this > point] > > Hi folks, > > > just to give you a counterpoint from what I said yesterday. While > SWIG has the advantage of allowing automatic wrapping of large > libraries, it is also true that it is unwieldy in some ways, and the > two-layer overhead is impossible to escape. Cython continues to > improve for this type of tasks, and I think the message below is worth > reading carefully before you embark in any long SWIG exercise. > Lisandro (CC'd here) is an extremely knowledgeable developer and both > Petsc and MPI are large, complex projects (Petsc comes from Argonne > NL). The fact that his recent experiences with Cython wrapping have > been so good is in my view an important data point. > > If you begin looking in this direction, the Cython mailing list is a > very helpful resource. You can also download the mpi4py sources to > have a look at what Lisandro has done. > > Best, > > f > > ---------- Forwarded message ---------- > From: Lisandro Dalcin > Date: Fri, May 30, 2008 at 11:58 AM > Subject: [Cython] about SWIG versus Cython/Pyrex > To: Daniele Pianu > Cc: Cython-dev > > > I saw a mail from you, forwarded by W. Stein to Cython-Devel list, > about you are trying to measure performance for SWIG vs. Cython/Pyrex. > > Well, I'm a rather power SWIG user, and now, after about a month I > believe I'm a rather good Cython user. > > My main insterest is parallel distributed computing with Python. From > some time, I've been developing two very important (at least for me!) > packages, mpi4py and petsc4py (just google for the link). > > For your objectives, I believe petsc4py could be interesting. The C > API of PETSc is wrapped with SWIG, but then this stuff is used inside > a Python code to implement a fully OO, pythonic API. I've heavily > hacked in the way SWIG implements its infamous 'this'. I do not rely > on this mechanism for passing object between Python and C layers, > instead, I've implemented a full base type object in C; and all this > crap is because the normal SWIG way was not enough for me. > > But now, after fall in love with Cython, I've started to port (in > fact, write from scratch) all my work to use Cython. The petsc4py > project is not yet ready for the public, but it is in good shape for > testing. > > And now, the interesting part. I've wrote some testing for my thesis > presentation (it will be in about a month), where I've crafted some > numerical tests in order to stress the overhead of passing back a > forth objects from the C layer to the Python layer and viceversa. All > the actual numericall computing is done in C or Fortran 90. And all > runs are sequential, not parallel. > > This testing is done, code and results available, but only for the > former SWIG based version of petsc4py. The whole problem is related to > use Krylov iterative methods for solving a linear system of equations > arising from a finite-diferences discretization of the 3D Poisson > problem. BUT note that I never build a sparse matrix, instead I use a > 'matrix-free' version with implements the matrix-vector product A*x -> > y as a function F(x) -> y, thus being crude computing with Fortran 90 > arrays. > > If you are interested in take a look at this, I can send you some PDF > pages describing all this testing. I can also send you the code > implementing this testing. And if you want to go further, I perhaps > can find some spare time to help you implement all this testing with > the new Cython-based version of petsc4py that I'm writting. > > I expect that my SWIG implementation will run faster that a normal > SWIG wrapper (as I've hacked on many ways to ge it faster). And I > expect that the new Cython based version is still faster. > > I really believe that this is a very good test case for comparing SWIG > vs. Cython, as all the numerical computing is actully done in C, then > you finally measure the overhead of passing object around. > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From whycode at gmail.com Fri May 30 23:51:46 2008 From: whycode at gmail.com (rahul garg) Date: Fri, 30 May 2008 15:51:46 -0600 Subject: [Cython] SWIG and Cython In-Reply-To: References: <96de71860805301247t5c6e53e8x953734e26c1340de@mail.gmail.com> Message-ID: I am a little curious about ctypes. Have you looked at ctypes and what advantages/disadvantages do you think it has over SWIG and Cython for wrapping the libraries you are interested in? Of course ctypes doesnt work for C++ but any other specific problems? rahul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080530/9957c35e/attachment.htm From dalcinl at gmail.com Sat May 31 00:20:28 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 30 May 2008 19:20:28 -0300 Subject: [Cython] SWIG and Cython In-Reply-To: References: <96de71860805301247t5c6e53e8x953734e26c1340de@mail.gmail.com> Message-ID: On 5/30/08, rahul garg wrote: > I am a little curious about ctypes. Have you looked at ctypes and what > advantages/disadvantages do you think it has over SWIG and Cython for > wrapping the libraries you are interested in? Of course ctypes doesnt work > for C++ but any other specific problems? Well, a C library API is not just functions and data exported by the linker. It is also #define'd macros, typedef'd stuff, custom C structs with many slots. If you want to write a full featured wrapper to a complex library, the ctypes way is going to be even more painfull than SWIG or Cython. Suppose you want to access, in the MPI case, MPI_COMM_WORLD. Look at the header files of major's MPI out there (MPICH2, OpenMPI, HP MPI, IBM MPI): in all cases MPI_COMM_WORLD is a #defined thing, in some case an integer, in other a pointer. And that applies to all MPI predefined handles (operations, datatypes, etc,) and other stuff. You just cannot access and manage a full C API in ctypes; at least, I do not know how to handle that. For example, the new Cython-based mpi4py can be build against Python version ranging from 2.3 to 3.0 (yes, it is Py3K ready!), and with all major MPI's out there, either if they are MPI-1 or MPI-2 implementation, in Linux, Mac OS X, and Windows. I'll never buy the ctypes way until someone can show me that this degree of portability is possible. Even if someone can do this with ctypes, I guess such a code would not be easy to understand and maintain. For the case of petsc4py, the same comments apply. PETSc C API do have macros. Furthermore, it's API changes from release to release (PETSc developers does not bother from backward compatibility issues). But petsc4py, either the former SWIG-based or the newer Cython-based, can be used with petsc-2.3.2, petsc-2.3.3, or the latest development copy. All this with a layer of compatibility functions and macros. How in the hell a serious developer is going to handle all that with ctypes?? Please note that I do not have nothing against ctypes. It is a wonderful and very good tool, but you cannot use it in all scenarios. However, if the libraries are all C functions, and they do not use macros in any way, then ctypes is still a good option, no compilation needed, pure python code. Of course, be prepared to get a segfault from time to time, not because of a ctypes bug, but because of your bugs. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From tgrav at mac.com Sat May 31 03:22:05 2008 From: tgrav at mac.com (Tommy Grav) Date: Fri, 30 May 2008 21:22:05 -0400 Subject: [Cython] Problems with example Message-ID: Hi everyone. I am on a Powerbook running 10.5.3 and trying to go through the example on http://www.perrygeo.net/wordpress/?p=116 , which fails. Can anyone tell me what I am doing wrong? I create the c1.pxd and run cython c1.pyx gcc -c -fPIC -I/usr/include/python2.5 cl.c gcc -dynamiclib -o c1.dylib -dylib c1.o Undefined symbols: "_PyDict_New", referenced from: ___Pyx_Import in c1.o "_PyNumber_Multiply", referenced from: ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o "_PyObject_SetAttrString", referenced from: _initc1 in c1.o "_PyObject_SetAttr", referenced from: _initc1 in c1.o "_PyFloat_AsDouble", referenced from: ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o . . . "_PyTuple_New", referenced from: ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o ___pyx_pf_2c1_great_circle in c1.o _initc1 in c1.o ld: symbol(s) not found collect2: ld returned 1 exit status -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080530/a9565b52/attachment.htm From tgrav at mac.com Sat May 31 17:43:35 2008 From: tgrav at mac.com (Tommy Grav) Date: Sat, 31 May 2008 11:43:35 -0400 Subject: [Cython] Problems with example In-Reply-To: References: Message-ID: I found the solution gcc -dynamiclib -undefined dynamic_lookup -single_module -o c1.so c1.o Note that python will not find libraries called *.dylib in your work directory, while *.so works fine. Does any know why? Cheers Tommy On May 30, 2008, at 9:22 PM, Tommy Grav wrote: > Hi everyone. > > I am on a Powerbook running 10.5.3 and trying to go through the > example on > http://www.perrygeo.net/wordpress/?p=116 , which fails. Can anyone > tell me what I am doing wrong? > > I create the c1.pxd > and run > > cython c1.pyx > gcc -c -fPIC -I/usr/include/python2.5 cl.c > gcc -dynamiclib -o c1.dylib -dylib c1.o > Undefined symbols: > "_PyDict_New", referenced from: > ___Pyx_Import in c1.o > > ld: symbol(s) not found > collect2: ld returned 1 exit status > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20080531/09c18449/attachment.htm From stefan_ml at behnel.de Sat May 31 19:29:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 May 2008 19:29:34 +0200 Subject: [Cython] Py3K: recent rename PyString -> PyBytes In-Reply-To: <1D4FE830-2596-4050-8FE4-4F5753893A57@math.washington.edu> References: <483B26B0.6060409@behnel.de> <483E536F.2080005@behnel.de> <483ECBE3.6010200@behnel.de> <483F0D0F.2080002@behnel.de> <483FE3D8.7020203@behnel.de> <1D4FE830-2596-4050-8FE4-4F5753893A57@math.washington.edu> Message-ID: <48418AFE.2080303@behnel.de> Hi, Robert Bradshaw wrote: > On May 30, 2008, at 4:24 AM, Stefan Behnel wrote: >> Also, code like >> >> code.putln('"%s",' % name) >> >> may not work in all cases. If we can't be sure it's a pure ASCII >> identifier >> (i.e. it was parsed as an identifier by the current Cython parser), >> the name >> has to be encoded as UTF-8 first and escaped using >> Utils.escape_byte_string(). > > On this note, what happens when a cdef variable (like above) has non- > ASCII characters in it? Not sure what you mean here. Do you mean in its name? That can't currently happen as the scanner only allows pure ASCII alphanumeric identifiers. I don't think we should support PEP 3131 in Cython, the mapping to C identifiers would become too obscure. http://www.python.org/dev/peps/pep-3131/ Keyword arguments are a different thing, but allowing non-ASCII keywords in function signatures will require us to write our own ParseTupleAndKeywords() to keep up compatibility with Py2, so I don't find that worth supporting either (for now). Although having our own PTAK() where you could simply pass Python strings as acceptable keyword list would be helpful in general... In case you were asking about string values: any non-ASCII or \0 characters in the encoded byte string are backslash escaped in the C source so that the C compiler creates the correct byte sequence in any case, which is then passed to the respective Python API functions at runtime to decode it into a Python string. If you meant something else, feel free to clarify. :) Stefan From stefan_ml at behnel.de Sat May 31 19:37:13 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 31 May 2008 19:37:13 +0200 Subject: [Cython] about let users have more control on garbage collection In-Reply-To: References: Message-ID: <48418CC9.2030103@behnel.de> Hi, Lisandro Dalcin wrote: > In short, I want to be able to add some special method to my objects, > and then Cython will always generate gc support (regardeless if my > cdef class has or do not have python attributes). In the generated > tp_visit and tp_clear, Cython work as current, visiting/clearing > python attributes, but near the end of these function, it calll the > user-defined special method. Could you explain a bit what you are going to do in these methods? Stefan From dalcinl at gmail.com Sat May 31 23:49:48 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 31 May 2008 18:49:48 -0300 Subject: [Cython] about let users have more control on garbage collection In-Reply-To: <48418CC9.2030103@behnel.de> References: <48418CC9.2030103@behnel.de> Message-ID: Sure, I'll elaborate a bit more... PETSc objects are refcounted, however there is no gc features in PETSc. I have classes that are proxies to the actual PETSc objects, and the methods on that classes internally call the PETSc C API. From now, when I said 'PETSc object', it is a pointer to an opaque C structure; when I said 'proxy class', it is 'cdef' class having a PETSc object, like this: cdef class Mat: # <- proxy class cdef PetscMat # <- PETSc object, a pointer to a refcounted struct. I need to store arbitrary Python objects inside the PETSc object, I'm actually storing a dictionary. This dictionary have to survive until the PETSc object is deallocated, but that deallocation do not necesarily occurs when the instance of my proxy class is deallocated, because the PETSc object can still 'live' (that is, a referece being owned) inside other PETSc objects. Then, as the PETSc object contains a Python dictionary, I need to 'visit' that dictionary in the 'tp_visit' of my proxy class (of course, I first need to make a call to get the dict from the PETSc object, then visit it). Furthermore, I need to release the ownership (decref) my PETSc object inside the 'tp_clear' of my proxy class. I've implemented this machinery some time ago for my SWIG-based implementations, in a hand-written C type object, and all works just fine. I do not know if there is any other way of doing this. If not, I endup 'leaking' PETSc objects, some of them with big memory footprints. Perhaps you think that my approach is over-complicated. But my intent is that in the near future petsc4py will be more than a PETSc wrapper for Python level consumption. I want to become Python an extension language for PETSc, were a user with codes in C/C++/Fortran can implement some stuff, like the loops of a custom nonlinear solver, in Python, and next use it from C/C++/Fortran with just passing a command line flag. This feature is working in the former SWIG-based version, now I want the new Cython-based version to become even more featured. I hope I was clear enough about my needs, it's not easy to explain. Perhaps, in short, we should think about this scenario: * We have a object-oriented C/C++ library wich let you put arbitrary data inside some refcounted C structs or C++ classed. * Then you can put any Python object in the C/C++ level objects, in particular, you can put mutable containers, like a list or a dict. * Now you want to wrap that library with Cython, by implementing classes that are proxies to the C/C++ object. Then, the question is: how do you avoid circular references? Short answer: you cannot. You need gc to mitigate the problem, then you need that your proxies can be able to traverse the contained stored inside the C/C++ level object. Is there any other way? On 5/31/08, Stefan Behnel wrote: > Hi, > > > Lisandro Dalcin wrote: > > In short, I want to be able to add some special method to my objects, > > and then Cython will always generate gc support (regardeless if my > > cdef class has or do not have python attributes). In the generated > > tp_visit and tp_clear, Cython work as current, visiting/clearing > > python attributes, but near the end of these function, it calll the > > user-defined special method. > > > Could you explain a bit what you are going to do in these methods? > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dss at fredtun.no Mon May 12 12:03:51 2008 From: dss at fredtun.no (Dag Sverre Seljebotn) Date: Mon, 12 May 2008 10:03:51 -0000 Subject: [Cython] Visitors and compiling Cython in Cython In-Reply-To: <482812C3.1040908@behnel.de> References: <4828066E.6040307@student.matnat.uio.no> <482812C3.1040908@behnel.de> Message-ID: <48281634.9060200@fredtun.no> Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> Gary recently voiced concern over dict-based visitor lookups perhaps >> giving a speed penalty > > Why? You could use lazy initialisation. Just look up a node type in the dict > (which is fast) and if it's not in there yet, walk it's base types (adding > each one to the dict) until there is one that already is in the dict, which > then determines the result for the lookup and for the newly added base types. Sure, sure, that's already being done and in fact you'll see that *exact* procedure in my patch! :-) The issue Gary has is indeed with raw dict lookup performance. (Consider that this will be between 10 and 50 times to every single node in the tree of a file.) I'll quote Gary so you have the context: """ Efficiency does matter to Sage though, and an O(1) overhead is very far from negligible. Cython uses real C vtables, so there are absolutely no dict-based lookups involved with cdef class polymorphism. -Infinity to any proposal that makes code slower. Well, clarifying what I meant a bit more, the *biggest* speed loss anywhere is dictionary lookups. If your going to use dictionary lookups at object time, you lose most of the advantage of using cython to compile cython. """ Apparently, you don't agree with Gary here :-) Since this will have a big effect on all of Cython I thought I'd investigate it properly before truncating Gary's -Infinity to -1 :-) -- Dag Sverre From wstein at gmail.com Mon May 12 16:46:24 2008 From: wstein at gmail.com (William Stein) Date: Mon, 12 May 2008 14:46:24 -0000 Subject: [Cython] assigning to struct member In-Reply-To: <48284649.9030709@semipol.de> References: <48284649.9030709@semipol.de> Message-ID: <85e81ba30805120744o69a8dfeci8c9908152538af83@mail.gmail.com> On Mon, May 12, 2008 at 6:29 AM, Johannes Wienke wrote: > Hi again, > > maybe I'm blind or I don't know but how do I assign a value to a struct > member? > > This code: > cdef plugData *data = malloc(sizeof(plugData)) > data.ident = "foo" > > with this definition of plugData: > ctypedef struct plugData: > plugDefinition *plug > char *ident > void *data > > generates an error: > "Object of type 'plugData' has no attribute 'ident'" > > What's wrong with this? > Nothing? This code compiles and runs fine in Cython-0.9.6.13 (see attached screenshot). Note that I deleted plugDefinition since you didn't define that above. ctypedef struct plugData: char *ident void *data def foo(): cdef plugData *data = malloc(sizeof(plugData)) data.ident = "foo" -- William -------------- next part -------------- A non-text attachment was scrubbed... Name: Picture 2.png Type: image/png Size: 37358 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20080512/095cda01/attachment-0001.png From sable at users.sourceforge.net Tue May 13 11:00:15 2008 From: sable at users.sourceforge.net (=?ISO-8859-1?Q?S=E9bastien_Sabl=E9?=) Date: Tue, 13 May 2008 09:00:15 -0000 Subject: [Cython] [Pyrex] ANN: Pyrex 0.9.7.1 In-Reply-To: <48294510.5060407@canterbury.ac.nz> References: <48294510.5060407@canterbury.ac.nz> Message-ID: <4829582C.4000905@users.sourceforge.net> Hi Greg, I tried both Pyrex 0.9.7 and 0.9.7.1, but I have an error which didn't occurred with previous versions: TypeError: 'dict' object does not support item assignment" The code where the problem happens looks like this: cdef reflect_bind(wrapped, void *addr, object o): wrapped[addr] = o By the way, thanks for Pyrex, it is a great tool; we are using it to migrate a big C application to Python and Pyrex is making our life a lot easier. Thanks in advance -- S?bastien Sabl? Greg Ewing a ?crit : > Pyrex 0.9.7.1 is now available: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ > > This version fixes a bug in the new integer indexing > optimisation which causes indexing of a non-sequence type > with a C int to fail with a TypeError. > > What is Pyrex? > -------------- > > Pyrex is a language for writing Python extension modules. > It lets you freely mix operations on Python and C data, with > all Python reference counting and error checking handled > automatically. > > _______________________________________________ > Pyrex mailing list > Pyrex at lists.copyleft.no > http://lists.copyleft.no/mailman/listinfo/pyrex From venky at facebook.com Fri May 23 01:15:43 2008 From: venky at facebook.com (Venky Iyer) Date: Thu, 22 May 2008 23:15:43 -0000 Subject: [Cython] Casting python objects to void* and back Message-ID: (cross-posted to pyrex/cython, apologies if this is bad behavior) I've run into a number of issues with casting python objects to void* and back. Ultimately the reason I want to do this is to allow C functions to call python functions as callbacks, passing python callables as void* to C is a pattern that has been discussed on these lists multiple times. However, in addition to python callables, I want to figure out how I can pass other python objects to the python callback through the void* trick. Here is some test code for this: --------- void.pyx --------- cdef void* pack_tuple( int i, int j ): t = ( i, j ) return t cdef void* pack_dict( int i, int j ): d = { 'i': i, 'j': j } return d cdef unpack_tuple( void* t ): u = t print u print u[0] print u[1] # (, 10) # (, ) # 10 cdef unpack_dict( void* d ): u = d # print u # {'Py_Repr': [{...}, [...]]} # Fatal Python error: GC object already tracked # Aborted print u['i'] print u['j'] # 5 # 10 cpdef run_tuple(int i, int j): print "__Tuple__" cdef void* packed packed = pack_tuple(i,j) unpack_tuple(packed) print "Done" cpdef run_dict(int i, int j): print "__Dict__" cdef void* packed packed = pack_dict(i,j) unpack_dict(packed) print "Done" ------------ void_test.py ------------ import void if __name__ == "__main__": void.run_tuple(5,10) void.run_dict(5,10) --x-x-x--- Here are the problems I'm encountering: 1) packing as a tuple just doesn't work. The first argument (5) cannot be retrieved. See output in comments above. 2) If I put in a dummy first argument in pack_tuple, like ('', 5, 10), I get (, 5, 10), but accessing index 0 segfaults. I can access index 1 and 2 (containing 5 and 10 respectively) just fine though. What is going on here? 3) Packing as a dict seems to work well, except that if I try to print this dict, it aborts. 4) Calling run_tuple causes the program to hang on exit. 5) Lists behave similar to tuples. Any help would be greatly appreciated. For reference, I'm using $ cython -v Cython version 0.9.6.14 $ uname -a Linux greyice 2.6.24-17-generic #1 SMP Thu May 1 14:31:33 UTC 2008 i686 GNU/Linux $ gcc -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) $ python -V Python 2.5.2 Compilation and running: $ cython void.pyx $ gcc -c -fPIC -I/usr/include/python2.5/ void.c $ gcc -shared void.o -o void.so $ python void_test.py Thanks Venky Iyer