From jek-gmane at kleckner.net Sat Nov 1 23:56:11 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Sat, 01 Nov 2008 15:56:11 -0700 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> Message-ID: Robert Bradshaw wrote: > There are lots of new things in cython-devel, and we're overdue for a > release. I have posed a beta up at http://cython.org/ > Cython-0.9.8.2.beta.tar.gz based on the current devel. Sage compiles > and passes all tests, please try it out on your own projects. I gave it a try with Python 2.6 on Windows. 8 regression tests fail on Python 2.6 release using Visual Studio 9.0 on Windows XP. This was run using the changeset 1291:16fc9454a2e5 The output is stored here: http://trac.cython.org/cython_trac/ticket/106 From jek-gmane at kleckner.net Sun Nov 2 00:51:08 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Sat, 01 Nov 2008 16:51:08 -0700 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> Message-ID: This simple case causes the parser to fail: cdef int doublePointer(double* inOutArray): return 1 See: http://trac.cython.org/cython_trac/ticket/107 From robertwb at math.washington.edu Sun Nov 2 01:44:16 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 1 Nov 2008 17:44:16 -0700 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> Message-ID: <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> Thanks for your reports. On Nov 1, 2008, at 3:56 PM, Jim Kleckner wrote: > I gave it a try with Python 2.6 on Windows. > > 8 regression tests fail on Python 2.6 release using Visual Studio > 9.0 on > Windows XP. > > This was run using the changeset 1291:16fc9454a2e5 > > The output is stored here: > http://trac.cython.org/cython_trac/ticket/106 I don't have access to a windows build platform, but it seems that these errors are for "fake" extern definitions to test that correct code is generated. I wonder why windows complains but other platforms don't... Not sure the best way to proceed here, any Windows experts out there? Do we have to make a .h file that actually defines all the filler stuff? > On Nov 1, 2008, at 4:51 PM, Jim Kleckner wrote: > > This simple case causes the parser to fail: > > cdef int doublePointer(double* inOutArray): > return 1 This is bad, surprised we haven't run into it before. This should certainly be fixed before release. - Robert From stephane.drouard at st.com Mon Nov 3 12:44:54 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Mon, 3 Nov 2008 12:44:54 +0100 Subject: [Cython] Intra-package reference not working with cimport Message-ID: <002e01c93da9$999361b0$f600fb0a@gnb.st.com> Hello, Intra-package reference (http://docs.python.org/tutorial/modules.html#intra-package-references) does not work with cimport. foo.py: def foo(): pass bar.py: import foo foo.foo() If I move foo.py and bar.py into a package and import pkg.bar, it works under Python as well as under Cython. But if I "cimport foo" (cdef foo), it works when not in a package, but fails to import foo when in pkg. Looking at the generated code, I see that "import foo" is mapped to PyObject_CallFunction(__import__, ... using the module as globals (why not using PyImport_ImportModuleEx()?), whereas "cimport foo" uses PyImport_Import(), that lets globals to NULL, so the issue. Is there a reason that cimport is mapped differently? Cheers, Stephane From stefan_ml at behnel.de Mon Nov 3 12:58:52 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 3 Nov 2008 12:58:52 +0100 (CET) Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <002e01c93da9$999361b0$f600fb0a@gnb.st.com> References: <002e01c93da9$999361b0$f600fb0a@gnb.st.com> Message-ID: <62273.213.61.181.86.1225713532.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Stephane DROUARD wrote: > Intra-package reference > (http://docs.python.org/tutorial/modules.html#intra-package-references) > does not work with cimport. > > foo.py: > def foo(): > pass > > bar.py: > import foo > foo.foo() > > If I move foo.py and bar.py into a package and import pkg.bar, it works > under Python as well as under Cython. > > But if I "cimport foo" (cdef foo), it works when not in a package, but > fails to import foo when in pkg. Did you read the documentation on sharing declarations between extension modules? Did you declare foo() public? Stefan From stephane.drouard at st.com Mon Nov 3 13:48:28 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Mon, 3 Nov 2008 13:48:28 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <62273.213.61.181.86.1225713532.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <000001c93db2$7b358d70$f600fb0a@gnb.st.com> Stefan Behnel wrote: > Did you read the documentation on sharing declarations between extension modules? > Did you declare foo() public? I assume it's through cpdef, right? Then it does not solve the issue. Unless I missed something, declaring cpdef allows functions to be callable by Python. Here, the problem comes from the genererated code of bar which imports foo through PyImport_Import(), so letting globals to NULL, avoiding Python to first try importing foo from the package bar resides. Cheers, Stephane From robertwb at math.washington.edu Mon Nov 3 18:28:49 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 3 Nov 2008 09:28:49 -0800 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <000001c93db2$7b358d70$f600fb0a@gnb.st.com> References: <000001c93db2$7b358d70$f600fb0a@gnb.st.com> Message-ID: <7ACB705B-E5BB-4040-8DE5-01CDCD2DF8E2@math.washington.edu> On Nov 3, 2008, at 4:48 AM, Stephane DROUARD wrote: > Stefan Behnel wrote: > >> Did you read the documentation on sharing declarations between >> extension > modules? > Did you declare foo() public? > > I assume it's through cpdef, right? Then it does not solve the issue. > > Unless I missed something, declaring cpdef allows functions to be > callable > by Python. > Here, the problem comes from the genererated code of bar which > imports foo > through PyImport_Import(), so letting globals to NULL, avoiding > Python to > first try importing foo from the package bar resides. cimport imports c definitions, which must be declared in .pxd files. Do you have a .pxd file? If not, that's probably the issue. - Robert From stephane.drouard at st.com Mon Nov 3 21:59:38 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Mon, 3 Nov 2008 21:59:38 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <7ACB705B-E5BB-4040-8DE5-01CDCD2DF8E2@math.washington.edu> Message-ID: <004d01c93df7$16a10e80$f600fb0a@gnb.st.com> Robert Bradshaw wrote: > cimport imports c definitions, which must be declared in .pxd files. > Do you have a .pxd file? If not, that's probably the issue. OK let me detail what I'm doing ;-) foo.pxd cdef foo() foo.pyx cdef foo(): pass bar.pyx cimport foo foo.foo() Once both compiled, importing bar is OK. Now I move foo.pyd and bar.pyd into a package (pkg). Then I import pkg.bar. It fails importing foo. As previously mentionned, this is because "cimport foo" is mapped using PyImport_Import(), that lets parameter globals to NULL. The issue can be solved by replacing, in static PyObject *__Pyx_ImportModule(char *name) py_module = PyImport_Import(py_name); by py_module = __Pyx_Import(py_name, 0); Cheers, Stephane From robertwb at math.washington.edu Mon Nov 3 22:36:00 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 3 Nov 2008 13:36:00 -0800 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <004d01c93df7$16a10e80$f600fb0a@gnb.st.com> References: <004d01c93df7$16a10e80$f600fb0a@gnb.st.com> Message-ID: <396942E3-FCB3-45DD-BA47-7BA782161B7C@math.washington.edu> On Nov 3, 2008, at 12:59 PM, Stephane DROUARD wrote: > Robert Bradshaw wrote: > >> cimport imports c definitions, which must be declared in .pxd files. >> Do you have a .pxd file? If not, that's probably the issue. > > OK let me detail what I'm doing ;-) > > foo.pxd > cdef foo() > > foo.pyx > cdef foo(): > pass > > bar.pyx > cimport foo > foo.foo() > > Once both compiled, importing bar is OK. > > Now I move foo.pyd and bar.pyd into a package (pkg). > Then I import pkg.bar. It fails importing foo. Ah hah, that's where the error is getting introduced. You can't just move compiled files around, as their absolute (rather than relative) location is needed and used at compile time. > As previously mentionned, this is because "cimport foo" is mapped > using > PyImport_Import(), that lets parameter globals to NULL. > > The issue can be solved by replacing, in > > static PyObject *__Pyx_ImportModule(char *name) > > py_module = PyImport_Import(py_name); > by > py_module = __Pyx_Import(py_name, 0); There was a recent thread on whether or not the full module path is needed at compile time. It would be nice if one was able to just do stuff like this, but it seems the issue is more subtle. (I'll be happy to be proven wrong.) - Robert From simon at arrowtheory.com Tue Nov 4 02:27:55 2008 From: simon at arrowtheory.com (Simon Burton) Date: Tue, 4 Nov 2008 12:27:55 +1100 Subject: [Cython] py.test + cython Message-ID: <20081104122755.8d140430.simon@arrowtheory.com> I have problems with py.test (all untraceable thanks to py.test's builtin magic) Looks like it is picking up pyx source files (is this from the module's __file__ ?) and choking. Does anyone here use py.test ? Simon. From dalcinl at gmail.com Tue Nov 4 04:52:25 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 4 Nov 2008 00:52:25 -0300 Subject: [Cython] py.test + cython In-Reply-To: <20081104122755.8d140430.simon@arrowtheory.com> References: <20081104122755.8d140430.simon@arrowtheory.com> Message-ID: Perhaps this is related to the way Cython implements tracebacks?. Tracebacks contain references to the pyx files, and then py.test surely get confused. On Mon, Nov 3, 2008 at 10:27 PM, Simon Burton wrote: > > I have problems with py.test (all untraceable thanks to py.test's builtin magic) > Looks like it is picking up pyx source files (is this from the module's __file__ ?) and choking. > Does anyone here use py.test ? > > Simon. > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Tue Nov 4 06:18:07 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 3 Nov 2008 21:18:07 -0800 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> Message-ID: On Nov 1, 2008, at 4:51 PM, Jim Kleckner wrote: > This simple case causes the parser to fail: > > cdef int doublePointer(double* inOutArray): > return 1 > > > See: http://trac.cython.org/cython_trac/ticket/107 I've been unable to reproduce this with my copy, either with the beta release or the current cython-devel. Has anyone else seen this? Or perhaps there's more context that one needs to reproduce this error. - Robert From greg.ewing at canterbury.ac.nz Tue Nov 4 09:01:45 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Nov 2008 21:01:45 +1300 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <000001c93db2$7b358d70$f600fb0a@gnb.st.com> References: <000001c93db2$7b358d70$f600fb0a@gnb.st.com> Message-ID: <49100169.1030103@canterbury.ac.nz> Stephane DROUARD wrote: > Here, the problem comes from the genererated code of bar which imports foo > through PyImport_Import(), so letting globals to NULL, avoiding Python to > first try importing foo from the package bar resides. The cimport statement uses the Pyrex/Cython compiler's idea of the module namespace at compile time. If you move things around after that, it will get confused. -- Greg From jek-gmane at kleckner.net Tue Nov 4 17:10:19 2008 From: jek-gmane at kleckner.net (Jim Kleckner) Date: Tue, 04 Nov 2008 08:10:19 -0800 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> Message-ID: Robert Bradshaw wrote: > Thanks for your reports. ... >> On Nov 1, 2008, at 4:51 PM, Jim Kleckner wrote: >> >> This simple case causes the parser to fail: >> >> cdef int doublePointer(double* inOutArray): >> return 1 > > This is bad, surprised we haven't run into it before. This should > certainly be fixed before release. In reducing the example, I guess I reduced it too far. I coulda sworn I ran that specific file! Try this, which is bad code but shouldn't cause an exception. And has nothing to do with platforms. cdef int foo((double*)inOutArray,): cdef int i i = inOutArray return i Separately, gmane seems to be broken and isn't getting the list feed. I've reported the problem to gmane.discuss. I hope this gets through... From stephane.drouard at st.com Tue Nov 4 18:58:03 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Tue, 4 Nov 2008 18:58:03 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <49100169.1030103@canterbury.ac.nz> Message-ID: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> Robert Bradshaw wrote: > Ah hah, that's where the error is getting introduced. You can't just move compiled > files around, as their absolute (rather than relative) location is needed and used > at compile time. > There was a recent thread on whether or not the full module path is needed at > compile time. It would be nice if one was able to just do stuff like this, but it > seems the issue is more subtle. (I'll be happy to be proven wrong.) Greg Ewing wrote: > The cimport statement uses the Pyrex/Cython compiler's idea of the module > namespace at compile time. If you move things around after that, it will get > confused. This is indeed what I was trying to "get round" ;-) Today our source code database structure does not care at all of the final package structure. So the fact that PXD files need to be located withing the same directory structure as the PYD/SO files clearly is an issue for us. This could be seen like any C code that needs a dynamic library: - the source code includes a .h file that the compiler has to find possibly through a -I option, - the linker has to find the static library (.a/.lib) possibly through -L/-l options, - the application has to find the dynamic library (.so/.dll) possibly through LD_LIBRARY_PATH/PATH environment variables. Among these 3 steps (compiler/linker/execution), none requires that files are located in the same directory (and they usually don't). So I'm "pushing" to get this behaviour, considering that PXD files have to be found at Cython'ization stage (through -I options), while SO/PYD files at execution stage (Python stuff). No need for a strong relationship between them. Meanwhile, I'm patching the generated C code to get round the few issues I have around that. I'm reporting these issues, hoping they will be considered. Now if you tell me that it is an intented behaviour, and you don't plan to modify it (or support both?), I will stop ennoying you with my requests. This also means that I could reconsider the use of Cython for our needs. Which would be a pity, I like the philosophy. If there's still a chance that my request(s) may be supported one day, it might also be nice to support a syntax like: cimport foo "pkg.foo" That would load foo.pxd from the path list (-I) but would generate the equivalent import of pkg.foo at runtime. Cheers, Stephane From robertwb at math.washington.edu Tue Nov 4 19:26:08 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 4 Nov 2008 10:26:08 -0800 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> References: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> Message-ID: On Nov 4, 2008, at 9:58 AM, Stephane DROUARD wrote: > Robert Bradshaw wrote: > >> Ah hah, that's where the error is getting introduced. You can't >> just move > compiled > files around, as their absolute (rather than relative) > location > is needed and used > at compile time. > >> There was a recent thread on whether or not the full module path >> is needed > at >> compile time. It would be nice if one was able to just do stuff >> like this, > but it >> seems the issue is more subtle. (I'll be happy to be proven wrong.) > > Greg Ewing wrote: > >> The cimport statement uses the Pyrex/Cython compiler's idea of the >> module >> namespace at compile time. If you move things around after that, >> it will > get >> confused. > > This is indeed what I was trying to "get round" ;-) > > Today our source code database structure does not care at all of > the final > package structure. So the fact that PXD files need to be located > withing the > same directory structure as the PYD/SO files clearly is an issue > for us. > > This could be seen like any C code that needs a dynamic library: > - the source code includes a .h file that the compiler has to find > possibly > through a -I option, > - the linker has to find the static library (.a/.lib) possibly > through > -L/-l options, > - the application has to find the dynamic library (.so/.dll) possibly > through LD_LIBRARY_PATH/PATH environment variables. > > Among these 3 steps (compiler/linker/execution), none requires that > files > are located in the same directory (and they usually don't). > > So I'm "pushing" to get this behaviour, considering that PXD files > have to > be found at Cython'ization stage (through -I options), while SO/PYD > files at > execution stage (Python stuff). No need for a strong relationship > between > them. > > Meanwhile, I'm patching the generated C code to get round the few > issues I > have around that. I'm reporting these issues, hoping they will be > considered. > > Now if you tell me that it is an intented behaviour, and you don't > plan to > modify it (or support both?), I will stop ennoying you with my > requests. > This also means that I could reconsider the use of Cython for our > needs. > Which would be a pity, I like the philosophy. > > If there's still a chance that my request(s) may be supported one > day, it > might also be nice to support a syntax like: > cimport foo "pkg.foo" > That would load foo.pxd from the path list (-I) but would generate the > equivalent import of pkg.foo at runtime. To clarify, I would like to see the ability to move compiled (.pyd/.so) files around and have it just work. I am just worried that it will be easy to break things (in unexpected ways) by doing this, and I don't want a feature that only sometimes works. However, I think I'll be in a much more "experimental" mood after this next release. If you can produce patches that does what you want, with tests that show it does, then I would certainly want to include it! - Robert From stefan_ml at behnel.de Tue Nov 4 19:31:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 04 Nov 2008 19:31:00 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> References: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> Message-ID: <491094E4.60807@behnel.de> Hi, Stephane DROUARD wrote: > Today our source code database structure does not care at all of the final > package structure. So the fact that PXD files need to be located withing the > same directory structure as the PYD/SO files clearly is an issue for us. That is clearly an unusual requirement. I had never heard of anyone who doesn't keep Python modules in their package. > This could be seen like any C code that needs a dynamic library: C isn't comparable, as it doesn't have a notion of packages. > - the application has to find the dynamic library (.so/.dll) possibly > through LD_LIBRARY_PATH/PATH environment variables. You can always use a flat module setup and require users to set their PYTHONPATH to achieve the same result. > So I'm "pushing" to get this behaviour, considering that PXD files have to > be found at Cython'ization stage (through -I options), while SO/PYD files at > execution stage (Python stuff). No need for a strong relationship between > them. Well, yes, there is a relationship, even in Python. For example, your exception doctests will break if you import the tested module from a different package. Python3 even has a dedicated syntax for relative imports to separate them clearly from top-level imports. This means that moving modules out of or into packages can break your code. > Now if you tell me that it is an intented behaviour, and you don't plan to > modify it (or support both?), I will stop ennoying you with my requests. Did you actually describe your use case anywhere? I don't see why users should be allowed to move modules all over the place. That would result in import code that will fail miserably when moved to other systems. Why do you use packages in the first place, if your users don't care about them anyway? Stefan From stephane.drouard at st.com Tue Nov 4 22:10:32 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Tue, 4 Nov 2008 22:10:32 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <491094E4.60807@behnel.de> Message-ID: <007701c93ec1$c92587a0$8e40fb0a@gnb.st.com> Stefan Behnel wrote: > Stephane DROUARD wrote: >> Today our source code database structure does not care at all of the >> final package structure. So the fact that PXD files need to be located >> withing the same directory structure as the PYD/SO files clearly is an issue for us. > > That is clearly an unusual requirement. I had never heard of anyone who doesn't keep > Python modules in their package. This structure does not cause any issues for pure Python modules as well as SWIG'ed modules. And honestly, I don't see why the source code "storage" would have to follow the delivery structure. I understand that people may have this guideline, but it's not mandatory. >> So I'm "pushing" to get this behaviour, considering that PXD files >> have to be found at Cython'ization stage (through -I options), while >> SO/PYD files at execution stage (Python stuff). No need for a strong >> relationship between them. > > Well, yes, there is a relationship, even in Python. For example, your exception doctests > will break if you import the tested module from a different package. I don't use doctest. But I assume we can test a module from a location and use this same module (without testing it) in another location. This is our testing strategy. >> Now if you tell me that it is an intented behaviour, and you don't >> plan to modify it (or support both?), I will stop ennoying you with my requests. > > Did you actually describe your use case anywhere? I don't see why users should be > allowed to move modules all over the place. That would result in import code that will > fail miserably when moved to other systems. Why do you use packages in the first place, > if your users don't care about them anyway? The patches/ideas I've proposed just (try to) get Cython modules behave like Python ones. The intra-package reference is a behaviour that is clearly described in Python. Cython does not behave like that. Stephane From stephane.drouard at st.com Tue Nov 4 22:56:24 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Tue, 4 Nov 2008 22:56:24 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: Message-ID: <007801c93ec8$32955ac0$8e40fb0a@gnb.st.com> Robert Bradshaw wrote: > To clarify, I would like to see the ability to move compiled (.pyd/.so) files around > and have it just work. > I am just worried that it will be easy to break things (in unexpected ways) by doing > this, and I don't want a feature that only sometimes works. I fully agree with that too. And what I'm doing is clearly not to provide patches that solve my own issues, but really tries to get Cython modules behave as close as possible like Python modules. > However, I think I'll be in a much more "experimental" mood after this next release. If > you can produce patches that does what you want, with tests that show it does, then I > would certainly want to include it! I've first tried to see if I can patch Cython itself. But the structure is quite complex, and I finally abandoned this idea and decided to patch the generated C code. This is why I've posted patches relative to the generated C code, hoping that someone knowing the compiler structure could integrate them. Particularly the patches I've proposed are really simple. Is it be acceptable? But first, you need to be conviced that this goes on the right direction (Python compatibility) and this really solves it in all situations. And I understand your doubts. Cheers, Stephane From greg.ewing at canterbury.ac.nz Wed Nov 5 01:55:00 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Nov 2008 13:55:00 +1300 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> Message-ID: <4910EEE4.5000707@canterbury.ac.nz> Jim Kleckner wrote: > cdef int foo((double*)inOutArray,): By the way, the fact that Pyrex accepts this form of parameter declaration is probably an accident -- you can't write that in C, and I wouldn't blame Cython if it refused to support it either. -- Greg From greg.ewing at canterbury.ac.nz Wed Nov 5 02:08:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Nov 2008 14:08:04 +1300 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> References: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> Message-ID: <4910F1F4.2060004@canterbury.ac.nz> Stephane DROUARD wrote: > it might also be nice to support a syntax like: > cimport foo "pkg.foo" > That would load foo.pxd from the path list (-I) but would generate the > equivalent import of pkg.foo at runtime. If this is purely a matter of source code layout, then there's an alternative availalable in Pyrex: if you name your source files pkg.foo.pxd and pkg.foo.pyx then they can be anywhere in the source tree (as long as the .pxd can be found on the -I path) and the module will be assumed to be named pkg.foo at run time. I don't know whether Cython still supports this, though. -- Greg From robertwb at math.washington.edu Wed Nov 5 03:46:47 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 4 Nov 2008 18:46:47 -0800 Subject: [Cython] Cython 0.9.8.2 beta In-Reply-To: <4910EEE4.5000707@canterbury.ac.nz> References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> <4910EEE4.5000707@canterbury.ac.nz> Message-ID: <4A3B4890-8FEF-4940-B21F-72F527898A45@math.washington.edu> On Nov 4, 2008, at 4:55 PM, Greg Ewing wrote: > Jim Kleckner wrote: > >> cdef int foo((double*)inOutArray,): > > By the way, the fact that Pyrex accepts this form of > parameter declaration is probably an accident -- you > can't write that in C, and I wouldn't blame Cython > if it refused to support it either. Unfortunately, it made the compiler crash, which is worse than flagging a syntax error. I've fixed it now. - Robert From stefan_ml at behnel.de Wed Nov 5 07:28:54 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 05 Nov 2008 07:28:54 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <4910F1F4.2060004@canterbury.ac.nz> References: <007601c93ea6$e5614460$8e40fb0a@gnb.st.com> <4910F1F4.2060004@canterbury.ac.nz> Message-ID: <49113D26.7090100@behnel.de> Hi, Greg Ewing wrote: > Stephane DROUARD wrote: > >> it might also be nice to support a syntax like: >> cimport foo "pkg.foo" >> That would load foo.pxd from the path list (-I) but would generate the >> equivalent import of pkg.foo at runtime. > > If this is purely a matter of source code layout, then > there's an alternative availalable in Pyrex: if you > name your source files > > pkg.foo.pxd > > and > > pkg.foo.pyx > > then they can be anywhere in the source tree (as long as > the .pxd can be found on the -I path) and the module will > be assumed to be named pkg.foo at run time. > > I don't know whether Cython still supports this, though. It does, I use this in lxml (although only for legacy reasons). Stefan From stefan_ml at behnel.de Wed Nov 5 07:35:35 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 05 Nov 2008 07:35:35 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <007701c93ec1$c92587a0$8e40fb0a@gnb.st.com> References: <007701c93ec1$c92587a0$8e40fb0a@gnb.st.com> Message-ID: <49113EB7.3090709@behnel.de> Hi, Stephane DROUARD wrote: > And honestly, I don't see why the source code "storage" would have to follow > the delivery structure. I understand that people may have this guideline, > but it's not mandatory. If you build your modules with distutils, Cython will pick up the correct package name from what you provide in your Extension setup. This allows you to keep your sources in one place and build them into packaged modules purely for shipping. >>> So I'm "pushing" to get this behaviour, considering that PXD files >>> have to be found at Cython'ization stage (through -I options), while >>> SO/PYD files at execution stage (Python stuff). No need for a strong >>> relationship between them. >> Well, yes, there is a relationship, even in Python. For example, your > exception doctests >> will break if you import the tested module from a different package. > > I don't use doctest. But I assume we can test a module from a location and > use this same module (without testing it) in another location. > This is our testing strategy. Then you're lucky that you are not using doctests, as your users would not be able to run the test suite if they encounter any problems. >> Did you actually describe your use case anywhere? I don't see why users > should be >> allowed to move modules all over the place. That would result in import > code that will >> fail miserably when moved to other systems. Why do you use packages in the > first place, if your users don't care about them anyway? > > The patches/ideas I've proposed just (try to) get Cython modules behave like > Python ones. > The intra-package reference is a behaviour that is clearly described in > Python. Cython does not behave like that. That does not answer my question about a use case. You mentioned different things so far, one being source code layout versus shipping in packages, one being users moving modules around as they see fit. I understand that some people require the first for legacy reasons ("we always kept our sources in that one directory!"). I don't see a reason for the latter. Stefan From stefan_ml at behnel.de Wed Nov 5 09:10:17 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 5 Nov 2008 09:10:17 +0100 (CET) Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <4910AB32.7020309@v.loewis.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> Message-ID: <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Looks like we've gained a supporter here... Martin v. L?wis wrote: > Stefan Behnel wrote: >> The project has made inclusion into Python's stdlib a goal right from >> the beginning. > > Ah, that changes my view of it significantly. If the authors want to > contribute it to Python some day, I'm looking forward to that (assuming > that they then close their official branch, and make the version inside > Python the maintained one). > > That is also independent of whether standard library modules get written > in Cython. I would expect that some may (in particular, if they focus on > wrapping an external library), whereas others might stay what they are > (in particular, when they are in the real core of the interpreter). > >> ctypes makes sense for projects that do not require a high-speed >> interface, >> i.e. if you do major things behind the interface and only call into it >> from >> time to time, choosing ctypes will keep your code more portable without >> requiring a C compiler. However, if speed matters then it's hard to beat >> Cython even with hand-written C code. > > I would personally prefer a Cython integration over a ctypes one, for > the standard library (and supported inclusion of ctypes into Python > regardless). > > Regards, > Martin I'm still for moving Cython to the stdlib one day, but I would prefer a somewhat closer "one day". What do the others think? If you agree, it would be good to make that on official milestone, and to collect and flag bugs in trac as relevant or blockers for this goal. That would allow us to see when we get closer, and to go back to python-dev when we think it's time. Stefan From aaron.devore at gmail.com Wed Nov 5 10:00:49 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Wed, 5 Nov 2008 01:00:49 -0800 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <2ead2fb0811050100g48511aa3h9fdf36c7e63f28f9@mail.gmail.com> Wouldn't Cython be a bit big for the stdlib? It would be the largest single piece of the standard library, with the possible exception of Tkinter. On Wed, Nov 5, 2008 at 12:10 AM, Stefan Behnel wrote: > Looks like we've gained a supporter here... > > Martin v. L?wis wrote: >> Stefan Behnel wrote: >>> The project has made inclusion into Python's stdlib a goal right from >>> the beginning. >> >> Ah, that changes my view of it significantly. If the authors want to >> contribute it to Python some day, I'm looking forward to that (assuming >> that they then close their official branch, and make the version inside >> Python the maintained one). >> >> That is also independent of whether standard library modules get written >> in Cython. I would expect that some may (in particular, if they focus on >> wrapping an external library), whereas others might stay what they are >> (in particular, when they are in the real core of the interpreter). >> >>> ctypes makes sense for projects that do not require a high-speed >>> interface, >>> i.e. if you do major things behind the interface and only call into it >>> from >>> time to time, choosing ctypes will keep your code more portable without >>> requiring a C compiler. However, if speed matters then it's hard to beat >>> Cython even with hand-written C code. >> >> I would personally prefer a Cython integration over a ctypes one, for >> the standard library (and supported inclusion of ctypes into Python >> regardless). >> >> Regards, >> Martin > > I'm still for moving Cython to the stdlib one day, but I would prefer a > somewhat closer "one day". What do the others think? > > If you agree, it would be good to make that on official milestone, and to > collect and flag bugs in trac as relevant or blockers for this goal. That > would allow us to see when we get closer, and to go back to python-dev > when we think it's time. > From ondrej at certik.cz Wed Nov 5 10:04:36 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 5 Nov 2008 10:04:36 +0100 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <4910AB32.7020309@v.loewis.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> Message-ID: <85b5c3130811050104v54ec1dbfy9f800f7a94e33d38@mail.gmail.com> On Tue, Nov 4, 2008 at 9:06 PM, "Martin v. L?wis" wrote: >> The project has made inclusion into Python's stdlib a goal right from the >> beginning. > > Ah, that changes my view of it significantly. If the authors want to > contribute it to Python some day, I'm looking forward to that (assuming > that they then close their official branch, and make the version inside > Python the maintained one). > > That is also independent of whether standard library modules get written > in Cython. I would expect that some may (in particular, if they focus on > wrapping an external library), whereas others might stay what they are > (in particular, when they are in the real core of the interpreter). I think it is also a good idea to write things using pure Python syntax in Cython, so that all other Python implementations, like Jython, Pypy, IronPython can just take it and run it in pure Python mode. Pure Python syntax means that the code runs in Python unmodified, but can also be compiled with Cython. Pure Python syntax was only recently added to Cython, so I guess it should be well tested first. What do you think? Ondrej From dagss at student.matnat.uio.no Wed Nov 5 10:10:41 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 05 Nov 2008 10:10:41 +0100 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <49116311.8000103@student.matnat.uio.no> Stefan Behnel wrote: > Looks like we've gained a supporter here... > > Martin v. L?wis wrote: > >> Stefan Behnel wrote: >> >>> The project has made inclusion into Python's stdlib a goal right from >>> the beginning. >>> >> Ah, that changes my view of it significantly. If the authors want to >> contribute it to Python some day, I'm looking forward to that (assuming >> that they then close their official branch, and make the version inside >> Python the maintained one). ... > I'm still for moving Cython to the stdlib one day, but I would prefer a > somewhat closer "one day". What do the others think? > The only concern I have is that of release cycles. I.e. would it be possible to live within the Python repositories etc., but still do our own independent releases? (And when a new version of Python is released, and Cython with it, simply release the "cython" branch, i.e. latest stable with critical bugfixes). If so, then there's no problem. Dag Sverre From stefan_ml at behnel.de Wed Nov 5 10:47:35 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 5 Nov 2008 10:47:35 +0100 (CET) Subject: [Cython] Using Cython for standard library? In-Reply-To: <2ead2fb0811050100g48511aa3h9fdf36c7e63f28f9@mail.gmail.com> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <2ead2fb0811050100g48511aa3h9fdf36c7e63f28f9@mail.gmail.com> Message-ID: <60877.213.61.181.86.1225878455.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Aaron DeVore wrote: > Wouldn't Cython be a bit big for the stdlib? It would be the largest > single piece of the standard library, with the possible exception of > Tkinter. Counting the .py files from the Cython package gives me 1.1MB for 69 source files, the tests are about 600K, so that's less than 2MB in total. I don't think that makes the Python distro fat, given that Py3.0rc1 is about 95MB and Py2.6 is about 56MB unpacked. Python 2.6 comes with 1975 .py files (of which 1164 get installed on my side). They take up about 20MB in the distro and 15MB installed. Adding Cython to that would mean increasing the amount taken up by installed Python files to 16MB instead. My hard drive is capable of handling that, Linux distributions will split the stdlib anyway, as will embedded devices. On the other hand, if we can manage to bootstrap modules in the stdlib that get written in Cython so that they can be shipped as Cython code instead of C code, that would actually decrease the size of the shipped stdlib, so that we can end up better than before. :-) Although it's likely that the installed binary modules become bigger than what we have today... ;-) Stefan From stefan_ml at behnel.de Wed Nov 5 11:16:15 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 5 Nov 2008 11:16:15 +0100 (CET) Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <49116311.8000103@student.matnat.uio.no> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <49116311.8000103@student.matnat.uio.no> Message-ID: <61332.213.61.181.86.1225880175.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> I'm still for moving Cython to the stdlib one day, but I would prefer a >> somewhat closer "one day". What do the others think? >> > The only concern I have is that of release cycles. I.e. would it be > possible to live within the Python repositories etc., but still do our > own independent releases? (And when a new version of Python is released, > and Cython with it, simply release the "cython" branch, i.e. latest > stable with critical bugfixes). If so, then there's no problem. That is a concern I have, too. As Martin noted, the main release would have to happen in the stdlib, which means: adapt to the Python release cycle. Currently, that's one major release every 1-2 years and a minor release about every 6-10 months. I doubt that users would honour such a slowdown in the current state, but I'm sure that we will think different about this once we reach a freezable language state. We'll also have to check how other projects handle this. Last time I asked, I was told that ElementTree was an "externally maintained project in the stdlib", whatever that means. If we can adopt a release scheme as you indicated above, i.e. stabilize the language, have a stable release included, and then keep releasing improvements and optimisations separately (e.g. in our own hg repo) before they get included in the next Python release - that sounds acceptable to me. Language extensions would then still have to wait for a major Python release, IMHO, but users could benefit from other improvements earlier if they feel they really need to. But we will have to take care which improvements go into which Python minor release, and which should wait for a major release. Older Python branches usually only receive bug fixes, but new minor releases can well receive general improvements for a while. Stefan From ondrej at certik.cz Wed Nov 5 14:37:48 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 5 Nov 2008 14:37:48 +0100 Subject: [Cython] Pure python mode In-Reply-To: <48EE7E60.7040502@student.matnat.uio.no> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810051546h1d953077n3becd7231525448f@mail.gmail.com> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> Message-ID: <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> On Thu, Oct 9, 2008 at 10:57 PM, Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> I'd like to release this soon. I've changed declare so it can work as >> below (i.e. declare(var=type, ..)). For now there's no "cpdef/ >> cdefclass/earlybind/??" decorator, we'll hammer that one out later. >> Any other comments before we commit to it? > > Hmm.. Well done! :-) > > Comments about release schedule though: > > For NumPy complex numbers are now in place which was the big lacking > feature (of course, that makes native complex float support the next big > lacking feature for numerical use). Structs/record arrays are incredibly > near the surface (less than a day, easily), OTOH I'm very pressed on > time at the moment, so I suppose they should not block a release. > > There definitely needs to be betas this time around too, the result_code > refactoring is more dangerous than anything we did in summer. (In fact > the issue Lisandro just posted may be related to this...) > > BTW, are we now anywhere near keeping a feature sync with Pyrex? I'm > wondering if we could perhaps think again about the versioning scheme -- > the version number is getting very long to refer to, and the link to > Pyrex seems less important than it used to. (Obviously it needs to not > mess with packaging systems, but both 0.10 and 0.9.9 could work, rather > than 0.9.8.2). So the pure python mode seems to be working very nice for me. I was following the documentation here: http://wiki.cython.org/pure However, one drawback is that I still need to use .pxd files to declare cdef functions. Example: @cython.locals(n=cython.int) def fact(n): ... in order for this to become a "cdef int foo()" function, I need to have this line in foo.pxd: cdef int fact(int n) (And then the @cython decorator is not necessary). I'd like to have everything in the .py file. Could that be approached by for example something like: @cython.locals(n=cython.int, _return=cython.int, cdef=True) def fact(n): ... ? Ondrej From stefan_ml at behnel.de Wed Nov 5 14:52:28 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 5 Nov 2008 14:52:28 +0100 (CET) Subject: [Cython] Pure python mode In-Reply-To: <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810051546h1d953077n3becd7231525448f@mail.gmail.com> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> Message-ID: <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Ondrej Certik wrote: > one drawback is that I still need to use .pxd files to > declare cdef functions. Example: > > @cython.locals(n=cython.int) > def fact(n): > ... > > in order for this to become a "cdef int foo()" function, I need to > have this line in foo.pxd: > > cdef int fact(int n) > > (And then the @cython decorator is not necessary). I'd like to have > everything in the .py file. Could that be approached by for example > something like: > > @cython.locals(n=cython.int, _return=cython.int, cdef=True) > def fact(n): > ... That would block calling a local variable "cdef", which would be valid Python. Calling it _cdef might solve this, but having "_return" blocked already is bad enough. How about a generic decorator @cython.cdef ? @cython.cdef @cython.locals(n=cython.int, _return=cython.int) def fact(n): ... would become cdef int fact(int n): ... Stefan From dalcinl at gmail.com Wed Nov 5 15:48:42 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 5 Nov 2008 11:48:42 -0300 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On Wed, Nov 5, 2008 at 5:10 AM, Stefan Behnel wrote: > Looks like we've gained a supporter here... > > Martin v. L?wis wrote: >> >> Ah, that changes my view of it significantly. If the authors want to >> contribute it to Python some day, I'm looking forward to that (assuming >> that they then close their official branch, and make the version inside >> Python the maintained one). >> > > I'm still for moving Cython to the stdlib one day, but I would prefer a > somewhat closer "one day". What do the others think? > Stefan, could you please explain me what all we (we==Cython developers and users) will gain with this inclusion of Cython in Python's core? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From ondrej at certik.cz Wed Nov 5 15:52:56 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 5 Nov 2008 15:52:56 +0100 Subject: [Cython] Pure python mode In-Reply-To: <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> On Wed, Nov 5, 2008 at 2:52 PM, Stefan Behnel wrote: > Ondrej Certik wrote: >> one drawback is that I still need to use .pxd files to >> declare cdef functions. Example: >> >> @cython.locals(n=cython.int) >> def fact(n): >> ... >> >> in order for this to become a "cdef int foo()" function, I need to >> have this line in foo.pxd: >> >> cdef int fact(int n) >> >> (And then the @cython decorator is not necessary). I'd like to have >> everything in the .py file. Could that be approached by for example >> something like: >> >> @cython.locals(n=cython.int, _return=cython.int, cdef=True) >> def fact(n): >> ... > > That would block calling a local variable "cdef", which would be valid > Python. Calling it _cdef might solve this, but having "_return" blocked > already is bad enough. > > How about a generic decorator @cython.cdef ? > > @cython.cdef > @cython.locals(n=cython.int, _return=cython.int) > def fact(n): > ... > > would become > > cdef int fact(int n): > ... Yes, that looks perfectly ok. And the same for extension classes. Ondrej From stephane.drouard at st.com Wed Nov 5 16:03:09 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Wed, 5 Nov 2008 16:03:09 +0100 Subject: [Cython] Intra-package reference not working with cimport In-Reply-To: <49113EB7.3090709@behnel.de> Message-ID: <004e01c93f57$a1f0b060$3a00fb0a@gnb.st.com> Stefan Behnel wrote: > That does not answer my question about a use case. You mentioned different things so > far, one being source code layout versus shipping in packages, one being users moving > modules around as they see fit. I understand that some people require the first for > legacy reasons ("we always kept our sources in that one directory!"). I don't see a > reason for the latter. Here is our module directory structure: module/ src/ *.c, *.h [common source code] standalone/ *.c [contain at least main() and the command line parser] Makefile build/ Linux/ [intermediate files + binary executable] Win32/ [intermediate files + binary executable] test/ [test files as shell scripts] python/ [needed wrapper files] Makefile build/ Linux/ [intermediate files + python wrapper] Win32/ [intermediate files + python wrapper] test/ [test files as Python scripts] [other platforms] test/ [test files as Python scripts] Makefiles build everything in the corresponding build// test/ within platform directories are dedicated to their platform. They get access to the "executable" by ../build// Concerning python tests, they start with: import sys, os, platform sys.path.append(os.path.join("..", "build", platform.system().replace("Windows", "Win32"))) The global test/ contains test that validate everything between releases and platforms. They don't rely on ..//build/ directories, but have their own build structure, because they have to download previous releases from the datatabase and build them. Note that, when everything is validated, python wrappers are put into packages within site-packages for final users. OK, I previously mentionned that users may move them, but in fact they don't, it was just to avoid having to detail the complex test suite structure. Sorry if this finally disturbed you. Anyway, all those mechanisms worked well until I decided to use Cython in replacement of SWIG. This is because the test suites don't care of the destination package. They just set the path to the built modules (through sys.path for Python) and load/launch them (import for Python). For sure the structure could be modified to support a package hierarchy, and I asked for it, but to summarize other developpers answer I got: "you want to use Cython, why not, but you have to ensure this is backward compatible. If not, stop using Cython." This is why I'm patching the generated C code for the moment. Cheers, Stephane From stefan_ml at behnel.de Wed Nov 5 16:47:25 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 5 Nov 2008 16:47:25 +0100 (CET) Subject: [Cython] Using Cython for standard library? In-Reply-To: References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <55158.213.61.181.86.1225900045.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > Stefan, could you please explain me what all we (we==Cython developers > and users) will gain with this inclusion of Cython in Python's core? I see a couple of advantages in general, and for us in particular. 1) Cython would become an official part of CPython, and shipped with the standard Python distribution. This would open it to everyone who uses Python, not only those who feel the need to do things in "C, but not quite C", and look for a way to do so. Looking for a way to speed up your code? "import Cython". 2) The integration of Cython with distutils would become better, as stock distutils would be enabled to build Cython extensions without monkey-patching and the like. The same applies to pyximport, which would likely become part of the stdlib as well. 3) There would be a stable implementation of the Cython language as shipped by the CPython standard library Version X.Y. This would allow Cython users to target a specific language level, which (should I say it?) is rather futile currently. 4) It would allow stdlib modules to be written in Cython, thus lowering the entry level for stdlib contributors, and consequently broadening the Cython user base. Classical win-win situation. 5) The Cython project would benefit from a broader set of contributors. The CPython developers already show their incentives in writing Cython code, and would likely start improving Cython for their own needs, simply because it's so much easier to improve a code generator than to improve your (or somebody else's) code in 139 places. As I already said elsewhere, I consider the reduced maintenance cost through module portability over CPython versions a pretty strong incentive, especially during the 2.6+ and 3.0+ life cycles. 6) Other Python implementations (Jython, IronPython, PyPy) could provide a (partial) implementation of Cython's pure Python mode to support the efficient execution of Cython implemented extension modules directly in their runtime environment. They could do that now, but the incentive would be much higher if it would allow them to copy stdlib modules over instead of reimplementing them. I bet I forgot some more, but I think the main theme is that Cython would benefit from being "just there" and "ready-to-be-improved" when you install Python. Stefan From lists at cheimes.de Wed Nov 5 16:57:22 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 05 Nov 2008 16:57:22 +0100 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <49116311.8000103@student.matnat.uio.no> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <49116311.8000103@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn wrote: > The only concern I have is that of release cycles. I.e. would it be > possible to live within the Python repositories etc., but still do our > own independent releases? (And when a new version of Python is released, > and Cython with it, simply release the "cython" branch, i.e. latest > stable with critical bugfixes). If so, then there's no problem. The Cython developers must support a version of Cython through the entire cycle of a minor release. You don't have to stop working on Cython but the version included in a release must be maintained without adding new features. Exampple: If we would include Cython with Python 3.1 then you have to provide a working, (mostly) bug free and feature complete version of Cython for the firt beta release of 3.1. This version of Cython must be supported until the end of the 3.1 release cycle a couple of years later. You are allowed to fix minor bugs, you must fix major and security bugs, but you are not allowed to introduce new features. In the mean time you can work on Cython 1.1, 1.2 etc. Christian From robertwb at math.washington.edu Wed Nov 5 17:43:31 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 5 Nov 2008 08:43:31 -0800 Subject: [Cython] Pure python mode In-Reply-To: <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> Message-ID: On Nov 5, 2008, at 6:52 AM, Ondrej Certik wrote: > On Wed, Nov 5, 2008 at 2:52 PM, Stefan Behnel > wrote: >> Ondrej Certik wrote: >>> one drawback is that I still need to use .pxd files to >>> declare cdef functions. Example: >>> >>> @cython.locals(n=cython.int) >>> def fact(n): >>> ... >>> >>> in order for this to become a "cdef int foo()" function, I need to >>> have this line in foo.pxd: >>> >>> cdef int fact(int n) >>> >>> (And then the @cython decorator is not necessary). I'd like to have >>> everything in the .py file. Certainly this is a goal of mine too. It was (IMHO) less straightforward the best syntax to use, and I wanted to get the other stuff out there for testing/trial first. Until we have autogenerated .pxd files, or the equivalent, this won't allow one to share cdef classes/methods. >>> Could that be approached by for example >>> something like: >>> >>> @cython.locals(n=cython.int, _return=cython.int, cdef=True) >>> def fact(n): >>> ... >> >> That would block calling a local variable "cdef", which would be >> valid >> Python. Calling it _cdef might solve this, but having "_return" >> blocked >> already is bad enough. >> >> How about a generic decorator @cython.cdef ? >> >> @cython.cdef >> @cython.locals(n=cython.int, _return=cython.int) >> def fact(n): >> ... >> >> would become >> >> cdef int fact(int n): >> ... > > Yes, that looks perfectly ok. And the same for extension classes. Dag mentioned an "earlybind" rather than "cdef" as it would have more meaning to newcomers. There's also the issue of cpdef vs. cdef, should that be a flag to the decorator, or a new decorator. I'm not a fan of the _return parameter, I think it belongs in the "cdef" decorator as it doesn't make sense without it. It would be nice if one could declare argument types there as well (though that introduces some redundancy with locals). Perhaps the first (prepositional) argument would be the return type. There's also Py3 syntax for annotating arguments, though I think we'll be wanting to support Py2 for quite a while. - Robert From robertwb at math.washington.edu Wed Nov 5 19:25:02 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 5 Nov 2008 10:25:02 -0800 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <44CEE8BF-DEB2-4545-BCEB-A32293076B22@math.washington.edu> On Nov 5, 2008, at 12:10 AM, Stefan Behnel wrote: > Looks like we've gained a supporter here... > > Martin v. L?wis wrote: >> Stefan Behnel wrote: >>> The project has made inclusion into Python's stdlib a goal right >>> from >>> the beginning. >> >> Ah, that changes my view of it significantly. If the authors want to >> contribute it to Python some day, I'm looking forward to that >> (assuming >> that they then close their official branch, and make the version >> inside >> Python the maintained one). >> >> That is also independent of whether standard library modules get >> written >> in Cython. I would expect that some may (in particular, if they >> focus on >> wrapping an external library), whereas others might stay what they >> are >> (in particular, when they are in the real core of the interpreter). >> >>> ctypes makes sense for projects that do not require a high-speed >>> interface, >>> i.e. if you do major things behind the interface and only call >>> into it >>> from >>> time to time, choosing ctypes will keep your code more portable >>> without >>> requiring a C compiler. However, if speed matters then it's hard >>> to beat >>> Cython even with hand-written C code. >> >> I would personally prefer a Cython integration over a ctypes one, for >> the standard library (and supported inclusion of ctypes into Python >> regardless). >> >> Regards, >> Martin > > I'm still for moving Cython to the stdlib one day, but I would > prefer a > somewhat closer "one day". What do the others think? > > If you agree, it would be good to make that on official milestone, > and to > collect and flag bugs in trac as relevant or blockers for this > goal. That > would allow us to see when we get closer, and to go back to python-dev > when we think it's time. I think the most immediate official milestone is being able to compile 100% of Cython code. This is the target for 1.0. Of course there are several other things (e.g. the constantly improving buffer support) that are not part of this goal but extremely valuable. I imagine once we hit that target things will still be in flux enough that inclusion into the Python stdlib will be premature at that point, but worth moving towards. Also, hopefully by then things will stabilize enough that the slower release cycle won't be as much of a burden. (I'm used to the 1-3 week release cycle of Sage). We will still have the less-stable development branch going for those who want/need to be on the cutting edge. One thing that is not clear is if a (sufficiently advanced?) user would be able to use the newer Cython with an older version of Python (say, if some projects/modules he needed weren't ported yet.) There is also the long term question of the parser which is very redundant with the one shipped with Python. (Would we try to migrate over to Python's ASTs? Could that be done without loosing the old-style cdef syntax?) - Robert From dalcinl at gmail.com Wed Nov 5 19:37:51 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 5 Nov 2008 15:37:51 -0300 Subject: [Cython] adding more constification Message-ID: I'm working on a patch adding 'const' to 'char *' arguments, and a few explicit (char *) casts. The final idea is that for Py >= 2.5, we can then pass -Wwrite-strings to GCC and no warnings should appear. Moreover, perhaps we can make 'runtest.py' a bit smarter from pushing the -Wwrite-strings flags for Py >=2.5 and GGC. I did not take the Py2/3/2.4 case just because that would require the generated C code to have many explicit, ugly (char *) cast everywhere. But if this is not a problem, it could be done in the near future. Any objections? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Wed Nov 5 20:18:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 05 Nov 2008 20:18:34 +0100 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <44CEE8BF-DEB2-4545-BCEB-A32293076B22@math.washington.edu> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <44CEE8BF-DEB2-4545-BCEB-A32293076B22@math.washington.edu> Message-ID: <4911F18A.3040002@behnel.de> Hi, Robert Bradshaw wrote: > I think the most immediate official milestone is being able to > compile 100% of Cython code. This is the target for 1.0. Did you really mean Cython here or Python? I think 100% Python is somewhat hard to prove. I would expect that we are pretty close to compiling Py2.4 code, and a bit further from compiling 2.5 code (which mainly includes generator expressions). Not sure about 2.6, which was a heavily moving target last time I checked. If you meant Cython, this requires a stable definition of what the Cython language actually is. > there are several other things (e.g. the constantly improving buffer > support) that are not part of this goal but extremely valuable. I > imagine once we hit that target things will still be in flux I think that's somewhat orthogonal. I can imagine a stable language core with optional, advanced features. Both can well have different levels of maturity, both can mature in different release cycles - the more stable ones in the Python stdlib, and the more experimental or younger ones in independent Cython releases. > that inclusion into the Python stdlib will be premature at that > point, but worth moving towards. I'm more thinking in terms of stabilising the core language and major features. If we shift a bit of our current focus towards that goal, a core Cython release could even go into Py2.7. (Note that inclusion in Py3.1 would require migrating the source code to Py3 first, which is still a bit of additional work, but definitely worth it). > One thing that is not clear is > if a (sufficiently advanced?) user would be able to use the newer > Cython with an older version of Python (say, if some projects/modules > he needed weren't ported yet.) I'm not sure where you see the problem here. A user wouldn't replace the core Cython (i.e. everything else would keep working), but import a separate Cython install instead. Just an idea: I would imagine that we will (finally) have to lower-case the package/module names when we move into the stdlib, but we may keep (at least) the upper-case Cython package for the stand-alone release. We could then install with a lower-case package (or both) under Python 2.3-2.6, and with an upper-case package name on Python versions that already have a packaged Cython. > There is also the long term question > of the parser which is very redundant with the one shipped with > Python. (Would we try to migrate over to Python's ASTs? Could that be > done without loosing the old-style cdef syntax?) That's really a long-term question and something that we should discuss with the CPython developers. If we become an official part of CPython, I imagine that there will be ways to hook into the existing parser /somewow/. Stefan From robertwb at math.washington.edu Wed Nov 5 22:43:01 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 5 Nov 2008 13:43:01 -0800 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <4911F18A.3040002@behnel.de> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <44CEE8BF-DEB2-4545-BCEB-A32293076B22@math.washington.edu> <4911F18A.3040002@behnel.de> Message-ID: <8288F5F6-2C86-409F-86EF-51A917290CCE@math.washington.edu> On Nov 5, 2008, at 11:18 AM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> I think the most immediate official milestone is being able to >> compile 100% of Cython code. This is the target for 1.0. > > Did you really mean Cython here or Python? I think 100% Python is > somewhat > hard to prove. I would expect that we are pretty close to compiling > Py2.4 > code, and a bit further from compiling 2.5 code (which mainly includes > generator expressions). Not sure about 2.6, which was a heavily > moving target > last time I checked. Sorry, I meant Python here. I'd say the goal is at a minimum to run the entire Python regression suite for 2.x, probably 2.4 or 2.5. > If you meant Cython, this requires a stable definition of what the > Cython > language actually is. Yeah, I'd certainly like to eventually move to a more stable (and precise) definition than "whatever the compiler accepts at the moment" :). >> there are several other things (e.g. the constantly improving buffer >> support) that are not part of this goal but extremely valuable. I >> imagine once we hit that target things will still be in flux > > I think that's somewhat orthogonal. I can imagine a stable language > core with > optional, advanced features. Both can well have different levels of > maturity, > both can mature in different release cycles - the more stable ones > in the > Python stdlib, and the more experimental or younger ones in > independent Cython > releases. It *is* orthogonal, my point was that I think pursuing this stuff is at least as important as the other goals we're talking about here. >> >> that inclusion into the Python stdlib will be premature at that >> point, but worth moving towards. > > I'm more thinking in terms of stabilising the core language and major > features. If we shift a bit of our current focus towards that goal, > a core > Cython release could even go into Py2.7. (Note that inclusion in > Py3.1 would > require migrating the source code to Py3 first, which is still a > bit of > additional work, but definitely worth it). What is the timeline for Py2.7? Also, if we migrate our source to Py3, will that exclude running it under Py2? Will we have to maintain two separate codebases? > >> One thing that is not clear is >> if a (sufficiently advanced?) user would be able to use the newer >> Cython with an older version of Python (say, if some projects/modules >> he needed weren't ported yet.) > > I'm not sure where you see the problem here. A user wouldn't > replace the core > Cython (i.e. everything else would keep working), but import a > separate Cython > install instead. If it could hook in an "hijack" the builtin one, especially if there's tighter integration of the latter, then that would be very cool. > Just an idea: I would imagine that we will (finally) have to lower- > case the > package/module names when we move into the stdlib, but we may keep > (at least) > the upper-case Cython package for the stand-alone release. We could > then > install with a lower-case package (or both) under Python 2.3-2.6, > and with an > upper-case package name on Python versions that already have a > packaged Cython. The lower/upper case thing could be confusing to some, but might be worth it. > >> There is also the long term question >> of the parser which is very redundant with the one shipped with >> Python. (Would we try to migrate over to Python's ASTs? Could that be >> done without loosing the old-style cdef syntax?) > > That's really a long-term question and something that we should > discuss with > the CPython developers. If we become an official part of CPython, I > imagine > that there will be ways to hook into the existing parser /somewow/. Yep. "cdef int foo(int y)" is just so convenient to write. It'd be nice to not have to duplicate 90% of the parser code, but I'm sure we'll be able to cross that bridge when we come to it. - Robert From stefan_ml at behnel.de Thu Nov 6 09:12:54 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 06 Nov 2008 09:12:54 +0100 Subject: [Cython] [Python-Dev] Using Cython for standard library? In-Reply-To: <8288F5F6-2C86-409F-86EF-51A917290CCE@math.washington.edu> References: <392528d30810300855g33b8130flc3098f81700bab08@mail.gmail.com> <392528d30810300913w723271d2i855b4d78fc2cd9b5@mail.gmail.com> <4909EF67.5010104@trueblade.com> <4909FECF.7000703@voidspace.org.uk> <490E0747.8050401@behnel.de> <490EE0A0.2060700@ghaering.de> <76fd5acf0811030546y1984fd8fx467661351de22975@mail.gmail.com> <490F0784.4050408@ghaering.de> <4910AB32.7020309@v.loewis.de> <47682.213.61.181.86.1225872617.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <44CEE8BF-DEB2-4545-BCEB-A32293076B22@math.washington.edu> <4911F18A.3040002@behnel.de> <8288F5F6-2C86-409F-86EF-51A917290CCE@math.washington.edu> Message-ID: <4912A706.1050001@behnel.de> Hi, Robert Bradshaw wrote: > On Nov 5, 2008, at 11:18 AM, Stefan Behnel wrote: >> Robert Bradshaw wrote: >>> I think the most immediate official milestone is being able to >>> compile 100% of Cython code. This is the target for 1.0. >> Did you really mean Cython here or Python? I think 100% Python is >> somewhat >> hard to prove. I would expect that we are pretty close to compiling >> Py2.4 >> code, and a bit further from compiling 2.5 code (which mainly includes >> generator expressions). Not sure about 2.6, which was a heavily >> moving target >> last time I checked. > > Sorry, I meant Python here. I'd say the goal is at a minimum to run > the entire Python regression suite for 2.x, probably 2.4 or 2.5. Then runtests.py --sys-pyregr is your friend. :) Note that this requires the Python test suite to be installed, which isn't the case in a normal Python install (at least on most Linux distros). The best is to build your own clean Python somewhere apart and use that. If compiling Python is our first goal, I suggest that people get "sys-pyregr" running on a Python 2.5 installation and fix tests. When a test fails unexpectedly, I'd also suggest to check it again with Python 2.6, just to be sure it didn't change behaviour. >> If you meant Cython, this requires a stable definition of what the >> Cython language actually is. > > Yeah, I'd certainly like to eventually move to a more stable (and > precise) definition than "whatever the compiler accepts at the > moment" :). Yep, that's the state of the art. :) IMHO, the definition is something in the lines of Python 2.5 - some unsupportable features + some well defined C features + some additional features: buffer syntax, pure Python syntax, ... Some of that is in the Wiki already. > What is the timeline for Py2.7? I don't think there is one yet, but I'd expect a beta by next summer, given that 2.6 final is out, the developers are focussing on Py3.0 now, and 2.7/3.1 are expected to be cleanup releases. > Also, if we migrate our source to > Py3, will that exclude running it under Py2? Will we have to maintain > two separate codebases? I still hope that the 2to3 tool (or maybe a future 3to2 tool, as often discussed) will help us here once we get the code clean enough. Plex is a problem, for example, but not a major one either. I think the test suite is good enough by now to give these things another try. >>> One thing that is not clear is >>> if a (sufficiently advanced?) user would be able to use the newer >>> Cython with an older version of Python (say, if some projects/modules >>> he needed weren't ported yet.) >> I'm not sure where you see the problem here. A user wouldn't >> replace the core >> Cython (i.e. everything else would keep working), but import a >> separate Cython >> install instead. > > If it could hook in an "hijack" the builtin one, especially if > there's tighter integration of the latter, then that would be very cool. PyXML did that with the stock Python XML support - although not very beautifully. So, yes, there are ways to do these things. We will just have to find one that users can live with. Stefan From metawilm at gmail.com Thu Nov 6 11:38:38 2008 From: metawilm at gmail.com (Willem Broekema) Date: Thu, 6 Nov 2008 11:38:38 +0100 Subject: [Cython] Overflow semantics in pure Python mode Message-ID: In Pure Python mode, code compiled (using annotations) to C should run equivalent to running (ignoring annotations) in the Python interpreter. That implies that a variable declared as cython.int may not rely on overflow behavior, as Python would automatically make a long object for such a result. And the same also holds for the other numeric types, from char to longlong. Right? - Willem From stefan_ml at behnel.de Thu Nov 6 14:14:10 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 6 Nov 2008 14:14:10 +0100 (CET) Subject: [Cython] Overflow semantics in pure Python mode In-Reply-To: References: Message-ID: <33046.213.61.181.86.1225977250.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Willem Broekema wrote: > In Pure Python mode, code compiled (using annotations) to C should run > equivalent to running (ignoring annotations) in the Python > interpreter. That implies that a variable declared as cython.int may > not rely on overflow behavior, as Python would automatically make a > long object for such a result. And the same also holds for the other > numeric types, from char to longlong. Right? Yep, developers should avoid writing broken code. :) Stefan From robertwb at math.washington.edu Thu Nov 6 17:17:10 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 6 Nov 2008 08:17:10 -0800 Subject: [Cython] Overflow semantics in pure Python mode In-Reply-To: <33046.213.61.181.86.1225977250.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <33046.213.61.181.86.1225977250.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <5944D4E7-51F4-4E57-8120-D32B0E23282E@math.washington.edu> On Nov 6, 2008, at 5:14 AM, Stefan Behnel wrote: > Willem Broekema wrote: >> In Pure Python mode, code compiled (using annotations) to C should >> run >> equivalent to running (ignoring annotations) in the Python >> interpreter. That implies that a variable declared as cython.int may >> not rely on overflow behavior, as Python would automatically make a >> long object for such a result. And the same also holds for the other >> numeric types, from char to longlong. Right? > > Yep, developers should avoid writing broken code. :) Um, No. If you declare a variable to be int then it *will* overflow when compiled but not (currently) when interpreted. - Robert From metawilm at gmail.com Thu Nov 6 17:37:45 2008 From: metawilm at gmail.com (Willem Broekema) Date: Thu, 6 Nov 2008 17:37:45 +0100 Subject: [Cython] Overflow semantics in pure Python mode In-Reply-To: <5944D4E7-51F4-4E57-8120-D32B0E23282E@math.washington.edu> References: <33046.213.61.181.86.1225977250.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <5944D4E7-51F4-4E57-8120-D32B0E23282E@math.washington.edu> Message-ID: On Thu, Nov 6, 2008 at 5:17 PM, Robert Bradshaw wrote: > Um, No. If you declare a variable to be int then it *will* overflow > when compiled but not (currently) when interpreted. The question was not intended as "will overflow happen?" but "may a developer make use of it?" The answer to the second question must be "no" otherwise the equivalence is gone. - Willem From robertwb at math.washington.edu Fri Nov 7 02:07:56 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 6 Nov 2008 17:07:56 -0800 Subject: [Cython] Cython 0.10 release candidate In-Reply-To: <4A3B4890-8FEF-4940-B21F-72F527898A45@math.washington.edu> References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> <4910EEE4.5000707@canterbury.ac.nz> <4A3B4890-8FEF-4940-B21F-72F527898A45@math.washington.edu> Message-ID: Given the number of new features, I'm thinking it's time to bump a more major version number. (Probably should have done so last version, but got to start sometime.) Try the package up at http://cython.org/Cython-0.10.rc.tar.gz . Unless there are major issues, let's release. - Robert From robertwb at math.washington.edu Fri Nov 7 02:11:23 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 6 Nov 2008 17:11:23 -0800 Subject: [Cython] adding more constification In-Reply-To: References: Message-ID: <672EA171-79BA-4EBA-BE35-7406C8E13022@math.washington.edu> On Nov 5, 2008, at 10:37 AM, Lisandro Dalcin wrote: > I'm working on a patch adding 'const' to 'char *' arguments, and a few > explicit (char *) casts. > > The final idea is that for Py >= 2.5, we can then pass -Wwrite-strings > to GCC and no warnings should appear. Moreover, perhaps we can make > 'runtest.py' a bit smarter from pushing the -Wwrite-strings flags for > Py >=2.5 and GGC. > > I did not take the Py2/3/2.4 case just because that would require the > generated C code to have many explicit, ugly (char *) cast everywhere. > But if this is not a problem, it could be done in the near future. > > Any objections? This sounds like a good idea. I haven't had the chance to test it on Sage, so it didn't make it into the next release which I want to be really safe. (Though the patch looks fine.) - Robert From jasone at canonware.com Fri Nov 7 02:23:49 2008 From: jasone at canonware.com (Jason Evans) Date: Thu, 06 Nov 2008 17:23:49 -0800 Subject: [Cython] Minor issue with switch conversion Message-ID: <491398A5.3040001@canonware.com> The switch conversion, as described at: http://wiki.cython.org/enhancements/switch causes a compilation error for the following code: =============== cdef int x = 42 if x == 1: print x elif x == 2: print x else: pass =============== This is due to generating an empty default: clause. Jason From aaron.devore at gmail.com Fri Nov 7 04:21:17 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Thu, 6 Nov 2008 19:21:17 -0800 Subject: [Cython] constant Py_UNICODE arrays Message-ID: <2ead2fb0811061921y56fb1fb7r3dd8caf7c6ff068b@mail.gmail.com> I'm working with serializing an XML tree. To make it as fast as possible I am using a raw array of Py_UNICODE characters instead of "".join(l) or something of the sort. One problem I am running into is that I need to append certain strings many times (e.g. '<', ''). That incurs significant overhead in calls to the Python API. Here is a basic idea of what is happening right now (the real code is more complex): cdef void append(UnicodeBuffer *ubuffer, unicode un): ----Py_UNICODE *string = PyUnicode_AsUnicode(un) ----... append string to buffer ... cdef void render(UnicodeBuffer *ubuffer, Tag tag): ----append(buffer, u"<") ----append(buffer, tag.name) ----...render attributes with more append calls... ----append(buffer, u">") I'm trying to find something where I could also append with a function like this one: cdef void appendArray(UnicodeBuffer *ubuffer, Py_UNICODE *string, int length): ----...append string to buffer... One possibility is having several arrays of commonly used unicode strings sitting around. In that case render() from above might look like this: cdef void render(UnicodeBuffer *ubuffer, Tag tag): ----appendArray(buffer, ustring_lt, 1) ----append(buffer, tag.name) ----...render attributes with more append calls... ----append(buffer, ustring_gt, 1) What would be the best way to go about this? -Aaron DeVore From stefan_ml at behnel.de Fri Nov 7 06:57:31 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 06:57:31 +0100 Subject: [Cython] Minor issue with switch conversion In-Reply-To: <491398A5.3040001@canonware.com> References: <491398A5.3040001@canonware.com> Message-ID: <4913D8CB.9040809@behnel.de> Hi, Jason Evans wrote: > The switch conversion, as described at: > > http://wiki.cython.org/enhancements/switch > > causes a compilation error for the following code: > > =============== > cdef int x = 42 > > if x == 1: > print x > elif x == 2: > print x > else: > pass > =============== > > This is due to generating an empty default: clause. Thanks for this very complete bug report. Fixed. http://hg.cython.org/cython-devel/rev/1f4f07a63dba Stefan From stefan_ml at behnel.de Fri Nov 7 07:26:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 07:26:26 +0100 Subject: [Cython] constant Py_UNICODE arrays In-Reply-To: <2ead2fb0811061921y56fb1fb7r3dd8caf7c6ff068b@mail.gmail.com> References: <2ead2fb0811061921y56fb1fb7r3dd8caf7c6ff068b@mail.gmail.com> Message-ID: <4913DF92.7070403@behnel.de> Hi, Aaron DeVore wrote: > cdef void append(UnicodeBuffer *ubuffer, unicode un): > Py_UNICODE *string = PyUnicode_AsUnicode(un) > ... append string to buffer ... > > I'm trying to find something where I could also append with a function > like this one: > > cdef void appendArray(UnicodeBuffer *ubuffer, Py_UNICODE *string, int length): > ...append string to buffer... Why? Just use the PyUnicode_AS_UNICODE(un) macro for accessing the Py_UNICODE buffer, and PyUnicode_AS_DATA(un) to get the length, that avoids any redundant type checks or conversions. That way, you can just keep your original function without changing its signature. I may be biased since I've been working on the lxml XML library for quite a while now, but may I ask why you use unicode strings and Py_UNICODE internally, instead of a UTF-8 encoded byte buffer? > One possibility is having several arrays of commonly used unicode > strings sitting around. In that case render() from above might look > like this: > > cdef void render(UnicodeBuffer *ubuffer, Tag tag): > ----appendArray(buffer, ustring_lt, 1) > ----append(buffer, tag.name) > ----...render attributes with more append calls... > ----append(buffer, ustring_gt, 1) > > What would be the best way to go about this? Note that both unicode and byte strings are interned by Cython, so I'd just write u"<" and u">", which I find the most readable. The less short-term solution would actually be to make Py_UNICODE a known numeric type in Cython, and to do the conversion on the fly, as in cdef unicode u = u"some text" cdef Py_UNICODE* buffer = u # calls PyUnicode_AS_UNICODE(u) cdef Py_UNICODE ch = u # raise an exception as u is too long That would be in line with the current byte string <-> char* conversion. (not sure about the exception, BTW). Stefan From stefan_ml at behnel.de Fri Nov 7 07:34:43 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 07:34:43 +0100 Subject: [Cython] Overflow semantics in pure Python mode In-Reply-To: References: <33046.213.61.181.86.1225977250.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <5944D4E7-51F4-4E57-8120-D32B0E23282E@math.washington.edu> Message-ID: <4913E183.1090602@behnel.de> Hi, Willem Broekema wrote: > On Thu, Nov 6, 2008 at 5:17 PM, Robert Bradshaw > wrote: >> Um, No. If you declare a variable to be int then it *will* overflow >> when compiled but not (currently) when interpreted. > > The question was not intended as "will overflow happen?" but "may a > developer make use of it?" That's how I read it. It's exactly like writing cdef int i and relying on an overflow at 32bits, even on a 16 bit or 64 bit platform. I really think this is a non-problem. Few people would write pure Python code for Cython without the intention to run it in Python, i.e. with Python semantics. Stefan From stefan_ml at behnel.de Fri Nov 7 07:53:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 07:53:22 +0100 Subject: [Cython] constant Py_UNICODE arrays In-Reply-To: <4913DF92.7070403@behnel.de> References: <2ead2fb0811061921y56fb1fb7r3dd8caf7c6ff068b@mail.gmail.com> <4913DF92.7070403@behnel.de> Message-ID: <4913E5E2.4060009@behnel.de> Hi again, Stefan Behnel wrote: > PyUnicode_AS_DATA(un) to get the length I (obviously) meant PyUnicode_GET_SIZE(un) ... Stefan From zhen_qing_he_456 at 163.com Fri Nov 7 10:01:54 2008 From: zhen_qing_he_456 at 163.com (zhen_qing_he_456) Date: Fri, 7 Nov 2008 17:01:54 +0800 Subject: [Cython] how to initialize my c array in a quick way? Message-ID: <200811071701538906185@163.com> Hello: if in my fuction i declare a c array like: cdef int a[5] now if want to initialize my array a, i can do follow: a[0] = 1 a[1] = 10 : : a[4] = 3 the value to my array is not continuous. Is there a quicker way in Cython to do it like in c: a[5] = {1,3,28,5,3} 2008-11-07 zhen_qing_he_456 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20081107/822eac4c/attachment.htm From stefan_ml at behnel.de Fri Nov 7 10:14:27 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 10:14:27 +0100 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <200811071701538906185@163.com> References: <200811071701538906185@163.com> Message-ID: <491406F3.3040708@behnel.de> Hi, zhen_qing_he_456 wrote: > if in my fuction i declare a c array like: > cdef int a[5] > now if want to initialize my array a, i can do follow: > a[0] = 1 > a[1] = 10 > : > : > a[4] = 3 > the value to my array is not continuous. Is there a quicker way in Cython to do it like in c: > a[5] = {1,3,28,5,3} In the current Cython release candidate http://cython.org/Cython-0.10.rc.tar.gz you can do def test(): cdef int* a = [1,2,3,4,5] print a[0] Apparently, this currently only works inside functions, but that should be fixable. Stefan From dalcinl at gmail.com Fri Nov 7 13:44:59 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 7 Nov 2008 09:44:59 -0300 Subject: [Cython] adding more constification In-Reply-To: <672EA171-79BA-4EBA-BE35-7406C8E13022@math.washington.edu> References: <672EA171-79BA-4EBA-BE35-7406C8E13022@math.washington.edu> Message-ID: OK, now I'm building my own projects with -Wwrite-strings. As warnings appear, I'll be pushing fixes. On Thu, Nov 6, 2008 at 10:11 PM, Robert Bradshaw wrote: > On Nov 5, 2008, at 10:37 AM, Lisandro Dalcin wrote: > >> I'm working on a patch adding 'const' to 'char *' arguments, and a few >> explicit (char *) casts. >> >> The final idea is that for Py >= 2.5, we can then pass -Wwrite-strings >> to GCC and no warnings should appear. Moreover, perhaps we can make >> 'runtest.py' a bit smarter from pushing the -Wwrite-strings flags for >> Py >=2.5 and GGC. >> >> I did not take the Py2/3/2.4 case just because that would require the >> generated C code to have many explicit, ugly (char *) cast everywhere. >> But if this is not a problem, it could be done in the near future. >> >> Any objections? > > This sounds like a good idea. I haven't had the chance to test it on > Sage, so it didn't make it into the next release which I want to be > really safe. (Though the patch looks fine.) > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Nov 7 19:58:51 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 07 Nov 2008 19:58:51 +0100 Subject: [Cython] Cython 0.10 release candidate In-Reply-To: References: <1B9C1C49-B178-4A67-AE75-A97E38901131@math.washington.edu> <4A94F2AF-6A59-4854-8E14-239CCFCA24D5@math.washington.edu> <4910EEE4.5000707@canterbury.ac.nz> <4A3B4890-8FEF-4940-B21F-72F527898A45@math.washington.edu> Message-ID: <49148FEB.3040706@behnel.de> Hi, Robert Bradshaw wrote: > Given the number of new features, I'm thinking it's time to bump a > more major version number. (Probably should have done so last > version, but got to start sometime.) > > Try the package up at http://cython.org/Cython-0.10.rc.tar.gz . Good call. I think it makes sense to go away from the 0.9-so-close-to-1.0 scheme. I was always thinking of Linux 1.0 the closer we got... It's fine with me if 1.0 follows 0.12 or whatever. > Unless there are major issues, let's release. Yay, go Cython 0.X! :) Stefan From aaron.devore at gmail.com Sat Nov 8 01:02:01 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Fri, 7 Nov 2008 16:02:01 -0800 Subject: [Cython] constant Py_UNICODE arrays In-Reply-To: <4913DF92.7070403@behnel.de> References: <2ead2fb0811061921y56fb1fb7r3dd8caf7c6ff068b@mail.gmail.com> <4913DF92.7070403@behnel.de> Message-ID: <2ead2fb0811071602h2fed5f62t95a8c3ced47401d3@mail.gmail.com> On Thu, Nov 6, 2008 at 10:26 PM, Stefan Behnel wrote: > I may be biased since I've been working on the lxml XML library for quite a > while now, but may I ask why you use unicode strings and Py_UNICODE > internally, instead of a UTF-8 encoded byte buffer? The tree is built using a pure Python parser, even though the tree itself is in Cython. The strings are already passed from the parser as unicode objects so it's easier for me to just store a pointer to the PyObject. I don't know if there's a performance hit (or gain?) but I've found that method more convenient. If there's a better way then I would be happy to hear it. I'll take a look this evening at how Cython deals with interning strings. -Aaron From tav at espians.com Sun Nov 9 04:56:26 2008 From: tav at espians.com (tav) Date: Sun, 9 Nov 2008 03:56:26 +0000 Subject: [Cython] Optimising dict manipulation in extension types Message-ID: Hey all, Apologies if this is the wrong list to post to -- I couldn't find a cython-users list... I've been using Cython to speed up instantiation of a Namespace object written in Python: http://paste.lisp.org/display/69989 Fundamentally, what I am trying to do is: class Namespace: def __init__(self, **env): global id, store id += 1 keys = store[id] = sorted(env) new_env = [] for key in keys: obj = env[key] if PyFunction_Check(obj): obj = staticmethod(obj) new_env.append(obj) self.env = new_env The sort() and looping (for key in keys) seems to take up most of the time... how can I do this better so that it takes less time? The Cython version so far is at http://paste.lisp.org/display/69988 Please forgive my Cython newbie errors... Thanks! -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From tav at espians.com Sun Nov 9 05:13:55 2008 From: tav at espians.com (tav) Date: Sun, 9 Nov 2008 04:13:55 +0000 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: Message-ID: Whilst I'm at it, is there any advantage into turning the various lists into C arrays? I won't be accessing them from Python and thus was wondering if it would also be possible to speed up the __getattribute__ implementation in Namespace? Thanks! Do you guys hang out on IRC at all? On Sun, Nov 9, 2008 at 3:56 AM, tav wrote: > Hey all, > > Apologies if this is the wrong list to post to -- I couldn't find a > cython-users list... > > I've been using Cython to speed up instantiation of a Namespace object > written in Python: http://paste.lisp.org/display/69989 > > Fundamentally, what I am trying to do is: > > class Namespace: > def __init__(self, **env): > global id, store > id += 1 > keys = store[id] = sorted(env) > new_env = [] > for key in keys: > obj = env[key] > if PyFunction_Check(obj): > obj = staticmethod(obj) > new_env.append(obj) > self.env = new_env > > The sort() and looping (for key in keys) seems to take up most of the > time... how can I do this better so that it takes less time? > > The Cython version so far is at http://paste.lisp.org/display/69988 > > Please forgive my Cython newbie errors... Thanks! > > -- > love, tav > > plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 > -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From aaron.devore at gmail.com Sun Nov 9 05:38:28 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Sat, 8 Nov 2008 20:38:28 -0800 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: Message-ID: <2ead2fb0811082038q3d826e2fh85f01ede15fd7c2e@mail.gmail.com> I'm not sure if this solves the problem but it would help if you declared your variables with an actual type (list, tuple, dict, etc.) instead of just 'object'. That allows Cython to do optimizations like substituting PyList_Append(new_env, obj) for the more expensive new_env.append(obj). On Sat, Nov 8, 2008 at 8:13 PM, tav wrote: > Whilst I'm at it, is there any advantage into turning the various > lists into C arrays? I won't be accessing them from Python and thus > was wondering if it would also be possible to speed up the > __getattribute__ implementation in Namespace? Thanks! > > Do you guys hang out on IRC at all? > > On Sun, Nov 9, 2008 at 3:56 AM, tav wrote: >> Hey all, >> >> Apologies if this is the wrong list to post to -- I couldn't find a >> cython-users list... >> >> I've been using Cython to speed up instantiation of a Namespace object >> written in Python: http://paste.lisp.org/display/69989 >> >> Fundamentally, what I am trying to do is: >> >> class Namespace: >> def __init__(self, **env): >> global id, store >> id += 1 >> keys = store[id] = sorted(env) >> new_env = [] >> for key in keys: >> obj = env[key] >> if PyFunction_Check(obj): >> obj = staticmethod(obj) >> new_env.append(obj) >> self.env = new_env >> >> The sort() and looping (for key in keys) seems to take up most of the >> time... how can I do this better so that it takes less time? >> >> The Cython version so far is at http://paste.lisp.org/display/69988 >> >> Please forgive my Cython newbie errors... Thanks! >> >> -- >> love, tav >> >> plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 >> > > > > -- > love, tav > > plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From robertwb at math.washington.edu Sun Nov 9 06:20:26 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 8 Nov 2008 21:20:26 -0800 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: Message-ID: <6DDCC650-31C7-40D4-995B-9015A9E20517@math.washington.edu> On Nov 8, 2008, at 7:56 PM, tav wrote: > Hey all, > > Apologies if this is the wrong list to post to -- I couldn't find a > cython-users list... This is the right list. > I've been using Cython to speed up instantiation of a Namespace object > written in Python: http://paste.lisp.org/display/69989 > > Fundamentally, what I am trying to do is: > > class Namespace: > def __init__(self, **env): > global id, store > id += 1 > keys = store[id] = sorted(env) > new_env = [] > for key in keys: > obj = env[key] > if PyFunction_Check(obj): > obj = staticmethod(obj) > new_env.append(obj) > self.env = new_env > > The sort() and looping (for key in keys) seems to take up most of the > time... how can I do this better so that it takes less time? Short of writing your own sort algorithm there's not much of a way to speed up sort(). If you're just sorting strings here, that could probably be done a lot faster. However, depending on the size of env I might ask how important it is to have them sorted. As mentioned earlier, declaring new_env and keys as a list could help the .append method go faster. The "for key in keys" should be relatively fast, much faster than it is in Python. > The Cython version so far is at http://paste.lisp.org/display/69988 > > Please forgive my Cython newbie errors... Thanks! > > -- > love, tav > > plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Sun Nov 9 06:46:04 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 8 Nov 2008 21:46:04 -0800 Subject: [Cython] Cython 0.10 released Message-ID: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> Cython 0.10 is now officially released. There are lots of new features, most notably pure python compatible syntax for cython declarations, improved buffer support, better Python 3.0 support, and some major re-factoring of temporary allocation and code generation. There are also many bug fixes and optimizations. A (by no means complete) listing can be found at http://trac.cython.org/cython_trac/query? group=component&milestone=0.10 . Thanks to all of you who helped with this release, including those who submitted bug reports. - Robert From robertwb at math.washington.edu Sun Nov 9 07:42:21 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 8 Nov 2008 22:42:21 -0800 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <491406F3.3040708@behnel.de> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> Message-ID: On Nov 7, 2008, at 1:14 AM, Stefan Behnel wrote: > Hi, > > zhen_qing_he_456 wrote: >> if in my fuction i declare a c array like: >> cdef int a[5] >> now if want to initialize my array a, i can do follow: >> a[0] = 1 >> a[1] = 10 >> : >> : >> a[4] = 3 >> the value to my array is not continuous. Is there a quicker >> way in Cython to do it like in c: >> a[5] = {1,3,28,5,3} > > In the current Cython release candidate > > http://cython.org/Cython-0.10.rc.tar.gz > > you can do > > def test(): > cdef int* a = [1,2,3,4,5] > print a[0] > > Apparently, this currently only works inside functions, but that > should be > fixable. Hmm... it worked before the "assignment on declaration" restriction. http://trac.cython.org/cython_trac/ticket/113 - Robert From stefan_ml at behnel.de Sun Nov 9 08:06:07 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Nov 2008 08:06:07 +0100 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: Message-ID: <49168BDF.9020502@behnel.de> Hi, tav wrote: > Whilst I'm at it, is there any advantage into turning the various > lists into C arrays? I won't be accessing them from Python and thus > was wondering if it would also be possible to speed up the > __getattribute__ implementation in Namespace? Thanks! This is what you have: def __getattribute__(Namespace self, char *attr): cdef int i cdef object j, v i = 0 for j in store[self.id]: if j == attr: v = self.env[i] if hasattr(v, '__get__'): return v.__get__(None, self) return v i = i + 1 raise AttributeError("'Namespace' object has no attribute %r" % attr) If you change the "char* attr" into a plain "attr", this will speed up things considerably. In your code, Cython has to convert "attr" to a Python string on each loop to convert it to the Python string "j" (which is a really bad name for a Python string, BTW). I also don't quite understand the hasattr(v, "__get__"). Is that supposed to access a property? > On Sun, Nov 9, 2008 at 3:56 AM, tav wrote: >> Apologies if this is the wrong list to post to -- I couldn't find a >> cython-users list... I extended the list description a bit, so that it becomes clearer that this *is* the right list. >> I've been using Cython to speed up instantiation of a Namespace object >> written in Python: http://paste.lisp.org/display/69989 >> >> Fundamentally, what I am trying to do is: >> >> class Namespace: >> def __init__(self, **env): >> global id, store >> id += 1 >> keys = store[id] = sorted(env) >> new_env = [] >> for key in keys: >> obj = env[key] >> if PyFunction_Check(obj): >> obj = staticmethod(obj) >> new_env.append(obj) >> self.env = new_env >> >> The sort() and looping (for key in keys) seems to take up most of the >> time... how can I do this better so that it takes less time? Why is this important? Do you create them very often, or is the size of env the problem? We need to know what you want to optimise in order to hep you. Note, BTW, that the speed of sort() is not influenced by Cython. Stefan From tav at espians.com Sun Nov 9 09:13:59 2008 From: tav at espians.com (tav) Date: Sun, 9 Nov 2008 08:13:59 +0000 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: <49168BDF.9020502@behnel.de> References: <49168BDF.9020502@behnel.de> Message-ID: Thanks for the fast responses -- it's given me a lot of confidence in using Cython! Are there plans to champion it to be included as part of Python's standard lib and how can I help with that? aaron> I'm not sure if this solves the problem but it would help aaron> if you declared your variables with an actual type aaron> (list, tuple, dict, etc.) instead of just 'object'. Thanks -- although the effects seem to be relatively minimal. But every little helps! =) robertwb> Short of writing your own sort algorithm there's no robertwb> much of a way to speed up sort(). If you're just robertwb> sorting strings here, that could probably be done a robertwb> lot faster. Hmz, I am sorting strings. Is there a special subset/subrouting of timsort that I can use for just sorting strings? stefan> If you change the "char* attr" into a plain "attr", this stefan> will speed up things considerably. In your code, stefan> Cython has to convert "attr" to a Python string on stefan> each loop to convert it to the Python string "j" (which stefan> is a really bad name for a Python string, BTW). Ah thanks. I mistakenly thought that typing everything would help speed things up. Not really understanding the subtleties of when types are converted between Python and C types... stefan> I also don't quite understand the hasattr(v, "__get__"). stefan> Is that supposed to access a property? My intention was to mimic the behaviour of type.__getattribute__ in order to support Python's descriptor protocols... stefan> I extended the list description a bit, so that it becomes stefan> clearer that this *is* the right list. Thanks! tav> The sort() and looping (for key in keys) seems to take up tav> most of the time... how can I do this better so that it takes tav> less time? stefan> Why is this important? Do you create them very often, stefan> or is the size of env the problem? We need to know stefan> what you want to optimise in order to hep you. Sorry, I should have been more explicit in my original email. Instead of using Python's class statements to define objects, I use nested function definitions (and closures) to define objects, e.g. class This: pass def Point(x, y): this = This() this.x = x this.y = y def getX(): return this.x def setX(value): this.x = value return Namespace() The advantage of this is that lends itself to a secure way of programming. For example, untrusted code can be given access to getX: p = Point(1, 2) p.setX(56) untrusted_code(p.getX) And, unlike with class-based objects, the untrusted code will not be able to call setX(). This is all inspired by http://www.erights.org/elib/capability/ode/ode-capabilities.html Now, the idea behind Namespace() is to return an object that mimics the semantics of a traditional class. So that you can call access attributes/methods using the traditional .dot syntax. Unfortunately, this means that there will be millions of Namespace objects being created. As such I am trying to optimise: * Initialisation speed * Memory footprint * Attribute access speed Right now, in comparison to normal Python class-style objects, the Cython-based Namespace-style objects cost factors of 7x, 3x, 2x for those 3 metrics. Ideally, I'd like to bring those metrics down to just 1.5-2x for all 3 metrics. Fundamentally, Namespace just needs to be immutable and mimic the behaviour of: class Namespace: def __init__(self, **env): self.__dict__ = env # should also bound methods appropriately freeze(self) # make this instance immutable As an additional memory optimisation I decided to try and borrow a trick from the PyPy guys: http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#sharing-dicts Namespace() objects created from the same constructor are likely to have the same scope env dict. As such, space could be saved by sharing the keys amongst them and then using an array lookup to find the specific values. This also being the same reason why I was trying to sort() the keys -- so as to minimise the number of different combinations of keys() being stored. Sorry for going into so much detail -- hope it lends a bit of clarity. Any further insight and help would be greatly appreciated -- thanks! -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From tav at espians.com Sun Nov 9 09:19:45 2008 From: tav at espians.com (tav) Date: Sun, 9 Nov 2008 08:19:45 +0000 Subject: [Cython] Cython 0.10 released In-Reply-To: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> References: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> Message-ID: Thanks and well done for the release guys! A small note -- the following harmless errors pop up when easy_install'ing... warning: no files found matching 'CHANGES.txt' warning: no files found matching 'bin/update_references' warning: no files found matching 'Demos/*.pxd' Perhaps they can be removed from the MANIFEST.in for the next release? On Sun, Nov 9, 2008 at 5:46 AM, Robert Bradshaw wrote: > Cython 0.10 is now officially released. > > There are lots of new features, most notably pure python compatible > syntax for cython declarations, improved buffer support, better > Python 3.0 support, and some major re-factoring of temporary > allocation and code generation. There are also many bug fixes and > optimizations. A (by no means complete) listing can be found at > http://trac.cython.org/cython_trac/query? > group=component&milestone=0.10 . Thanks to all of you who helped with > this release, including those who submitted bug reports. > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From stefan_ml at behnel.de Sun Nov 9 11:06:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Nov 2008 11:06:21 +0100 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: <49168BDF.9020502@behnel.de> Message-ID: <4916B61D.6090307@behnel.de> Hi, tav wrote: > robertwb> Short of writing your own sort algorithm there's no > robertwb> much of a way to speed up sort(). If you're just > robertwb> sorting strings here, that could probably be done a > robertwb> lot faster. > > Hmz, I am sorting strings. Is there a special subset/subrouting of > timsort that I can use for just sorting strings? > > stefan> If you change the "char* attr" into a plain "attr", this > stefan> will speed up things considerably. In your code, > stefan> Cython has to convert "attr" to a Python string on > stefan> each loop to convert it to the Python string "j" (which I meant: "to compare it to j". String comparison necessarily happens in Python space here. > stefan> is a really bad name for a Python string, BTW). > > Ah thanks. I mistakenly thought that typing everything would help > speed things up. > > Not really understanding the subtleties of when types are converted > between Python and C types... Whenever it's necessary. For a comparison, for example; on assignments, or when passing parameters to a function. You can see it in the generated C code, which is actually designed to be readable by humans. http://behnel.de/cgi-bin/weblog_basic/index.php?p=17 A good way to use timeit with Cython code is pyximport, BTW. Something like python -m timeit -s 'import pyximport; pyximport.install(); \ from mycythonmodule import myfunction' 'myfunction(somearg)' will show you how fast your code is without the hassle of recompiling your .pyx file manually after each change. It just compiles and imports the .pyx file on the fly. > stefan> I also don't quite understand the hasattr(v, "__get__"). > stefan> Is that supposed to access a property? > > My intention was to mimic the behaviour of type.__getattribute__ in > order to support Python's descriptor protocols... Try to restrict your code to handle the cases you really need. More general code should not be inside the critical path or a fast loop. > Instead of using Python's class statements to define objects, I use nested > function definitions (and closures) to define objects, e.g. > > class This: > pass > > def Point(x, y): > this = This() > this.x = x > this.y = y > def getX(): > return this.x > def setX(value): > this.x = value > return Namespace() > > The advantage of this is that lends itself to a secure way of > programming. For example, untrusted code can be given access to getX: > > p = Point(1, 2) > p.setX(56) > untrusted_code(p.getX) > > And, unlike with class-based objects, the untrusted code will not be > able to call setX(). Who needs setters when you can modify "this" directly? >>> def test(): ... a = [] ... def get(i): ... return a[i] ... return get >>> get = test() >>> get.func_closure[0].cell_contents.append(123) >>> get(0) 123 > Unfortunately, this means that there will be millions of Namespace > objects being created. As such I am trying to optimise: > > * Initialisation speed > * Memory footprint > * Attribute access speed Use a "cdef class" in Cython instead. It's implemented in C, so you need to modify the class instance memory at the C level in order to change the object in other ways than you allow. That's not secure, untrusted code may do that, but it's not trivial and therefore pretty unlikely at least. Stefan From tav at espians.com Sun Nov 9 11:40:46 2008 From: tav at espians.com (tav) Date: Sun, 9 Nov 2008 10:40:46 +0000 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: <4916B61D.6090307@behnel.de> References: <49168BDF.9020502@behnel.de> <4916B61D.6090307@behnel.de> Message-ID: > Whenever it's necessary. For a comparison, for example; on assignments, or > when passing parameters to a function. You can see it in the generated C code, > which is actually designed to be readable by humans. I've been using the -a flag and squinting at the generated C code -- it really helps with understanding what's going on! > python -m timeit -s 'import pyximport; pyximport.install(); \ > from mycythonmodule import myfunction' 'myfunction(somearg)' Thx! > Who needs setters when you can modify "this" directly? > > >>> def test(): > ... a = [] > ... def get(i): > ... return a[i] > ... return get > >>> get = test() > >>> get.func_closure[0].cell_contents.append(123) > >>> get(0) > 123 Hehe =) This bit of code protects against that: http://paste.lisp.org/display/70003 > Use a "cdef class" in Cython instead. It's implemented in C, so you need to > modify the class instance memory at the C level in order to change the object > in other ways than you allow. That's not secure, untrusted code may do that, > but it's not trivial and therefore pretty unlikely at least. Untrusted code is pure Python only =) On a separate note, I am having trouble with dynamically returning a method (function) via __getattribute__. I'd like to be able to dynamically return attributes like __str__/__len__ -- and for them to be used by the likes of str(), len(), etc. Is there some way of doing this? At the moment, the __str__ returned by my __getattribute__ is just ignored by str(). And in the only google result exploring the problem, http://mail.python.org/pipermail/python-list/2004-April/259971.html -- Greg suggests Pyrex (this was years ago), but does not explain how it could be used in a dynamic setting... Thanks! -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 From stefan_ml at behnel.de Sun Nov 9 11:56:56 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 09 Nov 2008 11:56:56 +0100 Subject: [Cython] Optimising dict manipulation in extension types In-Reply-To: References: <49168BDF.9020502@behnel.de> <4916B61D.6090307@behnel.de> Message-ID: <4916C1F8.4040303@behnel.de> Hi, tav wrote: > Stefan Behnel wrote: >> Who needs setters when you can modify "this" directly? >> >> >>> def test(): >> ... a = [] >> ... def get(i): >> ... return a[i] >> ... return get >> >>> get = test() >> >>> get.func_closure[0].cell_contents.append(123) >> >>> get(0) >> 123 > > Hehe =) > > This bit of code protects against that: http://paste.lisp.org/display/70003 Nice. Having PJE in there makes me suspect that it might actually work. >> Use a "cdef class" in Cython instead. It's implemented in C, so you need to >> modify the class instance memory at the C level in order to change the object >> in other ways than you allow. That's not secure, untrusted code may do that, >> but it's not trivial and therefore pretty unlikely at least. > > Untrusted code is pure Python only =) ctypes? > On a separate note, I am having trouble with dynamically returning a > method (function) via __getattribute__. I'd like to be able to > dynamically return attributes like __str__/__len__ -- and for them to > be used by the likes of str(), len(), etc. My guess is that this is because __str__ and friends are exposed at the C level as part of the type struct. This likely overrides the __getattribute__ lookup mechanism. Maybe you can just implement them and let them delegate to the right functions instead. Stefan From ggellner at uoguelph.ca Sun Nov 9 15:09:28 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Sun, 9 Nov 2008 09:09:28 -0500 Subject: [Cython] Cython 0.10 released In-Reply-To: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> References: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> Message-ID: <20081109140928.GA32346@encolpuis> On Sat, Nov 08, 2008 at 09:46:04PM -0800, Robert Bradshaw wrote: > Cython 0.10 is now officially released. > > There are lots of new features, most notably pure python compatible > syntax for cython declarations, improved buffer support, better > Python 3.0 support, and some major re-factoring of temporary > allocation and code generation. There are also many bug fixes and > optimizations. A (by no means complete) listing can be found at > http://trac.cython.org/cython_trac/query? > group=component&milestone=0.10 . Thanks to all of you who helped with > this release, including those who submitted bug reports. > Wow great work everybody! Gabriel From zhen_qing_he_456 at 163.com Mon Nov 10 02:49:33 2008 From: zhen_qing_he_456 at 163.com (jay) Date: Mon, 10 Nov 2008 01:49:33 +0000 (UTC) Subject: [Cython] how to initialize my c array in a quick way? References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> Message-ID: > > zhen_qing_he_456 wrote: > >> if in my fuction i declare a c array like: > >> cdef int a[5] > >> now if want to initialize my array a, i can do follow: > >> a[0] = 1 > >> a[1] = 10 > >> : > >> : > >> a[4] = 3 > >> the value to my array is not continuous. Is there a quicker > >> way in Cython to do it like in c: > >> a[5] = {1,3,28,5,3} > > I don't think it's what i want... I want declare an array with a shape at the beginning of program,then in my function i will assign value to it by calculation. Is there a good way to do it? From greg.ewing at canterbury.ac.nz Mon Nov 10 05:38:41 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 10 Nov 2008 17:38:41 +1300 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> Message-ID: <4917BAD1.8080802@canterbury.ac.nz> jay wrote: > I want declare an array with a shape at the beginning of program,then in my > function i will assign value to it by calculation. > Is there a good way to do it? You can do a[0], a[1], a[2], a[3], a[4] = 1, 3, 28, 5, 3 which will get turned into a series of assignments. Ideally, what you *should* be able to do is a[:] = 1, 3, 28, 5, 3 but unless Cython has grown a feature I don't know about, that will only work if a is a Python sequence (or possibly a numpy array, although I'm not sure how efficient it will be in that case). -- Greg From stefan_ml at behnel.de Mon Nov 10 09:07:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 10 Nov 2008 09:07:34 +0100 (CET) Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <4917BAD1.8080802@canterbury.ac.nz> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> <4917BAD1.8080802@canterbury.ac.nz> Message-ID: <33942.213.61.181.86.1226304454.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Greg Ewing wrote: > jay wrote: >> I want declare an array with a shape at the beginning of program,then >> in my function i will assign value to it by calculation. >> Is there a good way to do it? > > Ideally, what you *should* be able to do is > > a[:] = 1, 3, 28, 5, 3 > > but unless Cython has grown a feature I don't know about, As I said, this works already: cdef int* a = [1, 3, 28, 5, 3] However, a later assignment will not work. Your example above is absolutely the way to go, and it would even work in the case where a is declared as a plain pointer, not just a fixed size array. Stefan From dagss at student.matnat.uio.no Mon Nov 10 11:46:05 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 10 Nov 2008 11:46:05 +0100 Subject: [Cython] External typedefs and pointers Message-ID: <491810ED.8090904@student.matnat.uio.no> A discussion recently came up on the NumPy mailing list, and it inspired me to focus on this usability issue: cdef extern from "test.h": ctypedef int type_a ctypedef int type_b cdef type_a* ptr1 = NULL cdef type_b* ptr2 = ptr1 Now, it might happen (like with NumPy) that type_a and type_b are defined as seperate types on some platforms and the same on others (through #ifdefs). Partly this is relied upon currently, making the situation a bit confusing, especially for new users. Possible solutions: 1) Make all external types pointer-incompatible. So "ptr2=ptr1" above will always fail, and an explicit cast is needed. 2) Introduce a new keyword, something like unknown_size: cdef extern from "test.h": ctypedef unknown_size int type_a ctypedef long type_b # we know this is always "long" ... which trigger pointer-incompatability with anything. NB! At the same time, all external primitive types are checked for the right declaration in Cython at module startup time: if (sizeof(type_b) != sizeof(long) || ((type_a)-1) != ((type_b)-1) || ((type_a)0.5) != ((type_b)0.5))) { /*raise exception*/ } (Not sure if that float check will work though, might need to do a division instead.) 3) Complex interaction with the C compiler so that Cython and the C compiler work "in one step". The Cython core is simplified so that it never checks pointer assignment compatability, but rather take any error the C compiler gives and translates it back to an error to the Cython user. Obviously not a short-term solution but if this is a long-term solution it might be enough reason not to bother with this for now. This seems to be the most stable solution, but it does give away the possibility to seperate the Cython and C compilation stages as some people are fond of doing. Dag Sverre From dagss at student.matnat.uio.no Mon Nov 10 11:49:05 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 10 Nov 2008 11:49:05 +0100 Subject: [Cython] External typedefs and pointers In-Reply-To: <491810ED.8090904@student.matnat.uio.no> References: <491810ED.8090904@student.matnat.uio.no> Message-ID: <491811A1.2050001@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > if (sizeof(type_b) != sizeof(long) || ((type_a)-1) != ((type_b)-1) || > ((type_a)0.5) != ((type_b)0.5))) { /*raise exception*/ } > Sorry, this should be if (sizeof(type_b) != sizeof(long) || ((type_b)-1) != ((long)-1) || ((type_b)0.5) != ((long)0.5))) { /*raise exception*/ } i.e. it is checked that "type_b" really is exactly a "long" at C compilation time. Dag Sverre From robertwb at math.washington.edu Mon Nov 10 20:05:40 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 10 Nov 2008 11:05:40 -0800 Subject: [Cython] Cython 0.10 released In-Reply-To: References: <8A6BFA6E-2FD0-44EC-AF2C-ACEBB45CF0AC@math.washington.edu> Message-ID: On Nov 9, 2008, at 12:19 AM, tav wrote: > Thanks and well done for the release guys! > > A small note -- the following harmless errors pop up when > easy_install'ing... > > warning: no files found matching 'CHANGES.txt' > warning: no files found matching 'bin/update_references' > warning: no files found matching 'Demos/*.pxd' > > Perhaps they can be removed from the MANIFEST.in for the next release? Done. http://hg.cython.org/cython-devel/rev/a916d70ed6f7 > > On Sun, Nov 9, 2008 at 5:46 AM, Robert Bradshaw > wrote: >> Cython 0.10 is now officially released. >> >> There are lots of new features, most notably pure python compatible >> syntax for cython declarations, improved buffer support, better >> Python 3.0 support, and some major re-factoring of temporary >> allocation and code generation. There are also many bug fixes and >> optimizations. A (by no means complete) listing can be found at >> http://trac.cython.org/cython_trac/query? >> group=component&milestone=0.10 . Thanks to all of you who helped with >> this release, including those who submitted bug reports. >> >> - Robert >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > > -- > love, tav > > plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Tue Nov 11 06:54:11 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 11 Nov 2008 06:54:11 +0100 Subject: [Cython] Pure python mode In-Reply-To: References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> Message-ID: <49191E03.70001@behnel.de> Hi, Robert Bradshaw wrote: > On Nov 5, 2008, at 6:52 AM, Ondrej Certik wrote: >> On Wed, Nov 5, 2008 at 2:52 PM, Stefan Behnel wrote: >>> @cython.cdef >>> @cython.locals(n=cython.int, _return=cython.int) >>> def fact(n): >>> ... >>> >>> would become >>> >>> cdef int fact(int n): >>> ... >> Yes, that looks perfectly ok. And the same for extension classes. Extension types are a different thing and behave different from cdef functions. I'd prefer having a separate decorator for them, something like @cython.cexttype(attr1=type, ...) class A: ... (BTW, should such a class - be allowed to or automatically - inherit from object in Python?) I'm not sure the attributes are necessary, though. It might be enough to write @cython.cexttype class A: attr1 = cython.declare(cython.int) ... But the problem is that this has different semantics in Python... Given that class annotations are not really a current Python thing anyway, I might also consider a) a special Cython metaclass class A: __metaclass__ = cython.cexttype or b) simply use inheritance class A(..., cython.CExtType): where cython.CExtType 'is' object when running in Python. > Dag mentioned an "earlybind" rather than "cdef" as it would have more > meaning to newcomers. I actually like @cython.cdef, but only for functions. In Python, it conflicts too much with the 'def' keyword. OTOH, people who know C might be happier with @cython.static or @cython.cfunc. That would also make it clear why the function disappears from the module namespace when compiled. > There's also the issue of cpdef vs. cdef, > should that be a flag to the decorator, or a new decorator. Hmm, good question. What about @cython.def(fast_c_call=True) def myfunc(): as an equivalent to cpdef? I think that makes it perfectly clear what cpdef does. Maybe this is enough: @cython.fastccall def myfunc(): > I'm not a fan of the _return parameter, I think it belongs in the > "cdef" decorator as it doesn't make sense without it. It would be > nice if one could declare argument types there as well (though that > introduces some redundancy with locals). Perhaps the first > (prepositional) argument would be the return type. The idea of a first positional argument for the return type totally makes sense to me. The only problem here is that this makes @cython.locals() a non-intuitive name, as the return type is everything but local. @cython.locals(cython.int, a=cython.float, ...) def myfunc(a): Why not use a separate decorator @cython.returns(cython.int) @cython.locals(a=cython.float, ...) def myfunc(a): Saving one line of code isn't always worth it, and declaring return types is not as common as declaring local variables. > There's also Py3 > syntax for annotating arguments, though I think we'll be wanting to > support Py2 for quite a while. That's more than a nice-to-have, though. I think if we make it into the stdlib, it's most likely to get there somewhere in the Py3 series. So supporting this would be consistent with the language level by the time. Note that this would only impact @cython.locals (and @cython.returns), the rest would still be required. So the Cython Py2 annotations and the official Py3 function annotations are actually orthogonal, and we should support both in Py3, as people might want to use their function annotations for other stuff as well. Stefan From robertwb at math.washington.edu Tue Nov 11 09:08:47 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 11 Nov 2008 00:08:47 -0800 Subject: [Cython] Pure python mode In-Reply-To: <49191E03.70001@behnel.de> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> Message-ID: <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> On Nov 10, 2008, at 9:54 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> On Nov 5, 2008, at 6:52 AM, Ondrej Certik wrote: >>> On Wed, Nov 5, 2008 at 2:52 PM, Stefan Behnel wrote: >>>> @cython.cdef >>>> @cython.locals(n=cython.int, _return=cython.int) >>>> def fact(n): >>>> ... >>>> >>>> would become >>>> >>>> cdef int fact(int n): >>>> ... >>> Yes, that looks perfectly ok. And the same for extension classes. > > Extension types are a different thing and behave different from cdef > functions. I'd prefer having a separate decorator for them, > something like > > @cython.cexttype(attr1=type, ...) > class A: > ... I would prefer cython.cclass(...) or cython.extension_type(...) > (BTW, should such a class - be allowed to or automatically - > inherit from > object in Python?) Don't all types inherit from object already? > I'm not sure the attributes are necessary, though. It might be > enough to write > > @cython.cexttype > class A: > attr1 = cython.declare(cython.int) > ... > > But the problem is that this has different semantics in Python... The semantics are not different enough to be a bad syntax. I like being able to specify the attributes in the decorator as well though. > Given that class annotations are not really a current Python thing > anyway, I > might also consider > > a) a special Cython metaclass > > class A: > __metaclass__ = cython.cexttype > > >> > or b) simply use inheritance > > class A(..., cython.CExtType): > > where cython.CExtType 'is' object when running in Python. Both of these are interesting, but I have to say I prefer the decorator form, but it's not available < Py3. The __metaclass__ seems to fit what's happening better (and could also take attribute/other arguments). >> Dag mentioned an "earlybind" rather than "cdef" as it would have more >> meaning to newcomers. > > I actually like @cython.cdef, but only for functions. In Python, it > conflicts > too much with the 'def' keyword. Yeah, it doesn't make as much sense for classes. > OTOH, people who know C might be happier with @cython.static or > @cython.cfunc. > That would also make it clear why the function disappears from the > module > namespace when compiled. Static means to many (different) things in too many languages. I like cfunc. @cdef def seems rather redundant and meaningless to non-cython people. >> There's also the issue of cpdef vs. cdef, >> should that be a flag to the decorator, or a new decorator. > > Hmm, good question. What about > > @cython.def(fast_c_call=True) > def myfunc(): > > as an equivalent to cpdef? I think that makes it perfectly clear > what cpdef > does. Maybe this is enough: > > @cython.fastccall > def myfunc(): Or maybe even "ccall" as the fast should be obvious. I think passing in a keyword argument gets to verbose, and I'd like to use that to type arguments as well. I'm still in brainstorming mode, as I'm not totally happy with any of the options we have come up with so far. >> I'm not a fan of the _return parameter, I think it belongs in the >> "cdef" decorator as it doesn't make sense without it. It would be >> nice if one could declare argument types there as well (though that >> introduces some redundancy with locals). Perhaps the first >> (prepositional) argument would be the return type. > > The idea of a first positional argument for the return type totally > makes > sense to me. The only problem here is that this makes @cython.locals > () a > non-intuitive name, as the return type is everything but local. > > @cython.locals(cython.int, a=cython.float, ...) > def myfunc(a): > > Why not use a separate decorator > > @cython.returns(cython.int) > @cython.locals(a=cython.float, ...) > def myfunc(a): +1 > Saving one line of code isn't always worth it, and declaring return > types is > not as common as declaring local variables. I locals is for typing all local variables, it just so happens that function arguments are local variables as well. I think the same decorator used to declare the function c[p]def should take an (optional) argument specifying the return type, as well as be able to type arguments if desired. >> There's also Py3 >> syntax for annotating arguments, though I think we'll be wanting to >> support Py2 for quite a while. > > That's more than a nice-to-have, though. I think if we make it into > the > stdlib, it's most likely to get there somewhere in the Py3 series. So > supporting this would be consistent with the language level by the > time. > > Note that this would only impact @cython.locals (and > @cython.returns), the > rest would still be required. So the Cython Py2 annotations and the > official > Py3 function annotations are actually orthogonal, and we should > support both > in Py3, as people might want to use their function annotations for > other stuff > as well. Yep. - Robert From stefan_ml at behnel.de Tue Nov 11 10:22:03 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 11 Nov 2008 10:22:03 +0100 (CET) Subject: [Cython] Pure python mode In-Reply-To: <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> Message-ID: <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Bradshaw wrote: > On Nov 10, 2008, at 9:54 PM, Stefan Behnel wrote: >> @cython.cexttype(attr1=type, ...) >> class A: >> ... > > I would prefer cython.cclass(...) or cython.extension_type(...) > [...] >> people who know C might be happier with @cython.static or >> @cython.cfunc. > [...] >> @cython.fastccall >> def myfunc(): > > Or maybe even "ccall" as the fast should be obvious. I like how simple "cclass" is, and how well it matches "cfunc" and "ccall". So what about @cython.cclass(attr1=cython.int, ...) # Py3 only @cython.cfunc(cython.int, arg1=cython.int, ...) @cython.ccall(cython.int, arg1=cython.int, ...) I think they target nicely distinct use cases and make it perfectly clear what's happening. Additionally, you can use @cython.locals(var1=..., ...) with @cython.cfunc, @cython.ccall or plain functions. Not sure if we should allow declaring function parameters in the cfunc/ccall cases. It makes sense in the plain function case, though. Now I don't even see a use case for @cython.returns(cython.int) anymore, as all functions where it makes sense would either be ccall or cfunc. >> (BTW, should such a class - be allowed to or automatically - >> inherit from object in Python?) > > Don't all types inherit from object already? No, only in Py3. In Py2, if you write class A: pass you will get an old-style class. I would love to rule that out for classes that Cython will compile to cdef classes. >> I'm not sure the attributes are necessary, though. It might be >> enough to write >> >> @cython.cexttype >> class A: >> attr1 = cython.declare(cython.int) >> ... >> >> But the problem is that this has different semantics in Python... > > The semantics are not different enough to be a bad syntax. Well, the difference is that the above gives you a class attribute in Python, which users usually (but not necessarily!) overwrite by instance attribute setting in the constructor. In any case, it stays alive in the class and can be accessed there, be it through the instance or the class. However, for extension types, the attribute is /always/ per instance. > I like > being able to specify the attributes in the decorator as well though. Yes, I think the above syntax gets more readable the more attributes you have, but the decorator/metaclass syntax allows you to keep Python semantics. >> Given that class annotations are not really a current Python thing >> anyway, I might also consider >> >> a) a special Cython metaclass >> >> class A: >> __metaclass__ = cython.cexttype >> >> or b) simply use inheritance >> >> class A(..., cython.CExtType): >> >> where cython.CExtType 'is' object when running in Python. > > Both of these are interesting, but I have to say I prefer the > decorator form, but it's not available < Py3. The __metaclass__ seems > to fit what's happening better (and could also take attribute/other > arguments). Yes, it would match the decorator exactly. Stefan From hoytak at cs.ubc.ca Wed Nov 12 02:04:04 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 11 Nov 2008 17:04:04 -0800 Subject: [Cython] help tracking down TypeError Message-ID: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> Hello, I've got a fairly large cython module that I've just made a lot of changes to. It compiles fine, but when I try to import it, it just gives this cryptic error message: TypeError: Item in ``from list'' not a string Here's the traceback: In [1]: import geometry --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/hoytak/workspace/gravimetrics/spatial/ in () /home/hoytak/workspace/gravimetrics/spatial/geometry.pyx in spatial.geometry (spatial/geometry.c:6001)() 2 3 ----> 4 import numpy.random as rn 5 from copy import copy 6 TypeError: Item in ``from list'' not a string > /home/hoytak/workspace/gravimetrics/spatial/geometry.pyx(4)spatial.geometry (spatial/geometry.c:6001)() 3 ----> 4 import numpy.random as rn 5 from copy import copy I'm pretty sure that it doesn't have anything to do with the line in question, as changing/removing it doesn't change things. If anyone has a clue where I should look, please let me know. I've attached the pyx/pxd files in question just in case. I'm using the latest cython version from the mercurial repo. Thanks!!! --Hoyt ++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ -------------- next part -------------- A non-text attachment was scrubbed... Name: geometry.tar.gz Type: application/x-gzip Size: 2513 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081111/c8dd6096/attachment-0001.bin From robert.kern at gmail.com Wed Nov 12 02:19:44 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 11 Nov 2008 19:19:44 -0600 Subject: [Cython] help tracking down TypeError In-Reply-To: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> Message-ID: Hoyt Koepke wrote: > Hello, > > I've got a fairly large cython module that I've just made a lot of > changes to. It compiles fine, but when I try to import it, it just > gives this cryptic error message: > > > TypeError: Item in ``from list'' not a string > > > Here's the traceback: > > > In [1]: import geometry > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > > /home/hoytak/workspace/gravimetrics/spatial/ in () > > /home/hoytak/workspace/gravimetrics/spatial/geometry.pyx in > spatial.geometry (spatial/geometry.c:6001)() > 2 > 3 > ----> 4 import numpy.random as rn > 5 from copy import copy > 6 > > TypeError: Item in ``from list'' not a string >> /home/hoytak/workspace/gravimetrics/spatial/geometry.pyx(4)spatial.geometry (spatial/geometry.c:6001)() > 3 > ----> 4 import numpy.random as rn > 5 from copy import copy > > > > I'm pretty sure that it doesn't have anything to do with the line in > question, as changing/removing it doesn't change things. Are you sure? I changed the line to this: from numpy import random as rn And it worked with Cython 0.10. Possibly the code that handles "import . as " is incorrect. It's a tricky bit of syntax. Stylistically, though, I would suggest just doing "from numpy import random". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hoytak at cs.ubc.ca Wed Nov 12 02:38:01 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 11 Nov 2008 17:38:01 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> Message-ID: <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> > Are you sure? I changed the line to this: > > from numpy import random as rn > > And it worked with Cython 0.10. Possibly the code that handles "import > . as " is incorrect. It's a tricky bit of syntax. You're right; this does work. I think in trying a bunch of combinations to figure it out, I must have missed recompiling something and so missed this. Sorry for the noise! > Stylistically, though, I would suggest just doing "from numpy import random". Okay, good tip. Style is important :-) --Hoyt ++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From hoytak at cs.ubc.ca Wed Nov 12 02:48:28 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 11 Nov 2008 17:48:28 -0800 Subject: [Cython] contiguous array access Message-ID: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> Hello, I'm trying to use the new contiguous array support and I've hit a bit of a problem. I'm curious if this is a bug or is intentional. When I do cdef ndarray[float, mode="c"] Xc = X where X is a 2 dimensional numpy array, it raises an exception: File "/home/hoytak/workspace/gravimetrics/spatial/gravity.pyx", line 101, in spatial.gravity._setGFFromSamples (spatial/gravity.c:1242) cdef ndarray[float, mode="c"] Xc = X ValueError: Buffer has wrong number of dimensions (expected 1, got 2) However, doing cdef ndarray[float, mode="c"] Xc = X.reshape(-1) seems to work fine. If this behavior is intentional, what's the best way of using contiguous mode with 2+ dim arrays? Thanks!!! --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From dagss at student.matnat.uio.no Wed Nov 12 11:19:07 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Nov 2008 11:19:07 +0100 Subject: [Cython] contiguous array access In-Reply-To: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> References: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> Message-ID: <491AAD9B.5080500@student.matnat.uio.no> Hoyt Koepke wrote: > Hello, > > I'm trying to use the new contiguous array support and I've hit a bit > of a problem. I'm curious if this is a bug or is intentional. When I > do > > cdef ndarray[float, mode="c"] Xc = X > > where X is a 2 dimensional numpy array, it raises an exception: > > File "/home/hoytak/workspace/gravimetrics/spatial/gravity.pyx", line > 101, in spatial.gravity._setGFFromSamples (spatial/gravity.c:1242) > cdef ndarray[float, mode="c"] Xc = X > ValueError: Buffer has wrong number of dimensions (expected 1, got 2) Yes, this is intentional. The "mode" parameter doesn't affect how you access the array, so you still need to specify the "ndim" parameter (which defaults to 1). So: cdef np.ndarray[float, mode="c", ndim=2] Xc = X print Xc[2,3] The only difference is that the second line will be very slightly faster than if mode were set to the default "strided", but at the cost than an exception is raised if the array is not contiguous. If you want access the underlying buffer, you can do it like this: cdef np.ndarray[float, mode="c", ndim=2] Xc = X cdef float* buf = Xc.data Or if you do not know the number of dimensions: cdef np.ndarray Xc = X if Xc.dtype != np.float or not Xc.flags["C_CONTIGUOUS"]: raise something cdef float* buf = Xc.data -- Dag Sverre From dagss at student.matnat.uio.no Wed Nov 12 11:25:26 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Nov 2008 11:25:26 +0100 Subject: [Cython] contiguous array access In-Reply-To: <491AAD9B.5080500@student.matnat.uio.no> References: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> <491AAD9B.5080500@student.matnat.uio.no> Message-ID: <491AAF16.4020005@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Or if you do not know the number of dimensions: > > cdef np.ndarray Xc = X > if Xc.dtype != np.float or not Xc.flags["C_CONTIGUOUS"]: > raise something > cdef float* buf = Xc.data > I'm changing my recommondation to this instead: if X.dtype != np.float or not X.flags["C_CONTIGUOUS"]: raise something cdef float* buf = np.PyArray_DATA(X) (I suppose the "X.data" could be removed in the future because it might be potentially confusing that it does something different if X is typed as ndarray rather than if not: cdef ndarray Xc = X X.data # is a Python buffer instance Xc.data # is a char* PyArray_DATA can always be used if you want the latter one. Opinions?) -- Dag Sverre From hoytak at cs.ubc.ca Wed Nov 12 17:16:43 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Wed, 12 Nov 2008 08:16:43 -0800 Subject: [Cython] contiguous array access In-Reply-To: <491AAF16.4020005@student.matnat.uio.no> References: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> <491AAD9B.5080500@student.matnat.uio.no> <491AAF16.4020005@student.matnat.uio.no> Message-ID: <4db580fd0811120816i27b77d44x841ab8ea7b5d9f48@mail.gmail.com> Hey Dag, Thanks a bunch for your helpful answer. This clears things up. The most common use of this for me will be to do the same thing to every element, regardless of whether or not it's contiguous. In my case, it's for a type of Monte Carlo simulation. I still use the array in python so it doesn't make sense to create a low-level c array with malloc/free. As a quick thought: I don't know how difficult it would be to implement, but a mode for this kind of use would be nice. Something like mode="unordered" to emphasize the use case. I guess you already have that with the above code, but I've found the new buffer syntax so nice that I don't particularly like going back to the direct method. I would be quite happy to try and implement this, however, I'm in the first year of my phd program so between classes and research time is quite scarce. I haven't worked on the cython code yet, though I've long wanted to and plan to start contributing as soon as time permits. --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From dagss at student.matnat.uio.no Wed Nov 12 17:40:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Nov 2008 17:40:25 +0100 Subject: [Cython] contiguous array access In-Reply-To: <4db580fd0811120816i27b77d44x841ab8ea7b5d9f48@mail.gmail.com> References: <4db580fd0811111748j5be82580vef7bdc6130f98a0@mail.gmail.com> <491AAD9B.5080500@student.matnat.uio.no> <491AAF16.4020005@student.matnat.uio.no> <4db580fd0811120816i27b77d44x841ab8ea7b5d9f48@mail.gmail.com> Message-ID: <491B06F9.5050405@student.matnat.uio.no> Hoyt Koepke wrote: > As a quick thought: I don't know how difficult it would be to > implement, but a mode for this kind of use would be nice. Something > like mode="unordered" to emphasize the use case. I guess you already > have that with the above code, but I've found the new buffer syntax so > nice that I don't particularly like going back to the direct method. > Yes, "ndim-generic" syntax would certainly be nice. However it is almost there already (sorry I forgot about this earlier): cdef np.ndarray[mode="c", ndim=1] arr = otherarr.ravel() The ravel() method gives a 1D view, and if the data was contiguous originally then no copy is made and modifications to arr will also be made to otherarr (bether make an explicit check for contiguousness and copy back if needed). There's a related issue though: Unfortunately I don't have time myself to work much on Cython these days; but what I thought would be nice in this area is to have some kind of magic ndenumerate function, much like NumPy's (but works with all buffers): # Multiply array with 2 np.ndarray[np.float, ndim=None] X = ... for index, value in cython.ndenumerate(X): X[index] = value * 2 Here "index" would be kind of magic, as it could contain multiple dimensions (a tuple, but if it is never unpacked it is never actually constructed as a tuple). The order of traversal would always be the most efficient. If mode="c" extra optimizations could then be made. Until that happens (if it ever will), you can use NumPy's generic ndim-iterator objects (which are available in the NumPy C interface). I haven't exposed them in numpy.pxd but would be very happy for a patch to that file and an example for use of the ndim-iterators in Cython. Dag Sverre From robert.kern at gmail.com Thu Nov 13 01:16:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 12 Nov 2008 18:16:01 -0600 Subject: [Cython] help tracking down TypeError In-Reply-To: <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> Message-ID: Hoyt Koepke wrote: >> Are you sure? I changed the line to this: >> >> from numpy import random as rn >> >> And it worked with Cython 0.10. Possibly the code that handles "import >> . as " is incorrect. It's a tricky bit of syntax. > > You're right; this does work. I think in trying a bunch of > combinations to figure it out, I must have missed recompiling > something and so missed this. Sorry for the noise! It's not noise. This does appear to be a bug. /* "/Users/rkern/today/geometry/geometry.pyx":4 * * * import numpy.random as rn # <<<<<<<<<<<<<< * from copy import copy * */ __pyx_1 = PyList_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_c lineno = __LINE__; goto __pyx_L1_error;} Py_INCREF(__pyx_kp_29); PyList_SET_ITEM(__pyx_1, 0, __pyx_kp_29); __pyx_2 = __Pyx_Import(__pyx_kp_28, ((PyObject *)__pyx_1)); if (unlikely(!__pyx_2)) {__pyx_filename = __ pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; if (PyObject_SetAttr(__pyx_m, __pyx_kp_rn, __pyx_2) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(__pyx_2); __pyx_2 = 0; _pyx_kp_29 is never given a value in the generated code so the from_list is invalid. I am guessing that the "as" is fooling the code generator into thinking that there is a from_list. If that gets fixed, there's more stuff that's broken. The module that gets returned from __Pyx_Import() as __pyx_2 will be numpy, not numpy.random, so there must be one or more PyObject_GetAttr()s to get the actual module that should be assigned to the name "rn". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hoytak at cs.ubc.ca Thu Nov 13 03:04:51 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Wed, 12 Nov 2008 18:04:51 -0800 Subject: [Cython] bug in generated cython -a html output Message-ID: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> Hello, In trying to understand what is going on at the c level with some cython code, I found that the html code generated with cython -a doesn't always match the code in the c file. I attached a rather confusing case I found. It's line 125 in the html file and line 1720 in the c code. In essence, I'm trying to do buffer access; the cython code is p1.y = y_edges[yi] where y_edges is a buffer and p1 is an extension type with y defined as an attribute. The (sensible) c code generated in gridworld.c is: __pyx_t_9 = __pyx_v_yi; __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges However, the html file shows this plus a ton of junk: 125: p1.y = y_edges[yi] __pyx_t_9 = __pyx_v_yi; __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges)); __pyx_2 = PyObject_GetAttr(__pyx_v_iterator, __pyx_kp_next); if (unlikely(!__pyx_2)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} __pyx_3 = PyObject_Call(__pyx_2, ((PyObject *)__pyx_empty_tuple), NULL); if (unlikely(!__pyx_3)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_2); __pyx_2 = 0; __pyx_2 = __Pyx_GetItemInt(__pyx_3, 1, 0); if (!__pyx_2) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_3); __pyx_3 = 0; __pyx_3 = __Pyx_GetItemInt(__pyx_2, 0, 0); if (!__pyx_3) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_2); __pyx_2 = 0; if (!(__Pyx_TypeTest(__pyx_3, __pyx_ptype_5numpy_dtype))) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(((PyObject *)__pyx_v_descr)); __pyx_v_descr = ((PyArray_Descr *)__pyx_3); __pyx_3 = 0; } goto __pyx_L17_try; __pyx_L13_error:; Py_XDECREF(__pyx_2); __pyx_2 = 0; Py_XDECREF(__pyx_3); __pyx_3 = 0; 126: .... If you'd like me to file a bug report, I'd be happy to. I'm using the latest version of cython from the repo, 1321:6e8c09631af4. Thanks! --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ -------------- next part -------------- A non-text attachment was scrubbed... Name: gwgenerated.tar.bz2 Type: application/x-bzip2 Size: 27776 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081112/997bc49d/attachment-0001.bin From robertwb at math.washington.edu Thu Nov 13 04:05:10 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 12 Nov 2008 19:05:10 -0800 Subject: [Cython] bug in generated cython -a html output In-Reply-To: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> References: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> Message-ID: On Nov 12, 2008, at 6:04 PM, Hoyt Koepke wrote: > Hello, > > In trying to understand what is going on at the c level with some > cython code, I found that the html code generated with cython -a > doesn't always match the code in the c file. It should always match, but sometimes there's some C extra code hanging off. > I attached a rather > confusing case I found. It's line 125 in the html file and line 1720 > in the c code. > > In essence, I'm trying to do buffer access; the cython code is > > p1.y = y_edges[yi] > > where y_edges is a buffer and p1 is an extension type with y defined > as an attribute. The (sensible) c code generated in gridworld.c is: > > __pyx_t_9 = __pyx_v_yi; > __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, > __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges > > However, the html file shows this plus a ton of junk: > > 125: p1.y = y_edges[yi] > __pyx_t_9 = __pyx_v_yi; > __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, > __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges)); > > __pyx_2 = PyObject_GetAttr(__pyx_v_iterator, > __pyx_kp_next); if (unlikely(!__pyx_2)) {__pyx_filename = __pyx_f[1]; > __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} > __pyx_3 = PyObject_Call(__pyx_2, ((PyObject > *)__pyx_empty_tuple), NULL); if (unlikely(!__pyx_3)) {__pyx_filename = > __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto > __pyx_L13_error;} > Py_DECREF(__pyx_2); __pyx_2 = 0; > __pyx_2 = __Pyx_GetItemInt(__pyx_3, 1, 0); if (!__pyx_2) > {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = > __LINE__; goto __pyx_L13_error;} > Py_DECREF(__pyx_3); __pyx_3 = 0; > __pyx_3 = __Pyx_GetItemInt(__pyx_2, 0, 0); if (!__pyx_3) > {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = > __LINE__; goto __pyx_L13_error;} > Py_DECREF(__pyx_2); __pyx_2 = 0; > if (!(__Pyx_TypeTest(__pyx_3, __pyx_ptype_5numpy_dtype))) > {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = > __LINE__; goto __pyx_L13_error;} > Py_DECREF(((PyObject *)__pyx_v_descr)); > __pyx_v_descr = ((PyArray_Descr *)__pyx_3); > __pyx_3 = 0; > } > goto __pyx_L17_try; > __pyx_L13_error:; > Py_XDECREF(__pyx_2); __pyx_2 = 0; > Py_XDECREF(__pyx_3); __pyx_3 = 0; > > 126: .... It looks like your line is just before a loop? What is going on here is that the compiler periodically emmits something that says "I'm at this spot now" and the annotator tries to make correspondences between the Cython and C code based on that. If you finish a line, but the compiler doesn't explicitly say it's working on the next line, then all the extra stuff gets appended to the end. This also happens at the end of functions, etc. where there is cleanup code that gets tagged onto the last line. > If you'd like me to file a bug report, I'd be happy to. I'm using the > latest version of cython from the repo, 1321:6e8c09631af4. Yes, please do. Try and get it down to the smallest example that generates this behavior, if possible. - Robert From robertwb at math.washington.edu Thu Nov 13 04:05:38 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 12 Nov 2008 19:05:38 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> Message-ID: <20F54975-55E8-4ABB-9059-6B6A3AF4F091@math.washington.edu> Could you file a bug report? On Nov 12, 2008, at 4:16 PM, Robert Kern wrote: > Hoyt Koepke wrote: >>> Are you sure? I changed the line to this: >>> >>> from numpy import random as rn >>> >>> And it worked with Cython 0.10. Possibly the code that handles >>> "import >>> . as " is incorrect. It's a tricky bit of syntax. >> >> You're right; this does work. I think in trying a bunch of >> combinations to figure it out, I must have missed recompiling >> something and so missed this. Sorry for the noise! > > It's not noise. This does appear to be a bug. > > /* "/Users/rkern/today/geometry/geometry.pyx":4 > * > * > * import numpy.random as rn # <<<<<<<<<<<<<< > * from copy import copy > * > */ > __pyx_1 = PyList_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = > __pyx_f[0]; __pyx_lineno = 4; __pyx_c > lineno = __LINE__; goto __pyx_L1_error;} > Py_INCREF(__pyx_kp_29); > PyList_SET_ITEM(__pyx_1, 0, __pyx_kp_29); > __pyx_2 = __Pyx_Import(__pyx_kp_28, ((PyObject *)__pyx_1)); if > (unlikely(!__pyx_2)) {__pyx_filename = __ > pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto > __pyx_L1_error;} > Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; > if (PyObject_SetAttr(__pyx_m, __pyx_kp_rn, __pyx_2) < 0) > {__pyx_filename = > __pyx_f[0]; __pyx_lineno = 4; > __pyx_clineno = __LINE__; goto __pyx_L1_error;} > Py_DECREF(__pyx_2); __pyx_2 = 0; > > > _pyx_kp_29 is never given a value in the generated code so the > from_list is > invalid. I am guessing that the "as" is fooling the code generator > into thinking > that there is a from_list. > > If that gets fixed, there's more stuff that's broken. The module > that gets > returned from __Pyx_Import() as __pyx_2 will be numpy, not > numpy.random, so > there must be one or more PyObject_GetAttr()s to get the actual > module that > should be assigned to the name "rn". > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a > harmless enigma > that is made terrible by our own mad attempt to interpret it as > though it had > an underlying truth." > -- Umberto Eco > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From michael.abshoff at googlemail.com Thu Nov 13 04:15:36 2008 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Wed, 12 Nov 2008 19:15:36 -0800 Subject: [Cython] bug in generated cython -a html output In-Reply-To: References: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> Message-ID: <491B9BD8.6080002@gmail.com> Robert Bradshaw wrote: > On Nov 12, 2008, at 6:04 PM, Hoyt Koepke wrote: > It looks like your line is just before a loop? What is going on here > is that the compiler periodically emmits something that says "I'm at > this spot now" and the annotator tries to make correspondences > between the Cython and C code based on that. If you finish a line, > but the compiler doesn't explicitly say it's working on the next > line, then all the extra stuff gets appended to the end. This also > happens at the end of functions, etc. where there is cleanup code > that gets tagged onto the last line. > >> If you'd like me to file a bug report, I'd be happy to. I'm using the >> latest version of cython from the repo, 1321:6e8c09631af4. > > Yes, please do. Try and get it down to the smallest example that > generates this behavior, if possible. I have something in the same direction: some of the "boilerplate" code that Cython generates is appended at the end of the generated file. On occasion I run into issues in Sage for example that I debug and the issue points to that area of the code. It is not always clear initially that I am in that area of the code and at least the first time it happened it did cause some confusion until I figured it out that the code I was staring at did not come from the pyx file. So can we change the code generation so that it adds some extra information to those generated code sections? It is certainly not something life threatening, so feel free to ignore this feature request. > - Robert Cheers, Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From robertwb at math.washington.edu Thu Nov 13 04:19:52 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 12 Nov 2008 19:19:52 -0800 Subject: [Cython] bug in generated cython -a html output In-Reply-To: <491B9BD8.6080002@gmail.com> References: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> <491B9BD8.6080002@gmail.com> Message-ID: On Nov 12, 2008, at 7:15 PM, Michael Abshoff wrote: > Robert Bradshaw wrote: >> On Nov 12, 2008, at 6:04 PM, Hoyt Koepke wrote: > > > > >> It looks like your line is just before a loop? What is going on here >> is that the compiler periodically emmits something that says "I'm at >> this spot now" and the annotator tries to make correspondences >> between the Cython and C code based on that. If you finish a line, >> but the compiler doesn't explicitly say it's working on the next >> line, then all the extra stuff gets appended to the end. This also >> happens at the end of functions, etc. where there is cleanup code >> that gets tagged onto the last line. >> >>> If you'd like me to file a bug report, I'd be happy to. I'm >>> using the >>> latest version of cython from the repo, 1321:6e8c09631af4. >> >> Yes, please do. Try and get it down to the smallest example that >> generates this behavior, if possible. > > I have something in the same direction: some of the "boilerplate" code > that Cython generates is appended at the end of the generated file. On > occasion I run into issues in Sage for example that I debug and the > issue points to that area of the code. It is not always clear > initially > that I am in that area of the code and at least the first time it > happened it did cause some confusion until I figured it out that the > code I was staring at did not come from the pyx file. So can we change > the code generation so that it adds some extra information to those > generated code sections? It is certainly not something life > threatening, > so feel free to ignore this feature request. Yes, what we need to do is have the compiler emmit a position marker that says it's entering boilerplate code, just like the "real" positions it uses. I think a lot of people would benifit from this. - Robert From stefan_ml at behnel.de Thu Nov 13 08:09:57 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Nov 2008 08:09:57 +0100 Subject: [Cython] help tracking down TypeError In-Reply-To: References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> Message-ID: <491BD2C5.9070000@behnel.de> Hi, Robert Kern wrote: > Hoyt Koepke wrote: >>> Are you sure? I changed the line to this: >>> >>> from numpy import random as rn Hmm, I added this as tests/run/importas.pyx, and it works perfectly: =========================== __doc__ = u""" >>> import sys as sous >>> import distutils.core as corey >>> from copy import copy as copey >>> sous is _sous True >>> corey is _corey True >>> copey is _copey True >>> _sous is not None True >>> _corey is not None True >>> _copey is not None True >>> print(_sous.__name__) sys >>> print(sous.__name__) sys >>> print(_corey.__name__) distutils.core >>> print(corey.__name__) distutils.core >>> print(_copey.__name__) deepcopy >>> print(copey.__name__) deepcopy """ import sys as _sous import distutils.core as _corey from copy import copy as _copey =========================== Is numpy.random a module or a module attribute? >>> And it worked with Cython 0.10. Possibly the code that handles "import >>> . as " is incorrect. It's a tricky bit of syntax. >> You're right; this does work. I think in trying a bunch of >> combinations to figure it out, I must have missed recompiling >> something and so missed this. Sorry for the noise! > > It's not noise. This does appear to be a bug. > > /* "/Users/rkern/today/geometry/geometry.pyx":4 > * > * > * import numpy.random as rn # <<<<<<<<<<<<<< > * from copy import copy > * > */ > __pyx_1 = PyList_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = > __pyx_f[0]; __pyx_lineno = 4; __pyx_c > lineno = __LINE__; goto __pyx_L1_error;} > Py_INCREF(__pyx_kp_29); > PyList_SET_ITEM(__pyx_1, 0, __pyx_kp_29); > __pyx_2 = __Pyx_Import(__pyx_kp_28, ((PyObject *)__pyx_1)); if > (unlikely(!__pyx_2)) {__pyx_filename = __ > pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} > Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; > if (PyObject_SetAttr(__pyx_m, __pyx_kp_rn, __pyx_2) < 0) {__pyx_filename = > __pyx_f[0]; __pyx_lineno = 4; > __pyx_clineno = __LINE__; goto __pyx_L1_error;} > Py_DECREF(__pyx_2); __pyx_2 = 0; > > > _pyx_kp_29 is never given a value in the generated code so the from_list is > invalid. I am guessing that the "as" is fooling the code generator into thinking > that there is a from_list. It's not a general bug in any case. The constant _kp_29 is assigned its value from _k_29 in __Pyx_InitStrings() by walking __pyx_string_tab, so if _k_29 has a value in the code and appears in the string table, _kp_29 will initialised as well. > If that gets fixed, there's more stuff that's broken. The module that gets > returned from __Pyx_Import() as __pyx_2 will be numpy, not numpy.random, so > there must be one or more PyObject_GetAttr()s to get the actual module that > should be assigned to the name "rn". Hmmm, could you try to extend the test case above so that it shows the problem? Stefan From robertwb at math.washington.edu Thu Nov 13 10:23:28 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 13 Nov 2008 01:23:28 -0800 Subject: [Cython] Pure python mode In-Reply-To: <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On Nov 11, 2008, at 1:22 AM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> On Nov 10, 2008, at 9:54 PM, Stefan Behnel wrote: >>> @cython.cexttype(attr1=type, ...) >>> class A: >>> ... >> >> I would prefer cython.cclass(...) or cython.extension_type(...) >> [...] >>> people who know C might be happier with @cython.static or >>> @cython.cfunc. >> [...] >>> @cython.fastccall >>> def myfunc(): >> >> Or maybe even "ccall" as the fast should be obvious. > > I like how simple "cclass" is, and how well it matches "cfunc" and > "ccall". > > So what about > > @cython.cclass(attr1=cython.int, ...) # Py3 only > > @cython.cfunc(cython.int, arg1=cython.int, ...) > > @cython.ccall(cython.int, arg1=cython.int, ...) > > I think they target nicely distinct use cases and make it perfectly > clear > what's happening. Additionally, you can use > > @cython.locals(var1=..., ...) > > with @cython.cfunc, @cython.ccall or plain functions. Not sure if we > should allow declaring function parameters in the cfunc/ccall > cases. It > makes sense in the plain function case, though. I think that's fine, there's going to be a lot of cases where one only cares about typing the arguments (especially once we have type inference). > Now I don't even see a use case for > > @cython.returns(cython.int) > > anymore, as all functions where it makes sense would either be > ccall or > cfunc. Yep. >>> (BTW, should such a class - be allowed to or automatically - >>> inherit from object in Python?) >> >> Don't all types inherit from object already? > > No, only in Py3. In Py2, if you write > > class A: pass > > you will get an old-style class. I would love to rule that out for > classes > that Cython will compile to cdef classes. Ah, classes (not extension classes). I think we should retain the Py2 behavior in Py2. But @cython.cclass class A: pass would necessarily inherit from object. >>> I'm not sure the attributes are necessary, though. It might be >>> enough to write >>> >>> @cython.cexttype >>> class A: >>> attr1 = cython.declare(cython.int) >>> ... >>> >>> But the problem is that this has different semantics in Python... >> >> The semantics are not different enough to be a bad syntax. > > Well, the difference is that the above gives you a class attribute in > Python, which users usually (but not necessarily!) overwrite by > instance > attribute setting in the constructor. In any case, it stays alive > in the > class and can be accessed there, be it through the instance or the > class. > However, for extension types, the attribute is /always/ per instance. Yep. I also like how it's a direct analogue of the "cdef int attr1" that we use now (in case anyone's translating old code to be use in pure Python mode). >> I like >> being able to specify the attributes in the decorator as well though. > > Yes, I think the above syntax gets more readable the more > attributes you > have, but the decorator/metaclass syntax allows you to keep Python > semantics. > > >>> Given that class annotations are not really a current Python thing >>> anyway, I might also consider >>> >>> a) a special Cython metaclass >>> >>> class A: >>> __metaclass__ = cython.cexttype >>> >>> or b) simply use inheritance >>> >>> class A(..., cython.CExtType): >>> >>> where cython.CExtType 'is' object when running in Python. >> >> Both of these are interesting, but I have to say I prefer the >> decorator form, but it's not available < Py3. The __metaclass__ seems >> to fit what's happening better (and could also take attribute/other >> arguments). > > Yes, it would match the decorator exactly. OK. lets to that then. - Robert From robertwb at math.washington.edu Thu Nov 13 10:40:18 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 13 Nov 2008 01:40:18 -0800 Subject: [Cython] External typedefs and pointers In-Reply-To: <491810ED.8090904@student.matnat.uio.no> References: <491810ED.8090904@student.matnat.uio.no> Message-ID: On Nov 10, 2008, at 2:46 AM, Dag Sverre Seljebotn wrote: > A discussion recently came up on the NumPy mailing list, and it > inspired > me to focus on this usability issue: > > cdef extern from "test.h": > ctypedef int type_a > ctypedef int type_b > cdef type_a* ptr1 = NULL > cdef type_b* ptr2 = ptr1 > > Now, it might happen (like with NumPy) that type_a and type_b are > defined as seperate types on some platforms and the same on others > (through #ifdefs). Partly this is relied upon currently, making the > situation a bit confusing, especially for new users. > > Possible solutions: > > 1) Make all external types pointer-incompatible. So "ptr2=ptr1" above > will always fail, and an explicit cast is needed. I think this is probably the way to go. Presumably there's a reason to have to separate times. > 2) Introduce a new keyword, something like unknown_size: > > cdef extern from "test.h": > ctypedef unknown_size int type_a > ctypedef long type_b # we know this is always "long" > ... > > which trigger pointer-incompatability with anything. NB! At the same > time, all external primitive types are checked for the right > declaration > in Cython at module startup time: > > if (sizeof(type_b) != sizeof(long) || ((type_a)-1) != ((type_b)-1) || > ((type_a)0.5) != ((type_b)0.5))) { /*raise exception*/ } > > (Not sure if that float check will work though, might need to do a > division instead.) I have to admit I'm not a fan either of the new keyword or waiting 'till runtime to throw the error. > 3) Complex interaction with the C compiler so that Cython and the C > compiler work "in one step". The Cython core is simplified so that it > never checks pointer assignment compatability, but rather take any > error > the C compiler gives and translates it back to an error to the Cython > user. Obviously not a short-term solution but if this is a long-term > solution it might be enough reason not to bother with this for now. > This > seems to be the most stable solution, but it does give away the > possibility to seperate the Cython and C compilation stages as some > people are fond of doing. The biggest problem with 2 and 3 is that people often ship the .c files and it is compiled/run on the (end-users) machine. Much worse to delay the error to this point, and it may work on some machines and not on others. Better to raise an error at Cython compile time an force the programmer to think about what to do. - Robert From stefan_ml at behnel.de Thu Nov 13 14:29:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Nov 2008 14:29:29 +0100 (CET) Subject: [Cython] Pure python mode In-Reply-To: References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061025n696ea13xe81af4231eb04ff0@mail.gmail.com> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <53562.213.61.181.86.1226582969.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Robert Bradshaw wrote: > On Nov 11, 2008, at 1:22 AM, Stefan Behnel wrote: >> @cython.cclass(attr1=cython.int, ...) # Py3 only This actually works in Py2.6 also, where a @cython.cclass class A: pass would still be an old-style class. So, using a class decorator here will not guard us from decorated classes becoming old-style classes in CPython. The decorator actually gets the readily created class as argument, which is an old-style class already. Fixing that into a new-style class will not work. So the only chance we have is really a metaclass, or just an enforced requirement that any decorated class must inherit from object, i.e. it must be a type. Not only in this case would it be nice to allow extension classes to inherit from object in Cython without any additional setup. Currently, you have to ctypedef the object type yourself to do that. We could override an "object" base class and just emit a plain extension type instead. BTW, Py3 (not Py2.x) would also allow this syntax when using a metaclass: class A(metaclass=cython.cclass): pass Stefan From uschmitt at mineway.de Thu Nov 13 15:24:34 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Thu, 13 Nov 2008 15:24:34 +0100 Subject: [Cython] problems with external data In-Reply-To: References: <491810ED.8090904@student.matnat.uio.no> Message-ID: <491C38A2.4060804@mineway.de> Hi, I got a problem in the following situatino when wrapping some c code: file svdlib.h contains: extern char *SVDVersion; extern long SVDVerbosity; file svlib.pyx begins with: cdef extern from "svdlib.h" : cdef extern long SVDVerbosity cdef extern char * SVDVersion If I study the generated C code, Cython handles SVDVersion as a char*, but SVDVerbosity as a PyObject*. If I try to reduce this problem to a simple test case, everything works fine :-( Any hints why Cython does not recognize SVDVerbosity ??? Greetings, Uwe From uschmitt at mineway.de Thu Nov 13 15:35:20 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Thu, 13 Nov 2008 15:35:20 +0100 Subject: [Cython] construction numpy arrays In-Reply-To: <491C38A2.4060804@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> Message-ID: <491C3B28.1030200@mineway.de> Hi, is there a fast and simple way to convert a given chunk of memory to a coresponding numpy array ? Or do I have to construct the array entry by entry ? Greetings, Uwe -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 From robertwb at math.washington.edu Thu Nov 13 19:01:35 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 13 Nov 2008 10:01:35 -0800 Subject: [Cython] construction numpy arrays In-Reply-To: <491C3B28.1030200@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C3B28.1030200@mineway.de> Message-ID: On Nov 13, 2008, at 6:35 AM, Uwe Schmitt wrote: > Hi, > > is there a fast and simple way to convert a given > chunk of memory to a coresponding numpy array ? > Or do I have to construct the array entry by entry ? You can grab the numpy underlying data as a void* and do a memcpy. Perhaps you can even avoid the memcpy, but I'm not sure. - Robert From stefan_ml at behnel.de Thu Nov 13 19:27:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Nov 2008 19:27:22 +0100 Subject: [Cython] problems with external data In-Reply-To: <491C38A2.4060804@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> Message-ID: <491C718A.9050805@behnel.de> Hi, please don't hijack other people's threads, start a new one for a new topic. Uwe Schmitt wrote: > I got a problem in the following situatino when wrapping > some c code: > > file svdlib.h contains: > > extern char *SVDVersion; > extern long SVDVerbosity; > > file svlib.pyx begins with: > > cdef extern from "svdlib.h" : > > cdef extern long SVDVerbosity > cdef extern char * SVDVersion > > > If I study the generated C code, Cython > handles SVDVersion as a char*, but SVDVerbosity > as a PyObject*. Could you forward a section of your code and the corresponding generated C code that shows this behaviour? Stefan From jasone at canonware.com Thu Nov 13 19:56:33 2008 From: jasone at canonware.com (Jason Evans) Date: Thu, 13 Nov 2008 10:56:33 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: <20F54975-55E8-4ABB-9059-6B6A3AF4F091@math.washington.edu> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <20F54975-55E8-4ABB-9059-6B6A3AF4F091@math.washington.edu> Message-ID: <491C7861.5000400@canonware.com> Robert Bradshaw wrote: > Could you file a bug report? How does one file a bug report these days? The last time I tried (a couple of weeks ago), it looked like the wiki had been locked up so that an administrator-created account is necessary. Thanks, Jason From robertwb at math.washington.edu Thu Nov 13 20:00:05 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 13 Nov 2008 11:00:05 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: <491C7861.5000400@canonware.com> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <20F54975-55E8-4ABB-9059-6B6A3AF4F091@math.washington.edu> <491C7861.5000400@canonware.com> Message-ID: On Nov 13, 2008, at 10:56 AM, Jason Evans wrote: > Robert Bradshaw wrote: >> Could you file a bug report? > > How does one file a bug report these days? The last time I tried (a > couple of weeks ago), it looked like the wiki had been locked up so > that > an administrator-created account is necessary. The page is here http://trac.cython.org/cython_trac/ I had to lock it down a bit because we were being deluged by spam. Email me a htpasswd file and I'll add it as an account. - Robert From dagss at student.matnat.uio.no Thu Nov 13 20:14:52 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 13 Nov 2008 20:14:52 +0100 Subject: [Cython] construction numpy arrays In-Reply-To: References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C3B28.1030200@mineway.de> Message-ID: <491C7CAC.4070009@student.matnat.uio.no> Robert Bradshaw wrote: > On Nov 13, 2008, at 6:35 AM, Uwe Schmitt wrote: > >> Hi, >> >> is there a fast and simple way to convert a given >> chunk of memory to a coresponding numpy array ? >> Or do I have to construct the array entry by entry ? > > You can grab the numpy underlying data as a void* and do a memcpy. > Perhaps you can even avoid the memcpy, but I'm not sure. > There are ways of handing off the memory to NumPy arrays directly, but it is not very trivial. This question is better asked on the NumPy mailing list; as it's been answered in various forms there before and they will know it a lot better (I don't know the details myself). -- Dag Sverre From robert.kern at gmail.com Thu Nov 13 23:25:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Nov 2008 16:25:45 -0600 Subject: [Cython] help tracking down TypeError In-Reply-To: <491BD2C5.9070000@behnel.de> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> Message-ID: Stefan Behnel wrote: > Hi, > > Robert Kern wrote: >> Hoyt Koepke wrote: >>>> Are you sure? I changed the line to this: >>>> >>>> from numpy import random as rn > > Hmm, I added this as tests/run/importas.pyx, and it works perfectly: > > =========================== > __doc__ = u""" >>>> import sys as sous >>>> import distutils.core as corey >>>> from copy import copy as copey > >>>> sous is _sous > True >>>> corey is _corey > True >>>> copey is _copey > True > >>>> _sous is not None > True >>>> _corey is not None > True >>>> _copey is not None > True > >>>> print(_sous.__name__) > sys >>>> print(sous.__name__) > sys >>>> print(_corey.__name__) > distutils.core >>>> print(corey.__name__) > distutils.core >>>> print(_copey.__name__) > deepcopy >>>> print(copey.__name__) > deepcopy > """ > > import sys as _sous > import distutils.core as _corey > from copy import copy as _copey > =========================== > > Is numpy.random a module or a module attribute? It's a package. Add import distutils.command as _commie and you will see the failure. >>>> And it worked with Cython 0.10. Possibly the code that handles "import >>>> . as " is incorrect. It's a tricky bit of syntax. >>> You're right; this does work. I think in trying a bunch of >>> combinations to figure it out, I must have missed recompiling >>> something and so missed this. Sorry for the noise! >> It's not noise. This does appear to be a bug. >> >> /* "/Users/rkern/today/geometry/geometry.pyx":4 >> * >> * >> * import numpy.random as rn # <<<<<<<<<<<<<< >> * from copy import copy >> * >> */ >> __pyx_1 = PyList_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = >> __pyx_f[0]; __pyx_lineno = 4; __pyx_c >> lineno = __LINE__; goto __pyx_L1_error;} >> Py_INCREF(__pyx_kp_29); >> PyList_SET_ITEM(__pyx_1, 0, __pyx_kp_29); >> __pyx_2 = __Pyx_Import(__pyx_kp_28, ((PyObject *)__pyx_1)); if >> (unlikely(!__pyx_2)) {__pyx_filename = __ >> pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} >> Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; >> if (PyObject_SetAttr(__pyx_m, __pyx_kp_rn, __pyx_2) < 0) {__pyx_filename = >> __pyx_f[0]; __pyx_lineno = 4; >> __pyx_clineno = __LINE__; goto __pyx_L1_error;} >> Py_DECREF(__pyx_2); __pyx_2 = 0; >> >> >> _pyx_kp_29 is never given a value in the generated code so the from_list is >> invalid. I am guessing that the "as" is fooling the code generator into thinking >> that there is a from_list. > > It's not a general bug in any case. The constant _kp_29 is assigned its value > from _k_29 in __Pyx_InitStrings() by walking __pyx_string_tab, so if _k_29 has > a value in the code and appears in the string table, _kp_29 will initialised > as well. Ah, I see. I think the problem is that _kp_29 becomes a unicode string u'*'. The function that raises the exception explicitly checks for exactly str. if (!PyString_Check(item)) { PyErr_SetString(PyExc_TypeError, "Item in ``from list'' not a string"); If I manually go into the importas.c source (after adding my failing line) and set is_unicode to 0 in the string_tab, then it can import. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan_ml at behnel.de Thu Nov 13 23:29:41 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Nov 2008 23:29:41 +0100 Subject: [Cython] help tracking down TypeError In-Reply-To: References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> Message-ID: <491CAA55.9070209@behnel.de> Hi, Robert Kern wrote: > I think the problem is that _kp_29 becomes a unicode string u'*'. The > function that raises the exception explicitly checks for exactly str. > > if (!PyString_Check(item)) { > > PyErr_SetString(PyExc_TypeError, > "Item in ``from list'' not a string"); > > If I manually go into the importas.c source (after adding my failing line) and > set is_unicode to 0 in the string_tab, then it can import. Ok, then it's a Py2-only issue. Py3 requires a unicode string here. Setting the "is_identifier" bit for the specific string should do the trick. Stefan From stefan_ml at behnel.de Thu Nov 13 23:30:25 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 13 Nov 2008 23:30:25 +0100 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <4917BAD1.8080802@canterbury.ac.nz> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> <4917BAD1.8080802@canterbury.ac.nz> Message-ID: <491CAA81.9000108@behnel.de> Hi, Greg Ewing wrote: > You can do > > a[0], a[1], a[2], a[3], a[4] = 1, 3, 28, 5, 3 > > which will get turned into a series of assignments. > > Ideally, what you *should* be able to do is > > a[:] = 1, 3, 28, 5, 3 In Cython, you can now do cdef int a[5] a[:] = [1, 3, 28, 5, 3] or cdef int a[20] a[1:6] = [1, 3, 28, 5, 3] or even cdef int a[20] start = 1 end = 6 a[start:end] = [1, 3, 28, 5, 3] However, the way it's currently implemented does no bounds checking, so it's pretty easy to shoot yourself in the foot when you assign non-existing slices. It's actually not so easy to determine at compile time how long the lhs slice is, and how long the assigned sequence is. There's definitely more room for improvements. :) Stefan From robertwb at math.washington.edu Thu Nov 13 23:37:34 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 13 Nov 2008 14:37:34 -0800 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <491CAA81.9000108@behnel.de> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> <4917BAD1.8080802@canterbury.ac.nz> <491CAA81.9000108@behnel.de> Message-ID: <28F95193-37D9-40C0-B3EC-58CB454B4EC9@math.washington.edu> On Nov 13, 2008, at 2:30 PM, Stefan Behnel wrote: > Hi, > > Greg Ewing wrote: >> You can do >> >> a[0], a[1], a[2], a[3], a[4] = 1, 3, 28, 5, 3 >> >> which will get turned into a series of assignments. >> >> Ideally, what you *should* be able to do is >> >> a[:] = 1, 3, 28, 5, 3 > > In Cython, you can now do > > cdef int a[5] > a[:] = [1, 3, 28, 5, 3] > > or > > cdef int a[20] > a[1:6] = [1, 3, 28, 5, 3] > > or even > > cdef int a[20] > start = 1 > end = 6 > a[start:end] = [1, 3, 28, 5, 3] Very cool. > However, the way it's currently implemented does no bounds > checking, so it's > pretty easy to shoot yourself in the foot when you assign non- > existing slices. > It's actually not so easy to determine at compile time how long the > lhs slice > is, and how long the assigned sequence is. > > There's definitely more room for improvements. :) Ouch. We can make sure (at runtime) that the rhs has the right size, but there's not much we can do for the lhs. I guess this is no different than indexing though... - Robert From hoytak at cs.ubc.ca Fri Nov 14 00:58:56 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Thu, 13 Nov 2008 15:58:56 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: <491CAA55.9070209@behnel.de> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> Message-ID: <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> > Setting the "is_identifier" bit for the specific string should do the trick. Nice. Would you still like me to file the bug report? --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From hoytak at cs.ubc.ca Fri Nov 14 01:05:45 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Thu, 13 Nov 2008 16:05:45 -0800 Subject: [Cython] bug in generated cython -a html output In-Reply-To: References: <4db580fd0811121804q7897593auef1a703a61191064@mail.gmail.com> Message-ID: <4db580fd0811131605v3306bfbep7b036e419cba3e6f@mail.gmail.com> > It should always match, but sometimes there's some C extra code > hanging off. Yes, I've seen some of that. In my case, this definitely isn't true as the next 3 lines are more of the same. To continue on, the code in the .c file is the beautiful __pyx_t_9 = __pyx_v_yi; __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges)); /* "/home/hoytak/workspace/gravimetrics/spatial/gridworld.pyx":126 * p2.x = x_edges[xi+1] * p1.y = y_edges[yi] * p2.y = y_edges[yi+1] # <<<<<<<<<<<<<< * p1.z = z_edges[zi] * p2.z = z_edges[zi+1] */ __pyx_t_10 = (__pyx_v_yi + 1); __pyx_v_p2->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_10, __pyx_bstride_0_y_edges)); /* "/home/hoytak/workspace/gravimetrics/spatial/gridworld.pyx":127 * p1.y = y_edges[yi] * p2.y = y_edges[yi+1] * p1.z = z_edges[zi] # <<<<<<<<<<<<<< * p2.z = z_edges[zi+1] * */ __pyx_t_11 = __pyx_v_zi; __pyx_v_p1->z = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_z_edges.buf, __pyx_t_11, __pyx_bstride_0_z_edges)); /* "/home/hoytak/workspace/gravimetrics/spatial/gridworld.pyx":128 * p2.y = y_edges[yi+1] * p1.z = z_edges[zi] * p2.z = z_edges[zi+1] # <<<<<<<<<<<<<< * * curbox.setCoordsDirect(p1, p2) */ __pyx_t_12 = (__pyx_v_zi + 1); __pyx_v_p2->z = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_z_edges.buf, __pyx_t_12, __pyx_bstride_0_z_edges)); /* "/home/hoytak/workspace/gravimetrics/spatial/gridworld.pyx":130 * p2.z = z_edges[zi+1] * * .................. However, the same code in the .html shows up as: 125: p1.y = y_edges[yi] __pyx_t_9 = __pyx_v_yi; __pyx_v_p1->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_9, __pyx_bstride_0_y_edges)); __pyx_2 = PyObject_GetAttr(__pyx_v_iterator, __pyx_kp_next); if (unlikely(!__pyx_2)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} __pyx_3 = PyObject_Call(__pyx_2, ((PyObject *)__pyx_empty_tuple), NULL); if (unlikely(!__pyx_3)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_2); __pyx_2 = 0; __pyx_2 = __Pyx_GetItemInt(__pyx_3, 1, 0); if (!__pyx_2) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_3); __pyx_3 = 0; __pyx_3 = __Pyx_GetItemInt(__pyx_2, 0, 0); if (!__pyx_3) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(__pyx_2); __pyx_2 = 0; if (!(__Pyx_TypeTest(__pyx_3, __pyx_ptype_5numpy_dtype))) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 125; __pyx_clineno = __LINE__; goto __pyx_L13_error;} Py_DECREF(((PyObject *)__pyx_v_descr)); __pyx_v_descr = ((PyArray_Descr *)__pyx_3); __pyx_3 = 0; } goto __pyx_L17_try; __pyx_L13_error:; Py_XDECREF(__pyx_2); __pyx_2 = 0; Py_XDECREF(__pyx_3); __pyx_3 = 0; 126: p2.y = y_edges[yi+1] __pyx_t_10 = (__pyx_v_yi + 1); __pyx_v_p2->y = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_y_edges.buf, __pyx_t_10, __pyx_bstride_0_y_edges)); __pyx_4 = PyErr_ExceptionMatches(__pyx_builtin_StopIteration); if (__pyx_4) { __Pyx_AddTraceback("numpy.__getbuffer__"); if (__Pyx_GetException(&__pyx_2, &__pyx_3, &__pyx_5) < 0) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 126; __pyx_clineno = __LINE__; goto __pyx_L15_except_error;} 127: p1.z = z_edges[zi] __pyx_t_11 = __pyx_v_zi; __pyx_v_p1->z = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_z_edges.buf, __pyx_t_11, __pyx_bstride_0_z_edges)); __pyx_6 = PyObject_GetAttr(((PyObject *)__pyx_v_stack), __pyx_kp_pop); if (unlikely(!__pyx_6)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 127; __pyx_clineno = __LINE__; goto __pyx_L15_except_error;} __pyx_7 = PyObject_Call(__pyx_6, ((PyObject *)__pyx_empty_tuple), NULL); if (unlikely(!__pyx_7)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 127; __pyx_clineno = __LINE__; goto __pyx_L15_except_error;} Py_DECREF(__pyx_6); __pyx_6 = 0; Py_DECREF(__pyx_7); __pyx_7 = 0; 128: p2.z = z_edges[zi+1] __pyx_t_12 = (__pyx_v_zi + 1); __pyx_v_p2->z = (*__Pyx_BufPtrCContig1d(float *, __pyx_bstruct_z_edges.buf, __pyx_t_12, __pyx_bstride_0_z_edges)); __pyx_8 = PyObject_Length(((PyObject *)__pyx_v_stack)); if (unlikely(__pyx_8 == -1)) {__pyx_filename = __pyx_f[1]; __pyx_lineno = 128; __pyx_clineno = __LINE__; goto __pyx_L15_except_error;} __pyx_1 = (__pyx_8 > 0); if (__pyx_1) { 129: So it's definitely showing stuff that isn't there (unless I'm reading things way wrong). I don't have time for a few hours here, but I will play around with it and try to get some other small test cases here in a bit and file a bug report. Thanks! --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From greg.ewing at canterbury.ac.nz Fri Nov 14 05:16:10 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Nov 2008 17:16:10 +1300 Subject: [Cython] Pure python mode In-Reply-To: <53562.213.61.181.86.1226582969.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <1648675C-9FE0-4D45-B9A9-0D70214D6CCC@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <53562.213.61.181.86.1226582969.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <491CFB8A.805@canterbury.ac.nz> Stefan Behnel wrote: > Not only in this case would it be nice to allow extension classes to > inherit from object in Cython without any additional setup. What do you mean by that? Extension classes already inherit from object implicitly. -- Greg From uschmitt at mineway.de Fri Nov 14 09:57:11 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 09:57:11 +0100 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491C718A.9050805@behnel.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> Message-ID: <491D3D67.30105@mineway.de> Stefan Behnel schrieb: > Hi, > > please don't hijack other people's threads, start a new one for a new topic. > > Uwe Schmitt wrote: >> y >> as a PyObject*. >> > > Could you forward a section of your code and the corresponding generated C > code that shows this behaviour? > Good news: the problem disappeared after upgrading to Cython 0.10, Greetings, Uwe -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20081114/02b67b40/attachment.htm From stefan_ml at behnel.de Fri Nov 14 10:13:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 10:13:09 +0100 (CET) Subject: [Cython] Pure python mode In-Reply-To: <491CFB8A.805@canterbury.ac.nz> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <53562.213.61.181.86.1226582969.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <491CFB8A.805@canterbury.ac.nz> Message-ID: <40453.213.61.181.86.1226653989.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Greg Ewing wrote: > Stefan Behnel wrote: > >> Not only in this case would it be nice to allow extension classes to >> inherit from object in Cython without any additional setup. > > What do you mean by that? Extension classes already > inherit from object implicitly. Yes, but you can't currently write cdef class A(object): pass without declaring 'object' as a C class first. Stefan From stefan_ml at behnel.de Fri Nov 14 10:27:04 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 10:27:04 +0100 (CET) Subject: [Cython] help tracking down TypeError In-Reply-To: <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> Message-ID: <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Hoyt Koepke wrote: >> Setting the "is_identifier" bit for the specific string should do the >> trick. > > Nice. Would you still like me to file the bug report? Sure, I can't fix it right away (so that will keep it from getting lost), and a bug report is very helpful for documentation purposes to show which bugs were fixed in a release. Thanks! Stefan From uschmitt at mineway.de Fri Nov 14 12:23:34 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 12:23:34 +0100 Subject: [Cython] buffer interface problem Message-ID: <491D5FB6.10806@mineway.de> Hi, I got a problem with the following typedef ctypedef np.ndarray[np.double_t, ndim=1, negative_indices=False, mode="c"] ndarray results in Traceback (most recent call last): File "setup.py", line 24, in ext_modules= [ ext1,], File "C:\Python25\lib\distutils\core.py", line 151, in setup dist.run_commands() File "C:\Python25\lib\distutils\dist.py", line 974, in run_commands self.run_command(cmd) File "C:\Python25\lib\distutils\dist.py", line 994, in run_command cmd_obj.run() File "C:\Python25\lib\distutils\command\build_ext.py", line 290, in run self.build_extensions() File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Distuti ls\build_ext.py", line 81, in build_extensions ext.sources = self.cython_sources(ext.sources, ext) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Distuti ls\build_ext.py", line 196, in cython_sources full_module_name=module_name) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 690, in compile return compile_single(source, options, full_module_name) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 635, in compile_single return run_pipeline(source, options, full_module_name) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 524, in run_pipeline err, enddata = context.run_pipeline(pipeline, source) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 183, in run_pipeline data = phase(data) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 482, in parse tree = context.parse(source_desc, scope, pxd = 0, full_module_name = full_mo dule_name) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Main.py", line 414, in parse tree = Parsing.p_module(s, pxd, full_module_name) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Parsing.py", line 2414, in p_module body = p_statement_list(s, Ctx(level = level), first_statement = 1) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Parsing.py", line 1480, in p_statement_list stats.append(p_statement(s, ctx, first_statement = first_statement)) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Parsing.py", line 1411, in p_statement return p_ctypedef_statement(s, ctx) File "c:\python25\lib\site-packages\cython-0.10-py2.5-win32.egg\Cython\Compile r\Parsing.py", line 2182, in p_ctypedef_statement if base_type.name is None: AttributeError: 'CBufferAccessTypeNode' object has no attribute 'name' Is this a bug ? Greetings, Uwe -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20081114/886a424b/attachment.htm From dagss at student.matnat.uio.no Fri Nov 14 13:08:22 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 14 Nov 2008 13:08:22 +0100 Subject: [Cython] buffer interface problem In-Reply-To: <491D5FB6.10806@mineway.de> References: <491D5FB6.10806@mineway.de> Message-ID: <491D6A36.1070609@student.matnat.uio.no> Uwe Schmitt wrote: > > Hi, > > I got a problem with the following typedef > > ctypedef np.ndarray[np.double_t, ndim=1, negative_indices=False, > mode="c"] ndarray > > Is this a bug ? Yes, Cython producing an exception during compilation is always a bug. I must shamefully admit that this reminds me that I put "buffer typedefs" on my TODO list in summer, and never got around to do it, and completely forgot about it. This should work of course, though an acceptable first step is to disallow it in a more graceful fashion. Unfortunately I don't have much time for either now... This is now ticket http://trac.cython.org/cython_trac/ticket/117 Dag Sverre From uschmitt at mineway.de Fri Nov 14 13:10:59 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 13:10:59 +0100 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491D3D67.30105@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> Message-ID: <491D6AD3.3070202@mineway.de> Uwe Schmitt schrieb: > Stefan Behnel schrieb: >> Uwe Schmitt wrote: >>> > I got a problem in the following situatino when wrapping >>> > some c code: >>> > >>> > file svdlib.h contains: >>> > >>> > extern char *SVDVersion; >>> > extern long SVDVerbosity; >>> > >>> > file svlib.pyx begins with: >>> > >>> > cdef extern from "svdlib.h" : >>> > >>> > cdef extern long SVDVerbosity >>> > cdef extern char * SVDVersion >>> > >>> > >>> > If I study the generated C code, Cython >>> > handles SVDVersion as a char*, but SVDVerbosity >>> > as a PyObject*. >> >> Could you forward a section of your code and the corresponding generated C >> code that shows this behaviour? >> >> Stefan > > Good news: the problem disappeared after upgrading to Cython 0.10, Bad news: the error occured again, I do not know what happens.. I'm using "print SVDVersion" and "print SVDVerbosity" which are translated as follows: /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":157 * cdef SMat As * * SVDVerbosity = verbosity # <<<<<<<<<<<<<< * print SVDVerbosity * print SVDVersion */ Py_INCREF(__pyx_v_verbosity); Py_DECREF(__pyx_v_SVDVerbosity); __pyx_v_SVDVerbosity = __pyx_v_verbosity; /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":158 * * SVDVerbosity = verbosity * print SVDVerbosity # <<<<<<<<<<<<<< * print SVDVersion * */ __pyx_1 = PyTuple_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 158; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_INCREF(__pyx_v_SVDVerbosity); PyTuple_SET_ITEM(__pyx_1, 0, __pyx_v_SVDVerbosity); if (__Pyx_Print(((PyObject *)__pyx_1), 1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 158; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":159 * SVDVerbosity = verbosity * print SVDVerbosity * print SVDVersion # <<<<<<<<<<<<<< * * if isinstance(A, scipy.sparse.spmatrix): */ __pyx_1 = __Pyx_PyBytes_FromString(SVDVersion); if (unlikely(!__pyx_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 159; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __pyx_2 = PyTuple_New(1); if (unlikely(!__pyx_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 159; __pyx_clineno = __LINE__; goto __pyx_L1_error;} PyTuple_SET_ITEM(__pyx_2, 0, __pyx_1); __pyx_1 = 0; if (__Pyx_Print(((PyObject *)__pyx_2), 1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 159; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(((PyObject *)__pyx_2)); __pyx_2 = 0; SVDVerbosity is declared as "PyObject *__pyx_v_SVDVerbosity;" SVDVersion is not declared at all, which is the behavior I expected. Greetings, Uwe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20081114/e5715364/attachment.htm From stefan_ml at behnel.de Fri Nov 14 13:26:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 13:26:09 +0100 (CET) Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491D6AD3.3070202@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> Message-ID: <49462.213.61.181.86.1226665569.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Uwe Schmitt wrote: > Uwe Schmitt schrieb: >> Stefan Behnel schrieb: >>> Uwe Schmitt wrote: >>>> > I got a problem in the following situatino when wrapping >>>> > some c code: >>>> > >>>> > file svdlib.h contains: >>>> > >>>> > extern char *SVDVersion; >>>> > extern long SVDVerbosity; >>>> > >>>> > file svlib.pyx begins with: >>>> > >>>> > cdef extern from "svdlib.h" : >>>> > >>>> > cdef extern long SVDVerbosity >>>> > cdef extern char * SVDVersion > > I'm using "print SVDVersion" > and "print SVDVerbosity" which are translated as follows: > > /* > "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":157 > * cdef SMat As > * > * SVDVerbosity = verbosity # <<<<<<<<<<<<<< > * print SVDVerbosity > * print SVDVersion > */ > Py_INCREF(__pyx_v_verbosity); > Py_DECREF(__pyx_v_SVDVerbosity); > __pyx_v_SVDVerbosity = __pyx_v_verbosity; Hmm, your code example is not very complete. An educated guess is that the above code occurs inside a function, and the assignment to SVDVerbosity makes it a local variable (normal Python behaviour). Since it hasn't been declared, it has the normal Python object type. I have no idea why you do this assignment, though. As I said, you could help us in helping you by providing a relevant section of your code. Stefan From uschmitt at mineway.de Fri Nov 14 13:28:21 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 13:28:21 +0100 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491D6AD3.3070202@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> Message-ID: <491D6EE5.8030400@mineway.de> Uwe Schmitt schrieb: > Uwe Schmitt schrieb: >> >> Good news: the problem disappeared after upgrading to Cython 0.10, > > Bad news: the error occured again, I do not know what happens.. > > I'm using "print SVDVersion" > and "print SVDVerbosity" which are translated as follows: > I depends on the usage ov SVDVerbosity: If I just read the variable I get: /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":157 * cdef SMat As * #SVDVerbosity = 0 * print SVDVerbosity # <<<<<<<<<<<<<< * * if isinstance(A, scipy.sparse.spmatrix): */ __pyx_1 = PyInt_FromLong(SVDVerbosity); if (unlikely(!__pyx_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 157; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __pyx_2 = PyTuple_New(1); if (unlikely(!__pyx_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 157; __pyx_clineno = __LINE__; goto __pyx_L1_error;} PyTuple_SET_ITEM(__pyx_2, 0, __pyx_1); __pyx_1 = 0; if (__Pyx_Print(((PyObject *)__pyx_2), 1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 157; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(((PyObject *)__pyx_2)); __pyx_2 = 0; As soon as I add one line setting SVDVerbosity, I get: /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":158 * cdef SMat As * * SVDVerbosity = 0 # <<<<<<<<<<<<<< * print SVDVerbosity * */ Py_INCREF(__pyx_int_0); Py_DECREF(__pyx_v_SVDVerbosity); __pyx_v_SVDVerbosity = __pyx_int_0; /* "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":159 * * SVDVerbosity = 0 * print SVDVerbosity # <<<<<<<<<<<<<< * * */ __pyx_1 = PyTuple_New(1); if (unlikely(!__pyx_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 159; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_INCREF(__pyx_v_SVDVerbosity); PyTuple_SET_ITEM(__pyx_1, 0, __pyx_v_SVDVerbosity); if (__Pyx_Print(((PyObject *)__pyx_1), 1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 159; __pyx_clineno = __LINE__; goto __pyx_L1_error;} Py_DECREF(((PyObject *)__pyx_1)); __pyx_1 = 0; Greetings, Uwe PS: Thanks for developing Cython, it is a great tool ! -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20081114/9e101faf/attachment.htm From uschmitt at mineway.de Fri Nov 14 13:33:27 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 13:33:27 +0100 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <49462.213.61.181.86.1226665569.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> <49462.213.61.181.86.1226665569.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <491D7017.8060101@mineway.de> Stefan Behnel schrieb: > Hmm, your code example is not very complete. An educated guess is that the > above code occurs inside a function, and the assignment to SVDVerbosity > makes it a local variable (normal Python behaviour). Since it hasn't been > declared, it has the normal Python object type. Ok, that is the reason. Changing SVDVerbosity = 0 to cdef extern long SVDVerbosity = 0 helps. I posted more information before I read your answer, so please forget this other mail. > > I have no idea why you do this assignment, though. As I said, you could > help us in helping you by providing a relevant section of your code. The reason is that I am interfacing external C code where the verbosity of its output is controlled by this global variable. Not nice, but I have to live with it. Greetings & Thanks, Uwe > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 From stefan_ml at behnel.de Fri Nov 14 13:59:47 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 13:59:47 +0100 (CET) Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491D7017.8060101@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> <49462.213.61.181.86.1226665569.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <491D7017.8060101@mineway.de> Message-ID: <57662.213.61.181.86.1226667587.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Uwe Schmitt wrote: > Stefan Behnel schrieb: >> Hmm, your code example is not very complete. An educated guess is that >> the >> above code occurs inside a function, and the assignment to SVDVerbosity >> makes it a local variable (normal Python behaviour). Since it hasn't >> been declared, it has the normal Python object type. > Ok, that is the reason. Changing > > SVDVerbosity = 0 > > to > > cdef extern long SVDVerbosity = 0 > > helps. > > The reason is that I am interfacing external C code where > the verbosity of its output is controlled by this global > variable. Ah, now I get it. A "global" statement would also do the trick, BTW. Stefan From uschmitt at mineway.de Fri Nov 14 14:11:29 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Fri, 14 Nov 2008 14:11:29 +0100 Subject: [Cython] buffer interface -- boundscheck Message-ID: <491D7901.5020806@mineway.de> Hi, I'm using the buffer interface and have a question concerning bound checks: The only way I now to avoid this is with a cython decorator like this: @cython.boundscheck(False) def fun().... But this does not work before "cdef" declared function, which I understand. Is there another way to configure this behaviour inside the code ? I found a hint on a commanline option, which is not what I am interested in. Greetings, Uwe -- Dr. rer. nat. Uwe Schmitt F&E Mathematik mineway GmbH Science Park 2 D-66123 Saarbr?cken Telefon: +49 (0)681 8390 5334 Telefax: +49 (0)681 830 4376 uschmitt at mineway.de www.mineway.de Gesch?ftsf?hrung: Dr.-Ing. Mathias Bauer Amtsgericht Saarbr?cken HRB 12339 From dagss at student.matnat.uio.no Fri Nov 14 14:39:14 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 14 Nov 2008 14:39:14 +0100 Subject: [Cython] buffer interface -- boundscheck In-Reply-To: <491D7901.5020806@mineway.de> References: <491D7901.5020806@mineway.de> Message-ID: <491D7F82.50808@student.matnat.uio.no> Uwe Schmitt wrote: > Hi, > > I'm using the buffer interface and have a question concerning > bound checks: The only way I now to avoid this is with > a cython decorator like this: > > @cython.boundscheck(False) > def fun().... > > But this does not work before "cdef" declared function, > which I understand. Ideally, one could fix the parser to support this on cdef functions (and just disallow them if they weren't compiler directives). However you can also write: with cython.boundscheck(False): ... Dag Sverre From stefan_ml at behnel.de Fri Nov 14 17:43:33 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 17:43:33 +0100 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <28F95193-37D9-40C0-B3EC-58CB454B4EC9@math.washington.edu> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> <4917BAD1.8080802@canterbury.ac.nz> <491CAA81.9000108@behnel.de> <28F95193-37D9-40C0-B3EC-58CB454B4EC9@math.washington.edu> Message-ID: <491DAAB5.1010106@behnel.de> Hi, Robert Bradshaw wrote: > On Nov 13, 2008, at 2:30 PM, Stefan Behnel wrote: >> Greg Ewing wrote: >>> You can do >>> >>> a[0], a[1], a[2], a[3], a[4] = 1, 3, 28, 5, 3 >>> >>> which will get turned into a series of assignments. >>> >>> Ideally, what you *should* be able to do is >>> >>> a[:] = 1, 3, 28, 5, 3 >> In Cython, you can now do >> >> cdef int a[5] >> a[:] = [1, 3, 28, 5, 3] >> >> or >> >> cdef int a[20] >> a[1:6] = [1, 3, 28, 5, 3] >> >> or even >> >> cdef int a[20] >> start = 1 >> end = 6 >> a[start:end] = [1, 3, 28, 5, 3] > > Very cool. You did most of the work already. >> However, the way it's currently implemented does no bounds >> checking, so it's >> pretty easy to shoot yourself in the foot when you assign non- >> existing slices. >> It's actually not so easy to determine at compile time how long the >> lhs slice >> is, and how long the assigned sequence is. >> >> There's definitely more room for improvements. :) > > Ouch. We can make sure (at runtime) that the rhs has the right size, That's what I thought, too. There's a couple of cases that we can handle efficiently, and some where we can add runtime checks. Stefan From robertwb at math.washington.edu Fri Nov 14 18:45:54 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 14 Nov 2008 09:45:54 -0800 Subject: [Cython] Pure python mode In-Reply-To: <40453.213.61.181.86.1226653989.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <20CF4396-1D84-404F-9BCC-2A2CC986F259@math.washington.edu> <85b5c3130810061110i674c87cfx1fe427d8cde1ddb3@mail.gmail.com> <78CB683C-F680-469E-BB4C-F1C771CA4E12@math.washington.edu> <33197.88.90.248.62.1223329162.squirrel@webmail.uio.no> <48EE7E60.7040502@student.matnat.uio.no> <85b5c3130811050537w7932c395h71549dca57f61f51@mail.gmail.com> <53005.213.61.181.86.1225893148.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <85b5c3130811050652y5249b3aaudeb1d4bcabf71827@mail.gmail.com> <49191E03.70001@behnel.de> <3BBE241F-7ACA-447D-8033-81831072D545@math.washington.edu> <46301.213.61.181.86.1226395323.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <53562.213.61.181.86.1226582969.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <491CFB8A.805@canterbury.ac.nz> <40453.213.61.181.86.1226653989.! squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <5D7D2ABD-3CC2-451B-B6DD-1239B837F453@math.washington.edu> On Nov 14, 2008, at 1:13 AM, Stefan Behnel wrote: > Greg Ewing wrote: >> Stefan Behnel wrote: >> >>> Not only in this case would it be nice to allow extension classes to >>> inherit from object in Cython without any additional setup. >> >> What do you mean by that? Extension classes already >> inherit from object implicitly. > > Yes, but you can't currently write > > cdef class A(object): pass > > without declaring 'object' as a C class first. Yes, that should change. - Robert From hoytak at cs.ubc.ca Fri Nov 14 21:31:08 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Fri, 14 Nov 2008 12:31:08 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4db580fd0811141231hee5bf6dsf0b12b9648b00b5e@mail.gmail.com> Okay, I'll do that right away. Could someone give me an account on the bug tracker? Thanks! --Hoyt On Fri, Nov 14, 2008 at 1:27 AM, Stefan Behnel wrote: > Hoyt Koepke wrote: >>> Setting the "is_identifier" bit for the specific string should do the >>> trick. >> >> Nice. Would you still like me to file the bug report? > > Sure, I can't fix it right away (so that will keep it from getting lost), > and a bug report is very helpful for documentation purposes to show which > bugs were fixed in a release. > > Thanks! > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From jasone at canonware.com Fri Nov 14 21:35:21 2008 From: jasone at canonware.com (Jason Evans) Date: Fri, 14 Nov 2008 12:35:21 -0800 Subject: [Cython] 0.9.8.1.1 and .pxd files In-Reply-To: References: <48BA6509.6080905@behnel.de> <48BB43FD.7000907@canterbury.ac.nz> <490A4F6C.8050701@canonware.com> Message-ID: <491DE109.4020300@canonware.com> Lisandro Dalcin wrote: > Sorry, forgot to attach... > > On Thu, Oct 30, 2008 at 11:28 PM, Lisandro Dalcin wrote: >> If all you can accept some vile hackery, see the attached tarball ;-) . >> >> Python recognizes a PKGNAME directory as a pakage if '__init__.py' is >> there, if not, no way. BUT if '__init__.so' is also there, it loads >> '__init__.so' !!! But then the exported module init function in the >> dynlib needs to be named initPKGNAME, and "PKGNAME" needs to be passed >> to Py_InitModule() >> >> So, Greg, here you have the rules if you want to implement it ;-). >> >> PS: tried only in Py 2.5, this is surely undocumented, and probably it >> is in fact some bugy code in CPython's 'import.c' . I finally got around to playing with your example, and after working past a couple of issues, I got it to work for my software. For the record, I had to use the following __init__.h file: --------------------------------------------------------------------- #if SIZEOF_SIZE_T != SIZEOF_INT # define Py_InitModule4_64(name,a,b,c,d) Py_InitModule4_64("Crux",a,b,c,d) #else # define Py_InitModule4(name,a,b,c,d) Py_InitModule4("Crux",a,b,c,d) #endif #define init__init__(a) initCrux(a) --------------------------------------------------------------------- After I got this working, I took a look at the generated __init__.c, and realized that there is no simple way to extend the approach to Python 3, since "__init__" is directly embedded in a data structure used for module initialization. The attached patch modifies Cython to handle the __init__ module name specially. This works correctly for my software. Does it look like a general solution? Thanks, Jason -------------- next part -------------- A non-text attachment was scrubbed... Name: __init__.pyx.patch Type: text/x-diff Size: 812 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081114/840f0d0b/attachment.bin From stefan_ml at behnel.de Fri Nov 14 21:57:22 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 21:57:22 +0100 Subject: [Cython] help tracking down TypeError In-Reply-To: <4db580fd0811141231hee5bf6dsf0b12b9648b00b5e@mail.gmail.com> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4db580fd0811141231hee5bf6dsf0b12b9648b00b5e@mail.gmail.com> Message-ID: <491DE632.9050800@behnel.de> Hi, Hoyt Koepke wrote: > Okay, I'll do that right away. Could someone give me an account on > the bug tracker? Please send a htpasswd file/line to Robert Bradshaw. I agree that this situation is not optimal, though... Anyway, I applied a fix for this bug. Please give the latest cython-devel version a try to check if it works for you. Stefan From stefan_ml at behnel.de Fri Nov 14 22:17:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 14 Nov 2008 22:17:29 +0100 Subject: [Cython] help tracking down TypeError In-Reply-To: <491DE632.9050800@behnel.de> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <4db580fd0811111738s1cbf6655m229b68cf03d5fe60@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4db580fd0811141231hee5bf6dsf0b12b9648b00b5e@mail.gmail.com> <491DE632.9050800@behnel.de> Message-ID: <491DEAE9.3020700@behnel.de> Stefan Behnel wrote: > Hoyt Koepke wrote: >> Okay, I'll do that right away. Could someone give me an account on >> the bug tracker? > > Please send a htpasswd file/line to Robert Bradshaw. I agree that this > situation is not optimal, though... I also filed a bug report. http://trac.cython.org/cython_trac/ticket/118 Stefan From hoytak at cs.ubc.ca Fri Nov 14 22:23:06 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Fri, 14 Nov 2008 13:23:06 -0800 Subject: [Cython] help tracking down TypeError In-Reply-To: <491DEAE9.3020700@behnel.de> References: <4db580fd0811111704p3b1dc75fhdb262a9d0c8ae1a7@mail.gmail.com> <491BD2C5.9070000@behnel.de> <491CAA55.9070209@behnel.de> <4db580fd0811131558m4a9a5bd3ke63de31de9ae4c8e@mail.gmail.com> <59905.213.61.181.86.1226654824.squirrel@groupware.dvs.informatik.tu-darmstadt.de> <4db580fd0811141231hee5bf6dsf0b12b9648b00b5e@mail.gmail.com> <491DE632.9050800@behnel.de> <491DEAE9.3020700@behnel.de> Message-ID: <4db580fd0811141323j4c369697pc7f2400c37561ba2@mail.gmail.com> > I also filed a bug report. > > http://trac.cython.org/cython_trac/ticket/118 > > Stefan Okay, excellent. I just tried it with the latest release, and it works fine. Thanks for your work -- you guys are awesome!!! I will send the .htpasswd file to Robert ASAP. Thanks! --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From dalcinl at gmail.com Fri Nov 14 23:52:49 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 14 Nov 2008 19:52:49 -0300 Subject: [Cython] broken array declarators when size is an expression Message-ID: This changeset changeset: 1333:34aca76e1b9d parent: 1326:cf41fa30ad5f user: Stefan Behnel date: Fri Nov 14 19:19:55 2008 +0100 summary: array size must be set as int, not numeric string broke mpi4py and petsc4py, where I use stack-allocated arrays where the size comes from an (external) enumeration, for example: cdef char name[MPI_MAX_OBJECT_NAME+1] I had to remove an 'assert' in Cython/Compiler/PyrexTypes.py at __init__ of class CArrayType. Additionally, I've added a few tests for all this. I really do not know how to make the assert smarter by analyzing expressions. Stefan, please review. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat Nov 15 00:05:28 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 14 Nov 2008 20:05:28 -0300 Subject: [Cython] External typedefs and pointers In-Reply-To: References: <491810ED.8090904@student.matnat.uio.no> Message-ID: I'm on Robert's side on all this for the same reasons. In short, +1 for (1) and -1 for (2) and (3). On Thu, Nov 13, 2008 at 6:40 AM, Robert Bradshaw wrote: > On Nov 10, 2008, at 2:46 AM, Dag Sverre Seljebotn wrote: > >> A discussion recently came up on the NumPy mailing list, and it >> inspired >> me to focus on this usability issue: >> >> cdef extern from "test.h": >> ctypedef int type_a >> ctypedef int type_b >> cdef type_a* ptr1 = NULL >> cdef type_b* ptr2 = ptr1 >> >> Now, it might happen (like with NumPy) that type_a and type_b are >> defined as seperate types on some platforms and the same on others >> (through #ifdefs). Partly this is relied upon currently, making the >> situation a bit confusing, especially for new users. >> >> Possible solutions: >> >> 1) Make all external types pointer-incompatible. So "ptr2=ptr1" above >> will always fail, and an explicit cast is needed. > > I think this is probably the way to go. Presumably there's a reason > to have to separate times. > >> 2) Introduce a new keyword, something like unknown_size: >> >> cdef extern from "test.h": >> ctypedef unknown_size int type_a >> ctypedef long type_b # we know this is always "long" >> ... >> >> which trigger pointer-incompatability with anything. NB! At the same >> time, all external primitive types are checked for the right >> declaration >> in Cython at module startup time: >> >> if (sizeof(type_b) != sizeof(long) || ((type_a)-1) != ((type_b)-1) || >> ((type_a)0.5) != ((type_b)0.5))) { /*raise exception*/ } >> >> (Not sure if that float check will work though, might need to do a >> division instead.) > > I have to admit I'm not a fan either of the new keyword or waiting > 'till runtime to throw the error. > >> 3) Complex interaction with the C compiler so that Cython and the C >> compiler work "in one step". The Cython core is simplified so that it >> never checks pointer assignment compatability, but rather take any >> error >> the C compiler gives and translates it back to an error to the Cython >> user. Obviously not a short-term solution but if this is a long-term >> solution it might be enough reason not to bother with this for now. >> This >> seems to be the most stable solution, but it does give away the >> possibility to seperate the Cython and C compilation stages as some >> people are fond of doing. > > The biggest problem with 2 and 3 is that people often ship the .c > files and it is compiled/run on the (end-users) machine. Much worse > to delay the error to this point, and it may work on some machines > and not on others. Better to raise an error at Cython compile time an > force the programmer to think about what to do. > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Sat Nov 15 00:31:49 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Nov 2008 12:31:49 +1300 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491D6AD3.3070202@mineway.de> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> Message-ID: <491E0A65.2050304@canterbury.ac.nz> Uwe Schmitt wrote: > /* > "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src\svdlibc.pyx":157 > * cdef SMat As > * > * SVDVerbosity = verbosity # <<<<<<<<<<<<<< > * print SVDVerbosity > * print SVDVersion If this is inside a function, have you declared SVDVerbosity as a global within that function? If not, the assignment is implicitly declaring it as a local. -- Greg From stefan_ml at behnel.de Sat Nov 15 09:14:06 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 15 Nov 2008 09:14:06 +0100 Subject: [Cython] broken array declarators when size is an expression In-Reply-To: References: Message-ID: <491E84CE.1010801@behnel.de> Hi, Lisandro Dalcin wrote: > changeset: 1333:34aca76e1b9d > summary: array size must be set as int, not numeric string > > broke mpi4py and petsc4py, where I use stack-allocated arrays where > the size comes from an (external) enumeration, for example: > > cdef char name[MPI_MAX_OBJECT_NAME+1] Sorry for that. The problem is that we don't currently have a way to say "give me the compile-time result for this subtree, but don't complain if it's a runtime value". I already needed that in a couple of places when working on Cython, as it can lead to different code when you know the result of an expression. I just never got around to implement this. I think it's wrong that compile_time_value() raises a compiler error. It should rather return the result with a hint if it was determined completely or if part of it is runtime-determined. Then the caller can decide what to do with this information. Stefan From stefan_ml at behnel.de Sat Nov 15 10:55:15 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 15 Nov 2008 10:55:15 +0100 Subject: [Cython] how to initialize my c array in a quick way? In-Reply-To: <491DAAB5.1010106@behnel.de> References: <200811071701538906185@163.com> <491406F3.3040708@behnel.de> <4917BAD1.8080802@canterbury.ac.nz> <491CAA81.9000108@behnel.de> <28F95193-37D9-40C0-B3EC-58CB454B4EC9@math.washington.edu> <491DAAB5.1010106@behnel.de> Message-ID: <491E9C83.4040709@behnel.de> Stefan Behnel wrote: > Robert Bradshaw wrote: >> On Nov 13, 2008, at 2:30 PM, Stefan Behnel wrote: >>> However, the way it's currently implemented does no bounds >>> checking, so it's >>> pretty easy to shoot yourself in the foot when you assign non- >>> existing slices. >>> It's actually not so easy to determine at compile time how long the >>> lhs slice is, and how long the assigned sequence is. >>> >>> There's definitely more room for improvements. :) >> Ouch. We can make sure (at runtime) that the rhs has the right size, > > That's what I thought, too. There's a couple of cases that we can handle > efficiently, and some where we can add runtime checks. Done. http://hg.cython.org/cython-devel/file/tip/tests/run/arrayassign.pyx What's still missing is a way to unpack arbitrary iterables into an array, as in cdef int a[5] a = range(5) Stefan From uschmitt at mineway.de Sat Nov 15 10:57:39 2008 From: uschmitt at mineway.de (Uwe Schmitt) Date: Sat, 15 Nov 2008 10:57:39 +0100 Subject: [Cython] [mailinglist] Re: problems with external data In-Reply-To: <491E0A65.2050304@canterbury.ac.nz> References: <491810ED.8090904@student.matnat.uio.no> <491C38A2.4060804@mineway.de> <491C718A.9050805@behnel.de> <491D3D67.30105@mineway.de> <491D6AD3.3070202@mineway.de> <491E0A65.2050304@canterbury.ac.nz> Message-ID: <8FDBF71D-4948-434C-9979-B15B0E7542B6@mineway.de> Am 15.11.2008 um 00:31 schrieb Greg Ewing: > Uwe Schmitt wrote: > >> /* >> "C:\cygwin\home\uschmitt\workspace_eclipse_ganymede\PySVDLIB\src >> \svdlibc.pyx":157 >> * cdef SMat As >> * >> * SVDVerbosity = verbosity # <<<<<<<<<<<<<< >> * print SVDVerbosity >> * print SVDVersion > > If this is inside a function, have you declared SVDVerbosity > as a global within that function? If not, the assignment is > implicitly declaring it as a local. > That was the reason ! Greetings, Uwe > -- > Greg > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From aaron.devore at gmail.com Sat Nov 15 22:17:59 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Sat, 15 Nov 2008 13:17:59 -0800 Subject: [Cython] Using PyDict_Next Message-ID: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> I'm trying to use PyDict_Next to iterate over a dict in a way that is identical to this statement: for k, v in d.items(): # do stuff with key and value PyDict_Next has the signature: int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject **key, PyObject **value) The basic idea is that the dict uses pos to track which key it is on. The key is then assigned to the key pointer and the corresponding value is assigned to the value pointer. It can really help efficiency because it doesn't involve iterators, tuples, etc. The most obvious code is the following: cdef int pos = 0 cdef object key, value while PyDict_Next(d, &pos, &key, &value): # do stuff with key and value However, apparently &python_object is not legal and I'm running into odd type issues with &pos. Is there a way to get around those limitations? -Aaron From robertwb at math.washington.edu Sat Nov 15 22:53:28 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 15 Nov 2008 13:53:28 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> Message-ID: <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: > I'm trying to use PyDict_Next to iterate over a dict in a way that is > identical to this statement: > for k, v in d.items(): > # do stuff with key and value > > PyDict_Next has the signature: > int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject **key, > PyObject **value) > > > The basic idea is that the dict uses pos to track which key it is on. > The key is then assigned to the key pointer and the corresponding > value is assigned to the value pointer. It can really help efficiency > because it doesn't involve iterators, tuples, etc. > > The most obvious code is the following: > > cdef int pos = 0 > cdef object key, value > while PyDict_Next(d, &pos, &key, &value): > # do stuff with key and value > > However, apparently &python_object is not legal and I'm running into > odd type issues with &pos. Is there a way to get around those > limitations? Yes, declare them to be PyObject* rather than object. Then you'll have to do all refcounting manually (as you would have had to do anyways, as PyDict_Next doesn't decref its input). However, I doubt iterating over the dict manually like that will be a significant speed increase than the basic Python way of doing it. - Robert From musiccomposition at gmail.com Sun Nov 16 00:39:58 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 15 Nov 2008 17:39:58 -0600 Subject: [Cython] __add__ type declaration Message-ID: <1afaf6160811151539g3affb9a0tbfd8d58df6dd5f79@mail.gmail.com> Hi! For fun, I have been implementing a Python binding to GMP. I have a class like this: cdef class GMPInt: cdef mpz_t ob_val .... def __add__(self, GMPInt other): cdef mpz_t temp mpz_init(temp) mpz_add(temp, self.ob_val, other.ob_val) mpz_clear(temp) .... When I try to compile it I get this error: mpz_add(temp, self.ob_val, other.ob_val) ^ ------------------------------------------------------------ /temp/sandbox/pygmp/src/_gmp.pyx:49:27: Cannot convert Python object to 'mpz_t' It compiles correctly when I define a declaration for self in __add__ like this, though: def __add__(GMPInt self, GMPInt other) I thought type declarations were supposed to be implicit on self in classes. Is this just an anomaly? I have Cython 0.10. Thanks for the help! -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner." From robertwb at math.washington.edu Sun Nov 16 00:47:52 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 15 Nov 2008 15:47:52 -0800 Subject: [Cython] __add__ type declaration In-Reply-To: <1afaf6160811151539g3affb9a0tbfd8d58df6dd5f79@mail.gmail.com> References: <1afaf6160811151539g3affb9a0tbfd8d58df6dd5f79@mail.gmail.com> Message-ID: <5531678A-2B80-4835-8BB2-F48EF91D6DF3@math.washington.edu> On Nov 15, 2008, at 3:39 PM, Benjamin Peterson wrote: > Hi! > > For fun, I have been implementing a Python binding to GMP. I have a > class like > this: > > cdef class GMPInt: > > cdef mpz_t ob_val > .... > def __add__(self, GMPInt other): > cdef mpz_t temp > mpz_init(temp) > mpz_add(temp, self.ob_val, other.ob_val) > mpz_clear(temp) > .... > > When I try to compile it I get this error: > > mpz_add(temp, self.ob_val, other.ob_val) > ^ > ------------------------------------------------------------ > > /temp/sandbox/pygmp/src/_gmp.pyx:49:27: Cannot convert Python > object to 'mpz_t' > > > It compiles correctly when I define a declaration for self in __add__ > like this, though: > > def __add__(GMPInt self, GMPInt other) > > I thought type declarations were supposed to be implicit on self in > classes. Is this just an anomaly? > > I have Cython 0.10. > > Thanks for the help! > -- That is correct most of the time. What you're observing here due to the fact that the __add__ method may take self in either the left or right parameter. (This is how the slot works in extension classes, rather than having a separate __radd__ method.) We really need to add that to http://docs.cython.org/docs/extension_types.html#special-methods - Robert From greg.ewing at canterbury.ac.nz Sun Nov 16 01:14:20 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 16 Nov 2008 13:14:20 +1300 Subject: [Cython] __add__ type declaration In-Reply-To: <5531678A-2B80-4835-8BB2-F48EF91D6DF3@math.washington.edu> References: <1afaf6160811151539g3affb9a0tbfd8d58df6dd5f79@mail.gmail.com> <5531678A-2B80-4835-8BB2-F48EF91D6DF3@math.washington.edu> Message-ID: <491F65DC.60107@canterbury.ac.nz> Robert Bradshaw wrote: > What you're observing here due to > the fact that the __add__ method may take self in either the left or > right parameter. > > We really need to add that to > > http://docs.cython.org/docs/extension_types.html#special-methods In the meantime, you can read about it here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html -- Greg From aaron.devore at gmail.com Sun Nov 16 08:28:32 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Sat, 15 Nov 2008 23:28:32 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> Message-ID: <2ead2fb0811152328x44ada170r481193a8a3a5190b@mail.gmail.com> So would this be better? for k in d: v = d[k] # Do stuff with k and v -Aaron On Sat, Nov 15, 2008 at 1:53 PM, Robert Bradshaw wrote: > On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: > >> I'm trying to use PyDict_Next to iterate over a dict in a way that is >> identical to this statement: >> for k, v in d.items(): >> # do stuff with key and value >> >> PyDict_Next has the signature: >> int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject **key, >> PyObject **value) >> >> >> The basic idea is that the dict uses pos to track which key it is on. >> The key is then assigned to the key pointer and the corresponding >> value is assigned to the value pointer. It can really help efficiency >> because it doesn't involve iterators, tuples, etc. >> >> The most obvious code is the following: >> >> cdef int pos = 0 >> cdef object key, value >> while PyDict_Next(d, &pos, &key, &value): >> # do stuff with key and value >> >> However, apparently &python_object is not legal and I'm running into >> odd type issues with &pos. Is there a way to get around those >> limitations? > > Yes, declare them to be PyObject* rather than object. Then you'll > have to do all refcounting manually (as you would have had to do > anyways, as PyDict_Next doesn't decref its input). However, I doubt > iterating over the dict manually like that will be a significant > speed increase than the basic Python way of doing it. > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From robertwb at math.washington.edu Sun Nov 16 08:40:54 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 15 Nov 2008 23:40:54 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <2ead2fb0811152328x44ada170r481193a8a3a5190b@mail.gmail.com> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <2ead2fb0811152328x44ada170r481193a8a3a5190b@mail.gmail.com> Message-ID: <891CED65-EA1C-42C2-AB4D-7614D85FA187@math.washington.edu> On Nov 15, 2008, at 11:28 PM, Aaron DeVore wrote: > So would this be better? > > for k in d: > v = d[k] > # Do stuff with k and v No, the for k, v in d.items(): ... is best. - Robert > > On Sat, Nov 15, 2008 at 1:53 PM, Robert Bradshaw > wrote: >> On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: >> >>> I'm trying to use PyDict_Next to iterate over a dict in a way >>> that is >>> identical to this statement: >>> for k, v in d.items(): >>> # do stuff with key and value >>> >>> PyDict_Next has the signature: >>> int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject >>> **key, >>> PyObject **value) >>> >>> >>> The basic idea is that the dict uses pos to track which key it is >>> on. >>> The key is then assigned to the key pointer and the corresponding >>> value is assigned to the value pointer. It can really help >>> efficiency >>> because it doesn't involve iterators, tuples, etc. >>> >>> The most obvious code is the following: >>> >>> cdef int pos = 0 >>> cdef object key, value >>> while PyDict_Next(d, &pos, &key, &value): >>> # do stuff with key and value >>> >>> However, apparently &python_object is not legal and I'm running into >>> odd type issues with &pos. Is there a way to get around those >>> limitations? >> >> Yes, declare them to be PyObject* rather than object. Then you'll >> have to do all refcounting manually (as you would have had to do >> anyways, as PyDict_Next doesn't decref its input). However, I doubt >> iterating over the dict manually like that will be a significant >> speed increase than the basic Python way of doing it. >> >> - Robert >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From aaron.devore at gmail.com Sun Nov 16 09:31:44 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Sun, 16 Nov 2008 00:31:44 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <891CED65-EA1C-42C2-AB4D-7614D85FA187@math.washington.edu> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <2ead2fb0811152328x44ada170r481193a8a3a5190b@mail.gmail.com> <891CED65-EA1C-42C2-AB4D-7614D85FA187@math.washington.edu> Message-ID: <2ead2fb0811160031t3383de9cnccc9aaeac9986695@mail.gmail.com> Doesn't that give the overhead of a tuple? I would check this myself but I don't have access to a Cython-enabled computer at the moment. :( On Sat, Nov 15, 2008 at 11:40 PM, Robert Bradshaw wrote: > On Nov 15, 2008, at 11:28 PM, Aaron DeVore wrote: > >> So would this be better? >> >> for k in d: >> v = d[k] >> # Do stuff with k and v > > No, the > > for k, v in d.items(): > ... > > is best. > > - Robert > > > >> >> On Sat, Nov 15, 2008 at 1:53 PM, Robert Bradshaw >> wrote: >>> On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: >>> >>>> I'm trying to use PyDict_Next to iterate over a dict in a way >>>> that is >>>> identical to this statement: >>>> for k, v in d.items(): >>>> # do stuff with key and value >>>> >>>> PyDict_Next has the signature: >>>> int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject >>>> **key, >>>> PyObject **value) >>>> >>>> >>>> The basic idea is that the dict uses pos to track which key it is >>>> on. >>>> The key is then assigned to the key pointer and the corresponding >>>> value is assigned to the value pointer. It can really help >>>> efficiency >>>> because it doesn't involve iterators, tuples, etc. >>>> >>>> The most obvious code is the following: >>>> >>>> cdef int pos = 0 >>>> cdef object key, value >>>> while PyDict_Next(d, &pos, &key, &value): >>>> # do stuff with key and value >>>> >>>> However, apparently &python_object is not legal and I'm running into >>>> odd type issues with &pos. Is there a way to get around those >>>> limitations? >>> >>> Yes, declare them to be PyObject* rather than object. Then you'll >>> have to do all refcounting manually (as you would have had to do >>> anyways, as PyDict_Next doesn't decref its input). However, I doubt >>> iterating over the dict manually like that will be a significant >>> speed increase than the basic Python way of doing it. >>> >>> - Robert >>> >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From robertwb at math.washington.edu Sun Nov 16 09:39:08 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 16 Nov 2008 00:39:08 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <2ead2fb0811160031t3383de9cnccc9aaeac9986695@mail.gmail.com> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <2ead2fb0811152328x44ada170r481193a8a3a5190b@mail.gmail.com> <891CED65-EA1C-42C2-AB4D-7614D85FA187@math.washington.edu> <2ead2fb0811160031t3383de9cnccc9aaeac9986695@mail.gmail.com> Message-ID: <03742100-7FB6-4156-9A9E-6947B8F5C055@math.washington.edu> On Nov 16, 2008, at 12:31 AM, Aaron DeVore wrote: > Doesn't that give the overhead of a tuple? I would check this myself > but I don't have access to a Cython-enabled computer at the moment. :( Yes, it does, but it is relatively cheap. Much cheaper than the overhead of a dictionary lookup for instance. And also much less painful then manually reference counting using PyDict_Next. > On Sat, Nov 15, 2008 at 11:40 PM, Robert Bradshaw > wrote: >> On Nov 15, 2008, at 11:28 PM, Aaron DeVore wrote: >> >>> So would this be better? >>> >>> for k in d: >>> v = d[k] >>> # Do stuff with k and v >> >> No, the >> >> for k, v in d.items(): >> ... >> >> is best. >> >> - Robert >> >> >> >>> >>> On Sat, Nov 15, 2008 at 1:53 PM, Robert Bradshaw >>> wrote: >>>> On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: >>>> >>>>> I'm trying to use PyDict_Next to iterate over a dict in a way >>>>> that is >>>>> identical to this statement: >>>>> for k, v in d.items(): >>>>> # do stuff with key and value >>>>> >>>>> PyDict_Next has the signature: >>>>> int PyDict_Next(PyObject *dictionary, Py_ssize_t *pos, PyObject >>>>> **key, >>>>> PyObject **value) >>>>> >>>>> >>>>> The basic idea is that the dict uses pos to track which key it is >>>>> on. >>>>> The key is then assigned to the key pointer and the corresponding >>>>> value is assigned to the value pointer. It can really help >>>>> efficiency >>>>> because it doesn't involve iterators, tuples, etc. >>>>> >>>>> The most obvious code is the following: >>>>> >>>>> cdef int pos = 0 >>>>> cdef object key, value >>>>> while PyDict_Next(d, &pos, &key, &value): >>>>> # do stuff with key and value >>>>> >>>>> However, apparently &python_object is not legal and I'm running >>>>> into >>>>> odd type issues with &pos. Is there a way to get around those >>>>> limitations? >>>> >>>> Yes, declare them to be PyObject* rather than object. Then you'll >>>> have to do all refcounting manually (as you would have had to do >>>> anyways, as PyDict_Next doesn't decref its input). However, I doubt >>>> iterating over the dict manually like that will be a significant >>>> speed increase than the basic Python way of doing it. >>>> >>>> - Robert >>>> >>>> _______________________________________________ >>>> Cython-dev mailing list >>>> Cython-dev at codespeak.net >>>> http://codespeak.net/mailman/listinfo/cython-dev >>>> >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Sun Nov 16 10:38:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Nov 2008 10:38:29 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> Message-ID: <491FEA15.2020207@behnel.de> Hi, Robert Bradshaw wrote: > On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: > >> I'm trying to use PyDict_Next to iterate over a dict in a way that is >> identical to this statement: >> for k, v in d.items(): >> # do stuff with key and value > > Yes, declare them to be PyObject* rather than object. Then you'll > have to do all refcounting manually (as you would have had to do > anyways, as PyDict_Next doesn't decref its input). However, I doubt > iterating over the dict manually like that will be a significant > speed increase than the basic Python way of doing it. I ran timeit on this: --------------------------------------- def items(d): for k,v in d.items(): pass def iteritems(d): for k,v in d.iteritems(): pass def dictnext(d): cdef Py_ssize_t pos = 0 cdef PyObject* pk = NULL cdef PyObject* pv = NULL while PyDict_Next(d, &pos, &pk, &pv): k = pk v = pv --------------------------------------- With a 1000 item dict in Py2.5, it gives me this: $ python2.5 -m timeit -s '...' 'items(d)' 10000 loops, best of 3: 98.3 usec per loop $ python2.5 -m timeit -s '...' 'iteritems(d)' 10000 loops, best of 3: 50.8 usec per loop $ python2.5 -m timeit -s '...' 'dictnext(d)' 10000 loops, best of 3: 26.8 usec per loop So, yes, there is a speedup. It may not be worth it for small dicts, but the code is not so bad that it's not worth a 50% speedup for larger dicts. Stefan From robertwb at math.washington.edu Sun Nov 16 10:47:34 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 16 Nov 2008 01:47:34 -0800 Subject: [Cython] Using PyDict_Next In-Reply-To: <491FEA15.2020207@behnel.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> Message-ID: <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> On Nov 16, 2008, at 1:38 AM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> On Nov 15, 2008, at 1:17 PM, Aaron DeVore wrote: >> >>> I'm trying to use PyDict_Next to iterate over a dict in a way >>> that is >>> identical to this statement: >>> for k, v in d.items(): >>> # do stuff with key and value >> >> Yes, declare them to be PyObject* rather than object. Then you'll >> have to do all refcounting manually (as you would have had to do >> anyways, as PyDict_Next doesn't decref its input). However, I doubt >> iterating over the dict manually like that will be a significant >> speed increase than the basic Python way of doing it. > > I ran timeit on this: > > --------------------------------------- > def items(d): > for k,v in d.items(): > pass > > def iteritems(d): > for k,v in d.iteritems(): > pass > > def dictnext(d): > cdef Py_ssize_t pos = 0 > cdef PyObject* pk = NULL > cdef PyObject* pv = NULL > > while PyDict_Next(d, &pos, &pk, &pv): > k = pk > v = pv > --------------------------------------- > > With a 1000 item dict in Py2.5, it gives me this: > > $ python2.5 -m timeit -s '...' 'items(d)' > 10000 loops, best of 3: 98.3 usec per loop > $ python2.5 -m timeit -s '...' 'iteritems(d)' > 10000 loops, best of 3: 50.8 usec per loop > $ python2.5 -m timeit -s '...' 'dictnext(d)' > 10000 loops, best of 3: 26.8 usec per loop > > So, yes, there is a speedup. It may not be worth it for small > dicts, but > the code is not so bad that it's not worth a 50% speedup for larger > dicts. I stand happily corrected. I though there would be some improvement, but not near that much. I am particularly surprised at the poor performance of items vs iteritems, I guess the allocation is killing us here. - Robert From stefan_ml at behnel.de Sun Nov 16 11:33:34 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Nov 2008 11:33:34 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> Message-ID: <491FF6FE.8080901@behnel.de> Hi, Robert Bradshaw wrote: > On Nov 16, 2008, at 1:38 AM, Stefan Behnel wrote: >> I ran timeit on this: >> >> --------------------------------------- >> def items(d): >> for k,v in d.items(): >> pass >> >> def iteritems(d): >> for k,v in d.iteritems(): >> pass >> >> def dictnext(d): >> cdef Py_ssize_t pos = 0 >> cdef PyObject* pk = NULL >> cdef PyObject* pv = NULL >> >> while PyDict_Next(d, &pos, &pk, &pv): >> k = pk >> v = pv >> --------------------------------------- >> >> With a 1000 item dict in Py2.5, it gives me this: >> >> $ python2.5 -m timeit -s '...' 'items(d)' >> 10000 loops, best of 3: 98.3 usec per loop >> $ python2.5 -m timeit -s '...' 'iteritems(d)' >> 10000 loops, best of 3: 50.8 usec per loop >> $ python2.5 -m timeit -s '...' 'dictnext(d)' >> 10000 loops, best of 3: 26.8 usec per loop >> >> So, yes, there is a speedup. It may not be worth it for small dicts, >> but the code is not so bad that it's not worth a 50% speedup for >> larger dicts. > > I stand happily corrected. I though there would be some improvement, > but not near that much. I am particularly surprised at the poor > performance of items vs iteritems, I guess the allocation is killing > us here. That makes me wonder if it's not worth using such an implementation for cdef dict d ... for (key|value|key,value) in d.iter(keys|values|items)(): ... internally - but *only* for the ".iter*()" variants to avoid introducing problems with dict modification during iteration. That would be a pretty cool optimisation. The generic loop code we currently generate is huge compared to a straight call to PyDict_Next(). Maybe a tree transformation could replace the for-loops above with a new DictLoopNode that would generate the respective code. Stefan From dagss at student.matnat.uio.no Sun Nov 16 12:11:36 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 16 Nov 2008 12:11:36 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <491FF6FE.8080901@behnel.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> Message-ID: <491FFFE8.7090906@student.matnat.uio.no> Stefan Behnel wrote: > That makes me wonder if it's not worth using such an implementation for > > cdef dict d > ... > for (key|value|key,value) in d.iter(keys|values|items)(): > ... > > internally - but *only* for the ".iter*()" variants to avoid introducing > problems with dict modification during iteration. That would be a pretty > cool optimisation. The generic loop code we currently generate is huge > compared to a straight call to PyDict_Next(). > > Maybe a tree transformation could replace the for-loops above with a new > DictLoopNode that would generate the respective code. +1, good idea -- though I'd definitely not have a DictLoopNode; instead one should use a regular WhileStatNode containing a SimpleCallNode calling PyDict_Next. The direction things should be taking is less loop node types, not more (e.g. transform ForInStatNode into WhileStatNode if an iterator is used). (See also: The copy&paste-style code duplication going on in the generate_..._code-implementations of the current looping nodes.) -- Dag Sverre From stefan_ml at behnel.de Sun Nov 16 12:22:00 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Nov 2008 12:22:00 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <491FFFE8.7090906@student.matnat.uio.no> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> Message-ID: <49200258.1050809@behnel.de> Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> That makes me wonder if it's not worth using such an implementation for >> >> cdef dict d >> ... >> for (key|value|key,value) in d.iter(keys|values|items)(): >> ... >> >> internally - but *only* for the ".iter*()" variants to avoid introducing >> problems with dict modification during iteration. That would be a pretty >> cool optimisation. The generic loop code we currently generate is huge >> compared to a straight call to PyDict_Next(). >> >> Maybe a tree transformation could replace the for-loops above with a new >> DictLoopNode that would generate the respective code. > > +1, good idea -- though I'd definitely not have a DictLoopNode; instead > one should use a regular WhileStatNode containing a SimpleCallNode > calling PyDict_Next. The direction things should be taking is less loop > node types, not more (e.g. transform ForInStatNode into WhileStatNode if > an iterator is used). (See also: The copy&paste-style code duplication > going on in the generate_..._code-implementations of the current looping > nodes.) Very true. I just noticed that when I looked through the current implementations. There's also the range() optimisation which could be done with a tree transform now that we have them in place. Stefan From greg.ewing at canterbury.ac.nz Sun Nov 16 22:08:47 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Nov 2008 10:08:47 +1300 Subject: [Cython] Using PyDict_Next In-Reply-To: <491FFFE8.7090906@student.matnat.uio.no> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> Message-ID: <49208BDF.9030708@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > I'd definitely not have a DictLoopNode; instead > one should use a regular WhileStatNode containing a SimpleCallNode > calling PyDict_Next. Except that you can't express a call to PyDict_Next using a standard SimpleCallNode because of its weird signature. -- Greg From stefan_ml at behnel.de Sun Nov 16 22:09:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Nov 2008 22:09:44 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <49208BDF.9030708@canterbury.ac.nz> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49208BDF.9030708@canterbury.ac.nz> Message-ID: <49208C18.6050203@behnel.de> Hi, Greg Ewing wrote: > Dag Sverre Seljebotn wrote: >> I'd definitely not have a DictLoopNode; instead >> one should use a regular WhileStatNode containing a SimpleCallNode >> calling PyDict_Next. > > Except that you can't express a call to PyDict_Next using > a standard SimpleCallNode because of its weird signature. Yep, I noticed that. My work-around was not to officially declare it at all, but to design a hand-built function type that is only used in the transform. Seems to work for me so far. Stefan From greg.ewing at canterbury.ac.nz Sun Nov 16 22:19:54 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Nov 2008 10:19:54 +1300 Subject: [Cython] Using PyDict_Next In-Reply-To: <49208C18.6050203@behnel.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49208BDF.9030708@canterbury.ac.nz> <49208C18.6050203@behnel.de> Message-ID: <49208E7A.3060706@canterbury.ac.nz> Stefan Behnel wrote: > My work-around was not to officially declare it at > all, but to design a hand-built function type that is only used in the > transform. Another possibility is to enhance the builtin function signature mechanism with a pointer-to-object-reference type and make the call nodes understand it. That would be a first step towards allowing you to declare such functions yourself. -- Greg From dagss at student.matnat.uio.no Sun Nov 16 22:40:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 16 Nov 2008 22:40:25 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <49200258.1050809@behnel.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49200258.1050809@behnel.de> Message-ID: <49209349.7030206@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: >> Stefan Behnel wrote: >>> That makes me wonder if it's not worth using such an implementation for >>> >>> cdef dict d >>> ... >>> for (key|value|key,value) in d.iter(keys|values|items)(): >>> ... >>> >>> internally - but *only* for the ".iter*()" variants to avoid introducing >>> problems with dict modification during iteration. That would be a pretty >>> cool optimisation. The generic loop code we currently generate is huge >>> compared to a straight call to PyDict_Next(). >>> >>> Maybe a tree transformation could replace the for-loops above with a new >>> DictLoopNode that would generate the respective code. >> +1, good idea -- though I'd definitely not have a DictLoopNode; instead >> one should use a regular WhileStatNode containing a SimpleCallNode >> calling PyDict_Next. The direction things should be taking is less loop >> node types, not more (e.g. transform ForInStatNode into WhileStatNode if >> an iterator is used). (See also: The copy&paste-style code duplication >> going on in the generate_..._code-implementations of the current looping >> nodes.) > > Very true. I just noticed that when I looked through the current > implementations. There's also the range() optimisation which could be done > with a tree transform now that we have them in place. > Be aware of the TempsBlockNode I wrote if you look at this, for any temporary variables needed. (It is currently unused so may have bugs too...ask if you have problems with it) -- Dag Sverre From stefan_ml at behnel.de Sun Nov 16 23:00:21 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Nov 2008 23:00:21 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <49209349.7030206@student.matnat.uio.no> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49200258.1050809@behnel.de> <49209349.7030206@student.matnat.uio.no> Message-ID: <492097F5.6040800@behnel.de> Hi Dag, Dag Sverre Seljebotn wrote: > Be aware of the TempsBlockNode I wrote if you look at this, for any > temporary variables needed. (It is currently unused so may have bugs > too...ask if you have problems with it) Yep, I found it already. :) The current implementation works for me in all cases I found relevant. It supports all three .iter*() methods and .iteritems() works with both a tuple and two separate variables as targets. However, the code (in Optimize.py) looks huge and feels a bit clumsy. Could you review it and try to fix it up? Thanks, Stefan From dagss at student.matnat.uio.no Mon Nov 17 10:27:59 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 17 Nov 2008 10:27:59 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <492097F5.6040800@behnel.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49200258.1050809@behnel.de> <49209349.7030206@student.matnat.uio.no> <492097F5.6040800@behnel.de> Message-ID: <4921391F.4020108@student.matnat.uio.no> Stefan Behnel wrote: > Hi Dag, > > Dag Sverre Seljebotn wrote: > >> Be aware of the TempsBlockNode I wrote if you look at this, for any >> temporary variables needed. (It is currently unused so may have bugs >> too...ask if you have problems with it) >> > > Yep, I found it already. :) > > The current implementation works for me in all cases I found relevant. It > supports all three .iter*() methods and .iteritems() works with both a > tuple and two separate variables as targets. > > However, the code (in Optimize.py) looks huge and feels a bit clumsy. Could > you review it and try to fix it up? > I think it looks OK to me, mostly, but see below. These kind of long code strings is what you get when you manually construct nodes -- one could work at making node construction more readable, or you could try to use a TreeFragment (with three different cases depending on the unpacking). It would be ok to simply have a FOO call-node and then replace the SimpleCallNode's entry after the TreeFragment construction. But within the current framework seems to be the kind of code length you get. There's one thing that worries me: Calling TupleNode's allocate_temps. The reason is that "old" temp allocation is tightly connected to the order things happen in (a temp is "locked" for the duration of the recursive call), and must NOT be interfered with from any "external" code path (which I why I want to get rid of it and move to the new one). TupleNode doesn't allocate temps directly, but one also needs to be sure that the children does not do any temp allocation. (Passing in "None" as the env parameter certainly helps enforce this. I.e. if allocate_temps does something that accesses the "env" parameter, then you probably cannot do it at this stage without problems.) These things lead to extremely subtle bugs (i.e. reuse of temporary variables in wrong places), so care is needed. Ideally, one should find a way of not having to call analyse_types and allocate_temps on TupleNode, instead one should pass the different variables that is set during analysis directly to the TupleNode constructor (i.e. the TupleNode must be constructed in an "analysed state"). I don't have time for fixing this up right now though. Dag Sverre From stefan_ml at behnel.de Mon Nov 17 10:49:33 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 17 Nov 2008 10:49:33 +0100 (CET) Subject: [Cython] Using PyDict_Next In-Reply-To: <4921391F.4020108@student.matnat.uio.no> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49200258.1050809@behnel.de> <49209349.7030206@student.matnat.uio.no> <492097F5.6040800@behnel.de> <4921391F.4020108@student.matnat.uio.no> Message-ID: <57322.213.61.181.86.1226915373.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Dag Sverre Seljebotn wrote: > There's one thing that worries me: Calling TupleNode's allocate_temps. > [...] > Ideally, one should find a way of not having to call analyse_types and > allocate_temps on TupleNode, instead one should pass the different > variables that is set during analysis directly to the TupleNode > constructor (i.e. the TupleNode must be constructed in an "analysed > state"). That's what I started off with, until I noticed that I was only doing stuff that analyse_types() and allocate_temps() were doing already. Initialising the TupleNode manually leads to tons of stuff that you have to build/set up first. I'll take another look at it. Thanks for the review. Stefan From dagss at student.matnat.uio.no Mon Nov 17 11:03:54 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 17 Nov 2008 11:03:54 +0100 Subject: [Cython] Using PyDict_Next In-Reply-To: <57322.213.61.181.86.1226915373.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <2ead2fb0811151317sce4a662iaf5eadfe42e0fa48@mail.gmail.com> <7E3E9341-33BC-4553-A58A-5816BBE41A6F@math.washington.edu> <491FEA15.2020207@behnel.de> <40BA29E4-8D08-4AF2-883B-16C55623AF9E@math.washington.edu> <491FF6FE.8080901@behnel.de> <491FFFE8.7090906@student.matnat.uio.no> <49200258.1050809@behnel.de> <49209349.7030206@student.matnat.uio.no> <492097F5.6040800@behnel.de> <4921391F.4020108@student.matnat.uio.no> <57322.213.61.181.86.1226915373.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: <4921418A.9060301@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: > >> There's one thing that worries me: Calling TupleNode's allocate_temps. >> [...] >> Ideally, one should find a way of not having to call analyse_types and >> allocate_temps on TupleNode, instead one should pass the different >> variables that is set during analysis directly to the TupleNode >> constructor (i.e. the TupleNode must be constructed in an "analysed >> state"). >> > > That's what I started off with, until I noticed that I was only doing > stuff that analyse_types() and allocate_temps() were doing already. > > Initialising the TupleNode manually leads to tons of stuff that you have > to build/set up first. > > I'll take another look at it. Thanks for the review. > I'd add a static member method "create_analysed" in TupleNode which does all the setup stuff, but which doesn't need an "env" passed in. Then there should be much potential in code reuse in helper methods within TupleNode. Dag Sverre From dalcinl at gmail.com Mon Nov 17 14:40:59 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 17 Nov 2008 10:40:59 -0300 Subject: [Cython] build cython-devel on Windows fails if MSVC not available Message-ID: I tried to build && install cython-devel on Windows XP, I do not have MSVC installed. All went fine, but compiling extension modules (Scanners, Scanning, Parsing, Visitor) failed bad because the MSVC compiler was not available. Do it make sense to fix this by printing a warning and go on if the extension building fails for ever reason? Unless a user actually open and take a look at setup.py, there is no way to discover the '--no-cython-compile' flag. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Mon Nov 17 18:05:29 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 17 Nov 2008 18:05:29 +0100 (CET) Subject: [Cython] build cython-devel on Windows fails if MSVC not available In-Reply-To: References: Message-ID: <40007.213.61.181.86.1226941529.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > I tried to build && install cython-devel on Windows XP, I do not have > MSVC installed. All went fine, but compiling extension modules > (Scanners, Scanning, Parsing, Visitor) failed bad because the MSVC > compiler was not available. > > Do it make sense to fix this by printing a warning and go on if the > extension building fails for ever reason? We already gracefully handle the case that the Cython compilation fails, so, yes, that would be nice. It's harder to do, though, as it happens inside the setup() call. Maybe we can figure out before-hand if a compiler is available, or just default to not compiling on Windows. > Unless a user actually open and take a look at setup.py, there is no > way to discover the '--no-cython-compile' flag. Yes, a good help message would be good. Stefan From robertwb at math.washington.edu Mon Nov 17 19:33:58 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 17 Nov 2008 10:33:58 -0800 Subject: [Cython] build cython-devel on Windows fails if MSVC not available In-Reply-To: <40007.213.61.181.86.1226941529.squirrel@groupware.dvs.informatik.tu-darmstadt.de> References: <40007.213.61.181.86.1226941529.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On Nov 17, 2008, at 9:05 AM, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> I tried to build && install cython-devel on Windows XP, I do not have >> MSVC installed. All went fine, but compiling extension modules >> (Scanners, Scanning, Parsing, Visitor) failed bad because the MSVC >> compiler was not available. >> >> Do it make sense to fix this by printing a warning and go on if the >> extension building fails for ever reason? > > We already gracefully handle the case that the Cython compilation > fails, > so, yes, that would be nice. It's harder to do, though, as it happens > inside the setup() call. Maybe we can figure out before-hand if a > compiler > is available, or just default to not compiling on Windows. I'm not sure the best way to check for compiler presence, but it seems compiling should be the default as 99% of the users who use Cython will be using it with a compiler (otherwise, what's the point?). >> Unless a user actually open and take a look at setup.py, there is no >> way to discover the '--no-cython-compile' flag. > > Yes, a good help message would be good. +1 Robert From stefan_ml at behnel.de Mon Nov 17 20:27:47 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 17 Nov 2008 20:27:47 +0100 Subject: [Cython] Exception leaking in Cython 0.10 Message-ID: <4921C5B3.5080503@behnel.de> Hi, an lxml user discovered that Cython code currently leaks exception references when entering a try-except blocks while an exception is set in sys.exc_*(). This was introduced in my change to the try-except handling a while ago. It's already fixed in current cython-devel. It would therefore be good to release a 0.10.1 soon to officially fix this. Stefan From dagss at student.matnat.uio.no Mon Nov 17 20:48:12 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 17 Nov 2008 20:48:12 +0100 Subject: [Cython] Exception leaking in Cython 0.10 In-Reply-To: <4921C5B3.5080503@behnel.de> References: <4921C5B3.5080503@behnel.de> Message-ID: <4921CA7C.7040108@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > an lxml user discovered that Cython code currently leaks exception > references when entering a try-except blocks while an exception is set in > sys.exc_*(). > > This was introduced in my change to the try-except handling a while ago. > It's already fixed in current cython-devel. It would therefore be good to > release a 0.10.1 soon to officially fix this. If you commit the same change to the cython-branch, one can release a bugfix release immediatily with much less extensive release checking (i.e. there might be bugs introduced by some of the new features since the release?). My opinion: I'd like a system where releases of cython-devel always enter a period of freezing and beta stage and lead to incremental "major" version (i.e. 0.11), while cython can be released at will and increments minor version (i.e. 0.10.1). The logic is thus that a) when there is a new feature, the first number is incremented, b) never commit new features to cython, c) changes in cython-devel doesn't need to "slow down" or freeze (so that beta-testing can happen without introducing new bugs in the meantime) just because we need to get a critical bugfix out the door. -- Dag Sverre From stefan_ml at behnel.de Mon Nov 17 20:50:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 17 Nov 2008 20:50:44 +0100 Subject: [Cython] Exception leaking in Cython 0.10 In-Reply-To: <4921CA7C.7040108@student.matnat.uio.no> References: <4921C5B3.5080503@behnel.de> <4921CA7C.7040108@student.matnat.uio.no> Message-ID: <4921CB14.9090606@behnel.de> Hi, Dag Sverre Seljebotn wrote: > If you commit the same change to the cython-branch, Done. > one can release a > bugfix release immediatily with much less extensive release checking Fine with me. > I'd like a system where releases of cython-devel always enter a period > of freezing and beta stage and lead to incremental "major" version (i.e. > 0.11), while cython can be released at will and increments minor version > (i.e. 0.10.1). Sure. Stefan From dalcinl at gmail.com Tue Nov 18 00:00:58 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 17 Nov 2008 20:00:58 -0300 Subject: [Cython] build cython-devel on Windows fails if MSVC not available In-Reply-To: References: <40007.213.61.181.86.1226941529.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Message-ID: On Mon, Nov 17, 2008 at 3:33 PM, Robert Bradshaw wrote: > On Nov 17, 2008, at 9:05 AM, Stefan Behnel wrote: > > I'm not sure the best way to check for compiler presence, I?ll not follow that path. Checking for compilers is going to be a nightmare. I?ll just try and catch errors if the C compiler fails for whatever reason in any system and emit a one line (of perhaps a three-line banner?) warning. All this by implementing in setup.py a custom build_ext command inheriting from distutils one and passing a 'cmdclass' dict argument to setup() function. > but it > seems compiling should be the default as 99% of the users who use > Cython will be using it with a compiler (otherwise, what's the point?). What about if cross-compiling for other platforms? No idea if this make sense, though ;-). >>> Unless a user actually open and take a look at setup.py, there is no >>> way to discover the '--no-cython-compile' flag. >> >> Yes, a good help message would be good. > > +1 OK, I?ll try to writhe a patch by tomorrow, and then send it for review and comments. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue Nov 18 02:24:32 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 17 Nov 2008 22:24:32 -0300 Subject: [Cython] about Cython author list in setup.py Message-ID: While, hacking on setup.py, I noticed the author list has four member. I understand this is already a bit long, but ... I really believe Dag had made so wonderful and non-trivial contributions to the project as to deserve a place there. What do you all think? Dag, you cannot vote unless you say +1 ;-) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue Nov 18 03:07:13 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 17 Nov 2008 23:07:13 -0300 Subject: [Cython] PATH: go on if C compiler fails, plus problems with 'import cython' (lowercase) Message-ID: Here you have for review a quick hackery in setup.py to go on if the C compiler fails. However, I had problems with 'import cython' pure Python mode. On WinXP, I got failures with 'no module named cython'. No idea what's going on!! It's my second day using WinDog!!. In the Cython.diff patch, I've just commented out importing/using the (lowercase) 'cython', just because I do not know how to fix this. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: setup.diff Type: application/octet-stream Size: 1123 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081117/3a2cd476/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: Cython.diff Type: application/octet-stream Size: 1510 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081117/3a2cd476/attachment-0001.obj From stefan_ml at behnel.de Tue Nov 18 07:20:48 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 18 Nov 2008 07:20:48 +0100 Subject: [Cython] about Cython author list in setup.py In-Reply-To: References: Message-ID: <49225EC0.8030602@behnel.de> Hi, Lisandro Dalcin wrote: > While, hacking on setup.py, I noticed the author list has four member. > I understand this is already a bit long, but ... > > I really believe Dag had made so wonderful and non-trivial > contributions to the project as to deserve a place there. > > What do you all think? Dag, you cannot vote unless you say +1 ;-) +1.9 Stefan From robertwb at math.washington.edu Tue Nov 18 11:05:27 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 18 Nov 2008 02:05:27 -0800 Subject: [Cython] about Cython author list in setup.py In-Reply-To: <49225EC0.8030602@behnel.de> References: <49225EC0.8030602@behnel.de> Message-ID: On Nov 17, 2008, at 10:20 PM, Stefan Behnel wrote: > Hi, > > Lisandro Dalcin wrote: >> While, hacking on setup.py, I noticed the author list has four >> member. >> I understand this is already a bit long, but ... >> >> I really believe Dag had made so wonderful and non-trivial >> contributions to the project as to deserve a place there. >> >> What do you all think? Dag, you cannot vote unless you say +1 ;-) > > +1.9 I fully approve as well. - Robert From robertwb at math.washington.edu Tue Nov 18 11:21:40 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 18 Nov 2008 02:21:40 -0800 Subject: [Cython] Exception leaking in Cython 0.10 In-Reply-To: <4921CB14.9090606@behnel.de> References: <4921C5B3.5080503@behnel.de> <4921CA7C.7040108@student.matnat.uio.no> <4921CB14.9090606@behnel.de> Message-ID: On Nov 17, 2008, at 11:50 AM, Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> If you commit the same change to the cython-branch, > > Done. > >> one can release a >> bugfix release immediatily with much less extensive release checking > > Fine with me. > >> I'd like a system where releases of cython-devel always enter a >> period >> of freezing and beta stage and lead to incremental "major" version >> (i.e. >> 0.11), while cython can be released at will and increments minor >> version >> (i.e. 0.10.1). > > Sure. +1 Any other fixes that need to go in before I push a 0.10.1 out? - Robert From robertwb at math.washington.edu Tue Nov 18 11:35:56 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 18 Nov 2008 02:35:56 -0800 Subject: [Cython] broken array declarators when size is an expression In-Reply-To: <491E84CE.1010801@behnel.de> References: <491E84CE.1010801@behnel.de> Message-ID: On Nov 15, 2008, at 12:14 AM, Stefan Behnel wrote: > Hi, > > Lisandro Dalcin wrote: >> changeset: 1333:34aca76e1b9d >> summary: array size must be set as int, not numeric string >> >> broke mpi4py and petsc4py, where I use stack-allocated arrays where >> the size comes from an (external) enumeration, for example: >> >> cdef char name[MPI_MAX_OBJECT_NAME+1] > > Sorry for that. The problem is that we don't currently have a way > to say "give > me the compile-time result for this subtree, but don't complain if > it's a > runtime value". I already needed that in a couple of places when > working on > Cython, as it can lead to different code when you know the result > of an > expression. I just never got around to implement this. > > I think it's wrong that compile_time_value() raises a compiler > error. It > should rather return the result with a hint if it was determined > completely or > if part of it is runtime-determined. Then the caller can decide > what to do > with this information. This has annoyed me too. http://trac.cython.org/cython_trac/ticket/119 - Robert From dalcinl at gmail.com Tue Nov 18 14:12:23 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 18 Nov 2008 10:12:23 -0300 Subject: [Cython] about Cython author list in setup.py In-Reply-To: References: <49225EC0.8030602@behnel.de> Message-ID: Robert, could you please make the change? On Tue, Nov 18, 2008 at 7:05 AM, Robert Bradshaw wrote: > On Nov 17, 2008, at 10:20 PM, Stefan Behnel wrote: > >> Hi, >> >> Lisandro Dalcin wrote: >>> While, hacking on setup.py, I noticed the author list has four >>> member. >>> I understand this is already a bit long, but ... >>> >>> I really believe Dag had made so wonderful and non-trivial >>> contributions to the project as to deserve a place there. >>> >>> What do you all think? Dag, you cannot vote unless you say +1 ;-) >> >> +1.9 > > I fully approve as well. > > - Robert > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Tue Nov 18 14:39:06 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 18 Nov 2008 14:39:06 +0100 Subject: [Cython] broken array declarators when size is an expression In-Reply-To: References: <491E84CE.1010801@behnel.de> Message-ID: <4922C57A.7080000@student.matnat.uio.no> Robert Bradshaw wrote: > On Nov 15, 2008, at 12:14 AM, Stefan Behnel wrote: > > >> Hi, >> >> Lisandro Dalcin wrote: >> >>> changeset: 1333:34aca76e1b9d >>> summary: array size must be set as int, not numeric string >>> >>> broke mpi4py and petsc4py, where I use stack-allocated arrays where >>> the size comes from an (external) enumeration, for example: >>> >>> cdef char name[MPI_MAX_OBJECT_NAME+1] >>> >> Sorry for that. The problem is that we don't currently have a way >> to say "give >> me the compile-time result for this subtree, but don't complain if >> it's a >> runtime value". I already needed that in a couple of places when >> working on >> Cython, as it can lead to different code when you know the result >> of an >> expression. I just never got around to implement this. >> >> I think it's wrong that compile_time_value() raises a compiler >> error. It >> should rather return the result with a hint if it was determined >> completely or >> if part of it is runtime-determined. Then the caller can decide >> what to do >> with this information. >> > > This has annoyed me too. http://trac.cython.org/cython_trac/ticket/119 > > Idea: I was thinking of perhaps a transform that ran through the tree and replaced anything that is a compile-time expression with its literal result, i.e. DEF a = 4 print a would be turned into print 4 by this transform directly (just by calling compile_time_value and putting it back into IntNodes and StringNodes, really). So code after that transform (which would be run right after parsing) would not call compile_time_value at all -- if it is known compile-time, a literal exprnode is there (and treated the same way as if a literal was coded in by the user), otherwise it is only known runtime. Seems simpler than extending the compile_time_value protocol with information about "runtime-determined" etc. Unfortunately I don't have time to do it. Dag Sverre From dalcinl at gmail.com Tue Nov 18 16:34:48 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 18 Nov 2008 12:34:48 -0300 Subject: [Cython] broken array declarators when size is an expression In-Reply-To: <4922C57A.7080000@student.matnat.uio.no> References: <491E84CE.1010801@behnel.de> <4922C57A.7080000@student.matnat.uio.no> Message-ID: On Tue, Nov 18, 2008 at 10:39 AM, Dag Sverre Seljebotn wrote: > Idea: I was thinking of perhaps a transform that ran through the tree > and replaced anything that is a compile-time expression with its literal > result, i.e. But this will not work with external definitions. For example, in MPI I have to deal with C constants like MPI_MAX_OBJECT_NAME. I declare all them as external enumerations, and there is no way to tell Cython de actual value (as it depend on the mpi.h header, and expected to be different for each MPI implementation). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Tue Nov 18 16:59:39 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 18 Nov 2008 16:59:39 +0100 (CET) Subject: [Cython] broken array declarators when size is an expression In-Reply-To: References: <491E84CE.1010801@behnel.de> <4922C57A.7080000@student.matnat.uio.no> Message-ID: <38534.145.253.136.18.1227023979.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Lisandro Dalcin wrote: > On Tue, Nov 18, 2008 at 10:39 AM, Dag Sverre Seljebotn > wrote: >> Idea: I was thinking of perhaps a transform that ran through the tree >> and replaced anything that is a compile-time expression with its literal >> result, i.e. > > But this will not work with external definitions. For example, in MPI > I have to deal with C constants like MPI_MAX_OBJECT_NAME. I declare > all them as external enumerations, and there is no way to tell Cython > de actual value (as it depend on the mpi.h header, and expected to be > different for each MPI implementation). We were only talking about subtrees that can be evaluated at translation time. C compile-time values are not affected by this transform. Stefan From robertwb at math.washington.edu Wed Nov 19 06:05:59 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 18 Nov 2008 21:05:59 -0800 Subject: [Cython] Exception leaking in Cython 0.10 In-Reply-To: References: <4921C5B3.5080503@behnel.de> <4921CA7C.7040108@student.matnat.uio.no> <4921CB14.9090606@behnel.de> Message-ID: <88229B56-8A52-4F67-93FC-FD4FE12C4D03@math.washington.edu> On Nov 18, 2008, at 2:21 AM, Robert Bradshaw wrote: > On Nov 17, 2008, at 11:50 AM, Stefan Behnel wrote: > >> Hi, >> >> Dag Sverre Seljebotn wrote: >>> If you commit the same change to the cython-branch, >> >> Done. >> >>> one can release a >>> bugfix release immediatily with much less extensive release checking >> >> Fine with me. >> >>> I'd like a system where releases of cython-devel always enter a >>> period >>> of freezing and beta stage and lead to incremental "major" version >>> (i.e. >>> 0.11), while cython can be released at will and increments minor >>> version >>> (i.e. 0.10.1). >> >> Sure. > > +1 > > Any other fixes that need to go in before I push a 0.10.1 out? OK, our first "minor" release 0.10.1 is out. There are only a couple of fixes. - Robert From stephane.drouard at st.com Fri Nov 21 17:39:13 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Fri, 21 Nov 2008 17:39:13 +0100 Subject: [Cython] cimport does not respect .H include order Message-ID: <009c01c94bf7$afc728c0$b9ad810a@gnb.st.com> Hello, With the following files: foo.h: typedef int foo_t; foo.pxd: cdef extern from "foo.h": ctypedef int foo_t bar.h: typedef foo_t bar_t; bar.pyx: include "foo.pxd" cdef extern from "bar.h": ctypedef foo_t bar_t Everything is fine. But if I now replace "include" by "cimport": bar.pyx: from foo cimport * cdef extern from "bar.h": ctypedef foo_t bar_t It fails at gcc step: bar.h:1: error: syntax error before "bar_t" This is because in the latter, Cython generates the #include's in the wrong order: #include "bar.h" #include "foo.h" Cheers, Stephane -------------- next part -------------- A non-text attachment was scrubbed... Name: files.zip Type: application/octet-stream Size: 721 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20081121/70438703/attachment.obj From greg.ewing at canterbury.ac.nz Sat Nov 22 01:19:39 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 22 Nov 2008 13:19:39 +1300 Subject: [Cython] cimport does not respect .H include order In-Reply-To: <009c01c94bf7$afc728c0$b9ad810a@gnb.st.com> References: <009c01c94bf7$afc728c0$b9ad810a@gnb.st.com> Message-ID: <4927501B.3@canterbury.ac.nz> Stephane DROUARD wrote: > This is because in the latter, Cython generates the #include's in the wrong order: > #include "bar.h" > #include "foo.h" Cython/Pyrex has no way of knowing what the internal dependencies are between .h files. If one .h file depends on another, it should #include it itself (with appropriate protection against multiple inclusion). -- Greg From robertwb at math.washington.edu Sat Nov 22 01:21:33 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 21 Nov 2008 16:21:33 -0800 Subject: [Cython] cimport does not respect .H include order In-Reply-To: <4927501B.3@canterbury.ac.nz> References: <009c01c94bf7$afc728c0$b9ad810a@gnb.st.com> <4927501B.3@canterbury.ac.nz> Message-ID: <036B1650-C75B-4E42-989F-56021DDF413C@math.washington.edu> On Nov 21, 2008, at 4:19 PM, Greg Ewing wrote: > Stephane DROUARD wrote: >> This is because in the latter, Cython generates the #include's in >> the wrong order: >> #include "bar.h" >> #include "foo.h" > > Cython/Pyrex has no way of knowing what the internal dependencies > are between .h files. If one .h file depends on another, it should > #include it itself (with appropriate protection against multiple > inclusion). I actually just ran into this myself in Sage. Though of course all dependancies can't be resolved, I've made it so cimported includes always come first: http://hg.cython.org/cython-devel/rev/8c0c3784245e - Robert From tim.w at hiddenworlds.org Sat Nov 22 07:31:38 2008 From: tim.w at hiddenworlds.org (Tim Wakeham) Date: Sat, 22 Nov 2008 16:31:38 +1000 Subject: [Cython] Errors in generated .c Message-ID: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> Hi guys, I've just begun playing with Cython over the last few days, but I've come across what appears to be errors in the generated .c file. When I try to build this source file http://pastebin.com/f686e0ac I receive the output below from MingW. The entire source package is available at http://hiddenworlds.org/temp/Jupiter.zip if that helps. -Building renderdata.pyx ---------------------------------------- running build_ext building 'renderdata' extension C:\mingw\bin\gcc.exe -mno-cygwin -mdll -O -Wall -ID:\Programs\Python\include -ID :\Programs\Python\PC -c renderdata.c -o build\temp.win32-2.5\Release\renderdata. o renderdata.c:360: error: syntax error before "__pyx_t_10renderdata_FACE" renderdata.c:360: warning: no semicolon at end of struct or union renderdata.c:361: warning: type defaults to `int' in declaration of `next' renderdata.c:361: warning: data definition has no type or storage class renderdata.c:362: error: syntax error before '*' token renderdata.c:362: warning: type defaults to `int' in declaration of `last' renderdata.c:362: warning: data definition has no type or storage class renderdata.c:363: error: syntax error before '}' token renderdata.c:363: warning: type defaults to `int' in declaration of `__pyx_t_10r enderdata_FACE' renderdata.c:363: warning: data definition has no type or storage class renderdata.c:375: error: syntax error before "__pyx_t_10renderdata_FACE" renderdata.c:375: warning: no semicolon at end of struct or union renderdata.c:376: warning: type defaults to `int' in declaration of `__pyx_t_10r enderdata_MESH' renderdata.c:376: warning: data definition has no type or storage class renderdata.c:430: warning: '__pyx_f_10renderdata_test' defined but not used error: command 'gcc' failed with exit status 1 D:\Programming\Jupiter> From robertwb at math.washington.edu Sat Nov 22 09:16:22 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 22 Nov 2008 00:16:22 -0800 Subject: [Cython] Errors in generated .c In-Reply-To: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> References: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> Message-ID: <4A41293A-AEE1-44FC-9372-3954C1907C35@math.washington.edu> On Nov 21, 2008, at 10:31 PM, Tim Wakeham wrote: > Hi guys, > > I've just begun playing with Cython over the last few days, Cool. > but I've come > across what appears to be errors in the generated .c file. No promises that it's bug free yet. :) Thanks for the report. > When I try to build this source file http://pastebin.com/f686e0ac I > receive > the output below from MingW. The entire source package is > available at > http://hiddenworlds.org/temp/Jupiter.zip if that helps. What would really help is a minimal example that reproduce this error. (E.g. one two very short files, just keep cutting until you can't cut any more.) Then we could go in and try and figure out what's going wrong. - Robert From tim.w at hiddenworlds.org Sun Nov 23 08:08:05 2008 From: tim.w at hiddenworlds.org (Tim Wakeham) Date: Sun, 23 Nov 2008 17:08:05 +1000 Subject: [Cython] Errors in generated .c In-Reply-To: <4A41293A-AEE1-44FC-9372-3954C1907C35@math.washington.edu> References: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> <4A41293A-AEE1-44FC-9372-3954C1907C35@math.washington.edu> Message-ID: <000001c94d3a$41c066e0$c54134a0$@w@hiddenworlds.org> Ok, I've done a bit of testing and the problem seems to be in recursive ctypedef's. Eg. ctypedef struct foo: int count foo *bar causes the bug in the outputted code. Remove the reference to itself and the problem goes away. cdef struct foo: int count foo *bar compiles fine. I'm guessing that even if a recursive type def isn't allowed, then the cython translator should complain about it before gcc is ever invoked. The easy way around it is to do the C thing and do a plain cdef first, then do the ctypedef like cdef struct foo: int count foo *bar ctypedef foo FOO Hope this helps. -Tim -----Original Message----- From: cython-dev-bounces at codespeak.net [mailto:cython-dev-bounces at codespeak.net] On Behalf Of Robert Bradshaw Sent: Saturday, 22 November 2008 6:16 PM To: cython-dev at codespeak.net Subject: Re: [Cython] Errors in generated .c On Nov 21, 2008, at 10:31 PM, Tim Wakeham wrote: > Hi guys, > > I've just begun playing with Cython over the last few days, Cool. > but I've come > across what appears to be errors in the generated .c file. No promises that it's bug free yet. :) Thanks for the report. > When I try to build this source file http://pastebin.com/f686e0ac I > receive > the output below from MingW. The entire source package is > available at > http://hiddenworlds.org/temp/Jupiter.zip if that helps. What would really help is a minimal example that reproduce this error. (E.g. one two very short files, just keep cutting until you can't cut any more.) Then we could go in and try and figure out what's going wrong. - Robert _______________________________________________ Cython-dev mailing list Cython-dev at codespeak.net http://codespeak.net/mailman/listinfo/cython-dev From cournape at gmail.com Sun Nov 23 16:23:46 2008 From: cournape at gmail.com (David Cournapeau) Date: Mon, 24 Nov 2008 00:23:46 +0900 Subject: [Cython] Cython, signals and keyboard interrupts Message-ID: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> Hi, I am using cython to wrap some relatively low level C api (ALSA). It has worked great so far, specially for integrating numpy array and C API. But I have a problem with signals. ALSA is the Linux API for sound devices, and I have a cython class like the following: cdef class AlsaDevice: def __init__(AlsaDevice self): # Set up the device def __dealloca__(AlsaDevice self): # Free-up the device and related resources def play(AlsaDevice self, ndarray data): # Play the data through the device # Number of buffers to play cdef int nbuf cdef int i for i in range(mbuf): # Call the C API When a KeyboardInterrupt is raised while executing the loop, it is blocked until the end of the loop, which is not ideal, obviously. After having tried different things, I came up with the following: cdef int err .... for i in range(mbuf): err = python_exc.PyErr_CheckSignals() if err != 0: if python_exc.PyErr_ExceptionMatches(KeyboardInterrupt): raise KeyboardInterrupt() # Call the C API Which seems to do what I want. But I don't really understand why it works, in particular, is it guaranteed that every ressource is freed (assuming everything is correctly handled when no SIGINT is send to the python process) ? Is there a better way ? David From ondrej at certik.cz Mon Nov 24 00:49:40 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 24 Nov 2008 00:49:40 +0100 Subject: [Cython] handling ctrl-C Message-ID: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> Hi, does anyone have a simple script/example how to handle the _sig_on and _sig_off that is in Sage (e.g. so that I can use ctrl-C to interrupt a code written in C)? I tried to port it to my own project, but I am getting some missing symbols: http://github.com/certik/hermes2d/commits/sig/ everything compiles nicely into an .so file, but when importing in python, I get: ImportError: /home/ondra/repos/hermes2d/python/hermes2d.so: undefined symbol: _signals So I think I need to link it with something. If someone knows, let me know. If not, no problem, I think I'll figure it out eventually. Ondrej P.S. I also tried the approach that is in numpy, see the patch before it: http://github.com/certik/hermes2d/commit/2ca243da2bda18fefe8a6a1c24e922e6361779a6 but that gives me segfaults when run from Python. Also haven't yet figured out why. From wstein at gmail.com Mon Nov 24 02:37:48 2008 From: wstein at gmail.com (William Stein) Date: Sun, 23 Nov 2008 17:37:48 -0800 Subject: [Cython] handling ctrl-C In-Reply-To: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> References: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> Message-ID: <85e81ba30811231737t5dbf8f10w6d53e65da731591c@mail.gmail.com> On Sun, Nov 23, 2008 at 3:49 PM, Ondrej Certik wrote: > Hi, > > does anyone have a simple script/example how to handle the _sig_on and > _sig_off that is in Sage (e.g. so that I can use ctrl-C to interrupt a > code written in C)? > > I tried to port it to my own project, but I am getting some missing symbols: > > http://github.com/certik/hermes2d/commits/sig/ > > everything compiles nicely into an .so file, but when importing in > python, I get: > > ImportError: /home/ondra/repos/hermes2d/python/hermes2d.so: undefined > symbol: _signals > > So I think I need to link it with something. If someone knows, let me > know. If not, no problem, I think I'll figure it out eventually. > The extra code is in Sage's c_lib, I think. See SAGE_ROOT/devel/sage/c_lib/src/interrupt.c SAGE_ROOT/devel/sage/c_lib/include/interrupt.h Ask on sage-devel if you don't get enough feedback here, though I think it would possibly be very good to port this signal stuff into Cython itself (why not?). I've cc'd Gonzalo Tornaria () since he co-authored the signals stuff with me. William > Ondrej > > P.S. I also tried the approach that is in numpy, see the patch before it: > > http://github.com/certik/hermes2d/commit/2ca243da2bda18fefe8a6a1c24e922e6361779a6 > > but that gives me segfaults when run from Python. Also haven't yet > figured out why. > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- William Stein Associate Professor of Mathematics University of Washington http://wstein.org From strawman at astraw.com Mon Nov 24 02:48:04 2008 From: strawman at astraw.com (Andrew Straw) Date: Sun, 23 Nov 2008 17:48:04 -0800 Subject: [Cython] Cython, signals and keyboard interrupts In-Reply-To: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> References: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> Message-ID: <492A07D4.7010600@astraw.com> According to the documentation for PyErr_CheckSignals() (at http://docs.python.org/c-api/exceptions.html#PyErr_CheckSignals ), the KeyboardInterrupt is already raised. So, you just need to break out of your loop and tell Cython to raise the exception. I don't know how to do that, but presumably it would just cause the cython compiler to emit C code like "goto fail;". -Andrew David Cournapeau wrote: > Hi, > > I am using cython to wrap some relatively low level C api (ALSA). It > has worked great so far, specially for integrating numpy array and C > API. But I have a problem with signals. ALSA is the Linux API for > sound devices, and I have a cython class like the following: > > cdef class AlsaDevice: > def __init__(AlsaDevice self): > # Set up the device > > def __dealloca__(AlsaDevice self): > # Free-up the device and related resources > > def play(AlsaDevice self, ndarray data): > # Play the data through the device > > # Number of buffers to play > cdef int nbuf > cdef int i > > for i in range(mbuf): > # Call the C API > > When a KeyboardInterrupt is raised while executing the loop, it is > blocked until the end of the loop, which is not ideal, obviously. > After having tried different things, I came up with the following: > > cdef int err > .... > for i in range(mbuf): > err = python_exc.PyErr_CheckSignals() > if err != 0: > if python_exc.PyErr_ExceptionMatches(KeyboardInterrupt): > raise KeyboardInterrupt() > # Call the C API > > Which seems to do what I want. But I don't really understand why it > works, in particular, is it guaranteed that every ressource is freed > (assuming everything is correctly handled when no SIGINT is send to > the python process) ? Is there a better way ? > > David > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Mon Nov 24 08:14:46 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Nov 2008 08:14:46 +0100 Subject: [Cython] Cython, signals and keyboard interrupts In-Reply-To: <492A07D4.7010600@astraw.com> References: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> <492A07D4.7010600@astraw.com> Message-ID: <492A5466.6050801@behnel.de> Hi, Andrew Straw wrote: > According to the documentation for PyErr_CheckSignals() (at > http://docs.python.org/c-api/exceptions.html#PyErr_CheckSignals ), the > KeyboardInterrupt is already raised. In that case, if you need this inside a cdef function, you can declaring it as "except *" and simply return from the function if a signal was set. The caller will always check if an exception was raised and propagate the KeyboardInterrupt. If your cdef function returns a C type (like int, for example) that allows passing a dedicated exception value (like -1), you can also declare it "except -1" and then return -1 explicitly on signals. That will allow the caller to know that an error was raised without checking. http://docs.cython.org/docs/language_basics.html#error-return-values Stefan From stefan_ml at behnel.de Mon Nov 24 08:18:09 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Nov 2008 08:18:09 +0100 Subject: [Cython] handling ctrl-C In-Reply-To: <85e81ba30811231737t5dbf8f10w6d53e65da731591c@mail.gmail.com> References: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> <85e81ba30811231737t5dbf8f10w6d53e65da731591c@mail.gmail.com> Message-ID: <492A5531.8060405@behnel.de> Hi, William Stein wrote: > I think it would > possibly be very good to port this signal stuff into Cython itself (why not?). Yes, but... :) It would have to be explicit to avoid unnecessary overhead in time-critical code (e.g. loops may or may not be a good candidate for signal checking, depending on how tight they are), and it wouldn't necessarily work in calls to external libraries. Stefan From michael.abshoff at googlemail.com Mon Nov 24 08:29:35 2008 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 23 Nov 2008 23:29:35 -0800 Subject: [Cython] handling ctrl-C In-Reply-To: <492A5531.8060405@behnel.de> References: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> <85e81ba30811231737t5dbf8f10w6d53e65da731591c@mail.gmail.com> <492A5531.8060405@behnel.de> Message-ID: <492A57DF.3080801@gmail.com> Stefan Behnel wrote: > Hi, > > William Stein wrote: >> I think it would >> possibly be very good to port this signal stuff into Cython itself (why not?). > > Yes, but... :) > > It would have to be explicit to avoid unnecessary overhead in time-critical > code (e.g. loops may or may not be a good candidate for signal checking, > depending on how tight they are), and it wouldn't necessarily work in calls > to external libraries. I guess William's point was that the signal semantic should be available in Cython, not that it should be automatically be used by some heuristic. > Stefan Cheers, Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From wstein at gmail.com Mon Nov 24 08:37:46 2008 From: wstein at gmail.com (William Stein) Date: Sun, 23 Nov 2008 23:37:46 -0800 Subject: [Cython] handling ctrl-C In-Reply-To: <492A57DF.3080801@gmail.com> References: <85b5c3130811231549j5233c85o2e5e3e72e55ff110@mail.gmail.com> <85e81ba30811231737t5dbf8f10w6d53e65da731591c@mail.gmail.com> <492A5531.8060405@behnel.de> <492A57DF.3080801@gmail.com> Message-ID: <85e81ba30811232337s2b136009i91213c873f0fa341@mail.gmail.com> On Sun, Nov 23, 2008 at 11:29 PM, Michael Abshoff wrote: > Stefan Behnel wrote: >> Hi, >> >> William Stein wrote: >>> I think it would >>> possibly be very good to port this signal stuff into Cython itself (why not?). >> >> Yes, but... :) >> >> It would have to be explicit to avoid unnecessary overhead in time-critical >> code (e.g. loops may or may not be a good candidate for signal checking, >> depending on how tight they are), and it wouldn't necessarily work in calls >> to external libraries. > > I guess William's point was that the signal semantic should be available > in Cython, not that it should be automatically be used by some heuristic. > Yes, exactly. And I definitely agree that it should be very explicit. William From cournape at gmail.com Mon Nov 24 09:34:35 2008 From: cournape at gmail.com (David Cournapeau) Date: Mon, 24 Nov 2008 17:34:35 +0900 Subject: [Cython] Cython, signals and keyboard interrupts In-Reply-To: <492A5466.6050801@behnel.de> References: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> <492A07D4.7010600@astraw.com> <492A5466.6050801@behnel.de> Message-ID: <5b8d13220811240034u4390feddhdc531e4e0a3571d4@mail.gmail.com> On Mon, Nov 24, 2008 at 4:14 PM, Stefan Behnel wrote: > Hi, > > Andrew Straw wrote: >> According to the documentation for PyErr_CheckSignals() (at >> http://docs.python.org/c-api/exceptions.html#PyErr_CheckSignals ), the >> KeyboardInterrupt is already raised. > > In that case, if you need this inside a cdef function, you can declaring it > as "except *" and simply return from the function if a signal was set. The > caller will always check if an exception was raised and propagate the > KeyboardInterrupt. Ah, I was unaware of this feature. It is working great, thanks, David From cournape at gmail.com Mon Nov 24 10:17:39 2008 From: cournape at gmail.com (David Cournapeau) Date: Mon, 24 Nov 2008 18:17:39 +0900 Subject: [Cython] Complex number (at pure C level) Message-ID: <5b8d13220811240117r393528eas714f92673fd28970@mail.gmail.com> Hi, Is there a way to generate simple C code from complex numbers with cython. For example, something like cdef complex foo(complex a, complex b): return a - b which would become in C: complex foo(complex a, complex b) { return a - b; } And ideally, an easy way to handle complex/imaginary as members (a.imag, etc...); assuming the C compiler supports complex.h, of course. David From stefan_ml at behnel.de Mon Nov 24 11:11:03 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Nov 2008 11:11:03 +0100 Subject: [Cython] dynamic memory management Message-ID: <492A7DB7.10107@behnel.de> Hi, when re-reading an older thread about the struct syntax, I had a funny idea I wanted to share. Robert Bradshaw wrote: > I'm with Stefan that it is very > dangerous to hide the malloc and expect the user to explicitly > provide the free. If the user wants to manage their own memory, they > can do so with malloc and friends (understanding the associated > dangers), and any special memory-management stuff we add should be as > implicit and easy to use as Python's garbage collection (and probably > piggyback off of it). The "piggybacking" here makes me wonder if it would work to provide a dedicated "Memory" object, that would be a Python object but could be used as in def do_stuff_with_dynamic_memory(Py_ssize_t size): cdef Memory mem cdef void* ptr mem = Memory(10*size) # equiv to malloc() ptr = mem # automagic coercion to a pointer to the memory buffer # do stuff with *ptr, e.g. hand it around in C code mem = None # => Py_DECREF(mem), equiv to free() Note that the pointer coercion would not return a pointer to the object, but to the memory buffer. The way this works would be exactly as with the str() object, which allocates a variable sized memory block in one step (i.e. a PyVarObject), and accesses the buffer as the last field in the object struct. This obviously gives us the overhead of a Python object in addition to the allocated memory, but allocating tons of tiny amounts of memory using malloc() is a bad thing to do anyway, so the average overhead for a medium to large slice of memory should be acceptable given the advantage of easy and safe memory handling - especially since AFAIR PyMem_Alloc() is (often) a lot faster on Windows than malloc(). I first had my doubts if auto-coercion to a pointer is a good idea, as in a plain cdef void* ptr = mem or as an argument in a function call, so that you could pass "mem" directly into a C function. What I would like to avoid, is that people write cdef void* ptr = Memory(1000) and thus drop the Python reference immediately. But Cython should actually be able to see that, just as it prevents the char* coercion of temporary Python strings. Obviously, Memory must implement the buffer protocol so that you can write cdef Memory[float, ndim=2] mem = Memory(100*100*sizeof(float)) We could also allow fancy stuff like mem = Memory(itemsize=sizeof(int), count=1000) or maybe cdef Memory[float, ndim=2] mem = \ Memory(itemsize=sizeof(float), dimensions=(100,100)) Supporting realloc is also trivial as in mem.realloc(2 * len(mem)) and respectively mem.realloc(itemsize=sizeof(float), dimensions=(100,200)) The nice thing about this is that for the case that realloc() fails, the method could raise a PyErr_NoMemory() immediately and return NULL to use the normal exception propagation mechanism. Even one thing less to care about for users. BTW, it would be best to actually implement this in Cython, not C - except that the custom object allocation itself isn't currently supported in Cython. Stefan From dagss at student.matnat.uio.no Mon Nov 24 11:22:40 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 24 Nov 2008 11:22:40 +0100 Subject: [Cython] Complex number (at pure C level) In-Reply-To: <5b8d13220811240117r393528eas714f92673fd28970@mail.gmail.com> References: <5b8d13220811240117r393528eas714f92673fd28970@mail.gmail.com> Message-ID: <492A8070.6070608@student.matnat.uio.no> David Cournapeau wrote: > Hi, > > Is there a way to generate simple C code from complex numbers with > cython. For example, something like > > cdef complex foo(complex a, complex b): > return a - b > > which would become in C: > > complex foo(complex a, complex b) > { > return a - b; > } > > And ideally, an easy way to handle complex/imaginary as members > (a.imag, etc...); assuming the C compiler supports complex.h, of > course. > There seems to be a positive attitude towards including complex support in Cython natively, it's just about lack of developer time... You can fake it, but it is a bit dangerous: cdef extern from *: ctypedef float complex Problems to watch out: 1) This means that Cython will think that complex numbers are automatically castable to float and it will be C that raises any error. (Especially for float* vs. complex*) 2) You need to declare seperate functions for getting real and imag parts. 3) No auto-coercion to/from Python complex, must do it manually (and use the functions from 2) in the process). I think this is as good as it gets, at least I can't think of anything else within the Cython that's there today. Dag Sverre From stefan_ml at behnel.de Mon Nov 24 11:47:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Nov 2008 11:47:36 +0100 Subject: [Cython] dynamic memory management In-Reply-To: <492A7DB7.10107@behnel.de> References: <492A7DB7.10107@behnel.de> Message-ID: <492A8648.8030900@behnel.de> replying to myself here... Stefan Behnel wrote: > Supporting realloc is also trivial as in > > mem.realloc(2 * len(mem)) It's actually not that trivial for a PyVarObject. This would work better: mem = mem.realloc(2*len(mem)) but if you write mem2 = mem1.realloc(2*len(mem1)) then mem1 will be broken after that. That's not trivial to fix ... Stefan From collette at physics.ucla.edu Tue Nov 25 00:16:26 2008 From: collette at physics.ucla.edu (Andrew Collette) Date: Mon, 24 Nov 2008 15:16:26 -0800 Subject: [Cython] Proper (ab)use of __dealloc__ Message-ID: <1227568586.17125.12.camel@tachyon-laptop> Hi, I'm running in to problems with the __dealloc__ method of extension types. Specifically, I need to acquire a Python lock inside a __dealloc__ method in order to free resources which come from an external C library. The code is something like: def __dealloc__(self): mylock.acquire() try: free_resource(self.id) finally: mylock.release() In this code, free_resource is an external C function, and self.id is a C attribute of the extension type (not a Python attribute). This works fine in single-threaded code, but leads to random segmentation faults when using threads, apparently when the garbage collector runs. Commenting out mylock.acquire() and mylock.release() fixes the segmentation fault. Is the GIL guaranteed to be held when __dealloc__ is called? In other words, is it safe to call into Python? If not, is there an approved Cython way to explicitly acquire the GIL? There is unfortunately no way to get rid of the lock, as the C library I'm interfacing with is not thread-safe and if free_resource is called the wrong number of times very bad things will happen. Alternatively, is there another (safer) method I can use for extension types, analagous to Python's __del__ method? Thanks, Andrew Collette From greg.ewing at canterbury.ac.nz Tue Nov 25 00:32:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Nov 2008 12:32:36 +1300 Subject: [Cython] Cython, signals and keyboard interrupts In-Reply-To: <492A5466.6050801@behnel.de> References: <5b8d13220811230723s3f24a356l312f5a8d60025d1d@mail.gmail.com> <492A07D4.7010600@astraw.com> <492A5466.6050801@behnel.de> Message-ID: <492B3994.9090606@canterbury.ac.nz> Stefan Behnel wrote: > you can also declare it > "except -1" and then return -1 explicitly on signals. That will allow the > caller to know that an error was raised without checking. Or except? -1 (with a question mark) to check for an exception if -1 is returned (more efficient if -1 is a rare return value). -- Greg From greg.ewing at canterbury.ac.nz Tue Nov 25 00:39:07 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Nov 2008 12:39:07 +1300 Subject: [Cython] dynamic memory management In-Reply-To: <492A7DB7.10107@behnel.de> References: <492A7DB7.10107@behnel.de> Message-ID: <492B3B1B.4050005@canterbury.ac.nz> Stefan Behnel wrote: > def do_stuff_with_dynamic_memory(Py_ssize_t size): > cdef Memory mem > cdef void* ptr > mem = Memory(10*size) # equiv to malloc() > ptr = mem # automagic coercion to a pointer to the memory buffer > # do stuff with *ptr, e.g. hand it around in C code > mem = None # => Py_DECREF(mem), equiv to free() This sounds a lot like Bytes in py3. Maybe just provide coercion between Bytes and void *? -- Greg From michael.abshoff at googlemail.com Tue Nov 25 00:56:55 2008 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Mon, 24 Nov 2008 15:56:55 -0800 Subject: [Cython] dynamic memory management In-Reply-To: <492A7DB7.10107@behnel.de> References: <492A7DB7.10107@behnel.de> Message-ID: <492B3F47.80502@gmail.com> Stefan Behnel wrote: > Hi, Hi, > when re-reading an older thread about the struct syntax, I had a funny idea > I wanted to share. > > Robert Bradshaw wrote: >> I'm with Stefan that it is very >> dangerous to hide the malloc and expect the user to explicitly >> provide the free. If the user wants to manage their own memory, they >> can do so with malloc and friends (understanding the associated >> dangers), and any special memory-management stuff we add should be as >> implicit and easy to use as Python's garbage collection (and probably >> piggyback off of it). > > The "piggybacking" here makes me wonder if it would work to provide a > dedicated "Memory" object, that would be a Python object but could be used > as in > > def do_stuff_with_dynamic_memory(Py_ssize_t size): > cdef Memory mem > cdef void* ptr > mem = Memory(10*size) # equiv to malloc() > ptr = mem # automagic coercion to a pointer to the memory buffer > # do stuff with *ptr, e.g. hand it around in C code > mem = None # => Py_DECREF(mem), equiv to free() > In general I consider this a great idea since people will be less likely to shoot themselves in the foot and as you mentioned for small allocs this should really be a win. Having memory automatically freed on exceptions would also be great since that is tricky to write. The less the average user has to deal with memory allocs and deallocs in Cython code the better Cython will be. But I am concerned to some extend about the success of hunting memory leaks if this is the default allocator in some future version of Cython and there is no way to use malloc() [currenly Sage provide sagemalloc(), so that for example could be made to use do_stuff_with_dynamic_memory() since we only need to change one define]. Looking for heap objects in Python that are leaked or not properly dereferenced is a giant pain right now and valgrind is next to useless here. muppy is starting to solve that problem, but there are at least in Sage issues with ipython allocating tens of thousands of heap objects at load time impacting performance of muppy badly, but those issues will be hopefully fixed. Anyway, no need to rant about python memory heap debugging here :) Since in Sage I have to deal with loads of small, but potentially large cumulative leaks I would consider it very nice if the allocator could have some magic debug fallback to malloc(). That should certainly be not the default behavior, just like --without-pymalloc is not the default for Python. Just my 0.2 cents > Stefan Cheers, Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From greg.ewing at canterbury.ac.nz Tue Nov 25 01:09:24 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Nov 2008 13:09:24 +1300 Subject: [Cython] Proper (ab)use of __dealloc__ In-Reply-To: <1227568586.17125.12.camel@tachyon-laptop> References: <1227568586.17125.12.camel@tachyon-laptop> Message-ID: <492B4234.3020608@canterbury.ac.nz> Andrew Collette wrote: > Is the GIL guaranteed to be held when __dealloc__ is called? In other > words, is it safe to call into Python? Yes, the GIL is held, and you can call into Python, but don't expose the object being deallocated to any other Python code, because it may contain object references that are no longer valid. > Alternatively, is there another (safer) method I can use for extension > types, analagous to Python's __del__ method? No, there's no __del__ for extension types, unfortunately. -- Greg From aaron.devore at gmail.com Tue Nov 25 01:54:37 2008 From: aaron.devore at gmail.com (Aaron DeVore) Date: Mon, 24 Nov 2008 16:54:37 -0800 Subject: [Cython] dynamic memory management In-Reply-To: <492A7DB7.10107@behnel.de> References: <492A7DB7.10107@behnel.de> Message-ID: <2ead2fb0811241654q156ed04x4f91e69eb8b63ffd@mail.gmail.com> First of all, this is from the view of someone who has never created/implemented a programming language, so some of my ideas might be a bit n00bish. 1. Why not give Memory a keyword option type that could be used both when allocating *and* when resizing? I'm thinking something like Memory(type=float, length=i). realloc() or equivalent would instead just take a size that is automatically multiplied by sizeof(...). 2. For resizing arrays, the D programming language uses array.length = i. For Cython it could be: mem.length = i # equivalent to realloc(ptr, sizeof(float) * i) mem.length *= i # equivalent to realloc(ptr, sizeof(float) * i * len(mem) length = mem.length # Get the number of bytes divided by sizeof size = mem.size # Get the number of bytes mem.size = sizeof(a_type) * i # Bypass the automatic sizeof calculations when reallocating or, for multidimensional arrays mem.length = i, j # First dimension = i, second dimension = j mem.length *= i, j # Multiply length by i for the first dimension, j for the second dimension mem.length *= i # Multiply length of both dimensions by i length = mem.length # Get a tuple of the dimensions size = mem.size # The number of bytes *total* I don't really know enough about the Cython compiler to know how those would be implemented. Maybe properties? - Aaron DeVore On Mon, Nov 24, 2008 at 2:11 AM, Stefan Behnel wrote: > Hi, > > when re-reading an older thread about the struct syntax, I had a funny idea > I wanted to share. > > Robert Bradshaw wrote: >> I'm with Stefan that it is very >> dangerous to hide the malloc and expect the user to explicitly >> provide the free. If the user wants to manage their own memory, they >> can do so with malloc and friends (understanding the associated >> dangers), and any special memory-management stuff we add should be as >> implicit and easy to use as Python's garbage collection (and probably >> piggyback off of it). > > The "piggybacking" here makes me wonder if it would work to provide a > dedicated "Memory" object, that would be a Python object but could be used > as in > > def do_stuff_with_dynamic_memory(Py_ssize_t size): > cdef Memory mem > cdef void* ptr > mem = Memory(10*size) # equiv to malloc() > ptr = mem # automagic coercion to a pointer to the memory buffer > # do stuff with *ptr, e.g. hand it around in C code > mem = None # => Py_DECREF(mem), equiv to free() > > Note that the pointer coercion would not return a pointer to the object, > but to the memory buffer. The way this works would be exactly as with the > str() object, which allocates a variable sized memory block in one step > (i.e. a PyVarObject), and accesses the buffer as the last field in the > object struct. > > This obviously gives us the overhead of a Python object in addition to the > allocated memory, but allocating tons of tiny amounts of memory using > malloc() is a bad thing to do anyway, so the average overhead for a medium > to large slice of memory should be acceptable given the advantage of easy > and safe memory handling - especially since AFAIR PyMem_Alloc() is (often) > a lot faster on Windows than malloc(). > > I first had my doubts if auto-coercion to a pointer is a good idea, as in a > plain > > cdef void* ptr = mem > > or as an argument in a function call, so that you could pass "mem" directly > into a C function. What I would like to avoid, is that people write > > cdef void* ptr = Memory(1000) > > and thus drop the Python reference immediately. But Cython should actually > be able to see that, just as it prevents the char* coercion of temporary > Python strings. > > Obviously, Memory must implement the buffer protocol so that you can write > > cdef Memory[float, ndim=2] mem = Memory(100*100*sizeof(float)) > > We could also allow fancy stuff like > > mem = Memory(itemsize=sizeof(int), count=1000) > > or maybe > > cdef Memory[float, ndim=2] mem = \ > Memory(itemsize=sizeof(float), dimensions=(100,100)) > > Supporting realloc is also trivial as in > > mem.realloc(2 * len(mem)) > > and respectively > > mem.realloc(itemsize=sizeof(float), dimensions=(100,200)) > > The nice thing about this is that for the case that realloc() fails, the > method could raise a PyErr_NoMemory() immediately and return NULL to use > the normal exception propagation mechanism. Even one thing less to care > about for users. > > BTW, it would be best to actually implement this in Cython, not C - except > that the custom object allocation itself isn't currently supported in Cython. > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From hoytak at cs.ubc.ca Tue Nov 25 09:24:56 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 25 Nov 2008 00:24:56 -0800 Subject: [Cython] dictionary iteration generates code with illegal cast Message-ID: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> Hello, It seems that I've hit a bug in the new dictionary iteration stuff (which is quite awesome, BTW). I'm using revision 1374:85a5fdd79388 (new at time of writing). In iteration over a dictionary, I'm declaring both the variables of iteration to have explicit types; i.e. my code is: cdef int k cdef double v for k, v in d.iteritems(): # do things However, in the c code, I find the following: int __pyx_v_k; double __pyx_v_v; ... void *__pyx_t_2; void *__pyx_t_3; ... if (!PyDict_Next(__pyx_t_1, (&__pyx_5), ((PyObject**)(&__pyx_t_2)), ((PyObject **)(&__pyx_t_3)))) break; __pyx_v_k = ((int)__pyx_t_2); __pyx_v_v = ((double)__pyx_t_3); In other words, it's trying to do a cast directly from a void* type, which isn't allowed by gcc (The void* to int cast seems problematic as well). Thus the code fails on compilation. If I replace the above loop with for k_o, v_o in d.iteritems(): k = k_o v = v_o # do things It works fine. I'll post a ticket with this as well... Thanks!!!! --Hoyt -- ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From robertwb at math.washington.edu Tue Nov 25 12:39:28 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 25 Nov 2008 03:39:28 -0800 Subject: [Cython] Errors in generated .c In-Reply-To: <000001c94d3a$41c066e0$c54134a0$@w@hiddenworlds.org> References: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> <4A41293A-AEE1-44FC-9372-3954C1907C35@math.washington.edu> <000001c94d3a$41c066e0$c54134a0$@w@hiddenworlds.org> Message-ID: On Nov 22, 2008, at 11:08 PM, Tim Wakeham wrote: > Ok, I've done a bit of testing and the problem seems to be in > recursive > ctypedef's. > > Eg. > > ctypedef struct foo: > int count > foo *bar > > causes the bug in the outputted code. Remove the reference to > itself and > the problem goes away. > > cdef struct foo: > int count > foo *bar > > compiles fine. I'm guessing that even if a recursive type def isn't > allowed, then the cython > translator should complain about it before gcc is ever invoked. I consider it a bug anytime Cython emits invalid C. One should never have to see gcc errors (assuming, of course, that all the external declarations are made correctly). We should either throw an error or separate out the typedef. http://trac.cython.org/cython_trac/ticket/136 > The easy > way around it is to > do the C thing and do a plain cdef first, then do the ctypedef like > > cdef struct foo: > int count > foo *bar > > ctypedef foo FOO > > Hope this helps. Very much so. Thanks. - Robert From ggellner at uoguelph.ca Tue Nov 25 16:05:22 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Tue, 25 Nov 2008 10:05:22 -0500 Subject: [Cython] Comments on example code? Message-ID: <20081125150522.GA17693@workerbee> So I am giving a talk to my lab about doing some fast ODE solving using python. Traditionally I have used f2py to define the callback function, but I think Cython is a better fit for some of the newer students that don't know fortran (though in this case it is easy to teach them). Now using the cython/numpy tutorial I can get my code to run around 50% slower than the f2py generated code (which is still a healthy order of magnitude faster than the python callback . . .). If there are any easy things I can do to make the code faster (without sacrificing readability) I would be very grateful! The callback code is (if you want the full program just ask, I am asking more for glaring errors, as opposed to subtle optimizations): cdef class Model: cdef public double a1, a2, b1, b2, d1, d2 def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t): cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3) yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) - self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1] yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d2*y[2] return yprime thanks, Gabriel From dagss at student.matnat.uio.no Tue Nov 25 16:45:23 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 25 Nov 2008 16:45:23 +0100 (CET) Subject: [Cython] Comments on example code? In-Reply-To: <20081125150522.GA17693@workerbee> References: <20081125150522.GA17693@workerbee> Message-ID: <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> Gabriel Gellner wrote: > So I am giving a talk to my lab about doing some fast ODE solving using > python. Traditionally I have used f2py to define the callback function, > but I > think Cython is a better fit for some of the newer students that don't > know > fortran (though in this case it is easy to teach them). > > Now using the cython/numpy tutorial I can get my code to run around 50% > slower > than the f2py generated code (which is still a healthy order of magnitude > faster than the python callback . . .). If there are any easy things I > can do > to make the code faster (without sacrificing readability) I would be very > grateful! > > The callback code is (if you want the full program just ask, I am asking > more > for glaring errors, as opposed to subtle optimizations): > > cdef class Model: > > cdef public double a1, a2, b1, b2, d1, d2 > > def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t): > cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3) > > yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 + > self.b1*y[0]) > yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) - > self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1] > yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - > self.d2*y[2] > > return yprime > The amount of work that is done in this function is almost nothing -- i.e. "n" is hard-coded to 3. So I think you'll find that the thing killing performance here is calling the function and passing the arguments. For starters, use typed polymorphism: Make the function "cpdef" and give it another name, have a parent class "AbstractModel" with the same function in it, and in the calling code type the callee as AbstractModel. After that it would help to pass around raw float* rather than NumPy objects in this case when n is so small (unfortunately, there's no way to pass around an acquired buffer between functions. I have ideas of course, but they are not implemented.) Without knowing the nature of the caller (where the real bottlenetck likely is) it is difficult to give better advice. Dag Sverre From dagss at student.matnat.uio.no Tue Nov 25 16:49:35 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 25 Nov 2008 16:49:35 +0100 (CET) Subject: [Cython] Comments on example code? In-Reply-To: <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> Message-ID: <17f2af620b59bbcb3d24de8db7bd24a7.squirrel@webmail.uio.no> Dag Sverre Seljebotn wrote: > Gabriel Gellner wrote: >> So I am giving a talk to my lab about doing some fast ODE solving using >> python. Traditionally I have used f2py to define the callback function, >> but I >> think Cython is a better fit for some of the newer students that don't >> know >> fortran (though in this case it is easy to teach them). >> >> Now using the cython/numpy tutorial I can get my code to run around 50% >> slower >> than the f2py generated code (which is still a healthy order of >> magnitude >> faster than the python callback . . .). If there are any easy things I >> can do >> to make the code faster (without sacrificing readability) I would be >> very >> grateful! >> >> The callback code is (if you want the full program just ask, I am asking >> more >> for glaring errors, as opposed to subtle optimizations): >> >> cdef class Model: >> >> cdef public double a1, a2, b1, b2, d1, d2 >> >> def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t): >> cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3) >> >> yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 + >> self.b1*y[0]) >> yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) - >> self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1] >> yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - >> self.d2*y[2] >> >> return yprime >> > > The amount of work that is done in this function is almost nothing -- i.e. > "n" is hard-coded to 3. So I think you'll find that the thing killing > performance here is calling the function and passing the arguments. > > For starters, use typed polymorphism: Make the function "cpdef" and give > it another name, have a parent class "AbstractModel" with the same > function in it, and in the calling code type the callee as AbstractModel. > > After that it would help to pass around raw float* rather than NumPy > objects in this case when n is so small (unfortunately, there's no way to > pass around an acquired buffer between functions. I have ideas of course, > but they are not implemented.) > > Without knowing the nature of the caller (where the real bottlenetck > likely is) it is difficult to give better advice. However, I'll also note that the usual way to get high performance with NumPy is to reformulate the computation so that you do it all in parallell. If you are evaluating this using a grid with n=1000 or similar, then as long as you code things so that all operations happen "in parallell" then you could code it in pure Python if you wished and it would still be at least comparable to Fortran. I.e. to avoid all the call overheads etc., it pays off to rather do: gridvalues = eval_func(gridpts) than for i in range(..): gridvalues[i] = eval_func(gridpts[i]) and continue in the same fashion all the way down. Wherever your loop is, there's your problem... Dag Sverre From ggellner at uoguelph.ca Tue Nov 25 17:21:26 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Tue, 25 Nov 2008 11:21:26 -0500 Subject: [Cython] Comments on example code? In-Reply-To: <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> Message-ID: <20081125162126.GA18850@workerbee> > > cdef class Model: > > > > cdef public double a1, a2, b1, b2, d1, d2 > > > > def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t): > > cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3) > > > > yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 + > > self.b1*y[0]) > > yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) - > > self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1] > > yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - > > self.d2*y[2] > > > > return yprime > > > > The amount of work that is done in this function is almost nothing -- i.e. > "n" is hard-coded to 3. So I think you'll find that the thing killing > performance here is calling the function and passing the arguments. > This is generally true this is why I use f2py, and now cython. But somehow f2py is being more efficient here, maybe it is the extra np.empty each call, which is done in fortran in the other code, is there any way to set the size of the ndarray without the python call? > For starters, use typed polymorphism: Make the function "cpdef" and give > it another name, have a parent class "AbstractModel" with the same > function in it, and in the calling code type the callee as AbstractModel. > The calling code is python, does this work in this case? I am not sure what you mean my typing then. > After that it would help to pass around raw float* rather than NumPy > objects in this case when n is so small (unfortunately, there's no way to > pass around an acquired buffer between functions. I have ideas of course, > but they are not implemented.) > Maybe this is what f2py does, it can pass the buffer around. float* isn't really an option as the calling program is python. > Without knowing the nature of the caller (where the real bottlenetck > likely is) it is difficult to give better advice. > But the only difference in the caller between the f2py version and this one is this file. So the speed difference must be from the function calls etc as you have said, but I don't understand how the overhead can be from the caller as the code for this is identical (and not compiled). the code is basically: for model.b1 in np.linspace(0, 2.6, 100): odeint(model, y0, t0) where model is defined in a separate file, in one case as a fortran object generated by f2py and in this case by cython. Anyway thanks for the comments, I know how hard it is to comment on partial pieces of code. I will do some more investigating, but I am starting to think Fortran is the better option as f2py seems to make the calling bridge more efficient [currently . . .] (which is generically the bottleneck in our problems). Gabriel From ggellner at uoguelph.ca Tue Nov 25 17:26:08 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Tue, 25 Nov 2008 11:26:08 -0500 Subject: [Cython] Comments on example code? In-Reply-To: <17f2af620b59bbcb3d24de8db7bd24a7.squirrel@webmail.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <17f2af620b59bbcb3d24de8db7bd24a7.squirrel@webmail.uio.no> Message-ID: <20081125162608.GA19183@workerbee> > However, I'll also note that the usual way to get high performance with > NumPy is to reformulate the computation so that you do it all in > parallell. If you are evaluating this using a grid with n=1000 or similar, > then as long as you code things so that all operations happen "in > parallell" then you could code it in pure Python if you wished and it > would still be at least comparable to Fortran. > > I.e. to avoid all the call overheads etc., it pays off to rather do: > > gridvalues = eval_func(gridpts) > > than > > for i in range(..): > gridvalues[i] = eval_func(gridpts[i]) > > and continue in the same fashion all the way down. Wherever your loop is, > there's your problem... > True, but I like loops :-) I find vectorized code quickly becomes unreadable (and awkward to write). I would rather resort to C or Fortran if this is too much of an issue. I find the best tradeoff is to use tools like f2py and Cython to decrease the call overhead, I can live with the other slowdowns. Gabriel From dagss at student.matnat.uio.no Tue Nov 25 19:21:27 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 25 Nov 2008 19:21:27 +0100 Subject: [Cython] Comments on example code? In-Reply-To: <20081125162126.GA18850@workerbee> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <20081125162126.GA18850@workerbee> Message-ID: <492C4227.50208@student.matnat.uio.no> Gabriel Gellner wrote: > But the only difference in the caller between the f2py version and this one > is this file. So the speed difference must be from the function calls etc as > you have said, but I don't understand how the overhead can be from the caller > as the code for this is identical (and not compiled). > > the code is basically: > > for model.b1 in np.linspace(0, 2.6, 100): > odeint(model, y0, t0) > > where model is defined in a separate file, in one case as a fortran > object generated by f2py and in this case by cython. Ah, right. It's interesting (but not that surprising) to see that f2py does perform better in this area, due to both the call to np.empty and that acquiring the buffer from the NumPy array likely is slower than the NumPy-specific stuff that f2py is doing. But that Python snippet is likely going to use the majority of the time no matter how things are done, so it doesn't matter much. I'm interested to hear how much do you gain over pure Python in this case then -- probably not too much, perhaps 2-300%? (Moving the entire for-loop to Cython would typically give you around 800% improvement). Anyway, this is not likely to be an area where Cython is improved, so there's nothing to do about it. I don't think it is a common usecase either -- typically either one keep everything in Python, or one decide that Python only is too slow, but then one doesn't want to only leverage a small part of the speed increase that compiled code can give you... Even if you move the for-loop into Cython or Fortran you don't have to use vectorized calculations. Something like: cython_apply_odeint(model, 0, 2.6, 100) and then just code a very generic for-loop in Cython which calls odeint and model, just to move the loop on the other side of the Python/Cython-barrier. Enough ranting though. Thanks for bringing this to my attention! -- Dag Sverre From stefan_ml at behnel.de Tue Nov 25 20:25:44 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Nov 2008 20:25:44 +0100 Subject: [Cython] dictionary iteration generates code with illegal cast In-Reply-To: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> References: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> Message-ID: <492C5138.3060603@behnel.de> Hi, Hoyt Koepke wrote: > In iteration over a dictionary, I'm declaring both the variables of > iteration to have explicit types; i.e. my code is: > > cdef int k > cdef double v > > for k, v in d.iteritems(): > # do things > > > However, in the c code, I find the following: > > int __pyx_v_k; > double __pyx_v_v; > ... > void *__pyx_t_2; > void *__pyx_t_3; > ... > if (!PyDict_Next(__pyx_t_1, (&__pyx_5), ((PyObject**)(&__pyx_t_2)), > ((PyObject **)(&__pyx_t_3)))) break; > __pyx_v_k = ((int)__pyx_t_2); > __pyx_v_v = ((double)__pyx_t_3); Thanks for the report. This was the right thing to do for Python objects, but not for C types. Should be fixed now. Stefan From robertwb at math.washington.edu Tue Nov 25 21:06:06 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 25 Nov 2008 12:06:06 -0800 Subject: [Cython] Comments on example code? In-Reply-To: <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> Message-ID: <58925B04-DEF5-4AF2-B6B0-98622B40DACC@math.washington.edu> On Nov 25, 2008, at 7:45 AM, Dag Sverre Seljebotn wrote: > Gabriel Gellner wrote: >> So I am giving a talk to my lab about doing some fast ODE solving >> using >> python. Traditionally I have used f2py to define the callback >> function, >> but I >> think Cython is a better fit for some of the newer students that >> don't >> know >> fortran (though in this case it is easy to teach them). >> >> Now using the cython/numpy tutorial I can get my code to run >> around 50% >> slower >> than the f2py generated code (which is still a healthy order of >> magnitude >> faster than the python callback . . .). If there are any easy >> things I >> can do >> to make the code faster (without sacrificing readability) I would >> be very >> grateful! >> >> The callback code is (if you want the full program just ask, I am >> asking >> more >> for glaring errors, as opposed to subtle optimizations): >> >> cdef class Model: >> >> cdef public double a1, a2, b1, b2, d1, d2 >> >> def __call__(self, np.ndarray[np.float_t, ndim=1] y, int t): >> cdef np.ndarray[np.float_t, ndim=1] yprime = np.empty(3) >> >> yprime[0] = y[0]*(1.0 - y[0]) - self.a1*y[0]*y[1]/(1.0 + >> self.b1*y[0]) >> yprime[1] = self.a1*y[0]*y[1]/(1.0 + self.b1*y[0]) - >> self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - self.d1*y[1] >> yprime[2] = self.a2*y[1]*y[2]/(1.0 + self.b2*y[1]) - >> self.d2*y[2] >> >> return yprime >> > > The amount of work that is done in this function is almost nothing > -- i.e. > "n" is hard-coded to 3. So I think you'll find that the thing killing > performance here is calling the function and passing the arguments. > > For starters, use typed polymorphism: Make the function "cpdef" and > give > it another name, have a parent class "AbstractModel" with the same > function in it, and in the calling code type the callee as > AbstractModel. > > After that it would help to pass around raw float* rather than NumPy > objects in this case when n is so small (unfortunately, there's no > way to > pass around an acquired buffer between functions. I have ideas of > course, > but they are not implemented.) One way to do it would be to flatten/compact the ndarray early on (so one know the entries are contiguous) and then pass the raw float* and its length (keeping the original array around so you don't have to worry about memory issues. Note that __call__, though not as slow as a normal python function call, still has Python semantics (i.e. all of its arguments and its return value have to pass through Python objects) so just using a cdef method could speed things up considerably. (I wonder how hard it would be to support cpdef __call__?) The call to empty is probably dominating things too--since the 3 seems hard coded in anyways, I would accept (and return) a cdef struct data: float x float y float z instead of a 3-element ndarray. - Robert From hoytak at cs.ubc.ca Tue Nov 25 21:22:36 2008 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 25 Nov 2008 12:22:36 -0800 Subject: [Cython] dictionary iteration generates code with illegal cast In-Reply-To: <4db580fd0811251219kf211731l50a32a1f0df9f4aa@mail.gmail.com> References: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> <492C5138.3060603@behnel.de> <4db580fd0811251219kf211731l50a32a1f0df9f4aa@mail.gmail.com> Message-ID: <4db580fd0811251222j4d342549q62c0e1f2f1de4e40@mail.gmail.com> Hello Stefan, Thanks! I really appreciate it. FYI: I'm not sure if it's related, but your changeset 1375 introduced a bug that gives me the traceback: File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 323, in generate_function_definitions stat.generate_function_definitions(env, code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 2540, in generate_function_definitions self.entry.type.scope, code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 323, in generate_function_definitions stat.generate_function_definitions(env, code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 1026, in generate_function_definitions self.body.generate_execution_code(code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 329, in generate_execution_code stat.generate_execution_code(code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 2658, in generate_execution_code self.generate_rhs_evaluation_code(code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", line 2771, in generate_rhs_evaluation_code self.rhs.generate_evaluation_code(code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/ExprNodes.py", line 442, in generate_evaluation_code self.generate_result_code(code) File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/ExprNodes.py", line 4524, in generate_result_code if not dst_type.is_builtin_type: NameError: global name 'dst_type' is not defined It looks easy to fix, but I'm not sure how, and you could probably fix it in 5 seconds. Thanks again! --Hoyt On Tue, Nov 25, 2008 at 11:25 AM, Stefan Behnel wrote: > Hi, > > Hoyt Koepke wrote: >> In iteration over a dictionary, I'm declaring both the variables of >> iteration to have explicit types; i.e. my code is: >> >> cdef int k >> cdef double v >> >> for k, v in d.iteritems(): >> # do things >> >> >> However, in the c code, I find the following: >> >> int __pyx_v_k; >> double __pyx_v_v; >> ... >> void *__pyx_t_2; >> void *__pyx_t_3; >> ... >> if (!PyDict_Next(__pyx_t_1, (&__pyx_5), ((PyObject**)(&__pyx_t_2)), >> ((PyObject **)(&__pyx_t_3)))) break; >> __pyx_v_k = ((int)__pyx_t_2); >> __pyx_v_v = ((double)__pyx_t_3); > > Thanks for the report. This was the right thing to do for Python objects, > but not for C types. Should be fixed now. > > Stefan > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From stefan_ml at behnel.de Tue Nov 25 21:36:19 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Nov 2008 21:36:19 +0100 Subject: [Cython] dictionary iteration generates code with illegal cast In-Reply-To: <4db580fd0811251222j4d342549q62c0e1f2f1de4e40@mail.gmail.com> References: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> <492C5138.3060603@behnel.de> <4db580fd0811251219kf211731l50a32a1f0df9f4aa@mail.gmail.com> <4db580fd0811251222j4d342549q62c0e1f2f1de4e40@mail.gmail.com> Message-ID: <492C61C3.5090900@behnel.de> Hi, Hoyt Koepke wrote: > NameError: global name 'dst_type' is not defined > > It looks easy to fix, but I'm not sure how, and you could probably fix > it in 5 seconds. Yep, I noticed that, too. Fixed already. Stefan From robertwb at math.washington.edu Wed Nov 26 02:07:57 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 25 Nov 2008 17:07:57 -0800 Subject: [Cython] Errors in generated .c In-Reply-To: <000001c94d3a$41c066e0$c54134a0$@w@hiddenworlds.org> References: <002101c94c6b$fdb10be0$f91323a0$@w@hiddenworlds.org> <4A41293A-AEE1-44FC-9372-3954C1907C35@math.washington.edu> <000001c94d3a$41c066e0$c54134a0$@w@hiddenworlds.org> Message-ID: On Nov 22, 2008, at 11:08 PM, Tim Wakeham wrote: > Ok, I've done a bit of testing and the problem seems to be in > recursive > ctypedef's. > > Eg. > > ctypedef struct foo: > int count > foo *bar > > causes the bug in the outputted code. Remove the reference to > itself and > the problem goes away. > > cdef struct foo: > int count > foo *bar > > compiles fine. I'm guessing that even if a recursive type def isn't > allowed, then the cython > translator should complain about it before gcc is ever invoked. > The easy > way around it is to > do the C thing and do a plain cdef first, then do the ctypedef like > > cdef struct foo: > int count > foo *bar > > ctypedef foo FOO > > Hope this helps. The bug has been fixed. - Robert From robertwb at math.washington.edu Wed Nov 26 02:48:52 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 25 Nov 2008 17:48:52 -0800 Subject: [Cython] Cython 0.10.2 Message-ID: <617CEF70-3A8E-499E-94D9-2E4F2C5CCA28@math.washington.edu> Another minor release. Bugfix only. See http://trac.cython.org/ cython_trac/query?milestone=0.10.2 - Robert From stefan_ml at behnel.de Wed Nov 26 09:47:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 26 Nov 2008 09:47:08 +0100 Subject: [Cython] dictionary iteration generates code with illegal cast In-Reply-To: <4db580fd0811251243u3faa6473u92d8893059bb4f00@mail.gmail.com> References: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> <492C5138.3060603@behnel.de> <4db580fd0811251219kf211731l50a32a1f0df9f4aa@mail.gmail.com> <4db580fd0811251222j4d342549q62c0e1f2f1de4e40@mail.gmail.com> <492C61C3.5090900@behnel.de> <4db580fd0811251243u3faa6473u92d8893059bb4f00@mail.gmail.com> Message-ID: <492D0D0C.2020808@behnel.de> Hoyt Koepke wrote: > Thanks! That bug is fixed. Another one has showed up, though: > [...] > line 3601, in generate_execution_code > self.body.generate_execution_code(code) > File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", > line 329, in generate_execution_code > stat.generate_execution_code(code) > AttributeError: 'PyTypeTestNode' object has no attribute > 'generate_execution_code' Hmm, I had already fixed a bug like that. Would you have some example code that shows it? Stefan From stefan_ml at behnel.de Wed Nov 26 14:03:27 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 26 Nov 2008 14:03:27 +0100 Subject: [Cython] dictionary iteration generates code with illegal cast In-Reply-To: <492D0D0C.2020808@behnel.de> References: <4db580fd0811250024x1292e071m5ba4c68fce50db2f@mail.gmail.com> <492C5138.3060603@behnel.de> <4db580fd0811251219kf211731l50a32a1f0df9f4aa@mail.gmail.com> <4db580fd0811251222j4d342549q62c0e1f2f1de4e40@mail.gmail.com> <492C61C3.5090900@behnel.de> <4db580fd0811251243u3faa6473u92d8893059bb4f00@mail.gmail.com> <492D0D0C.2020808@behnel.de> Message-ID: <492D491F.7030604@behnel.de> Hi, Stefan Behnel wrote: > Hoyt Koepke wrote: >> Thanks! That bug is fixed. Another one has showed up, though: >> [...] >> line 3601, in generate_execution_code >> self.body.generate_execution_code(code) >> File "/home/hoytak/sysroot/lib/python2.5/site-packages/Cython/Compiler/Nodes.py", >> line 329, in generate_execution_code >> stat.generate_execution_code(code) >> AttributeError: 'PyTypeTestNode' object has no attribute >> 'generate_execution_code' > > Hmm, I had already fixed a bug like that. ... just forgot to commit it. :) Should be fixed now. Stefan From stefan_ml at behnel.de Wed Nov 26 19:23:02 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 26 Nov 2008 19:23:02 +0100 Subject: [Cython] Cleaning up new-style temps on errors and returns Message-ID: <492D9406.9030203@behnel.de> Hi, I noticed that the new-style temps do not currently generate XDECREF code at the standard function error label. Neither are they considered for DECREF cleanup by the return statement. For testing, I let ExprNodes.SequenceNode (i.e. TupleNode and ListNode) inherit from NewTempExprNode, which works quite well, but induces the following behaviour. When I add DECREF cleanup code to the return statement, the test suite passes just like before (I'm sure 'return' inside of control structures is not backed by enough tests, though). However, when I add the temp cleanup code to the error label, it crashes the failure tests in bufaccess.pyx. I wouldn't know why this kind of error cleanup should not be needed for new-style temps, so I'm not sure if it's just plain wrong from my side to put it there because of some reason I'm not aware of, or if I stumbled over a hidden bug here. Any comments on why it might be wrong to clean up new-style temps the same way as old-style temps? Stefan From ggellner at uoguelph.ca Wed Nov 26 19:24:50 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Wed, 26 Nov 2008 13:24:50 -0500 Subject: [Cython] Comments on example code? In-Reply-To: <58925B04-DEF5-4AF2-B6B0-98622B40DACC@math.washington.edu> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <58925B04-DEF5-4AF2-B6B0-98622B40DACC@math.washington.edu> Message-ID: <20081126182450.GA27424@workerbee> On Tue, Nov 25, 2008 at 12:06:06PM -0800, Robert Bradshaw wrote: > One way to do it would be to flatten/compact the ndarray early on (so > one know the entries are contiguous) and then pass the raw float* and > its length (keeping the original array around so you don't have to > worry about memory issues. > Yeah I do this with my own code at times, but I am generally dealing with a matlab audience, and this seems like dark magic. I don't need maximum speed, just as fast as is still understandable to basic programmers. > Note that __call__, though not as slow as a normal python function > call, still has Python semantics (i.e. all of its arguments and its > return value have to pass through Python objects) so just using a > cdef method could speed things up considerably. (I wonder how hard it > would be to support cpdef __call__?) The call to empty is probably > dominating things too--since the 3 seems hard coded in anyways, I > would accept (and return) a > How hard do you think it would be to write something that sets up an empty (or zeroed) ndarray do avoid the python call? Upon further thinking I imagine this is the major issue from f2py. If it is doable for mortals I would try my hand at it. I imagine it is just getting the .data and some metadata of the struct setup correctly, but I don't fully understand the magic of the ndarray in cython. If this is possible would it make sense to add it to the numpy.pyx file? > cdef struct data: > float x > float y > float z > > instead of a 3-element ndarray. > I don't have control of the calling program routine, so this won't work . . . Again, thanks for all the responses, I continue to think Cython is pythons greatest `killer` feature for science. Gabriel From stefan_ml at behnel.de Wed Nov 26 19:57:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 26 Nov 2008 19:57:36 +0100 Subject: [Cython] Cleaning up new-style temps on errors and returns In-Reply-To: <492D9406.9030203@behnel.de> References: <492D9406.9030203@behnel.de> Message-ID: <492D9C20.3030507@behnel.de> [replying to myself...] Stefan Behnel wrote: > When I add DECREF cleanup code to the return statement, the test suite > passes just like before and in fact, the code that the iter-dict transform generates for ticket #124 shows that this is required. For this code def spam(dict d): for elm in d: return False return True Cython now generates this when I add DEFREF cleanup code for new-style temps to the return statement: [...] PyObject *__pyx_t_1 = NULL; [...] __pyx_t_2 = 0; Py_INCREF(((PyObject *)__pyx_v_d)); Py_XDECREF(__pyx_t_1); __pyx_t_1 = ((PyObject *)__pyx_v_d); while (1) { if (!PyDict_Next(__pyx_t_1, (&__pyx_t_2), \ ((PyObject **)(&__pyx_t_3)), NULL)) break; Py_INCREF(((PyObject *)__pyx_t_3)); Py_DECREF(__pyx_v_elm); __pyx_v_elm = ((PyObject *)__pyx_t_3); __pyx_3 = __Pyx_PyBool_FromLong(0); if () [error goto]; __pyx_r = __pyx_3; __pyx_3 = 0; Py_DECREF(__pyx_2); __pyx_2 = 0; Py_DECREF(__pyx_t_1); __pyx_t_1 = 0; goto __pyx_L0; } Note the line where it says "Py_DECREF(__pyx_t_1); __pyx_t_1 = 0;", which wasn't there before. (BTW, the bogus "DECREF(__pyx_2)" is the reason why ticket 124 exists in the first place). The same applies to the error case, where Cython currently generates __pyx_L1_error:; Py_XDECREF(__pyx_2); Py_XDECREF(__pyx_3); __Pyx_AddTraceback("ticket124.spam"); This certainly lacks an XDECREF for __pyx_t_1, as the error goto inside the loop ends up here. One more thing: the above loop code also lacks a DECREF(__pyx_t_1) /after/ the loop. I currently have no idea how to put it there... Stefan From ggellner at uoguelph.ca Wed Nov 26 20:13:03 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Wed, 26 Nov 2008 14:13:03 -0500 Subject: [Cython] Comments on example code? In-Reply-To: <492C4227.50208@student.matnat.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <20081125162126.GA18850@workerbee> <492C4227.50208@student.matnat.uio.no> Message-ID: <20081126191303.GA27524@workerbee> On Tue, Nov 25, 2008 at 07:21:27PM +0100, Dag Sverre Seljebotn wrote: > Gabriel Gellner wrote: >> But the only difference in the caller between the f2py version and this one >> is this file. So the speed difference must be from the function calls etc as >> you have said, but I don't understand how the overhead can be from the caller >> as the code for this is identical (and not compiled). >> >> the code is basically: >> >> for model.b1 in np.linspace(0, 2.6, 100): >> odeint(model, y0, t0) >> >> where model is defined in a separate file, in one case as a fortran >> object generated by f2py and in this case by cython. > > Ah, right. > > It's interesting (but not that surprising) to see that f2py does perform > better in this area, due to both the call to np.empty and that acquiring > the buffer from the NumPy array likely is slower than the NumPy-specific > stuff that f2py is doing. > > But that Python snippet is likely going to use the majority of the time > no matter how things are done, so it doesn't matter much. I'm interested > to hear how much do you gain over pure Python in this case then -- > probably not too much, perhaps 2-300%? (Moving the entire for-loop to > Cython would typically give you around 800% improvement). > The Cython callback version is 10 times faster than the pure python. > Anyway, this is not likely to be an area where Cython is improved, so > there's nothing to do about it. I don't think it is a common usecase > either -- typically either one keep everything in Python, or one decide > that Python only is too slow, but then one doesn't want to only leverage > a small part of the speed increase that compiled code can give you... > I don't think this is true. Speeding up callbacks (that I loop in my code is not essential odeint needs to call the provided function a lot, even without a python loop) is very, very common. PyDSTool (a dynamical systems package) has even invented their own text format to do this. Unless I can expect that every python routine I use has a Cython interface that excepts cdef-like functions (which for scipy [as far as I know] is currently not true at all) than I am stuck with python callbacks or making my own wrappers (and not using the vast scipy ecosystem), which would mean I should just use C or Fortran alone. Using tools like f2py, weave, and recently cython is the only major draw to python I have to offer fellow researchers over matlab (as it makes the equivalent MEX construction seem extra painful). Heck using this I can significantly speed up solvers written in pure python that consume a callback. I find for most iterative solvers in python the callback is the most expensive part (even for toy examples like I have given), especially with Cython to get rid of the loop overhead. > Even if you move the for-loop into Cython or Fortran you don't have to > use vectorized calculations. Something like: > > cython_apply_odeint(model, 0, 2.6, 100) > I see what you mean, yes I can do this (and often do), but it doesn't get over the callback issue. > and then just code a very generic for-loop in Cython which calls odeint > and model, just to move the loop on the other side of the > Python/Cython-barrier. > > Enough ranting though. Thanks for bringing this to my attention! > And thanks for discussing it. Gabriel From dagss at student.matnat.uio.no Wed Nov 26 20:17:19 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 26 Nov 2008 20:17:19 +0100 (CET) Subject: [Cython] Cleaning up new-style temps on errors and returns In-Reply-To: <492D9406.9030203@behnel.de> References: <492D9406.9030203@behnel.de> Message-ID: Stefan Behnel wrote: > I wouldn't know why this kind of error cleanup should not be needed for > new-style temps, so I'm not sure if it's just plain wrong from my side to > put it there because of some reason I'm not aware of, or if I stumbled > over > a hidden bug here. I must admit I felt a bit like I was using a big, blunt axe to carve out NewTempExprNode -- I only understood about half of what I did. So it is very unlikely that you fail to see something here, I think you have a much better understanding of this stuff than myself at least. (The good news is that the fixes are likely to be very small I think, I don't think there are fundamental design flaws, but in some ways it needs you to do it...at least I don't know this stuff well enough.) About the failing XDECREF in error label...hmm, nothing in particular comes to mind -- both NewTempExprNode and TempsBlockNode seem to set the temps to NULL when they are decref-ed...perhaps it is some interaction with the old temp system, that when transferring values from the new to the old temp system, the old system decides it doesn't need to incref it because it expects the reference to be transferred because the source node has is_temp set to True? But it is a wild guess. On to something related: I did find another likely memory leak though: TempsBlockNode doesn't automatically clear the temps that is used on exit of the TempsBlockNode -- so if you have two of those in a row, using temps of the same type, there will be a memory leak. Putting DECREF in the loop around line 67 in UtilNodes.py should do it. This assumes that transforms always use all the temps they ask for. (Can you have a look if you're at this already? Two dict-loop-optimizations in a row in a function should trigger it...) Perhaps one should make it more convenient to specify an initial value for the temps, the API is kind of assymetrical otherwise. Also that would allow changing the XDECREF in TempRefNode.generate_assignment_code to DECREF (the XDECREF is there so that the first SingleAssignmentNode doesn't fail). Sorry about writing rather than coding but I have exams up to read for. Which reminds me...enough procrastination, gtg :-) Dag Sverre From dagss at student.matnat.uio.no Wed Nov 26 20:19:26 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 26 Nov 2008 20:19:26 +0100 (CET) Subject: [Cython] Cleaning up new-style temps on errors and returns In-Reply-To: <492D9C20.3030507@behnel.de> References: <492D9406.9030203@behnel.de> <492D9C20.3030507@behnel.de> Message-ID: <4084d2c5b2d247dd9f30d2a27624bf99.squirrel@webmail.uio.no> Stefan Behnel wrote: > One more thing: the above loop code also lacks a DECREF(__pyx_t_1) /after/ > the loop. I currently have no idea how to put it there... Just a short note that this is the problem I noted the solution for in the latter half of my previous mail. Dag Sverre From dagss at student.matnat.uio.no Wed Nov 26 20:58:00 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 26 Nov 2008 20:58:00 +0100 Subject: [Cython] Comments on example code? In-Reply-To: <20081126191303.GA27524@workerbee> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <20081125162126.GA18850@workerbee> <492C4227.50208@student.matnat.uio.no> <20081126191303.GA27524@workerbee> Message-ID: <492DAA48.8090804@student.matnat.uio.no> Gabriel Gellner wrote: > On Tue, Nov 25, 2008 at 07:21:27PM +0100, Dag Sverre Seljebotn wrote: >> Gabriel Gellner wrote: >>> But the only difference in the caller between the f2py version and this one >>> is this file. So the speed difference must be from the function calls etc as >>> you have said, but I don't understand how the overhead can be from the caller >>> as the code for this is identical (and not compiled). >>> >>> the code is basically: >>> >>> for model.b1 in np.linspace(0, 2.6, 100): >>> odeint(model, y0, t0) >>> >>> where model is defined in a separate file, in one case as a fortran >>> object generated by f2py and in this case by cython. >> Ah, right. >> >> It's interesting (but not that surprising) to see that f2py does perform >> better in this area, due to both the call to np.empty and that acquiring >> the buffer from the NumPy array likely is slower than the NumPy-specific >> stuff that f2py is doing. >> >> But that Python snippet is likely going to use the majority of the time >> no matter how things are done, so it doesn't matter much. I'm interested >> to hear how much do you gain over pure Python in this case then -- >> probably not too much, perhaps 2-300%? (Moving the entire for-loop to >> Cython would typically give you around 800% improvement). >> > The Cython callback version is 10 times faster than the pure python. Sorry about the %-numbers above, I have no idea how the confused sentence above got there :-) What I meant to say: For the simplest extreme examples, putting the loop Cython-side gives about 1000 times improvement. Your example has a lot more stuff inside the loop, but I still have a feeling that improvements in the range of 200 times are easily within range if the entire solver is written in Cython. However, you definitely raise some interesting points below. While a mere 10 times increase is not close to what I wrote e.g. the buffer stuff for, then of course, when you sit there, wondering whether to bother with Cython or use MATLAB and Fortran, 10 times faster is, well, 10 times faster. >> Anyway, this is not likely to be an area where Cython is improved, so >> there's nothing to do about it. I don't think it is a common usecase >> either -- typically either one keep everything in Python, or one decide >> that Python only is too slow, but then one doesn't want to only leverage >> a small part of the speed increase that compiled code can give you... >> > I don't think this is true. Speeding up callbacks (that I loop in my code is > not essential odeint needs to call the provided function a lot, even without a > python loop) is very, very common. PyDSTool (a dynamical systems package) has > even invented their own text format to do this. Unless I can expect that every > python routine I use has a Cython interface that excepts cdef-like functions > (which for scipy [as far as I know] is currently not true at all) than I am > stuck with python callbacks or making my own wrappers (and not using the vast > scipy ecosystem), which would mean I should just use C or Fortran alone. Using > tools like f2py, weave, and recently cython is the only major draw to python I > have to offer fellow researchers over matlab (as it makes the equivalent MEX > construction seem extra painful). Heck using this I can significantly speed up > solvers written in pure python that consume a callback. I find for most > iterative solvers in python the callback is the most expensive part (even for > toy examples like I have given), especially with Cython to get rid of the loop > overhead. Very interesting perspective. Unfortunately I do not have a many good ideas about what to do about it. (In some ways I am taking the perspective that SciPy may well have a transparent Cython interface for callbacks in some time, so that callbacks passed in that subclass from a special SciPy parent class would be called in a fast way from SciPy code. I know that SciPy already use Cython for some things.) But one thing I forgot to mention is trying to use the "cast" flag on the buffer. I.e. write np.ndarray[float, cast=True] arr this will skip some checking of the dtype (and mess up the buffer rather than cast an exception if the array passed has the wrong dtype...). Also, if you are really interested in this then have a look at the C source generated by Cython -- there's a lot of stuff going on in order to acquire access to the ndarray, and if you find that disabling a line which checks things spends a lot of time we could easily add a flag to disable the check. Also I did once have plans for optimizing "np.empty" etc. so that no Python overhead would be necesarry (through inlineable functions in numpy.pxd), but I didn't have time for doing it. -- Dag Sverre From ggellner at uoguelph.ca Wed Nov 26 20:58:38 2008 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Wed, 26 Nov 2008 14:58:38 -0500 Subject: [Cython] Comments on example code? In-Reply-To: <492DAA48.8090804@student.matnat.uio.no> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <20081125162126.GA18850@workerbee> <492C4227.50208@student.matnat.uio.no> <20081126191303.GA27524@workerbee> <492DAA48.8090804@student.matnat.uio.no> Message-ID: <20081126195838.GA27964@workerbee> > Also, if you are really interested in this then have a look at the C > source generated by Cython -- there's a lot of stuff going on in order > to acquire access to the ndarray, and if you find that disabling a line > which checks things spends a lot of time we could easily add a flag to > disable the check. > > Also I did once have plans for optimizing "np.empty" etc. so that no > Python overhead would be necesarry (through inlineable functions in > numpy.pxd), but I didn't have time for doing it. > I think I will try to do this, I don't really understand what it will entail, but heck, it would be sweet, basically having zeros and empty would make Cython super natural for what I often do. Gabriel From dagss at student.matnat.uio.no Wed Nov 26 21:03:13 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 26 Nov 2008 21:03:13 +0100 (CET) Subject: [Cython] Comments on example code? In-Reply-To: <20081126195838.GA27964@workerbee> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <20081125162126.GA18850@workerbee> <492C4227.50208@student.matnat.uio.no> <20081126191303.GA27524@workerbee> <492DAA48.8090804@student.matnat.uio.no> <20081126195838.GA27964@workerbee> Message-ID: <578bcf0549a4d9d49b74fcb07db2ba15.squirrel@webmail.uio.no> >> Also, if you are really interested in this then have a look at the C >> source generated by Cython -- there's a lot of stuff going on in order >> to acquire access to the ndarray, and if you find that disabling a line >> which checks things spends a lot of time we could easily add a flag to >> disable the check. >> >> Also I did once have plans for optimizing "np.empty" etc. so that no >> Python overhead would be necesarry (through inlineable functions in >> numpy.pxd), but I didn't have time for doing it. >> > I think I will try to do this, I don't really understand what it will > entail, > but heck, it would be sweet, basically having zeros and empty would make > Cython super natural for what I often do. Note that there is currently no such thing as inlineable functions in pxd files! I would have to implement that in Cython first, which is no easy task (at least if you haven't looked much at Cython code already). But it would be very good to have, for starters, benchmarks for whether calling the C function in arrayobject.h corresponding to np.empty provides significant improvements. (Remember to call using the exact right integer size for ndims!) Dag Sverre From robertwb at math.washington.edu Wed Nov 26 21:03:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 26 Nov 2008 12:03:39 -0800 Subject: [Cython] Comments on example code? In-Reply-To: <20081126182450.GA27424@workerbee> References: <20081125150522.GA17693@workerbee> <11502fa0125f2acc110dfef263449eba.squirrel@webmail.uio.no> <58925B04-DEF5-4AF2-B6B0-98622B40DACC@math.washington.edu> <20081126182450.GA27424@workerbee> Message-ID: <77FC624E-5E9E-419D-B68F-1B64A97ADAE0@math.washington.edu> On Nov 26, 2008, at 10:24 AM, Gabriel Gellner wrote: > On Tue, Nov 25, 2008 at 12:06:06PM -0800, Robert Bradshaw wrote: >> One way to do it would be to flatten/compact the ndarray early on (so >> one know the entries are contiguous) and then pass the raw float* and >> its length (keeping the original array around so you don't have to >> worry about memory issues. >> > Yeah I do this with my own code at times, but I am generally > dealing with a > matlab audience, and this seems like dark magic. I don't need > maximum speed, > just as fast as is still understandable to basic programmers. Certainly a worthy goal. One of the main motivations of Cython is to achieve maximum speed without sacrificing understandability. There is, of course, a lot that remains to be done in this direction :). When you're speaking of callbacks, would it be to magic to pass (C) function pointers around? > >> Note that __call__, though not as slow as a normal python function >> call, still has Python semantics (i.e. all of its arguments and its >> return value have to pass through Python objects) so just using a >> cdef method could speed things up considerably. (I wonder how hard it >> would be to support cpdef __call__?) The call to empty is probably >> dominating things too--since the 3 seems hard coded in anyways, I >> would accept (and return) a >> > How hard do you think it would be to write something that sets up > an empty (or > zeroed) ndarray do avoid the python call? Upon further thinking I > imagine > this is the major issue from f2py. > > If it is doable for mortals I would try my hand at it. I imagine it > is just > getting the .data and some metadata of the struct setup correctly, > but I don't > fully understand the magic of the ndarray in cython. I bet there's probably a C-level function in NumPy that one could just call, if one can find it. Either that or malloc/calloc a chunk and use NumPy's C API to create a new ndarray out of that raw data. > If this is possible would it make sense to add it to the numpy.pyx > file? Yes, certainly. This is (part of) what Dag was talking about being able to put inline functions in .pxd files. > >> cdef struct data: >> float x >> float y >> float z >> >> instead of a 3-element ndarray. >> > I don't have control of the calling program routine, so this won't > work . . . Sure. > Again, thanks for all the responses, I continue to think Cython is > pythons > greatest `killer` feature for science. Thanks. From dagss at student.matnat.uio.no Thu Nov 27 10:20:18 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 27 Nov 2008 10:20:18 +0100 Subject: [Cython] Cleaning up new-style temps on errors and returns In-Reply-To: <492D9C20.3030507@behnel.de> References: <492D9406.9030203@behnel.de> <492D9C20.3030507@behnel.de> Message-ID: <492E6652.5030906@student.matnat.uio.no> Stefan Behnel wrote: > [replying to myself...] > > Stefan Behnel wrote: >> When I add DECREF cleanup code to the return statement, the test suite >> passes just like before > > and in fact, the code that the iter-dict transform generates for ticket > #124 shows that this is required. For this code > > def spam(dict d): > for elm in d: > return False > return True > > Cython now generates this when I add DEFREF cleanup code for new-style > temps to the return statement: > snip > Note the line where it says "Py_DECREF(__pyx_t_1); __pyx_t_1 = 0;", which > wasn't there before. OK I thought of something that might be an issue. Psuedo-code: tempsblocknode t1 = 3: try: try: tempsblocknode t2 = 4: return False finally: print t1 finally: print t1 Is it enough just to document that TempsBlockNode must not be used in such a way that the temp can be accessed within a finally clause? (Python already helps us here by disallowing "continue" within finally, for instance). I suppose this is usually under strict control of the transform using TempsBlockNode. Otherwise it looks like solving this involves some (not too advanced) kind of flow control analysis, in a transform tracking TempsBlockNodes and try/finally-nodes. -- Dag Sverre From stephane.drouard at st.com Thu Nov 27 18:29:22 2008 From: stephane.drouard at st.com (Stephane DROUARD) Date: Thu, 27 Nov 2008 18:29:22 +0100 Subject: [Cython] __del__ not called on Python exit Message-ID: <014401c950b5$b02de430$b9ad810a@gnb.st.com> Hi, With the following file "foo.py": class Foo: def __init__(self): print "__init__" def __del__(self): print "__del__" foo = Foo() del foo foo = Foo() > python -c "import foo" __init__ __del__ __init__ __del__ When foo.py is Cythonized (0.10.2): > python -c "import foo" __init__ __del__ __init__ __del__ is not called when the Python exits. I tried using "--cleanup 1", but it hangs, trying to write at address 0... Cheers, Stephane -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: foo.py Url: http://codespeak.net/pipermail/cython-dev/attachments/20081127/4e40cab1/attachment.diff From dagss at student.matnat.uio.no Thu Nov 27 19:34:33 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 27 Nov 2008 19:34:33 +0100 Subject: [Cython] New feature: Inline functions in pxd files Message-ID: <492EE839.2000307@student.matnat.uio.no> It is now possible (in cython-devel) to have inline cdef functions in pxd files. I.e. in mymod.pxd you can write cdef inline int my_sum(int a, int b): return a + b Only functions at the top level in pxd file are supported, and it must have the "inline" modifier, and must *not* have "api" or "public" modifiers. Note: The implementation of the function is *copied* to every pyx-file using the pxd file. (Coding it went surprisingly smooth, was primarily about letting it through the parser...please try to break it :-) ). The solution is a little hacky (added special case in Entry), but then I know I don't break something else. Patches for numpy.pxd utilising this very welcome, I won't have time for that myself for some time. -- Dag Sverre From dagss at student.matnat.uio.no Thu Nov 27 19:39:29 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 27 Nov 2008 19:39:29 +0100 Subject: [Cython] pxd locals? Message-ID: <492EE961.2040101@student.matnat.uio.no> When implementing inlineable functions, I saw that there's code in PxdPostParse to enable this kind of code: cdef int foo(int a, int b): cdef int c cdef int d in a pxd file, in order to set "pxd_locals". As soon as some real code is added the code is disallowed. What is the purpose? It does conflict a bit with the inline syntax (not for real but mentally). I have a feeling Robert added this as a feature for something but what the feature is escapes me... -- Dag Sverre From stefan_ml at behnel.de Thu Nov 27 21:24:24 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 27 Nov 2008 21:24:24 +0100 Subject: [Cython] pxd locals? In-Reply-To: <492EE961.2040101@student.matnat.uio.no> References: <492EE961.2040101@student.matnat.uio.no> Message-ID: <492F01F8.2060507@behnel.de> Hi, Dag Sverre Seljebotn wrote: > When implementing inlineable functions, I saw that there's code in > PxdPostParse to enable this kind of code: > > cdef int foo(int a, int b): > cdef int c > cdef int d > > in a pxd file, in order to set "pxd_locals". As soon as some real code > is added the code is disallowed. > > What is the purpose? It does conflict a bit with the inline syntax (not > for real but mentally). I have a feeling Robert added this as a feature > for something but what the feature is escapes me... I think that's for overriding the code in .py files, so that you can use plain Python code without type declarations, and just add a .pxd file next to it for efficient C compilation. Stefan From dagss at student.matnat.uio.no Thu Nov 27 21:46:28 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 27 Nov 2008 21:46:28 +0100 Subject: [Cython] pxd locals? In-Reply-To: <492F01F8.2060507@behnel.de> References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> Message-ID: <492F0724.3060409@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> When implementing inlineable functions, I saw that there's code in >> PxdPostParse to enable this kind of code: >> >> cdef int foo(int a, int b): >> cdef int c >> cdef int d >> >> in a pxd file, in order to set "pxd_locals". As soon as some real code >> is added the code is disallowed. >> >> What is the purpose? It does conflict a bit with the inline syntax (not >> for real but mentally). I have a feeling Robert added this as a feature >> for something but what the feature is escapes me... > > I think that's for overriding the code in .py files, so that you can use > plain Python code without type declarations, and just add a .pxd file next > to it for efficient C compilation. Ahh. This kind of makes me worry, syntax-wise, as it collides with inline cdef functions. (And, I must admit, I have now accidentally, through overlaying my syntax, removed the possibility to specify that a function is "inline" in the pyx through the pxd.) inline functions in pxd files is something I think comes in *really* handy, it is part of what the "include" statement is currently used for and inline functions are much nicer than include statements. Proposal: New modifier, "proto". I.e. cdef proto int foo(int a, int b): cdef int c cdef int b will embed the signature/variables on a pure def foo(a, b): ... If the keyword proto is not present, the definition is not allowed unless it is declared inline. Pro: - Resolve the conflict. - I think this makes it more obvious what is going on. I think having the pxd definition transfer to a def in the pyx/py is a bit too magical using the current syntax anyway, and that this is an improvement, conflict or not. Con: - Breaks compatability with Cython 0.10 (but through compiler errors, no silently changed behaviour). What do you think? I could implement it in roughly five minutes if it is accepted, the function modifier part of the parser is nice and dynamic. -- Dag Sverre From dalcinl at gmail.com Thu Nov 27 23:15:30 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 27 Nov 2008 20:15:30 -0200 Subject: [Cython] pxd locals? In-Reply-To: <492F0724.3060409@student.matnat.uio.no> References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> <492F0724.3060409@student.matnat.uio.no> Message-ID: On Thu, Nov 27, 2008 at 6:46 PM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi, >> > Proposal: New modifier, "proto". I.e. > > cdef proto int foo(int a, int b): > cdef int c > cdef int b > > will embed the signature/variables on a pure > > def foo(a, b): ... > Definitely +1 for me. Your idea is really great. Howerver, I would really prefer 'prototype' as the magic keyword Could this be extended to class definitions and perhaps their methods? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Nov 28 07:28:24 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 27 Nov 2008 22:28:24 -0800 Subject: [Cython] pxd locals? In-Reply-To: <492F0724.3060409@student.matnat.uio.no> References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> <492F0724.3060409@student.matnat.uio.no> Message-ID: On Nov 27, 2008, at 12:46 PM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi, >> >> Dag Sverre Seljebotn wrote: >>> When implementing inlineable functions, I saw that there's code in >>> PxdPostParse to enable this kind of code: >>> >>> cdef int foo(int a, int b): >>> cdef int c >>> cdef int d >>> >>> in a pxd file, in order to set "pxd_locals". As soon as some real >>> code >>> is added the code is disallowed. >>> >>> What is the purpose? It does conflict a bit with the inline >>> syntax (not >>> for real but mentally). I have a feeling Robert added this as a >>> feature >>> for something but what the feature is escapes me... >> >> I think that's for overriding the code in .py files, so that you >> can use >> plain Python code without type declarations, and just add a .pxd >> file next >> to it for efficient C compilation. > > Ahh. Yes, that's exactly why I introduced it. Python 2.3 doesn't have decorators, so we can't use them in bootstrapping the Cython compiler for instance. > This kind of makes me worry, syntax-wise, as it collides with inline > cdef functions. (And, I must admit, I have now accidentally, through > overlaying my syntax, removed the possibility to specify that a > function > is "inline" in the pyx through the pxd.) > > inline functions in pxd files is something I think comes in *really* > handy, it is part of what the "include" statement is currently used > for > and inline functions are much nicer than include statements. +1 to inline functions. I've wanted these too. > Proposal: New modifier, "proto". I.e. > > cdef proto int foo(int a, int b): > cdef int c > cdef int b > > will embed the signature/variables on a pure > > def foo(a, b): ... > > If the keyword proto is not present, the definition is not allowed > unless it is declared inline. > > Pro: > - Resolve the conflict. > - I think this makes it more obvious what is going on. I think having > the pxd definition transfer to a def in the pyx/py is a bit too > magical > using the current syntax anyway, and that this is an improvement, > conflict or not. > > Con: > - Breaks compatability with Cython 0.10 (but through compiler > errors, no > silently changed behaviour). > > What do you think? I could implement it in roughly five minutes if > it is > accepted, the function modifier part of the parser is nice and > dynamic. I'm generally -1 on adding new syntax, but what I had wasn't very clear either. Would it be enough to accept a locals decorator? - Robert From robertwb at math.washington.edu Fri Nov 28 07:40:39 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 27 Nov 2008 22:40:39 -0800 Subject: [Cython] __del__ not called on Python exit In-Reply-To: <014401c950b5$b02de430$b9ad810a@gnb.st.com> References: <014401c950b5$b02de430$b9ad810a@gnb.st.com> Message-ID: On Nov 27, 2008, at 9:29 AM, Stephane DROUARD wrote: > Hi, > > With the following file "foo.py": > > class Foo: > def __init__(self): > print "__init__" > > def __del__(self): > print "__del__" > > foo = Foo() > del foo > foo = Foo() > >> python -c "import foo" > __init__ > __del__ > __init__ > __del__ > > When foo.py is Cythonized (0.10.2): >> python -c "import foo" > __init__ > __del__ > __init__ > > __del__ is not called when the Python exits. > > I tried using "--cleanup 1", but it hangs, trying to write at > address 0... The cleanup level goes all the way up to 3, but it still doesn't seem to catch everything. It's also (as you've noticed) a bit unsafe. Somehow module dictionaries in extension modules don't get deallocated. This is a known issue, if you (or anyone else) has any ideas I'd love to hear them. I just made a ticket http:// trac.cython.org/cython_trac/ticket/142 - Robert From stefan_ml at behnel.de Fri Nov 28 08:06:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 08:06:36 +0100 Subject: [Cython] pxd locals? In-Reply-To: References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> <492F0724.3060409@student.matnat.uio.no> Message-ID: <492F987C.6070908@behnel.de> Hi, Robert Bradshaw wrote: >> inline functions in pxd files is something I think comes in *really* >> handy, it is part of what the "include" statement is currently used >> for and inline functions are much nicer than include statements. > > +1 to inline functions. I've wanted these too. Absolutely, +1. > I'm generally -1 on adding new syntax, but what I had wasn't very > clear either. Would it be enough to accept a locals decorator? I think that's a very clean compromise. You can a) specify the signature in .pxd files as you would in .pyx files, just without a function body. b) write importable inline functions in .pxd files as a function with body, just as you would in a .pyx file. c) specify local variable types for a .py implemented function in a .pxd file by adding a decorator to a function signature, but without providing a function body. So, whenever a function signature in a .pxd file has a body at all, it must be a complete inline function. I actually don't think that users would intuitively expect the current way of defining local variables in .pxd files to work. Stefan From dagss at student.matnat.uio.no Fri Nov 28 10:24:40 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 10:24:40 +0100 Subject: [Cython] pxd locals? In-Reply-To: References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> <492F0724.3060409@student.matnat.uio.no> Message-ID: <492FB8D8.2040708@student.matnat.uio.no> Robert Bradshaw wrote: > On Nov 27, 2008, at 12:46 PM, Dag Sverre Seljebotn wrote: >> Proposal: New modifier, "proto". I.e. >> >> cdef proto int foo(int a, int b): >> cdef int c >> cdef int b >> >> will embed the signature/variables on a pure >> >> def foo(a, b): ... >> >> If the keyword proto is not present, the definition is not allowed >> unless it is declared inline. >> >> Pro: >> - Resolve the conflict. >> - I think this makes it more obvious what is going on. I think having >> the pxd definition transfer to a def in the pyx/py is a bit too >> magical >> using the current syntax anyway, and that this is an improvement, >> conflict or not. >> >> Con: >> - Breaks compatability with Cython 0.10 (but through compiler >> errors, no >> silently changed behaviour). >> >> What do you think? I could implement it in roughly five minutes if >> it is >> accepted, the function modifier part of the parser is nice and >> dynamic. > > I'm generally -1 on adding new syntax, but what I had wasn't very > clear either. Would it be enough to accept a locals decorator? That's much better; +1. Could you do it? You did this so it should be quicker for you. At any rate this is now http://trac.cython.org/cython_trac/ticket/143 Note that this may require extending decorator support (in Parsing.py) to cdef functions, currently they only support "def" (if you didn't change that already). They should all probably simply be let through no matter what they decorate, and then disallowed again (except for @cython.locals) in PostParse. -- Dag Sverre From stefan_ml at behnel.de Fri Nov 28 12:58:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 12:58:26 +0100 Subject: [Cython] ExprNodes.subexpr_nodes() Message-ID: <492FDCE2.6030904@behnel.de> Hi, the ExprNodes.subexpr_nodes() method caches the values of children referenced in the "subexpr" attribute. However, it can easily get in the way of tree transforms that replace these subexpressions after they have been requested for the first time (e.g. during temp allocation). Two ways to deal with it: a) remove the caching entirely. I'm not sure it's worth it anyway, since the list of subexpressions tends to be pretty short. b) make the tree visitor a bit smarter so that it clears the cache on any change (assuming that it can detect changes). Opinions? Stefan From dagss at student.matnat.uio.no Fri Nov 28 13:57:59 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 13:57:59 +0100 Subject: [Cython] ExprNodes.subexpr_nodes() In-Reply-To: <492FDCE2.6030904@behnel.de> References: <492FDCE2.6030904@behnel.de> Message-ID: <492FEAD7.9040809@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > the ExprNodes.subexpr_nodes() method caches the values of children > referenced in the "subexpr" attribute. However, it can easily get in the > way of tree transforms that replace these subexpressions after they have > been requested for the first time (e.g. during temp allocation). Two ways > to deal with it: > > a) remove the caching entirely. I'm not sure it's worth it anyway, since > the list of subexpressions tends to be pretty short. > > b) make the tree visitor a bit smarter so that it clears the cache on any > change (assuming that it can detect changes). > > Opinions? Long-term, there's an option c): c) Use a reverse iterator pattern, i.e. a parent can be requested to invoke a visitor on all its children, and this is written out manually for every node class. Like this: class AddNode(BinopNode): def iterate_children(reverse_iterator): reverse_iterator.next(self.op1) reverse_iterator.next(self.op2) Also you would have reverse_iterator.next_list(self.body) This has about the same performance as a) currently (assuming that the cost of getattr(self, "attr") is about the same as the cost of self.attr), but when/if Cython itself is compiled it can be optimized using parallell pxds and/or annotations, and then c) will be *much* faster than either a) or b). Given this aspect, I wonder whether it makes sense to invest lots of time in this (presumably minor) optimization now. And a) certainly has a simpler feel to it. So I lean towards a) b) is certainly not difficult to pull off though. (It just makes for a more complicated program to maintain and debug.) -- Dag Sverre From dagss at student.matnat.uio.no Fri Nov 28 18:04:47 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 18:04:47 +0100 Subject: [Cython] Pass char literals straight to C? Message-ID: <493024AF.50602@student.matnat.uio.no> I've might have touched upon this one before, actually I cannot remember. At any rate I know I've got no answer. Can anyone think of a reason why C string literals are allocated as variables in C source? I.e. what happens now is static char __pyx_k_2[] = "AB"; static PyObject *__pyx_kp_2; ... static __Pyx_StringTabEntry __pyx_string_tab[] = { {&__pyx_kp_2, __pyx_k_2, sizeof(__pyx_k_2), 0, 1, 0}, {0, 0, 0, 0, 0, 0} }; This leads to bookkeeping etc. in Symtab.py, and IMO also makes the C source less readable. Would it be possible to simplify this and instead do static __Pyx_StringTabEntry __pyx_string_tab[] = { {&__pyx_kp_2, "AB", 2, 0, 1, 0}, {0, 0, 0, 0, 0, 0} }; ? One could argue that the literal can be reused -- however the C compiler will in most (all?) cases optimize/collapse identical string constants anyway, and also Cython currently doesn't do this for StringNode, only identifiers... (The background is that string literals etc. should also be moved to code generation, so that, say, pruning a StringNode in a transform doesn't make the string constant linger in the scope. Part of this is already in place but the final steps are not done. See http://trac.cython.org/cython_trac/ticket/99 ) -- Dag Sverre From dagss at student.matnat.uio.no Fri Nov 28 18:06:39 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 18:06:39 +0100 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <493024AF.50602@student.matnat.uio.no> References: <493024AF.50602@student.matnat.uio.no> Message-ID: <4930251F.8070701@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > This leads to bookkeeping etc. in Symtab.py, and IMO also makes the C > source less readable. Would it be possible to simplify this and instead do > > static __Pyx_StringTabEntry __pyx_string_tab[] = { > {&__pyx_kp_2, "AB", 2, 0, 1, 0}, > {0, 0, 0, 0, 0, 0} > }; > > ? BTW I'm aware of the fact that it should have said "3" as the length above :-) -- Dag Sverre From stefan_ml at behnel.de Fri Nov 28 18:11:42 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 18:11:42 +0100 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <493024AF.50602@student.matnat.uio.no> References: <493024AF.50602@student.matnat.uio.no> Message-ID: <4930264E.9040706@behnel.de> Hi, Dag Sverre Seljebotn wrote: > I've might have touched upon this one before, actually I cannot > remember. At any rate I know I've got no answer. > > Can anyone think of a reason why C string literals are allocated as > variables in C source? I.e. what happens now is > > static char __pyx_k_2[] = "AB"; > static PyObject *__pyx_kp_2; > ... > > static __Pyx_StringTabEntry __pyx_string_tab[] = { > {&__pyx_kp_2, __pyx_k_2, sizeof(__pyx_k_2), 0, 1, 0}, > {0, 0, 0, 0, 0, 0} > }; > > This leads to bookkeeping etc. in Symtab.py, and IMO also makes the C > source less readable. Would it be possible to simplify this and instead do > > static __Pyx_StringTabEntry __pyx_string_tab[] = { > {&__pyx_kp_2, "AB", 2, 0, 1, 0}, > {0, 0, 0, 0, 0, 0} > }; > > ? AFAIR, there may also be references to __pyx_k_2 in the C source, but that's worth checking. In any case, char* literals currently end up in the same place as the char* basis of str/unicode literals. I wouldn't mind putting them right into the string tab, though, and declaring them "const char*" there (if the C-API in Py2.3 allows it, don't remember). > One could argue that the literal can be reused -- however the C compiler > will in most (all?) cases optimize/collapse identical string constants I doubt that it will do that in many cases. It can really only do that for a const char*. All other char sequences are free to be mutated in place. BTW, I doubt that it's worth merging Python string references. They are either identifiers, in which case the interning hits, or longer strings, in which case it's less likely there's more than one of them. > (The background is that string literals etc. should also be moved to > code generation, so that, say, pruning a StringNode in a transform > doesn't make the string constant linger in the scope. Part of this is > already in place but the final steps are not done. See > http://trac.cython.org/cython_trac/ticket/99 ) Sounds like a bit of work, but it sounds worthwhile. Stefan From dagss at student.matnat.uio.no Fri Nov 28 18:37:53 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 18:37:53 +0100 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <4930264E.9040706@behnel.de> References: <493024AF.50602@student.matnat.uio.no> <4930264E.9040706@behnel.de> Message-ID: <49302C71.5010809@student.matnat.uio.no> Stefan Behnel wrote: > AFAIR, there may also be references to __pyx_k_2 in the C source, but > that's worth checking. In any case, char* literals currently end up in the > same place as the char* basis of str/unicode literals. > > I wouldn't mind putting them right into the string tab, though, and > declaring them "const char*" there (if the C-API in Py2.3 allows it, don't > remember). Note that I meant getting rid of string constants in Symtab.py altogether, meaning that also char literals (i.e. 'cdef char* s = "asdf"') would also be "inlined". (You once mentioned splitting up long char literals for MSVC, was this ever done? At least I wasn't able to split up a long string when testing.) >> One could argue that the literal can be reused -- however the C compiler >> will in most (all?) cases optimize/collapse identical string constants > > I doubt that it will do that in many cases. It can really only do that for > a const char*. All other char sequences are free to be mutated in place. OK, as that is not an issue I won't worry. But, going off-topic and conversational -- are you sure? The way I understood it, all literals are "const char*", it's just that they can be immediately assigned to a regular char* variable. I tried this with gcc: char* s = "one one one"; s[2] = 'Z'; printf("%s\n", s); At least on my gcc, I had two types of behaviour: a) With no switches, segfault. b) With -O1 or -O2, no segfault, but my change didn't show up in the output either (might have just been my testcase though, but at least when the write is optimized away it is not supported). With g++ the only apparent difference was that g++ would complain that the literal was casted from const to non-const without an explicit cast. > Sounds like a bit of work, but it sounds worthwhile. If it turns out that way I won't do it actually, but it didn't look too bad -- estimated an hour... -- Dag Sverre From stefan_ml at behnel.de Fri Nov 28 18:56:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 18:56:26 +0100 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <49302C71.5010809@student.matnat.uio.no> References: <493024AF.50602@student.matnat.uio.no> <4930264E.9040706@behnel.de> <49302C71.5010809@student.matnat.uio.no> Message-ID: <493030CA.1060306@behnel.de> Hi, Dag Sverre Seljebotn wrote: > You once mentioned splitting up long > char literals for MSVC, was this ever done? No. The problem was only with docstrings at the time, and they are commonly put into a struct field instead of a string constant. So splitting them up would have required a separate machinery from what we currently have for Python string building. This should be doable for Python strings, though, after a major cleanup in string constant handling (which you are up to, it seems). >>> One could argue that the literal can be reused -- however the C compiler >>> will in most (all?) cases optimize/collapse identical string constants >> I doubt that it will do that in many cases. It can really only do that for >> a const char*. All other char sequences are free to be mutated in place. > > are you sure? The way I > understood it, all literals are "const char*", it's just that they can > be immediately assigned to a regular char* variable. What I meant was: the current code doesn't allow this, as we only use char* variables. If we can put them directly into the struct as a const char* and then read them from there only inside the Python string building function, that would allow the C compiler to merge them. However, this still shouldn't apply to C strings, where merging them might break user code. Stefan From d4lojh902 at sneakemail.com Fri Nov 28 19:02:32 2008 From: d4lojh902 at sneakemail.com ({}) Date: 28 Nov 2008 18:02:32 -0000 Subject: [Cython] Can I use Cython for iPhone application development? Message-ID: <5801-43011@sneakemail.com> Hi. I'm completely new to Cython. As I understand, I can call Cython code from C, as it's said in the FAQ. Therefore, it would be possible to call Cython code from Objective-C. So I can develop an iPhone application written mostly in Cython, and then convert it to Objective-C code, right? Or am I missing anything? How? How would you write a simple Hello World Cython iPhone application? Thank you in advance. From dagss at student.matnat.uio.no Fri Nov 28 19:34:34 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 19:34:34 +0100 Subject: [Cython] Can I use Cython for iPhone application development? In-Reply-To: <5801-43011@sneakemail.com> References: <5801-43011@sneakemail.com> Message-ID: <493039BA.8040805@student.matnat.uio.no> {} wrote: > Hi. I'm completely new to Cython. > > As I understand, I can call Cython code from C, as it's said in the FAQ. > > Therefore, it would be possible to call Cython code from Objective-C. So I can develop an iPhone application written mostly in Cython, and then convert it to Objective-C code, right? Or am I missing anything? Cython is not a pure Python -> C translator, instead it genereates C/Python extension modules which depends on a running Python instance (the Python version from python.org, not Jython or IronPython or similar). From the perspective of making Cython run on the iPhone, it is exactly the same as making C extensions for Python run on the iPhone. Cython code cannot run standalone. > How? How would you write a simple Hello World Cython iPhone application? You should seek out iPhone development forums and ask again there. You need to be able to run the official Python (from python.org), and compile and install a custom C extension modules yourself. Then come back here and ask again if something is not working out at that stage. -- Dag Sverre From stefan_ml at behnel.de Fri Nov 28 19:29:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 19:29:36 +0100 Subject: [Cython] Temp allocation flow Message-ID: <49303890.7040501@behnel.de> Hi, to fix ticket 124, I tried migrating the for-in loop to new-style temps. The problem I ran into is that the current support for temp (de-)allocation in NewTempExprNode during code generation is not enough to get this to work. Some temps are used longer, some are used shorter than the current automatic granularity at a node boundary. Example: for a,b in some_list: pass needs a temp for retrieving the next item, unpacks it into two target temps, frees the item temp and then assigns them to a and b. The problem is that we a) pass temps across multiple nodes here, and b) currently generate separate code paths for the tuple unpacking, so that the item temp disposal code is also generated twice. That prevents the rhs from just releasing the temp on a call to generate_disposal_code(). I think it would be best to keep temp allocation in the code where it needs to happen, but to make the deallocation explicit and callable from outside. This means that all nodes would need to call a "free_temps()" method on their subexpressions when they are done generating code for them. Any objections or better ideas? Stefan From robertwb at math.washington.edu Fri Nov 28 19:30:41 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 28 Nov 2008 10:30:41 -0800 Subject: [Cython] Can I use Cython for iPhone application development? In-Reply-To: <493039BA.8040805@student.matnat.uio.no> References: <5801-43011@sneakemail.com> <493039BA.8040805@student.matnat.uio.no> Message-ID: On Nov 28, 2008, at 10:34 AM, Dag Sverre Seljebotn wrote: > {} wrote: >> Hi. I'm completely new to Cython. >> >> As I understand, I can call Cython code from C, as it's said in >> the FAQ. >> >> Therefore, it would be possible to call Cython code from Objective- >> C. So I can develop an iPhone application written mostly in >> Cython, and then convert it to Objective-C code, right? Or am I >> missing anything? > > Cython is not a pure Python -> C translator, instead it genereates > C/Python extension modules which depends on a running Python instance > (the Python version from python.org, not Jython or IronPython or > similar). > > From the perspective of making Cython run on the iPhone, it is > exactly > the same as making C extensions for Python run on the iPhone. > > Cython code cannot run standalone. Note that one can embed the Python interpreter to get a standalone executable. > >> How? How would you write a simple Hello World Cython iPhone >> application? > > You should seek out iPhone development forums and ask again there. You > need to be able to run the official Python (from python.org), and > compile and install a custom C extension modules yourself. Then come > back here and ask again if something is not working out at that stage. Yes, I'm sure they'd know a lot more about this than any of us do :). Please report back any success. - Robert From stefan_ml at behnel.de Fri Nov 28 19:43:47 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 19:43:47 +0100 Subject: [Cython] Can I use Cython for iPhone application development? In-Reply-To: <493039BA.8040805@student.matnat.uio.no> References: <5801-43011@sneakemail.com> <493039BA.8040805@student.matnat.uio.no> Message-ID: <49303BE3.4040100@behnel.de> Hi, Dag Sverre Seljebotn wrote: > C/Python extension modules which depends on a running Python instance > (the Python version from python.org, not Jython or IronPython or similar). IronPython has Ironclad, which allows it to run a couple of CPython extensions already, namely bz2 and (most of) NumPy. I wonder how much would be missing to get Cython code running on IronPython that way. http://code.google.com/p/ironclad/ They promote it as a MSWindows-only thing, though, so unless someone gets it to run on Mono, my personal interest will stay at -0. Stefan From dagss at student.matnat.uio.no Fri Nov 28 20:22:44 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 20:22:44 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49303890.7040501@behnel.de> References: <49303890.7040501@behnel.de> Message-ID: <49304504.7040004@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > to fix ticket 124, I tried migrating the for-in loop to new-style temps. > The problem I ran into is that the current support for temp (de-)allocation > in NewTempExprNode during code generation is not enough to get this to > work. Some temps are used longer, some are used shorter than the current > automatic granularity at a node boundary. > > Example: > > for a,b in some_list: > pass > > needs a temp for retrieving the next item, unpacks it into two target > temps, frees the item temp and then assigns them to a and b. The problem is > that we a) pass temps across multiple nodes here, and b) currently generate > separate code paths for the tuple unpacking, so that the item temp disposal > code is also generated twice. That prevents the rhs from just releasing the > temp on a call to generate_disposal_code(). I must admit I need a little help here. At what point is it the new temps differ from the old temps? I'd prefer, if possible, to stay as close as possible to the old scheme when it comes to flow, so that one doesn't have to break everything and build it up again. I.e. try to make NewTempExprNode emulate "whatever it is" pure ExprNode with is_temp==True does. > I think it would be best to keep temp allocation in the code where it needs > to happen, but to make the deallocation explicit and callable from outside. > This means that all nodes would need to call a "free_temps()" method on > their subexpressions when they are done generating code for them. If one decides to change the temp allocation flow altogether, I have another proposal: Get rid of the idea of a "temporary expression node" altogether. The fact that an expression node must itself free its temp (which it cannot possibly itself know how long should be held on to) always struck me as messy. So: - Each node still flags whether it needs a result variable (i.e. the is_temp of today, or rather needs_target as I'll call it from here on). - Before the parent calls subexpr.generate_result_code, it has to check subexpr.needs_target. If True, the parent must call subexpr.set_target(some_cname) - If the parent doesn't have a variable handy to put the result in, it needs to allocate a temp, hand it to the child for it to store its result in, and finally release the temp when the result is no longer needed. If I'm not totally mistaken, this will both be clearer and might have the potential to remove some unecesarry INCREFs/DECREFs that's there today. Example code: a = g(f(), a) would then have this flow: - The NameNode is a lhs, so generate_assignment_code is changed so that its target cname is returned to the SimpleAssignmentNode. - The SimpleAssignmentNode notes that self.rhs.needs_target==True, and so it calls "self.rhs.set_target(lhs_cname)", and recurses into rhs. (where lhs_cname is the variable name of a in C source). - "g", in turn, notices that the call to f() as needs_target. As g's purposes for f() isn't to store it in a variable, but just use the result, g needs to allocate a temporary and call set_target on f. However, for "a" (a NameNode) the needs_target is set to False, so g simply uses the result_code directly. -- Dag Sverre From dagss at student.matnat.uio.no Fri Nov 28 20:32:27 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 20:32:27 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49304504.7040004@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> Message-ID: <4930474B.9070406@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > If one decides to change the temp allocation flow altogether, I have > another proposal: *And*, if that is done, there might be a lot of monotone but crucial work (tear it down and build it up), so we better have a coordinated coding session with IRC chat. I can put in two or three hours tomorrow. I am a bit scared of the prospect of implementing my own proposal though. Before changing the entire flow we need to schedule developer time so it doesn't get out of hand (Robert has some scary stories to tell about the SAGE "coercion" remake :-) arguably this change is isolated to 4000 lines in ExprNodes.py). The drawback with my proposal is that it doesn't seem easy to be backwards compatible, it needs tearing out and rebuilding stuff. Then again, without all the restrictions that are there on the current temp system (i.e. must allocate during analysis, trying to reuse temps in ways that should be much easier in a parent-manages-temp scheme) the problem seems to become a lot smaller than the problem the current complicated code has to solve. (I'm putting the string constant stuff on hold, this is more important.) -- Dag Sverre From stefan_ml at behnel.de Fri Nov 28 20:34:26 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 20:34:26 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49304504.7040004@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> Message-ID: <493047C2.9070309@behnel.de> Hi, Dag Sverre Seljebotn wrote: > I must admit I need a little help here. At what point is it the new > temps differ from the old temps? They differ in that there is not currently a dedicated flow for temp allocation/deallocation between nodes at code generation time like there was one at pre-allocation time for the old temps. > another proposal: > > Get rid of the idea of a "temporary expression node" altogether. Yep, that doesn't lead anywhere. > - Each node still flags whether it needs a result variable (i.e. the > is_temp of today, or rather needs_target as I'll call it from here on). I never really understood what "is_temp" was supposed to mean, but "needs_target" sounds clear enough. > - Before the parent calls subexpr.generate_result_code, it has to check > subexpr.needs_target. If True, the parent must call > subexpr.set_target(some_cname) > > - If the parent doesn't have a variable handy to put the result in, it > needs to allocate a temp, hand it to the child for it to store its > result in, and finally release the temp when the result is no longer needed. That sounds like part of this could be automated with a good default. Sounds like a good proposal to me. Stefan From stefan_ml at behnel.de Fri Nov 28 20:51:12 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 28 Nov 2008 20:51:12 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <4930474B.9070406@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <4930474B.9070406@student.matnat.uio.no> Message-ID: <49304BB0.8060106@behnel.de> Hi, Dag Sverre Seljebotn wrote: > *And*, if that is done, there might be a lot of monotone but crucial > work (tear it down and build it up), so we better have a coordinated > coding session with IRC chat. I can put in two or three hours tomorrow. It will certainly take longer. I actually don't think there is any good default here that ExprNode() could implement, as it now depends on the subexpression *if* a temp is needed and on the node itself *when* it needs to be allocated and deallocated. We'll basically have to re-implement the entire flow that is there inside the analyse_expressions() methods, but move it to the code generation. It will be easier than doing these things in analyse_expressions(), but it certainly needs time to get it right again. Above all, the infrastructure must be at least somewhat clear before hand. Otherwise, we'll end up doing everything over and over again. > I am a bit scared of the prospect of implementing my own proposal > though. Before changing the entire flow we need to schedule developer > time so it doesn't get out of hand (Robert has some scary stories to > tell about the SAGE "coercion" remake :-) arguably this change is > isolated to 4000 lines in ExprNodes.py). + another couple of hundred lines in Nodes.py, e.g. for the ForInStatNode that started this. > The drawback with my proposal is that it doesn't seem easy to be > backwards compatible, it needs tearing out and rebuilding stuff. Then > again, without all the restrictions that are there on the current temp > system (i.e. must allocate during analysis, trying to reuse temps in > ways that should be much easier in a parent-manages-temp scheme) the > problem seems to become a lot smaller than the problem the current > complicated code has to solve. It's absolutely worth it, and it must be done completely. Stopping half-way means breaking everything. Stefan From dagss at student.matnat.uio.no Fri Nov 28 21:09:22 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 28 Nov 2008 21:09:22 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49304BB0.8060106@behnel.de> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <4930474B.9070406@student.matnat.uio.no> <49304BB0.8060106@behnel.de> Message-ID: <49304FF2.6040905@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > Dag Sverre Seljebotn wrote: >> *And*, if that is done, there might be a lot of monotone but crucial >> work (tear it down and build it up), so we better have a coordinated >> coding session with IRC chat. I can put in two or three hours tomorrow. > > It will certainly take longer. I actually don't think there is any good > default here that ExprNode() could implement, as it now depends on the > subexpression *if* a temp is needed and on the node itself *when* it needs > to be allocated and deallocated. > > We'll basically have to re-implement the entire flow that is there inside > the analyse_expressions() methods, but move it to the code generation. It > will be easier than doing these things in analyse_expressions(), but it > certainly needs time to get it right again. When it comes to implementing a sensible default: is_temp is already set in analyse_expressions and can just continue to be set there (so default needs_target is is_temp). ExprNode could by default iterate through all subexprs, and if needs_target is set, allocate a temp for the result. Then run the code generation of the node (which will assume that the result can be copied from subexpr.result()), and finally release all the temps. Then a lot more temps than needed are allocated, but it should produce working code. From there on, the extra work put in is about removing redundant temps generated on a node-by-node case. -- Dag Sverre From dalcinl at gmail.com Sat Nov 29 03:01:45 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 28 Nov 2008 23:01:45 -0300 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <49302C71.5010809@student.matnat.uio.no> References: <493024AF.50602@student.matnat.uio.no> <4930264E.9040706@behnel.de> <49302C71.5010809@student.matnat.uio.no> Message-ID: On Fri, Nov 28, 2008 at 2:37 PM, Dag Sverre Seljebotn wrote: > > char* s = "one one one"; > s[2] = 'Z'; > printf("%s\n", s); > > At least on my gcc, I had two types of behaviour: > NO NO NO!!! NEVER DO THAT!!! In C/C++ string literals should NEVER be modified!!! The actual contents of the string are placed in special sections of your executable/library, and such memory places are readonly!!!! So you hopefully get a segfault, or you just corrupted memory and generated a extremely hard to find bug!!!. In short, C string literals should be always bounded to 'const char*' pointers, and explicit cast to (char*) have to be done with great care, just because the some C API (like older Python ones, and still some calls in Py3k) do not correctly declare arguments as "const char*". Because the danger of modifying the contents of string literals, some time ago I pushed a fix in cython-devel letting Cython-generated code to be happy with -Wwrite-strings flag to GCC, explicit casts to (char*) are done only when extrictly needed. Regarding compiler optimizations, it seems that GCC is smart enough: [dalcinl at botafogo tmp]$ cat strlits.c const char *a = "abcabc"; const char *b = "abcabc"; const char *c = "abcabc"; void f(const char* tmp) { } int main() { f("xyzxyz"); f("xyzxyz"); f("xyzxyz"); } [dalcinl at botafogo tmp]$ gcc strlits.c [dalcinl at botafogo tmp]$ strings a.out /lib/ld-linux.so.2 __gmon_start__ libc.so.6 _IO_stdin_used __libc_start_main GLIBC_2.0 PTRh [^_] abcabc xyzxyz As you can see, it seems the literals are only emited once (unless the strings command do not show dupes, I do not know) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From michael.abshoff at googlemail.com Sat Nov 29 03:07:22 2008 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Fri, 28 Nov 2008 18:07:22 -0800 Subject: [Cython] Pass char literals straight to C? In-Reply-To: References: <493024AF.50602@student.matnat.uio.no> <4930264E.9040706@behnel.de> <49302C71.5010809@student.matnat.uio.no> Message-ID: <4930A3DA.9050705@gmail.com> Lisandro Dalcin wrote: > On Fri, Nov 28, 2008 at 2:37 PM, Dag Sverre Seljebotn Hi, > As you can see, it seems the literals are only emited once (unless the > strings command do not show dupes, I do not know) > strings will show duplicates - there is no sorting or anything like that done. Cheers, Michael From dalcinl at gmail.com Sat Nov 29 03:17:32 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 28 Nov 2008 23:17:32 -0300 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <4930A3DA.9050705@gmail.com> References: <493024AF.50602@student.matnat.uio.no> <4930264E.9040706@behnel.de> <49302C71.5010809@student.matnat.uio.no> <4930A3DA.9050705@gmail.com> Message-ID: Tanks for the info. So it seems GCC (at least my 4.3.0) is smart enough and do no emit many entries for identical str literal... Perhaps the optimization in Cython could be removed... On Fri, Nov 28, 2008 at 11:07 PM, Michael Abshoff wrote: > Lisandro Dalcin wrote: >> On Fri, Nov 28, 2008 at 2:37 PM, Dag Sverre Seljebotn > > > > Hi, > >> As you can see, it seems the literals are only emited once (unless the >> strings command do not show dupes, I do not know) >> > > strings will show duplicates - there is no sorting or anything like that > done. > > Cheers, > > Michael > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Sat Nov 29 07:37:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Nov 2008 19:37:36 +1300 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <493024AF.50602@student.matnat.uio.no> References: <493024AF.50602@student.matnat.uio.no> Message-ID: <4930E330.3030601@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > Can anyone think of a reason why C string literals are allocated as > variables in C source? I can't remember all the reasons I did it that way in Pyrex, but one of them may have been so that I could leave calculating the length of the string to the C compiler. It's not entirely trivial to do that when escape sequences are involved. Another reason is so that I can refer to the C version of the string in the generated code concisely, instead of having to insert the whole string literal at that point, making it easier to audit the generated code. I don't think there should be any great difficulty in moving all the bookkeeping to code generation time, though, since you already seem to have a mechanism for out-of-order code generation. -- Greg From greg.ewing at canterbury.ac.nz Sat Nov 29 09:05:39 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Nov 2008 21:05:39 +1300 Subject: [Cython] Temp allocation flow In-Reply-To: <49304504.7040004@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> Message-ID: <4930F7D3.3040906@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > a = g(f(), a) > > would then have this flow: > > - The NameNode is a lhs, so generate_assignment_code is changed so that > its target cname is returned to the SimpleAssignmentNode. Be careful here -- you can't just stuff the result straight into the lhs. You need to evaluate the rhs, make sure you have a new reference to it, decref the lhs, and then do the assignment, in that order. So you need a temp for the rhs anyway, if it's anything other than a bare name. I can't really see what you're expecting to gain from this change, over and above what you'll get simply by moving temp allocation to code generation time. At best you'll move some of the generic code from one place to another; at worst you'll end up duplicating it in many parent nodes. The only possible benefit I can think of is that it might eliminate the need for CloneNodes. But I'm not so sure about that -- you'll still need some way of differentiating between the "owner" of a node and nodes that simply use its value, so you don't end up trying to generate evaluation code for it more than once. Seems to me that things will already be a lot simpler and clearer once all temp handling is moved to code generation time. Are you sure that won't be enough? -- Greg From greg.ewing at canterbury.ac.nz Sat Nov 29 09:18:27 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Nov 2008 21:18:27 +1300 Subject: [Cython] Temp allocation flow In-Reply-To: <493047C2.9070309@behnel.de> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <493047C2.9070309@behnel.de> Message-ID: <4930FAD3.7000209@canterbury.ac.nz> Stefan Behnel wrote: > I never really understood what "is_temp" was supposed to mean It simply means that the value of the expression is in (or needs to be in) a temporary variable. I'm not sure where you got the idea of a "temporary expression" from -- it's not a concept I had in mind when I was designing Pyrex. There are just expressions, some of which happen to put their results into temp vars. Whether a given expression node does so or not is an implementation detail of that node. Most of the time, other nodes don't need to be aware of it. -- Greg From dagss at student.matnat.uio.no Sat Nov 29 10:00:39 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 10:00:39 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <4930F7D3.3040906@canterbury.ac.nz> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <4930F7D3.3040906@canterbury.ac.nz> Message-ID: <493104B7.3000501@student.matnat.uio.no> Thanks a lot for chiming in -- there's always the danger of running into the same dangers (that we don't see) that you already avoided. Greg Ewing wrote: > Be careful here -- you can't just stuff the result > straight into the lhs. You need to evaluate the rhs, > make sure you have a new reference to it, decref the > lhs, and then do the assignment, in that order. So > you need a temp for the rhs anyway, if it's anything > other than a bare name. Good point. It means that my scheme would likely also need "node.can_raise_exception" or similar, if one wants to optimize when the rhs is a NameNode. (If the rhs is a pure C statement it is also OK to decref the lhs before evaluation of rhs, but not sure if I'd bother with it, one can always be lazy and let can_raise_exception stay at a default True). > I can't really see what you're expecting to gain from > this change, over and above what you'll get simply by > moving temp allocation to code generation time. At best > you'll move some of the generic code from one place to > another; at worst you'll end up duplicating it in many > parent nodes. First of all, if I can get temp allocation moved to code generation time I am mostly happy, no matter how it is done. But Stefan noted some problems in doing that, and I wondered if this approach would perhaps make that move quicker and easier. Wondered, I'm not sure yet. However I also feel it might in general fit better within that framework (temps during code generation), where temps are looked at more as registers that are allocated and deallocated while outputting code statements. Case study: The snippet below (and I must say I am really thankful for the extensive comments) def allocate_temps(self, env, result_code = None): # We only ever evaluate one side, and this is # after evaluating the truth value, so we may # use an allocation strategy here which results in # this node and both its operands sharing the same # result variable. This allows us to avoid some # assignments and increfs/decrefs that would otherwise # be necessary. self.allocate_temp(env, result_code) self.test.allocate_temps(env, result_code) self.true_val.allocate_temps(env, self.result()) self.false_val.allocate_temps(env, self.result()) # We haven't called release_temp on either value, # because although they are temp nodes, they don't own # their result variable. And because they are temp # nodes, any temps in their subnodes will have been # released before their allocate_temps returned. # Therefore, they contain no temp vars that need to # be released. would be removed, and instead it would be put into the code generation flow itself (! marks new lines) def generate_evaluation_code(self, code): ! tmp = code.funcstate.allocate_temp(...etc...) ! self.test.set_target(tmp) self.test.generate_evaluation_code(code) code.putln("if (%s) {" % tmp ) ! self.true_val.set_target(self.target) self.true_val.generate_evaluation_code(code) code.putln("} else {") ! self.false_val.set_target(self.target) self.false_val.generate_evaluation_code(code) code.putln("}") self.test.generate_disposal_code(code) ! code.funcstate.release_temp(tmp) I find the latter (interleave the temps) more easy to write than having a seperate phase, but that could be a matter of taste. Without doing this, and *only* move the temp allocation phase to code generation time, the former code (the seperate allocate_temp phase) is still needed and in fact many of the benefits with temps at generation time disappear. > The only possible benefit I can think of is that it > might eliminate the need for CloneNodes. But I'm not > so sure about that -- you'll still need some way of > differentiating between the "owner" of a node and > nodes that simply use its value, so you don't end up > trying to generate evaluation code for it more than > once. CloneNodes-situations should now usually be dealt with by using TempsBlockNode instead. But of course, one wouldn't like to have to do everything -- rebuilding everything is a real danger and I haven't finally settled on +1-ing my own approach yet, but I want to investigate it. -- Dag Sverre From stefan_ml at behnel.de Sat Nov 29 12:04:14 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 29 Nov 2008 12:04:14 +0100 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <4930E330.3030601@canterbury.ac.nz> References: <493024AF.50602@student.matnat.uio.no> <4930E330.3030601@canterbury.ac.nz> Message-ID: <493121AE.2020709@behnel.de> Hi, Greg Ewing wrote: > Dag Sverre Seljebotn wrote: > >> Can anyone think of a reason why C string literals are allocated as >> variables in C source? > > I can't remember all the reasons I did it that way in Pyrex, > but one of them may have been so that I could leave calculating > the length of the string to the C compiler. It's not entirely > trivial to do that when escape sequences are involved. That is very true. I put a whole lot of work into proper string escaping to make sure byte strings end up in the binary the way they were written in the source. That usually makes non-ASCII strings a bit longer than their pure byte sequence. > Another reason is so that I can refer to the C version of the > string in the generated code concisely, instead of having to > insert the whole string literal at that point, making it > easier to audit the generated code. It's also more readable, as some important information such as identifier state or unicode type are currently stored in the string tab behind the string literal. Although one could argue that it's trivial to move it before the literal... All of this makes me think that it's not a bad idea to keep them separate in the code. Stefan From dagss at student.matnat.uio.no Sat Nov 29 13:05:03 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 13:05:03 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49304504.7040004@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> Message-ID: <49312FEF.2020606@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > So: > - Each node still flags whether it needs a result variable (i.e. the > is_temp of today, or rather needs_target as I'll call it from here on). > > - Before the parent calls subexpr.generate_result_code, it has to check > subexpr.needs_target. If True, the parent must call > subexpr.set_target(some_cname) > > - If the parent doesn't have a variable handy to put the result in, it > needs to allocate a temp, hand it to the child for it to store its > result in, and finally release the temp when the result is no longer needed. OK, I've played with it a bit. Here's a couple of extra points. Summary: I'm not satisfied with my proposal, and Stefan's original approach might be better. With e.g. x = ((((a + b) + c) + d) + e), and also the reverse nesting, the theoretical optimum is 1 temporary. My approach by default makes O(depth) temps, because each parent makes a new temp; while the old approach achieves 2 = O(1). Now, this was easily fixed by a manual override in BinopNode, so that the result variable (which is known to be temp) is used as a temporary inside the calculation. This got the number of needed temps down to 1. Still, that is kind of a hack (and could be broken if types were not the same etc., though I'm not sure if that ever happens). In general, it looks better to have the child allocate the temp at the place where it is needed. -- Dag Sverre From dagss at student.matnat.uio.no Sat Nov 29 13:39:26 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 13:39:26 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49312FEF.2020606@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> Message-ID: <493137FE.9010407@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Dag Sverre Seljebotn wrote: >> So: >> - Each node still flags whether it needs a result variable (i.e. the >> is_temp of today, or rather needs_target as I'll call it from here on). >> >> - Before the parent calls subexpr.generate_result_code, it has to check >> subexpr.needs_target. If True, the parent must call >> subexpr.set_target(some_cname) >> >> - If the parent doesn't have a variable handy to put the result in, it >> needs to allocate a temp, hand it to the child for it to store its >> result in, and finally release the temp when the result is no longer needed. > > OK, I've played with it a bit. Here's a couple of extra points. Summary: > I'm not satisfied with my proposal, and Stefan's original approach might > be better. > > With e.g. x = ((((a + b) + c) + d) + e), and also the reverse nesting, > the theoretical optimum is 1 temporary. My approach by default makes > O(depth) temps, because each parent makes a new temp; while the old > approach achieves 2 = O(1). A way to get around this would be to split generate_result_code. So mainly my proposal, but with this modification: self.subexpr.preparation_code(code) tmp = code.funcstate.allocate_temp(...) self.subexpr.store_result_code(tmp, code) # do stuff # perhaps DECREF, the parent will know this and no # workarounds for when DECREF shouldn't happen is needed code.funcstate.release_temp(tmp) This leaves the flow of temps in control of the parent. So currently there is generate_result_code, generate_disposal_code which I propose replaced with preparation_code, store_result_code (generate_result_code mostly maps directly to store_result_code except that more efficient temp use can be had if one splits it. What generate_disposal_code does is moved to the parent -- I didn't find a situation where disposal_code does anything beyond temp handling). -- Dag Sverre From stefan_ml at behnel.de Sat Nov 29 14:09:51 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 29 Nov 2008 14:09:51 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <493137FE.9010407@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> Message-ID: <49313F1F.2070202@behnel.de> Dag Sverre Seljebotn wrote: > A way to get around this would be to split generate_result_code. So > mainly my proposal, but with this modification: > > self.subexpr.preparation_code(code) > tmp = code.funcstate.allocate_temp(...) > self.subexpr.store_result_code(tmp, code) That's something I thought about, too. Most of the time, within an expression, all you really want is to say "do whatever it takes to calculate the result, and then put it *here*", where 'here' may be a temp or a name. How would the DECREF handling work for the target in that case? If it's a temp, I'd expect it to be empty in any case, but if it's a variable name, it needs a DECREF-after-INCREF. Or would the parent always hand in a temp and always handle variables itself? If we call this directly, we could also pass keyword arguments like "target_needs_decref" and "make_owned_reference". Stefan From dagss at student.matnat.uio.no Sat Nov 29 14:58:00 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 14:58:00 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49313F1F.2070202@behnel.de> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> Message-ID: <49314A68.5000200@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: >> A way to get around this would be to split generate_result_code. So >> mainly my proposal, but with this modification: >> >> self.subexpr.preparation_code(code) >> tmp = code.funcstate.allocate_temp(...) >> self.subexpr.store_result_code(tmp, code) > > That's something I thought about, too. Most of the time, within an > expression, all you really want is to say "do whatever it takes to > calculate the result, and then put it *here*", where 'here' may be a temp > or a name. > > How would the DECREF handling work for the target in that case? If it's a > temp, I'd expect it to be empty in any case, but if it's a variable name, > it needs a DECREF-after-INCREF. Or would the parent always hand in a temp > and always handle variables itself? > > If we call this directly, we could also pass keyword arguments like > "target_needs_decref" and "make_owned_reference". I don't like those keywords as such, responsibility should be in parent. I think a can_raise_exception flag on the node gets us a long way. I'll have to give another example to explain what I'm thinking. Let the code be: a = b + (c + d) (Currently SingleAssignmentNode evaluates rhs and then calls lhs.generate_assignment_code, I'd change that so that lhs.generate_assignment_code is responsible for evaluating the rhs.) Then, when assigning to the lhs NameNode: def generate_assignment_code(self, rhs, code): if not rhs.can_raise_exception: # No temp needed if self.type.is_pyobject: code.put_decref(self.cname, self.type) rhs.preparation_code(code) rhs.store_result_code(self.cname, code) # CONTRACT: store_result_code gives away reference on "result" else: # If exception is raised, value shouldn't be overwritten rhs.preparation_code(code) # Above causes rhs to allocate a tmp for the second operand # and fully evaluate "tmp1 = c + d". I.e both preparation_code # and store_result_code of "c+d" is done now, allowing # temps of nested expressions to be reused below. tmp = code.funcstate.allocate_temp(...) # returns "tmp2" rhs.store_result_code(tmp, code) # Above causes rhs to first calculate "tmp2 = b + tmp1", # then decref the tmp1 claimed during preparation_code code.put_decref(self.cname, self.type) code.putln("%s = %s; %s = 0" % (self.cname, tmp, tmp)) code.funcstate.release_temp(...) -- Dag Sverre From dagss at student.matnat.uio.no Sat Nov 29 15:19:30 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 15:19:30 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49314A68.5000200@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> Message-ID: <49314F72.8050307@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Dag Sverre Seljebotn wrote: >>> A way to get around this would be to split generate_result_code. So >>> mainly my proposal, but with this modification: >>> >>> self.subexpr.preparation_code(code) >>> tmp = code.funcstate.allocate_temp(...) >>> self.subexpr.store_result_code(tmp, code) >> That's something I thought about, too. Most of the time, within an >> expression, all you really want is to say "do whatever it takes to >> calculate the result, and then put it *here*", where 'here' may be a temp >> or a name. >> >> How would the DECREF handling work for the target in that case? If it's a >> temp, I'd expect it to be empty in any case, but if it's a variable name, >> it needs a DECREF-after-INCREF. Or would the parent always hand in a temp >> and always handle variables itself? >> >> If we call this directly, we could also pass keyword arguments like >> "target_needs_decref" and "make_owned_reference". > > I don't like those keywords as such, responsibility should be in parent. > I think a can_raise_exception flag on the node gets us a long way. Sorry!, I didn't fully realize what you were asking. Partly because I couldn't think of situations where the subexpr would not want to incref, but of course there are plenty of examples of that (like "print x"). So yes, you'd need the "needs_target" variable, and if that is set to False, there would be another protocol. Finally if something both needs a target, but is able to return a non-incref-ed value (cannot think of any examples -- perhaps some buffer code), then I'd prefer for store_result_code to return some status about whether it returns an incref-ed result or non-incref-ed one. So that you'd do something like this for "print x": if self.operand.needs_target: # allocate tmp etc needs_decref = self.operand.store_result_code(tmp, code) # output code to print tmp if needs_decref: # decref tmp # release tmp else: # code to print self.operand.result() directly Perhaps this calls for a wiki page with these examples etc. before any implementation work is done. -- Dag Sverre From robertwb at math.washington.edu Sat Nov 29 20:02:38 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 29 Nov 2008 11:02:38 -0800 Subject: [Cython] Pass char literals straight to C? In-Reply-To: <493121AE.2020709@behnel.de> References: <493024AF.50602@student.matnat.uio.no> <4930E330.3030601@canterbury.ac.nz> <493121AE.2020709@behnel.de> Message-ID: On Nov 29, 2008, at 3:04 AM, Stefan Behnel wrote: > Hi, > > Greg Ewing wrote: >> Dag Sverre Seljebotn wrote: >> >>> Can anyone think of a reason why C string literals are allocated as >>> variables in C source? >> >> I can't remember all the reasons I did it that way in Pyrex, >> but one of them may have been so that I could leave calculating >> the length of the string to the C compiler. It's not entirely >> trivial to do that when escape sequences are involved. > > That is very true. I put a whole lot of work into proper string > escaping to > make sure byte strings end up in the binary the way they were > written in > the source. That usually makes non-ASCII strings a bit longer than > their > pure byte sequence. > > >> Another reason is so that I can refer to the C version of the >> string in the generated code concisely, instead of having to >> insert the whole string literal at that point, making it >> easier to audit the generated code. > > It's also more readable, as some important information such as > identifier > state or unicode type are currently stored in the string tab behind > the > string literal. Although one could argue that it's trivial to move it > before the literal... > > All of this makes me think that it's not a bad idea to keep them > separate > in the code. +1, these are enough good reasons for me (aside from "if it ain't broke, don't fix it--we've got plenty of other stuff to fix"). - Robert From robertwb at math.washington.edu Sat Nov 29 20:04:05 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 29 Nov 2008 11:04:05 -0800 Subject: [Cython] pxd locals? In-Reply-To: <492FB8D8.2040708@student.matnat.uio.no> References: <492EE961.2040101@student.matnat.uio.no> <492F01F8.2060507@behnel.de> <492F0724.3060409@student.matnat.uio.no> <492FB8D8.2040708@student.matnat.uio.no> Message-ID: On Nov 28, 2008, at 1:24 AM, Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> On Nov 27, 2008, at 12:46 PM, Dag Sverre Seljebotn wrote: >>> Proposal: New modifier, "proto". I.e. >>> >>> cdef proto int foo(int a, int b): >>> cdef int c >>> cdef int b >>> >>> will embed the signature/variables on a pure >>> >>> def foo(a, b): ... >>> >>> If the keyword proto is not present, the definition is not allowed >>> unless it is declared inline. >>> >>> Pro: >>> - Resolve the conflict. >>> - I think this makes it more obvious what is going on. I think >>> having >>> the pxd definition transfer to a def in the pyx/py is a bit too >>> magical >>> using the current syntax anyway, and that this is an improvement, >>> conflict or not. >>> >>> Con: >>> - Breaks compatability with Cython 0.10 (but through compiler >>> errors, no >>> silently changed behaviour). >>> >>> What do you think? I could implement it in roughly five minutes if >>> it is >>> accepted, the function modifier part of the parser is nice and >>> dynamic. >> >> I'm generally -1 on adding new syntax, but what I had wasn't very >> clear either. Would it be enough to accept a locals decorator? > > That's much better; +1. > > Could you do it? You did this so it should be quicker for you. At any > rate this is now http://trac.cython.org/cython_trac/ticket/143 > > Note that this may require extending decorator support (in Parsing.py) > to cdef functions, currently they only support "def" (if you didn't > change that already). They should all probably simply be let > through no > matter what they decorate, and then disallowed again (except for > @cython.locals) in PostParse. Sure, I'll try to get to this (probably next week sometime). - Robert From robertwb at math.washington.edu Sat Nov 29 21:32:02 2008 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 29 Nov 2008 12:32:02 -0800 Subject: [Cython] Temp allocation flow In-Reply-To: <49314F72.8050307@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> <49314F72.8050307@student.matnat.uio.no> Message-ID: <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> On Nov 29, 2008, at 6:19 AM, Dag Sverre Seljebotn wrote: > Dag Sverre Seljebotn wrote: >> Stefan Behnel wrote: >>> Dag Sverre Seljebotn wrote: >>>> A way to get around this would be to split generate_result_code. So >>>> mainly my proposal, but with this modification: >>>> >>>> self.subexpr.preparation_code(code) >>>> tmp = code.funcstate.allocate_temp(...) >>>> self.subexpr.store_result_code(tmp, code) >>> That's something I thought about, too. Most of the time, within an >>> expression, all you really want is to say "do whatever it takes to >>> calculate the result, and then put it *here*", where 'here' may >>> be a temp >>> or a name. >>> >>> How would the DECREF handling work for the target in that case? >>> If it's a >>> temp, I'd expect it to be empty in any case, but if it's a >>> variable name, >>> it needs a DECREF-after-INCREF. Or would the parent always hand >>> in a temp >>> and always handle variables itself? >>> >>> If we call this directly, we could also pass keyword arguments like >>> "target_needs_decref" and "make_owned_reference". >> >> I don't like those keywords as such, responsibility should be in >> parent. >> I think a can_raise_exception flag on the node gets us a long way. > > Sorry!, I didn't fully realize what you were asking. Partly because I > couldn't think of situations where the subexpr would not want to > incref, > but of course there are plenty of examples of that (like "print x"). > > So yes, you'd need the "needs_target" variable, and if that is set to > False, there would be another protocol. > > Finally if something both needs a target, but is able to return a > non-incref-ed value (cannot think of any examples -- perhaps some > buffer > code), Here's an example: "print (foo[i]).c_level_attribute." > then I'd prefer for store_result_code to return some status about > whether it returns an incref-ed result or non-incref-ed one. So that > you'd do something like this for "print x": > > if self.operand.needs_target: > # allocate tmp etc > needs_decref = self.operand.store_result_code(tmp, code) > # output code to print tmp > if needs_decref: > # decref tmp > # release tmp > else: > # code to print self.operand.result() directly > > Perhaps this calls for a wiki page with these examples etc. before any > implementation work is done. I think simply moving everything over to the code generation phase will be the most beneficial. This will make things much more natural-- when one needs a temp one can request it right there, use it, and dispose of it when done. The "generate_disposal_code" paradigm makes sense to me, but perhaps that's because it's what I'm used to. If it's just used for temps, I agree the name should change. I like the idea of passing in a target (or None) when asking for the result code, but don't like requiring the parent to do so. It seems to make more sense to me to have the Node handle (e.g. decref/cleanup) its own code. All one should need to handle exceptions/function exit correctly is for the code object (from which one requests temps) to know which temps are in use in a given block (which are started/ended by an explicit call in the try node). Basically, the same framework should work, just transferring from the symbol table to the code object. One thing that we need to keep in mind is what optimizations the C compiler can do, and what it can't. For example, gcc can re-use the same space on the stack for multiple (independent) temp variables, but can't optimize away Py_DECREFs, so the latter are more important for us to handle. They also allow us do more things without the GIL. I realize that this is all generalities (rather than getting into the actual details) but figured I'd throw in my 2 cents now and then comment again when I actually have time to thing about this more. - Robert From dagss at student.matnat.uio.no Sat Nov 29 23:42:22 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 29 Nov 2008 23:42:22 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> <49314F72.8050307@student.matnat.uio.no> <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> Message-ID: <4931C54E.3030902@student.matnat.uio.no> Hi all, I've made progress. First off, I'm now somewhat confident that incremental steps can be taken anyway. If I'm right, I think that pretty much puts dead the discussion about what the optimal solution is for now. I've boosted NewTempsExprNode and fixed ticket #124. Testsuite doesn't complain, so I've pushed. PrimaryCmpNode has had changes which could make it leak if everything doesn't work, but it looks ok to me. Some points: a) Now, descendants of NewTempsExprNode has the option to override generate_evaluation_code. If they do that, there's free flexibility about how the temps are allocated, juggled etc. between sub-nodes, in-line with code generation, like the primary goal of this discussion was. The "free_temps" that Stefan wanted is essentially "generate_post_assignment_code", from what I can see. b) Typically, nodes that now implement generate_evaluation_code needs to be hand-converted to the new scheme, by allocating their result (for an example, see PrimaryCmpNode in changeset [1]). c) Nodes that merely override generate_result_code usually converts without pain. d) But non-trivial allocate_temps must be dealt with manually too. (This was the problem with IteratorNode, and I ended up removing a use of a TempNode to make it work, see [2]). [1] http://hg.cython.org/cython-devel/rev/c04479fdbe6d [2] http://hg.cython.org/cython-devel/rev/15459e336874 Some comments for Robert's post: Robert Bradshaw wrote: > I think simply moving everything over to the code generation phase > will be the most beneficial. This will make things much more natural-- > when one needs a temp one can request it right there, use it, and > dispose of it when done. The "generate_disposal_code" paradigm makes > sense to me, but perhaps that's because it's what I'm used to. If > it's just used for temps, I agree the name should change. I like the In ExprNode it is used only for DECREF-ing any result temp, in NewTempExprNode it is also used for releasing temps. > idea of passing in a target (or None) when asking for the result > code, but don't like requiring the parent to do so. It seems to make > more sense to me to have the Node handle (e.g. decref/cleanup) its > own code. All one should need to handle exceptions/function exit > correctly is for the code object (from which one requests temps) to > know which temps are in use in a given block (which are started/ended > by an explicit call in the try node). Basically, the same framework > should work, just transferring from the symbol table to the code object. Return statemens and except blocks were already fixed by Stefan and myself on Thursday. You now need to pass a new manage_ref=True/False to code.funcstate.allocate_temp, and if set to True, the temp will be freed on exceptions and the right returns. (There are important situations where that shouldn't happen -- usecases that the old temp system couldn't handle, but which I've used in the buffer code.) -- Dag Sverre From greg.ewing at canterbury.ac.nz Sun Nov 30 01:18:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Nov 2008 13:18:51 +1300 Subject: [Cython] Temp allocation flow In-Reply-To: <49314A68.5000200@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> Message-ID: <4931DBEB.90901@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > def generate_assignment_code(self, rhs, code): > if not rhs.can_raise_exception: > # No temp needed > if self.type.is_pyobject: > code.put_decref(self.cname, self.type) > rhs.preparation_code(code) > rhs.store_result_code(self.cname, code) This isn't safe. A new reference to the rhs must be obtained first in case it's the same as the old value of the lhs, otherwise you risk losing the object. -- Greg From greg.ewing at canterbury.ac.nz Sun Nov 30 01:20:15 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Nov 2008 13:20:15 +1300 Subject: [Cython] Temp allocation flow In-Reply-To: <49312FEF.2020606@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> Message-ID: <4931DC3F.3020909@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > With e.g. x = ((((a + b) + c) + d) + e), and also the reverse nesting, > the theoretical optimum is 1 temporary. What code do you have in mind to evaluate this using only one temp? Seems to me you need two, used alternately. -- Greg From greg.ewing at canterbury.ac.nz Sun Nov 30 01:36:44 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Nov 2008 13:36:44 +1300 Subject: [Cython] Temp allocation flow In-Reply-To: <493104B7.3000501@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <4930F7D3.3040906@canterbury.ac.nz> <493104B7.3000501@student.matnat.uio.no> Message-ID: <4931E01C.4040901@canterbury.ac.nz> Dag Sverre Seljebotn wrote: > Without doing this, and *only* move the temp allocation phase to code > generation time, the former code (the seperate allocate_temp phase) is > still needed I don't follow that. The only reason I added a third phase for temp allocation was because it had to be done after inserting coercion nodes. I can't think of any reason it couldn't be fully merged into the code generation pass. Temps can be allocated by generate_evaluation_code and freed by generate_disposal_code. -- Greg From stefan_ml at behnel.de Sun Nov 30 05:57:36 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 30 Nov 2008 05:57:36 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49313F1F.2070202@behnel.de> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> Message-ID: <49321D40.6000602@behnel.de> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: >> A way to get around this would be to split generate_result_code. So >> mainly my proposal, but with this modification: >> >> self.subexpr.preparation_code(code) >> tmp = code.funcstate.allocate_temp(...) >> self.subexpr.store_result_code(tmp, code) > > That's something I thought about, too. Most of the time, within an > expression, all you really want is to say "do whatever it takes to > calculate the result, and then put it *here*", where 'here' may be a temp > or a name. How about using a dedicated Target class and writing code like this: target = code.new_temp_target(some_type) # or, when targeting a name rather than a temp target = code.new_target(some_cname, needs_xdecref=True) # or maybe even target = code.new_target(some_entry, needs_xdecref=True) and then you'd call something.generate_result_code(code, target) which would do code.put_incref("source") target.assign_from("source") # or code.put_incref("source") target.assign_from("source", source_type=some_ext_type) and let 'target' do what's needed in terms of casting the source and (x)decrefing the target before the assignment. For disposal, you would call target.free() which would free the temp (if it holds one). I could also imagine writing target.decref_clear() or something in that line. The "code.new_target()" bit could even be a bit smart and decide based on the flow control mechanism if an (x)decref is needed for a name when no needs_decref or needs_xdecref keywords are passed. What do you think? Stefan From stefan_ml at behnel.de Sun Nov 30 07:42:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 30 Nov 2008 07:42:08 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <4931C54E.3030902@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> <49314F72.8050307@student.matnat.uio.no> <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> <4931C54E.3030902@student.matnat.uio.no> Message-ID: <493235C0.2030200@behnel.de> Hi Dag, Dag Sverre Seljebotn wrote: > I've made progress. > > First off, I'm now somewhat confident that incremental steps can be > taken anyway. If I'm right, I think that pretty much puts dead the > discussion about what the optimal solution is for now. > > I've boosted NewTempsExprNode and fixed ticket #124. Testsuite doesn't > complain, so I've pushed. Cool. One question, could you tell me what this line in NewTempExprNode.allocate_target_temps() is doing? rhs.release_temp(rhs) Usually, it's either "env.release_temp(rhs)" or "rhs.release_temp(env)" (which I find funny enough...) Stefan From dagss at student.matnat.uio.no Sun Nov 30 10:58:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 30 Nov 2008 10:58:25 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <49321D40.6000602@behnel.de> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49321D40.6000602@behnel.de> Message-ID: <493263C1.6020506@student.matnat.uio.no> Stefan Behnel wrote: > > Cool. > > One question, could you tell me what this line in > NewTempExprNode.allocate_target_temps() is doing? > > rhs.release_temp(rhs) That's a polite way of putting it :-) I suppose it is never called as no LHS-nodes are currently converted. > How about using a dedicated Target class and writing code like this: ... Yes, I like it. Building further on that: There's a couple of possible behaviours in the ExprNode (needs to hand out a temp or not, has an inc-refed result or not), and a couple of possible behaviours in the parent (has target or not, needs to decref target, etc.). So introducing a "mediator" class is not a bad idea so that each side doesn't need to worry about all combinations of cases. It would also possibly be more robust (could be unit tested etc.) *But*, at this stage, when incremental conversion seems possible, I'll resist any such drastic changes, based on the amount of work involved alone. I think there may be cases where the current framework doesn't work, but hopefully one can take a good look at them and introduce new, optional parameters to generate_result_code and/or generate_disposal_code to handle those situations. (I.e. "do_not_release_temp" to the latter one might be needed in some situations.) -- Dag Sverre From dagss at student.matnat.uio.no Sun Nov 30 11:36:33 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 30 Nov 2008 11:36:33 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> <49314F72.8050307@student.matnat.uio.no> <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> Message-ID: <49326CB1.1070300@student.matnat.uio.no> This is a summary post, with a couple of answers to Greg and Robert. You'll notice I've taken your side now, so this hopefully rounds off the discussion, this is just a last explanation round. Especially since this thread will be the canonical thread to look up for such matters in the future. Robert Bradshaw wrote: > > I think simply moving everything over to the code generation phase > will be the most beneficial. This will make things much more natural-- > when one needs a temp one can request it right there, use it, and > dispose of it when done. The "generate_disposal_code" paradigm makes I think the problem with generate_disposal_code in pure principle is that the things it does can't in reality be bundled into a nice disposal package: - There's actually two different disposal_code methods depending on the situation of the parent - Parents of an ExprNode now and then are seen to manually meddle; perhaps check the is_temp status and take action accordingly and so on. - When DECREF is inserted in addition by return stats and except clauses, one could look at it as disposal_code being reimplemented for that particular situation. So, I'd prefer a *declarative* scheme (I'm counting Stefan's Target class as such one), where the node exports both its result, and what needs to be done with it afterwards (refcount status/temp status/anything else). This makes the boundaries somewhat harder and the flow a little bit simpler -- leading to easier unit testing, it being more natural to handle things in return stats and except stats., etc. Note that this is philosophical objections to design, it works this way, and I'm not advocating a change now since I got more working. As in, I think it is likely there will never be a change, as long as the rest of the nodes can be gracefully converted. (Keep in mind that there's certain things the allocate_temps phase did, like the "result" parameter and the possibility for the parent to direct the flow of temps in creative ways, for which alternative/similar solutions must be found.) Greg Ewing wrote: > Dag Sverre Seljebotn wrote: > >> def generate_assignment_code(self, rhs, code): >> if not rhs.can_raise_exception: >> # No temp needed >> if self.type.is_pyobject: >> code.put_decref(self.cname, self.type) >> rhs.preparation_code(code) >> rhs.store_result_code(self.cname, code) > > This isn't safe. A new reference to the rhs must be obtained > first in case it's the same as the old value of the lhs, > otherwise you risk losing the object. > Ahh. I'm horrible with refcounting. Certainly makes me humble about rebuilding anything; your experience in such things which is embedded into the current system is a big reason to keep it. Greg Ewing wrote: > Dag Sverre Seljebotn wrote: > >> With e.g. x = ((((a + b) + c) + d) + e), and also the reverse nesting, >> the theoretical optimum is 1 temporary. > > What code do you have in mind to evaluate this using > only one temp? Seems to me you need two, used alternately. > Yes. I definitely thought I'd found a solution with 1, but now I think there must be 2. Greg Ewing wrote: > Dag Sverre Seljebotn wrote: > >> Without doing this, and *only* move the temp allocation phase to code >> generation time, the former code (the seperate allocate_temp phase) is >> still needed > > I don't follow that. The only reason I added a third phase > for temp allocation was because it had to be done after > inserting coercion nodes. I can't think of any reason it > couldn't be fully merged into the code generation pass. > Temps can be allocated by generate_evaluation_code and > freed by generate_disposal_code. > Mainly I misunderstood what you said, and so my answer didn't make sense to you. You'll find that this is more or less exactly what I've done with NewTempExprNode now, and it is what I go for now. -- Dag Sverre From dagss at student.matnat.uio.no Sun Nov 30 12:45:25 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 30 Nov 2008 12:45:25 +0100 Subject: [Cython] Temp allocation flow In-Reply-To: <4931C54E.3030902@student.matnat.uio.no> References: <49303890.7040501@behnel.de> <49304504.7040004@student.matnat.uio.no> <49312FEF.2020606@student.matnat.uio.no> <493137FE.9010407@student.matnat.uio.no> <49313F1F.2070202@behnel.de> <49314A68.5000200@student.matnat.uio.no> <49314F72.8050307@student.matnat.uio.no> <1DA4A5EA-33CF-4833-9A2A-7E5CD07437C9@math.washington.edu> <4931C54E.3030902@student.matnat.uio.no> Message-ID: <49327CD5.5040807@student.matnat.uio.no> Developments: I thought I should announce a change I did to generate_disposal_code. It now takes two param