From sanne at kortec.nl Wed Sep 2 14:34:02 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Wed, 2 Sep 2009 14:34:02 +0200 Subject: [Cython] FW: cython and hash tables / dictionary Message-ID: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> Hi mailing, I've been writing a complex program in python, which I am currently scaling up. I find myself in the position now, where I run out of memory or out of time. I have been looking at alternatives like cython and ctypes. I implemented ctypes which fixes the memory problem but doubles the time problem. Currently I am implementing a cython version and ran into a problem. I hope someone can help me out. The main bottleneck in my code is a large dictionary / hash table which I would like to optimize. Since a dictionary is a python datatype I have no idea how to make this cython. Currently I have tried to keep the 'keys' intact and store the 'values' as ctypes floats, but I think it might be better to do something else. Do I need to make the entire hash table c? Or is there a more simple solution like combining the python dict with cython? If so, how do I do this? Thanks in advance. Additional details: I use a double dict where the key of the first dict stores another dict as value. S. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090902/693d614b/attachment.htm From robertwb at math.washington.edu Wed Sep 2 18:01:43 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 09:01:43 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> Message-ID: <1ED2DD07-F2F5-4ED7-87B7-CFE40148B96B@math.washington.edu> On Sep 2, 2009, at 5:34 AM, Sanne Korzec wrote: > Hi mailing, > > I?ve been writing a complex program in python, which I am currently > scaling up. I find myself in the position now, where I run out of > memory or out of time. I have been looking at alternatives like > cython and ctypes. I implemented ctypes which fixes the memory > problem but doubles the time problem. > > Currently I am implementing a cython version and ran into a > problem. I hope someone can help me out. > > The main bottleneck in my code is a large dictionary / hash table > which I would like to optimize. Since a dictionary is a python > datatype I have no idea how to make this cython. > > Currently I have tried to keep the ?keys? intact and store the > ?values? as ctypes floats, but I think it might be better to do > something else. Do I need to make the entire hash table c? Or is > there a more simple solution like combining the python dict with > cython? If so, how do I do this? > > Thanks in advance. > > Additional details: I use a double dict where the key of the first > dict stores another dict as value. > > S. > The short answer is yes, to avoid using the Python dictionary (which can only hold Python objects), you need to write your own hashtable. That's not very hard though--I bet only a hundred or two lines in Cython would be sufficient (and very fast). You could also look into using an external C or C++ library, though C++ support is still a work in progress. - Robert From Chris.Barker at noaa.gov Wed Sep 2 18:30:40 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 02 Sep 2009 09:30:40 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> Message-ID: <4A9E9DB0.8060707@noaa.gov> Sanne Korzec wrote: > The main bottleneck in my code is a large dictionary / hash table which > I would like to optimize. In what way do you need to optimize it? i.e. how is it used? do you have memory issues or speed issues? python dicts are highly optimized already, so you're not likely to do much better with the look-up speed. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From seb.binet at gmail.com Wed Sep 2 19:01:20 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Wed, 2 Sep 2009 19:01:20 +0200 Subject: [Cython] FW: cython and hash tables / dictionary Message-ID: <200909021901.20923.binet@cern.ch> On Wednesday 02 September 2009 18:01:43 Robert Bradshaw wrote: > On Sep 2, 2009, at 5:34 AM, Sanne Korzec wrote: > > Hi mailing, > > > > I?ve been writing a complex program in python, which I am currently > > scaling up. I find myself in the position now, where I run out of > > memory or out of time. I have been looking at alternatives like > > cython and ctypes. I implemented ctypes which fixes the memory > > problem but doubles the time problem. > > > > Currently I am implementing a cython version and ran into a > > problem. I hope someone can help me out. > > > > The main bottleneck in my code is a large dictionary / hash table > > which I would like to optimize. Since a dictionary is a python > > datatype I have no idea how to make this cython. > > > > Currently I have tried to keep the ?keys? intact and store the > > ?values? as ctypes floats, but I think it might be better to do > > something else. Do I need to make the entire hash table c? Or is > > there a more simple solution like combining the python dict with > > cython? If so, how do I do this? > > > > Thanks in advance. > > > > Additional details: I use a double dict where the key of the first > > dict stores another dict as value. > > > > S. > > The short answer is yes, to avoid using the Python dictionary (which > can only hold Python objects), you need to write your own hashtable. > That's not very hard though--I bet only a hundred or two lines in > Cython would be sufficient (and very fast). You could also look into > using an external C or C++ library, though C++ support is still a > work in progress. I'd recommand using this C library: http://c-algorithms.sourceforge.net/ having a cython-stl sounds nice though :) cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From robertwb at math.washington.edu Wed Sep 2 19:22:10 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 10:22:10 -0700 (PDT) Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4A9E9DB0.8060707@noaa.gov> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> <4A9E9DB0.8060707@noaa.gov> Message-ID: On Wed, 2 Sep 2009, Christopher Barker wrote: > Sanne Korzec wrote: >> The main bottleneck in my code is a large dictionary / hash table which >> I would like to optimize. > > In what way do you need to optimize it? i.e. how is it used? do you have > memory issues or speed issues? python dicts are highly optimized > already, so you're not likely to do much better with the look-up speed. It could help with both memory and speed. In particular, to do a lookup in a Python hashtable you need to (1) Wrap the float in a Python object (2) call __hash__ on that new object (3) actually do the lookup (4) unwrap the result back into a float. Python does have a highly optimized (3), but the overhead for the rest will probably overwhelm it speedwise, so I bet a simple, unwrapped implementation would still be quite a win. - Robert From dagss at student.matnat.uio.no Wed Sep 2 19:52:01 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 02 Sep 2009 19:52:01 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> <4A9E9DB0.8060707@noaa.gov> Message-ID: <4A9EB0C1.5030509@student.matnat.uio.no> Robert Bradshaw wrote: > On Wed, 2 Sep 2009, Christopher Barker wrote: > >> Sanne Korzec wrote: >>> The main bottleneck in my code is a large dictionary / hash table which >>> I would like to optimize. >> In what way do you need to optimize it? i.e. how is it used? do you have >> memory issues or speed issues? python dicts are highly optimized >> already, so you're not likely to do much better with the look-up speed. > > It could help with both memory and speed. In particular, to do a lookup in > a Python hashtable you need to > > (1) Wrap the float in a Python object > (2) call __hash__ on that new object I hope that isn't what actually happens :-) Floating point and hashes don't seem like a good idea. (Just a note, I believe the OP was talking about floats in the values so we're good.) > (3) actually do the lookup > (4) unwrap the result back into a float. > > Python does have a highly optimized (3), but the overhead for the rest > will probably overwhelm it speedwise, so I bet a simple, unwrapped > implementation would still be quite a win. > > - Robert > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev -- Dag Sverre From stefan_ml at behnel.de Wed Sep 2 20:01:00 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 Sep 2009 20:01:00 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4A9EB0C1.5030509@student.matnat.uio.no> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> <4A9E9DB0.8060707@noaa.gov> <4A9EB0C1.5030509@student.matnat.uio.no> Message-ID: <4A9EB2DC.1040702@behnel.de> Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> to do a lookup in a Python hashtable you need to >> >> (1) Wrap the float in a Python object >> (2) call __hash__ on that new object > > I hope that isn't what actually happens :-) Floating point and hashes > don't seem like a good idea. That certainly depends on the hash function. A float value is not more than a bunch of bits after all, just like an int or string. It even has the advantage of being exactly a multiple of 4 bytes large, so a hash function can actually deploy extremely efficient CPU operations. Stefan From stefan_ml at behnel.de Wed Sep 2 20:08:12 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 Sep 2009 20:08:12 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <200909021901.20923.binet@cern.ch> References: <200909021901.20923.binet@cern.ch> Message-ID: <4A9EB48C.6030403@behnel.de> Sebastien Binet wrote: > I'd recommand using this C library: > http://c-algorithms.sourceforge.net/ Interesting. That would certainly make a nice C-level standard library. Would you have ready-made .pxd files for this library? Or even some Cython example code that you could post somewhere? Stefan From dagss at student.matnat.uio.no Wed Sep 2 21:21:46 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 02 Sep 2009 21:21:46 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4A9EB2DC.1040702@behnel.de> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> <4A9E9DB0.8060707@noaa.gov> <4A9EB0C1.5030509@student.matnat.uio.no> <4A9EB2DC.1040702@behnel.de> Message-ID: <4A9EC5CA.7060405@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: > >> Robert Bradshaw wrote: >> >>> to do a lookup in a Python hashtable you need to >>> >>> (1) Wrap the float in a Python object >>> (2) call __hash__ on that new object >>> >> I hope that isn't what actually happens :-) Floating point and hashes >> don't seem like a good idea. >> > > That certainly depends on the hash function. A float value is not more than > a bunch of bits after all, just like an int or string. It even has the > advantage of being exactly a multiple of 4 bytes large, so a hash function > can actually deploy extremely efficient CPU operations. > Yes, but the hash function to use would in the majority of real world cases depend heavily on the context. Up to what precision should two floats be compared for equality? Since roundoff errors will usually lead to slightly different values for storage and retrieval, unless you're really, really careful. (Always compare floats by abs(a - b) < eps and so on). (I'd like to hear about actual use cases if there is any though! -- anything I can think of where store/retrieve would make sense is better represented by intervals on the real line than a single float.) It is certainly possible to create hashes which rounds off floats in the right manner, but that's certainly more magic than I hope is embedded in Python's hash/eq -- I'd like to be able to specify my eps in such cases! Dag Sverre From stefan_ml at behnel.de Wed Sep 2 21:28:06 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 Sep 2009 21:28:06 +0200 Subject: [Cython] Thoughts after SciPy 09 In-Reply-To: <4A957D8E.7070003@student.matnat.uio.no> References: <4A957D8E.7070003@student.matnat.uio.no> Message-ID: <4A9EC746.1070703@behnel.de> Hi Dag, Dag Sverre Seljebotn wrote: > As many of you know me and Kurt attended SciPy 09. Four Cython-related > events were held: > > - An introductory tutorial to Cython (by me) > - A talk about Cython for numerics (by me again) > - A talk on Fwrap (by Kurt) > - A Cython BoF > > You can find links to slides and videos for the three first on > http://conference.scipy.org. > > An intensive week like that makes me reflect on what Cython is good > about, lacking, etc. etc. > > First of all, there seems to be quite a lot of interest in Cython, many > thinks it is excellent, and many thanked me personally for our efforts. > > One thing that's also very interesting to me personally is that there's > some talk of porting parts of NumPy over to Cython for easier Python 3 > support. This sounds like Cython is still heavily growing in interest, which is a really good thing. Thanks for spreading the word so loudly. > Beyond that, I've got a nice list of topics for further improvement. For > instance one thing that is very possible to fix was a real dealbreaker > that some complained about, and in one case stopped somebody from > recommending it to co-workers. It's always nice to get the "outside" > perspective that I get when I present Cython to lots of people. Well, the list of requested features has been growing pretty long by now, as has the list of things that need fixing. The Cython project is definitely not lacking ideas. > It seems to me that many has the impression that > > a) Cython is complicated technology which takes much work > b) A lot of effort is put into steadily improving it > > BUT, I feel the reality is that > > a) Core developers can implement new features or fix bugs rather quickly > b) Relatively little time is spent in total on Cython, compared to some > other projects I totally second this. While the amount of developer time that is currently available from the core developers is strictly limited by a variety of factors, I keep getting surprised about the amount of features and improvements that are doable with little effort. When I look at the code in lxml, for example, I constantly notice hand written tweaks that are no longer necessary today, simply because Cython got so much better over the last year or so. You can really feel what it buys you to fix the code generator instead of the code itself. > Or put another way: Putting relatively little in can, at least at this > point in Cython's development, yield high returns. > > Example: Profiling was a feature many at SciPy was anxious about > getting and was asking about a lot. That's in trunk now, mainly because > Robert had an intercontinental flight (!!). (That admittedly might say a > lot more about Robert than Cython, but still.) One problem I see here is the current release schedule. We keep getting more and more away from the "release early, release often" principle. Getting 0.12 out in one way or another would finally make all the great trunk improvements available to regular users. Thing is: every release needs a driver, so that's the first place where we could benefit from a dedicated helping hand. I mean, Sun pays the Jython project lead, even full-time. Guido is payed half-time for CPython evolution. I wonder if there isn't enough commercial interest in Cython by now that could express itself in contributed project time or financing. A developer day invested in Cython development can easily pay off by making your own code easier to write and/or faster to execute. Remember, we write C so you don't have to! (maybe we should put the last two sentences on the front page ;) > Cython can thrive without this too though! Looking at the coming > half-to-three-quarter year, here's what I'm guessing will happen: > > - I might get the new memoryviews from summer finished and merged with > trunk I wouldn't mind putting major new features off for 0.13 and releasing 0.12 sooner. > - Cython might run properly in Python 3 (w/ 2to3) It almost does, except for two remaining bugs. Even the test runner works out of the box now. > - Get -unstable stabilized and released (significant portion) I'm all for making that the current priority. I even consider it mostly stable, except for the few failing test cases (well, and for the open bugs that lack a test case, obviously). > - Fwrap released Independent of Cython's own schedule, except if there's a requirement for better integration on Cython side (which would be fine for 0.13). > - Closures > - Better C++ support merged > - Perhaps some pyximport improvements Again, fine for 0.13. While I guess that much of this done, I honestly prefer a soon-to-be-outdated release over ever-lasting developer checkouts. > Not bad at all! But, there's also a long list of projects we already > badly want to have done that we can't possibly reach now, IMO: > > - Fix the bugs, complete the test suite Oh, well... ;) I split up the following list into major enhancements that need real work and will arrive as major new features: > - SIMD > - Control flow analysis! > - Type inference/a better pure Python mode (note that type inference pretty much depends on control flow analysis) and those that will continue to improve through applied spare time: > - Speed up compilation speed, break up compilation units/utility code > - Convenient debugging, line-by-line profiling > - Many rather low-hanging fruit CEPs which would make using Cython a > nicer experience > - Full Python semantics compatability for untyped code Major new features (even the "we know how to do it" ones) will certainly require more than the average spare time of the current core developers. Stefan From stefan_ml at behnel.de Wed Sep 2 21:29:55 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 Sep 2009 21:29:55 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4A9EC5CA.7060405@student.matnat.uio.no> References: <20090902123402.XBRY15272.viefep17-int.chello.at@edge05.upc.biz> <4A9E9DB0.8060707@noaa.gov> <4A9EB0C1.5030509@student.matnat.uio.no> <4A9EB2DC.1040702@behnel.de> <4A9EC5CA.7060405@student.matnat.uio.no> Message-ID: <4A9EC7B3.1030709@behnel.de> Dag Sverre Seljebotn wrote: > the hash function to use would in the majority of real world > cases depend heavily on the context. Up to what precision should two > floats be compared for equality? Since roundoff errors will usually lead > to slightly different values for storage and retrieval, unless you're > really, really careful. Ah, ok, that's what you meant. Yes, I agree that it's rather futile to discuss this without a real use case in mind. Stefan From dagss at student.matnat.uio.no Wed Sep 2 22:08:13 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 02 Sep 2009 22:08:13 +0200 Subject: [Cython] Thoughts after SciPy 09 In-Reply-To: <4A9EC746.1070703@behnel.de> References: <4A957D8E.7070003@student.matnat.uio.no> <4A9EC746.1070703@behnel.de> Message-ID: <4A9ED0AD.6050804@student.matnat.uio.no> Stefan Behnel wrote: > One problem I see here is the current release schedule. We keep getting > more and more away from the "release early, release often" principle. > Getting 0.12 out in one way or another would finally make all the great > trunk improvements available to regular users. Thing is: every release > needs a driver, so that's the first place where we could benefit from a > dedicated helping hand. I mean, Sun pays the Jython project lead, even > full-time. Guido is payed half-time for CPython evolution. I wonder if > there isn't enough commercial interest in Cython by now that could express > itself in contributed project time or financing. A developer day invested > in Cython development can easily pay off by making your own code easier to > write and/or faster to execute. Remember, we write C so you don't have to! > > (maybe we should put the last two sentences on the front page ;) :-) I think they'd fit nicely. As for commercial interest... well, if nothing else, figuring out whether anyone's employable and for what tasks (might want to take that off-list perhaps) and ask "officially" for monetary donations on the list would at least give an indication on whether it is the case. (I guess we can always solicit for manpower as well, in particular for "simple" tasks like doc writing and Windows testing as well.) Making a decision to request donations isn't a trivial issue though -- one time this was discussed earlier there was concern that having people paid (beyond GSoC money) could stifle other contributions and be seen as against the current open development nature. It's a very valid concern, though in the exact situation we're in now I don't have issues with it myself. Then there's fear of sending a signal that makes people afraid of Cython dying if we can't get donations (which isn't true IMO). >> Cython can thrive without this too though! Looking at the coming >> half-to-three-quarter year, here's what I'm guessing will happen: >> >> - I might get the new memoryviews from summer finished and merged with >> trunk > > I wouldn't mind putting major new features off for 0.13 and releasing 0.12 > sooner. +1. I wasn't by any means setting up a list of priorities, more a list of what people were likely to work on over the next 1/2-1 year in no particular order. I'm fine with putting off memoryviews until 0.12 is released, and definitely agree that 0.12 should be put out without waiting for merges. >> - Fwrap released > > Independent of Cython's own schedule, except if there's a requirement for > better integration on Cython side (which would be fine for 0.13). Yes, there is, but 0.13 is fine (but again this was merely "what is likely to happen" that's Cython-related). -- Dag Sverre From robertwb at math.washington.edu Thu Sep 3 06:11:57 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 21:11:57 -0700 Subject: [Cython] Thoughts after SciPy 09 In-Reply-To: <4A9ED0AD.6050804@student.matnat.uio.no> References: <4A957D8E.7070003@student.matnat.uio.no> <4A9EC746.1070703@behnel.de> <4A9ED0AD.6050804@student.matnat.uio.no> Message-ID: On Sep 2, 2009, at 1:08 PM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> One problem I see here is the current release schedule. We keep >> getting >> more and more away from the "release early, release often" principle. >> Getting 0.12 out in one way or another would finally make all the >> great >> trunk improvements available to regular users. Yep, I'd like to see releases more often too. >> Thing is: every release >> needs a driver, so that's the first place where we could benefit >> from a >> dedicated helping hand. I mean, Sun pays the Jython project lead, >> even >> full-time. Guido is payed half-time for CPython evolution. I >> wonder if >> there isn't enough commercial interest in Cython by now that could >> express >> itself in contributed project time or financing. A developer day >> invested >> in Cython development can easily pay off by making your own code >> easier to >> write and/or faster to execute. Remember, we write C so you don't >> have to! >> >> (maybe we should put the last two sentences on the front page ;) > > :-) I think they'd fit nicely. +1 > As for commercial interest... well, if nothing else, figuring out > whether anyone's employable and for what tasks (might want to take > that > off-list perhaps) and ask "officially" for monetary donations on the > list would at least give an indication on whether it is the case. > > (I guess we can always solicit for manpower as well, in particular for > "simple" tasks like doc writing and Windows testing as well.) I think we're short manpower, not money, but of course the latter can sometimes be used to obtain for the former. > Making a decision to request donations isn't a trivial issue though -- > one time this was discussed earlier there was concern that having > people > paid (beyond GSoC money) could stifle other contributions and be > seen as > against the current open development nature. It's a very valid > concern, > though in the exact situation we're in now I don't have issues with it > myself. > > Then there's fear of sending a signal that makes people afraid of > Cython > dying if we can't get donations (which isn't true IMO). Yep, it also sends the signal that we're trying to turn this into a money-making venture, which is not the case. It's like GSoC, sometimes funding is needed to free up/justify time that would have had to been spent elsewhere. I also think there's a very different feel to 3rd party X paying person Y to implement/improve a specific Cython feature vs. someone giving Cython money which is then used to "employ" someone to work on the code. Cython's going very well, and I see it continuing no matter what. The main change I would see funding having is someone getting the features they want/need now, instead of whenever we get around to it. - Robert From robertwb at math.washington.edu Thu Sep 3 06:41:17 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 21:41:17 -0700 Subject: [Cython] Next release Message-ID: It seems like there's consensus that we're overdue for another release, so I propose that we try to get a release out in the next couple of weeks. Before I loose all you non-coders, if anyone sees something that they really want to see in the next release, please stick/move the ticket to 0.11.3 or 0.12 (no promises, but it could help establish priority). How about (1) Get the current -devel branch out as 0.11.3 as soon as possible. Hopefully, this will just require testing lxml, sage, and throwing out a couple of release candidates. I've assigned out some tickets at http://trac.cython.org/cython_trac/query? status=assigned&status=new&status=reopened&group=milestone in the next week, lets all at least look at them, and resolve them if they're quick or bump them if they're not. I'll look at the long- overdue integer conversion review. (2) Let's try to get -unstable out for 0.12. I actually don't even know how "unstable" it is, but I think it's not far off from being ready. (3) Also in preparation for 0.12, let's go over all the tickets and see which, if any, are low-hanging fruit. I also propose before 0.12 hits, we do a cython bug day. This is something we've done with Sage and it works well for getting a lot done in a little amount of time, especially those pesky little ones that no one ever looks at. Essentially, all of us get together on IRC and try to snuff out as many bugs as possible in a 24 hour period. Maybe end of September/early October, with 0.12 coming out mid/late October? - Robert From robertwb at math.washington.edu Thu Sep 3 06:51:13 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Sep 2009 21:51:13 -0700 Subject: [Cython] GSoC mergeback Message-ID: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> One thing you might have noticed was lacking in the last email was any talk about when the GSoC stuff will get into a release. First off, congratulations to both Kurt and Danilo for another successful GSoC summer. I'm not sure about the Fortran project (other than that a lot of cool stuff happened), but it sounds like much of it was the independent fwrap project. We should integrate whatever Cython-side stuff was involved as soon as its stable (which is probably now). As for the C++ project, it doesn't yet satisfy all C++ needs, but it still makes wrapping C++ a lot nicer. A bit of cleanup is needed before merging into main (e.g. we should verify we are in C++ compiling mode before allowing C++ features). The holdup is operator syntax--currently __add__, etc. are used, but if we support references (which are not yet implemented, but should be pretty easy, at least getting enough for external declarations) we can support C++ operator+ style declarations. (This is especially relevant for how the [] operator is handled.) I think this decision needs to be settled before we push anything out, as I don't want to push the one then deprecate it a month later. For both projects, when they get merged in we'll do a release, but we shouldn't hold back releases for them. - Robert From dagss at student.matnat.uio.no Thu Sep 3 09:54:49 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Sep 2009 09:54:49 +0200 Subject: [Cython] GSoC mergeback In-Reply-To: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> References: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> Message-ID: <4A9F7649.2090100@student.matnat.uio.no> Robert Bradshaw wrote: > One thing you might have noticed was lacking in the last email was > any talk about when the GSoC stuff will get into a release. First > off, congratulations to both Kurt and Danilo for another successful > GSoC summer. > > I'm not sure about the Fortran project (other than that a lot of cool > stuff happened), but it sounds like much of it was the independent > fwrap project. We should integrate whatever Cython-side stuff was > involved as soon as its stable (which is probably now). Nope, it needs some cleanup and stabilization. As fwrap needs to stabilize as well, this will be downprioritized a bit, so likely no merge for 0.12. -- Dag Sverre From sanne at kortec.nl Thu Sep 3 11:02:56 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Thu, 3 Sep 2009 11:02:56 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <200909021901.20923.binet@cern.ch> Message-ID: <20090903090256.SXSV29725.viefep14-int.chello.at@edge03.upc.biz> Thanks for the link. But I'm a little confused again. I thought cython is used when you do not want to write c yourself. But use it to write 'pythonic' c. In this case, the c code is ready and needs to be included in my python program. What I need is simplicity and c. Should I use cython to extend my main python program with this c datatype? If so, how? Or is there a better/faster way to let these two communicate? I basically need a fast way of accessing the hash table from python. If someone can refer me to some of the many docs, I would be very happy. To summarize this is when my python program needs to read or write to the hash table. Main python program does: -some preprocessing -loop over a large data file -1) count and calculate and do some tricks (read from hash table) -2) return a float, with two indices -3) store this output (write to hash table) -write final hash table to output Thanks. -----Original Message----- From: Sebastien Binet [mailto:seb.binet at gmail.com] Sent: woensdag 2 september 2009 19:01 To: cython-dev at codespeak.net Cc: Robert Bradshaw; sanne at kortec.nl Subject: Re: [Cython] FW: cython and hash tables / dictionary On Wednesday 02 September 2009 18:01:43 Robert Bradshaw wrote: > On Sep 2, 2009, at 5:34 AM, Sanne Korzec wrote: > > Hi mailing, > > > > I've been writing a complex program in python, which I am currently > > scaling up. I find myself in the position now, where I run out of > > memory or out of time. I have been looking at alternatives like > > cython and ctypes. I implemented ctypes which fixes the memory > > problem but doubles the time problem. > > > > Currently I am implementing a cython version and ran into a > > problem. I hope someone can help me out. > > > > The main bottleneck in my code is a large dictionary / hash table > > which I would like to optimize. Since a dictionary is a python > > datatype I have no idea how to make this cython. > > > > Currently I have tried to keep the 'keys' intact and store the > > 'values' as ctypes floats, but I think it might be better to do > > something else. Do I need to make the entire hash table c? Or is > > there a more simple solution like combining the python dict with > > cython? If so, how do I do this? > > > > Thanks in advance. > > > > Additional details: I use a double dict where the key of the first > > dict stores another dict as value. > > > > S. > > The short answer is yes, to avoid using the Python dictionary (which > can only hold Python objects), you need to write your own hashtable. > That's not very hard though--I bet only a hundred or two lines in > Cython would be sufficient (and very fast). You could also look into > using an external C or C++ library, though C++ support is still a > work in progress. I'd recommand using this C library: http://c-algorithms.sourceforge.net/ having a cython-stl sounds nice though :) cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From dagss at student.matnat.uio.no Thu Sep 3 13:51:18 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Sep 2009 13:51:18 +0200 Subject: [Cython] C99 complex behaviour Message-ID: <4A9FADB6.1040702@student.matnat.uio.no> I'm wondering whether we can change this for 0.11.3: Currently (if I understand correctly) Cython decides whether you want C99 complex or Cython complex structs based on whether complex.h is included or not. I think this is a bit too magic. Even if it works, it seems a bit confusing, and the thing is it won't work if you're interfacing with functions in C libraries taking complex numbers (which, of course, in turn include complex.h, but Cython doesn't see that). I'm wondering if we can move to a directive-only approach, and make C99 complex the default? -- Dag Sverre From robertwb at math.washington.edu Thu Sep 3 18:02:36 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 3 Sep 2009 09:02:36 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090903095852.DSSF793.viefep11-int.chello.at@edge01.upc.biz> References: <20090903095852.DSSF793.viefep11-int.chello.at@edge01.upc.biz> Message-ID: <7AC127B8-1542-4B76-AC62-E04E2B2CC571@math.washington.edu> On Sep 3, 2009, at 2:58 AM, Sanne Korzec wrote: > Personally, I would implement a simple extension class > > cdef class MyDoubleHashtable: > cdef void put(double key, double value) > cdef double get(double key) > ... > > - Robert > > > If you do this, you are still using python objects. E.g. > > Self.dict = {} > cdef void put(double key, double value): > self.dict[key] = value Yes, that wouldn't help. I was thinking of implementing the table itself on top of double pointers. > But how do I import the c code in python, so I can do something > like this: > > Import c_hash_table > > cdef void put(double key, double value): > > //call the c hash table > > I can't find any examples on how to use this. See http://docs.cython.org/docs/external_C_code.html . Once you declare these external functions, you can use them in your Cython code just like any function you defined. - Robert From sanne at kortec.nl Fri Sep 4 15:50:03 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Fri, 4 Sep 2009 15:50:03 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4A9EB48C.6030403@behnel.de> Message-ID: <20090904135005.QYYQ19290.viefep15-int.chello.at@edge01.upc.biz> Hi, I would also appreciate an example for the hash table too. I don't mind writing documentation for it, if someone can help me out getting it to work. I am unsure how and were to declare the arguments that hash_table_new takes. In the documentation http://c-algorithms.sourceforge.net/doc/hash-table_8h.html#e361c4c0256ec6c74 1ecfeabef33d891 , I can find: HashTable* hash_table_new ( HashTableHashFunc hash_func, HashTableEqualFunc equal_func ) To create a new hash table. But I can't find were the HashTableHashFunc and HashTableEqualFunc are declared. The only thing I can find is in the header file which state: typedef unsigned long(* HashTableHashFunc)(HashTableKey value) typedef unsigned long(* HashTableHashFunc)(HashTableKey value) Does this mean I have to write these functions myself? In c? And how then do I call them from cython? My guess: hashtable.pyx cdef extern from "hash_table.h": object HashTable hash_table_new(object hash_func, object equal_func) wrapper.py Import hashtable HT = hashtable.hash_table_new() //is this wrong? the c hash table is not a //class but a collection of methods it seems -----Original Message----- From: cython-dev-bounces at codespeak.net [mailto:cython-dev-bounces at codespeak.net] On Behalf Of Stefan Behnel Sent: woensdag 2 september 2009 20:08 To: cython-dev at codespeak.net Subject: Re: [Cython] FW: cython and hash tables / dictionary Sebastien Binet wrote: > I'd recommand using this C library: > http://c-algorithms.sourceforge.net/ Interesting. That would certainly make a nice C-level standard library. Would you have ready-made .pxd files for this library? Or even some Cython example code that you could post somewhere? Stefan _______________________________________________ Cython-dev mailing list Cython-dev at codespeak.net http://codespeak.net/mailman/listinfo/cython-dev From stefan_ml at behnel.de Fri Sep 4 16:03:22 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 04 Sep 2009 16:03:22 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090904135005.QYYQ19290.viefep15-int.chello.at@edge01.upc.biz> References: <20090904135005.QYYQ19290.viefep15-int.chello.at@edge01.upc.biz> Message-ID: <4AA11E2A.3010708@behnel.de> Sanne Korzec wrote: > In the documentation > http://c-algorithms.sourceforge.net/doc/hash-table_8h.html#e361c4c0256ec6c74 > 1ecfeabef33d891 , I can find: > > HashTable* hash_table_new ( HashTableHashFunc hash_func, > HashTableEqualFunc equal_func > ) > > To create a new hash table. But I can't find were the HashTableHashFunc and > HashTableEqualFunc are declared. The only thing I can find is in the header > file which state: > > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > > Does this mean I have to write these functions myself? Yes. > In c? You can write them in Cython: cdef unsigned long c_hash(HashTableKey value): return huge_calculation_on(value) > And how then do I call them from cython? You don't. Instead, you pass the function names (i.e. pointers) into hash_table_new(). > My guess: > > hashtable.pyx "hashtable.pxd", I assume? > cdef extern from "hash_table.h": > > object HashTable hash_table_new(object hash_func, object equal_func) That won't work. You can't use Python functions as their signature won't match the required signatures. Instead, define HashTable as a struct and the functions as a ctypedef. Stefan From dalcinl at gmail.com Fri Sep 4 16:36:59 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Sep 2009 11:36:59 -0300 Subject: [Cython] C99 complex behaviour In-Reply-To: <4A9FADB6.1040702@student.matnat.uio.no> References: <4A9FADB6.1040702@student.matnat.uio.no> Message-ID: On Thu, Sep 3, 2009 at 8:51 AM, Dag Sverre Seljebotn wrote: > I'm wondering whether we can change this for 0.11.3: > > Currently (if I understand correctly) Cython decides whether you want > C99 complex or Cython complex structs based on whether complex.h is > included or not. > But how the inclusion of "complex.h" is detected ? > I think this is a bit too magic. Even if it works, it seems a bit > confusing, and the thing is it won't work if you're interfacing with > functions in C libraries taking complex numbers (which, of course, in > turn include complex.h, but Cython doesn't see that). > I agree... > I'm wondering if we can move to a directive-only approach, No, please... Use the C preprocessor for "activating" the C99 complex stuff... This way you have chance that the same generated C source could be used with compilers missing C99 complex support... Of course, you can still use a directive, where the options where { yes | no | C-compile-time}, the last based on a preprocessor definition... > and make C99 > complex the default? > I bet you use GCC as much as me... However, you always forget that we have to live with the nightmare of MSVC compilers being used out there... So I think no, C99 complex cannot be the default -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Sep 4 18:13:27 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Sep 2009 09:13:27 -0700 Subject: [Cython] C99 complex behaviour In-Reply-To: References: <4A9FADB6.1040702@student.matnat.uio.no> Message-ID: <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> On Sep 4, 2009, at 7:36 AM, Lisandro Dalcin wrote: > On Thu, Sep 3, 2009 at 8:51 AM, Dag Sverre > Seljebotn wrote: >> I'm wondering whether we can change this for 0.11.3: >> >> Currently (if I understand correctly) Cython decides whether you want >> C99 complex or Cython complex structs based on whether complex.h is >> included or not. >> > > But how the inclusion of "complex.h" is detected ? It defines a specific macro. > >> I think this is a bit too magic. Even if it works, it seems a bit >> confusing, and the thing is it won't work if you're interfacing with >> functions in C libraries taking complex numbers (which, of course, in >> turn include complex.h, but Cython doesn't see that). >> > > I agree... If you're working with C libraries that take complex numbers, you've included complex.h somewhere (perhaps indirectly) to use them, right? > >> I'm wondering if we can move to a directive-only approach, > > No, please... Use the C preprocessor for "activating" the C99 complex > stuff... This way you have chance that the same generated C source > could be used with compilers missing C99 complex support... > > Of course, you can still use a directive, where the options where { > yes | no | C-compile-time}, the last based on a preprocessor > definition... I think there is already such a directive (though it's not nicely exposed). The whole thing should be better documented as well. >> and make C99 >> complex the default? >> > > I bet you use GCC as much as me... However, you always forget that we > have to live with the nightmare of MSVC compilers being used out > there... So I think no, C99 complex cannot be the default Yep, lets not rely on C99, even though all of us developers have it at our fingertips. Otherwise I would have implemented it this way from the start... However, are you bringing this up because something isn't working for you like it should? - Robert From dalcinl at gmail.com Fri Sep 4 19:08:55 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Sep 2009 14:08:55 -0300 Subject: [Cython] C99 complex behaviour In-Reply-To: <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> References: <4A9FADB6.1040702@student.matnat.uio.no> <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> Message-ID: On Fri, Sep 4, 2009 at 1:13 PM, Robert Bradshaw wrote: > On Sep 4, 2009, at 7:36 AM, Lisandro Dalcin wrote: > >> On Thu, Sep 3, 2009 at 8:51 AM, Dag Sverre >> Seljebotn wrote: >>> I'm wondering whether we can change this for 0.11.3: >>> >>> Currently (if I understand correctly) Cython decides whether you want >>> C99 complex or Cython complex structs based on whether complex.h is >>> included or not. >>> >> >> But how the inclusion of "complex.h" is detected ? > > It defines a specific macro. > OK... Now I remember... This relies in _Complex_I definition, that AFAIK is in the C99 standard, right? . >> >>> I think this is a bit too magic. Even if it works, it seems a bit >>> confusing, and the thing is it won't work if you're interfacing with >>> functions in C libraries taking complex numbers (which, of course, in >>> turn include complex.h, but Cython doesn't see that). >>> >> >> I agree... > > If you're working with C libraries that take complex numbers, you've > included complex.h somewhere (perhaps indirectly) to use them, right? > Of course.... >> >>> I'm wondering if we can move to a directive-only approach, >> >> No, please... Use the C preprocessor for "activating" the C99 complex >> stuff... This way you have chance that the same generated C source >> could be used with compilers missing C99 complex support... >> >> Of course, you can still use a directive, where the options where { >> yes | no | C-compile-time}, the last based on a preprocessor >> definition... > > I think there is already such a directive (though it's not nicely > exposed). The whole thing should be better documented as well. > Yes, I now see that all this is there... >>> and make C99 >>> complex the default? >>> >> >> I bet you use GCC as much as me... However, you always forget that we >> have to live with the nightmare of MSVC compilers being used out >> there... So I think no, C99 complex cannot be the default > > Yep, lets not rely on C99, even though all of us developers have it > at our fingertips. Otherwise I would have implemented it this way > from the start... > Indeed... > However, are you bringing this up because something isn't working for > you like it should? > Are you asking this to Dag or to me? Though I had no chance to try this with petsc4py, I reviewed the implementation in the past and it looked OK for me... So I do not have any objection... My mail was actually a reply to Dag's one ... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Sep 4 21:51:48 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Sep 2009 12:51:48 -0700 (PDT) Subject: [Cython] C99 complex behaviour In-Reply-To: References: <4A9FADB6.1040702@student.matnat.uio.no> <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> Message-ID: On Fri, 4 Sep 2009, Lisandro Dalcin wrote: >> However, are you bringing this up because something isn't working for >> you like it should? >> > > Are you asking this to Dag or to me? Though I had no chance to try > this with petsc4py, I reviewed the implementation in the past and it > looked OK for me... So I do not have any objection... My mail was > actually a reply to Dag's one ... I was responding to both of you, but, yes, this question was specifically for Dag. - Robert From dagss at student.matnat.uio.no Sat Sep 5 09:51:57 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 Sep 2009 09:51:57 +0200 Subject: [Cython] C99 complex behaviour In-Reply-To: <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> References: <4A9FADB6.1040702@student.matnat.uio.no> <02EE8667-C618-4D4C-8569-57EA1FCBBEFF@math.washington.edu> Message-ID: <4AA2189D.1040900@student.matnat.uio.no> Robert Bradshaw wrote: > On Sep 4, 2009, at 7:36 AM, Lisandro Dalcin wrote: > >> On Thu, Sep 3, 2009 at 8:51 AM, Dag Sverre >> Seljebotn wrote: >>> I'm wondering whether we can change this for 0.11.3: >>> >>> Currently (if I understand correctly) Cython decides whether you want >>> C99 complex or Cython complex structs based on whether complex.h is >>> included or not. >>> >> But how the inclusion of "complex.h" is detected ? > > It defines a specific macro. > >>> I think this is a bit too magic. Even if it works, it seems a bit >>> confusing, and the thing is it won't work if you're interfacing with >>> functions in C libraries taking complex numbers (which, of course, in >>> turn include complex.h, but Cython doesn't see that). >>> >> I agree... > > If you're working with C libraries that take complex numbers, you've > included complex.h somewhere (perhaps indirectly) to use them, right? I'm sorry, I was confused. I really knew this and forgot .. when I posted I thought Cython actually looked for whether it included "complex.h" directly. Since it is pushed to C compilation time there's no issue I think (except getting the directives documented on the wiki page). > However, are you bringing this up because something isn't working for > you like it should? Yes. Fwrap generate pxd files which interface with the Fortran modules (which have no header files), potentially using C99 complex numbers. I.e. cdef extern: int myfortranfunc(double complex z) Just doing cdef extern from "complex.h": pass seemed hackish at the time, but now I think it makes perfect sense. So again, there's no issue here. (Yes, I think Kurt's looking at emulating complex numbers too but that's for later.) Kurt: I haven't been playing around with it and don't plan to, I just suddenly remembered that the issue had to be raised. -- Dag Sverre From dagss at student.matnat.uio.no Sat Sep 5 12:42:45 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 Sep 2009 12:42:45 +0200 Subject: [Cython] GSoC mergeback In-Reply-To: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> References: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> Message-ID: <4AA240A5.5080404@student.matnat.uio.no> Robert Bradshaw wrote: > As for the C++ project, it doesn't yet satisfy all C++ needs, but it > still makes wrapping C++ a lot nicer. A bit of cleanup is needed > before merging into main (e.g. we should verify we are in C++ > compiling mode before allowing C++ features). The holdup is operator > syntax--currently __add__, etc. are used, but if we support > references (which are not yet implemented, but should be pretty easy, > at least getting enough for external declarations) we can support C++ > operator+ style declarations. (This is especially relevant for how > the [] operator is handled.) I think this decision needs to be > settled before we push anything out, as I don't want to push the one > then deprecate it a month later. Not that there's any hurry, but I'm curious: Do you have a plan for how the decision is going to be made? I guess it might be CEPable, once somebody has time for a CEP. Or a Skypecon or something. The way I see it is that the options are A: One try to keep current Cython semantics as far as possible. __add__. B: Really just let C++ into Cython entirely. operator++, and: - cdef int x = 3; func(x) # can change x - ++x # since it is different from += 1 in C++ - *x # Since it is different from x[0] in C++ - and so on I am, much to my own surprise, starting to lean towards B, much because one could then see C++ auto-wrapped and usable right away. -- Dag Sverre From robertwb at math.washington.edu Sat Sep 5 19:42:16 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 5 Sep 2009 10:42:16 -0700 Subject: [Cython] GSoC mergeback In-Reply-To: <4AA240A5.5080404@student.matnat.uio.no> References: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> <4AA240A5.5080404@student.matnat.uio.no> Message-ID: <59704279-28FE-4D69-9898-AC6F03C71C74@math.washington.edu> On Sep 5, 2009, at 3:42 AM, Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> As for the C++ project, it doesn't yet satisfy all C++ needs, but it >> still makes wrapping C++ a lot nicer. A bit of cleanup is needed >> before merging into main (e.g. we should verify we are in C++ >> compiling mode before allowing C++ features). The holdup is operator >> syntax--currently __add__, etc. are used, but if we support >> references (which are not yet implemented, but should be pretty easy, >> at least getting enough for external declarations) we can support C++ >> operator+ style declarations. (This is especially relevant for how >> the [] operator is handled.) I think this decision needs to be >> settled before we push anything out, as I don't want to push the one >> then deprecate it a month later. > > Not that there's any hurry, but I'm curious: Do you have a plan for > how > the decision is going to be made? I guess it might be CEPable, once > somebody has time for a CEP. Or a Skypecon or something. Yeah, we should at least put it up on the wiki as a CEP. > The way I see it is that the options are > > A: One try to keep current Cython semantics as far as possible. > __add__. > > B: Really just let C++ into Cython entirely. operator++, and: > > - cdef int x = 3; func(x) # can change x Ugh, this might wreck havoc with control flow analysis, but I guess that's what can happen. > - ++x # since it is different from += 1 in C++ This means something already in Python. > - *x # Since it is different from x[0] in C++ There are issues with parsing this, e.g. foo(*x). Maybe the two could be distinguished, but it's still a dangerous overloading of syntax. > - and so on > > I am, much to my own surprise, starting to lean towards B, much > because > one could then see C++ auto-wrapped and usable right away. Yep, I'm started out thinking very pro A, but now I'm leaning towards B, for declarations at least. - Robert From stefan_ml at behnel.de Sat Sep 5 22:22:04 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Sep 2009 22:22:04 +0200 Subject: [Cython] GSoC mergeback In-Reply-To: <59704279-28FE-4D69-9898-AC6F03C71C74@math.washington.edu> References: <43B6714C-E663-4FF6-946C-0D4BD73645DE@math.washington.edu> <4AA240A5.5080404@student.matnat.uio.no> <59704279-28FE-4D69-9898-AC6F03C71C74@math.washington.edu> Message-ID: <4AA2C86C.4090408@behnel.de> Robert Bradshaw wrote: > On Sep 5, 2009, at 3:42 AM, Dag Sverre Seljebotn wrote: >> I am, much to my own surprise, starting to lean towards B, much >> because one could then see C++ auto-wrapped and usable right away. > > Yep, I'm started out thinking very pro A, but now I'm leaning towards > B, for declarations at least. Not being a C++ user, I don't care so much about the declaration syntax. In case that's wanted, I'm fine with making the C++ declaration stuff C++ like, so that C++ users (who are the only ones who would use it anyway) can express their intents more easily. But I'm sure supporting any C++ specific syntax overloading in Cython /code/ will not do any good to the language. To stay with the three examples Dag gave, there is a *very* good reason why you have to write "x[0]" in Cython instead of "*x", and that's simply that both are valid Python code, but only the first means more or less what the C version means. And giving C++ semantics to "++x" in Cython is just screaming for trouble. That may be a bit less true for passing C++ references, but I agree with Robert that that might block future developments of the Cython compiler, and it certainly doesn't make the code more readable. Stefan From sanne at kortec.nl Mon Sep 7 15:15:07 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Mon, 7 Sep 2009 15:15:07 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4AA11E2A.3010708@behnel.de> Message-ID: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> Ok, I now have this: cythonHash.pxd: cdef extern from "hash-table.h": ctypedef struct HashTable ctypedef void *HashTableKey ctypedef unsigned long HashTableHashFunc(HashTableKey value) ctypedef unsigned long HashTableEqualFunc(HashTableKey value) HashTable *hash_table_new(HashTableHashFunc hash_func, HashTableEqualFunc equal_func) cdef inline unsigned long c_hash_func(HashTableKey value): return 1 cdef inline unsigned long c_hash_equal(HashTableKey value): return 1 cythonPT.pyx: cimport cythonHash from cythonHash cimport HashTable class MY_Phrase_Table(object): def __init__(self): pp = HashTable #error here print type(pp), pp This yields in the following error: 'HashTable' is not a constant, variable or function identifier. I don't really get how I should reference to Hashtable. I thought it was already declared from .pxd and the original .c file. -----Original Message----- From: Stefan Behnel [mailto:stefan_ml at behnel.de] Sent: vrijdag 4 september 2009 16:03 To: sanne at kortec.nl Cc: cython-dev at codespeak.net Subject: Re: [Cython] FW: cython and hash tables / dictionary Sanne Korzec wrote: > In the documentation > http://c-algorithms.sourceforge.net/doc/hash-table_8h.html#e361c4c0256ec6c74 > 1ecfeabef33d891 , I can find: > > HashTable* hash_table_new ( HashTableHashFunc hash_func, > HashTableEqualFunc equal_func > ) > > To create a new hash table. But I can't find were the HashTableHashFunc and > HashTableEqualFunc are declared. The only thing I can find is in the header > file which state: > > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > > Does this mean I have to write these functions myself? Yes. > In c? You can write them in Cython: cdef unsigned long c_hash(HashTableKey value): return huge_calculation_on(value) > And how then do I call them from cython? You don't. Instead, you pass the function names (i.e. pointers) into hash_table_new(). > My guess: > > hashtable.pyx "hashtable.pxd", I assume? > cdef extern from "hash_table.h": > > object HashTable hash_table_new(object hash_func, object equal_func) That won't work. You can't use Python functions as their signature won't match the required signatures. Instead, define HashTable as a struct and the functions as a ctypedef. Stefan From philipasmith at blueyonder.co.uk Mon Sep 7 19:51:11 2009 From: philipasmith at blueyonder.co.uk (Philip Smith) Date: Mon, 7 Sep 2009 18:51:11 +0100 Subject: [Cython] Special methods: __iadd__ Message-ID: <20090907175532.E0DCD168014@codespeak.net> Hi I am relatively new to Cython but making good progress I think. However I have the following code (skeleton) related to a library I'm wrapping: cdef class foo . cdef class foo2(object): cdef foo attribute . . def __iadd__(self, foo2 other): self.attribute+=other.attribute #This is defined in class foo and works fine there return self This compiles absolutely fine (under Mingw) but crashes at runtime. Any ideas? Thanks Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090907/c74f9225/attachment.htm From seb.binet at gmail.com Mon Sep 7 21:28:52 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Mon, 7 Sep 2009 21:28:52 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> References: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> Message-ID: <200909072128.52818.binet@cern.ch> hi there, attached is a simple cy_stl.pyx file (together with its setup.py companion) really just to get started :) to test: $ python -c 'import cy_stl as cc; cc.test()' hth, sebastien. > cythonHash.pxd: > > cdef extern from "hash-table.h": > ctypedef struct HashTable > ctypedef void *HashTableKey > ctypedef unsigned long HashTableHashFunc(HashTableKey value) > ctypedef unsigned long HashTableEqualFunc(HashTableKey value) > HashTable *hash_table_new(HashTableHashFunc hash_func, > HashTableEqualFunc equal_func) > > cdef inline unsigned long c_hash_func(HashTableKey value): > return 1 > > cdef inline unsigned long c_hash_equal(HashTableKey value): > return 1 > > > cythonPT.pyx: > > cimport cythonHash > from cythonHash cimport HashTable > > class MY_Phrase_Table(object): > > def __init__(self): > > pp = HashTable #error here > print type(pp), pp > > This yields in the following error: 'HashTable' is not a constant, variable > or function identifier. > > I don't really get how I should reference to Hashtable. I thought it was > already declared from .pxd and the original .c file. > > > > > -----Original Message----- > From: Stefan Behnel [mailto:stefan_ml at behnel.de] > Sent: vrijdag 4 september 2009 16:03 > To: sanne at kortec.nl > Cc: cython-dev at codespeak.net > Subject: Re: [Cython] FW: cython and hash tables / dictionary > > Sanne Korzec wrote: > > In the documentation > > http://c-algorithms.sourceforge.net/doc/hash-table_8h.html#e361c4c0256ec6c7 > 4 > > > 1ecfeabef33d891 , I can find: > > > > HashTable* hash_table_new ( HashTableHashFunc hash_func, > > HashTableEqualFunc equal_func > > ) > > > > To create a new hash table. But I can't find were the HashTableHashFunc > > and > > > HashTableEqualFunc are declared. The only thing I can find is in the > > header > > > file which state: > > > > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > > typedef unsigned long(* HashTableHashFunc)(HashTableKey value) > > > > Does this mean I have to write these functions myself? > > Yes. > > > In c? > > You can write them in Cython: > > cdef unsigned long c_hash(HashTableKey value): > return huge_calculation_on(value) > > > And how then do I call them from cython? > > You don't. Instead, you pass the function names (i.e. pointers) into > hash_table_new(). > > > My guess: > > > > hashtable.pyx > > "hashtable.pxd", I assume? > > > cdef extern from "hash_table.h": > > > > object HashTable hash_table_new(object hash_func, object equal_func) > > That won't work. You can't use Python functions as their signature won't > match the required signatures. Instead, define HashTable as a struct and > the functions as a ctypedef. > > Stefan > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### -------------- next part -------------- cdef extern from "libcalg/hash-pointer.h": unsigned long c_pointer_hash "pointer_hash"(void* location) cdef extern from "libcalg/compare-pointer.h": int c_pointer_equal "pointer_equal" (void* loc1, void* loc2) int c_pointer_compare "pointer_compare" (void* loc1, void* loc2) cdef extern from "libcalg/hash-table.h": ctypedef struct c_HashTable "HashTable" ctypedef void *HashTableKey ctypedef void *HashTableValue ctypedef unsigned long (*HashTableHashFunc)(HashTableKey value) ctypedef unsigned long (*HashTableEqualFunc)(HashTableKey val1, HashTableKey val2) c_HashTable *hash_table_new(HashTableHashFunc hash_func, HashTableEqualFunc equal_func) void hash_table_free(c_HashTable* ht) int hash_table_insert(c_HashTable* self, HashTableKey k, HashTableValue v) HashTableValue hash_table_lookup(c_HashTable* self, HashTableKey k) int hash_table_remove(c_HashTable* self, HashTableKey k) int hash_table_num_entries(c_HashTable* self) cdef inline unsigned long c_ptr_hash_func(HashTableKey value): # FIXME: use intptr_t return (value) cdef inline unsigned long c_ptr_hash_equal(HashTableKey val1, HashTableKey val2): return (val1) == (val2) cdef class HashTable: cdef c_HashTable *_base def __cinit__(self): self._base = hash_table_new((&c_ptr_hash_func), (&c_ptr_hash_equal)) def __dealloc__(self): hash_table_free(self._base) cdef int insert(self, HashTableKey k, HashTableValue v): return hash_table_insert(self._base, k, v) cdef HashTableValue lookup(self, HashTableKey k): return hash_table_lookup(self._base, k) cdef int remove(self, HashTableKey k): return hash_table_remove(self._base, k) cdef int num_entries(self): return hash_table_num_entries(self._base) def test(): cdef HashTable ht = HashTable() cdef char* k1 = "k" cdef int* v1 = [666] print "==> [%s]" % ht.num_entries() ht.insert(k1, v1) print "==> [%s]" % ht.num_entries() cdef HashTableValue vv1 = ht.lookup(k1) print "ht[%s]==[%s]" % (k1, (vv1)[0]) cdef char* k2 = "cy" cdef int* v2 = [42] print "==> [%s]" % ht.num_entries() ht.insert(k2, v2) print "==> [%s]" % ht.num_entries() cdef HashTableValue vv2 = ht.lookup(k2) print "ht[%s]==[%s]" % (k2, (vv2)[0]) -------------- next part -------------- A non-text attachment was scrubbed... Name: setup.py Type: text/x-python Size: 465 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20090907/e4688960/attachment.py From robertwb at math.washington.edu Mon Sep 7 23:53:03 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Sep 2009 14:53:03 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <200909072128.52818.binet@cern.ch> References: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> <200909072128.52818.binet@cern.ch> Message-ID: Thanks for the implementation. I noticed that your compare was only comparing pointers, so if you had two equal strings at different memory addresses it wouldn't find them (which may or may not be what you'd want) Also, your pointers would go out of scope as soon as test ended (so you couldn't return ht and use it later). I built on what you had to get a float -> float hashtable. Note that this technique only works since the float value fits inside a void*, anything bigger and you'd have to allocate memory manually to stick it into the hashtable. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: cy_stl.pyx Type: application/octet-stream Size: 2844 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20090907/69939911/attachment.obj -------------- next part -------------- sage: from cy_stl import * sage: time time_c_hashtable(10**5) CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s Wall time: 0.01 s sage: time time_c_hashtable(10**6) CPU times: user 0.08 s, sys: 0.00 s, total: 0.08 s Wall time: 0.08 s sage: time time_c_hashtable(10**7) CPU times: user 0.75 s, sys: 0.00 s, total: 0.75 s Wall time: 0.75 s sage: time time_py_hashtable(10**5) CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s Wall time: 0.02 s sage: time time_py_hashtable(10**6) CPU times: user 0.18 s, sys: 0.00 s, total: 0.19 s Wall time: 0.19 s sage: time time_py_hashtable(10**7) CPU times: user 2.01 s, sys: 0.01 s, total: 2.02 s Wall time: 2.02 s Not near the speed gains I was expecting...disappointing. - Robert On Sep 7, 2009, at 12:28 PM, Sebastien Binet wrote: > hi there, > > attached is a simple cy_stl.pyx file (together with its setup.py > companion) > > really just to get started :) > > to test: > $ python -c 'import cy_stl as cc; cc.test()' > > hth, > sebastien. > >> cythonHash.pxd: >> >> cdef extern from "hash-table.h": >> ctypedef struct HashTable >> ctypedef void *HashTableKey >> ctypedef unsigned long HashTableHashFunc(HashTableKey value) >> ctypedef unsigned long HashTableEqualFunc(HashTableKey value) >> HashTable *hash_table_new(HashTableHashFunc hash_func, >> HashTableEqualFunc equal_func) >> >> cdef inline unsigned long c_hash_func(HashTableKey value): >> return 1 >> >> cdef inline unsigned long c_hash_equal(HashTableKey value): >> return 1 >> >> >> cythonPT.pyx: >> >> cimport cythonHash >> from cythonHash cimport HashTable >> >> class MY_Phrase_Table(object): >> >> def __init__(self): >> >> pp = HashTable #error here >> print type(pp), pp >> >> This yields in the following error: 'HashTable' is not a constant, >> variable >> or function identifier. >> >> I don't really get how I should reference to Hashtable. I thought >> it was >> already declared from .pxd and the original .c file. >> >> >> >> >> -----Original Message----- >> From: Stefan Behnel [mailto:stefan_ml at behnel.de] >> Sent: vrijdag 4 september 2009 16:03 >> To: sanne at kortec.nl >> Cc: cython-dev at codespeak.net >> Subject: Re: [Cython] FW: cython and hash tables / dictionary >> >> Sanne Korzec wrote: >>> In the documentation >> >> http://c-algorithms.sourceforge.net/doc/hash- >> table_8h.html#e361c4c0256ec6c7 >> 4 >> >>> 1ecfeabef33d891 , I can find: >>> >>> HashTable* hash_table_new ( HashTableHashFunc hash_func, >>> HashTableEqualFunc equal_func >>> ) >>> >>> To create a new hash table. But I can't find were the >>> HashTableHashFunc >> >> and >> >>> HashTableEqualFunc are declared. The only thing I can find is in the >> >> header >> >>> file which state: >>> >>> typedef unsigned long(* HashTableHashFunc)(HashTableKey value) >>> typedef unsigned long(* HashTableHashFunc)(HashTableKey value) >>> >>> Does this mean I have to write these functions myself? >> >> Yes. >> >>> In c? >> >> You can write them in Cython: >> >> cdef unsigned long c_hash(HashTableKey value): >> return huge_calculation_on(value) >> >>> And how then do I call them from cython? >> >> You don't. Instead, you pass the function names (i.e. pointers) into >> hash_table_new(). >> >>> My guess: >>> >>> hashtable.pyx >> >> "hashtable.pxd", I assume? >> >>> cdef extern from "hash_table.h": >>> >>> object HashTable hash_table_new(object hash_func, object >>> equal_func) >> >> That won't work. You can't use Python functions as their signature >> won't >> match the required signatures. Instead, define HashTable as a >> struct and >> the functions as a ctypedef. >> >> Stefan >> >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > -- > ######################################### > # Dr. Sebastien Binet > # Laboratoire de l'Accelerateur Lineaire > # Universite Paris-Sud XI > # Batiment 200 > # 91898 Orsay > #########################################_______ > ________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Tue Sep 8 00:19:10 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Sep 2009 15:19:10 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: References: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> <200909072128.52818.binet@cern.ch> Message-ID: <13680BEC-2436-4905-AD66-40DDE9A4FBF9@math.washington.edu> On Sep 7, 2009, at 2:53 PM, Robert Bradshaw wrote: > Not near the speed gains I was expecting...disappointing. I was timing the wrong thing sage: time time_c_hashtable(10**5) CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s Wall time: 0.01 s sage: time time_c_hashtable(10**6) CPU times: user 0.08 s, sys: 0.00 s, total: 0.08 s Wall time: 0.08 s sage: time time_c_hashtable(10**7) CPU times: user 0.76 s, sys: 0.00 s, total: 0.76 s Wall time: 0.76 s sage: time time_py_hashtable(10**5) CPU times: user 0.06 s, sys: 0.00 s, total: 0.06 s Wall time: 0.06 s sage: time time_py_hashtable(10**6) CPU times: user 0.50 s, sys: 0.00 s, total: 0.50 s Wall time: 0.50 s sage: time time_py_hashtable(10**7) CPU times: user 5.14 s, sys: 0.01 s, total: 5.15 s Wall time: 5.15 s still 6.7x is not much. > > - Robert > > > On Sep 7, 2009, at 12:28 PM, Sebastien Binet wrote: > >> hi there, >> >> attached is a simple cy_stl.pyx file (together with its setup.py >> companion) >> >> really just to get started :) >> >> to test: >> $ python -c 'import cy_stl as cc; cc.test()' >> >> hth, >> sebastien. >> >>> cythonHash.pxd: >>> >>> cdef extern from "hash-table.h": >>> ctypedef struct HashTable >>> ctypedef void *HashTableKey >>> ctypedef unsigned long HashTableHashFunc(HashTableKey value) >>> ctypedef unsigned long HashTableEqualFunc(HashTableKey value) >>> HashTable *hash_table_new(HashTableHashFunc hash_func, >>> HashTableEqualFunc equal_func) >>> >>> cdef inline unsigned long c_hash_func(HashTableKey value): >>> return 1 >>> >>> cdef inline unsigned long c_hash_equal(HashTableKey value): >>> return 1 >>> >>> >>> cythonPT.pyx: >>> >>> cimport cythonHash >>> from cythonHash cimport HashTable >>> >>> class MY_Phrase_Table(object): >>> >>> def __init__(self): >>> >>> pp = HashTable #error here >>> print type(pp), pp >>> >>> This yields in the following error: 'HashTable' is not a >>> constant, variable >>> or function identifier. >>> >>> I don't really get how I should reference to Hashtable. I thought >>> it was >>> already declared from .pxd and the original .c file. >>> >>> >>> >>> >>> -----Original Message----- >>> From: Stefan Behnel [mailto:stefan_ml at behnel.de] >>> Sent: vrijdag 4 september 2009 16:03 >>> To: sanne at kortec.nl >>> Cc: cython-dev at codespeak.net >>> Subject: Re: [Cython] FW: cython and hash tables / dictionary >>> >>> Sanne Korzec wrote: >>>> In the documentation >>> >>> http://c-algorithms.sourceforge.net/doc/hash- >>> table_8h.html#e361c4c0256ec6c7 >>> 4 >>> >>>> 1ecfeabef33d891 , I can find: >>>> >>>> HashTable* hash_table_new ( HashTableHashFunc hash_func, >>>> HashTableEqualFunc equal_func >>>> ) >>>> >>>> To create a new hash table. But I can't find were the >>>> HashTableHashFunc >>> >>> and >>> >>>> HashTableEqualFunc are declared. The only thing I can find is in >>>> the >>> >>> header >>> >>>> file which state: >>>> >>>> typedef unsigned long(* HashTableHashFunc)(HashTableKey value) >>>> typedef unsigned long(* HashTableHashFunc)(HashTableKey value) >>>> >>>> Does this mean I have to write these functions myself? >>> >>> Yes. >>> >>>> In c? >>> >>> You can write them in Cython: >>> >>> cdef unsigned long c_hash(HashTableKey value): >>> return huge_calculation_on(value) >>> >>>> And how then do I call them from cython? >>> >>> You don't. Instead, you pass the function names (i.e. pointers) into >>> hash_table_new(). >>> >>>> My guess: >>>> >>>> hashtable.pyx >>> >>> "hashtable.pxd", I assume? >>> >>>> cdef extern from "hash_table.h": >>>> >>>> object HashTable hash_table_new(object hash_func, object >>>> equal_func) >>> >>> That won't work. You can't use Python functions as their >>> signature won't >>> match the required signatures. Instead, define HashTable as a >>> struct and >>> the functions as a ctypedef. >>> >>> Stefan >>> >>> >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> >> -- >> ######################################### >> # Dr. Sebastien Binet >> # Laboratoire de l'Accelerateur Lineaire >> # Universite Paris-Sud XI >> # Batiment 200 >> # 91898 Orsay >> #########################################______ >> _________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From seb.binet at gmail.com Tue Sep 8 08:51:50 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 8 Sep 2009 08:51:50 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: References: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> <200909072128.52818.binet@cern.ch> Message-ID: <200909080851.51055.binet@cern.ch> On Monday 07 September 2009 23:53:03 Robert Bradshaw wrote: > Thanks for the implementation. I noticed that your compare was only > comparing pointers, so if you had two equal strings at different > memory addresses it wouldn't find them (which may or may not be what > you'd want) yeah, for strings, the c-alg library has a dedicated string-hash: http://c-algorithms.sourceforge.net/doc/hash- string_8h.html#6eb697fb58d3de146a2ddd76a1900f83 (as well as a case insensitive version) > Also, your pointers would go out of scope as soon as test > ended (so you couldn't return ht and use it later). well, the C-way is to malloc everything (and c-alg's hash-table provides a way to register the proper 'free' frunctions) > > I built on what you had to get a float -> float hashtable. Note that > this technique only works since the float value fits inside a void*, > anything bigger and you'd have to allocate memory manually to stick > it into the hashtable. right. [..snip..] > Not near the speed gains I was expecting...disappointing. I (probably very naively) suspect this is coming from all the type conversions void*<->float but proper profiling would tell :) cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From dagss at student.matnat.uio.no Tue Sep 8 09:00:55 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Sep 2009 09:00:55 +0200 Subject: [Cython] Next release In-Reply-To: References: Message-ID: <4AA60127.20200@student.matnat.uio.no> Robert Bradshaw wrote: > It seems like there's consensus that we're overdue for another > release, so I propose that we try to get a release out in the next > couple of weeks. Before I loose all you non-coders, if anyone sees > something that they really want to see in the next release, please > stick/move the ticket to 0.11.3 or 0.12 (no promises, but it could > help establish priority). > > How about > > (1) Get the current -devel branch out as 0.11.3 as soon as possible. > Hopefully, this will just require testing lxml, sage, and throwing > out a couple of release candidates. I've assigned out some tickets at > > http://trac.cython.org/cython_trac/query? > status=assigned&status=new&status=reopened&group=milestone > > in the next week, lets all at least look at them, and resolve them if > they're quick or bump them if they're not. I'll look at the long- > overdue integer conversion review. > > (2) Let's try to get -unstable out for 0.12. I actually don't even > know how "unstable" it is, but I think it's not far off from being > ready. > > (3) Also in preparation for 0.12, let's go over all the tickets and > see which, if any, are low-hanging fruit. > > I also propose before 0.12 hits, we do a cython bug day. This is > something we've done with Sage and it works well for getting a lot > done in a little amount of time, especially those pesky little ones > that no one ever looks at. Essentially, all of us get together on IRC > and try to snuff out as many bugs as possible in a 24 hour period. > Maybe end of September/early October, with 0.12 coming out mid/late > October? I think this sounds fine. I'm not too sure about the bug day but I'd like to, I'll have to see when 0.12 approaches if I feel I've had an efficient week and can slack a day... :-) -- Dag Sverre From stefan_ml at behnel.de Tue Sep 8 09:48:37 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Sep 2009 09:48:37 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <200909080851.51055.binet@cern.ch> References: <20090907131512.WLMI19290.viefep15-int.chello.at@edge05.upc.biz> <200909072128.52818.binet@cern.ch> <200909080851.51055.binet@cern.ch> Message-ID: <4AA60C55.9040208@behnel.de> Sebastien Binet wrote: > On Monday 07 September 2009 23:53:03 Robert Bradshaw wrote: >> I noticed that your compare was only comparing pointers, so if you had >> two equal strings at different memory addresses it wouldn't find them >> (which may or may not be what you'd want) > > yeah, for strings, the c-alg library has a dedicated string-hash: > http://c-algorithms.sourceforge.net/doc/hash- > string_8h.html#6eb697fb58d3de146a2ddd76a1900f83 > > (as well as a case insensitive version) Regarding string hashes, this is worth a read: http://burtleburtle.net/bob/hash/doobs.html In version 2.7.x, libxml2's tag dictionaries switched to one of those, which brought a major performance boost for large sets of XML tags in documents. Stefan From stefan_ml at behnel.de Tue Sep 8 10:24:48 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Sep 2009 10:24:48 +0200 Subject: [Cython] Next release In-Reply-To: References: Message-ID: <4AA614D0.8030907@behnel.de> Hi, I almost missed your mail as my mail program folded it into an ancient thread with the same title. I was actually about to write a similar mail but didn't get to finish it up before I saw yours. :) Robert Bradshaw wrote: > It seems like there's consensus that we're overdue for another > release totally. > (1) Get the current -devel branch out as 0.11.3 as soon as possible. > Hopefully, this will just require testing lxml Works for me. AFAIR, there were no really major changes since 0.11.2 anyway, mostly fixes. > and throwing > out a couple of release candidates. I've assigned out some tickets at > > http://trac.cython.org/cython_trac/query? > status=assigned&status=new&status=reopened&group=milestone > > in the next week, lets all at least look at them, and resolve them if > they're quick or bump them if they're not. Let's do that. > (2) Let's try to get -unstable out for 0.12. I actually don't even > know how "unstable" it is, but I think it's not far off from being > ready. I consider it mostly stable, except for the few remaining test case failures. Would be great if you could take a look, Robert. > (3) Also in preparation for 0.12, let's go over all the tickets and > see which, if any, are low-hanging fruit. > > I also propose before 0.12 hits, we do a cython bug day. This is > something we've done with Sage and it works well for getting a lot > done in a little amount of time, especially those pesky little ones > that no one ever looks at. Essentially, all of us get together on IRC > and try to snuff out as many bugs as possible in a 24 hour period. > Maybe end of September/early October, with 0.12 coming out mid/late > October? Early October should work for me, and I think the overall time frame is ok. Stefan From sanne at kortec.nl Tue Sep 8 13:41:40 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Tue, 8 Sep 2009 13:41:40 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: Message-ID: <20090908114146.WHEJ14919.viefep11-int.chello.at@edge02.upc.biz> Thanks for the implementation, can't wait to try. But again I'm running into some problems. I installed the c stl with configure --prefix='/home/me/libcalg' Changed the setup.py: #!/usr/bin/env python # python setup.py build_ext --inplace from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext ext = Extension( "cy_hash", ["cy_hash.pyx"], language="c", include_dirs=['/home/me/libcalg/include/libcalg-1.0'], library_dirs=['/home/me/libcalg/lib'], libraries=['calg'], #also tried 'libcalg/calg' cmdclass = {'build_ext': build_ext} ) setup( cmdclass={'build_ext': build_ext}, ext_modules=[ext] ) When I run python setup.py build_ext --inplace I get a warning: /usr/lib/python2.5/distutils/extension.py:133: UserWarning: Unknown Extension options: 'cmdclass' warnings.warn(msg) When I run python -c 'import cy_hash as cc; cc.test()' I get: ImportError: libcalg.so.0: cannot open shared object file: No such file or directory The file is compiled and ready in '/home/me/libcalg/lib' and is rwx accessible for me. What's going on here? -----Original Message----- From: Robert Bradshaw [mailto:robertwb at math.washington.edu] Sent: maandag 7 september 2009 23:53 To: Cython-dev; Sebastien Binet Cc: sanne at kortec.nl Subject: Re: [Cython] FW: cython and hash tables / dictionary Thanks for the implementation. I noticed that your compare was only comparing pointers, so if you had two equal strings at different memory addresses it wouldn't find them (which may or may not be what you'd want) Also, your pointers would go out of scope as soon as test ended (so you couldn't return ht and use it later). I built on what you had to get a float -> float hashtable. Note that this technique only works since the float value fits inside a void*, anything bigger and you'd have to allocate memory manually to stick it into the hashtable. - Robert From seb.binet at gmail.com Tue Sep 8 13:45:40 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 8 Sep 2009 13:45:40 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090908114146.WHEJ14919.viefep11-int.chello.at@edge02.upc.biz> References: <20090908114146.WHEJ14919.viefep11-int.chello.at@edge02.upc.biz> Message-ID: <200909081345.40711.binet@cern.ch> On Tuesday 08 September 2009 13:41:40 Sanne Korzec wrote: > Thanks for the implementation, can't wait to try. But again I'm running > into some problems. > > I installed the c stl with configure --prefix='/home/me/libcalg' > > Changed the setup.py: > > #!/usr/bin/env python > # python setup.py build_ext --inplace > from distutils.core import setup > from distutils.extension import Extension > from Cython.Distutils import build_ext > > ext = Extension( > "cy_hash", > ["cy_hash.pyx"], > language="c", > include_dirs=['/home/me/libcalg/include/libcalg-1.0'], > library_dirs=['/home/me/libcalg/lib'], > libraries=['calg'], #also tried 'libcalg/calg' > cmdclass = {'build_ext': build_ext} > ) > > setup( > cmdclass={'build_ext': build_ext}, > ext_modules=[ext] > ) > > When I run python setup.py build_ext --inplace > > I get a warning: /usr/lib/python2.5/distutils/extension.py:133: > UserWarning: Unknown Extension options: 'cmdclass' > warnings.warn(msg) > > When I run python -c 'import cy_hash as cc; cc.test()' > > I get: ImportError: libcalg.so.0: cannot open shared object file: No such > file or directory you probably need to add the /home/me/libcalg/lib directory to your LD_LIBRARY_PATH environment variable. cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From dalcinl at gmail.com Tue Sep 8 16:26:15 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 8 Sep 2009 11:26:15 -0300 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090908114146.WHEJ14919.viefep11-int.chello.at@edge02.upc.biz> References: <20090908114146.WHEJ14919.viefep11-int.chello.at@edge02.upc.biz> Message-ID: On Tue, Sep 8, 2009 at 8:41 AM, Sanne Korzec wrote: > Thanks for the implementation, can't wait to try. But again I'm running into > some problems. > > I installed the c stl with configure --prefix='/home/me/libcalg' > > Changed the setup.py: > > #!/usr/bin/env python > # python setup.py build_ext --inplace > from distutils.core import setup > from distutils.extension import Extension > from Cython.Distutils import build_ext > > ext = Extension( > ? ?"cy_hash", > ? ?["cy_hash.pyx"], > ? ?language="c", > ? ?include_dirs=['/home/me/libcalg/include/libcalg-1.0'], > ? ?library_dirs=['/home/me/libcalg/lib'], > ? ?libraries=['calg'], ? ? ? ? ? ? ? ? #also tried 'libcalg/calg' > ? ?cmdclass = {'build_ext': build_ext} > ? ?) > > setup( > ? ?cmdclass={'build_ext': build_ext}, > ? ?ext_modules=[ext] > ? ?) > > When I run python setup.py build_ext --inplace > > I get a warning: /usr/lib/python2.5/distutils/extension.py:133: UserWarning: > Unknown Extension options: 'cmdclass' > warnings.warn(msg) > The warning is very clear, I think... 'cmdclass' is not an option for Extension(), so remove it... 'cmdclass' is an option for the setup() function... > When I run python -c 'import cy_hash as cc; cc.test()' > > I get: ImportError: libcalg.so.0: cannot open shared object file: No such > file or directory > > The file is compiled and ready in '/home/me/libcalg/lib' and is rwx > accessible for me. What's going on here? > Please add the argument below to the Extension() constructor... ext = Extension( ,,,, runtime_library_dirs=['/home/me/libcalg/lib'], ... ) That way, distutils will ask the C compiler to pass an special flag ("-Wl,-rpath /home/me/libcalg/lib") to the linker specifying the path of your library... > > > -----Original Message----- > From: Robert Bradshaw [mailto:robertwb at math.washington.edu] > Sent: maandag 7 september 2009 23:53 > To: Cython-dev; Sebastien Binet > Cc: sanne at kortec.nl > Subject: Re: [Cython] FW: cython and hash tables / dictionary > > Thanks for the implementation. I noticed that your compare was only > comparing pointers, so if you had two equal strings at different > memory addresses it wouldn't find them (which may or may not be what > you'd want) Also, your pointers would go out of scope as soon as test > ended (so you couldn't return ht and use it later). > > I built on what you had to get a float -> float hashtable. Note that > this technique only works since the float value fits inside a void*, > anything bigger and you'd have to allocate memory manually to stick > it into the hashtable. > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From sanne at kortec.nl Tue Sep 8 16:28:48 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Tue, 8 Sep 2009 16:28:48 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: Message-ID: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> Thanks for both examples. Is there a way to check the value of a float at different stages as it is being cast from one type to another? e.g. is there a way to write to the screen the value of a void* void** or float*, cause currently all float values return (0.0) in my implementation. -----Original Message----- From: Robert Bradshaw [mailto:robertwb at math.washington.edu] Sent: maandag 7 september 2009 23:53 To: Cython-dev; Sebastien Binet Cc: sanne at kortec.nl Subject: Re: [Cython] FW: cython and hash tables / dictionary Thanks for the implementation. I noticed that your compare was only comparing pointers, so if you had two equal strings at different memory addresses it wouldn't find them (which may or may not be what you'd want) Also, your pointers would go out of scope as soon as test ended (so you couldn't return ht and use it later). I built on what you had to get a float -> float hashtable. Note that this technique only works since the float value fits inside a void*, anything bigger and you'd have to allocate memory manually to stick it into the hashtable. - Robert From dalcinl at gmail.com Tue Sep 8 16:52:39 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 8 Sep 2009 11:52:39 -0300 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> References: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> Message-ID: On Tue, Sep 8, 2009 at 11:28 AM, Sanne Korzec wrote: > > is there a way to write to the screen the value of a void* ?void** or > float*, cause currently all float values return (0.0) in my implementation. > What about using C stdlib printf() using "%p" format, like this cdef extern from "stdio.h": int printf(char*,...) cdef float *ptr = NULL printf("%p\n", ptr) > > > > > -----Original Message----- > From: Robert Bradshaw [mailto:robertwb at math.washington.edu] > Sent: maandag 7 september 2009 23:53 > To: Cython-dev; Sebastien Binet > Cc: sanne at kortec.nl > Subject: Re: [Cython] FW: cython and hash tables / dictionary > > Thanks for the implementation. I noticed that your compare was only > comparing pointers, so if you had two equal strings at different > memory addresses it wouldn't find them (which may or may not be what > you'd want) Also, your pointers would go out of scope as soon as test > ended (so you couldn't return ht and use it later). > > I built on what you had to get a float -> float hashtable. Note that > this technique only works since the float value fits inside a void*, > anything bigger and you'd have to allocate memory manually to stick > it into the hashtable. > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From seb.binet at gmail.com Tue Sep 8 19:38:09 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 8 Sep 2009 19:38:09 +0200 Subject: [Cython] difference b/w PyObject* and object ? Message-ID: <200909081938.09161.binet@cern.ch> hi there, crawling my way through the cython-unstable tree and the Includes/*.pxd files I see stuff like so (python_dict.pxd): object PyDict_Copy(object p) PyObject* PyDict_GetItem(object p, object key) are they just syntaxic sugar, sweeting the pot to the very same thing ? (I guess so) which one is the most advised ? cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From greg.ewing at canterbury.ac.nz Wed Sep 9 01:36:49 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Sep 2009 11:36:49 +1200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> References: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> Message-ID: <4AA6EA91.6060906@canterbury.ac.nz> Sanne Korzec wrote: > Is there a way to check the value of a float at different stages as it is > being cast from one type to another? If you're building a specialised hash table just for floats, why are you casting them to void * at all? Why not just store them as floats? -- Greg From greg.ewing at canterbury.ac.nz Wed Sep 9 02:00:36 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Sep 2009 12:00:36 +1200 Subject: [Cython] difference b/w PyObject* and object ? In-Reply-To: <200909081938.09161.binet@cern.ch> References: <200909081938.09161.binet@cern.ch> Message-ID: <4AA6F024.2080902@canterbury.ac.nz> Sebastien Binet wrote: > object PyDict_Copy(object p) > PyObject* PyDict_GetItem(object p, object key) > > are they just syntaxic sugar, sweeting the pot to the very same thing ? I think PyObject * has been used for PyDict_GetItem because it returns a borrowed reference. -- Greg From robertwb at math.washington.edu Wed Sep 9 04:09:00 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 8 Sep 2009 19:09:00 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <4AA6EA91.6060906@canterbury.ac.nz> References: <20090908142850.UHOB24520.viefep12-int.chello.at@edge03.upc.biz> <4AA6EA91.6060906@canterbury.ac.nz> Message-ID: On Sep 8, 2009, at 4:36 PM, Greg Ewing wrote: > Sanne Korzec wrote: > >> Is there a way to check the value of a float at different stages >> as it is >> being cast from one type to another? > > If you're building a specialised hash table just for floats, > why are you casting them to void * at all? Why not just > store them as floats? It was an example of using an (existing) hashtable that was written to store void*. - Robert From robertwb at math.washington.edu Wed Sep 9 04:10:00 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 8 Sep 2009 19:10:00 -0700 Subject: [Cython] difference b/w PyObject* and object ? In-Reply-To: <4AA6F024.2080902@canterbury.ac.nz> References: <200909081938.09161.binet@cern.ch> <4AA6F024.2080902@canterbury.ac.nz> Message-ID: <30C563F1-0853-47A9-8A68-65562FFD41C6@math.washington.edu> On Sep 8, 2009, at 5:00 PM, Greg Ewing wrote: > Sebastien Binet wrote: >> object PyDict_Copy(object p) >> PyObject* PyDict_GetItem(object p, object key) >> >> are they just syntaxic sugar, sweeting the pot to the very same >> thing ? > > I think PyObject * has been used for PyDict_GetItem because it > returns a borrowed reference. Exactly. You probably only need to use object. - Robert From seb.binet at gmail.com Wed Sep 9 08:37:40 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Wed, 9 Sep 2009 08:37:40 +0200 Subject: [Cython] difference b/w PyObject* and object ? In-Reply-To: <4AA6F024.2080902@canterbury.ac.nz> References: <200909081938.09161.binet@cern.ch> <4AA6F024.2080902@canterbury.ac.nz> Message-ID: <200909090837.41052.binet@cern.ch> On Wednesday 09 September 2009 02:00:36 Greg Ewing wrote: > Sebastien Binet wrote: > > object PyDict_Copy(object p) > > PyObject* PyDict_GetItem(object p, object key) > > > > are they just syntaxic sugar, sweeting the pot to the very same thing ? > > I think PyObject * has been used for PyDict_GetItem because it > returns a borrowed reference. > thanks, and as usual, moments after I hit the send button I found the answer in the code (python.pxd): # For all the declaration below, whenver the Py_ function returns # a *new reference* to a PyObject*, the return type is "object". # When the function returns a borrowed reference, the return # type is PyObject*. When Cython sees "object" as a return type # it doesn't increment the reference count. When it sees PyObject* # in order to use the result you must explicitly cast to , # and when you do that Cython increments the reference count wether # you want it to or not, forcing you to an explicit DECREF (or leak memory). # To avoid this we make the above convention. # Cython takes care of this automatically for anything of type object. ## More precisely, I think the correct convention for ## using the Python/C API from Pyrex is as follows. ## ## (1) Declare all input arguments as type "object". This way no explicit ## casting is needed, and moreover Pyrex doesn't generate ## any funny reference counting. ## (2) Declare output as object if a new reference is returned. ## (3) Declare output as PyObject* if a borrowed reference is returned. sorry for the RTFM noise. cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From robertwb at math.washington.edu Wed Sep 9 10:58:47 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 01:58:47 -0700 Subject: [Cython] ticket #333 (extern ctypedef integral <-> python object conversion), please review patch In-Reply-To: References: Message-ID: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> On Jun 19, 2009, at 9:44 AM, Lisandro Dalcin wrote: > I've finally managed to write a more or less working patch... Look at > the test, I've tried to cover many corner cases... > > 1) Dag & Kurt: I bet you will be happy. > > > 2) Robert: your eyeballs needed, please comment on this: > > a) Look at the changes outside PyrexTypes.pyx, I believe they make > sense, though I would like you to confirm that. > > b) In the past, you raised some concerns about __int32 from ILP64 > model... A possible (and suboptimal, no overflow-safe) way of handling > that is there, though "#if 0" disabled. I've tried to take advantage > of "_PyLong_{As/From}ByteArray()", but that (in particular, the "As" > one) is somewhat harder to use, as we should pass a PyLongObject type. Sorry I've taken so long to get back on this. I have read all the code, and it looks good. I've posted some comments on trac. http:// trac.cython.org/cython_trac/ticket/333 - Robert From sanne at kortec.nl Wed Sep 9 16:18:06 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Wed, 9 Sep 2009 16:18:06 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: Message-ID: <20090909141810.LZQD422.viefep16-int.chello.at@edge01.upc.biz> Ok, I have played around with this hash table and understand most of the basics... I now would like to create the datastructure I need for my project. Basically what I need is two linked hash tables like this Key : int ---> value : hashtable { key: int ---> value: list(float, float) } And Key : int ---> value : hashtable { key: int ---> value: float } I am starting to wonder since this hash table works with voids for key and value only if I should change the .c and .h files myself. I think I would prefer not to. Does anybody have a suggestion what would be wise? Preference goes to quick implementation, not total optimization. From robertwb at math.washington.edu Wed Sep 9 18:53:17 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 09:53:17 -0700 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> Message-ID: <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> On Jul 13, 2009, at 3:35 PM, Eric Eisner wrote: > On Tue, Jul 14, 2009 at 00:51, Robert > Bradshaw wrote: >> On Jul 13, 2009, at 6:47 AM, Eric Eisner wrote: >> >>> Hi, >>> >>> I was working on a wrapper for a c function that took an unsigned >>> char* and its length (the string could have null bytes, so it >>> needs a >>> specific length). I was having some trouble getting cython to >>> compile >>> a simple conversion of string to unsigned char*, the way I >>> eventually >>> got it to work is: >>> >>> udata = pydata >>> >>> This was a surprising requirement that took me a while to figure >>> out. >>> Is it intentional that strings cannot be directly cast to unsigned >>> char? >> >> No, I don't think that's intentional. >> >>> If not, I assume this can be fixed easily...by someone who >>> understands the code of course. >> >> Yes, I think that could be fixed relatively easy. However, note >> that casting >> Python objects directly to char* is skirting all unicode/charset >> issues. >> >> - Robert > > This application was specifically supposed to be for arbitrary data > bytes (hence needed the null bytes) and the term string was the 2.x > nomenclature. For a 3.x version, it would definitely need to take > bytes Having null bytes has nothing to do with char vs. unsigned char. I've thought about this some more, and the amount of casting it would take to get the C compiler to not complain when trying to treat unsigned char* as strings, I actually don't think it's any natural to convert strings to unsigned char*, so the double cast above seems like the right thing to do (the first cast extracts the string data, the second changes the pointer type). The same would work if you wanted to treat the contents of pydata as a void* or an int*, etc. It would, however, be worth an entry in the FAQ. - Robert From dalcinl at gmail.com Wed Sep 9 18:57:04 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 13:57:04 -0300 Subject: [Cython] removing PyrexType.is_longlong Message-ID: I'm going to remove "is_longlong" from "PyrexType"... "is_longlong" is currently used in a single place, and we can easily get rid of it; new usages should be discouraged as we have put a lot of effort in making the beast size-agnostic... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Wed Sep 9 19:05:12 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 14:05:12 -0300 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> Message-ID: On Wed, Sep 9, 2009 at 1:53 PM, Robert Bradshaw wrote: > On Jul 13, 2009, at 3:35 PM, Eric Eisner wrote: > >>>> udata = pydata >>>> > so the double cast above seems > like the right thing to do (the first cast extracts the string data, > the second changes the pointer type). However, if 'pydata' has embed null characters, the conversion will fail. I think that the only reasonable way to handle 'bytes' ('str' in Py2) is by using PyBytes_AsStringAndSize() (PyString_AsStringAndSize() in Py2)... whithout the buffer size, a buffer with embed null characters is useless. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Wed Sep 9 19:23:01 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 14:23:01 -0300 Subject: [Cython] C99 complex and GNU extensions Message-ID: My GCC manual page says this: To extract the real part of a complex-valued expression EXP, write `__real__ EXP'. Likewise, use `__imag__' to extract the imaginary part. This is a GNU extension; for values of floating type, you should use the ISO C99 functions `crealf', `creal', `creall', `cimagf', `cimag' and `cimagl', declared in `' and also provided as built-in functions by GCC. So it seems that __real__ and __imag__ are actually GNU extensions... Then these defines: #define __Pyx_REAL_PART(z) __real__(z) #define __Pyx_IMAG_PART(z) __imag__(z) could potentially not work on compilers other than GCC... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Wed Sep 9 19:41:03 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 10:41:03 -0700 Subject: [Cython] removing PyrexType.is_longlong In-Reply-To: References: Message-ID: On Sep 9, 2009, at 9:57 AM, Lisandro Dalcin wrote: > I'm going to remove "is_longlong" from "PyrexType"... "is_longlong" is > currently used in a single place, and we can easily get rid of it; new > usages should be discouraged as we have put a lot of effort in making > the beast size-agnostic... Sounds like a good idea to me. - Robert From robertwb at math.washington.edu Wed Sep 9 19:43:45 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 10:43:45 -0700 Subject: [Cython] casting strings to unsigned char* In-Reply-To: References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> Message-ID: On Sep 9, 2009, at 10:05 AM, Lisandro Dalcin wrote: > On Wed, Sep 9, 2009 at 1:53 PM, Robert > Bradshaw wrote: >> On Jul 13, 2009, at 3:35 PM, Eric Eisner wrote: >> >>>>> udata = pydata >>>>> >> so the double cast above seems >> like the right thing to do (the first cast extracts the string data, >> the second changes the pointer type). > > However, if 'pydata' has embed null characters, the conversion will > fail. I think that the only reasonable way to handle 'bytes' ('str' in > Py2) is by using PyBytes_AsStringAndSize() (PyString_AsStringAndSize() > in Py2)... whithout the buffer size, a buffer with embed null > characters is useless. Here we are going from python to C. Yes, if there are null characters than the length needs to be determined otherwise, and the conversion to Python via one of the Py*_AsAndSize methods. - Robert From robertwb at math.washington.edu Wed Sep 9 19:53:07 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 10:53:07 -0700 Subject: [Cython] C99 complex and GNU extensions In-Reply-To: References: Message-ID: <92BA423E-D16F-4146-A446-4138C9997F25@math.washington.edu> On Sep 9, 2009, at 10:23 AM, Lisandro Dalcin wrote: > My GCC manual page says this: > > To extract the real part of a complex-valued expression EXP, write > `__real__ EXP'. Likewise, use `__imag__' to extract the imaginary > part. This is a GNU extension; for values of floating type, you > should > use the ISO C99 functions `crealf', `creal', `creall', `cimagf', > `cimag' and `cimagl', declared in `' and also provided as > built-in functions by GCC. > > So it seems that __real__ and __imag__ are actually GNU extensions... > Then these defines: > > #define __Pyx_REAL_PART(z) __real__(z) > #define __Pyx_IMAG_PART(z) __imag__(z) > > could potentially not work on compilers other than GCC... Hmm... true. The advantage of __real__ and __imag__ are that they work for non-floating point types (when the compiler supports them), they are the same for all types, and perhaps most importantly, they are valid lvalues. The IBM and CodeWarrior compilers seem to support this extension, MSVC isn't even c99 compliant. Is there a c99 way of getting the real and imaginary parts of a complex as an lvalue? - Robert From stefan_ml at behnel.de Wed Sep 9 20:24:41 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 Sep 2009 20:24:41 +0200 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> Message-ID: <4AA7F2E9.9010500@behnel.de> Robert Bradshaw wrote: > I've thought about this some more, and the amount of casting it would > take to get the C compiler to not complain when trying to treat > unsigned char* as strings, I actually don't think it's any natural to > convert strings to unsigned char*, so the double cast above seems > like the right thing to do Regarding the "natural" bit, libxml2 actually defines all its UTF-8 encoded byte strings as "unsigned char*". So, except for serialised XML, basically every string you get from libxml2 uses that. This is so inconvenient to work with in Cython that the original author of lxml actually went for the simple 'solution' of declaring everything as plain char* and passing "-w" to gcc (which is still in use today, although it already bit me more than once). I don't think there's anything wrong with letting Cython do the necessary casting under the hood. Stefan From dalcinl at gmail.com Wed Sep 9 21:24:35 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 16:24:35 -0300 Subject: [Cython] change arg types for PyErr_Restore(t, v, tb) in python_exc.pxd Message-ID: I would like to change this declaration (in Cython/Includes/python_exc.pxd) void PyErr_Restore(object type, object value, object traceback) for this void PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback) 1) That call is usually used after PyErr_Fetch(), which has PyObject** args... 2) This call steals references, so IMO it is dangerous to use 'object' arguments 3) If should be valid to call PyErr_Restore(NULL,NULL,NULL) (documented to clear the error indicator) with this change, these lines (in Cython/Runtime/refnanny.pyx) PyErr_Restore(type, value, tb) will become PyErr_Restore(type, value, tb) Any objections? Am I missing something? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Wed Sep 9 21:31:23 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 12:31:23 -0700 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <4AA7F2E9.9010500@behnel.de> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> Message-ID: <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> On Sep 9, 2009, at 11:24 AM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> I've thought about this some more, and the amount of casting it would >> take to get the C compiler to not complain when trying to treat >> unsigned char* as strings, I actually don't think it's any natural to >> convert strings to unsigned char*, so the double cast above seems >> like the right thing to do > > Regarding the "natural" bit, libxml2 actually defines all its UTF-8 > encoded > byte strings as "unsigned char*". So, except for serialised XML, > basically > every string you get from libxml2 uses that. This is so > inconvenient to > work with in Cython that the original author of lxml actually went > for the > simple 'solution' of declaring everything as plain char* and > passing "-w" > to gcc (which is still in use today, although it already bit me > more than > once). Interesting. One of the reasons I was so quick to discard this is because I thought the usecase was that null characters needed to be embedded, which is completely orthogonal, and I couldn't think of anywhere I'd come across unsigned char* used for strings (but clearly libxml2 is such a library). Just out of curiosity, does it use char* for ASCII and unsigned char* for utf-8 as a poor-man's typechecking for encoding? > I don't think there's anything wrong with letting Cython do the > necessary > casting under the hood. http://trac.cython.org/cython_trac/ticket/359 - Robert From robertwb at math.washington.edu Wed Sep 9 21:34:51 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 12:34:51 -0700 Subject: [Cython] change arg types for PyErr_Restore(t, v, tb) in python_exc.pxd In-Reply-To: References: Message-ID: On Sep 9, 2009, at 12:24 PM, Lisandro Dalcin wrote: > I would like to change this declaration (in Cython/Includes/ > python_exc.pxd) > > void PyErr_Restore(object type, object value, object traceback) > > for this > > void PyErr_Restore(PyObject *type, PyObject *value, PyObject > *traceback) > > 1) That call is usually used after PyErr_Fetch(), which has > PyObject** args... > 2) This call steals references, so IMO it is dangerous to use > 'object' arguments > 3) If should be valid to call PyErr_Restore(NULL,NULL,NULL) > (documented to clear the error indicator) > > with this change, these lines (in Cython/Runtime/refnanny.pyx) > > PyErr_Restore(type, value, tb) > > will become > > PyErr_Restore(type, value, tb) > > Any objections? Am I missing something? Makes sense to me. - Robert From dalcinl at gmail.com Wed Sep 9 22:20:09 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 17:20:09 -0300 Subject: [Cython] removing PyrexType.is_longlong In-Reply-To: References: Message-ID: On Wed, Sep 9, 2009 at 2:41 PM, Robert Bradshaw wrote: > On Sep 9, 2009, at 9:57 AM, Lisandro Dalcin wrote: > >> I'm going to remove "is_longlong" from "PyrexType"... "is_longlong" is >> currently used in a single place, and we can easily get rid of it; new >> usages should be discouraged as we have put a lot of effort in making >> the beast size-agnostic... > > Sounds like a good idea to me. > Pushed: http://hg.cython.org/cython-devel/rev/5f8653965c34 -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Wed Sep 9 22:32:59 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Sep 2009 17:32:59 -0300 Subject: [Cython] change arg types for PyErr_Restore(t, v, tb) in python_exc.pxd In-Reply-To: References: Message-ID: On Wed, Sep 9, 2009 at 4:34 PM, Robert Bradshaw wrote: > On Sep 9, 2009, at 12:24 PM, Lisandro Dalcin wrote: > >> I would like to change this declaration (in Cython/Includes/ >> python_exc.pxd) >> >> ? ? void PyErr_Restore(object type, object value, object traceback) >> >> for this >> >> ? ? void PyErr_Restore(PyObject *type, PyObject *value, PyObject >> *traceback) >> >> 1) That call is usually used after PyErr_Fetch(), which has >> PyObject** args... >> 2) This call steals references, so IMO it is dangerous to use >> 'object' arguments >> 3) If should be valid to call ?PyErr_Restore(NULL,NULL,NULL) >> (documented to clear the error indicator) >> >> with this change, these lines (in Cython/Runtime/refnanny.pyx) >> >> ? ? PyErr_Restore(type, value, tb) >> >> will become >> >> ? ? PyErr_Restore(type, value, tb) >> >> Any objections? Am I missing something? > > Makes sense to me. > Pushed: http://hg.cython.org/cython-devel/rev/8157444859b4 -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Thu Sep 10 02:57:16 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 9 Sep 2009 17:57:16 -0700 Subject: [Cython] change arg types for PyErr_Restore(t, v, tb) in python_exc.pxd In-Reply-To: References: Message-ID: <5A599515-FAA7-47D3-9818-EE0D60B91BD5@math.washington.edu> On Sep 9, 2009, at 1:32 PM, Lisandro Dalcin wrote: > Pushed: http://hg.cython.org/cython-devel/rev/8157444859b4 Thanks and Thanks. - Robert From stefan_ml at behnel.de Thu Sep 10 04:40:14 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 10 Sep 2009 04:40:14 +0200 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> Message-ID: <4AA8670E.20300@behnel.de> Robert Bradshaw wrote: > One of the reasons I was so quick to discard this is > because I thought the usecase was that null characters needed to be > embedded, which is completely orthogonal, and I couldn't think of > anywhere I'd come across unsigned char* used for strings (but clearly > libxml2 is such a library). I actually never understood why people use plain char* in the first place (ok, apart from tradition, laziness and non-ASCII unawareness). Any 1-byte encoding table I've ever come across maps characters to the byte values 0-255 or 0x00-0xFF. I've never seen an encoded byte string represented with negative byte values. The habit of using char* for text goes so far that I wasn't even aware that char* was pointing to a signed value when I learned C. Before I was made aware of it, I just unconsciously considered 'char' a special case in the language (which it actually is when you think about it). > Just out of curiosity, does it use char* for ASCII and unsigned char* > for utf-8 as a poor-man's typechecking for encoding? It's a form of type-checking, yes, but not in that way. It uses unsigned char* for text (tag names, text values, etc.) and char* for data sequences (e.g. file names and serialised XML). It even redefines "unsigned char" as "xmlChar" for that purpose, and a macro "BAD_CAST" that does exactly what it sounds like. I guess the historical reason to do that was that you can (or could?) switch the internal text encoding in libxml2, so you could use Latin-1 instead of UTF-8, for example, and the xmlChar* would denote all strings encoded that way. Doesn't make much sense for XML nowadays and just little more for HTML, but it's still a nice way of documenting the API. And to me, it makes sense to use "unsigned char" anyway. >> I don't think there's anything wrong with letting Cython do the >> necessary casting under the hood. > > http://trac.cython.org/cython_trac/ticket/359 Thanks. Stefan From robertwb at math.washington.edu Thu Sep 10 09:04:11 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Sep 2009 00:04:11 -0700 Subject: [Cython] filetable_cname vs filenames_cname Message-ID: <9BD65EE9-13BA-4538-8FD9-C34127AE8470@math.washington.edu> Does anyone know why we have filenames_cname and filetable_cname? The only time the latter is actually used is static void __pyx_init_filenames(void) { __pyx_f = __pyx_filenames; } - Robert From robertwb at math.washington.edu Thu Sep 10 09:15:13 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Sep 2009 00:15:13 -0700 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <4AA8670E.20300@behnel.de> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> <4AA8670E.20300@behnel.de> Message-ID: <25EB1675-158F-49DB-8357-61145E99E572@math.washington.edu> On Sep 9, 2009, at 7:40 PM, Stefan Behnel wrote: > > Robert Bradshaw wrote: >> One of the reasons I was so quick to discard this is >> because I thought the usecase was that null characters needed to be >> embedded, which is completely orthogonal, and I couldn't think of >> anywhere I'd come across unsigned char* used for strings (but clearly >> libxml2 is such a library). > > I actually never understood why people use plain char* in the first > place > (ok, apart from tradition, laziness and non-ASCII unawareness). Any > 1-byte > encoding table I've ever come across maps characters to the byte > values > 0-255 or 0x00-0xFF. I've never seen an encoded byte string > represented with > negative byte values. The habit of using char* for text goes so far > that I > wasn't even aware that char* was pointing to a signed value when I > learned > C. It might not be signed, depends on the compiler. Unsigned char is just so long to type for something so common, and that upper bit (or the negative values) really aren't all that useful--once you leave ASCII there's a whole host of bigger issues to deal with (if only everyone used UTF-8 unicode...). > Before I was made aware of it, I just unconsciously considered > 'char' a > special case in the language (which it actually is when you think > about it). > > >> Just out of curiosity, does it use char* for ASCII and unsigned char* >> for utf-8 as a poor-man's typechecking for encoding? > > It's a form of type-checking, yes, but not in that way. It uses > unsigned > char* for text (tag names, text values, etc.) and char* for data > sequences > (e.g. file names and serialised XML). It even redefines "unsigned > char" as > "xmlChar" for that purpose, and a macro "BAD_CAST" that does > exactly what > it sounds like. > > I guess the historical reason to do that was that you can (or could?) > switch the internal text encoding in libxml2, so you could use Latin-1 > instead of UTF-8, for example, and the xmlChar* would denote all > strings > encoded that way. Doesn't make much sense for XML nowadays and just > little > more for HTML, but it's still a nice way of documenting the API. > And to me, > it makes sense to use "unsigned char" anyway. Ah. - Robert From paxcalpt at gmail.com Thu Sep 10 17:51:09 2009 From: paxcalpt at gmail.com (Ricardo Henriques) Date: Thu, 10 Sep 2009 17:51:09 +0200 Subject: [Cython] Problem on Cython compilation Message-ID: <5AD7DB18-E451-4A85-9859-9D64A4ED32BD@gmail.com> Hi guys, I'm having a constant problem trying to compile a simple file with mingw: ------ iqdb.pyx ------- import numpy def main(): print "hello" --------------------------- By going through the usual python setup.py build_ext --inplace I get: running build_ext cythoning iqdb.pyx to iqdb.cpp building 'iqdb' extension creating build creating build\temp.win32-2.5 creating build\temp.win32-2.5\Release C:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW \bin\gcc.exe -mno-cygwin -mdll -O -Wall "-IC:\Documents and Settings \Paxcal\Desktop\iQ2\iQPython\lib\site-packages\numpy\core\include" "- IC:\Documents and Settings\Paxcal\Desktop\iQtemp10\32\ab_common" "-IC: \Documents and Settings\Paxcal\Desktop\iQ2\iQPython\include" "-IC: \Documents and Settings\Paxcal\Desktop\iQ2\iQPython\PC" -c iqdb.cpp -o build\temp.win32-2.5\Release\iqdb.o C:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW \bin\gcc.exe -mno-cygwin -mdll -O -Wall "-IC:\Documents and Settings \Paxcal\Desktop\iQ2\iQPython\lib\site-packages\numpy\core\include" "- IC:\Documents and Settings\Paxcal\Desktop\iQtemp10\32\ab_common" "-IC: \Documents and Settings\Paxcal\Desktop\iQ2\iQPython\include" "-IC: \Documents and Settings\Paxcal\Desktop\iQ2\iQPython\PC" -c iqdb.cpp -o build\temp.win32-2.5\Release\iqdb.o writing build\temp.win32-2.5\Release\iqdb.def C:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW \bin\g++.exe -mno-cygwin -mdll -static --entry _DllMain at 12 --output- lib build\temp.win32-2.5\Release\libiqdb.a --def build \temp.win32-2.5\Release\iqdb.def -s build\temp.win32-2.5\Release \iqdb.o build\temp.win32-2.5\Release\iqdb.o "-LC:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\libs" "-LC:\Documents and Settings \Paxcal\Desktop\iQ2\iQPython\PCBuild" -lpython25 -lmsvcr71 -o iqdb.pyd g++: build\temp.win32-2.5\Release\libiqdb.a: No such file or directory error: command 'g++' failed with exit status 1 Any ideas why the step to create the libiqdb.a is missing? Thanks for your help.. Cheers, Ricardo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Ricardo Henriques PhD Student Gene Expression and Biophysics Unit Institute of Molecular Medicine Faculty of Medicine, University of Lisbon Av. Prof. Egas Moniz 1649-028 Lisbon, Portugal Phone: + 351 217999503, Ext: 47318 Fax: + 351 217999504 E-mail: rhenriques at fm.ul.pt ~~ Or ~~ PhD Student (in collaboration with) Groupe Imagerie et Mod?lisation (Computational Imaging & Modeling Group) D?partement Biologie Cellulaire et Infections CNRS URA 2582 Institut Pasteur 25-28 rue du Docteur Roux 75015 Paris, France Phone: +33 1 40 61 31 70 Ext: 3170 E-mail: ricardoh at pasteur.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090910/c5ebd444/attachment.htm From sanne at kortec.nl Thu Sep 10 18:54:10 2009 From: sanne at kortec.nl (Sanne Korzec) Date: Thu, 10 Sep 2009 18:54:10 +0200 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: Message-ID: <20090910165413.XQRN29725.viefep14-int.chello.at@edge02.upc.biz> Basically, I am trying to transform the following python datastructure into cython. Source_index = int Target_index = int Phrase_count = float Phrase_prob = float Phrase_table = {} Subdict = {} Subdict[s_index] = [count, prob] Phrase_table[tindex] = subdict So that a call to: Phrase_table[tindex][sindex] gives the count and prob. I described this im my previous mail as, sorry for the confusion. Key : int ---> value : hashtable { key: int ---> value: list(float, float) } My main two concerns with using this hashtable are: -Can I reference to the "subdict" hashtable from the original void *HashTableValue? And how do I cast this? -Can I store two floats in void *HashTableValue; In addition, I have started to create what I want, but I am still having some difficulties. Attached is my .pyx file. Some questions I have are: -Void_star_to_hashtable is obviously wrong, but why exactly? cdef c_HashTable void_star_to_hashtable(void* v): cdef void** b = [v] cdef c_HashTable* a = b return a[0] -line 86: sub_dict = void_star_to_hashtable(hash_table_lookup(self._base, int_to_void_star(tindex))) Why do I need a cast here? -----Original Message----- From: Robert Bradshaw [mailto:robertwb at math.washington.edu] Sent: woensdag 9 september 2009 18:48 To: sanne at kortec.nl Subject: Re: [Cython] FW: cython and hash tables / dictionary On Sep 9, 2009, at 7:18 AM, Sanne Korzec wrote: > Ok, I have played around with this hash table and understand most > of the > basics... > > I now would like to create the datastructure I need for my project. > > Basically what I need is two linked hash tables like this > > Key : int ---> value : hashtable { key: int ---> value: list(float, > float) } > > And > > Key : int ---> value : hashtable { key: int ---> value: float } I'm not quite sure exactly what your notation means here. You need a hashtable from ints to floats, and another one from ints to pairs of floats? > I am starting to wonder since this hash table works with voids for > key and > value only if I should change the .c and .h files myself. I think I > would > prefer not to. The way this hashtable is intended to be used is that you malloc some room for your keys/values, and then pass the pointers into the table itself. Of course this is a bit of overhead, both in terms of runtime (all the malloc/free calls) and manual labor. > Does anybody have a suggestion what would be wise? Preference goes > to quick > implementation, not total optimization. First, I'd see how fast Python hashtables work for you. That might be good enough. You could look into creating a special PairOfFoats cdef class to try to cut on the list/tuple overhead. If that doesn't work, the next thing would be to use the structure above (manually malloc- ing room for all the stuff, though if your ints fit into a void* you could use the same casting trick). That failing, you could write your own custom hash table. As has been discovered, Python hashtables are pretty good, so expect at most a 10x (?) improvement writing your own. - Robert -------------- next part -------------- A non-text attachment was scrubbed... Name: cy_hash.pyx Type: application/octet-stream Size: 10552 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20090910/98edf6d1/attachment-0001.obj From robertwb at math.washington.edu Thu Sep 10 20:20:23 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Sep 2009 11:20:23 -0700 Subject: [Cython] FW: cython and hash tables / dictionary In-Reply-To: <20090910165413.XQRN29725.viefep14-int.chello.at@edge02.upc.biz> References: <20090910165413.XQRN29725.viefep14-int.chello.at@edge02.upc.biz> Message-ID: On Sep 10, 2009, at 9:54 AM, Sanne Korzec wrote: > Basically, I am trying to transform the following python > datastructure into > cython. > > Source_index = int > Target_index = int > Phrase_count = float > Phrase_prob = float > > Phrase_table = {} > Subdict = {} > Subdict[s_index] = [count, prob] > Phrase_table[tindex] = subdict > > So that a call to: > > Phrase_table[tindex][sindex] gives the count and prob. You could probably get a twofold speedup by just implementing this as a hashtable (int, int) -> (float, float) rather than nested hashtables (unless that doesn't work for your algorithm). > I described this im my previous mail as, sorry for the confusion. > > Key : int ---> value : hashtable { key: int ---> value: list(float, > float) } > > My main two concerns with using this hashtable are: > > -Can I reference to the "subdict" hashtable from the original void > *HashTableValue? And how do I cast this? > -Can I store two floats in void *HashTableValue; This hashtable implementation is a map from pointers to pointers, and the pointers can refer to anything you want. The drawback of course is that you have to manually manage the memory those pointers point to. > In addition, > > I have started to create what I want, but I am still having some > difficulties. Attached is my .pyx file. > > Some questions I have are: > > -Void_star_to_hashtable is obviously wrong, but why exactly? > > cdef c_HashTable void_star_to_hashtable(void* v): > cdef void** b = [v] > cdef c_HashTable* a = b > return a[0] > > -line 86: sub_dict = > void_star_to_hashtable(hash_table_lookup(self._base, > int_to_void_star(tindex))) > > Why do I need a cast here? The code, as written, is returning a c_HashTable, not a c_HashTable*. You should just cast between c_HashTable* and void*--the only reason I had a conversion function is because I was storing the float as if it were a pointer to avoid manually managing the memory (which you will have to do as a c_HashTable struct is larger than a pointer). I think it's pertinent to point out that this library was designed to be used from C, which means it's totally usable from Cython, but the interface is very C-like so the only way to use it is like you would in C. If you're not confortable with C and pointers and malloc, etc. then I would do the following, which will still be a lot faster than what you have: Implement a cdef class that wraps a pairs of ints, and another that wraps pairs of floats. Create methods to instantiate them very quickly (avoiding all Python calls and argument passing) and give them fast __hash__ and __cmp__ methods. Now use the first as keys and the latter as values in a Python dictionary, and access their members directly. This should be much faster than what you have, and probably within a factor of 2 of using c_HashTable, as well as being much easier to code (let alone avoiding the pitfalls of segfaults and memory leaks). - Robert From dalcinl at gmail.com Thu Sep 10 22:12:48 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 10 Sep 2009 17:12:48 -0300 Subject: [Cython] Problem on Cython compilation In-Reply-To: <5AD7DB18-E451-4A85-9859-9D64A4ED32BD@gmail.com> References: <5AD7DB18-E451-4A85-9859-9D64A4ED32BD@gmail.com> Message-ID: Are you using a recent MinGW? Can you show us the contents of your setup.py file? On Thu, Sep 10, 2009 at 12:51 PM, Ricardo Henriques wrote: > Hi guys, > I'm having a constant problem trying to compile a simple file with mingw: > ------ iqdb.pyx ?------- > import numpy > def main(): > ?? ?print "hello" > --------------------------- > By going through the usual python setup.py build_ext --inplace I get: > running build_ext > cythoning iqdb.pyx to iqdb.cpp > building 'iqdb' extension > creating build > creating build\temp.win32-2.5 > creating build\temp.win32-2.5\Release > C:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW\bin\gcc.exe -mno-cygwin > -mdll -O -Wall "-IC:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\lib\site-packages\numpy\core\include" > "-IC:\Documents and Settings\Paxcal\Desktop\iQtemp10\32\ab_common" > "-IC:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\include" > "-IC:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\PC" -c iqdb.cpp -o > build\temp.win32-2.5\Release\iqdb.o > C:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW\bin\gcc.exe -mno-cygwin > -mdll -O -Wall "-IC:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\lib\site-packages\numpy\core\include" > "-IC:\Documents and Settings\Paxcal\Desktop\iQtemp10\32\ab_common" > "-IC:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\include" > "-IC:\Documents and Settings\Paxcal\Desktop\iQ2\iQPython\PC" -c iqdb.cpp -o > build\temp.win32-2.5\Release\iqdb.o > writing build\temp.win32-2.5\Release\iqdb.def > C:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\Programs\MinGW\bin\g++.exe -mno-cygwin > -mdll -static --entry _DllMain at 12 --output-lib > build\temp.win32-2.5\Release\libiqdb.a --def > build\temp.win32-2.5\Release\iqdb.def -s build\temp.win32-2.5\Release\iqdb.o > build\temp.win32-2.5\Release\iqdb.o "-LC:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\libs" "-LC:\Documents and > Settings\Paxcal\Desktop\iQ2\iQPython\PCBuild" -lpython25 -lmsvcr71 -o > iqdb.pyd > g++: build\temp.win32-2.5\Release\libiqdb.a: No such file or directory > error: command 'g++' failed with exit status 1 > > Any ideas why the step to create the libiqdb.a is missing? > Thanks for your help.. > Cheers, > Ricardo > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Ricardo Henriques > PhD Student Gene Expression and Biophysics Unit > Institute of Molecular Medicine > Faculty of Medicine, University of Lisbon Av. Prof. Egas Moniz > 1649-028 Lisbon, Portugal > Phone: + 351 217999503, > Ext: 47318 > Fax: + 351 217999504 > E-mail: rhenriques at fm.ul.pt > ~~ Or ~~ > PhD Student (in collaboration with) Groupe Imagerie et Mod?lisation > (Computational Imaging & Modeling Group) > D?partement Biologie Cellulaire et Infections > CNRS URA 2582 > Institut Pasteur > 25-28 rue du Docteur Roux > 75015 Paris,?France > Phone: +33 1 40 61 31 70 > Ext: 3170 > E-mail: ricardoh at pasteur.fr > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From cb at pdos.csail.mit.edu Thu Sep 10 22:24:29 2009 From: cb at pdos.csail.mit.edu (Chuck Blake) Date: Thu, 10 Sep 2009 16:24:29 -0400 Subject: [Cython] Cython-dev Digest, Vol 21, Issue 5 In-Reply-To: <20090910165413.XQRN29725.viefep14-int.chello.at@edge02.upc.biz> Message-ID: <20090910202429.GA15507@pdos.lcs.mit.edu> >As has been discovered, Python hashtables are pretty good, so expect >at most a 10x (?) improvement writing your own. Just to tighten this claim up, I have a fairly optimal C string hash table that is only about 2.8X faster (in a Cython environment) than Python's (also in the same Cython environment), at least at L2 cache and main memory scales. { My table is also about 3X faster than the kind of lame one in g++-4 series STL, which means for strings anyway, the Py one is "about as good as g++" - that may not be saying much, but g++'s is one widely used, well-known alternative, anyway. } For hashing, an intrinsically "non local" sort of computation structure, once tables no longer fit in core RAM, the actual time profile changes drastically, with a multiplier more like 10ms/60ns = 166000X or maybe if you have 50us latency SSD drives for swap only 50us/60ns =~ 833X. In memory usage, Python objects are huge. Robert points out one way you may be able to save memory. On an x86-64, Python string objects are typically >64 B, for example. So, if you were actually keeping short 4-byte strings as a memory optimization you might get something like an 16X reduction in memory, or be able to reach problem sizes 16X larger before falling over the cliff of external memory slow-downs. So, either a very C-like Cython or a full C solution could allow you to do problem sizes 16X bigger or so. (A 10X memory optimization is one of the few cases lately where I've had to go to a full C solution instead of a Cython solution in my own work). When everything does fit in the same level of the memory hierarchy, 3X is likely to be the best you will improve over Python dictionaries. -- FWIW, Bob Jenkin's hash function is good, but 6..7x slower than what you might need. Paul Hsieh has a slightly faster one, and I have one still faster: core2_cycles i7_cycles (n=num of key bytes) Bob Jenkins 27 + 2.4*n 30.3 + 2.26 * n Paul Hsieh 24 + 1.9*n 20.0 + 1.69 * n cb2 9 + 0.41*n 8.4 + 0.32 * n Numbers above are from robust linear regressions over many randomized trials and very stable. { Mine is really just a Knott-style (as attributed by Knuth circa 1970) hash, modified for word-wise/8byte loops over byte-wise aligned data and with some constants selected for English language ASCII. } I've done a lot of different key set distributional checking on dozens of sets of real world key sets. The extra expense of the others does not seem to generally make any difference in either average or maximum collision rates, though I'm sure in pathological cases they can perform better. It sounds like Sanne wants integer keys, anyway. It may make sense to copy Python's integer hash function, though you should be careful about how that interacts with Python's reduction from a hash code to a table address via modulus primes. Any real distribution checking has likely only been done in an end-to-end sense. -- All these follow-up points on the interesting discussion so far being made, looking at the kind of structure Sanne seems to actually be building, it is possible that he should not be using giant hash tables at all? I'm just guessing from his variable names... Could he perhaps use 2D NumPy arrays for his two floats -- the count and probability, and only use two separate small hash tables to convert between integer or string keys and "dense index coordinates". Then he could use Numpy_count[s,t] where s=sindex[sub]; t=tindex[tgt], and a paralle Numpy_prob[s,t]. Or even, perhaps, a 3D numpy array, Numpy_countProb[s, t, slot] where slot 0 could be count and slot 1 could be Prob (or a recarray). He may need to maintain the reverse mapping, sub=subspelling[ix] as well, but that can be a simple Python list. You see a new sub, you append it to the subList and set sindex[sub] = len(subList) and analogously for targetList and tindex. It may be necessary to pre-allocate the address space in the Numpy array with some kind of bound for how many tindex and sindex values can happen, but without this bound if things blow out RAM the algo with come to a screeching halt anyway. So this kind of bounding may be intrinsic to a problem this big for that machine. Whether this sort of thing might work may depend upon how "dynamic" the container in question is during the course of his processing. E.g., whether he needs to dynamically 'del' columns or rows and rely on them being actually "gone" or just growing up an indexing structure and allocating new rows and new columns as he goes. It seems conceivable that he may only be doing the latter and then not need any external hash library mojo and, honestly, very little C-style Cython to save on memory since Numpy objects are already quite dense. It's been my experience that the ease Python (and Perl) expose for dicts leads a lot of people to use nested hash table containers rather than as "large address space-to-small address space" translators. Often the translator approach is much more efficient. Hopefully at least some of this has been useful commentary. From cb at pdos.csail.mit.edu Thu Sep 10 23:33:01 2009 From: cb at pdos.csail.mit.edu (Chuck Blake) Date: Thu, 10 Sep 2009 17:33:01 -0400 Subject: [Cython] FW: cython and hash tables / dictionary (was Re: Cython-dev Digest, Vol 21, Issue 5) Message-ID: <20090910213301.GA7177@pdos.lcs.mit.edu> D'oh! Sorry about the subject-line mixup, guys. :( Also, Robert has also already pointed out maybe a "two-key sparse -> dense map" may be all that is needed. Another maybe-easier memory management possibility would be to have a bit of Cython code to interpet a NumPy array *as a hash table*, with dead slots and deleted keys and all. So, if the association is a triple -- (int s, int t, int ix) you could have a NumPy array that had as many 12-byte rows as you need (assuming ints are 4 bytes), resized on demand, etc. That would take up basically as little space as you are going to get with C, with the 'ix' being an index into to the NumPy array of floats. This would also probably yield very little cost in malloc/free (or even just malloc) cycles. Really, only when the whole table grew would any memory mgmt happen. It may well be more space efficient than this external library with its 8 byte void pointers and such taking up extra space. I'm not sure how easy it would be to make a table of this style "key generic" in Cython, but the actual hash-insert/lookup code is probably a mere 15..25 lines of Cython as a layer of interpretation of the NumPy memory. I may code up a little example of an open-addressed linear probe no-deletion table tonight if I have some time. From greg.ewing at canterbury.ac.nz Fri Sep 11 00:43:08 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 11 Sep 2009 10:43:08 +1200 Subject: [Cython] filetable_cname vs filenames_cname In-Reply-To: <9BD65EE9-13BA-4538-8FD9-C34127AE8470@math.washington.edu> References: <9BD65EE9-13BA-4538-8FD9-C34127AE8470@math.washington.edu> Message-ID: <4AA980FC.4090808@canterbury.ac.nz> Robert Bradshaw wrote: > Does anyone know why we have filenames_cname and filetable_cname? In Pyrex I defined constants for all the identifiers used in generated code so that I could change them easily if I needed to. The definitions are all in one file in alphabetical order to help with keeping them unique. -- Greg From robertwb at math.washington.edu Fri Sep 11 00:47:10 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Sep 2009 15:47:10 -0700 Subject: [Cython] filetable_cname vs filenames_cname In-Reply-To: <4AA980FC.4090808@canterbury.ac.nz> References: <9BD65EE9-13BA-4538-8FD9-C34127AE8470@math.washington.edu> <4AA980FC.4090808@canterbury.ac.nz> Message-ID: <938B16F3-D5ED-4D73-8FF9-9AB6EC033656@math.washington.edu> On Sep 10, 2009, at 3:43 PM, Greg Ewing wrote: > Robert Bradshaw wrote: >> Does anyone know why we have filenames_cname and filetable_cname? > > In Pyrex I defined constants for all the identifiers > used in generated code so that I could change them > easily if I needed to. The definitions are all in > one file in alphabetical order to help with keeping > them unique. Yes, I understand the purpose of Naming.py (and I think it's a good idea). My question was why do we have two identifiers that essentially point to the same thing, with a function that does nothing but initialize one to the other in the initmodule. (And I'm guessing you have a good answer that I just haven't thought of yet...) - Robert From rwest at MIT.EDU Fri Sep 11 01:53:49 2009 From: rwest at MIT.EDU (Richard West) Date: Thu, 10 Sep 2009 19:53:49 -0400 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard Message-ID: Hi, I recently upgraded to Mac OS X 10.6 Snow Leopard, which means I am now using gcc version 4.2.1 (Apple Inc. build 5646) When I first tried to use Cython after the upgrade I was getting errors like cc1: error: unrecognized command line option "-Wno-long-double" presumably because the deprecated -Wno-long-double option was removed from gcc. When trying to build Cython itself on the default Python 2.6 installation, I was also getting a lot of warnings like /usr/include/AvailabilityMacros.h:108:14: warning: #warning Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. My workaround, which seems to work OK so far, is as follows: First run $ easy_install -eb temporary_folder Cython to download but not install Cython On line 32 of temporary_folder/cython/Cython/Mac/DarwinSystem.py change os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.3" to os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.4" And on line 36 of Cython/Mac/DarwinSystem.py remove the "-Wno-long- double" option. Then run $ sudo easy_install temporary_folder/cython/ to build and install the modified Cython. Hope this saves someone a few minutes. Richard From greg.ewing at canterbury.ac.nz Fri Sep 11 04:05:28 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 11 Sep 2009 14:05:28 +1200 Subject: [Cython] filetable_cname vs filenames_cname In-Reply-To: <938B16F3-D5ED-4D73-8FF9-9AB6EC033656@math.washington.edu> References: <9BD65EE9-13BA-4538-8FD9-C34127AE8470@math.washington.edu> <4AA980FC.4090808@canterbury.ac.nz> <938B16F3-D5ED-4D73-8FF9-9AB6EC033656@math.washington.edu> Message-ID: <4AA9B068.9070201@canterbury.ac.nz> Robert Bradshaw wrote: > My question was why do we have two identifiers that > essentially point to the same thing, with a function that does > nothing but initialize one to the other in the initmodule. The reason for the function is that Pyrex had to be able to generate code referring to the table before the code that initializes the table, and Windows appears to be incapable of handling static initialization of pointers in a DLL. If Cython can arrange for the table init code to be emitted before the code that references the table, the function and the dual naming could probably be eliminated. -- Greg From dagss at student.matnat.uio.no Fri Sep 11 05:55:00 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: 11 Sep 2009 05:55:00 +0200 Subject: [Cython] filetable_cname vs filenames_cname Message-ID: <3335493342.1718282@smtp.netcom.no> In case anyone's forgot, this is trivial in Cython, see Code.py/GlobalState. Dag Sverre Seljebotn -----Original Message----- From: Greg Ewing Date: Friday, Sep 11, 2009 4:06 am Subject: Re: [Cython] filetable_cname vs filenames_cname To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net Robert Bradshaw wrote: > My question was why do we have two identifiers that > essentially point to the same thing, with a function that does > nothing but initialize one to the other in the initmodule. > >The reason for the function is that Pyrex had to >be able to generate code referring to the table >before the code that initializes the table, and >Windows appears to be incapable of handling >static initialization of pointers in a DLL. > >If Cython can arrange for the table init code >to be emitted before the code that references >the table, the function and the dual naming >could probably be eliminated. > >-- >Greg > >_______________________________________________ >Cython-dev mailing list >Cython-dev at codespeak.net >http://codespeak.net/mailman/listinfo/cython-dev > From dalcinl at gmail.com Fri Sep 11 17:00:43 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Sep 2009 12:00:43 -0300 Subject: [Cython] ticket #333 (extern ctypedef integral <-> python object conversion), please review patch In-Reply-To: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> References: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> Message-ID: What about implementing Pyx_PyNumber_Int in such a way that we can be sure that it returns a EXACT PyInt/PyLong (or NULL in case of failure) ? Then we could safely use CheckExact in the other conversor functions (and perhaps a "goto begin" in order to avoid the recursion?) On Wed, Sep 9, 2009 at 5:58 AM, Robert Bradshaw wrote: > On Jun 19, 2009, at 9:44 AM, Lisandro Dalcin wrote: > >> I've finally managed to write a more or less working patch... Look at >> the test, I've tried to cover many corner cases... >> >> 1) Dag & Kurt: I bet you will be happy. >> >> >> 2) Robert: your eyeballs needed, please comment on this: >> >> a) Look at the changes outside PyrexTypes.pyx, I believe they make >> sense, though I would like you to confirm that. >> >> b) In the past, you raised some concerns about __int32 from ILP64 >> model... A possible (and suboptimal, no overflow-safe) way of handling >> that is there, though "#if 0" disabled. I've tried to take advantage >> of "_PyLong_{As/From}ByteArray()", but that (in particular, the "As" >> one) is somewhat harder to use, as we should pass a PyLongObject type. > > Sorry I've taken so long to get back on this. I have read all the > code, and it looks good. I've posted some comments on trac. http:// > trac.cython.org/cython_trac/ticket/333 > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Sep 11 17:13:14 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 Sep 2009 17:13:14 +0200 Subject: [Cython] ticket #333 (extern ctypedef integral <-> python object conversion), please review patch In-Reply-To: References: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> Message-ID: <4AAA690A.8050404@behnel.de> Lisandro Dalcin wrote: > (and perhaps a "goto begin" in order to avoid the recursion?) Note that gcc has some support for tail recursion elimination, which might actually apply here. Stefan From dsdale24 at gmail.com Fri Sep 11 17:54:41 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 11:54:41 -0400 Subject: [Cython] how to make cython definitions available to external C code? Message-ID: Hello, I am just learning cython, please bear with me. This is maybe a common question, but I didn't recognize it in the documentation or the FAQs. How do you make cython definitions available to external C code? For example, converting some of numpy's code in numpy/core/src/multiarray to cython without affecting the C API? Thanks, Darren From dsdale24 at gmail.com Fri Sep 11 18:12:21 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 12:12:21 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: On Fri, Sep 11, 2009 at 11:54 AM, Darren Dale wrote: > Hello, > > I am just learning cython, please bear with me. This is maybe a common > question, but I didn't recognize it in the documentation or the FAQs. > How do you make cython definitions available to external C code? For > example, converting some of numpy's code in numpy/core/src/multiarray > to cython without affecting the C API? Somehow I overlooked: http://docs.cython.org/docs/external_C_code.html#using-cython-declarations-from-c From seb.binet at gmail.com Fri Sep 11 18:15:55 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Fri, 11 Sep 2009 18:15:55 +0200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: <200909111815.56044.binet@cern.ch> Darren, > I am just learning cython, please bear with me. This is maybe a common > question, but I didn't recognize it in the documentation or the FAQs. > How do you make cython definitions available to external C code? For > example, converting some of numpy's code in numpy/core/src/multiarray > to cython without affecting the C API? that's here: http://docs.cython.org/docs/external_C_code.html#using-cython-declarations- from-c hth, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From dagss at student.matnat.uio.no Fri Sep 11 18:26:17 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Sep 2009 18:26:17 +0200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: Hi Darren, > Hello, > > I am just learning cython, please bear with me. This is maybe a common > question, but I didn't recognize it in the documentation or the FAQs. > How do you make cython definitions available to external C code? For > example, converting some of numpy's code in numpy/core/src/multiarray > to cython without affecting the C API? Doing just this is something I've been eager to try myself, please don't hesitate to ask any questions you might have in the process. I hope it's OK that I write down some thoughts I have about this, even if they're not really related to your question. One thing that complicates this process is that NumPy has a requirement that both modes of building the multiarray module works: - Building all C files seperately, then linking -- But this won't work with Cython out of the box because the module initialization code won't get called, so that globals etc. are not initialized properly. - Including all C files in a single master C file, then compile only that one file -- This won't work with Cython as it inserts module initialization code which will conflict with the module initialization code already present in multiarray. What I've thought about though is starting with module initialization code, i.e. rewrite NumPy's module initialization code in Cython as the first thing. This might solve both modes above (with some more hacks, no doubt). Module initialization is a major hurdle with regards to Python 3 anyway, so it's as good a place to start as any. Dag Sverre From dagss at student.matnat.uio.no Fri Sep 11 18:30:15 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Sep 2009 18:30:15 +0200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: > On Fri, Sep 11, 2009 at 11:54 AM, Darren Dale wrote: >> Hello, >> >> I am just learning cython, please bear with me. This is maybe a common >> question, but I didn't recognize it in the documentation or the FAQs. >> How do you make cython definitions available to external C code? For >> example, converting some of numpy's code in numpy/core/src/multiarray >> to cython without affecting the C API? > > Somehow I overlooked: > http://docs.cython.org/docs/external_C_code.html#using-cython-declarations-from-c I'm not sure if this is useful in the context of NumPy though, because Cython expects to create an entire module, and symbols from one module can't be seen from another. So this is only useful if you have C code which is statically linked into your Cython module. However declaring stuff "public" will keep Cython from mangling it, which is a must. So yes, you need to do this, but it is not sufficient. Dag Sverre From dsdale24 at gmail.com Fri Sep 11 18:36:03 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 12:36:03 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: Hi Dag, On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn wrote: > Hi Darren, > >> Hello, >> >> I am just learning cython, please bear with me. This is maybe a common >> question, but I didn't recognize it in the documentation or the FAQs. >> How do you make cython definitions available to external C code? For >> example, converting some of numpy's code in numpy/core/src/multiarray >> to cython without affecting the C API? > > Doing just this is something I've been eager to try myself, please don't > hesitate to ask any questions you might have in the process. > > I hope it's OK that I write down some thoughts I have about this, even if > they're not really related to your question. It is related. I'm trying to get a view of the numpy/py3 landscape from the bottom of an apparently steep slope. > One thing that complicates this process is that NumPy has a requirement > that both modes of building the multiarray module works: > - Building all C files seperately, then linking > ?-- But this won't work with Cython out of the box because the module > initialization code won't get called, so that globals etc. are not > initialized properly. > - Including all C files in a single master C file, then compile only that > one file > ?-- This won't work with Cython as it inserts module initialization code > which will conflict with the module initialization code already present > in multiarray. > > What I've thought about though is starting with module initialization > code, i.e. rewrite NumPy's module initialization code in Cython as the > first thing. This might solve both modes above (with some more hacks, no > doubt). Module initialization is a major hurdle with regards to Python 3 > anyway, so it's as good a place to start as any. From dsdale24 at gmail.com Fri Sep 11 18:56:25 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 12:56:25 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn wrote: > Hi Darren, > >> Hello, >> >> I am just learning cython, please bear with me. This is maybe a common >> question, but I didn't recognize it in the documentation or the FAQs. >> How do you make cython definitions available to external C code? For >> example, converting some of numpy's code in numpy/core/src/multiarray >> to cython without affecting the C API? > > Doing just this is something I've been eager to try myself, please don't > hesitate to ask any questions you might have in the process. > > I hope it's OK that I write down some thoughts I have about this, even if > they're not really related to your question. > > One thing that complicates this process is that NumPy has a requirement > that both modes of building the multiarray module works: > - Building all C files seperately, then linking > ?-- But this won't work with Cython out of the box because the module > initialization code won't get called, so that globals etc. are not > initialized properly. > - Including all C files in a single master C file, then compile only that > one file > ?-- This won't work with Cython as it inserts module initialization code > which will conflict with the module initialization code already present > in multiarray. Where can I learn more about these requirements? From robertwb at math.washington.edu Fri Sep 11 18:56:59 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 11 Sep 2009 09:56:59 -0700 Subject: [Cython] ticket #333 (extern ctypedef integral <-> python object conversion), please review patch In-Reply-To: References: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> Message-ID: <432AFA98-A3B4-40CC-B496-FEA0FB820226@math.washington.edu> On Sep 11, 2009, at 8:00 AM, Lisandro Dalcin wrote: > What about implementing Pyx_PyNumber_Int in such a way that we can be > sure that it returns a EXACT PyInt/PyLong (or NULL in case of failure) > ? Then we could safely use CheckExact in the other conversor functions > (and perhaps a "goto begin" in order to avoid the recursion?) Well, it could be more expensive to construct the new object if all we want to do is extract out the long value. Also, I'm worried that Pyx_PyNumber_Int checks for PyInt/PyLong, even if it's only called on non PyInt/PyLong values. Given that this whole thing gets inlined all over the place, we also need to be concerned about code size (not for the .so size, but it can make a difference for instruction caches). Note that I pushed your #333, though we can continue working on top of it. - Robert > > > On Wed, Sep 9, 2009 at 5:58 AM, Robert Bradshaw > wrote: >> On Jun 19, 2009, at 9:44 AM, Lisandro Dalcin wrote: >> >>> I've finally managed to write a more or less working patch... >>> Look at >>> the test, I've tried to cover many corner cases... >>> >>> 1) Dag & Kurt: I bet you will be happy. >>> >>> >>> 2) Robert: your eyeballs needed, please comment on this: >>> >>> a) Look at the changes outside PyrexTypes.pyx, I believe they make >>> sense, though I would like you to confirm that. >>> >>> b) In the past, you raised some concerns about __int32 from ILP64 >>> model... A possible (and suboptimal, no overflow-safe) way of >>> handling >>> that is there, though "#if 0" disabled. I've tried to take advantage >>> of "_PyLong_{As/From}ByteArray()", but that (in particular, the "As" >>> one) is somewhat harder to use, as we should pass a PyLongObject >>> type. >> >> Sorry I've taken so long to get back on this. I have read all the >> code, and it looks good. I've posted some comments on trac. http:// >> trac.cython.org/cython_trac/ticket/333 >> >> - Robert >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Fri Sep 11 19:24:18 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Sep 2009 14:24:18 -0300 Subject: [Cython] ticket #333 (extern ctypedef integral <-> python object conversion), please review patch In-Reply-To: <432AFA98-A3B4-40CC-B496-FEA0FB820226@math.washington.edu> References: <4AD48DBC-D376-43DD-8E5C-B4174C42E6B5@math.washington.edu> <432AFA98-A3B4-40CC-B496-FEA0FB820226@math.washington.edu> Message-ID: On Fri, Sep 11, 2009 at 1:56 PM, Robert Bradshaw wrote: > On Sep 11, 2009, at 8:00 AM, Lisandro Dalcin wrote: > > > Well, it could be more expensive to construct the new object if all > we want to do is extract out the long value. But this path will fe followed only if Py_TYPE(x)->tp_as_number->nb_{int|long} did not return an exact int/long... in Python terms (taking into accout only PyInt for the sake of keeping it simple): def PyNumber_Int(x): if type(x) is int: return x res = x.__int__() if type(res) is int: return res return int.__new__(int, x) So the line is executed in the (unlikely?) case an "int" subclass returns a "int" subclass when calling __int__()... (perhaps this is the case for NumPy scalars??) > Also, I'm worried that > Pyx_PyNumber_Int checks for PyInt/PyLong, even if it's only called on > non PyInt/PyLong values. Given that this whole thing gets inlined all > over the place, we also need to be concerned about code size (not for > the .so size, but it can make a difference for instruction caches). Very good point... > > Note that I pushed your #333, though we can continue working on top > of it. > Perhaps we can move right now to use a chain "likely(PyInt_ChecExact(x) || PyIntCheck(x))" for making it more or less fast in Py 2.5 and below ? ... Still, IIRC numpy scalars are/were broken with the Py2.6 fast subclass check.. In such case, it would be nice to to implement PyNumber_Int in a way that does not suffer from that issues... >> >> >> On Wed, Sep 9, 2009 at 5:58 AM, Robert Bradshaw >> wrote: >>> On Jun 19, 2009, at 9:44 AM, Lisandro Dalcin wrote: >>> >>>> I've finally managed to write a more or less working patch... >>>> Look at >>>> the test, I've tried to cover many corner cases... >>>> >>>> 1) Dag & Kurt: I bet you will be happy. >>>> >>>> >>>> 2) Robert: your eyeballs needed, please comment on this: >>>> >>>> a) Look at the changes outside PyrexTypes.pyx, I believe they make >>>> sense, though I would like you to confirm that. >>>> >>>> b) In the past, you raised some concerns about __int32 from ILP64 >>>> model... A possible (and suboptimal, no overflow-safe) way of >>>> handling >>>> that is there, though "#if 0" disabled. I've tried to take advantage >>>> of "_PyLong_{As/From}ByteArray()", but that (in particular, the "As" >>>> one) is somewhat harder to use, as we should pass a PyLongObject >>>> type. >>> >>> Sorry I've taken so long to get back on this. I have read all the >>> code, and it looks good. I've posted some comments on trac. http:// >>> trac.cython.org/cython_trac/ticket/333 >>> >>> - Robert >>> >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Fri Sep 11 19:36:56 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Sep 2009 19:36:56 +0200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: > On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn > wrote: >> Hi Darren, >> >>> Hello, >>> >>> I am just learning cython, please bear with me. This is maybe a common >>> question, but I didn't recognize it in the documentation or the FAQs. >>> How do you make cython definitions available to external C code? For >>> example, converting some of numpy's code in numpy/core/src/multiarray >>> to cython without affecting the C API? >> >> Doing just this is something I've been eager to try myself, please don't >> hesitate to ask any questions you might have in the process. >> >> I hope it's OK that I write down some thoughts I have about this, even >> if >> they're not really related to your question. >> >> One thing that complicates this process is that NumPy has a requirement >> that both modes of building the multiarray module works: >> - Building all C files seperately, then linking >> ?-- But this won't work with Cython out of the box because the module >> initialization code won't get called, so that globals etc. are not >> initialized properly. >> - Including all C files in a single master C file, then compile only >> that >> one file >> ?-- This won't work with Cython as it inserts module initialization code >> which will conflict with the module initialization code already present >> in multiarray. > > Where can I learn more about these requirements? It's just how the build works, there's two seperate ways of building things; one "old" and one new which David introduced recently. I asked whether the old one would be dropped, but it appears not: http://thread.gmane.org/gmane.comp.python.numeric.general/32385 I'm not sure if this is all that well documented anywhere, I think the information has to be extracted on the mailing lists. Please ask again whenever you're stuck. If you can get Cython code into the multiarray module using any of the build modes it would be a great first step -- I don't expect it to be easy (because Cython is made for writing entire modules, not for mixing with other C source also making up the module), but I think it is doable. Dag Sverre From dsdale24 at gmail.com Fri Sep 11 21:24:21 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 15:24:21 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: On Fri, Sep 11, 2009 at 1:36 PM, Dag Sverre Seljebotn wrote: >> On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn >> wrote: >>> Hi Darren, >>> >>>> Hello, >>>> >>>> I am just learning cython, please bear with me. This is maybe a common >>>> question, but I didn't recognize it in the documentation or the FAQs. >>>> How do you make cython definitions available to external C code? For >>>> example, converting some of numpy's code in numpy/core/src/multiarray >>>> to cython without affecting the C API? >>> >>> Doing just this is something I've been eager to try myself, please don't >>> hesitate to ask any questions you might have in the process. >>> >>> I hope it's OK that I write down some thoughts I have about this, even >>> if >>> they're not really related to your question. >>> >>> One thing that complicates this process is that NumPy has a requirement >>> that both modes of building the multiarray module works: >>> - Building all C files seperately, then linking >>> ?-- But this won't work with Cython out of the box because the module >>> initialization code won't get called, so that globals etc. are not >>> initialized properly. >>> - Including all C files in a single master C file, then compile only >>> that >>> one file >>> ?-- This won't work with Cython as it inserts module initialization code >>> which will conflict with the module initialization code already present >>> in multiarray. >> >> Where can I learn more about these requirements? > > It's just how the build works, there's two seperate ways of building > things; one "old" and one new which David introduced recently. I asked > whether the old one would be dropped, but it appears not: > > http://thread.gmane.org/gmane.comp.python.numeric.general/32385 > > I'm not sure if this is all that well documented anywhere, I think the > information has to be extracted on the mailing lists. Please ask again > whenever you're stuck. > > If you can get Cython code into the multiarray module using any of the > build modes it would be a great first step -- I don't expect it to be easy > (because Cython is made for writing entire modules, not for mixing with > other C source also making up the module), but I think it is doable. So let me see if I understood you: Cython is not currently designed to let you build a module up from several submodule sources, be they C or Cython, because it will not expose the symbols in those submodules. Isn't it possible to make these available by treating them as external C sources, using the "cdef extern from "spam.h":" discussed at http://docs.cython.org/docs/external_C_code.html#referencing-c-header-files ? Although I am very interested in diving into cython, if it is not currently a good fit for helping with the numpy py3 transition then I should probably focus on getting up to speed on what needs to be done with the existing numpy C code. Darren From robertwb at math.washington.edu Fri Sep 11 21:52:06 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 11 Sep 2009 12:52:06 -0700 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: On Sep 11, 2009, at 12:24 PM, Darren Dale wrote: > On Fri, Sep 11, 2009 at 1:36 PM, Dag Sverre Seljebotn > wrote: >>> On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn >>> wrote: >>>> Hi Darren, >>>> >>>>> Hello, >>>>> >>>>> I am just learning cython, please bear with me. This is maybe a >>>>> common >>>>> question, but I didn't recognize it in the documentation or the >>>>> FAQs. >>>>> How do you make cython definitions available to external C >>>>> code? For >>>>> example, converting some of numpy's code in numpy/core/src/ >>>>> multiarray >>>>> to cython without affecting the C API? >>>> >>>> Doing just this is something I've been eager to try myself, >>>> please don't >>>> hesitate to ask any questions you might have in the process. >>>> >>>> I hope it's OK that I write down some thoughts I have about >>>> this, even >>>> if >>>> they're not really related to your question. >>>> >>>> One thing that complicates this process is that NumPy has a >>>> requirement >>>> that both modes of building the multiarray module works: >>>> - Building all C files seperately, then linking >>>> -- But this won't work with Cython out of the box because the >>>> module >>>> initialization code won't get called, so that globals etc. are not >>>> initialized properly. >>>> - Including all C files in a single master C file, then compile >>>> only >>>> that >>>> one file >>>> -- This won't work with Cython as it inserts module >>>> initialization code >>>> which will conflict with the module initialization code already >>>> present >>>> in multiarray. >>> >>> Where can I learn more about these requirements? >> >> It's just how the build works, there's two seperate ways of building >> things; one "old" and one new which David introduced recently. I >> asked >> whether the old one would be dropped, but it appears not: >> >> http://thread.gmane.org/gmane.comp.python.numeric.general/32385 >> >> I'm not sure if this is all that well documented anywhere, I think >> the >> information has to be extracted on the mailing lists. Please ask >> again >> whenever you're stuck. >> >> If you can get Cython code into the multiarray module using any of >> the >> build modes it would be a great first step -- I don't expect it to >> be easy >> (because Cython is made for writing entire modules, not for mixing >> with >> other C source also making up the module), but I think it is doable. > > So let me see if I understood you: Cython is not currently designed to > let you build a module up from several submodule sources, be they C or > Cython, because it will not expose the symbols in those submodules. > Isn't it possible to make these available by treating them as external > C sources, using the "cdef extern from "spam.h":" discussed at > http://docs.cython.org/docs/external_C_code.html#referencing-c- > header-files > ? > > Although I am very interested in diving into cython, if it is not > currently a good fit for helping with the numpy py3 transition then I > should probably focus on getting up to speed on what needs to be done > with the existing numpy C code. Yes, you can use the include keyword to build up a module out of several other files. The disadvantage of this is that you couldn't compile pieces independently--the whole thing would end up being one huge .c file. I don't think any of the obstacles for using Cython for the core of NumPy are insurmountable, but they may require some modification of Cython (e.g. to produce output .c files that are not independent modules, made to be linked using c linker). It's not clear the best way to go about this (though someone who knows more about the NumPy build system would know better than I). - Robert From dagss at student.matnat.uio.no Fri Sep 11 21:56:49 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Sep 2009 21:56:49 +0200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: <4ce93773aa70d997e58c8aa5b1195b87.squirrel@webmail.uio.no> > On Fri, Sep 11, 2009 at 1:36 PM, Dag Sverre Seljebotn > wrote: >>> On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn >>> wrote: >>>> Hi Darren, >>>> >>>>> Hello, >>>>> >>>>> I am just learning cython, please bear with me. This is maybe a >>>>> common >>>>> question, but I didn't recognize it in the documentation or the FAQs. >>>>> How do you make cython definitions available to external C code? For >>>>> example, converting some of numpy's code in numpy/core/src/multiarray >>>>> to cython without affecting the C API? >>>> >>>> Doing just this is something I've been eager to try myself, please >>>> don't >>>> hesitate to ask any questions you might have in the process. >>>> >>>> I hope it's OK that I write down some thoughts I have about this, even >>>> if >>>> they're not really related to your question. >>>> >>>> One thing that complicates this process is that NumPy has a >>>> requirement >>>> that both modes of building the multiarray module works: >>>> - Building all C files seperately, then linking >>>> ?-- But this won't work with Cython out of the box because the module >>>> initialization code won't get called, so that globals etc. are not >>>> initialized properly. >>>> - Including all C files in a single master C file, then compile only >>>> that >>>> one file >>>> ?-- This won't work with Cython as it inserts module initialization >>>> code >>>> which will conflict with the module initialization code already >>>> present >>>> in multiarray. >>> >>> Where can I learn more about these requirements? >> >> It's just how the build works, there's two seperate ways of building >> things; one "old" and one new which David introduced recently. I asked >> whether the old one would be dropped, but it appears not: >> >> http://thread.gmane.org/gmane.comp.python.numeric.general/32385 >> >> I'm not sure if this is all that well documented anywhere, I think the >> information has to be extracted on the mailing lists. Please ask again >> whenever you're stuck. >> >> If you can get Cython code into the multiarray module using any of the >> build modes it would be a great first step -- I don't expect it to be >> easy >> (because Cython is made for writing entire modules, not for mixing with >> other C source also making up the module), but I think it is doable. > > So let me see if I understood you: Cython is not currently designed to > let you build a module up from several submodule sources, be they C or Correct this far. (But see below.) > Cython, because it will not expose the symbols in those submodules. Not quite. The problem is that each Cython-generated source file will always contain a module initialization function, which it is expected (and required) that Python call. multiarray also contains one such function already, and so the two are in conflict. > Isn't it possible to make these available by treating them as external > C sources, using the "cdef extern from "spam.h":" discussed at > http://docs.cython.org/docs/external_C_code.html#referencing-c-header-files > ? Yes. Basically, what you can do is: a) Make a Cython .pyx the "top-level" multiarray module b) Drop the module initialization code currently written in C (that is, turn it from a function called by Python, to a C function called by the Cython module in a)). c) Call into the existing C source by using "cdef extern" etc. When it comes to the build, decide which of the two mechanisms you want to support first. It is probably easiest to tackle the multiple-compiled-sources build first; that basically only requires to declare functions in Cython code "public", and import functions from C through "cdef extern". Single-unit-compilation requires, I think, that you include all the relevant C files into the master top-level pyx file. Then all bets are off concerning whether things will work regarding inclusion order of API header files etc., but it might work. > > Although I am very interested in diving into cython, if it is not > currently a good fit for helping with the numpy py3 transition then I > should probably focus on getting up to speed on what needs to be done > with the existing numpy C code. There are obstacles, but we're talking about things which can be fixed (if necesarry in Cython itself) in a day or two of work. NumPy developers estimate that porting NumPy to Python 3 is an effort for several full-time months by the people who are already up to speed on the codebase -- in comparison with that, the time it would take to deal with this particular issue with Cython (i.e. mixing C and Cython into one module) fades. However, these are complicated issues, and the learning curve might be a bit high, as it requires knowing the internals of both the NumPy build system and the Cython-generated C source. Dag Sverre From dsdale24 at gmail.com Fri Sep 11 22:20:25 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 16:20:25 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: On Fri, Sep 11, 2009 at 3:52 PM, Robert Bradshaw wrote: > On Sep 11, 2009, at 12:24 PM, Darren Dale wrote: > >> On Fri, Sep 11, 2009 at 1:36 PM, Dag Sverre Seljebotn >> wrote: >>>> On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn >>>> wrote: >>>>> Hi Darren, >>>>> >>>>>> Hello, >>>>>> >>>>>> I am just learning cython, please bear with me. This is maybe a >>>>>> common >>>>>> question, but I didn't recognize it in the documentation or the >>>>>> FAQs. >>>>>> How do you make cython definitions available to external C >>>>>> code? For >>>>>> example, converting some of numpy's code in numpy/core/src/ >>>>>> multiarray >>>>>> to cython without affecting the C API? >>>>> >>>>> Doing just this is something I've been eager to try myself, >>>>> please don't >>>>> hesitate to ask any questions you might have in the process. >>>>> >>>>> I hope it's OK that I write down some thoughts I have about >>>>> this, even >>>>> if >>>>> they're not really related to your question. >>>>> >>>>> One thing that complicates this process is that NumPy has a >>>>> requirement >>>>> that both modes of building the multiarray module works: >>>>> - Building all C files seperately, then linking >>>>> ?-- But this won't work with Cython out of the box because the >>>>> module >>>>> initialization code won't get called, so that globals etc. are not >>>>> initialized properly. >>>>> - Including all C files in a single master C file, then compile >>>>> only >>>>> that >>>>> one file >>>>> ?-- This won't work with Cython as it inserts module >>>>> initialization code >>>>> which will conflict with the module initialization code already >>>>> present >>>>> in multiarray. >>>> >>>> Where can I learn more about these requirements? >>> >>> It's just how the build works, there's two seperate ways of building >>> things; one "old" and one new which David introduced recently. I >>> asked >>> whether the old one would be dropped, but it appears not: >>> >>> http://thread.gmane.org/gmane.comp.python.numeric.general/32385 >>> >>> I'm not sure if this is all that well documented anywhere, I think >>> the >>> information has to be extracted on the mailing lists. Please ask >>> again >>> whenever you're stuck. >>> >>> If you can get Cython code into the multiarray module using any of >>> the >>> build modes it would be a great first step -- I don't expect it to >>> be easy >>> (because Cython is made for writing entire modules, not for mixing >>> with >>> other C source also making up the module), but I think it is doable. >> >> So let me see if I understood you: Cython is not currently designed to >> let you build a module up from several submodule sources, be they C or >> Cython, because it will not expose the symbols in those submodules. >> Isn't it possible to make these available by treating them as external >> C sources, using the "cdef extern from "spam.h":" discussed at >> http://docs.cython.org/docs/external_C_code.html#referencing-c- >> header-files >> ? >> >> Although I am very interested in diving into cython, if it is not >> currently a good fit for helping with the numpy py3 transition then I >> should probably focus on getting up to speed on what needs to be done >> with the existing numpy C code. > > Yes, you can use the include keyword to build up a module out of > several other files. The disadvantage of this is that you couldn't > compile pieces independently--the whole thing would end up being one > huge .c file. I don't think any of the obstacles for using Cython for > the core of NumPy are insurmountable, but they may require some > modification of Cython (e.g. to produce output .c files that are not > independent modules, made to be linked using c linker). It's not > clear the best way to go about this (though someone who knows more > about the NumPy build system would know better than I). Could cython recognize a .pxc file as intended to generate a c file that is not a module? Darren From dsdale24 at gmail.com Fri Sep 11 22:31:42 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Fri, 11 Sep 2009 16:31:42 -0400 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: <4ce93773aa70d997e58c8aa5b1195b87.squirrel@webmail.uio.no> References: <4ce93773aa70d997e58c8aa5b1195b87.squirrel@webmail.uio.no> Message-ID: On Fri, Sep 11, 2009 at 3:56 PM, Dag Sverre Seljebotn wrote: >> On Fri, Sep 11, 2009 at 1:36 PM, Dag Sverre Seljebotn >> wrote: >>>> On Fri, Sep 11, 2009 at 12:26 PM, Dag Sverre Seljebotn >>>> wrote: >>>>> Hi Darren, >>>>> >>>>>> Hello, >>>>>> >>>>>> I am just learning cython, please bear with me. This is maybe a >>>>>> common >>>>>> question, but I didn't recognize it in the documentation or the FAQs. >>>>>> How do you make cython definitions available to external C code? For >>>>>> example, converting some of numpy's code in numpy/core/src/multiarray >>>>>> to cython without affecting the C API? >>>>> >>>>> Doing just this is something I've been eager to try myself, please >>>>> don't >>>>> hesitate to ask any questions you might have in the process. >>>>> >>>>> I hope it's OK that I write down some thoughts I have about this, even >>>>> if >>>>> they're not really related to your question. >>>>> >>>>> One thing that complicates this process is that NumPy has a >>>>> requirement >>>>> that both modes of building the multiarray module works: >>>>> - Building all C files seperately, then linking >>>>> ?-- But this won't work with Cython out of the box because the module >>>>> initialization code won't get called, so that globals etc. are not >>>>> initialized properly. >>>>> - Including all C files in a single master C file, then compile only >>>>> that >>>>> one file >>>>> ?-- This won't work with Cython as it inserts module initialization >>>>> code >>>>> which will conflict with the module initialization code already >>>>> present >>>>> in multiarray. >>>> >>>> Where can I learn more about these requirements? >>> >>> It's just how the build works, there's two seperate ways of building >>> things; one "old" and one new which David introduced recently. I asked >>> whether the old one would be dropped, but it appears not: >>> >>> http://thread.gmane.org/gmane.comp.python.numeric.general/32385 >>> >>> I'm not sure if this is all that well documented anywhere, I think the >>> information has to be extracted on the mailing lists. Please ask again >>> whenever you're stuck. >>> >>> If you can get Cython code into the multiarray module using any of the >>> build modes it would be a great first step -- I don't expect it to be >>> easy >>> (because Cython is made for writing entire modules, not for mixing with >>> other C source also making up the module), but I think it is doable. >> >> So let me see if I understood you: Cython is not currently designed to >> let you build a module up from several submodule sources, be they C or > > Correct this far. (But see below.) > >> Cython, because it will not expose the symbols in those submodules. > > Not quite. The problem is that each Cython-generated source file will > always contain a module initialization function, which it is expected (and > required) that Python call. > > multiarray also contains one such function already, and so the two are in > conflict. > >> Isn't it possible to make these available by treating them as external >> C sources, using the "cdef extern from "spam.h":" discussed at >> http://docs.cython.org/docs/external_C_code.html#referencing-c-header-files >> ? > > Yes. Basically, what you can do is: > > a) Make a Cython .pyx the "top-level" multiarray module > b) Drop the module initialization code currently written in C (that is, > turn it from a function called by Python, to a C function called by the > Cython module in a)). > c) Call into the existing C source by using "cdef extern" etc. > > When it comes to the build, decide which of the two mechanisms you want to > support first. It is probably easiest to tackle the > multiple-compiled-sources build first; that basically only requires to > declare functions in Cython code "public", and import functions from C > through "cdef extern". > > Single-unit-compilation requires, I think, that you include all the > relevant C files into the master top-level pyx file. Then all bets are off > concerning whether things will work regarding inclusion order of API > header files etc., but it might work. Please bear with me, I didn't understand the distinction you made between multiple-compiled-sources and Single-unit-compilation. Could you please illustrate with some simple pseudocode? Darren From cb at pdos.csail.mit.edu Fri Sep 11 22:39:47 2009 From: cb at pdos.csail.mit.edu (Chuck Blake) Date: Fri, 11 Sep 2009 16:39:47 -0400 Subject: [Cython] FW: cython and hash tables / dictionary (was Re: Cython-dev Digest, Vol 21, Issue 5) Message-ID: <20090911203947.GA13759@pdos.lcs.mit.edu> Here is a fully developed no-delete (OR ZERO KEYS) integer keyed table in Pure Cython using NumPy as a backing store allocator for the table space. Included is enough to benchmark it relative to Python's. In my timing, it is about 3X faster when the table is "right sized", but almost 5X faster when you need to do significant growing up of tables. It is neither pretty nor general, nor even a very good example code for Cython (or even C). It needs 4 byte ints and 4 byte floats (with the size equality mattering), and it does (almost) all indexing via under the hood C-isms. The ambitious Cythoner could probably convert most if not all of that to Buffer API mojo. Since exact integer keys seemed necessary, it does not even print out correctly unless you do under the covers casts { though convert to floats for keys would be trivial but stat failing for keys > 2**23 or so }. Indeed, about the only thing about the table that IS general is which of the 3 array slots in a row holds the KEY field. All I really wanted to do was exhibit that you only really need two 13-line functions slot() and registerAdd() to have a usable hash table system, though perhaps not a user friendly one. I've seen a great many people go to herculean efforts to import that 26 lines of logic from elsewhere and still be unrewarded by the back-end overheads and layering. (In this case it would be all kinds of malloc/free cycles...). Anyway, with right-sized tables that fit in L1 it hits about 37 cycles per count update (including the overhead of looping over the input keys) on my i7. The hash part alone is probably about 30 cycles. It is possible that a better hash function/multiplier choice could do a little bit better, though I expect 15 cycles or so to be a hard floor. Chuck # cython --embed u4Dict.pyx && gcc -g3 -O9 -combine -fipa-cp -march=native -mfpmath=sse,387 -fno-strict-aliasing -I/usr/include/python2.6 -I/usr/lib64/python2.6/site-packages/numpy/core/include u4Dict.c -L/usr/lib64 -L/usr/lib -lpython2.6 -lutil -lpthread -ldl -o u4Dict -------------------------- u4Dict.pyx -------------------------- cimport numpy as np import numpy as np cdef extern from "string.h" nogil: void *memcpy(void*, void*, unsigned long) ctypedef unsigned int u4_t cdef class u4Dict: cdef np.ndarray table # table space cdef u4_t key # column index of u4 key cdef u4_t lgSz # log_2(number of rows) cdef u4_t pop # number of members def __init__(u4Dict o, u4_t key, u4_t lgSz): o.lgSz = max(1, lgSz) o.table = np.zeros(( (1 << o.lgSz), 3), 'f') o.pop = 0 cdef u4_t slot(u4Dict o, u4_t key, u4_t *missing): cdef u4_t h = 0xe7cadfcc46f7c2cb * key # primary hash cdef u4_t i = ( h >> (32 - o.lgSz)) & ~1 # Hi bits(even) cdef u4_t incr = ((h >> 1) & ((1 << o.lgSz) - 1)) | 1 # Lo bits (odd) cdef u4_t MASK = (1 << o.lgSz) - 1 # Fast mod tab sz while (o.table.data)[3*i + o.key] != 0: # 0 ==> UNUSED SLOT if (o.table.data)[3*i + o.key] == key: # FOUND; yield slot missing[0] = 0 return i i = (i + incr) & MASK # incr mod table size missing[0] = 1 # NOT FOUND return i # yield insert slot cdef void registerAdd(u4Dict o): # update pop; maybe resize cdef u4_t i, j, dum cdef u4Dict n o.pop += 1 if o.pop * 8 > 7 << o.lgSz: n = u4Dict(o.key, o.lgSz + 1) for 0 <= i < (1 << o.lgSz): if (o.table.data)[3*i + o.key] != 0: j = n.slot((o.table.data)[3 * i + o.key], &dum) memcpy(&(n.table.data)[3 * j], &(o.table.data)[3 * i], 3 * 4) o.table = n.table o.lgSz += 1 from time import time from sys import stdin, argv def build(np.ndarray keys): cdef u4_t i, j, k, nKey = len(keys), missing, KEY=0, CNT=1, PROB=2 cdef u4Dict idict = u4Dict(KEY, 1) cdef double t0 = time() cdef np.ndarray row for 0 <= j < nKey: k = (keys.data)[j] i = idict.slot(k, &missing) if missing: # insert (idict.table.data)[3*i + KEY] = k (idict.table.data)[3*i + CNT] = 1.0 idict.registerAdd() else: # update (idict.table.data)[3*i + CNT] += 1.0 cdef double t1 = time() if len(argv) > 1: for row in idict.table[ idict.table[ : , KEY] != 0 ]: print row[CNT], (row.data)[KEY] print (t1 - t0)*1e6 / len(keys), 'microseconds/key' def Build(np.ndarray keys): # Python dicts are 2.5..4.5X slower than u4Dict cdef u4_t i, j, k, nKey = len(keys), missing, KEY=0, CNT=1, PROB=2 cdef dict idict = {} cdef double t0 = time() cdef np.ndarray row for 0 <= j < nKey: k = (keys.data)[j] try: idict[k][CNT] += 1.0 except: idict[k] = [1, 0] cdef double t1 = time() if len(argv) > 1: for row in idict.table[ idict.table[ : , KEY] != 0 ]: print row[CNT], (row.data)[KEY] print (t1 - t0)*1e6 / len(keys), 'microseconds/key' stuff = np.array([ int(line) for line in stdin ], 'I') build(stuff) Build(stuff) From stefan_ml at behnel.de Fri Sep 11 23:13:38 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 Sep 2009 23:13:38 +0200 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> Message-ID: <4AAABD82.4050506@behnel.de> Robert Bradshaw wrote: > On Sep 9, 2009, at 11:24 AM, Stefan Behnel wrote: >> I don't think there's anything wrong with letting Cython do the >> necessary casting under the hood. > > http://trac.cython.org/cython_trac/ticket/359 Ok, I pushed a patch to both cython-devel and cython-unstable. http://hg.cython.org/cython-devel/rev/d273a3dc784b I hope these casts pass as expected... Stefan From dalcinl at gmail.com Fri Sep 11 23:23:39 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Sep 2009 18:23:39 -0300 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <4AAABD82.4050506@behnel.de> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> <4AAABD82.4050506@behnel.de> Message-ID: Well, I guess it is too late to complain... But I think that explicit is better than implicit here, then I do not like this... I do not even like the automatic casting to bare char* !! It is almost impossible to be 100% sure that you code is bytes/unicode clean! Could we have at least a compile directive (or perhaps better/easier a global option?) to DISABLE these automatic castings to char*/uchar* ? On Fri, Sep 11, 2009 at 6:13 PM, Stefan Behnel wrote: > > Robert Bradshaw wrote: >> On Sep 9, 2009, at 11:24 AM, Stefan Behnel wrote: >>> I don't think there's anything wrong with letting Cython do the >>> necessary casting under the hood. >> >> http://trac.cython.org/cython_trac/ticket/359 > > Ok, I pushed a patch to both cython-devel and cython-unstable. > > http://hg.cython.org/cython-devel/rev/d273a3dc784b > > I hope these casts pass as expected... > > Stefan > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sat Sep 12 00:17:55 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 11 Sep 2009 15:17:55 -0700 Subject: [Cython] casting strings to unsigned char* In-Reply-To: References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> <4AAABD82.4050506@behnel.de> Message-ID: <6AC1927A-F41E-4182-8C90-6DA84B94D9D8@math.washington.edu> On Sep 11, 2009, at 2:23 PM, Lisandro Dalcin wrote: > Well, I guess it is too late to complain... But I think that explicit > is better than implicit here, then I do not like this... I do not even > like the automatic casting to bare char* !! It is almost impossible to > be 100% sure that you code is bytes/unicode clean! > > Could we have at least a compile directive (or perhaps better/easier a > global option?) to DISABLE these automatic castings to char*/uchar* ? It's certainly too late to disable it now, but I would be up for an option that gives errors/warnings. (Maybe -W or lint flags of some kind.) Just out of curiosity, what would the explicit method be? Python/C API? - Robert > > > On Fri, Sep 11, 2009 at 6:13 PM, Stefan Behnel > wrote: >> >> Robert Bradshaw wrote: >>> On Sep 9, 2009, at 11:24 AM, Stefan Behnel wrote: >>>> I don't think there's anything wrong with letting Cython do the >>>> necessary casting under the hood. >>> >>> http://trac.cython.org/cython_trac/ticket/359 >> >> Ok, I pushed a patch to both cython-devel and cython-unstable. >> >> http://hg.cython.org/cython-devel/rev/d273a3dc784b >> >> I hope these casts pass as expected... >> >> Stefan >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dalcinl at gmail.com Sat Sep 12 00:46:35 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Sep 2009 19:46:35 -0300 Subject: [Cython] casting strings to unsigned char* In-Reply-To: <6AC1927A-F41E-4182-8C90-6DA84B94D9D8@math.washington.edu> References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> <4AAABD82.4050506@behnel.de> <6AC1927A-F41E-4182-8C90-6DA84B94D9D8@math.washington.edu> Message-ID: On Fri, Sep 11, 2009 at 7:17 PM, Robert Bradshaw wrote: > On Sep 11, 2009, at 2:23 PM, Lisandro Dalcin wrote: > >> Well, I guess it is too late to complain... But I think that explicit >> is better than implicit here, then I do not like this... I do not even >> like the automatic casting to bare char* !! It is almost impossible to >> be 100% sure that you code is bytes/unicode clean! >> >> Could we have at least a compile directive (or perhaps better/easier a >> global option?) ?to DISABLE these automatic castings to char*/uchar* ? > > It's certainly too late to disable it now, but I would be up for an > option that gives errors/warnings. (Maybe -W or lint flags of some > kind.) > In understand ... Anyway, the implicit cast would not bother me at all I I would be able to disable it... > Just out of curiosity, what would the explicit method be? No idea... perhaps a manual cast? char *p = pystring ... > > Python/C API? > That's more or less what I'm currently doing in mpi4py... > - Robert > >> >> >> On Fri, Sep 11, 2009 at 6:13 PM, Stefan Behnel >> wrote: >>> >>> Robert Bradshaw wrote: >>>> On Sep 9, 2009, at 11:24 AM, Stefan Behnel wrote: >>>>> I don't think there's anything wrong with letting Cython do the >>>>> necessary casting under the hood. >>>> >>>> http://trac.cython.org/cython_trac/ticket/359 >>> >>> Ok, I pushed a patch to both cython-devel and cython-unstable. >>> >>> http://hg.cython.org/cython-devel/rev/d273a3dc784b >>> >>> I hope these casts pass as expected... >>> >>> Stefan >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dominic.sacre at gmx.de Sat Sep 12 02:36:45 2009 From: dominic.sacre at gmx.de (Dominic =?iso-8859-1?q?Sacr=E9?=) Date: Sat, 12 Sep 2009 02:36:45 +0200 Subject: [Cython] String types with Python 2.x and 3.x Message-ID: <200909120236.45400.dominic.sacre@gmx.de> Hi, I'm trying to make a Pyrex/Cython module that was originally written for Python 2.x work with Python 3.x, while at the same time keeping it compatible with older versions. It seems like when using Python 3.x, Cython will automatically replace 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals are interpreted as 'bytes' unless prefixed with 'u'. However, 'bytes' is not really useful in a context where an actual string is expected, and causes problems for example when working with strings passed from Python. (One of many issues I have run into is the fact that b"foo" != "foo"...) The only solution I've found to at least get most of my code working is basically to use unicode for almost everything, but if possible I'd like to avoid unicode strings in the 2.x version. Is there a sane way to use the native string type (i.e. 'str') in either Python version? Thanks, Dominic From joschu at caltech.edu Sat Sep 12 06:44:33 2009 From: joschu at caltech.edu (John Schulman) Date: Sat, 12 Sep 2009 00:44:33 -0400 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard Message-ID: <185761440909112144u15bc7001l74f423596834e5bb@mail.gmail.com> OK the problem I am having apparently has nothing to do with cython. I got the error when building Cython, and the -Wno-long-double flag comes from lib/python2.5/config/Makefile From joschu at caltech.edu Sat Sep 12 06:49:11 2009 From: joschu at caltech.edu (John Schulman) Date: Sat, 12 Sep 2009 00:49:11 -0400 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard In-Reply-To: <185761440909112134o50110485y898aa4c8336a2768@mail.gmail.com> References: <185761440909112134o50110485y898aa4c8336a2768@mail.gmail.com> Message-ID: <185761440909112149y3c65a2a0sb19848b964d35150@mail.gmail.com> By the way, my python is EPD 5.0, which I just installed. On Sat, Sep 12, 2009 at 12:34 AM, John Schulman wrote: > Thanks for posting, but this does not work for me. > I grepped the cython directory and removed every instance of > -Wno-long-double, but I still get the same error, which totally > baffles me. (I did this right after downloading the package, so > there's no build stuff sitting around) > > > On Thu, Sep 10, 2009 at 7:53 PM, Richard West wrote: >> Hi, >> I recently upgraded to Mac OS X 10.6 Snow Leopard, which means I am >> now using gcc version 4.2.1 (Apple Inc. build 5646) >> When I first tried to use Cython after the upgrade I was getting >> errors like >> ? cc1: error: unrecognized command line option "-Wno-long-double" >> presumably because the deprecated -Wno-long-double option was removed >> from gcc. >> >> When trying to build Cython itself on the default Python 2.6 >> installation, I was also getting a lot of warnings like >> ? /usr/include/AvailabilityMacros.h:108:14: warning: #warning >> Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. >> >> >> My workaround, which seems to work OK so far, is as follows: >> >> First run >> ? ?$ easy_install -eb temporary_folder Cython >> to download but not install Cython >> >> On line 32 of temporary_folder/cython/Cython/Mac/DarwinSystem.py change >> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.3" >> to >> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.4" >> >> >> And on line 36 of Cython/Mac/DarwinSystem.py remove the ?"-Wno-long- >> double" option. >> >> Then run >> ? ?$ sudo easy_install temporary_folder/cython/ >> to build and install the modified Cython. >> >> Hope this saves someone a few minutes. >> >> Richard >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > From joschu at caltech.edu Sat Sep 12 06:34:24 2009 From: joschu at caltech.edu (John Schulman) Date: Sat, 12 Sep 2009 00:34:24 -0400 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard In-Reply-To: References: Message-ID: <185761440909112134o50110485y898aa4c8336a2768@mail.gmail.com> Thanks for posting, but this does not work for me. I grepped the cython directory and removed every instance of -Wno-long-double, but I still get the same error, which totally baffles me. (I did this right after downloading the package, so there's no build stuff sitting around) On Thu, Sep 10, 2009 at 7:53 PM, Richard West wrote: > Hi, > I recently upgraded to Mac OS X 10.6 Snow Leopard, which means I am > now using gcc version 4.2.1 (Apple Inc. build 5646) > When I first tried to use Cython after the upgrade I was getting > errors like > ? cc1: error: unrecognized command line option "-Wno-long-double" > presumably because the deprecated -Wno-long-double option was removed > from gcc. > > When trying to build Cython itself on the default Python 2.6 > installation, I was also getting a lot of warnings like > ? /usr/include/AvailabilityMacros.h:108:14: warning: #warning > Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. > > > My workaround, which seems to work OK so far, is as follows: > > First run > ? ?$ easy_install -eb temporary_folder Cython > to download but not install Cython > > On line 32 of temporary_folder/cython/Cython/Mac/DarwinSystem.py change > ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.3" > to > ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.4" > > > And on line 36 of Cython/Mac/DarwinSystem.py remove the ?"-Wno-long- > double" option. > > Then run > ? ?$ sudo easy_install temporary_folder/cython/ > to build and install the modified Cython. > > Hope this saves someone a few minutes. > > Richard > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From stefan_ml at behnel.de Sat Sep 12 07:32:36 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 07:32:36 +0200 Subject: [Cython] casting strings to unsigned char* In-Reply-To: References: <42ef4ee0907130643l6d00acb0o90aaf0589dc98483@mail.gmail.com> <42ef4ee0907130647g661e9095x5cc9bbbf3239e0ca@mail.gmail.com> <42ef4ee0907131535sbf30106p8a4a024b9fb2a77c@mail.gmail.com> <3A86A2B4-5498-4BBF-94EB-775BF8991A81@math.washington.edu> <4AA7F2E9.9010500@behnel.de> <7EE20DE1-46F2-4773-AC56-B02F35CFAD9C@math.washington.edu> <4AAABD82.4050506@behnel.de> <6AC1927A-F41E-4182-8C90-6DA84B94D9D8@math.washington.edu> Message-ID: <4AAB3274.5030107@behnel.de> Lisandro Dalcin wrote: > On Fri, Sep 11, 2009 at 7:17 PM, Robert Bradshaw wrote: >> On Sep 11, 2009, at 2:23 PM, Lisandro Dalcin wrote: >> >>> Well, I guess it is too late to complain... But I think that explicit >>> is better than implicit here, then I do not like this... I do not even >>> like the automatic casting to bare char* !! It is almost impossible to >>> be 100% sure that you code is bytes/unicode clean! Sounds like FUD to me. This has nothing to do with Unicode. Cython will continue to give you compile time errors if you do this: cdef char* s = some_unicode_string or this: cdef char* s = some_unicode_string.encode('UTF-8') >>> Could we have at least a compile directive (or perhaps better/easier a >>> global option?) to DISABLE these automatic castings to char*/uchar* ? >> It's certainly too late to disable it now, but I would be up for an >> option that gives errors/warnings. (Maybe -W or lint flags of some >> kind.) > > In understand ... Anyway, the implicit cast would not bother me at all > I I would be able to disable it... Would you also want to disable automatic casts between Python int and C int? I do see the difference that string handling involves a bit of reference keeping care. But that would still be the case when you use explicit casts. In many, many cases, all you need to do is to pass a byte string into a C function, where the lifetime of the Python string is automatically assured by the function call lifetime. That's such a common use case that I can't imagine requiring more code than some_c_function(some_python_byte_string) >> Just out of curiosity, what would the explicit method be? > > No idea... perhaps a manual cast? char *p = pystring ... I find it *very* convenient that Cython allows you to get the pointer to Python's byte string buffer with a simple assignment. I honestly doubt that an explicit cast would serve anyone. >> Python/C API? > > That's more or less what I'm currently doing in mpi4py... I keep thinking about safe ways to make it easier for users to convert between Python unicode strings and C byte strings. Making it harder to convert between Python byte strings and C byte strings certainly wouldn't help here. Stefan From robertwb at math.washington.edu Sat Sep 12 08:20:04 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 11 Sep 2009 23:20:04 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <200909120236.45400.dominic.sacre@gmx.de> References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: On Sep 11, 2009, at 5:36 PM, Dominic Sacr? wrote: > Hi, > > I'm trying to make a Pyrex/Cython module that was originally > written for > Python 2.x work with Python 3.x, while at the same time keeping it > compatible with older versions. > > It seems like when using Python 3.x, Cython will automatically replace > 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals > are > interpreted as 'bytes' unless prefixed with 'u'. > However, 'bytes' is not really useful in a context where an actual > string is expected, and causes problems for example when working with > strings passed from Python. > (One of many issues I have run into is the fact that b"foo" != > "foo"...) > > The only solution I've found to at least get most of my code > working is > basically to use unicode for almost everything, I think this is (unfortunately) by design. > but if possible I'd like > to avoid unicode strings in the 2.x version. > > Is there a sane way to use the native string type (i.e. 'str') in > either > Python version? How to handle strings/unicode, especially in Python 3, has been a huge area of debate on the list. However, I'm surprised that str is mapped to bytes in Python 3. What was the justification for this, or is it just a bug? I think if def foo(): return str, isinstance("abc", str) have different behavior in Cython and Python that there's a bug (unless there's a *very* good reason to do so). I'm not trying to re- advocate automatic char* <-> unicode conversions. - Robert From stefan_ml at behnel.de Sat Sep 12 08:27:20 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 08:27:20 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <200909120236.45400.dominic.sacre@gmx.de> References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: <4AAB3F48.4010907@behnel.de> Hi, Dominic Sacr? wrote: > I'm trying to make a Pyrex/Cython module that was originally written for > Python 2.x work with Python 3.x, while at the same time keeping it > compatible with older versions. > > It seems like when using Python 3.x, Cython will automatically replace > 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals are > interpreted as 'bytes' unless prefixed with 'u'. Correct. > However, 'bytes' is not really useful in a context where an actual > string is expected You mean "text", I suppose? "string" is ambiguous as it can refer to C strings, Python byte strings and Python Unicode strings. > and causes problems for example when working with > strings passed from Python. > (One of many issues I have run into is the fact that b"foo" != "foo"...) Yep, and that's a really good thing. I fixed loads of those in Cython lately, and tons of them in the test suite. > The only solution I've found to at least get most of my code working is > basically to use unicode for almost everything That's the way to go anyway. To make the code Unicode aware, you have to make it distinguish between text, encoded text and data. > but if possible I'd like to avoid unicode strings in the 2.x version. That's not impossible, but it certainly is some work and the benefit is rather questionable, as it can easily bite you if you do not take care about the three-fold separation above. I do this in lxml as the API dictates that under Py2, ASCII compatible byte strings are accepted and returned as ASCII encoded byte strings. I actually work completely with UTF-8 encoded strings inside of lxml and use dedicated functions for checking and encoding everything that comes through the API or that goes back to the user. The main theme is to decide if you want to work with unicode internally or with encoded byte strings. Choose one or the other, not both. And make sure you check byte strings that contain text on the way in and reject them in the face of encoding ambiguity. In any case, data byte strings should remain unchanged, although you may run into all sorts of problems with file names (which are really text but that won't necessarily help you when trying to find them in an encoded file system, or when a user passes you an encoded URL that came from whatever source). > Is there a sane way to use the native string type (i.e. 'str') in either > Python version? ... and have Cython automatically encode and decode the byte strings for you? No, certainly not. Encoding is an explicit operation and it will make your code safer to make it explicit. Stefan From stefan_ml at behnel.de Sat Sep 12 08:39:55 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 08:39:55 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: <4AAB423B.50501@behnel.de> Hi, Robert Bradshaw wrote: > I'm surprised that str is > mapped to bytes in Python 3. What was the justification for this, or > is it just a bug? I think if > > def foo(): > return str, isinstance("abc", str) > > have different behavior in Cython and Python that there's a bug Define Python: Py2 or Py3? Python 2 is still the defining context for Cython, so "str" is a byte string and "unicode" is a Unicode string. If you do the above in Py2, it will return True for a byte string and False for a unicode string. If you do it in Cython, you will get the same result. And that's still true when you compile your code for Py3. The same applies for the Cython code isinstance(u"abc", unicode) which will return True in both environments. Given that "abc" is a byte string in Cython, I'd be rather surprised to have isinstance("abc", str) return True in Py2 and False in Py3, and isinstance("abc", unicode) return False in Py2 and True in Py3. Stefan From robertwb at math.washington.edu Sat Sep 12 09:00:26 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Sep 2009 00:00:26 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAB423B.50501@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> Message-ID: On Sep 11, 2009, at 11:39 PM, Stefan Behnel wrote: > Hi, > > Robert Bradshaw wrote: >> I'm surprised that str is >> mapped to bytes in Python 3. What was the justification for this, or >> is it just a bug? I think if >> >> def foo(): >> return str, isinstance("abc", str) >> >> have different behavior in Cython and Python that there's a bug > > Define Python: Py2 or Py3? Both. If I compile the module against Py2, it should behave as if it was a .py file under Py2, and if I compile the module under Py3, it should behave as if it were a .py file under Py3. Moving code from .py to .pyx should not change its behavior. If I have this function in a module, in both Python 2 and Python 3 I have >>> foo() == str, True True > Python 2 is still the defining context for Cython, so "str" is a byte > string and "unicode" is a Unicode string. If you do the above in > Py2, it > will return True for a byte string and False for a unicode string. > If you > do it in Cython, you will get the same result. And that's still > true when > you compile your code for Py3. The same applies for the Cython code > > isinstance(u"abc", unicode) > > which will return True in both environments. Given that "abc" is a > byte > string in Cython, I'd be rather surprised to have > > isinstance("abc", str) > > return True in Py2 and False in Py3, and This returns True in both. > > isinstance("abc", unicode) > > return False in Py2 and True in Py3. This is an error in Py3. I don't see "abc" as a byte string, I see it as a string literal. If it's used in a C context it's a byte string, and if used as a Python object it's a Python str. This is how we handle all other literals (e.g. large integer literals used as Python objects are not the same as large integer literals truncated to an int then used as a Python object). - Robert From robertwb at math.washington.edu Sat Sep 12 09:00:55 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Sep 2009 00:00:55 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <200909120236.45400.dominic.sacre@gmx.de> References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: On Sep 11, 2009, at 5:36 PM, Dominic Sacr? wrote: > Hi, > > I'm trying to make a Pyrex/Cython module that was originally > written for > Python 2.x work with Python 3.x, while at the same time keeping it > compatible with older versions. > > It seems like when using Python 3.x, Cython will automatically replace > 'unicode' with 'str', and 'str' with 'bytes'. Also, string literals > are > interpreted as 'bytes' unless prefixed with 'u'. > However, 'bytes' is not really useful in a context where an actual > string is expected, and causes problems for example when working with > strings passed from Python. > (One of many issues I have run into is the fact that b"foo" != > "foo"...) > > The only solution I've found to at least get most of my code > working is > basically to use unicode for almost everything, but if possible I'd > like > to avoid unicode strings in the 2.x version. > > Is there a sane way to use the native string type (i.e. 'str') in > either > Python version? Not really, but you can get it: >>> type(list(object.__dict__.keys())[0]) - Robert From stefan_ml at behnel.de Sat Sep 12 09:12:35 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 09:12:35 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: <4AAB49E3.8010609@behnel.de> Robert Bradshaw wrote: > On Sep 11, 2009, at 5:36 PM, Dominic Sacr? wrote: >> Is there a sane way to use the native string type (i.e. 'str') in >> either Python version? > > Not really, but you can get it: > > >>> type(list(object.__dict__.keys())[0]) > Is there a use case for this? Stefan From stefan_ml at behnel.de Sat Sep 12 09:35:05 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 09:35:05 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> Message-ID: <4AAB4F29.1080902@behnel.de> Robert Bradshaw wrote: > If I compile the module against Py2, it should behave as if it > was a .py file under Py2, and if I compile the module under Py3, it > should behave as if it were a .py file under Py3. Moving code > from .py to .pyx should not change its behavior. Well, when you run a Py2 script in Py3, the semantics change. So it doesn't make sense to say "moving code from .py to .pyx should not change its behavior", as the same .py file can already have different behaviour. I'm fine with providing a separate front-end for compiling Python 3 code ("cython3" ?), so I'm also fine with providing a separate front-end for compiling Python 2 code. Simply seeing the .py extension isn't enough anymore. I'm also fine with a command line option "-3"/"-2" that defines the semantics when compiling a .py file. However, once the compilation is done, I think the semantics of literals should be fixed and should not change depending on the platform. >> isinstance("abc", unicode) >> >> return False in Py2 and True in Py3. > > This is an error in Py3. Correct, but neither in Python 2 nor in Cython, which currently uses the Py2 builtin names. > I don't see "abc" as a byte string, I see it as a string literal. If > it's used in a C context it's a byte string, and if used as a Python > object it's a Python str. This is how we handle all other literals > (e.g. large integer literals used as Python objects are not the same > as large integer literals truncated to an int then used as a Python > object). So your proposal is to make cdef char* s s = "?????fs#dfsjdf?asjf" a C byte string encoded in source encoding, and s = "?????fs#dfsjdf?asjf" a byte string in source-encoding when run in Python 2 and a decoded unicode string when run in Python 3? Note that this means that s = "?????fs#dfsjdf?asjf" cdef char* cs = s will work in Py2 and fail in Py3, whereas it currently works identically in both. This means that you'd have to prefix basically all Python string literals with either 'b' or 'u' if you want a fixed type/semantics, whereas now you only have to prefix Python unicode strings with a 'u', following Python 2 syntax. Given that this is more code overhead, do you have a real use case for literals that behave that way? The only place I've seen this so far are keyword argument dicts that you fill with literal string names. A rather rare thing, IMHO, and easy to fix using e.g. the dict() factory. Stefan From dagss at student.matnat.uio.no Sat Sep 12 11:26:41 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 12 Sep 2009 11:26:41 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: Robert Bradshaw wrote: > How to handle strings/unicode, especially in Python 3, has been a > huge area of debate on the list. However, I'm surprised that str is > mapped to bytes in Python 3. What was the justification for this, or > is it just a bug? I think if > > def foo(): > return str, isinstance("abc", str) > > have different behavior in Cython and Python that there's a bug > (unless there's a *very* good reason to do so). I'm not trying to re- > advocate automatic char* <-> unicode conversions. Are you sure? What about this: def foo(): return 4 / 5 Should this have the same behaviour in Cython and Python regardless of Python version as well? I'm with Stefan, a -3 flag which turns on from __future__ import division, unicode_literals, etc seems like the right mechanism. Changing semantics based on the Python version used to compile the C source can't be a good thing. Dag Sverre From dalcinl at gmail.com Sat Sep 12 17:56:16 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 12 Sep 2009 12:56:16 -0300 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard In-Reply-To: <185761440909112149y3c65a2a0sb19848b964d35150@mail.gmail.com> References: <185761440909112134o50110485y898aa4c8336a2768@mail.gmail.com> <185761440909112149y3c65a2a0sb19848b964d35150@mail.gmail.com> Message-ID: Could you grep in the whole lib/python2.5 for the offending flags? On Sat, Sep 12, 2009 at 1:49 AM, John Schulman wrote: > By the way, my python is EPD 5.0, which I just installed. > > On Sat, Sep 12, 2009 at 12:34 AM, John Schulman wrote: >> Thanks for posting, but this does not work for me. >> I grepped the cython directory and removed every instance of >> -Wno-long-double, but I still get the same error, which totally >> baffles me. (I did this right after downloading the package, so >> there's no build stuff sitting around) >> >> >> On Thu, Sep 10, 2009 at 7:53 PM, Richard West wrote: >>> Hi, >>> I recently upgraded to Mac OS X 10.6 Snow Leopard, which means I am >>> now using gcc version 4.2.1 (Apple Inc. build 5646) >>> When I first tried to use Cython after the upgrade I was getting >>> errors like >>> ? cc1: error: unrecognized command line option "-Wno-long-double" >>> presumably because the deprecated -Wno-long-double option was removed >>> from gcc. >>> >>> When trying to build Cython itself on the default Python 2.6 >>> installation, I was also getting a lot of warnings like >>> ? /usr/include/AvailabilityMacros.h:108:14: warning: #warning >>> Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. >>> >>> >>> My workaround, which seems to work OK so far, is as follows: >>> >>> First run >>> ? ?$ easy_install -eb temporary_folder Cython >>> to download but not install Cython >>> >>> On line 32 of temporary_folder/cython/Cython/Mac/DarwinSystem.py change >>> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.3" >>> to >>> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.4" >>> >>> >>> And on line 36 of Cython/Mac/DarwinSystem.py remove the ?"-Wno-long- >>> double" option. >>> >>> Then run >>> ? ?$ sudo easy_install temporary_folder/cython/ >>> to build and install the modified Cython. >>> >>> Hope this saves someone a few minutes. >>> >>> Richard >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dominic.sacre at gmx.de Sat Sep 12 18:41:12 2009 From: dominic.sacre at gmx.de (Dominic =?utf-8?q?Sacr=C3=A9?=) Date: Sat, 12 Sep 2009 18:41:12 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAB3F48.4010907@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB3F48.4010907@behnel.de> Message-ID: <200909121841.12135.dominic.sacre@gmx.de> On Saturday 12 of September 2009 08:27:20 Stefan Behnel wrote: > > However, 'bytes' is not really useful in a context where an actual > > string is expected > > You mean "text", I suppose? "string" is ambiguous as it can refer to > C strings, Python byte strings and Python Unicode strings. Yes, that's what I meant. > > The only solution I've found to at least get most of my code > > working is basically to use unicode for almost everything > > That's the way to go anyway. To make the code Unicode aware, you have > to make it distinguish between text, encoded text and data. Well, actually my module only needs to be able to handle ASCII, because the protocol it implements doesn't support anything else. So it seems weird and in many cases very cumbersome use unicode internally, especially with Py2, where usually all string coming from Python will not be unicode in the first place. > The main theme is to decide if you want to work with unicode > internally or with encoded byte strings. Using byte strings internally seems to make much more sense to me in this case. In fact that was my first attempt, though not deliberately, but simply because that's what happened when I tried to use my unmodified code with Py3. I think I'll try to go back to that approach again, and insert encoding/decoding wherever necessary to make sure that no unicode strings get in, and no byte strings get out... By the way, another issue I've stumbled upon: With Py3, str(42) does not work as one would expect, because it actually creates a bytes object of length 42, filled with zeroes. Should this be considered a bug, or is it just one of the awkward consequences of 'str' meaning 'bytes' with Py3? Thanks, Dominic From stefan_ml at behnel.de Sat Sep 12 18:59:42 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 18:59:42 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <200909121841.12135.dominic.sacre@gmx.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB3F48.4010907@behnel.de> <200909121841.12135.dominic.sacre@gmx.de> Message-ID: <4AABD37E.70707@behnel.de> Dominic Sacr? wrote: > Well, actually my module only needs to be able to handle ASCII, because > the protocol it implements doesn't support anything else. > So it seems weird and in many cases very cumbersome use unicode > internally, especially with Py2, where usually all string coming from > Python will not be unicode in the first place. Then this sounds like a case for using ASCII encoded byte strings internally. > I think I'll try to go back to that approach again, and insert > encoding/decoding wherever necessary to make sure that no unicode > strings get in, and no byte strings get out... You should write a little input normalisation function that does a quick check with PyString_CheckExact() as a fast path. You might also want to check the strings for stuff like \0 bytes and values >= 0x80 in that case. Users will usually be happy to get an exception, instead of having to chase weird bugs due to dirty data. > By the way, another issue I've stumbled upon: > With Py3, str(42) does not work as one would expect, because it actually > creates a bytes object of length 42, filled with zeroes. Should this be > considered a bug, or is it just one of the awkward consequences of 'str' > meaning 'bytes' with Py3? It's just like the bytes type in Py 2.6 isn't quite what you'd expect. You'll notice similar differences for other builtins across Python versions, e.g. Py3's zip(). That's things you have to live with. Note that this is still different from *literals* meaning different things in different Python versions. The exact behaviour of builtins (and any other objects) can naturally change when running in different environments. Stefan From robertwb at math.washington.edu Sat Sep 12 19:29:48 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Sep 2009 10:29:48 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAB4F29.1080902@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> Message-ID: <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> On Sep 12, 2009, at 12:35 AM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> If I compile the module against Py2, it should behave as if it >> was a .py file under Py2, and if I compile the module under Py3, it >> should behave as if it were a .py file under Py3. Moving code >> from .py to .pyx should not change its behavior. > > Well, when you run a Py2 script in Py3, the semantics change. So it > doesn't > make sense to say "moving code from .py to .pyx should not change its > behavior", as the same .py file can already have different behaviour. > > I'm fine with providing a separate front-end for compiling Python 3 > code > ("cython3" ?), so I'm also fine with providing a separate front-end > for > compiling Python 2 code. Simply seeing the .py extension isn't > enough anymore. > > I'm also fine with a command line option "-3"/"-2" that defines the > semantics when compiling a .py file. I think investigating something along these lines would be good. > However, once the compilation is done, > I think the semantics of literals should be fixed and should not > change > depending on the platform. It already does out of necessity. a = 10 b = 1000000000000000000000000000 type(a) == type(b) # depends on the environment > >>> isinstance("abc", unicode) >>> >>> return False in Py2 and True in Py3. >> >> This is an error in Py3. > > Correct, but neither in Python 2 nor in Cython, which currently > uses the > Py2 builtin names. > > >> I don't see "abc" as a byte string, I see it as a string literal. If >> it's used in a C context it's a byte string, and if used as a Python >> object it's a Python str. This is how we handle all other literals >> (e.g. large integer literals used as Python objects are not the same >> as large integer literals truncated to an int then used as a Python >> object). > > So your proposal is to make > > cdef char* s > > s = "?????fs#dfsjdf?asjf" > > a C byte string encoded in source encoding, and > > s = "?????fs#dfsjdf?asjf" > > a byte string in source-encoding when run in Python 2 and a decoded > unicode > string when run in Python 3? Yep. > Note that this means that > > s = "?????fs#dfsjdf?asjf" > > cdef char* cs = s > > will work in Py2 and fail in Py3, whereas it currently works > identically in > both. Several things will change, e.g. range will return an iterator, not a list. (We could have a mode where Cython emulates the Py2 builtins even when compiled against Py3, but probably not by default). These are all things that are easy consequences of how py3 differs from py2. I think "strings are different in Py3" is much easier to explain, and reference, than "cython string literals are no longer strings" (where by strings here I mean str, the type any programmer gets whey they type a string literal into the prompt). As for the double assignment being different, again, using integers as an example int a = 1000000000000000000000000000 behaves differently than a= 1000000000000000000000000000 int ca = a > This means that you'd have to prefix basically all Python string > literals > with either 'b' or 'u' if you want a fixed type/semantics, whereas > now you > only have to prefix Python unicode strings with a 'u', following > Python 2 > syntax. If one always wants bytes, one can do b"something." If one always wants unicode, one can do u"something." There's currently no (obvious, clean) way to get str. > Given that this is more code overhead, I don't think this is more code overhead--most people don't prefix their string literals with anything at all, they just think of them as "strings." Now you can say "it forces users who want py2 an py3 compatibility to explicitly use unicode everywhere, which they should have been doing anyways, or there code is broken" but I think this artificially raises the barrier for using Cython by imposing an independent presumption (despite any validity). Put another way, its extra overhead (and probably incorrect) to deal with the bytes object when using a cython module, and extra overhead to prefix all literals in the cython module with 'u.' It makes mixing strings from the environment with those from the module more cumbersome. Unless you're dealing with char* <-> object conversions you shouldn't have to think or care about encodings IMHO. > do you have a real use case for > literals that behave that way? The only place I've seen this so far > are > keyword argument dicts that you fill with literal string names. A > rather > rare thing, IMHO, and easy to fix using e.g. the dict() factory. Clearly Dominic has a usecase. I have a simple usecase too. Often in Sage one has functions like def charpoly(self, algorithm='default'): if algorithm == 'a': ... else if algorithm == 'b': ... else: raise ValueError("Unknown algorithm: %s" % algorithm) This will break if I run it in Python 3. You could say that we should be prefixing these with 'u', but frankly, I don't see the benefit. (I do like unicode in general, it's just not worth the overhead here.) Specifically would mean we have to get everyone who writes code of the above form to use 'u' despite the fact that the existence of unicode is *completely* irrelevant to the task at hand. These are just strings, I don't want to have to think about (or, more to the point, explain) byte strings, encodings, unicode, etc. unless one is actually dealing with byte strings, encodings, etc. Perhaps the difference in opinion comes from my perspective that, at a high level, str just got changed (for the better) in Py3. - Robert From robertwb at math.washington.edu Sat Sep 12 19:48:52 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Sep 2009 10:48:52 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> Message-ID: <82767F17-5740-4970-8FC6-63A32578E278@math.washington.edu> On Sep 12, 2009, at 2:26 AM, Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> How to handle strings/unicode, especially in Python 3, has been a >> huge area of debate on the list. However, I'm surprised that str is >> mapped to bytes in Python 3. What was the justification for this, or >> is it just a bug? I think if >> >> def foo(): >> return str, isinstance("abc", str) >> >> have different behavior in Cython and Python that there's a bug >> (unless there's a *very* good reason to do so). I'm not trying to re- >> advocate automatic char* <-> unicode conversions. > > Are you sure? What about this: > > def foo(): > return 4 / 5 > > Should this have the same behaviour in Cython and Python regardless of > Python version as well? I'm talking about Python object literals. (4) / 5 will already have Py3 semantics no matter what we do. I'm arguing that "literal" should be the native "str" type. > I'm with Stefan, a -3 flag which turns on > > from __future__ import division, unicode_literals, etc > > seems like the right mechanism. Changing semantics based on the Python > version used to compile the C source can't be a good thing. We already do for the rest of the builtins. The Py2 str object is gone in Py3. Bytes do not support the % operator (probably one of the most common operations on strings) and, as pointed out, bytes(x) does not give the string representation of x (str(5) -> "\0\0\0\0\0" is rather unsettling). Semantically, the str type of Py2 is closer to the str type of Py3 than it is to the bytes type of Py3, and is meant to be used in its place. The fact that it's unicode rather than bytes under the hood is an implementation detail that the user need not be bothered with *only* when they are trying to get at the underlying char*. - Robert From dalcinl at gmail.com Sat Sep 12 20:27:25 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 12 Sep 2009 15:27:25 -0300 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> Message-ID: I agree in almost all points with Robert On Sat, Sep 12, 2009 at 2:29 PM, Robert Bradshaw wrote: > > It already does out of necessity. > > a = 10 > b = 1000000000000000000000000000 > type(a) == type(b) # depends on the environment > Nice example to make your point... > > If one always wants bytes, one can do b"something." If one always > wants unicode, one can do u"something." There's currently no > (obvious, clean) way to get str. > Long ago I've asked to have a s"abc" prefix, were the type match the Python-side 'str' type.. And I would like to see this working that way despite any -2 or -3 flags... > Clearly Dominic has a usecase. > > I have a simple usecase too. Often in Sage one has functions like > > def charpoly(self, algorithm='default'): > > This will break if I run it in Python 3. The same happens to me in for example petsc4py.. In order to create a linear solver and select the iterative method, I have to write (in Python code): from petsc4py import PETSc solver = PETSc.KSP().create() solver.setType("gmres") # this calls C function KSPSetType(KSP ksp, const char *type_name) Having to handle the bytes/unicode thing by hand (I mean, using Python C-API) for such a simple thing is REALLY annoying... > Specifically would mean we have to get everyone who writes code of > the above form to use 'u' despite the fact that the existence of > unicode is *completely* irrelevant to the task at hand. That the case in all my projects... These codes are not related to string handling; string are just used for setting a few options and selecting algorithms, and ALL of them fits in ASCII... > These are > just strings, I don't want to have to think about (or, more to the > point, explain) byte strings, encodings, unicode, etc. unless one is > actually dealing with byte strings, encodings, etc. > Indeed. > Perhaps the difference in opinion comes from my perspective that, at > a high level, str just got changed (for the better) in Py3. > I second this view. I see Python 3 is a improved language, but not so radically different to Python 2 as to justify a stricter semantics adherence to one or another language version... Moreover, I do consider Cython/Pyrex a new language targeting two different runtimes: Py2 and Py3. How to handle the slightly different semantics of these two runtimes? We use a sensible default blesssed for our BDLF, but let USERS decide with semantics to follow when their Cython code endup running in a Py2 or Py3 runtime... >>> import this ... In the face of ambiguity, refuse the temptation to guess. Why should Cython enforce which of Py2/Py3 semantics should a *.pyx file have? That is a just VILE guess from the Cython team... In short, Cython has to deal with Py2/Py3 runtimes, and they are different. Please, do not ENFORCE semantics.. Let?s use sensible defaults but let end users decide what they want/need for getting their job done... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From joschu at caltech.edu Sat Sep 12 20:37:13 2009 From: joschu at caltech.edu (John Schulman) Date: Sat, 12 Sep 2009 14:37:13 -0400 Subject: [Cython] Cython on MacOS 10.6 Snow Leopard In-Reply-To: References: <185761440909112134o50110485y898aa4c8336a2768@mail.gmail.com> <185761440909112149y3c65a2a0sb19848b964d35150@mail.gmail.com> Message-ID: <185761440909121137p673ae47i64ccd8cb4c9c689b@mail.gmail.com> Yep. But then I get some other errors related to int size. My solution is to move all of my development to linux, at least until all of the incompatibilities of snow leopard get fixed. On Sat, Sep 12, 2009 at 11:56 AM, Lisandro Dalcin wrote: > Could you grep in the whole lib/python2.5 for the offending flags? > > > On Sat, Sep 12, 2009 at 1:49 AM, John Schulman wrote: >> By the way, my python is EPD 5.0, which I just installed. >> >> On Sat, Sep 12, 2009 at 12:34 AM, John Schulman wrote: >>> Thanks for posting, but this does not work for me. >>> I grepped the cython directory and removed every instance of >>> -Wno-long-double, but I still get the same error, which totally >>> baffles me. (I did this right after downloading the package, so >>> there's no build stuff sitting around) >>> >>> >>> On Thu, Sep 10, 2009 at 7:53 PM, Richard West wrote: >>>> Hi, >>>> I recently upgraded to Mac OS X 10.6 Snow Leopard, which means I am >>>> now using gcc version 4.2.1 (Apple Inc. build 5646) >>>> When I first tried to use Cython after the upgrade I was getting >>>> errors like >>>> ? cc1: error: unrecognized command line option "-Wno-long-double" >>>> presumably because the deprecated -Wno-long-double option was removed >>>> from gcc. >>>> >>>> When trying to build Cython itself on the default Python 2.6 >>>> installation, I was also getting a lot of warnings like >>>> ? /usr/include/AvailabilityMacros.h:108:14: warning: #warning >>>> Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. >>>> >>>> >>>> My workaround, which seems to work OK so far, is as follows: >>>> >>>> First run >>>> ? ?$ easy_install -eb temporary_folder Cython >>>> to download but not install Cython >>>> >>>> On line 32 of temporary_folder/cython/Cython/Mac/DarwinSystem.py change >>>> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.3" >>>> to >>>> ? ? os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.4" >>>> >>>> >>>> And on line 36 of Cython/Mac/DarwinSystem.py remove the ?"-Wno-long- >>>> double" option. >>>> >>>> Then run >>>> ? ?$ sudo easy_install temporary_folder/cython/ >>>> to build and install the modified Cython. >>>> >>>> Hope this saves someone a few minutes. >>>> >>>> Richard >>>> _______________________________________________ >>>> Cython-dev mailing list >>>> Cython-dev at codespeak.net >>>> http://codespeak.net/mailman/listinfo/cython-dev >>>> >>> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From dominic.sacre at gmx.de Sat Sep 12 21:00:43 2009 From: dominic.sacre at gmx.de (Dominic =?iso-8859-1?q?Sacr=E9?=) Date: Sat, 12 Sep 2009 21:00:43 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <82767F17-5740-4970-8FC6-63A32578E278@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <82767F17-5740-4970-8FC6-63A32578E278@math.washington.edu> Message-ID: <200909122100.43350.dominic.sacre@gmx.de> On Saturday 12 of September 2009 19:48:52 Robert Bradshaw wrote: > On Sep 12, 2009, at 2:26 AM, Dag Sverre Seljebotn wrote: > > I'm with Stefan, a -3 flag which turns on > > > > from __future__ import division, unicode_literals, etc > > > > seems like the right mechanism. Changing semantics based on the > > Python version used to compile the C source can't be a good thing. > > We already do for the rest of the builtins. > > The Py2 str object is gone in Py3. Bytes do not support the % > operator (probably one of the most common operations on strings) and, > as pointed out, bytes(x) does not give the string representation of > x (str(5) -> "\0\0\0\0\0" is rather unsettling). Semantically, the > str type of Py2 is closer to the str type of Py3 than it is to the > bytes type of Py3, and is meant to be used in its place. The fact > that it's unicode rather than bytes under the hood is an > implementation detail that the user need not be bothered with only > when they are trying to get at the underlying char*. I agree. In most of the places I used str and unprefixed literals in my original Py2/Pyrex based code, it simply means "I want text". Except for the few places where I actually need to convert to char*, all my code would still work fine with Py3's unicode str. bytes, on the other hand, seems to be a very bad replacement for str. I ran into both of the issues mentioned above (% operator and str(n)) when I tried my code with Py3. Also, 'foo'[0] equals 102, and even a simple print 'foo' doesn't work as expected (it prints b'foo'). I'm fairly new to Cython, so please excuse my ignorance, but even after reading many of the mails in the list archive about this topic, I still don't understand why the str -> bytes replacement is necessary. Why not just let 'unicode' always denote a unicode string, 'bytes' always a byte string, and let 'str' be 'str' in any Python version? Dominic From stefan_ml at behnel.de Sat Sep 12 21:51:49 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Sep 2009 21:51:49 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> Message-ID: <4AABFBD5.7080308@behnel.de> Robert Bradshaw wrote: > If one always wants bytes, one can do b"something." If one always > wants unicode, one can do u"something." There's currently no > (obvious, clean) way to get str. I get your point, although to me, "get str" smells like there isn't a clean way anyway. Compared to CPython, we actually have the advantage of supporting the 'b' prefix for all Py2 versions. So you can write portable code that is explicit about this prefix. That's certainly not the case for plain Python code if you need to support Python versions before 2.6. What would be the plan for a switch then? I think if we do this now, there will be two kinds of users: those who already changed their code to explicit string semantics to adapt it to Py3, and those who didn't care (yet). I actually think that such a switch would break both kinds of user code. The first one, as it wasn't necessary before to prefix byte strings with 'b', so it most likely wasn't done, and the second one because the existing code is likely not portable anyway (so it won't break more than it already is). The second group has the advantage of not having invested time, and the first (and likely smaller) group will have to fix up their code again. I'm certainly in the first group, but I guess the second group clearly outweighs the first one. Robert and/or Lisandro, would you write up a CEP that sums up and describes the proposed semantics for C strings and unprefixed byte strings? I would want to see a couple of examples in there that show in what cases code will break or not break, what changes will be required to fix broken code up, and what the change will simplify for code migration to Py3. That would make it quite clear how big the advantage actually is. From the top of my head, I can think of docstrings, for example, where this would be helpful, and I already mentioned the keywords example. Stefan From robertwb at math.washington.edu Sat Sep 12 22:16:56 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Sep 2009 13:16:56 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AABFBD5.7080308@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> Message-ID: On Sep 12, 2009, at 12:51 PM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> If one always wants bytes, one can do b"something." If one always >> wants unicode, one can do u"something." There's currently no >> (obvious, clean) way to get str. > > I get your point, although to me, "get str" smells like there isn't > a clean > way anyway. Compared to CPython, we actually have the advantage of > supporting the 'b' prefix for all Py2 versions. So you can write > portable > code that is explicit about this prefix. That's certainly not the > case for > plain Python code if you need to support Python versions before 2.6. > > What would be the plan for a switch then? I'd say do a warning for un-prefixed literals for 0.11.3, and then the actual switch for 0.12. > I think if we do this now, there > will be two kinds of users: those who already changed their code to > explicit string semantics to adapt it to Py3, and those who didn't > care > (yet). I actually think that such a switch would break both kinds > of user > code. The first one, as it wasn't necessary before to prefix byte > strings > with 'b', so it most likely wasn't done, and the second one because > the > existing code is likely not portable anyway (so it won't break more > than it > already is). The second group has the advantage of not having invested > time, and the first (and likely smaller) group will have to fix up > their > code again. I'm certainly in the first group, but I guess the > second group > clearly outweighs the first one. > > Robert and/or Lisandro, would you write up a CEP that sums up and > describes > the proposed semantics for C strings and unprefixed byte strings? I > would > want to see a couple of examples in there that show in what cases > code will > break or not break, what changes will be required to fix broken > code up, > and what the change will simplify for code migration to Py3. That > would > make it quite clear how big the advantage actually is. From the top > of my > head, I can think of docstrings, for example, where this would be > helpful, > and I already mentioned the keywords example. Sure, I'll do that. I initially hesitated bring this volatile topic up again, but I think this is a solution that will make life less cumbersome for most Cython users without letting them be sloppy about encodings. - Robert From greg.ewing at canterbury.ac.nz Sun Sep 13 03:22:41 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Sep 2009 13:22:41 +1200 Subject: [Cython] how to make cython definitions available to external C code? In-Reply-To: References: Message-ID: <4AAC4961.8040305@canterbury.ac.nz> Darren Dale wrote: > So let me see if I understood you: Cython is not currently designed to > let you build a module up from several submodule sources, be they C or > Cython, because it will not expose the symbols in those submodules. Symbol visibility isn't the problem (that can be fixed with public/extern declarations). The problem is that Pyrex/Cython compiles each .pyx file into a module with its own module init function, tables of strings, etc. etc. It's not designed to merge those things together from multiple separately-compiled .pyx files. There's no problem with linking multiple separately-compiled C sources with a single .pyx. It's also possible for a number of modules, each one generated from a .pyx, to import extension types and C functions from each other at run time, so you can divide the functionality of a package across .pyx sources that way. You can also use the 'include' statement to glue a number of source files together into a single module, but it will all be compiled together, which may take a while. -- Greg From robertwb at math.washington.edu Sun Sep 13 10:51:39 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 13 Sep 2009 01:51:39 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> Message-ID: <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> On Sep 12, 2009, at 1:16 PM, Robert Bradshaw wrote: >> Robert and/or Lisandro, would you write up a CEP that sums up and >> describes the proposed semantics for C strings and unprefixed byte >> strings? I >> would want to see a couple of examples in there that show in what >> cases >> code will break or not break, what changes will be required to fix >> broken >> code up, and what the change will simplify for code migration to >> Py3. That >> would make it quite clear how big the advantage actually is. From >> the top >> of my head, I can think of docstrings, for example, where this >> would be >> helpful, and I already mentioned the keywords example. > > Sure, I'll do that. I have a draft up at http://wiki.cython.org/enhancements/ stringliterals , please all feel free to add and edit. - Robert From stefan_ml at behnel.de Sun Sep 13 12:55:55 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Sep 2009 12:55:55 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> Message-ID: <4AACCFBB.1050508@behnel.de> Robert Bradshaw wrote: > I have a draft up at http://wiki.cython.org/enhancements/stringliterals > please all feel free to add and edit. Thanks, Robert. One thing I see missing is how this would be handled: cdef str s = "some string" cdef char* cs = s Should this simply result in a runtime error under Py3? Or would you forbid this and just raise a Cython compiler error (or warning), stating that "bytes" should be used instead? Although, this might actually appear inside of a Py2-only or try-except block, so I guess a warning would be the most we can do. BTW, we shouldn't forget to adapt the .pxd files in Cython/Includes accordingly, so that they return either "bytes" or "unicode", but *never* "str" (or "object", if we know it's a string type). And "str", "bytes" and "unicode" wouldn't be assignable to each other, right? Or would you also leave that to runtime? Stefan From dalcinl at gmail.com Sun Sep 13 20:56:42 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 13 Sep 2009 15:56:42 -0300 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AACCFBB.1050508@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> Message-ID: On Sun, Sep 13, 2009 at 7:55 AM, Stefan Behnel wrote: > > Robert Bradshaw wrote: >> I have a draft up at http://wiki.cython.org/enhancements/stringliterals >> please all feel free to add and edit. > > Thanks, Robert. > > One thing I see missing is how this would be handled: > > ? ? ? ?cdef str s = "some string" > ? ? ? ?cdef char* cs = s > > Should this simply result in a runtime error under Py3? > > Or would you forbid this and just raise a Cython compiler error (or > warning), stating that "bytes" should be used instead? Although, this might > actually appear inside of a Py2-only or try-except block, so I guess a > warning would be the most we can do. > I'm inclined for a warning... and that warning would not be generated in this case: "cdef char*cs = s" , right? > BTW, we shouldn't forget to adapt the .pxd files in Cython/Includes > accordingly, so that they return either "bytes" or "unicode", but *never* > "str" (or "object", if we know it's a string type). > Could you point to a couple of the C-API calls you are talking about? > And "str", "bytes" and "unicode" wouldn't be assignable to each other, > right? Or would you also leave that to runtime? > "bytes" <-> "unicode" (obviously?) would not be assignable, tough for the case of "bytes" <-> "str" or "str" <-> "unicode", we could generate similar Cython compile warnings as for the "[unsigned ]char *" conversions. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Sun Sep 13 21:39:11 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Sep 2009 21:39:11 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> Message-ID: <4AAD4A5F.7000806@behnel.de> Lisandro Dalcin wrote: > On Sun, Sep 13, 2009 at 7:55 AM, Stefan Behnel wrote: >> One thing I see missing is how this would be handled: >> >> cdef str s = "some string" >> cdef char* cs = s >> >> Should this simply result in a runtime error under Py3? >> >> Or would you forbid this and just raise a Cython compiler error (or >> warning), stating that "bytes" should be used instead? Although, this might >> actually appear inside of a Py2-only or try-except block, so I guess a >> warning would be the most we can do. > > I'm inclined for a warning... and that warning would not be generated > in this case: "cdef char*cs = s" , right? Sure. >> BTW, we shouldn't forget to adapt the .pxd files in Cython/Includes >> accordingly, so that they return either "bytes" or "unicode", but *never* >> "str" (or "object", if we know it's a string type). > > Could you point to a couple of the C-API calls you are talking about? Things like the encoding functions, for example, that convert between bytes and unicode. Using str here would be wrong. Plus, changing the argument/return value types from "object" to the right types will allow Cython to do actual type checking. >> And "str", "bytes" and "unicode" wouldn't be assignable to each other, >> right? Or would you also leave that to runtime? > > "bytes" <-> "unicode" (obviously?) would not be assignable, tough for > the case of "bytes" <-> "str" or "str" <-> "unicode", we could > generate similar Cython compile warnings as for the "[unsigned ]char > *" conversions. Yes, I guess that's a similar case. Stefan From dagss at student.matnat.uio.no Mon Sep 14 13:27:16 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 14 Sep 2009 13:27:16 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> Message-ID: <4AAE2894.7080307@student.matnat.uio.no> Robert Bradshaw wrote: > On Sep 12, 2009, at 1:16 PM, Robert Bradshaw wrote: > > >>> Robert and/or Lisandro, would you write up a CEP that sums up and >>> describes the proposed semantics for C strings and unprefixed byte >>> strings? I >>> would want to see a couple of examples in there that show in what >>> cases >>> code will break or not break, what changes will be required to fix >>> broken >>> code up, and what the change will simplify for code migration to >>> Py3. That >>> would make it quite clear how big the advantage actually is. From >>> the top >>> of my head, I can think of docstrings, for example, where this >>> would be >>> helpful, and I already mentioned the keywords example. >>> >> Sure, I'll do that. >> > > I have a draft up at http://wiki.cython.org/enhancements/ > stringliterals , please all feel free to add and edit. That did it for me, here's a +1 to this change in general. Dag Sverre From dalcinl at gmail.com Mon Sep 14 17:47:19 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 14 Sep 2009 12:47:19 -0300 Subject: [Cython] Py2.3 & eval.h: move include to proto section in pyexec_utility_code Message-ID: Stefan, I think this is the proper way... If you agree, please push the fix... diff -r fdf71a6bed70 Cython/Compiler/Builtin.py --- a/Cython/Compiler/Builtin.py Sat Sep 12 18:37:01 2009 +0200 +++ b/Cython/Compiler/Builtin.py Mon Sep 14 12:24:51 2009 -0300 @@ -163,14 +163,14 @@ pyexec_utility_code = UtilityCode( proto = """ -static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); -""", -impl = ''' #if PY_VERSION_HEX < 0x02040000 #ifndef Py_EVAL_H #include "eval.h" #endif #endif +static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); +""", +impl = ''' static PyObject* __Pyx_PyRun(PyObject* o, PyObject* globals, PyObject* locals) { PyObject* result; PyObject* s = 0; -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Mon Sep 14 17:55:35 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Sep 2009 17:55:35 +0200 Subject: [Cython] Py2.3 & eval.h: move include to proto section in pyexec_utility_code In-Reply-To: References: Message-ID: <4AAE6777.3030300@behnel.de> Lisandro Dalcin wrote: > Stefan, I think this is the proper way... If you agree, please push the fix... I don't mind either way. Since you wrote the patch, please push it to -devel and -unstable. Stefan From robertwb at math.washington.edu Mon Sep 14 19:08:51 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 10:08:51 -0700 Subject: [Cython] Py2.3 & eval.h: move include to proto section in pyexec_utility_code In-Reply-To: References: Message-ID: <39195600-CAAC-42F8-8A20-2999F37D9F8D@math.washington.edu> Looks right to me. We don't need eval.h anywhere else, do we? On Sep 14, 2009, at 8:47 AM, Lisandro Dalcin wrote: > Stefan, I think this is the proper way... If you agree, please push > the fix... > > diff -r fdf71a6bed70 Cython/Compiler/Builtin.py > --- a/Cython/Compiler/Builtin.py Sat Sep 12 18:37:01 2009 +0200 > +++ b/Cython/Compiler/Builtin.py Mon Sep 14 12:24:51 2009 -0300 > @@ -163,14 +163,14 @@ > > pyexec_utility_code = UtilityCode( > proto = """ > -static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); > -""", > -impl = ''' > #if PY_VERSION_HEX < 0x02040000 > #ifndef Py_EVAL_H > #include "eval.h" > #endif > #endif > +static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); > +""", > +impl = ''' > static PyObject* __Pyx_PyRun(PyObject* o, PyObject* globals, > PyObject* locals) { > PyObject* result; > PyObject* s = 0; > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Mon Sep 14 19:36:03 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 10:36:03 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAD4A5F.7000806@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> Message-ID: <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> On Sep 13, 2009, at 12:39 PM, Stefan Behnel wrote: > > Lisandro Dalcin wrote: >> On Sun, Sep 13, 2009 at 7:55 AM, Stefan Behnel wrote: >>> One thing I see missing is how this would be handled: >>> >>> cdef str s = "some string" >>> cdef char* cs = s >>> >>> Should this simply result in a runtime error under Py3? >>> >>> Or would you forbid this and just raise a Cython compiler error (or >>> warning), stating that "bytes" should be used instead? Although, >>> this might >>> actually appear inside of a Py2-only or try-except block, so I >>> guess a >>> warning would be the most we can do. >> >> I'm inclined for a warning... and that warning would not be generated >> in this case: "cdef char*cs = s" , right? > > Sure. That could be bad, s doesn't actually do a typecheck, especially if the bytes -> char* is eventually optimized. One should do s or s (neither of which generate a warning). > >>> BTW, we shouldn't forget to adapt the .pxd files in Cython/Includes >>> accordingly, so that they return either "bytes" or "unicode", but >>> *never* >>> "str" (or "object", if we know it's a string type). >> >> Could you point to a couple of the C-API calls you are talking about? > > Things like the encoding functions, for example, that convert > between bytes > and unicode. Using str here would be wrong. > > Plus, changing the argument/return value types from "object" to the > right > types will allow Cython to do actual type checking. Often the type checking will be redundant with the type checking that happens inside the method, so I'm not so sure this is a good idea. > >>> And "str", "bytes" and "unicode" wouldn't be assignable to each >>> other, >>> right? Or would you also leave that to runtime? >> >> "bytes" <-> "unicode" (obviously?) would not be assignable, tough for >> the case of "bytes" <-> "str" or "str" <-> "unicode", we could >> generate similar Cython compile warnings as for the "[unsigned ]char >> *" conversions. > > Yes, I guess that's a similar case. I'd be inclined to outright disallow them, favoring requiring or or cast. Currently, though, I can't think of any reason to type str/bytes/unicode variables at all. - Robert From dalcinl at gmail.com Mon Sep 14 19:53:57 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 14 Sep 2009 14:53:57 -0300 Subject: [Cython] Py2.3 & eval.h: move include to proto section in pyexec_utility_code In-Reply-To: <39195600-CAAC-42F8-8A20-2999F37D9F8D@math.washington.edu> References: <39195600-CAAC-42F8-8A20-2999F37D9F8D@math.washington.edu> Message-ID: On Mon, Sep 14, 2009 at 2:08 PM, Robert Bradshaw wrote: > Looks right to me. Pushed to -devel and -unstable > > We don't need eval.h anywhere else, do we? > No. BTW, I really do not understand why the calls in eval.h are not in ceval.h ... > On Sep 14, 2009, at 8:47 AM, Lisandro Dalcin wrote: > >> Stefan, I think this is the proper way... If you agree, please push >> the fix... >> >> diff -r fdf71a6bed70 Cython/Compiler/Builtin.py >> --- a/Cython/Compiler/Builtin.py ? ? ?Sat Sep 12 18:37:01 2009 +0200 >> +++ b/Cython/Compiler/Builtin.py ? ? ?Mon Sep 14 12:24:51 2009 -0300 >> @@ -163,14 +163,14 @@ >> >> ?pyexec_utility_code = UtilityCode( >> ?proto = """ >> -static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); >> -""", >> -impl = ''' >> ?#if PY_VERSION_HEX < 0x02040000 >> ?#ifndef Py_EVAL_H >> ?#include "eval.h" >> ?#endif >> ?#endif >> +static PyObject* __Pyx_PyRun(PyObject*, PyObject*, PyObject*); >> +""", >> +impl = ''' >> ?static PyObject* __Pyx_PyRun(PyObject* o, PyObject* globals, >> PyObject* locals) { >> ? ? ?PyObject* result; >> ? ? ?PyObject* s = 0; >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Mon Sep 14 20:47:01 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Sep 2009 20:47:01 +0200 Subject: [Cython] Py2.3 & eval.h: move include to proto section in pyexec_utility_code In-Reply-To: References: <39195600-CAAC-42F8-8A20-2999F37D9F8D@math.washington.edu> Message-ID: <4AAE8FA5.7090909@behnel.de> Lisandro Dalcin wrote: > I really do not understand why the calls in eval.h are not in ceval.h ... Why bother. Both are included by Python.h starting with 2.4, so I don't really care what is defined where. Stefan From dalcinl at gmail.com Mon Sep 14 20:48:15 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 14 Sep 2009 15:48:15 -0300 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> Message-ID: On Mon, Sep 14, 2009 at 2:36 PM, Robert Bradshaw wrote: >> Lisandro Dalcin wrote: >>> I'm inclined for a warning... and that warning would not be generated >>> in this case: "cdef char*cs = s" , right? > > That could be bad, s doesn't actually do a typecheck, > especially if the bytes -> char* is eventually optimized. One should > do s or s (neither of which generate a warning). > Of course, is as dangerous as any cast (and these casts defaulting to no-typecheck are the root of the evil)... Still, as casts are "blindly" honoured, I'm not sure if we should special case this usage ... >> >>>> BTW, we shouldn't forget to adapt the .pxd files in Cython/Includes >>>> accordingly, so that they return either "bytes" or "unicode", but >>>> *never* >>>> "str" (or "object", if we know it's a string type). >>> >>> Could you point to a couple of the C-API calls you are talking about? >> >> Things like the encoding functions, for example, that convert >> between bytes >> and unicode. Using str here would be wrong. >> >> Plus, changing the argument/return value types from "object" to the >> right >> types will allow Cython to do actual type checking. > > Often the type checking will be redundant with the type checking that > happens inside the method, so I'm not so sure this is a good idea. > Perhaps it is time to generate Cython compile-time warning when calling with an untyped arg value? Again, an explicit cast would suppress the warning... I mean, suppose this code: cdef void foo(tuple t): # REMEMBER!!, foo() does not typecheck that "t" is actually a tuple!!! pass cdef object a = 1 foo(a) cdef list b = [1] foo(b) cdef tuple c = (1,) foo(c) 1) When calling foo(a), we could emit a compile-time warning and generate C code with runtime type-check... What's the point of duck typing here iff foo() actually requires a tuple? 2) When calling foo(b), we could generate compile-time ERROR... What's the point of letting this Cython-compile silently and making it fail as runtime (like currently happens) ? 3) When calling foo(c), no warnigns and no C runtime type-check (as currently is being done) BTW, perhaps all this stuff is important enough as to start a new thread? >> >>>> And "str", "bytes" and "unicode" wouldn't be assignable to each >>>> other, >>>> right? Or would you also leave that to runtime? >>> >>> "bytes" <-> "unicode" (obviously?) would not be assignable, tough for >>> the case of "bytes" <-> "str" or "str" <-> "unicode", we could >>> generate similar Cython compile warnings as for the "[unsigned ]char >>> *" conversions. >> >> Yes, I guess that's a similar case. > > > I'd be inclined to outright disallow them, favoring requiring ?> or or cast. Currently, though, I can't think > of any reason to type str/bytes/unicode variables at all. > In the future, you could optimize item access for "bytes" ? Additionally, you could need to use typed variables because you want to enforce automatic compile-time/runtime type-checking on assignement or when calling functions ... Or you want to save the type-check before call, as in foo(c) in my example above -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Mon Sep 14 21:05:17 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Sep 2009 21:05:17 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> Message-ID: <4AAE93ED.8020704@behnel.de> Robert Bradshaw wrote: > On Sep 13, 2009, at 12:39 PM, Stefan Behnel wrote: >>>> cdef str s = "some string" >>>> cdef char* cs = s >>>> >>> I'm inclined for a warning... and that warning would not be generated >>> in this case: "cdef char*cs = s" , right? >> Sure. > > That could be bad, s doesn't actually do a typecheck, > especially if the bytes -> char* is eventually optimized. One should > do s or s (neither of which generate a warning). To me, that's just like casting an int to a void*. I don't see a reason to special case some casts while we already allow all that dangerous C stuff. If nothing else, a cast is a clear way to say "I know better!". And if you actually do not know better, you'll see where that gets you. Not Cython's problem. >> changing the argument/return value types from "object" to the >> right types will allow Cython to do actual type checking. > > Often the type checking will be redundant with the type checking that > happens inside the method, so I'm not so sure this is a good idea. I meant compile time type checking, which won't hurt performance but helps in making the C-API safer and also allows Cython to do some optimisations. For example, I only noticed recently that literal Python strings were always treated as "object" in Cython. So things like u"".join() were never associated with the unicode type. >>>> And "str", "bytes" and "unicode" wouldn't be assignable to each >>>> other, >>>> right? Or would you also leave that to runtime? >>> "bytes" <-> "unicode" (obviously?) would not be assignable, tough for >>> the case of "bytes" <-> "str" or "str" <-> "unicode", we could >>> generate similar Cython compile warnings as for the "[unsigned ]char >>> *" conversions. >> Yes, I guess that's a similar case. > > I'd be inclined to outright disallow them, favoring requiring > or or cast. Perfectly fine with me. > Currently, though, I can't think > of any reason to type str/bytes/unicode variables at all. You should take a look at the call optimisations for builtin types. I've been adding to them for a while now, and they really make a huge difference. For example, this: cdef unicode u = some_unicode_string s = u.encode('UTF-8') will now result in a straight C call to the UTF-8 encoder, instead of looking up the method, calling it, and having it look up the codec internally. I find that pretty cool. Stefan From stefan_ml at behnel.de Mon Sep 14 21:13:39 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Sep 2009 21:13:39 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> Message-ID: <4AAE95E3.3010301@behnel.de> Lisandro Dalcin wrote: > Perhaps it is time to generate Cython compile-time warning when > calling with an untyped arg value? Again, an explicit cast would > suppress the warning... > [...] > BTW, perhaps all this stuff is important enough as to start a new thread? Yep, I guess so. > In the future, you could optimize item access for "bytes" ? That's worth a feature request in trac, I'd say. Although you'd normally cast a byte string to a char* anyway if you want to do performance critical operations on it... > Additionally, you could need to use typed variables because you want > to enforce automatic compile-time/runtime type-checking on assignement > or when calling functions ... Or you want to save the type-check > before call, as in foo(c) in my example above When are we finally getting the tiny bit of type inference that's necessary to do this internally? Stefan From robertwb at math.washington.edu Tue Sep 15 03:47:53 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 18:47:53 -0700 Subject: [Cython] Type checking Message-ID: On Sep 14, 2009, at 11:48 AM, Lisandro Dalcin wrote: > On Mon, Sep 14, 2009 at 2:36 PM, Robert Bradshaw > wrote: >>> Lisandro Dalcin wrote: >>>> I'm inclined for a warning... and that warning would not be >>>> generated >>>> in this case: "cdef char*cs = s" , right? >> >> That could be bad, s doesn't actually do a typecheck, >> especially if the bytes -> char* is eventually optimized. One should >> do s or s (neither of which generate a warning). > > Of course, is as dangerous as any cast (and > these casts defaulting to no-typecheck are the root of the evil)... I was considering for a no-typecheck cast at one point, but changing the default to do typechecking is super backwards incompatible. > Still, as casts are "blindly" honoured, I'm not sure if > we should special case this usage ... No, I don't think we should special case it--I'm saying that we shouldn't encourage this as it could produce a hard crash if s is not bytes, e.g. input from in Py3. >>>>> BTW, we shouldn't forget to adapt the .pxd files in Cython/ >>>>> Includes >>>>> accordingly, so that they return either "bytes" or "unicode", but >>>>> *never* >>>>> "str" (or "object", if we know it's a string type). >>>> >>>> Could you point to a couple of the C-API calls you are talking >>>> about? >>> >>> Things like the encoding functions, for example, that convert >>> between bytes >>> and unicode. Using str here would be wrong. >>> >>> Plus, changing the argument/return value types from "object" to the >>> right >>> types will allow Cython to do actual type checking. >> >> Often the type checking will be redundant with the type checking that >> happens inside the method, so I'm not so sure this is a good idea. >> > > Perhaps it is time to generate Cython compile-time warning when > calling with an untyped arg value? Again, an explicit cast would > suppress the warning... > > I mean, suppose this code: > > cdef void foo(tuple t): # REMEMBER!!, foo() does not typecheck that > "t" is actually a tuple!!! > pass > > cdef object a = 1 > foo(a) > > cdef list b = [1] > foo(b) > > cdef tuple c = (1,) > foo(c) > > 1) When calling foo(a), we could emit a compile-time warning and > generate C code with runtime type-check... What's the point of duck > typing here iff foo() actually requires a tuple? We don't want to generate so many warnings they become useless, I personally wouldn't want that on by default. It's nice because a lot of the inputs, return results, etc. are not typed, e.g. I can do foo (bar(x)) where bar is any old Python function. > 2) When calling foo(b), we could generate compile-time ERROR... What's > the point of letting this Cython-compile silently and making it fail > as runtime (like currently happens) ? Yes, that would probably be a good idea. > 3) When calling foo(c), no warnigns and no C runtime type-check (as > currently is being done) Yep. > BTW, perhaps all this stuff is important enough as to start a new > thread? Done :) - Robert From robertwb at math.washington.edu Tue Sep 15 03:56:33 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 18:56:33 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAE93ED.8020704@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> <4AAE93ED.8020704@behnel.de> Message-ID: <818CB079-037C-4252-BBB1-48A7715F5240@math.washington.edu> On Sep 14, 2009, at 12:05 PM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> On Sep 13, 2009, at 12:39 PM, Stefan Behnel wrote: >>>>> cdef str s = "some string" >>>>> cdef char* cs = s >>>>> >>>> I'm inclined for a warning... and that warning would not be >>>> generated >>>> in this case: "cdef char*cs = s" , right? >>> Sure. >> >> That could be bad, s doesn't actually do a typecheck, >> especially if the bytes -> char* is eventually optimized. One should >> do s or s (neither of which generate a warning). > > To me, that's just like casting an int to a void*. I don't see a > reason to > special case some casts while we already allow all that dangerous C > stuff. > If nothing else, a cast is a clear way to say "I know better!". And > if you > actually do not know better, you'll see where that gets you. Not > Cython's > problem. Yes, as I said I was just saying that we shouldn't encourage *this* solution, as it doesn't do type checking. >>> changing the argument/return value types from "object" to the >>> right types will allow Cython to do actual type checking. >> >> Often the type checking will be redundant with the type checking that >> happens inside the method, so I'm not so sure this is a good idea. > > I meant compile time type checking, which won't hurt performance > but helps > in making the C-API safer and also allows Cython to do some > optimisations. Sometimes. For example, PyUnicode_GetSize in principle take a unicode object, but is only typed to take a object. It performs its own typecheck, so we should just define it as taking an object and not do the redundant type check ourselves. > For example, I only noticed recently that literal Python strings were > always treated as "object" in Cython. So things like u"".join() > were never > associated with the unicode type. Yes, if u"" is typed, we should be able to optimize on it. >>>>> And "str", "bytes" and "unicode" wouldn't be assignable to each >>>>> other, >>>>> right? Or would you also leave that to runtime? >>>> "bytes" <-> "unicode" (obviously?) would not be assignable, >>>> tough for >>>> the case of "bytes" <-> "str" or "str" <-> "unicode", we could >>>> generate similar Cython compile warnings as for the "[unsigned ] >>>> char >>>> *" conversions. >>> Yes, I guess that's a similar case. >> >> I'd be inclined to outright disallow them, favoring requiring >> or or cast. > > Perfectly fine with me. > > >> Currently, though, I can't think >> of any reason to type str/bytes/unicode variables at all. > > You should take a look at the call optimisations for builtin types. > I've > been adding to them for a while now, and they really make a huge > difference. > > For example, this: > > cdef unicode u = some_unicode_string > s = u.encode('UTF-8') > > will now result in a straight C call to the UTF-8 encoder, instead of > looking up the method, calling it, and having it look up the codec > internally. I find that pretty cool. Hmm, not for me (at least not in the -devel branch), but I could see this being very nice. - Robert From robertwb at math.washington.edu Tue Sep 15 03:57:39 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 18:57:39 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAE95E3.3010301@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> <4AAE95E3.3010301@behnel.de> Message-ID: On Sep 14, 2009, at 12:13 PM, Stefan Behnel wrote: > When are we finally getting the tiny bit of type inference that's > necessary > to do this internally? If no one gets around to it, I'll do it (ideas have been churning in the back of my mind for a while), but no promises as to when I'll find the time... - Robert From stefan_ml at behnel.de Tue Sep 15 08:01:45 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 15 Sep 2009 08:01:45 +0200 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <818CB079-037C-4252-BBB1-48A7715F5240@math.washington.edu> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> <4AAE93ED.8020704@behnel.de> <818CB079-037C-4252-BBB1-48A7715F5240@math.washington.edu> Message-ID: <4AAF2DC9.4020709@behnel.de> Hi Robert, Robert Bradshaw wrote: > On Sep 14, 2009, at 12:05 PM, Stefan Behnel wrote: >>>>> cdef str s = "some string" >>>>> cdef char* cs = s >>>>> >> I don't see a reason to >> special case some casts while we already allow all that dangerous C >> stuff. >> If nothing else, a cast is a clear way to say "I know better!". And >> if you >> actually do not know better, you'll see where that gets you. Not >> Cython's problem. > > Yes, as I said I was just saying that we shouldn't encourage *this* > solution, as it doesn't do type checking. We can always try to improve the error message. Users who are unaware of the resulting problems will first try cdef char* cs = s and only when Cython barfs at them they'd consider casting. But if Cython says "Coercing str to char* is not portable to Python 3, please use the bytes type instead", I think that might do the trick. I added that to the CEP. >>>> changing the argument/return value types from "object" to the >>>> right types will allow Cython to do actual type checking. >>> Often the type checking will be redundant with the type checking that >>> happens inside the method, so I'm not so sure this is a good idea. >> I meant compile time type checking, which won't hurt performance >> but helps >> in making the C-API safer and also allows Cython to do some >> optimisations. > > Sometimes. For example, PyUnicode_GetSize in principle take a unicode > object, but is only typed to take a object. It performs its own > typecheck, so we should just define it as taking an object and not do > the redundant type check ourselves. Right, input parameters are a different thing. Currently, we only do type checks for C-API calls we inject internally. There definitely shouldn't be any type checks in runtime code for C-API calls, and I agree that they make little sense at compile time as well. I'm more concerned about return types, which are almost always known in the C-API. >> For example, this: >> >> cdef unicode u = some_unicode_string >> s = u.encode('UTF-8') >> >> will now result in a straight C call to the UTF-8 encoder, instead of >> looking up the method, calling it, and having it look up the codec >> internally. I find that pretty cool. > > Hmm, not for me (at least not in the -devel branch), but I could see > this being very nice. Yes, cython-unstable has quite some improvements in that regard, but I don't think backporting them is really worth bothering if we get out 0.12 any soon. Stefan From stefan_ml at behnel.de Tue Sep 15 08:19:51 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 15 Sep 2009 08:19:51 +0200 Subject: [Cython] Type checking In-Reply-To: References: Message-ID: <4AAF3207.9040403@behnel.de> Robert Bradshaw wrote: > On Sep 14, 2009, at 11:48 AM, Lisandro Dalcin wrote: >> On Mon, Sep 14, 2009 at 2:36 PM, Robert Bradshaw wrote: >>>> Lisandro Dalcin wrote: >>>>> I'm inclined for a warning... and that warning would not be >>>>> generated >>>>> in this case: "cdef char*cs = s" , right? >>> That could be bad, s doesn't actually do a typecheck, >>> especially if the bytes -> char* is eventually optimized. One should >>> do s or s (neither of which generate a warning). >> Of course, is as dangerous as any cast (and >> these casts defaulting to no-typecheck are the root of the evil)... > > I was considering for a no-typecheck cast at one point, but > changing the default to do typechecking is super backwards incompatible. I'm ok with a runtime safe cast. After all, casting between Python objects is still a rather rare thing, and I only really do it when I know the exact type and Cython will clearly drop loads of generic code for it. >> Still, as casts are "blindly" honoured, I'm not sure if >> we should special case this usage ... > > No, I don't think we should special case it--I'm saying that we > shouldn't encourage this as it could produce a hard crash if s is not > bytes, e.g. input from in Py3. How hard it would crash depends on our optimisations, but I agree otherwise. See my change in the CEP. >> Perhaps it is time to generate Cython compile-time warning when >> calling with an untyped arg value? Again, an explicit cast would >> suppress the warning... >> >> I mean, suppose this code: >> >> cdef void foo(tuple t): # REMEMBER!!, foo() does not typecheck that >> "t" is actually a tuple!!! >> pass >> >> cdef object a = 1 >> foo(a) >> >> cdef list b = [1] >> foo(b) >> >> cdef tuple c = (1,) >> foo(c) >> >> 1) When calling foo(a), we could emit a compile-time warning and >> generate C code with runtime type-check... What's the point of duck >> typing here iff foo() actually requires a tuple? > > We don't want to generate so many warnings they become useless, I > personally wouldn't want that on by default. It's nice because a lot > of the inputs, return results, etc. are not typed, e.g. I can do foo > (bar(x)) where bar is any old Python function. -1 on "untyped" warnings. We still /discourage/ explicit typing in most cases. >> 2) When calling foo(b), we could generate compile-time ERROR... What's >> the point of letting this Cython-compile silently and making it fail >> as runtime (like currently happens) ? > > Yes, that would probably be a good idea. Note that the value might be None, which might still pass. Although if the user knows that the value is None, she/he'd probably also spell it that way, so an error should be ok. After all, in the current state, the user most likely provided the type explicitly anyway. BTW, for the specific case of list/tuple, there's a PyList_AsTuple() function and the PySequence_List/Tuple() functions that might become useful - although I'm certainly not saying we should do the required copying automatically behind the back of the user. >> 3) When calling foo(c), no warnigns and no C runtime type-check (as >> currently is being done) > > Yep. Sure. Stefan From robertwb at math.washington.edu Tue Sep 15 08:59:46 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Sep 2009 23:59:46 -0700 Subject: [Cython] String types with Python 2.x and 3.x In-Reply-To: <4AAF2DC9.4020709@behnel.de> References: <200909120236.45400.dominic.sacre@gmx.de> <4AAB423B.50501@behnel.de> <4AAB4F29.1080902@behnel.de> <31CFF574-B5B5-464D-B308-2D6250ACC114@math.washington.edu> <4AABFBD5.7080308@behnel.de> <47F93490-CA7B-4AE5-BAE5-392C019F3A54@math.washington.edu> <4AACCFBB.1050508@behnel.de> <4AAD4A5F.7000806@behnel.de> <3B527A6E-B37B-4146-A469-3C63A7F078EF@math.washington.edu> <4AAE93ED.8020704@behnel.de> <818CB079-037C-4252-BBB1-48A7715F5240@math.washington.edu> <4AAF2DC9.4020709@behnel.de> Message-ID: <50951C5E-FC10-4E6D-B4CE-C6A1EE35694C@math.washington.edu> On Sep 14, 2009, at 11:01 PM, Stefan Behnel wrote: > Yes, cython-unstable has quite some improvements in that regard, but I > don't think backporting them is really worth bothering if we get > out 0.12 > any soon. Cool. Yes, 0.11.3 soon (basically, as soon as we get this nailed down), and then 0.12 not long after that. - Robert From robertwb at math.washington.edu Tue Sep 15 09:49:46 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 15 Sep 2009 00:49:46 -0700 Subject: [Cython] Type checking In-Reply-To: <4AAF3207.9040403@behnel.de> References: <4AAF3207.9040403@behnel.de> Message-ID: On Sep 14, 2009, at 11:19 PM, Stefan Behnel wrote: > BTW, for the specific case of list/tuple, there's a PyList_AsTuple() > function and the PySequence_List/Tuple() functions that might > become useful > - although I'm certainly not saying we should do the required copying > automatically behind the back of the user. In the little bit of testing I've done with the PySequence... functions, I've been disappointed with the speed. They are just a tuple vs. list type check, then dispatch to the actual function. - Robert From sjparry88 at hotmail.co.uk Tue Sep 15 16:14:33 2009 From: sjparry88 at hotmail.co.uk (Sam Parry) Date: Tue, 15 Sep 2009 14:14:33 +0000 Subject: [Cython] vcvarsall.bat Message-ID: Hi guys, Not sure if I'm emailing to the correct place so apologies if I am spamming you... I am having problems with Cython compiling. I am following the tutorial on the main website (from the Users Guide) and when I type "python setup.py build_ext --inplace" I get an error saying "unable to find vcvarsall.bat". I am using MinGW as my compiler and running on windows XP. I have managed to find a way around this: typing "python setup.py build_ext --compiler=mingw32 --inplace" works for the first 'hello world' tutorial part. However, I get the vcvarsall error when trying the pyximport method. Adding the "--compiler=mingw32" does not work for any of the examples using any form of numpy import. I would be grateful for any insights provided that could help me run cython! I am new to using the command line, c and cython (and not all that experienced with python either!) so forgive me if I need more detail than the average user! Thanks, Sam _________________________________________________________________ Access your other email accounts and manage all your email from one place. http://clk.atdmt.com/UKM/go/167688463/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090915/ebf66722/attachment.htm From dalcinl at gmail.com Tue Sep 15 18:34:14 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 15 Sep 2009 13:34:14 -0300 Subject: [Cython] Type checking In-Reply-To: <4AAF3207.9040403@behnel.de> References: <4AAF3207.9040403@behnel.de> Message-ID: On Tue, Sep 15, 2009 at 3:19 AM, Stefan Behnel wrote: > >>> >>> 1) When calling foo(a), we could emit a compile-time warning and >>> generate C code with runtime type-check... What's the point of duck >>> typing here iff foo() actually requires a tuple? >> >> We don't want to generate so many warnings they become useless, I >> personally wouldn't want that on by default. It's nice because a lot >> of the inputs, return results, etc. are not typed, e.g. I can do foo >> (bar(x)) where bar is any old Python function. > > -1 on "untyped" warnings. We still /discourage/ explicit typing in most cases. > But the current status is far from optimal... cdef functions do not do internally any typecheck on typed arguments (for performance reasons?), then you can call it with anything, and you can easily end-up with a segfault... I understand that we /discourage/ explicit typing and agree with that. However, I think that if a user INSIST in using explicit typing for some args in a cdef function, she should be prepared to make sure that when calling that function, the argument have a proper, known type at Cython compile-time... In fact, this is a way to discourage even more explicit typing... And this will help catch nasty bugs when explicit typing is used. In short, I think that moving/enforcing type-checks from Python-runtime to Cython-compile-time is a very good deal... Far better to get a complain from Cython, before you release your code, than a nasty, embarrassing Python exception when end-users try your code... I know, unittesting should catch these problems, but honestly... how many codes out there have 100% code coverage? Final note: cdef functions with no argument typecheck at all are as dangerous as Fortran 77... IMHO, Cython should do better, just because Python/Cython users are not real programmers ;-) ... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue Sep 15 18:52:18 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 15 Sep 2009 13:52:18 -0300 Subject: [Cython] vcvarsall.bat In-Reply-To: References: Message-ID: IIRC, there are some patches in http://trac.cython.org/cython_trac/ to make pyximport MinGW aware... Unfortunately, I did not have any chance to review this, and Windows is always low in my priorities... You know... the Windows OS has a lot of users, fans, and strong defenders (we already had some of these "fights" here in this list!!!)... but very few of them make any useful code contribution/testing/review for their platform... A fast workaround for your issue if to add a file named "distutils.cfg" in C:\Python2.6\Lib\distutils (DISCLAIMER: do not remember right now if this is the actual full path of distutils!) with the contents below: [build_ext] compiler=mingw32 Alternatively, you can add a "pydistutils.cfg" file with the same contents in %HOME% or %UserProfile% or watever your "home" directory is in your Windows system (tip: use os.path.expanduser('~') in a Python prompt to figure out the right place) Hope this help... BTW, If you can elaborate a bit more on this and contribute all this stuff to the Cython wiki, it would be great.... Regards, On Tue, Sep 15, 2009 at 11:14 AM, Sam Parry wrote: > Hi guys, > > Not sure if I'm emailing to the correct place so apologies if I am spamming > you... > > I am having problems with Cython compiling. I am following the tutorial on > the main website (from the Users Guide)?and when I type "python setup.py > build_ext --inplace" I get an error saying "unable to find vcvarsall.bat". I > am using MinGW as my compiler and running on windows XP. I have managed to > find a way around this: typing "python setup.py build_ext --compiler=mingw32 > --inplace" works for the first 'hello world' tutorial part. However, I get > the vcvarsall error when trying the pyximport method. Adding the > "--compiler=mingw32" does not work for any of the examples using any form of > numpy import. > > I would be grateful for any insights provided that could help me run cython! > I am new to using the command line, c and cython (and not all that > experienced with python either!) so forgive me if I need more detail than > the average user! > > Thanks, > > Sam > ________________________________ > Use Hotmail to send and receive mail from your different email accounts. > Find out how. > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Tue Sep 15 20:13:43 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 15 Sep 2009 20:13:43 +0200 Subject: [Cython] Type checking In-Reply-To: References: <4AAF3207.9040403@behnel.de> Message-ID: <4AAFD957.2020203@student.matnat.uio.no> Lisandro Dalcin wrote: > On Tue, Sep 15, 2009 at 3:19 AM, Stefan Behnel wrote: >>>> 1) When calling foo(a), we could emit a compile-time warning and >>>> generate C code with runtime type-check... What's the point of duck >>>> typing here iff foo() actually requires a tuple? >>> We don't want to generate so many warnings they become useless, I >>> personally wouldn't want that on by default. It's nice because a lot >>> of the inputs, return results, etc. are not typed, e.g. I can do foo >>> (bar(x)) where bar is any old Python function. >> -1 on "untyped" warnings. We still /discourage/ explicit typing in most cases. >> > > Final note: cdef functions with no argument typecheck at all are as > dangerous as Fortran 77... IMHO, Cython should do better, just because > Python/Cython users are not real programmers ;-) ... > > > > But the current status is far from optimal... cdef functions do not do > internally any typecheck on typed arguments (for performance > reasons?), then you can call it with anything, and you can easily > end-up with a segfault... This got confusing to me. I always had the impression that: a) def foo(MyType arg): ... means that the foo function itself checks the type on entry. b) cdef foo(MyType arg): ... means that foo does not check the type, but the caller does! (Except, of course, if the caller already know the type, which is a useful optimization.) Is this understanding (of current behaviour) true or false? Whatever the does is something else entirely and much less important IMO. > I understand that we /discourage/ explicit typing and agree with that. > However, I think that if a user INSIST in using explicit typing for > some args in a cdef function, she should be prepared to make sure that > when calling that function, the argument have a proper, known type at > Cython compile-time... In fact, this is a way to discourage even more > explicit typing... And this will help catch nasty bugs when explicit > typing is used. I think the behaviour described above is perfect, that is: If you do cdef foo(MyType arg): ... def caller(x): foo(x) then "caller" has code inserted to check that x is indeed MyType before passing it. It is exactly the same case as cdef MyType y = x However, of course, if you do def other_caller(MyType x): print "entered" foo(x) then there is (of course) no need to do another type check beyond the one that is done before "entered" is printed. > In short, I think that moving/enforcing type-checks from > Python-runtime to Cython-compile-time is a very good deal... Far > better to get a complain from Cython, before you release your code, > than a nasty, embarrassing Python exception when end-users try your > code... I know, unittesting should catch these problems, but > honestly... how many codes out there have 100% code coverage? -1. This thinking is a dangerous slope. Where do you stop? Why would one stop short of Java or C++ or Haskell? They all have different fully typed systems which can help in *some* situations, but then the whole language is designed around it. Python is just not designed for it. If we try to use types to "encourage good programming practice" then we essentially need to develop a new language from ground up, which may or may not look like Python. I don't have the time for that, and wouldn't use the result (I'd rather learn myself Haskell to be honest). I even had a seperate slide in my SciPy 09 presentation telling people that static type checking (as a programming model) was not a valid usecase for Cython; "Cython is typed because it has to, not because it wants to." :-) -- Dag Sverre From stefan_ml at behnel.de Tue Sep 15 20:20:43 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 15 Sep 2009 20:20:43 +0200 Subject: [Cython] Type checking In-Reply-To: <4AAFD957.2020203@student.matnat.uio.no> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> Message-ID: <4AAFDAFB.7080007@behnel.de> Dag Sverre Seljebotn wrote: > I always had the impression that: > > a) def foo(MyType arg): ... > > means that the foo function itself checks the type on entry. > > b) cdef foo(MyType arg): ... > > means that foo does not check the type, but the caller does! (Except, of > course, if the caller already know the type, which is a useful > optimization.) > > Is this understanding (of current behaviour) true or false? Both correct, and that's exactly how it must work. Stefan From dagss at student.matnat.uio.no Tue Sep 15 20:24:20 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 15 Sep 2009 20:24:20 +0200 Subject: [Cython] Type checking In-Reply-To: <4AAFD957.2020203@student.matnat.uio.no> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> Message-ID: <4AAFDBD4.8090005@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Tue, Sep 15, 2009 at 3:19 AM, Stefan Behnel wrote: >>>>> 1) When calling foo(a), we could emit a compile-time warning and >>>>> generate C code with runtime type-check... What's the point of duck >>>>> typing here iff foo() actually requires a tuple? >>>> We don't want to generate so many warnings they become useless, I >>>> personally wouldn't want that on by default. It's nice because a lot >>>> of the inputs, return results, etc. are not typed, e.g. I can do foo >>>> (bar(x)) where bar is any old Python function. >>> -1 on "untyped" warnings. We still /discourage/ explicit typing in most cases. >>> >> Final note: cdef functions with no argument typecheck at all are as >> dangerous as Fortran 77... IMHO, Cython should do better, just because >> Python/Cython users are not real programmers ;-) ... >> >> >> >> But the current status is far from optimal... cdef functions do not do >> internally any typecheck on typed arguments (for performance >> reasons?), then you can call it with anything, and you can easily >> end-up with a segfault... > > This got confusing to me. I always had the impression that: > > a) def foo(MyType arg): ... > > means that the foo function itself checks the type on entry. > > b) cdef foo(MyType arg): ... > > means that foo does not check the type, but the caller does! (Except, of > course, if the caller already know the type, which is a useful > optimization.) > > Is this understanding (of current behaviour) true or false? Answering my own question: It is true. The testcase below works, you do NOT end up with a segfault, because a check is inserted in "caller". Perfect IMO. """ >>> caller(4) Traceback (most recent call last): ... TypeError: Cannot convert int to dagss.MyType """ cdef class MyType: cdef int value def __init__(self): self.value = 20 cdef foo(MyType p): print p.value def caller(o): foo(o) > > Whatever the does is something else entirely and much > less important IMO. > >> I understand that we /discourage/ explicit typing and agree with that. >> However, I think that if a user INSIST in using explicit typing for >> some args in a cdef function, she should be prepared to make sure that >> when calling that function, the argument have a proper, known type at >> Cython compile-time... In fact, this is a way to discourage even more >> explicit typing... And this will help catch nasty bugs when explicit >> typing is used. > > I think the behaviour described above is perfect, that is: If you do > > cdef foo(MyType arg): ... > > def caller(x): > foo(x) > > then "caller" has code inserted to check that x is indeed MyType before > passing it. It is exactly the same case as > > cdef MyType y = x > > However, of course, if you do > > def other_caller(MyType x): > print "entered" > foo(x) > > then there is (of course) no need to do another type check beyond the > one that is done before "entered" is printed. -- Dag Sverre From robertwb at math.washington.edu Tue Sep 15 20:34:00 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 15 Sep 2009 11:34:00 -0700 Subject: [Cython] Type checking In-Reply-To: <4AAFD957.2020203@student.matnat.uio.no> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> Message-ID: <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> On Sep 15, 2009, at 11:13 AM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Tue, Sep 15, 2009 at 3:19 AM, Stefan Behnel >> wrote: >>>>> 1) When calling foo(a), we could emit a compile-time warning and >>>>> generate C code with runtime type-check... What's the point of >>>>> duck >>>>> typing here iff foo() actually requires a tuple? >>>> We don't want to generate so many warnings they become useless, I >>>> personally wouldn't want that on by default. It's nice because a >>>> lot >>>> of the inputs, return results, etc. are not typed, e.g. I can do >>>> foo >>>> (bar(x)) where bar is any old Python function. >>> -1 on "untyped" warnings. We still /discourage/ explicit typing >>> in most cases. >>> >> >> Final note: cdef functions with no argument typecheck at all are as >> dangerous as Fortran 77... IMHO, Cython should do better, just >> because >> Python/Cython users are not real programmers ;-) ... >> >> But the current status is far from optimal... cdef functions do >> not do >> internally any typecheck on typed arguments (for performance >> reasons?), then you can call it with anything, and you can easily >> end-up with a segfault... > > This got confusing to me. I always had the impression that: > > a) def foo(MyType arg): ... > > means that the foo function itself checks the type on entry. > > b) cdef foo(MyType arg): ... > > means that foo does not check the type, but the caller does! > (Except, of > course, if the caller already know the type, which is a useful > optimization.) > > Is this understanding (of current behaviour) true or false? That is exactly what happens, and is as it should be. This is an interesting asymmetry between cdef and def functions that I don't think it pointed out clearly anywhere. [...] >> In short, I think that moving/enforcing type-checks from >> Python-runtime to Cython-compile-time is a very good deal... Far >> better to get a complain from Cython, before you release your code, >> than a nasty, embarrassing Python exception when end-users try your >> code... I know, unittesting should catch these problems, but >> honestly... how many codes out there have 100% code coverage? > > -1. This thinking is a dangerous slope. Where do you stop? Why > would one > stop short of Java or C++ or Haskell? They all have different fully > typed systems which can help in *some* situations, but then the whole > language is designed around it. > > Python is just not designed for it. If we try to use types to > "encourage > good programming practice" then we essentially need to develop a new > language from ground up, which may or may not look like Python. I > don't > have the time for that, and wouldn't use the result (I'd rather learn > myself Haskell to be honest). > > I even had a seperate slide in my SciPy 09 presentation telling people > that static type checking (as a programming model) was not a valid > usecase for Cython; "Cython is typed because it has to, not because it > wants to." :-) Well said. - Robert From dalcinl at gmail.com Tue Sep 15 23:52:03 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 15 Sep 2009 18:52:03 -0300 Subject: [Cython] Type checking In-Reply-To: <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> Message-ID: Sorry for the noise... Yes, the caller does the typecheck for cdef functions, so no danger at all... >> Python is just not designed for it. If we try to use types to >> "encourage >> good programming practice" then we essentially need to develop a new >> language from ground up, which may or may not look like Python. I've NEVER said that using types is encourages good programming practice... All as I commented if that if you write a cdef function with a typed arg, and next call it with an untyped value, it would be NICE that Cython let me know that... just because that WARNING could spot a real bug, or I could even save the runtime typecheck at the caller by using a typed value. And I just asked for a Cython warning, and I do not ever bother if that warning is not enabled by default... >From my own code, I have two primary use cases for cdef functions/methods: (1) internal, private helper routines, that should handle C types and cdef classes (2) public C-API for consumption in external C code. For (1), when I use typed arguments I have VERY good reasons for it, and would like Cython to point me any call with untyped args (because an untyped arg is either a bug or an unnecessary runtime typecheck in the caller)... For (2), the no-typecheck on the callee is unsafe, and I have to manually do the typecheck.... >> >> I even had a seperate slide in my SciPy 09 presentation telling people >> that static type checking (as a programming model) was not a valid >> usecase for Cython; "Cython is typed because it has to, not because it >> wants to." :-) > I do not buy that as an absolute truth... If some people use Cython to write/speedup Python code, then they will likely share that view... But if you use Cython closer to the C/C++ part, where you have to interact with C types and opaque handles, and typed cdef classes and cdef methods, and all that for maximum speed and calling external lib functions, then you start to think different and will appreciate if Cython help you to save unnecessary runtime typechecking by some sort or "static typing" features (even if all what Cython does is generating warnings). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Wed Sep 16 00:09:26 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 15 Sep 2009 15:09:26 -0700 Subject: [Cython] Type checking In-Reply-To: References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> Message-ID: On Sep 15, 2009, at 2:52 PM, Lisandro Dalcin wrote: > Sorry for the noise... Yes, the caller does the typecheck for cdef > functions, so no danger at all... > >>> Python is just not designed for it. If we try to use types to >>> "encourage >>> good programming practice" then we essentially need to develop a new >>> language from ground up, which may or may not look like Python. > > I've NEVER said that using types is encourages good programming > practice... All as I commented if that if you write a cdef function > with a typed arg, and next call it with an untyped value, it would be > NICE that Cython let me know that... just because that WARNING could > spot a real bug, or I could even save the runtime typecheck at the > caller by using a typed value. And I just asked for a Cython warning, > and I do not ever bother if that warning is not enabled by default... > >> From my own code, I have two primary use cases for cdef functions/ >> methods: > > (1) internal, private helper routines, that should handle C types and > cdef classes > (2) public C-API for consumption in external C code. > > For (1), when I use typed arguments I have VERY good reasons for it, > and would like Cython to point me any call with untyped args (because > an untyped arg is either a bug or an unnecessary runtime typecheck in > the caller)... > > For (2), the no-typecheck on the callee is unsafe, and I have to > manually do the typecheck.... > > >>> >>> I even had a seperate slide in my SciPy 09 presentation telling >>> people >>> that static type checking (as a programming model) was not a valid >>> usecase for Cython; "Cython is typed because it has to, not >>> because it >>> wants to." :-) >> > > I do not buy that as an absolute truth... If some people use Cython to > write/speedup Python code, then they will likely share that view... > But if you use Cython closer to the C/C++ part, where you have to > interact with C types and opaque handles, and typed cdef classes and > cdef methods, and all that for maximum speed and calling external lib > functions, then you start to think different and will appreciate if > Cython help you to save unnecessary runtime typechecking by some sort > or "static typing" features (even if all what Cython does is > generating warnings). This is one thing that cython -a is good for, it lets you easily spot cases where unnecessarily type checking, etc. is performed. - Robert From dalcinl at gmail.com Wed Sep 16 01:42:33 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 15 Sep 2009 20:42:33 -0300 Subject: [Cython] Type checking In-Reply-To: References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> Message-ID: On Tue, Sep 15, 2009 at 7:09 PM, Robert Bradshaw wrote: > > This is one thing that cython -a is good for, it lets you easily spot > cases where unnecessarily type checking, etc. is performed. > No, Robert. Let's see: mpi4py has 7 K lines of Python code. This generates 82 K lines (x12) of C code... I really do not have the time to review all that output. Anyway, let's stop here... Perhaps some day I'll convince you about adding a "-pedantic" option to Cython :-) ... But now we have more important things to work on... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Wed Sep 16 06:52:02 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 16 Sep 2009 06:52:02 +0200 Subject: [Cython] Type checking In-Reply-To: References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> Message-ID: <4AB06EF2.4050606@behnel.de> Lisandro Dalcin wrote: > On Tue, Sep 15, 2009 at 7:09 PM, Robert Bradshaw > wrote: >> This is one thing that cython -a is good for, it lets you easily spot >> cases where unnecessarily type checking, etc. is performed. > > No, Robert. Let's see: mpi4py has 7 K lines of Python code. This > generates 82 K lines (x12) of C code... I really do not have the time > to review all that output. Well, yes, that's what "-a" is for. You don't have to review "all that output", you just have to look for dark yellow lines in the Cython source at points that you consider performance critical. Only when you find those, you can open the underlying C source and check why there's so much Python stuff going on. Stefan From dagss at student.matnat.uio.no Wed Sep 16 10:16:55 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 Sep 2009 10:16:55 +0200 Subject: [Cython] Type checking In-Reply-To: References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> Message-ID: <4AB09EF7.70902@student.matnat.uio.no> Lisandro Dalcin wrote: > On Tue, Sep 15, 2009 at 7:09 PM, Robert Bradshaw > wrote: > >> This is one thing that cython -a is good for, it lets you easily spot >> cases where unnecessarily type checking, etc. is performed. >> >> > > No, Robert. Let's see: mpi4py has 7 K lines of Python code. This > generates 82 K lines (x12) of C code... I really do not have the time > to review all that output. > > Anyway, let's stop here... Perhaps some day I'll convince you about > adding a "-pedantic" option to Cython :-) ... But now we have more > important things to work on... > I'm sorry, I need to post this, then I'll stop. I understand you better now and think it's been useful; and like you I don't consider cython -a the final solution here. Myself I think that the bar for compiler directives which people find useful (and which default to off) should be very low, so I'm not opposed. But what about this instead: A directive to warn/give error on any undeclared variables? I.e.: @cython.warning_undeclared(True) def foo(): cdef object a cdef int b a = b = c = 3 # warns that "c" is not declared and auto-typed to object This would also help me when I forget to type the variables of a loop in a function I'm speeding up. And I think this would also catch the case you described and help avoid unecesarry type-checking? Both Visual Basic and Fortran have this BTW (respectively "Option Explicit" and "implicit none"). Not that I consider any of those pinnacles of good language design :-) But I think this could be a useful directive to have for those who want to use it. Dag Sverre From stefan_ml at behnel.de Wed Sep 16 10:35:55 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 16 Sep 2009 10:35:55 +0200 Subject: [Cython] Type checking In-Reply-To: <4AB09EF7.70902@student.matnat.uio.no> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> Message-ID: <4AB0A36B.3040905@behnel.de> Dag Sverre Seljebotn wrote: > A directive to warn/give error on any undeclared variables? I.e.: > > @cython.warning_undeclared(True) > def foo(): > cdef object a > cdef int b > a = b = c = 3 # warns that "c" is not declared and auto-typed to object Fine with me. There are really cases where you want a function to be "all C", and you want to be sure it stays that way. Regarding the infrastructure, what about an abstract transform hat intercepts on specific Cython compiler directives. The reason is that I think we'll need this more often, e.g. for a test directive that asserts certain features of the parsed/transformed/optimised source tree. It should just delegate the respective subtree to another transform (or even tree visitor) and let that do the rest. Stefan From dagss at student.matnat.uio.no Wed Sep 16 10:44:44 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 Sep 2009 10:44:44 +0200 Subject: [Cython] Type checking In-Reply-To: <4AB0A36B.3040905@behnel.de> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> Message-ID: <4AB0A57C.4040202@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: > >> A directive to warn/give error on any undeclared variables? I.e.: >> >> @cython.warning_undeclared(True) >> def foo(): >> cdef object a >> cdef int b >> a = b = c = 3 # warns that "c" is not declared and auto-typed to object >> > > Fine with me. There are really cases where you want a function to be "all > C", and you want to be sure it stays that way. > > Regarding the infrastructure, what about an abstract transform hat > intercepts on specific Cython compiler directives. The reason is that I > think we'll need this more often, e.g. for a test directive that asserts > certain features of the parsed/transformed/optimised source tree. It should > just delegate the respective subtree to another transform (or even tree > visitor) and let that do the rest. In this specific case I think I'd tend to add a few lines in analyse_target_type (or whatever) in NameNode I think, without using a transform (where the directives in effect are available in env.directives or similar -- which is kind of a hack as the scope info is mutated as the tree is traversed, but it is a solution until we start passing "ctx" instead of "env"). I like the general idea though (only concern is that combining multiple compiler directives etc. into one pass is sometimes going to be faster, although less clean). Dag Sverre From dagss at student.matnat.uio.no Wed Sep 16 10:49:13 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 Sep 2009 10:49:13 +0200 Subject: [Cython] Type checking In-Reply-To: <4AB0A36B.3040905@behnel.de> References: <4AAF3207.9040403@behnel.de> <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> Message-ID: <4AB0A689.5090109@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: > >> A directive to warn/give error on any undeclared variables? I.e.: >> >> @cython.warning_undeclared(True) >> def foo(): >> cdef object a >> cdef int b >> a = b = c = 3 # warns that "c" is not declared and auto-typed to object >> > > Fine with me. There are really cases where you want a function to be "all > C", and you want to be sure it stays that way. > This is now http://trac.cython.org/cython_trac/ticket/369 Dag Sverre From dalcinl at gmail.com Wed Sep 16 16:45:56 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 16 Sep 2009 11:45:56 -0300 Subject: [Cython] Type checking In-Reply-To: <4AB0A689.5090109@student.matnat.uio.no> References: <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> Message-ID: On Wed, Sep 16, 2009 at 5:49 AM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Dag Sverre Seljebotn wrote: >> >>> A directive to warn/give error on any undeclared variables? I.e.: >>> >>> @cython.warning_undeclared(True) >>> def foo(): >>> ? ? cdef object a >>> ? ? cdef int b >>> ? ? a = b = c ?= 3 # warns that "c" is not declared and auto-typed to object >>> >> Dag, take for granted that anything that Cython adds in this direction, I'm going to use once it is ready... and likely these directives will be globally enabled by default. However, there is something that makes me uncomfortable about this... This is not quite similar to other compiler directives... I mean, it has no effect in Cython semantics or generated C code. Just a coment, not a big deal... In the same spirit, Cython could have a "directive" to emit warnings when a bare "object" is implicitly cast to a builtin/cdef type. Final note: we could potentially have many warning-related directives, right? Having a lot will not harm, right? Then perhaps we should have a dict-like "warning" directive, and you use like this: @cython.warning(undeclared=True,untyped=True,cast_implicit=True) cdef void foo(a): # "a" is untyped cdef list b cdef object c b = a # implicit downcast object -> list c = b d = c # 'd' undeclared -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Wed Sep 16 16:56:47 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 16 Sep 2009 16:56:47 +0200 Subject: [Cython] Type checking In-Reply-To: References: <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> Message-ID: <4AB0FCAF.5050606@behnel.de> Lisandro Dalcin wrote: > cdef void foo(a): # "a" is untyped > cdef list b > b = a # implicit downcast object -> list I didn't try, but given that you consider this a problem, I assume that Cython currently allows this and you get a runtime error here if a is not a list, right? I would expect users to wrap that code by a type test anyway. If we start disallowing similar things for str&friends, would it make sense to require an explicit cast in the case above? I don't think a warning quite fits here. Stefan From dalcinl at gmail.com Wed Sep 16 18:58:11 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 16 Sep 2009 13:58:11 -0300 Subject: [Cython] Type checking In-Reply-To: <4AB0FCAF.5050606@behnel.de> References: <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> <4AB0FCAF.5050606@behnel.de> Message-ID: On Wed, Sep 16, 2009 at 11:56 AM, Stefan Behnel wrote: > > Lisandro Dalcin wrote: >> cdef void foo(a): # "a" is untyped >> ? ? cdef list b >> ? ? b = a # implicit downcast object -> list > > I didn't try, but given that you consider this a problem, I assume that > Cython currently allows this and you get a runtime error here if a is not a > list, right? Yes, see yourself: $ cat tryme.pyx cdef int foo(a) except -1: cdef tuple b b = a return 0 def test(): foo([]) $ python -c 'import pyximport;pyximport.install(); import tryme; tryme.test()' Traceback (most recent call last): File "", line 1, in File "tryme.pyx", line 7, in tryme.test (/u/dalcinl/.pyxbld/temp.linux-i686-2.6/pyrex/tryme.c:464) foo([]) File "tryme.pyx", line 3, in tryme.foo (/u/dalcinl/.pyxbld/temp.linux-i686-2.6/pyrex/tryme.c:414) b = a TypeError: Expected tuple, got list > > I would expect users to wrap that code by a type test anyway. > Yes, I would also expect users (and myself!!) to do that... But users (and me!) are not real programers: we DO make mistakes, introduce bugs, and write inefficient code. Could Cython help in such cases? Yes, it could.. Should it? Yes, it should, at least if asked explicitly (the last sentences of course just my humble opinion) > If we start disallowing similar things for str&friends, would it make sense > to require an explicit cast in the case above? I don't think a warning > quite fits here. > Well, I always asked for warnings, and even these warns not enabled by default. If you want to be stricter, then I'm +1. You know, "Explicit is better than implicit" and "Errors should never pass silently" and "Unless explicitly silenced" and all that. Stefan, perhaps the bytes/unicode issues are making you realize that (despite Cython has to but do not want to be typed) there are cases where strict typing REALLY do make sense? At some point, Cython let you interact with C, C is (you like it or not) a typed world, and strict typechecking in C saves you so many times from shooting yourself in the foot (have you ever tried Fortran 77?) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Wed Sep 16 19:49:26 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 16 Sep 2009 10:49:26 -0700 Subject: [Cython] Type checking In-Reply-To: References: <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> Message-ID: <1B863E29-35AA-4E3F-914D-A942DCCDDDF3@math.washington.edu> On Sep 16, 2009, at 7:45 AM, Lisandro Dalcin wrote: > On Wed, Sep 16, 2009 at 5:49 AM, Dag Sverre Seljebotn > wrote: >> Stefan Behnel wrote: >>> Dag Sverre Seljebotn wrote: >>> >>>> A directive to warn/give error on any undeclared variables? I.e.: >>>> >>>> @cython.warning_undeclared(True) >>>> def foo(): >>>> cdef object a >>>> cdef int b >>>> a = b = c = 3 # warns that "c" is not declared and auto- >>>> typed to object >>>> >>> > > Dag, take for granted that anything that Cython adds in this > direction, I'm going to use once it is ready... and likely these > directives will be globally enabled by default. > > However, there is something that makes me uncomfortable about this... > This is not quite similar to other compiler directives... I mean, it > has no effect in Cython semantics or generated C code. Just a coment, > not a big deal... > > In the same spirit, Cython could have a "directive" to emit warnings > when a bare "object" is implicitly cast to a builtin/cdef type. > > Final note: we could potentially have many warning-related directives, > right? Having a lot will not harm, right? Then perhaps we should have > a dict-like "warning" directive, and you use like this: > > @cython.warning(undeclared=True,untyped=True,cast_implicit=True) > cdef void foo(a): # "a" is untyped > cdef list b > cdef object c > b = a # implicit downcast object -> list > c = b > d = c # 'd' undeclared +1 I like this warning decorator. Perhaps we could also take a -Wxxx flags like gcc. I'd rather it not be on by default however. - Robert From robertwb at math.washington.edu Wed Sep 16 19:50:38 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 16 Sep 2009 10:50:38 -0700 Subject: [Cython] Type checking In-Reply-To: <4AB0FCAF.5050606@behnel.de> References: <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> <4AB0FCAF.5050606@behnel.de> Message-ID: <4D8C3D69-2CF7-44DA-8D74-F37467F8E8A3@math.washington.edu> On Sep 16, 2009, at 7:56 AM, Stefan Behnel wrote: > Lisandro Dalcin wrote: >> cdef void foo(a): # "a" is untyped >> cdef list b >> b = a # implicit downcast object -> list > > I didn't try, but given that you consider this a problem, I assume > that > Cython currently allows this and you get a runtime error here if a > is not a > list, right? I would expect users to wrap that code by a type test > anyway. The way I see it, if the user doesn't wrap it by a type test, Cython does it for you. - Robert From dagss at student.matnat.uio.no Wed Sep 16 20:02:35 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 Sep 2009 20:02:35 +0200 Subject: [Cython] Type checking In-Reply-To: <1B863E29-35AA-4E3F-914D-A942DCCDDDF3@math.washington.edu> References: <4AAFD957.2020203@student.matnat.uio.no> <852C15AF-9BD2-47DD-8E82-DDDC2E3581D1@math.washington.edu> <4AB09EF7.70902@student.matnat.uio.no> <4AB0A36B.3040905@behnel.de> <4AB0A689.5090109@student.matnat.uio.no> <1B863E29-35AA-4E3F-914D-A942DCCDDDF3@math.washington.edu> Message-ID: <4AB1283B.7000808@student.matnat.uio.no> Robert Bradshaw wrote: > On Sep 16, 2009, at 7:45 AM, Lisandro Dalcin wrote: > >> On Wed, Sep 16, 2009 at 5:49 AM, Dag Sverre Seljebotn >> wrote: >>> Stefan Behnel wrote: >>>> Dag Sverre Seljebotn wrote: >>>> >>>>> A directive to warn/give error on any undeclared variables? I.e.: >>>>> >>>>> @cython.warning_undeclared(True) >>>>> def foo(): >>>>> cdef object a >>>>> cdef int b >>>>> a = b = c = 3 # warns that "c" is not declared and auto- >>>>> typed to object >>>>> >> Dag, take for granted that anything that Cython adds in this >> direction, I'm going to use once it is ready... and likely these >> directives will be globally enabled by default. >> >> However, there is something that makes me uncomfortable about this... >> This is not quite similar to other compiler directives... I mean, it >> has no effect in Cython semantics or generated C code. Just a coment, >> not a big deal... >> >> In the same spirit, Cython could have a "directive" to emit warnings >> when a bare "object" is implicitly cast to a builtin/cdef type. >> >> Final note: we could potentially have many warning-related directives, >> right? Having a lot will not harm, right? Then perhaps we should have >> a dict-like "warning" directive, and you use like this: >> >> @cython.warning(undeclared=True,untyped=True,cast_implicit=True) >> cdef void foo(a): # "a" is untyped >> cdef list b >> cdef object c >> b = a # implicit downcast object -> list >> c = b >> d = c # 'd' undeclared > > +1 I like this warning decorator. Perhaps we could also take a -Wxxx > flags like gcc. I'd rather it not be on by default however. OK. I'd rather prefer this to be treated as "warning_undeclared" and so on internally though. Yet another layer of manually handled dictionary stack is a bit too much for me personally :-) And also I suppose @cython.error should be provided as well just for fun. So that makes it (internally) something like: if is_undeclared and env.directives["errlevel_undeclared"]: Errors.report(MY_MSG, env.directives["errlevel_undeclared"]) errlevel would be None, Errors.ERROR or Errors.WARNING. Sounds ok? (This partially overlaps with the existing warning level system for "seriousness" of warnings, which I think can just be ripped out in favour of this eventually.) -- Dag Sverre From magnus at hetland.org Wed Sep 16 22:50:23 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed, 16 Sep 2009 22:50:23 +0200 Subject: [Cython] Compiling pure Python mode with distutils Message-ID: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> Hi! Sorry if this is an FAQ, but I haven't really found any discussion of it... My problem is that I'm trying to compile Cython code written in pure Python mode using Distutils. And somehow, it seems to ignore the .pxd file... If I run $ cython foo.py the file foo.pxd in the same directory is evidently used. However, if I build an extension with either just foo.py or both foo.py and foo.pxd as sources, using Distutils, I get results that seem to indicate that the pxd isn't used. I see there's some handling of .py files in Cython.Distutils.build_ext, but it seems a bit limited... > if ext == ".py": > # FIXME: we might want to special case this some more > ext = '.pyx' BTW: I wouldn't mind naming my files .pyx, but it seems that this interferes with cython's interpretation of the file. (I.e., it's no longer treated as a "pure Python mode" file, and I get warnings of conflicts with the pxd file.) Also, I've used the standard package_dir argument to setup() ... but because I've got all the Cython source files in there, the .py files get installed alongside the .so files. What's the recommended setup for this? Basically, I'm using a setup.py quite similar to what I've been using with my .pyx files, where it has worked just fine, but now I can't get it to work -- and I'm just wondering if I'm missing some way of configuring it, or if I have to write the compilation code myself. (Or if, perhaps, this is a bug? If that's the case, I could look into writing a patch, I guess.) Thanks, - M -- Magnus Lie Hetland http://hetland.org From magnus at hetland.org Thu Sep 17 01:55:45 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu, 17 Sep 2009 01:55:45 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> Message-ID: <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> On Sep 16, 2009, at 22:50 , Magnus Lie Hetland wrote: > If I run > > $ cython foo.py > > the file foo.pxd in the same directory is evidently used. However, if > I build an extension with either just foo.py or both foo.py and > foo.pxd as sources, using Distutils, I get results that seem to > indicate that the pxd isn't used. After some fiddling, I've come to the conclusion that this was, most likely, a PEBKAC :-> I started playing around with Cython's build_ext.py, I've find that the difference is between the following two versions of the compile command... This worked result = cython_compile(source, options=options) This, for some reason, didn't: result = cython_compile(source, options=options full_module_name=module_name) Now, I'd put the code in a package -- except I'd left out the __init__.py for now (which I'd gotten some warnings about, but I wasn't planning on doing any importing yet, so I thought it wouldn't matter; I was using a glob in my setup.py and didn't want to compile the init file). But as the module name was of the form foo.foo, I suspected this might be the problem ... and indeed it was. So, I guess, if I'd done it "by the book", and included the __init__.py file, it would have worked from the start. However I'd say the compile failed in a rather non-obvious way... (I.e., it actually did compile -- it just ignored the .pxd file...) Maybe I'm still missing something, but at least it seems to be working now :) (I'd still be interested in hints on how people organize the code for projects using pure Python mode, though -- so that some .py files get compiled while others get installed, and so forth.) -- Magnus Lie Hetland http://hetland.org From robertwb at math.washington.edu Thu Sep 17 06:44:24 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 16 Sep 2009 21:44:24 -0700 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> Message-ID: <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> On Sep 16, 2009, at 4:55 PM, Magnus Lie Hetland wrote: > On Sep 16, 2009, at 22:50 , Magnus Lie Hetland wrote: > >> If I run >> >> $ cython foo.py >> >> the file foo.pxd in the same directory is evidently used. However, if >> I build an extension with either just foo.py or both foo.py and >> foo.pxd as sources, using Distutils, I get results that seem to >> indicate that the pxd isn't used. > > > After some fiddling, I've come to the conclusion that this was, most > likely, a PEBKAC :-> > > I started playing around with Cython's build_ext.py, I've find that > the difference is between the following two versions of the compile > command... > > This worked > > result = cython_compile(source, options=options) > > This, for some reason, didn't: > > result = cython_compile(source, options=options > full_module_name=module_name) > > Now, I'd put the code in a package -- except I'd left out the > __init__.py for now (which I'd gotten some warnings about, but I > wasn't planning on doing any importing yet, so I thought it wouldn't > matter; I was using a glob in my setup.py and didn't want to compile > the init file). But as the module name was of the form foo.foo, I > suspected this might be the problem ... and indeed it was. > > So, I guess, if I'd done it "by the book", and included the > __init__.py file, it would have worked from the start. However I'd say > the compile failed in a rather non-obvious way... (I.e., it actually > did compile -- it just ignored the .pxd file...) Glad you were able to figure it out. I'm not sure how we should detect this kind of error... > Maybe I'm still missing something, but at least it seems to be working > now :) > > (I'd still be interested in hints on how people organize the code for > projects using pure Python mode, though -- so that some .py files get > compiled while others get installed, and so forth.) Me too. (In Sage we use .pyx files, and actually have a massive, custom setup.py.) - Robert From stefan_ml at behnel.de Thu Sep 17 08:41:56 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 Sep 2009 08:41:56 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> Message-ID: <4AB1DA34.4030103@behnel.de> Robert Bradshaw wrote: > On Sep 16, 2009, at 4:55 PM, Magnus Lie Hetland wrote: >> Now, I'd put the code in a package -- except I'd left out the >> __init__.py for now (which I'd gotten some warnings about, but I >> wasn't planning on doing any importing yet, so I thought it wouldn't >> matter; I was using a glob in my setup.py and didn't want to compile >> the init file). But as the module name was of the form foo.foo, I >> suspected this might be the problem ... and indeed it was. >> >> So, I guess, if I'd done it "by the book", and included the >> __init__.py file, it would have worked from the start. However I'd say >> the compile failed in a rather non-obvious way... (I.e., it actually >> did compile -- it just ignored the .pxd file...) > > Glad you were able to figure it out. I'm not sure how we should > detect this kind of error... It's impossible to detect the case where an existing .pxd file is not found (since there might not actually be one), but we could detect the case where the module name does not reflect the package structure, i.e. exactly the case where the module code is expected to be inside of a package, but does not lie next to an __init__.py file. Stefan From robertwb at math.washington.edu Thu Sep 17 09:18:52 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Sep 2009 00:18:52 -0700 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <4AB1DA34.4030103@behnel.de> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> <4AB1DA34.4030103@behnel.de> Message-ID: On Sep 16, 2009, at 11:41 PM, Stefan Behnel wrote: > Robert Bradshaw wrote: >> On Sep 16, 2009, at 4:55 PM, Magnus Lie Hetland wrote: >>> Now, I'd put the code in a package -- except I'd left out the >>> __init__.py for now (which I'd gotten some warnings about, but I >>> wasn't planning on doing any importing yet, so I thought it wouldn't >>> matter; I was using a glob in my setup.py and didn't want to compile >>> the init file). But as the module name was of the form foo.foo, I >>> suspected this might be the problem ... and indeed it was. >>> >>> So, I guess, if I'd done it "by the book", and included the >>> __init__.py file, it would have worked from the start. However >>> I'd say >>> the compile failed in a rather non-obvious way... (I.e., it actually >>> did compile -- it just ignored the .pxd file...) >> >> Glad you were able to figure it out. I'm not sure how we should >> detect this kind of error... > > It's impossible to detect the case where an existing .pxd file is > not found > (since there might not actually be one), but we could detect the > case where > the module name does not reflect the package structure, i.e. > exactly the > case where the module code is expected to be inside of a package, > but does > not lie next to an __init__.py file. We should probably be at least emitting a warning in this case. - Robert From magnus at hetland.org Thu Sep 17 18:14:17 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu, 17 Sep 2009 18:14:17 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <4AB1DA34.4030103@behnel.de> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> <4AB1DA34.4030103@behnel.de> Message-ID: <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> On Sep 17, 2009, at 08:41, Stefan Behnel wrote: > Robert Bradshaw wrote: >> On Sep 16, 2009, at 4:55 PM, Magnus Lie Hetland wrote: >>> So, I guess, if I'd done it "by the book", and included the >>> __init__.py file, it would have worked from the start. However I'd >>> say the compile failed in a rather non-obvious way... (I.e., it >>> actually did compile -- it just ignored the .pxd file...) >> >> Glad you were able to figure it out. I'm not sure how we should >> detect this kind of error... > > It's impossible to detect the case where an existing .pxd file is > not found (since there might not actually be one), but we could > detect the case where the module name does not reflect the package > structure, i.e. exactly the case where the module code is expected > to be inside of a package, but does not lie next to an __init__.py > file. Hmm. OK. I still don't understand why the missing __init__.py file means that the .pxd file is ignored... At the moment, I'm trying to split my code into two separate dirs -- src/ and lib/ -- with src containing the Cython source (in pure Python mode .py files, along with .pxd files, as needed) and lib containing the Python code that is to be installed. However, it would seem I can't compile the pure Python mode Cython files properly without adding an unused __init__.py file to the source directory? Or am I misunderstanding that part? -- Magnus Lie Hetland http://hetland.org From stefan_ml at behnel.de Thu Sep 17 20:03:31 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 Sep 2009 20:03:31 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> <4AB1DA34.4030103@behnel.de> <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> Message-ID: <4AB279F3.2040909@behnel.de> Magnus Lie Hetland wrote: > Hmm. OK. I still don't understand why the missing __init__.py file > means that the .pxd file is ignored... Because you instruct Cython to compile a file package/module.pyx and it looks for a corresponding package/module.pxd. Since it cannot find the package, it can't find the .pxd file. > At the moment, I'm trying to split my code into two separate dirs -- > src/ and lib/ -- with src containing the Cython source (in pure Python > mode .py files, along with .pxd files, as needed) and lib containing > the Python code that is to be installed. Why do you do that? Just keep your source files in a correct package tree, regardless if you compile them or not. You can make that a deployment detail. Stefan From dagss at student.matnat.uio.no Thu Sep 17 20:11:04 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 Sep 2009 20:11:04 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> <4AB1DA34.4030103@behnel.de> <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> Message-ID: <4AB27BB8.5000104@student.matnat.uio.no> Magnus Lie Hetland wrote: > On Sep 17, 2009, at 08:41, Stefan Behnel wrote: > >> Robert Bradshaw wrote: >>> On Sep 16, 2009, at 4:55 PM, Magnus Lie Hetland wrote: >>>> So, I guess, if I'd done it "by the book", and included the >>>> __init__.py file, it would have worked from the start. However I'd >>>> say the compile failed in a rather non-obvious way... (I.e., it >>>> actually did compile -- it just ignored the .pxd file...) >>> Glad you were able to figure it out. I'm not sure how we should >>> detect this kind of error... >> It's impossible to detect the case where an existing .pxd file is >> not found (since there might not actually be one), but we could >> detect the case where the module name does not reflect the package >> structure, i.e. exactly the case where the module code is expected >> to be inside of a package, but does not lie next to an __init__.py >> file. > > > Hmm. OK. I still don't understand why the missing __init__.py file > means that the .pxd file is ignored... > > At the moment, I'm trying to split my code into two separate dirs -- > src/ and lib/ -- with src containing the Cython source (in pure Python > mode .py files, along with .pxd files, as needed) and lib containing > the Python code that is to be installed. However, it would seem I > can't compile the pure Python mode Cython files properly without > adding an unused __init__.py file to the source directory? Or am I > misunderstanding that part? Note that Python doesn't allow this either -- you can't spread a Python package across several directories. -- Dag Sverre From magnus at hetland.org Thu Sep 17 21:02:24 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu, 17 Sep 2009 21:02:24 +0200 Subject: [Cython] Compiling pure Python mode with distutils In-Reply-To: <4AB279F3.2040909@behnel.de> References: <5DF7C4A2-0FD3-44F0-AC93-6AB1DFCC39EA@hetland.org> <026BA7D9-4C2E-4998-AEB2-328899EBA48A@hetland.org> <25DADA9C-D630-438F-957B-6B29D02E11A4@math.washington.edu> <4AB1DA34.4030103@behnel.de> <14307F2D-8951-4BBF-863B-78A3C7F0A2D9@hetland.org> <4AB279F3.2040909@behnel.de> Message-ID: On Sep 17, 2009, at 20:03 , Stefan Behnel wrote: > > Magnus Lie Hetland wrote: >> Hmm. OK. I still don't understand why the missing __init__.py file >> means that the .pxd file is ignored... > > Because you instruct Cython to compile a file package/module.pyx and > it looks for a corresponding package/module.pxd. Right. And that file is right there. > Since it cannot find the package, it can't find the .pxd file. I guess that's what I don't understand -- why would it need to find the package to do the compilation? If there's a .pxd file in the same directory as the .py(x) file, couldn't that be enough? (I understand that it currently isn't.) After all, no other requirements of the Python modules/packages are made, right? Cython doesn't check that various Python modules/packages can be imported, for example. Why check that the .pyx file is, in fact, in a real package? On Sep 17, 2009, at 20:11 , Dag Sverre Seljebotn wrote: > > Magnus Lie Hetland wrote: >> However, it would seem I >> can't compile the pure Python mode Cython files properly without >> adding an unused __init__.py file to the source directory? Or am I >> misunderstanding that part? > > Note that Python doesn't allow this either -- you can't spread a > Python > package across several directories. No, but I don't, really -- not after the Cython files have been compiled. The .so files are in the actual package hierarchy. I guess there's a part of the Cython semantics I'm not getting here -- i.e., why it cares about the surrounding Python infrastructure while compiling. On Sep 17, 2009, at 20:03 , Stefan Behnel wrote: > Magnus Lie Hetland wrote: >> At the moment, I'm trying to split my code into two separate dirs -- >> src/ and lib/ -- with src containing the Cython source (in pure >> Python >> mode .py files, along with .pxd files, as needed) and lib containing >> the Python code that is to be installed. > > Why do you do that? Because of a few practical problems (that I'd be happy to solve differently). 1. I'd like to automate the process of finding compilable files, so I don't have to manually update my setup.py file all the time. Previously, I globbed for .pyx files; now (because I'm forced to use .py files to get cython to compile pure Python mode correctly, it seems), I can't do that. So I need some way of discerning Cython source files from Python files. I thought I'd keep the Python files in the actual package hierarchy (where they belong), and the Cython source somewhere else, putting the .so files in the package hierarchy (where they, too, belong). 2. I'm just using standard mechanisms for installing, and Distutils will happily install the Cython source files (with .py endings) alongside the corresponding .so files. I don't really want this, although it doesn't hurt ... except it makes me a bit uneasy to think that there's a chance that the .py file will be imported instead of the .so file. Not sure if there's a clearly documented Python behavior here? 3. I'd like to be able to run tests and so forth in two ways -- one interpreting the Cython .py files (for coverage, among other things) and one using the .so files. Haven't thought through exactly how I'll implement this, but it seems it'll be easier to tell Python what to import if the (compiled) .py files and their .so files aren't in the same directory. > Just keep your source files in a correct package tree, regardless if > you compile them or not. You can make that a deployment detail. Well, that's what I used to do. And I'd be happy to keep doing it, if I can get the wrinkled mentioned above ironed out. I guess I could, e.g., explicitly list the compiled files, or grep for files importing cython, or something. And be more explicit about what should be installed (could generate that list based on the same info). Not sure how I'd handle the third issue, but I could kludge together something (e.g., "installing" two temporary versions for the tests, possibly using symlinks...). Suggestions for better solutions would be welcome. (At the moment, I'm keeping the Cython source in a fake packe, with a dummy __init__.py, to appease Cython, with a body that simply raises an exception, to avoid importing it by chance. I've dropped the --inplace for build_ext, and the .so files appear where they should.) > Stefan -- Magnus Lie Hetland http://hetland.org From magnus at hetland.org Thu Sep 17 23:18:01 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Thu, 17 Sep 2009 23:18:01 +0200 Subject: [Cython] Inheritance bug in pure Python mode? Message-ID: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> I've encountered some behavior that I don't quite understand in Cython pure Python mode. I've got three classes, Foo, Bar and Baz. Both Bar and Baz inherit from Foo. All three implement the empty method quux(). (The problem also occurs if Bar inherits this from Foo -- but *not* if Baz does.) The .pxd file looks like this: > cdef class Foo: > > cpdef quux(self) > > cdef class Bar(Foo): > > cpdef quux(self) > > cdef class Baz(Foo): > > cpdef quux(self) Now, I compile this, and try the following: > >>> bar = Bar() > >>> bar.quux() > Traceback (most recent call last): > ... > TypeError: Cannot convert some.path.Bar to some.path.Baz For some reason, there's a type check on the self argument, requiring it to be of the Baz class, even though the method is in the Bar class... This works when there are several classes involved, too. At first, I thought it consistently chose the last class in the pxd file (so it seemed in my real repo), but when I copied all the files to a new directory to try to construct a minimal example, it consistently (before I started whittling away) used the *next to last* class. The only difference was the path to the current directory... Is this a bug, or am I doing something wrong here? (If anyone's interested, I've got a tarball with a makefile ready to demonstrate the behavior.) -- Magnus Lie Hetland http://hetland.org From michael at susens-schurter.com Fri Sep 18 03:25:44 2009 From: michael at susens-schurter.com (Michael Schurter) Date: Thu, 17 Sep 2009 18:25:44 -0700 Subject: [Cython] Cython not compiling .pyx files to .c Message-ID: <240b71640909171825u3d73a1ege01989e02cf1272b@mail.gmail.com> The following setup.py file doesn't generate the proper .c file when running install, build, or build_ext targets: from setuptools import setup, find_packages from pkg_resources import require require('Cython') from Cython.Distutils import build_ext from Cython.Distutils.extension import Extension setup( name="foo", version="0.6.0", author="foo", packages=find_packages(), entry_points={ 'console_scripts': [ 'foo = foo.cli:main', ], }, scripts=["scripts/foo"], cmdclass={'build_ext': build_ext}, ext_modules=[ Extension('bixdata.cbix', sources=['bixdata/bixidx.c', 'bixdata/cbix.pyx'], extra_compile_args=['-O2'], ), ], install_requires=[ "cython >= 0.11, < 0.12", # other stuff.... ], ) #EOF I've added print statements in Cython.Distutils.extension.Extension to find out that distutils.extension is turning these sources: ['bixdata/bixidx.c', 'bixdata/cbix.pyx'] ...into this: ['bixdata/bixidx.c', 'bixdata/cbix.c'] Since cbix.pyx becomes cbix.c, but cython has never compiled the .pyx into the .c file in build_ext.cython_sources. It appears distutils is preventing cython from automatically compiling .pyx to .c. Is there something I'm doing wrong? I've also tried using Extension from setuptools instead of from Cython, but it doesn't seem to make any difference. Thanks in advance, Michael Schurter @schmichael From robertwb at math.washington.edu Fri Sep 18 07:31:52 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Sep 2009 22:31:52 -0700 Subject: [Cython] Inheritance bug in pure Python mode? In-Reply-To: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> References: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> Message-ID: <9CB17459-462E-402B-9C9A-CF780483E3CA@math.washington.edu> On Sep 17, 2009, at 2:18 PM, Magnus Lie Hetland wrote: > I've encountered some behavior that I don't quite understand in Cython > pure Python mode. Is this just in pure mode, or a general problem? (FYI, pure mode is still somewhat experimental--users like you will help iron out all the corner cases (though it should basically just work)). > I've got three classes, Foo, Bar and Baz. Both Bar > and Baz inherit from Foo. All three implement the empty method quux(). > (The problem also occurs if Bar inherits this from Foo -- but *not* if > Baz does.) > > The .pxd file looks like this: > >> cdef class Foo: >> >> cpdef quux(self) >> >> cdef class Bar(Foo): >> >> cpdef quux(self) >> >> cdef class Baz(Foo): >> >> cpdef quux(self) You shouldn't need to re-declare the cpdef method in each class--it should get it by inheritance. (Should we raise an error in this case?) Do you still have problems if you don't redeclare the method in Bar and Baz? In any case, it's a bug one way or another. - Robert From stefan_ml at behnel.de Fri Sep 18 08:26:17 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Sep 2009 08:26:17 +0200 Subject: [Cython] TreePath Message-ID: <4AB32809.60205@behnel.de> Hi, I implemented a little path language for Python's code tree, based on the path language in ElementTree and lxml. It happily steals from XPath and looks like this: //DefNode/ReturnStatNode/NameNode //NameNode[@name = 'decorator'] //NameNode/@name It allows selecting nodes (or values) from the parse tree based on the above expressions and is meant mainly for testing, but can be used for anything. The main motivation was to support assertions in the test suite, so that we can finally test that an expected optimisation really leads to a specific tree structure. This test support isn't there yet, but it should be trivial to add. Look that the unit tests in http://hg.cython.org/cython-unstable/file/tip/Cython/Compiler/Tests/TestTreePath.py The implementation is in http://hg.cython.org/cython-unstable/file/tip/Cython/Compiler/TreePath.py Have fun, Stefan From magnus at hetland.org Fri Sep 18 12:06:08 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri, 18 Sep 2009 12:06:08 +0200 Subject: [Cython] Inheritance bug in pure Python mode? In-Reply-To: <9CB17459-462E-402B-9C9A-CF780483E3CA@math.washington.edu> References: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> <9CB17459-462E-402B-9C9A-CF780483E3CA@math.washington.edu> Message-ID: On Sep 18, 2009, at 07:31 , Robert Bradshaw wrote: > On Sep 17, 2009, at 2:18 PM, Magnus Lie Hetland wrote: > >> I've encountered some behavior that I don't quite understand in >> Cython >> pure Python mode. > > Is this just in pure mode, or a general problem? Only in pure mode. I used the same code (modulo porting ;) with plain Cython before. > (FYI, pure mode is still somewhat experimental--users like you will > help iron out all the corner cases (though it should basically just > work)). OK with me :) >>> cdef class Foo: >>> >>> cpdef quux(self) >>> >>> cdef class Bar(Foo): >>> >>> cpdef quux(self) >>> >>> cdef class Baz(Foo): >>> >>> cpdef quux(self) > > You shouldn't need to re-declare the cpdef method in each class--it > should get it by inheritance. Sounds reasonable :) I just mapped the pyx stuff rather directly to the pxd (and the method was overridden in the pyx file). > (Should we raise an error in this case?) I have no opinion, really. My intuition was that the pxd file just gave you info about what was in the pyx file, and as long as that information is correct, an error might be counter-intuitive ... maybe? But either way is fine with me. > Do you still have problems if you don't redeclare the method in Bar > and Baz? In any case, it's a bug one way or another. Yes, I do, actually :-/ I have now troid all eight combinations of declaring/not declaring the method in each class, and I get the error if and only if the method is declared in Foo (the shared superclass), regardless of what I do in the other classes. -- Magnus Lie Hetland http://hetland.org From stefan_ml at behnel.de Fri Sep 18 12:12:27 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Sep 2009 12:12:27 +0200 Subject: [Cython] Inheritance bug in pure Python mode? In-Reply-To: References: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> <9CB17459-462E-402B-9C9A-CF780483E3CA@math.washington.edu> Message-ID: <4AB35D0B.3000207@behnel.de> Magnus Lie Hetland wrote: > I have now troid all eight combinations of declaring/not declaring the > method in each class, and I get the error if and only if the method is > declared in Foo (the shared superclass), regardless of what I do in > the other classes. Sounds like a bug that was fixed a while ago, although I don't remember where it was fixed (hopefully cython-devel). Could you try with the latest developer versions from the cython-devel and/or cython-unstable branch? Stefan From dagss at student.matnat.uio.no Fri Sep 18 12:15:49 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 18 Sep 2009 12:15:49 +0200 Subject: [Cython] TreePath In-Reply-To: <4AB32809.60205@behnel.de> References: <4AB32809.60205@behnel.de> Message-ID: <4AB35DD5.2000801@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > I implemented a little path language for Python's code tree, based on the > path language in ElementTree and lxml. It happily steals from XPath and > looks like this: > > //DefNode/ReturnStatNode/NameNode > > //NameNode[@name = 'decorator'] > > //NameNode/@name > > It allows selecting nodes (or values) from the parse tree based on the > above expressions and is meant mainly for testing, but can be used for > anything. The main motivation was to support assertions in the test suite, > so that we can finally test that an expected optimisation really leads to a > specific tree structure. This test support isn't there yet, but it should > be trivial to add. Look that the unit tests in > > http://hg.cython.org/cython-unstable/file/tip/Cython/Compiler/Tests/TestTreePath.py > > The implementation is in > > http://hg.cython.org/cython-unstable/file/tip/Cython/Compiler/TreePath.py > Wonderful, thanks! FYI I have lying somewhere a metaclass which makes Cython nodes support W3C DOM. If you think that is valuable then I could try to polish it up, but unless there's a real need I'd rather spend my time on other things. Also I can see that this overlaps a little bit in intended usecase with my code tree writer, although I can see that it should be easier to write robust unit tests with this approach. Thanks again. Dag Sverre From stefan_ml at behnel.de Fri Sep 18 06:34:02 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Sep 2009 06:34:02 +0200 Subject: [Cython] Cython not compiling .pyx files to .c In-Reply-To: <240b71640909171825u3d73a1ege01989e02cf1272b@mail.gmail.com> References: <240b71640909171825u3d73a1ege01989e02cf1272b@mail.gmail.com> Message-ID: <4AB30DBA.7060903@behnel.de> Michael Schurter wrote: > The following setup.py file doesn't generate the proper .c file when > running install, build, or build_ext targets: > > from setuptools import setup, find_packages > from pkg_resources import require > require('Cython') > from Cython.Distutils import build_ext > from Cython.Distutils.extension import Extension > [...] > I've added print statements in Cython.Distutils.extension.Extension to > find out that distutils.extension is turning these sources: > > ['bixdata/bixidx.c', 'bixdata/cbix.pyx'] > > ...into this: > > ['bixdata/bixidx.c', 'bixdata/cbix.c'] That's a known 'feature' in setuptools. When it sees a .pyx file, it checks for Pyrex being installed instead of Cython, and if that's not the case, it changes the .pyx into .c, which obviously breaks the build. What you can do is to provide a fake Pyrex in your sources, like this: http://codespeak.net/svn/lxml/trunk/fake_pyrex/ and include it in your Python path as at the top of this setup.py file: http://codespeak.net/svn/lxml/trunk/setup.py Stefan From magnus at hetland.org Fri Sep 18 12:18:12 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri, 18 Sep 2009 12:18:12 +0200 Subject: [Cython] On the subject of pure mode Message-ID: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> As mentioned in a couple of previous emails, I've been experimenting with pure mode lately, with the main motivation being the use of development tools (coverage, lint etc.). I'm sure I can get that to work, but I also sort of like the plain Cython syntax :-) So an alternative occurred to me: Building plain Python from Cython. IOW, I could just strip down the Cython code to get plain Python equivalents that I could use the dev-tools on. Is that just a silly idea? Would I be better off with pure mode? (Is there, perhaps, such a beast/processor somewhere? I'm assuming I could reuse the Cython parser if I needed to write something myself?) I guess this might be useful also if people have already written Cython code that they'd like to have in plain Python form (although much, if not most, Cython code probably includes wrapping stuff that's not easy to translate... I mainly use it to implement algorithms for experiments). From the other perspective, if I'm going to go for pure mode, I'd appreciate being able to keep the Python files to be compiled as simple as possible (i.e., with as little explicit declarations as possible). I'm guessing that associated pxd files, along with CEP 505 could make that possible, if I code carefully. How likely do you think it is that that CEP (or some equivalent) will be accepted? (I'm guessing it would be hard to make that handle Cython/numpy arrays, though...?) And another thought/idea: What about allowing further declarations in the .pxd file (or some similar mechanism)? I.e., declaring local variables? Might seem pointless, but the idea would be to allow the compilation of optimized C from a plain Python file without any signs of Cython declaration. (This could then even be done with Python code you're not allowed to modifo.) I.e, you could specify all metadata separately. (If CEP 505 were accepted, this would only be needed in a few cases, I guess. Maybe?) Sorry for rambling :-) - M -- Magnus Lie Hetland http://hetland.org From magnus at hetland.org Fri Sep 18 12:39:13 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri, 18 Sep 2009 12:39:13 +0200 Subject: [Cython] Inheritance bug in pure Python mode? In-Reply-To: <4AB35D0B.3000207@behnel.de> References: <9E39F443-36E0-48F9-8EAB-2E353055F388@hetland.org> <9CB17459-462E-402B-9C9A-CF780483E3CA@math.washington.edu> <4AB35D0B.3000207@behnel.de> Message-ID: On Sep 18, 2009, at 12:12 , Stefan Behnel wrote: > > Magnus Lie Hetland wrote: >> I have now troid all eight combinations of declaring/not declaring >> the >> method in each class, and I get the error if and only if the method >> is >> declared in Foo (the shared superclass), regardless of what I do in >> the other classes. > > Sounds like a bug that was fixed a while ago, although I don't > remember > where it was fixed (hopefully cython-devel). Could you try with the > latest > developer versions from the cython-devel and/or cython-unstable > branch? Tried with cython-devel now. Had some trouble getting that to run -- the test_tm.py file in the Plex directory tried to import TransitionMaps, which was nowhere to be found, it seemed. Also several tests were run that just crashed (maybe because of my quick-and-dirty install -- just copying the relevant directory to where I was testing ;-) But: Yes. When I got it to run, I got the same error. If you'd like to have a look, here's the setup I'm using (a couple of kB): http://www.idi.ntnu.no/~mlh/cybug.zip Just run make, and you should see the error. -- Magnus Lie Hetland http://hetland.org From sjparry88 at hotmail.co.uk Fri Sep 18 16:46:28 2009 From: sjparry88 at hotmail.co.uk (Sam Parry) Date: Fri, 18 Sep 2009 14:46:28 +0000 Subject: [Cython] vcvarsall.bat In-Reply-To: References: Message-ID: Hi guys, Thanks for the prompt reply. > Date: Tue, 15 Sep 2009 13:52:18 -0300 > From: dalcinl at gmail.com > To: cython-dev at codespeak.net > Subject: Re: [Cython] vcvarsall.bat > > IIRC, there are some patches in http://trac.cython.org/cython_trac/ to > make pyximport MinGW aware... Unfortunately, I did not have any chance > to review this, and Windows is always low in my priorities... You > know... the Windows OS has a lot of users, fans, and strong defenders > (we already had some of these "fights" here in this list!!!)... but > very few of them make any useful code contribution/testing/review for > their platform... I have not had a look at the patches yet but thought I would just comment on the other things you suggested. I dual boot Windows and Linux but, unfortunately, I am running the program on a Windows machine at work. > A fast workaround for your issue if to add a file named > "distutils.cfg" in C:\Python2.6\Lib\distutils (DISCLAIMER: do not > remember right now if this is the actual full path of distutils!) with > the contents below: > [build_ext] > compiler=mingw32 As far as the disutils.cfg, I had already done that to no avail. It was in the correct place too. I just created the file using notepad and copied in the same text as you have writen (but from one of the install help documents on the cython website). As I have had this in from the start it doesn't seem to help (me anyway)! > Alternatively, you can add a "pydistutils.cfg" file with the same > contents in %HOME% or %UserProfile% or watever your "home" directory > is in your Windows system (tip: use os.path.expanduser('~') in a > Python prompt to figure out the right place) Thanks for your detailed directions in finding the home directory. However, when I type this into the command line python interpreter, I get a NameError: 'os' is not defined. This seems a bit weird to me... As such I am unsure as to where to put the pydisutils.cfg file. > BTW, If you can elaborate a bit more on this and contribute all this > stuff to the Cython wiki, it would be great.... I wouldn't mind putting this stuff in the wiki in brief if I find a solution. It would probably be mostly copy and pasted from this conversation though, especially as I have no real experience at this kind of thing... Thanks again for the info, Sam > On Tue, Sep 15, 2009 at 11:14 AM, Sam Parry wrote: > > Hi guys, > > > > Not sure if I'm emailing to the correct place so apologies if I am spamming > > you... > > > > I am having problems with Cython compiling. I am following the tutorial on > > the main website (from the Users Guide) and when I type "python setup.py > > build_ext --inplace" I get an error saying "unable to find vcvarsall.bat". I > > am using MinGW as my compiler and running on windows XP. I have managed to > > find a way around this: typing "python setup.py build_ext --compiler=mingw32 > > --inplace" works for the first 'hello world' tutorial part. However, I get > > the vcvarsall error when trying the pyximport method. Adding the > > "--compiler=mingw32" does not work for any of the examples using any form of > > numpy import. > > > > I would be grateful for any insights provided that could help me run cython! > > I am new to using the command line, c and cython (and not all that > > experienced with python either!) so forgive me if I need more detail than > > the average user! > > > > Thanks, > > > > Sam > > ________________________________ > > Use Hotmail to send and receive mail from your different email accounts. > > Find out how. > > _______________________________________________ > > Cython-dev mailing list > > Cython-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/cython-dev > > > > > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev _________________________________________________________________ MSN straight to your mobile - news, entertainment, videos and more. http://clk.atdmt.com/UKM/go/147991039/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090918/d018350e/attachment.htm From dalcinl at gmail.com Fri Sep 18 17:48:37 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 18 Sep 2009 12:48:37 -0300 Subject: [Cython] vcvarsall.bat In-Reply-To: References: Message-ID: OK, Sam... I'm very sorry, I got confused... The current implementation of pyximport will just ignore these config files I was talking about... You really need to apply the patch below to Cython sources. The patch is against cython-devel repo, but I think you should be able to fix Cython-0.11 as well. DISCLAIMER: I've not actually tested this !!!!... About how to find the location to put the per-user pydistutils.cfg file, you forgot to "import os"!!! . In short, you have to enter a Python promp and do this (here doing it in Linux): >>> import os >>> os.path.expanduser('~') '/u/dalcinl' If you can help me to test the patch below, then I'll start a new thread to ask the other Cython developers to push this fix. diff -r 7fbe931e5ab7 pyximport/pyxbuild.py --- a/pyximport/pyxbuild.py Wed Sep 16 15:50:00 2009 +0200 +++ b/pyximport/pyxbuild.py Fri Sep 18 12:39:51 2009 -0300 @@ -55,6 +55,11 @@ build = dist.get_command_obj('build') build.build_base = pyxbuild_dir + config_files = dist.find_config_files() + try: config_files.remove('setup.cfg') + except ValueError: pass + dist.parse_config_files(config_files) + try: ok = dist.parse_command_line() except DistutilsArgError: On Fri, Sep 18, 2009 at 11:46 AM, Sam Parry wrote: > Hi guys, > > Thanks for the prompt reply. > > >> Date: Tue, 15 Sep 2009 13:52:18 -0300 >> From: dalcinl at gmail.com >> To: cython-dev at codespeak.net >> Subject: Re: [Cython] vcvarsall.bat >> >> IIRC, there are some patches in http://trac.cython.org/cython_trac/ to >> make pyximport MinGW aware... Unfortunately, I did not have any chance >> to review this, and Windows is always low in my priorities... You >> know... the Windows OS has a lot of users, fans, and strong defenders >> (we already had some of these "fights" here in this list!!!)... but >> very few of them make any useful code contribution/testing/review for >> their platform... > > I have not had a look at the patches yet but thought I would just comment on > the?other things you suggested. I dual boot Windows and Linux but, > unfortunately, I am running the program on a Windows machine at work. > > >> A fast workaround for your issue if to add a file named >> "distutils.cfg" in C:\Python2.6\Lib\distutils (DISCLAIMER: do not >> remember right now if this is the actual full path of distutils!) with >> the contents below: >> [build_ext] >> compiler=mingw32 > > As far as the disutils.cfg, I had already done that to no avail.?It was in > the correct place too. I just created the file using notepad and?copied in > the same text as you have writen (but from one of the install help documents > on the cython website). As I have had this in from the start it doesn't seem > to help (me anyway)! > > >> Alternatively, you can add a "pydistutils.cfg" file with the same >> contents in %HOME% or %UserProfile% or watever your "home" directory >> is in your Windows system (tip: use os.path.expanduser('~') in a >> Python prompt to figure out the right place) > > Thanks for your detailed directions in finding the home directory. However, > when I type this into the command line python interpreter, I get a > NameError: 'os' is not defined. This seems a bit weird to me... As such I am > unsure as to where to put the pydisutils.cfg file. > > >> BTW, If you can elaborate a bit more on this and contribute all this >> stuff to the Cython wiki, it would be great.... > > I wouldn't mind putting this stuff in the wiki in brief if I find a > solution. It would probably be mostly copy and pasted from this conversation > though, especially as I have no real experience at this kind of thing... > > > Thanks again for the info, > > Sam > > > > > > > > >> On Tue, Sep 15, 2009 at 11:14 AM, Sam Parry >> wrote: >> > Hi guys, >> > >> > Not sure if I'm emailing to the correct place so apologies if I am >> > spamming >> > you... >> > >> > I am having problems with Cython compiling. I am following the tutorial >> > on >> > the main website (from the Users Guide)?and when I type "python setup.py >> > build_ext --inplace" I get an error saying "unable to find >> > vcvarsall.bat". I >> > am using MinGW as my compiler and running on windows XP. I have managed >> > to >> > find a way around this: typing "python setup.py build_ext >> > --compiler=mingw32 >> > --inplace" works for the first 'hello world' tutorial part. However, I >> > get >> > the vcvarsall error when trying the pyximport method. Adding the >> > "--compiler=mingw32" does not work for any of the examples using any >> > form of >> > numpy import. >> > >> > I would be grateful for any insights provided that could help me run >> > cython! >> > I am new to using the command line, c and cython (and not all that >> > experienced with python either!) so forgive me if I need more detail >> > than >> > the average user! >> > >> > Thanks, >> > >> > Sam >> > ________________________________ >> > Use Hotmail to send and receive mail from your different email accounts. >> > Find out how. >> > _______________________________________________ >> > Cython-dev mailing list >> > Cython-dev at codespeak.net >> > http://codespeak.net/mailman/listinfo/cython-dev >> > >> > >> >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > > ________________________________ > View your Twitter and Flickr updates from one place - Learn more! > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From michael at susens-schurter.com Fri Sep 18 19:58:20 2009 From: michael at susens-schurter.com (Michael Schurter) Date: Fri, 18 Sep 2009 10:58:20 -0700 Subject: [Cython] Cython not compiling .pyx files to .c In-Reply-To: <4AB30DBA.7060903@behnel.de> References: <240b71640909171825u3d73a1ege01989e02cf1272b@mail.gmail.com> <4AB30DBA.7060903@behnel.de> Message-ID: <240b71640909181058o3df4618bt38abf88f87304772@mail.gmail.com> On Thu, Sep 17, 2009 at 9:34 PM, Stefan Behnel wrote: > Michael Schurter wrote: >> The following setup.py file doesn't generate the proper .c file when >> running install, build, or build_ext targets: >> >> from setuptools import setup, find_packages >> from pkg_resources import require >> require('Cython') >> from Cython.Distutils import build_ext >> from Cython.Distutils.extension import Extension >> [...] >> I've added print statements in Cython.Distutils.extension.Extension to >> find out that distutils.extension is turning these sources: >> >> ['bixdata/bixidx.c', 'bixdata/cbix.pyx'] >> >> ...into this: >> >> ['bixdata/bixidx.c', 'bixdata/cbix.c'] > > That's a known 'feature' in setuptools. When it sees a .pyx file, it checks > for Pyrex being installed instead of Cython, and if that's not the case, it > changes the .pyx into .c, which obviously breaks the build. > > What you can do is to provide a fake Pyrex in your sources, like this: > > http://codespeak.net/svn/lxml/trunk/fake_pyrex/ > > and include it in your Python path as at the top of this setup.py file: > > http://codespeak.net/svn/lxml/trunk/setup.py > > Stefan Worked beautifully. Thanks Stefan! From jfrancis18 at gmail.com Fri Sep 18 23:56:54 2009 From: jfrancis18 at gmail.com (Jean-Francois Moulin) Date: Fri, 18 Sep 2009 23:56:54 +0200 Subject: [Cython] numpy array declaration problem Message-ID: <4AB40226.4070701@gmail.com> Hi to all! I am trying to cythonize some code including operations on numpy arrays. Following the user manual I declared the type of the arrays and tried to compile, which fails with a syntax error message for which I can't see any reason. It appears not to like the [] in my declaration (or at least notto find the closing ]. I tried compiling without using the DTYPE declaration at all (np.ndarray without specified type) and it works, but of course the speed gain is not much. Any clue? Thanks in advance for your help JF The abridged code: > > DTYPE = np.int > > ctypedef np.int_t DTYPE_t > > > > ... > > > > #@cython.boundscheck(False) #commented just to see if solves the problem... NOPE > > cpdef tuple discri_sweep_2D(... > > np.ndarray[DTYPE_t, ndim=1] tof_histo, > > np.ndarray[DTYPE_t, ndim=3] slab > > ): > > ... Compiling fails with the following output: > > running build_ext > > cythoning discri.pyx to discri.c > > > > Error converting Pyrex file to C: > > ------------------------------------------------------------ > > ... > > np.ndarray[DTYPE_t,ndim=1]tof_histo, > > ^ > > ------------------------------------------------------------ > > > > /home/.../discri_Cython/discri_Cython.main/discri.pyx:197:51: Expected ']' > > building 'discri' extension > > gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c discri.c -o build/temp.linux-x86_64-2.6/discri.o > > discri.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. > > error: command 'gcc' failed with exit status 1 From dagss at student.matnat.uio.no Sat Sep 19 09:33:00 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: 19 Sep 2009 09:33:00 +0200 Subject: [Cython] numpy array declaration problem Message-ID: <3336197633.532900@smtp.netcom.no> You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... Dag Sverre Seljebotn -----Original Message----- From: Jean-Francois Moulin Date: Friday, Sep 18, 2009 11:57 pm Subject: [Cython] numpy array declaration problem To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net Hi to all! > >I am trying to cythonize some code including operations on numpy arrays. Following the user manual I declared the type of the arrays and tried to compile, which fails with a syntax error message for which I can't see any reason. It appears not to like the [] in my declaration (or at least notto find the closing ]. I tried compiling without using the >DTYPE declaration at all (np.ndarray without specified type) and it >works, but of course the speed gain is not much. > >Any clue? > >Thanks in advance for your help >JF > > >The abridged code: > >> > DTYPE = np.int > > ctypedef np.int_t DTYPE_t > > > > ... > > > > #@cython.boundscheck(False) #commented just to see if solves the >problem... NOPE > > cpdef tuple discri_sweep_2D(... > > np.ndarray[DTYPE_t, ndim=1] tof_histo, > > np.ndarray[DTYPE_t, ndim=3] slab > > ): > > ... > > >Compiling fails with the following output: > > >> > running build_ext > > cythoning discri.pyx to discri.c > > > > Error converting Pyrex file to C: > > ------------------------------------------------------------ > > ... > > np.ndarray[DTYPE_t,ndim=1]tof_histo, > > ^ > > ------------------------------------------------------------ > > > > /home/.../discri_Cython/discri_Cython.main/discri.pyx:197:51: >Expected ']' > > building 'discri' extension > > gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall >-Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c discri.c -o >build/temp.linux-x86_64-2.6/discri.o > > discri.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. > > error: command 'gcc' failed with exit status 1 >_______________________________________________ >Cython-dev mailing list >Cython-dev at codespeak.net >http://codespeak.net/mailman/listinfo/cython-dev > From dagss at student.matnat.uio.no Sat Sep 19 10:45:23 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 19 Sep 2009 10:45:23 +0200 Subject: [Cython] On the subject of pure mode In-Reply-To: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> References: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> Message-ID: <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> Magnus Lie Hetland wrote: > As mentioned in a couple of previous emails, I've been experimenting > with pure mode lately, with the main motivation being the use of > development tools (coverage, lint etc.). I'm sure I can get that to > work, but I also sort of like the plain Cython syntax :-) So an > alternative occurred to me: Building plain Python from Cython. > > IOW, I could just strip down the Cython code to get plain Python > equivalents that I could use the dev-tools on. Is that just a silly > idea? Would I be better off with pure mode? (Is there, perhaps, such a > beast/processor somewhere? I'm assuming I could reuse the Cython > parser if I needed to write something myself?) No, it's a good idea (though see below). The work is about 1/3 done I think... Compile/CodeWriter.py contains a tree transform which will serialize the tree back to Cython code which should be a good starting point (it only supports parts of the language currently though) -- it can simply have a flag or subclass to skip any type declarations, output cdef classes as py classes etc. To quickly test with it, plug an instance into the pipeline (search for "pipeline" in Main.py). (If a proper tool is developed one should set up a seperate pipeline for the tool. In the devel branch I have been doing some refactorings so that there's a seperate Pipeline.py.) > > I guess this might be useful also if people have already written > Cython code that they'd like to have in plain Python form (although > much, if not most, Cython code probably includes wrapping stuff that's > not easy to translate... I mainly use it to implement algorithms for > experiments). > > From the other perspective, if I'm going to go for pure mode, I'd > appreciate being able to keep the Python files to be compiled as > simple as possible (i.e., with as little explicit declarations as > possible). I'm guessing that associated pxd files, along with CEP 505 > could make that possible, if I code carefully. How likely do you think > it is that that CEP (or some equivalent) will be accepted? (I'm > guessing it would be hard to make that handle Cython/numpy arrays, > though...?) I have lots of thoughts on pure Python mode :-) (but I'm mostly restraining myself until there's a good chance that somebody has time to actually implement something). First, as for NumPy arrays, it should be possible based on function argument decorators. Also with the new memoryview syntax, this could work in both Python and Cython: import cython view = cython.int[:,:](my2darray) print view[3,4] # fast when compiled My vote is always in favour of solutions that make Cython appear almost like a library in Python code -- with the difference being that use of the library is faster when compiled. E.g., for cdef classes I'd like a solution based on declaring interfaces with Python 3's Abstract Base Classes, and so on. (Don't tempt me, I have a whole CEP on this sitting in my head but I should spend my time better than writing it down...) > And another thought/idea: What about allowing further declarations in > the .pxd file (or some similar mechanism)? I.e., declaring local > variables? Might seem pointless, but the idea would be to allow the > compilation of optimized C from a plain Python file without any signs > of Cython declaration. (This could then even be done with Python code > you're not allowed to modifo.) I.e, you could specify all metadata > separately. (If CEP 505 were accepted, this would only be needed in a > few cases, I guess. Maybe?) I think declaring local variables is already allowed in pxd files, it's just not documents. I think @cython.locals(localvar=cython.int) cpdef foo(int x) in the pxd file should work. Not that pretty though...perhaps stick to @cython.locals(localvar=cython.int) cpdef foo(cython.int x) for better readability. Dag Sverre From magnus at hetland.org Sat Sep 19 13:26:35 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Sat, 19 Sep 2009 13:26:35 +0200 Subject: [Cython] On the subject of pure mode In-Reply-To: <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> References: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> Message-ID: <6667699E-E77B-4C22-B78C-70A3B59337A1@hetland.org> On Sep 19, 2009, at 10:45 , Dag Sverre Seljebotn wrote: > Magnus Lie Hetland wrote: [snip] >> IOW, I could just strip down the Cython code to get plain Python >> equivalents that I could use the dev-tools on. Is that just a silly >> idea? Would I be better off with pure mode? (Is there, perhaps, >> such a beast/processor somewhere? I'm assuming I could reuse the >> Cython parser if I needed to write something myself?) > > No, it's a good idea (though see below). The work is about 1/3 done > I think... Compile/CodeWriter.py contains a tree transform which > will serialize the tree back to Cython code which should be a good > starting point (it only supports parts of the language currently > though) -- it can simply have a flag or subclass to skip any type > declarations, output cdef classes as py classes etc. Sounds great :) (I found it in Cython/Writer.py, though; I guess perhaps it's been moved in the devel version or something?) > To quickly test with it, plug an instance into the pipeline (search > for "pipeline" in Main.py). Hmm. I've been experimenting with this for a while now, and I'm a bit stumped. After fiddling about with using various classes from Visitor.py as mix- ins to get CodeWriter to fit into the pipeline, I just added a simple __call__ myself. However, it seems that somehow the results in the LinesResult disappear somewhere along the pipeline? At the moment, I simply print them out before returning form __call__ :-/ As for the pipeline -- is it OK to just put it right after the parsing? For one thing, I get lots of unhandled nodes, but I guess that's just because they're not implemented yet. But I also get the transformation from "import numpy as np" to "np = (import numpy)", which isn't exactly valid :-> (I guess that's not an issue with the stage of the pipeline, though; probably not something that'll be get more Pythonesque further down the line toward code generation ;-) > (If a proper tool is developed one should set up a seperate pipeline > for the tool. In the devel branch I have been doing some > refactorings so that there's a seperate Pipeline.py.) Hm. OK. I guess in my case, it'd be useful to plug this thing into the pipeline set up by distutils, so that when I build the C code etc., I also get .py files in some configured directory. But perhaps there can be parallel pipelines based on the same parse tree? -- Magnus Lie Hetland http://hetland.org From dagss at student.matnat.uio.no Sat Sep 19 14:03:59 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 19 Sep 2009 14:03:59 +0200 Subject: [Cython] On the subject of pure mode In-Reply-To: <6667699E-E77B-4C22-B78C-70A3B59337A1@hetland.org> References: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> <6667699E-E77B-4C22-B78C-70A3B59337A1@hetland.org> Message-ID: <4ef37ce5b89c36f1345e0aabde8ce0a0.squirrel@webmail.uio.no> > On Sep 19, 2009, at 10:45 , Dag Sverre Seljebotn wrote: > >> Magnus Lie Hetland wrote: > [snip] >>> IOW, I could just strip down the Cython code to get plain Python >>> equivalents that I could use the dev-tools on. Is that just a silly >>> idea? Would I be better off with pure mode? (Is there, perhaps, >>> such a beast/processor somewhere? I'm assuming I could reuse the >>> Cython parser if I needed to write something myself?) >> >> No, it's a good idea (though see below). The work is about 1/3 done >> I think... Compile/CodeWriter.py contains a tree transform which >> will serialize the tree back to Cython code which should be a good >> starting point (it only supports parts of the language currently >> though) -- it can simply have a flag or subclass to skip any type >> declarations, output cdef classes as py classes etc. > > Sounds great :) (I found it in Cython/Writer.py, though; I guess > perhaps it's been moved in the devel version or something?) > >> To quickly test with it, plug an instance into the pipeline (search >> for "pipeline" in Main.py). > > Hmm. I've been experimenting with this for a while now, and I'm a bit > stumped. > > After fiddling about with using various classes from Visitor.py as mix- > ins to get CodeWriter to fit into the pipeline, I just added a simple > __call__ myself. However, it seems that somehow the results in the > LinesResult disappear somewhere along the pipeline? At the moment, I > simply print them out before returning form __call__ :-/ I must be brief... Yes that's as intended, otherwise you'd disturb the pipeline. > As for the pipeline -- is it OK to just put it right after the parsing? Seems like the best location, yes. analyse_declarations tends to remove some of the type declarations for you, but it is probably easier (and more stable with regards to refactorings in Cython etc.) to strip these explicitly (= not ouput) in the writer. > For one thing, I get lots of unhandled nodes, but I guess that's just > because they're not implemented yet. But I also get the transformation Yep. > from "import numpy as np" to "np = (import numpy)", which isn't > exactly valid :-> (I guess that's not an issue with the stage of the > pipeline, though; probably not something that'll be get more > Pythonesque further down the line toward code generation ;-) No, unfortunately it is the opposite: The parser is too smart for its own good and does some transformations (we internally treat an import as the latter case, so it is already transformed too far). Correct solution: Submit a patch which make the parser leave import statements alone (probably through a new ImportStatNode, in contrast with the current which is more like a ImportExprNode). Then, in ParseTreeTransforms.PostParse, transform ImportStateNode to ImportExprNode, and insert your writer before PostParse in the pipeline. > >> (If a proper tool is developed one should set up a seperate pipeline >> for the tool. In the devel branch I have been doing some >> refactorings so that there's a seperate Pipeline.py.) > > > Hm. OK. I guess in my case, it'd be useful to plug this thing into the > pipeline set up by distutils, so that when I build the C code etc., I > also get .py files in some configured directory. But perhaps there can > be parallel pipelines based on the same parse tree? > > -- > Magnus Lie Hetland > http://hetland.org > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From dagss at student.matnat.uio.no Sat Sep 19 14:05:53 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 19 Sep 2009 14:05:53 +0200 Subject: [Cython] On the subject of pure mode In-Reply-To: <4ef37ce5b89c36f1345e0aabde8ce0a0.squirrel@webmail.uio.no> References: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> <6667699E-E77B-4C22-B78C-70A3B59337A1@hetland.org> <4ef37ce5b89c36f1345e0aabde8ce0a0.squirrel@webmail.uio.no> Message-ID: >> On Sep 19, 2009, at 10:45 , Dag Sverre Seljebotn wrote: >> >>> Magnus Lie Hetland wrote: >> [snip] >>>> IOW, I could just strip down the Cython code to get plain Python >>>> equivalents that I could use the dev-tools on. Is that just a silly >>>> idea? Would I be better off with pure mode? (Is there, perhaps, >>>> such a beast/processor somewhere? I'm assuming I could reuse the >>>> Cython parser if I needed to write something myself?) >>> >>> No, it's a good idea (though see below). The work is about 1/3 done >>> I think... Compile/CodeWriter.py contains a tree transform which >>> will serialize the tree back to Cython code which should be a good >>> starting point (it only supports parts of the language currently >>> though) -- it can simply have a flag or subclass to skip any type >>> declarations, output cdef classes as py classes etc. >> >> Sounds great :) (I found it in Cython/Writer.py, though; I guess >> perhaps it's been moved in the devel version or something?) >> >>> To quickly test with it, plug an instance into the pipeline (search >>> for "pipeline" in Main.py). >> >> Hmm. I've been experimenting with this for a while now, and I'm a bit >> stumped. >> >> After fiddling about with using various classes from Visitor.py as mix- >> ins to get CodeWriter to fit into the pipeline, I just added a simple >> __call__ myself. However, it seems that somehow the results in the >> LinesResult disappear somewhere along the pipeline? At the moment, I >> simply print them out before returning form __call__ :-/ > > I must be brief... > > Yes that's as intended, otherwise you'd disturb the pipeline. > >> As for the pipeline -- is it OK to just put it right after the parsing? > > Seems like the best location, yes. analyse_declarations tends to remove > some of the type declarations for you, but it is probably easier (and more > stable with regards to refactorings in Cython etc.) to strip these > explicitly (= not ouput) in the writer. > >> For one thing, I get lots of unhandled nodes, but I guess that's just >> because they're not implemented yet. But I also get the transformation > > Yep. > >> from "import numpy as np" to "np = (import numpy)", which isn't >> exactly valid :-> (I guess that's not an issue with the stage of the >> pipeline, though; probably not something that'll be get more >> Pythonesque further down the line toward code generation ;-) > > No, unfortunately it is the opposite: The parser is too smart for its own > good and does some transformations (we internally treat an import as the > latter case, so it is already transformed too far). > > Correct solution: Submit a patch which make the parser leave import > statements alone (probably through a new ImportStatNode, in contrast with > the current which is more like a ImportExprNode). Then, in > ParseTreeTransforms.PostParse, transform ImportStateNode to > ImportExprNode, and insert your writer before PostParse in the pipeline. > Forgot: Hack: Have the writer check whenever it outputs assignment whether it assigns to an import :-) From jfrancis18 at gmail.com Sun Sep 20 08:02:02 2009 From: jfrancis18 at gmail.com (Jean-Francois Moulin) Date: Sun, 20 Sep 2009 08:02:02 +0200 Subject: [Cython] numpy array declaration problem In-Reply-To: <3336197633.532900@smtp.netcom.no> References: <3336197633.532900@smtp.netcom.no> Message-ID: <4AB5C55A.1040007@gmail.com> Hi! I tried to use the cdef instead as cpdef as suggested and it still fails, complaining for a missing closing ] (which is actually not missing)... I reduced the code down to this and it produces the error: > import numpy as np > cimport numpy as np > import cython > DTYPE = np.int > ctypedef np.int_t DTYPE_t > @cython.boundscheck(False) > cdef test(np.ndarray[DTYPE_t,ndim=1]tof_histo): > print tof_histo Just in case, here is the compile output: python setup_np_ndarray.py build_ext --inplace running build_ext cythoning np_ndarray_test.pyx to np_ndarray_test.c Error converting Pyrex file to C: ------------------------------------------------------------ ... DTYPE = np.int ctypedef np.int_t DTYPE_t @cython.boundscheck(False) cdef test(np.ndarray[DTYPE_t,ndim=1]tof_histo): ^ ------------------------------------------------------------ /home/jfmoulin/My_Progs/dev/cython/np_ndarray_test.pyx:9:38: Expected ']' building 'testnpnd' extension gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c np_ndarray_test.c -o build/temp.linux-x86_64-2.6/np_ndarray_test.o np_ndarray_test.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. error: command 'gcc' failed with exit status 1 Dag Sverre Seljebotn wrote: > You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... > > Dag Sverre Seljebotn > -----Original Message----- > From: Jean-Francois Moulin > Date: Friday, Sep 18, 2009 11:57 pm > Subject: [Cython] numpy array declaration problem > To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net > > Hi to all! >> I am trying to cythonize some code including operations on numpy arrays. Following the user manual I declared the type of the arrays and tried to compile, which fails with a syntax error message for which I can't see any reason. It appears not to like the [] in my declaration (or at least notto find the closing ]. I tried compiling without using the >> DTYPE declaration at all (np.ndarray without specified type) and it >> works, but of course the speed gain is not much. >> >> Any clue? >> >> Thanks in advance for your help >> JF >> >> >> The abridged code: >> >>>> DTYPE = np.int >>> ctypedef np.int_t DTYPE_t >>> >>> ... >>> >>> #@cython.boundscheck(False) #commented just to see if solves the >> problem... NOPE >>> cpdef tuple discri_sweep_2D(... >>> np.ndarray[DTYPE_t, ndim=1] tof_histo, >>> np.ndarray[DTYPE_t, ndim=3] slab >>> ): >>> ... >> >> Compiling fails with the following output: >> >> >>>> running build_ext >>> cythoning discri.pyx to discri.c >>> >>> Error converting Pyrex file to C: >>> ------------------------------------------------------------ >>> ... >>> np.ndarray[DTYPE_t,ndim=1]tof_histo, >>> ^ >>> ------------------------------------------------------------ >>> >>> /home/.../discri_Cython/discri_Cython.main/discri.pyx:197:51: >> Expected ']' >>> building 'discri' extension >>> gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall >> -Wstrict-prototypes -fPIC -I/usr/include/python2.6 -c discri.c -o >> build/temp.linux-x86_64-2.6/discri.o >>> discri.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. >>> error: command 'gcc' failed with exit status 1 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From johan.gronqvist at gmail.com Sun Sep 20 09:00:20 2009 From: johan.gronqvist at gmail.com (=?ISO-8859-1?Q?Johan_Gr=F6nqvist?=) Date: Sun, 20 Sep 2009 09:00:20 +0200 Subject: [Cython] numpy array declaration problem In-Reply-To: <4AB5C55A.1040007@gmail.com> References: <3336197633.532900@smtp.netcom.no> <4AB5C55A.1040007@gmail.com> Message-ID: Hi, You must use def, not cdef. It is not possible to use such a dtype declaration with cdef. / johan Jean-Francois Moulin skrev: > I tried to use the cdef instead as cpdef as suggested and it still > fails, > Dag Sverre Seljebotn wrote: >> You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... >> >> Dag Sverre Seljebotn >> -----Original Message----- >> From: Jean-Francois Moulin >> >>>> cpdef tuple discri_sweep_2D(... >>>> np.ndarray[DTYPE_t, ndim=1] tof_histo, >>>> np.ndarray[DTYPE_t, ndim=3] slab >>>> ): From jfrancis18 at gmail.com Sun Sep 20 15:51:55 2009 From: jfrancis18 at gmail.com (Jean-Francois Moulin) Date: Sun, 20 Sep 2009 15:51:55 +0200 Subject: [Cython] numpy array declaration problem In-Reply-To: References: <3336197633.532900@smtp.netcom.no> <4AB5C55A.1040007@gmail.com> Message-ID: <4AB6337B.6030802@gmail.com> Oooppss... sorry for not reading correctly! But... does that mean that there will be no speed imporvement for these functions (which are the ones I want to speed up the most in my code) Moreover I see (now) in the manual that the def fns might include some cdef declarations of np.ndarrays.... what is happening there? Should I better try to implement my stuff without numpy arrays if I am really looking for speed (I do not think that would be very wise in my case). Thanks for your patience! ;0) Best JF Johan Gr?nqvist wrote: > Hi, > > You must use def, not cdef. It is not possible to use such a dtype > declaration with cdef. > > > > / johan > > Jean-Francois Moulin skrev: >> I tried to use the cdef instead as cpdef as suggested and it still >> fails, >> Dag Sverre Seljebotn wrote: >>> You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... >>> >>> Dag Sverre Seljebotn >>> -----Original Message----- >>> From: Jean-Francois Moulin >>> >>>>> cpdef tuple discri_sweep_2D(... >>>>> np.ndarray[DTYPE_t, ndim=1] tof_histo, >>>>> np.ndarray[DTYPE_t, ndim=3] slab >>>>> ): > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From dagss at student.matnat.uio.no Sun Sep 20 18:51:00 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: 20 Sep 2009 18:51:00 +0200 Subject: [Cython] numpy array declaration problem Message-ID: <3336317476.81509@smtp.netcom.no> cdef vs. def only affects the speed of the *call* of the function, not the time spent in the function itself. If you are processing anything but a very small amount of array data there won't be a difference. Or did you mean that your function only wants to e.g. process a single element? Yes, then you need to not use arrays at the moment. Dag Sverre Seljebotn -----Original Message----- From: Jean-Francois Moulin Date: Sunday, Sep 20, 2009 3:52 pm Subject: Re: [Cython] numpy array declaration problem To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net >Oooppss... sorry for not reading correctly! >But... does that mean that there will be no speed imporvement for these > functions (which are the ones I want to speed up the most in my code) >Moreover I see (now) in the manual that the def fns might include some >cdef declarations of np.ndarrays.... what is happening there? >Should I better try to implement my stuff without numpy arrays if I am >really looking for speed (I do not think that would be very wise in my >case). > >Thanks for your patience! ;0) >Best >JF >Johan Gr?nqvist wrote: >> Hi, >> >> You must use def, not cdef. It is not possible to use such a dtype >> declaration with cdef. >> >> >> >> / johan >> >> Jean-Francois Moulin skrev: >>> I tried to use the cdef instead as cpdef as suggested and it still >>> fails, >>> Dag Sverre Seljebotn wrote: >>>> You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... >>>> >>>> Dag Sverre Seljebotn >>>> -----Original Message----- >>>> From: Jean-Francois Moulin >>>> >>>>>> cpdef tuple discri_sweep_2D(... >>>>>> np.ndarray[DTYPE_t, ndim=1] tof_histo, >>>>>> np.ndarray[DTYPE_t, ndim=3] slab >>>>>> ): >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > >_______________________________________________ >Cython-dev mailing list >Cython-dev at codespeak.net >http://codespeak.net/mailman/listinfo/cython-dev > From jfrancis18 at gmail.com Sun Sep 20 21:13:07 2009 From: jfrancis18 at gmail.com (Jean-Francois Moulin) Date: Sun, 20 Sep 2009 21:13:07 +0200 Subject: [Cython] numpy array declaration problem In-Reply-To: <3336317476.81509@smtp.netcom.no> References: <3336317476.81509@smtp.netcom.no> Message-ID: <4AB67EC3.3060504@gmail.com> Thanks for these answers! That will for sure help me. I actually tried and my code is ... slower after the declaration than before... ( I have to check that I changed only this though and not a 'subtle' performance killer) What I have is a big 3d L array for which I need to increment a single element by one at each call (so, there really isn't much of work for numpy, only later do I use the full power of Numpy/Scipy). So, for now I can separate more sophisticated operations and perform them later once the array is finalized... I can thus build my array as a huge list of lists and eventually convert it to np.ndarray once the time expensive work is done (many, many fn calls... millions of them actually). Or is it worth (possible?) to use the array.array object of python. The type of the elements would then be defined... I read the CEP 517: Cython array type and understand that it is work in progress. But does that mean that the array.array python object will not be efficiently (or at all) understood by cython? Thanks again for answering my naive questions (to call my C background thin would be a kind of understatement). JF Dag Sverre Seljebotn wrote: > cdef vs. def only affects the speed of the *call* of the function, not the time spent in the function itself. If you are processing anything but a very small amount of array data there won't be a difference. > > Or did you mean that your function only wants to e.g. process a single element? Yes, then you need to not use arrays at the moment. > > Dag Sverre Seljebotn > -----Original Message----- > From: Jean-Francois Moulin > Date: Sunday, Sep 20, 2009 3:52 pm > Subject: Re: [Cython] numpy array declaration problem > To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net > > >> Oooppss... sorry for not reading correctly! >> But... does that mean that there will be no speed imporvement for these >> functions (which are the ones I want to speed up the most in my code) >> Moreover I see (now) in the manual that the def fns might include some >> cdef declarations of np.ndarrays.... what is happening there? >> Should I better try to implement my stuff without numpy arrays if I am >> really looking for speed (I do not think that would be very wise in my >> case). >> >> Thanks for your patience! ;0) >> Best >> JF >> Johan Gr?nqvist wrote: >>> Hi, >>> >>> You must use def, not cdef. It is not possible to use such a dtype >>> declaration with cdef. >>> >>> >>> >>> / johan >>> >>> Jean-Francois Moulin skrev: >>>> I tried to use the cdef instead as cpdef as suggested and it still >>>> fails, >>>> Dag Sverre Seljebotn wrote: >>>>> You need to use "def", not "cpdef", when dealing with array arguments. Sorry, this should have been mentioned in the docs... >>>>> >>>>> Dag Sverre Seljebotn >>>>> -----Original Message----- >>>>> From: Jean-Francois Moulin >>>>> >>>>>>> cpdef tuple discri_sweep_2D(... >>>>>>> np.ndarray[DTYPE_t, ndim=1] tof_histo, >>>>>>> np.ndarray[DTYPE_t, ndim=3] slab >>>>>>> ): >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > ------------------------------------------------------------------------ > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From thesweeheng at gmail.com Sun Sep 20 23:07:12 2009 From: thesweeheng at gmail.com (Tan Swee Heng) Date: Mon, 21 Sep 2009 05:07:12 +0800 Subject: [Cython] Short circuit slower than nested if Message-ID: Hi, I've search the archives for "short circuit" but did not find an answer. So I am posting here. Consider the following code (app.pyx): from hashlib import sha1 cdef long expensive(long n, long k): return long(sha1("%s:%s" % (n, k)).hexdigest()[:4], base=16) < 256 def test1(long n): cdef long i, sum = 0 for i in range(n): if expensive(i, 0) and expensive(i, 1): # short-circuit sum += i print sum def test2(long n): cdef long i, sum = 0 for i in range(n): if expensive(i, 0): # nested-if if expensive(i, 1): sum += i print sum If this was pure Python, the performance of test1() and test2() would be the same. However with Cython, test1() was twice as slow as test2(). Looking at app.c: // if expensive(i, 0) and expensive(i, 1): # <<<<<<<<<<<<<< if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); } else { __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); } and // if expensive(i, 0): # <<<<<<<<<<<<<< __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); if (__pyx_t_2) { // if expensive(i, 1): # <<<<<<<<<<<<<< __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); if (__pyx_t_2) { For the short-circuit version, the first test is sometimes evaluated twice. For the nested-if version, it is evaluated exactly once. Q: For the short-circuit, would the right behaviour be to store the result of the first evaluation in a temporary variable instead? I am new to Cython so pardon me if this is not the right place to ask. If a patch is preferred, I can give it a try although I will take some time to get familiar with the code. Swee Heng -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090921/92fd6de2/attachment-0001.htm From greg.ewing at canterbury.ac.nz Mon Sep 21 03:32:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 21 Sep 2009 13:32:20 +1200 Subject: [Cython] Short circuit slower than nested if In-Reply-To: References: Message-ID: <4AB6D7A4.6030303@canterbury.ac.nz> Tan Swee Heng wrote: > Looking at app.c: > > // if expensive(i, 0) and expensive(i, 1): # <<<<<<<<<<<<<< > if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); > } else { > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); > } > Q: For the short-circuit, would the right behaviour be to store the > result of the first evaluation in a temporary variable instead? I'd say so. Pyrex generates this: __pyx_5 = __pyx_f_3app_expensive(__pyx_v_i,0); if (__pyx_5) { __pyx_5 = __pyx_f_3app_expensive(__pyx_v_i,1); } -- Greg From dagss at student.matnat.uio.no Mon Sep 21 08:33:00 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: 21 Sep 2009 08:33:00 +0200 Subject: [Cython] Short circuit slower than nested if Message-ID: <3336366795.144255@smtp.netcom.no> This is a bug. But it looks like something which might be fixed in -unstable. Could you get cython-unstable from http://hg.cython.org/ and see of there's still a problem? Dag Sverre Seljebotn -----Original Message----- From: Tan Swee Heng Date: Sunday, Sep 20, 2009 11:30 pm Subject: [Cython] Short circuit slower than nested if To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net Hi, I've search the archives for 'short circuit' but did not find an answer. So I am posting here. > >Consider the following code (app.pyx): > > ? from hashlib import sha1 > >? cdef long expensive(long n, long k): >? ? return long(sha1('%s:%s' % (n, k)).hexdigest()[:4], base) < 256 > > ? def test1(long n): >? ? cdef long i, sum = 0 >? ? for i in range(n): >??? ? if expensive(i, 0) and expensive(i, 1): # short-circuit >????? ? sum += i >? ? print sum > >? def test2(long n): >? ? cdef long i, sum = 0 >? ? for i in range(n): >??? ? if expensive(i, 0): # nested-if >????? ? if expensive(i, 1): >??????? ? sum += i >? ? print sum > >If this was pure Python, the performance of test1() and test2() would be the same. However with Cython, test1() was twice as slow as test2(). > >Looking at app.c: > >? //??? if expensive(i, 0) and expensive(i, 1):???????????? # <<<<<<<<<<<<<< >?? ?? if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { >??? ? ? __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); >?? ?? } else { >? ?? ?? __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); >????? } > >and > >? //??? if expensive(i, 0):???????????? # <<<<<<<<<<<<<< >?? ?? __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); >????? if (__pyx_t_2) { > >? // ???? if expensive(i, 1):???????????? # <<<<<<<<<<<<<< >? ?? ?? __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); >??? ? ? if (__pyx_t_2) { > >For the short-circuit version, the first test is sometimes evaluated twice. For the nested-if version, it is evaluated exactly once. > >Q: For the short-circuit, would the right behaviour be to store the result of the first evaluation in a temporary variable instead? > >I am new to Cython so pardon me if this is not the right place to ask. If a patch is preferred, I can give it a try although I will take some time to get familiar with the code. > >Swee Heng > From robertwb at math.washington.edu Mon Sep 21 19:00:06 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 21 Sep 2009 10:00:06 -0700 Subject: [Cython] Short circuit slower than nested if In-Reply-To: <4AB6D7A4.6030303@canterbury.ac.nz> References: <4AB6D7A4.6030303@canterbury.ac.nz> Message-ID: <15CD7B24-7587-4DE0-AF94-587E287D2D4D@math.washington.edu> On Sep 20, 2009, at 6:32 PM, Greg Ewing wrote: > Tan Swee Heng wrote: > >> Looking at app.c: >> >> // if expensive(i, 0) and expensive(i, 1): # >> <<<<<<<<<<<<<< >> if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { >> __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); >> } else { >> __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); >> } > >> Q: For the short-circuit, would the right behaviour be to store the >> result of the first evaluation in a temporary variable instead? > > I'd say so. Pyrex generates this: > > __pyx_5 = __pyx_f_3app_expensive(__pyx_v_i,0); > if (__pyx_5) { > __pyx_5 = __pyx_f_3app_expensive(__pyx_v_i,1); > } Yep, a bug for sure. Not just optimization, but if expensive() has side effects than it disobeys Python semantics. - Robert From Chris.Barker at noaa.gov Mon Sep 21 20:13:49 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 21 Sep 2009 11:13:49 -0700 Subject: [Cython] numpy array declaration problem In-Reply-To: <4AB67EC3.3060504@gmail.com> References: <3336317476.81509@smtp.netcom.no> <4AB67EC3.3060504@gmail.com> Message-ID: <4AB7C25D.9000909@noaa.gov> Jean-Francois Moulin wrote: > What I have is a big 3d L array for which I need to increment a single > element by one at each call Something like this? def MyFun(arr, i, j, k): arr[i,j,k] += 1 If so, the you might as well keep it in python. That little operation must be inside a some loop of some sort. The goal is to move the whole loop into Cython. Then you can pass in the array, and cython can convert your code into nice, fast C-type indexing. > So, for now I > can separate more sophisticated operations and perform them later once > the array is finalized... I can thus build my array as a huge list of > lists Why do you need to build it as lists? Usually you only need to do this if you don't know how big it's going to be when you start. > Or is > it worth (possible?) to use the array.array object of python. I don't think that buys you anything over numpy arrays, and you lose a lot! Perhaps a bit more description of your problem, and/or a stripped down pure-python version for us to comment on. I suspect that you could improve your performance a lot just by using numpy optimally (which means a post to the numpy list may be in order) NOTE: when faced with issues like this, you'll get the best results from a list if you post your problem, rather than a solution you are trying -- If can can pose your problem succinctly and clearly, the odds are good someone may have a better solution that the one you've come up with! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jfrancis18 at gmail.com Mon Sep 21 21:11:03 2009 From: jfrancis18 at gmail.com (Jean-Francois Moulin) Date: Mon, 21 Sep 2009 21:11:03 +0200 Subject: [Cython] numpy array declaration problem In-Reply-To: <4AB7C25D.9000909@noaa.gov> References: <3336317476.81509@smtp.netcom.no> <4AB67EC3.3060504@gmail.com> <4AB7C25D.9000909@noaa.gov> Message-ID: <4AB7CFC7.9080801@gmail.com> Yes, a single value of the array is incremented at each iteration of a many many times loop (I 'll describe it below) My aim was as you point out to have the main content of this loop be optimized and tried also to have the fastest array ops as possible (why not after all). Later on, the full numpy potentialities are used to analyse the array as a series of 2D images (filtering, extrema search, plotting, slicing ...). My idea to go to lists was to have a type which Cython directly understands (or am I wrong there), the array.array questions goes in the same direction too btw. But ok... I got the point. No use to move away from numpy. Now, since you asked, my problem is the following I am working on a data analysis script to deal with time of flight (TOF) detection of neutrons with a 2d detector. Basically I need to analyse a huge series of events consisting of timing signals. Each time a neutron hits my detector I receive 5 signals, one of which tells me the arrival time the 4 others together, where did the neutron hit the detector. If you get bored here, skip the next paragraph! 4 signals for 2 d location means redundancy, which is more than useful to desentangle envents which happen very close in time (and for which the 5 signals might be observed in a mixed order...) So in a loop I detect a clock signal, understand that a neutron just arrived, I now look for all signals over the other four channels which might be compatible with this clock signal (i.e. signals observed within a fixed time delay) . I they are only four bingo I can locate the event, if I found more signals, trouble begins and I must start to look into timing details... Sometimes the electronics also misses events and I should try to reconstruct them as much as possible on the basis of the remaining info (thanks again to redundancy) So basically a lookup into a large file and in the end increment the proper pixels in a series of 2d maps (1 map for each tof slice which I consider according to my time resolution). Typically between 20 and 200 maps of ca 350*350 pixels. Number of events to consider somewhere between 100.000 and some millions. Raw data files between 50Mb (a joke or a bad sample) and 5Gb (an overkill measurement). One of the things I left to python is the histogramming itself, which I realise using bisect (I read this is already efficient C code, and I am inclined to believe it when I compare the speed to the first naive DIY attempts I made just for fun ;0) For the rest, Cython is helping a good deal with lists/tuple lookup and the simple arithmetics involved in the pixel position calculation. I just wanted to push it as far as possible. I also happen to use scipy on an everyday basis and I was curious to see it together with Cython (maybe for other tasks) So far... Thanks a lot for your comments JF Christopher Barker wrote: > Jean-Francois Moulin wrote: >> What I have is a big 3d L array for which I need to increment a single >> element by one at each call > > Something like this? > > > def MyFun(arr, i, j, k): > arr[i,j,k] += 1 > > > If so, the you might as well keep it in python. > > That little operation must be inside a some loop of some sort. The goal > is to move the whole loop into Cython. Then you can pass in the array, > and cython can convert your code into nice, fast C-type indexing. > > > > So, for now I >> can separate more sophisticated operations and perform them later once >> the array is finalized... I can thus build my array as a huge list of >> lists > > Why do you need to build it as lists? Usually you only need to do this > if you don't know how big it's going to be when you start. > >> Or is >> it worth (possible?) to use the array.array object of python. > > I don't think that buys you anything over numpy arrays, and you lose a lot! > > Perhaps a bit more description of your problem, and/or a stripped down > pure-python version for us to comment on. I suspect that you could > improve your performance a lot just by using numpy optimally (which > means a post to the numpy list may be in order) > > NOTE: when faced with issues like this, you'll get the best results from > a list if you post your problem, rather than a solution you are trying > -- If can can pose your problem succinctly and clearly, the odds are > good someone may have a better solution that the one you've come up with! > > > -Chris > > > > From sturla at molden.no Tue Sep 22 04:50:29 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 04:50:29 +0200 Subject: [Cython] Bug in np.ndarray? Message-ID: <4AB83B75.7030809@molden.no> I have discovered something peculiar in Cython 0.11.2. Assume we have cdef np.ndarray[np.float64_t] y, z and do something like return (y if (zi is None) else (y, z)) GCC complains about "assignment from incompatible pointer type" when compiling the generated C. Whereas if I write this as if (zi is None): return y else: return (y,z) GCC does not complain. Is this a bug? Sturla Molden From robertwb at math.washington.edu Tue Sep 22 06:48:22 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 21 Sep 2009 21:48:22 -0700 Subject: [Cython] Bug in np.ndarray? In-Reply-To: <4AB83B75.7030809@molden.no> References: <4AB83B75.7030809@molden.no> Message-ID: On Sep 21, 2009, at 7:50 PM, Sturla Molden wrote: > > I have discovered something peculiar in Cython 0.11.2. > > Assume we have > > cdef np.ndarray[np.float64_t] y, z > > and do something like > > return (y if (zi is None) else (y, z)) > > GCC complains about "assignment from incompatible pointer type" when > compiling the generated C. Is it just a warning, or does it not compile? > Whereas if I write this as > > if (zi is None): > return y > else: > return (y,z) > > GCC does not complain. > > Is this a bug? Yes, it is. There are way too many warnings when compiling the generated c code, and in my mind any such warning is a bug (though possibly a low-priority one). - Robert From robertwb at math.washington.edu Tue Sep 22 06:51:25 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 21 Sep 2009 21:51:25 -0700 Subject: [Cython] numpy array declaration problem In-Reply-To: <4AB7CFC7.9080801@gmail.com> References: <3336317476.81509@smtp.netcom.no> <4AB67EC3.3060504@gmail.com> <4AB7C25D.9000909@noaa.gov> <4AB7CFC7.9080801@gmail.com> Message-ID: <2AD12C02-B3BE-4A2A-A407-D54983FE78D1@math.washington.edu> Sounds like an interesting problem, and it also sounds like something that Cython could help a lot with. However, it'd be much easier to help you out if you posted a 10-20 line snippet of code that is your bottleneck and that you are trying to make faster. - Robert On Sep 21, 2009, at 12:11 PM, Jean-Francois Moulin wrote: > Yes, a single value of the array is incremented at each iteration of a > many many times loop (I 'll describe it below) > My aim was as you point out to have the main content of this loop be > optimized and tried also to have the fastest array ops as possible > (why > not after all). > Later on, the full numpy potentialities are used to analyse the > array as > a series of 2D images (filtering, extrema search, plotting, > slicing ...). > > > My idea to go to lists was to have a type which Cython directly > understands (or am I wrong there), the array.array questions goes > in the > same direction too btw. > > But ok... I got the point. No use to move away from numpy. > > Now, since you asked, my problem is the following I am working on a > data > analysis script to deal with time of flight (TOF) detection of > neutrons > with a 2d detector. Basically I need to analyse a huge series of > events > consisting of timing signals. Each time a neutron hits my detector I > receive 5 signals, one of which tells me the arrival time the 4 others > together, where did the neutron hit the detector. > If you get bored here, skip the next paragraph! > > 4 signals for 2 d location means redundancy, which is more than > useful > to desentangle envents which happen very close in time (and for which > the 5 signals might be observed in a mixed order...) > So in a loop I detect a clock signal, understand that a neutron just > arrived, I now look for all signals over the other four channels which > might be compatible with this clock signal (i.e. signals observed > within > a fixed time delay) . I they are only four bingo I can locate the > event, > if I found more signals, trouble begins and I must start to look into > timing details... Sometimes the electronics also misses events and I > should try to reconstruct them as much as possible on the basis of the > remaining info (thanks again to redundancy) > > So basically a lookup into a large file and in the end increment the > proper pixels in a series of 2d maps (1 map for each tof slice which I > consider according to my time resolution). Typically between 20 and > 200 > maps of ca 350*350 pixels. Number of events to consider somewhere > between 100.000 and some millions. Raw data files between 50Mb (a joke > or a bad sample) and 5Gb (an overkill measurement). > > One of the things I left to python is the histogramming itself, > which I > realise using bisect (I read this is already efficient C code, and > I am > inclined to believe it when I compare the speed to the first naive DIY > attempts I made just for fun ;0) > > For the rest, Cython is helping a good deal with lists/tuple lookup > and > the simple arithmetics involved in the pixel position calculation. I > just wanted to push it as far as possible. I also happen to use > scipy on > an everyday basis and I was curious to see it together with Cython > (maybe for other tasks) > > So far... > > Thanks a lot for your comments > > JF > > > Christopher Barker wrote: >> Jean-Francois Moulin wrote: >>> What I have is a big 3d L array for which I need to increment a >>> single >>> element by one at each call >> >> Something like this? >> >> >> def MyFun(arr, i, j, k): >> arr[i,j,k] += 1 >> >> >> If so, the you might as well keep it in python. >> >> That little operation must be inside a some loop of some sort. The >> goal >> is to move the whole loop into Cython. Then you can pass in the >> array, >> and cython can convert your code into nice, fast C-type indexing. >> >> >>> So, for now I >>> can separate more sophisticated operations and perform them later >>> once >>> the array is finalized... I can thus build my array as a huge >>> list of >>> lists >> >> Why do you need to build it as lists? Usually you only need to do >> this >> if you don't know how big it's going to be when you start. >> >>> Or is >>> it worth (possible?) to use the array.array object of python. >> >> I don't think that buys you anything over numpy arrays, and you >> lose a lot! >> >> Perhaps a bit more description of your problem, and/or a stripped >> down >> pure-python version for us to comment on. I suspect that you could >> improve your performance a lot just by using numpy optimally (which >> means a post to the numpy list may be in order) >> >> NOTE: when faced with issues like this, you'll get the best >> results from >> a list if you post your problem, rather than a solution you are >> trying >> -- If can can pose your problem succinctly and clearly, the odds are >> good someone may have a better solution that the one you've come >> up with! >> >> >> -Chris >> >> >> >> > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Tue Sep 22 07:31:54 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 21 Sep 2009 22:31:54 -0700 Subject: [Cython] New cython users group Message-ID: <3AE57B90-F53C-4BF2-9B35-B44C8352DE5E@math.washington.edu> I'd like to announce cython-users at googlegroups.com , a users list for non-development discussion that will hopefully be lower-trafic and more focused on users (as opposed to development). Of course, feel free to stay subscribed to both. - Robert From sturla at molden.no Tue Sep 22 08:30:14 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 22 Sep 2009 08:30:14 +0200 Subject: [Cython] Bug in np.ndarray? In-Reply-To: References: <4AB83B75.7030809@molden.no> Message-ID: <4AB86EF6.8020307@molden.no> Robert Bradshaw skrev: > Is it just a warning, or does it not compile? > It is a warning. S.M. From thesweeheng at gmail.com Tue Sep 22 08:53:04 2009 From: thesweeheng at gmail.com (Tan Swee Heng) Date: Tue, 22 Sep 2009 14:53:04 +0800 Subject: [Cython] Short circuit slower than nested if In-Reply-To: <3336366795.144255@smtp.netcom.no> References: <3336366795.144255@smtp.netcom.no> Message-ID: I tested it with cython-unstable-fe0733adf6f0. The bug seems to remain. Should I submit this as a bug on http://trac.cython.org/cython_trac? Swee Heng On Mon, Sep 21, 2009 at 2:33 PM, Dag Sverre Seljebotn < dagss at student.matnat.uio.no> wrote: > This is a bug. But it looks like something which might be fixed in > -unstable. Could you get cython-unstable from http://hg.cython.org/ and > see of there's still a problem? > > Dag Sverre Seljebotn > -----Original Message----- > From: Tan Swee Heng > Date: Sunday, Sep 20, 2009 11:30 pm > Subject: [Cython] Short circuit slower than nested if > To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net > > Hi, I've search the archives for 'short circuit' but did not find an > answer. So I am posting here. > > > >Consider the following code (app.pyx): > > > > from hashlib import sha1 > > > > cdef long expensive(long n, long k): > > return long(sha1('%s:%s' % (n, k)).hexdigest()[:4], base=16) < 256 > > > > def test1(long n): > > cdef long i, sum = 0 > > for i in range(n): > > if expensive(i, 0) and expensive(i, 1): # short-circuit > > sum += i > > print sum > > > > def test2(long n): > > cdef long i, sum = 0 > > for i in range(n): > > if expensive(i, 0): # nested-if > > if expensive(i, 1): > > sum += i > > print sum > > > >If this was pure Python, the performance of test1() and test2() would be > the same. However with Cython, test1() was twice as slow as test2(). > > > >Looking at app.c: > > > > // if expensive(i, 0) and expensive(i, 1): # > <<<<<<<<<<<<<< > > if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); > > } else { > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); > > } > > > >and > > > > // if expensive(i, 0): # <<<<<<<<<<<<<< > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); > > if (__pyx_t_2) { > > > > // if expensive(i, 1): # <<<<<<<<<<<<<< > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); > > if (__pyx_t_2) { > > > >For the short-circuit version, the first test is sometimes evaluated > twice. For the nested-if version, it is evaluated exactly once. > > > >Q: For the short-circuit, would the right behaviour be to store the result > of the first evaluation in a temporary variable instead? > > > >I am new to Cython so pardon me if this is not the right place to ask. If > a patch is preferred, I can give it a try although I will take some time to > get familiar with the code. > > > >Swee Heng > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20090922/47c5daed/attachment.htm From robertwb at math.washington.edu Tue Sep 22 09:17:19 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 22 Sep 2009 00:17:19 -0700 Subject: [Cython] Short circuit slower than nested if In-Reply-To: References: <3336366795.144255@smtp.netcom.no> Message-ID: <6EAA7CFD-31D6-48BF-A961-362BC5EB92C6@math.washington.edu> On Sep 21, 2009, at 11:53 PM, Tan Swee Heng wrote: > I tested it with cython-unstable-fe0733adf6f0. The bug seems to > remain. > > Should I submit this as a bug on http://trac.cython.org/cython_trac? Yes, please do. > > > Swee Heng > > On Mon, Sep 21, 2009 at 2:33 PM, Dag Sverre Seljebotn > wrote: > This is a bug. But it looks like something which might be fixed in - > unstable. Could you get cython-unstable from http://hg.cython.org/ > and see of there's still a problem? > > Dag Sverre Seljebotn > -----Original Message----- > From: Tan Swee Heng > Date: Sunday, Sep 20, 2009 11:30 pm > Subject: [Cython] Short circuit slower than nested if > To: cython-dev at codespeak.netReply-To: cython-dev at codespeak.net > > Hi, I've search the archives for 'short circuit' but did not find > an answer. So I am posting here. > > > >Consider the following code (app.pyx): > > > > from hashlib import sha1 > > > > cdef long expensive(long n, long k): > > return long(sha1('%s:%s' % (n, k)).hexdigest()[:4], base=16) < > 256 > > > > def test1(long n): > > cdef long i, sum = 0 > > for i in range(n): > > if expensive(i, 0) and expensive(i, 1): # short-circuit > > sum += i > > print sum > > > > def test2(long n): > > cdef long i, sum = 0 > > for i in range(n): > > if expensive(i, 0): # nested-if > > if expensive(i, 1): > > sum += i > > print sum > > > >If this was pure Python, the performance of test1() and test2() > would be the same. However with Cython, test1() was twice as slow > as test2(). > > > >Looking at app.c: > > > > // if expensive(i, 0) and expensive(i, 1): # > <<<<<<<<<<<<<< > > if (__pyx_f_3app_expensive(__pyx_v_i, 0)) { > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); > > } else { > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); > > } > > > >and > > > > // if expensive(i, 0): # <<<<<<<<<<<<<< > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 0); > > if (__pyx_t_2) { > > > > // if expensive(i, 1): # <<<<<<<<<<<<<< > > __pyx_t_2 = __pyx_f_3app_expensive(__pyx_v_i, 1); > > if (__pyx_t_2) { > > > >For the short-circuit version, the first test is sometimes > evaluated twice. For the nested-if version, it is evaluated exactly > once. > > > >Q: For the short-circuit, would the right behaviour be to store > the result of the first evaluation in a temporary variable instead? > > > >I am new to Cython so pardon me if this is not the right place to > ask. If a patch is preferred, I can give it a try although I will > take some time to get familiar with the code. > > > >Swee Heng > > > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From seb.binet at gmail.com Tue Sep 22 09:21:13 2009 From: seb.binet at gmail.com (Sebastien Binet) Date: Tue, 22 Sep 2009 09:21:13 +0200 Subject: [Cython] New cython users group In-Reply-To: <3AE57B90-F53C-4BF2-9B35-B44C8352DE5E@math.washington.edu> References: <3AE57B90-F53C-4BF2-9B35-B44C8352DE5E@math.washington.edu> Message-ID: <200909220921.13308.binet@cern.ch> On Tuesday 22 September 2009 07:31:54 Robert Bradshaw wrote: > I'd like to announce cython-users at googlegroups.com , a users list for > non-development discussion that will hopefully be lower-trafic and > more focused on users (as opposed to development). Of course, feel > free to stay subscribed to both. FWIW, here is the direct link to subscribe (or just browse content): http://groups.google.com/group/cython-users hth, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From robertwb at math.washington.edu Tue Sep 22 10:14:42 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 22 Sep 2009 01:14:42 -0700 Subject: [Cython] Special methods: __iadd__ In-Reply-To: <20090907175532.E0DCD168014@codespeak.net> References: <20090907175532.E0DCD168014@codespeak.net> Message-ID: <0D72188C-2B0D-4A1E-BC0B-71D7F9F5B904@math.washington.edu> On Sep 7, 2009, at 10:51 AM, Philip Smith wrote: > Hi > > I am relatively new to Cython but making good progress I think. > However I have the following code (skeleton) related to a library > I?m wrapping: > [code] > This compiles absolutely fine (under Mingw) but crashes at runtime. > > Any ideas? > Sorry no ones's gotten back to you yet. I'm not sure what the issue is, it works fine on OS X (see code below). Perhaps you should be checking for None in your __iadd__? - Robert cdef class Foo: cdef int i def __init__(self, i): self.i = i def __repr__(self): return "Foo(%s)" % self.i def __iadd__(self, Foo other not None): self.i += other.i return self cdef class Foo2(object): cdef Foo attribute def __init__(self, Foo a): self.attribute = a def __iadd__(self, Foo2 other not None): self.attribute+=other.attribute #This is defined in class foo and works fine there return self def __repr__(self): return "Foo2(%s)" % self.attribute sage: from add import Foo, Foo2 sage: a = Foo2(Foo(3)); a Foo2(Foo(3)) sage: b = Foo2(Foo(4)); b Foo2(Foo(4)) sage: a += b sage: a Foo2(Foo(7)) From robertwb at math.washington.edu Tue Sep 22 11:19:19 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 22 Sep 2009 02:19:19 -0700 Subject: [Cython] Cython 0.11.3 beta is up Message-ID: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> Based on http://hg.cython.org/cython-devel/shortlog/71980dd690eb http://cython.org/Cython-0.11.3.beta0.tar.gz - Robert From ndbecker2 at gmail.com Tue Sep 22 16:29:40 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 22 Sep 2009 10:29:40 -0400 Subject: [Cython] New cython users group References: <3AE57B90-F53C-4BF2-9B35-B44C8352DE5E@math.washington.edu> Message-ID: Robert Bradshaw wrote: > I'd like to announce cython-users at googlegroups.com > , a users list for non-development discussion that will hopefully be > lower-trafic and more focused on users (as opposed to development). Of > course, feel free to stay subscribed to both. > > - Robert Not seen via gmane nntp interface, as of this time. From magnus at hetland.org Wed Sep 23 18:33:57 2009 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed, 23 Sep 2009 18:33:57 +0200 Subject: [Cython] On the subject of pure mode In-Reply-To: <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> References: <5E2C07CC-0995-450B-8562-63802D5D3AC9@hetland.org> <15a8b4b56647bbda474b39c912660463.squirrel@webmail.uio.no> Message-ID: <74DE6E20-7DFD-45E5-BCEF-51B54725FE57@hetland.org> On Sep 19, 2009, at 10:45, Dag Sverre Seljebotn wrote: > Magnus Lie Hetland wrote: [snip] >> IOW, I could just strip down the Cython code to get plain Python >> equivalents that I could use the dev-tools on. Is that just a silly >> idea? Would I be better off with pure mode? (Is there, perhaps, >> such a >> beast/processor somewhere? I'm assuming I could reuse the Cython >> parser if I needed to write something myself?) > > No, it's a good idea (though see below). The work is about 1/3 done I > think... Compile/CodeWriter.py contains a tree transform which will > serialize the tree back to Cython code which should be a good starting > point (it only supports parts of the language currently though) -- > it can > simply have a flag or subclass to skip any type declarations, output > cdef > classes as py classes etc. I fiddled a bit with this, but found that the task was a bit too daunting compared to my current needs. I only need to strip away certain syntactic elements -- and reconstructing the entire program based on a parse tree was a bit heavy. A cool thing to have as part of the Cython toolchain, certainly, but a bit too much work for me right now. What I ended up with for the moment is a very simple search/replace- based script that strips away the parts of the Cython syntax I'm currently using (preserving comments/whitespace, which the CodeWriter can't do either). Not sure how useful this is for others at the moment (as it only works with the most basic subset of functionality), but I guess I'll extend it as I use it in more of my code. (I also work with some others who will be using the same functionality.) If there's any interest, I'd be happy to make it available. (It's not exactly rocket science, though ;-) Maybe I'll have the capacity to look into a more elegant solution some time in the future... :-> - M -- Magnus Lie Hetland http://hetland.org From robertwb at math.washington.edu Thu Sep 24 09:20:04 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 00:20:04 -0700 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> Message-ID: <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> On Sep 22, 2009, at 2:19 AM, Robert Bradshaw wrote: > Based on http://hg.cython.org/cython-devel/shortlog/71980dd690eb > > http://cython.org/Cython-0.11.3.beta0.tar.gz Baring any critical flaws, I plan to release this beta as 0.11.3 by the end of the week. - Robert From dagss at student.matnat.uio.no Thu Sep 24 09:47:44 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 24 Sep 2009 09:47:44 +0200 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> Message-ID: <4ABB2420.6010506@student.matnat.uio.no> Robert Bradshaw wrote: > On Sep 22, 2009, at 2:19 AM, Robert Bradshaw wrote: > > >> Based on http://hg.cython.org/cython-devel/shortlog/71980dd690eb >> >> http://cython.org/Cython-0.11.3.beta0.tar.gz >> > > Baring any critical flaws, I plan to release this beta as 0.11.3 by > the end of the week. > > I was hoping to push a small patch fixing a buffer bug today/tonight. I'd really like it in, but I think it should be very safe (and definitely only affects buffers). It's not a regression though... Hope that's OK? Dag Sverre From robertwb at math.washington.edu Thu Sep 24 10:06:28 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 01:06:28 -0700 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <4ABB2420.6010506@student.matnat.uio.no> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> <4ABB2420.6010506@student.matnat.uio.no> Message-ID: <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> On Sep 24, 2009, at 12:47 AM, Dag Sverre Seljebotn wrote: > Robert Bradshaw wrote: >> On Sep 22, 2009, at 2:19 AM, Robert Bradshaw wrote: >> >> >>> Based on http://hg.cython.org/cython-devel/shortlog/71980dd690eb >>> >>> http://cython.org/Cython-0.11.3.beta0.tar.gz >>> >> >> Baring any critical flaws, I plan to release this beta as 0.11.3 by >> the end of the week. >> >> > I was hoping to push a small patch fixing a buffer bug today/tonight. > I'd really like it in, but I think it should be very safe (and > definitely only affects buffers). It's not a regression though... Hope > that's OK? Sure, no problem. I'll cut an rc tomorrow with your stuff, and then actually release Friday or Saturday so anyone who cares (but not enough to pull from -devel) can try it out. - Robert From dalcinl at gmail.com Thu Sep 24 16:05:33 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 11:05:33 -0300 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> <4ABB2420.6010506@student.matnat.uio.no> <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> Message-ID: Robert, I would like to push this, in order to (hopefully) stop the endless pain of WinDog users wanting to use MinGW. IIRC, we discussed about this in the past and agreed that distutils config files (in Python's site-packages and user's $HOME) should be honored, but NOT a bare 'setup.cfg' in the current working directory... Well, here you have... What do you think? diff -r 71980dd690eb pyximport/pyxbuild.py --- a/pyximport/pyxbuild.py Tue Sep 22 02:13:13 2009 -0700 +++ b/pyximport/pyxbuild.py Thu Sep 24 10:57:17 2009 -0300 @@ -55,6 +55,11 @@ build = dist.get_command_obj('build') build.build_base = pyxbuild_dir + config_files = dist.find_config_files() + try: config_files.remove('setup.cfg') + except ValueError: pass + dist.parse_config_files(config_files) + try: ok = dist.parse_command_line() except DistutilsArgError: On Thu, Sep 24, 2009 at 5:06 AM, Robert Bradshaw wrote: > On Sep 24, 2009, at 12:47 AM, Dag Sverre Seljebotn wrote: > >> Robert Bradshaw wrote: >>> On Sep 22, 2009, at 2:19 AM, Robert Bradshaw wrote: >>> >>> >>>> Based on http://hg.cython.org/cython-devel/shortlog/71980dd690eb >>>> >>>> http://cython.org/Cython-0.11.3.beta0.tar.gz >>>> >>> >>> Baring any critical flaws, I plan to release this beta as 0.11.3 by >>> the end of the week. >>> >>> >> I was hoping to push a small patch fixing a buffer bug today/tonight. >> I'd really like it in, but I think it should be very safe (and >> definitely only affects buffers). It's not a regression though... Hope >> that's OK? > > Sure, no problem. I'll cut an rc tomorrow with your stuff, and then > actually release Friday or Saturday so anyone who cares (but not > enough to pull from -devel) can try it out. > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Sep 24 21:24:37 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 16:24:37 -0300 Subject: [Cython] C tracing enabled by default? Message-ID: Are all you sure this should by the default ?? #ifndef CYTHON_TRACING #define CYTHON_TRACING 1 #endif -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Thu Sep 24 21:59:45 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 24 Sep 2009 21:59:45 +0200 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> <4ABB2420.6010506@student.matnat.uio.no> <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> Message-ID: <4ABBCFB1.8080108@student.matnat.uio.no> Lisandro Dalcin wrote: > Robert, I would like to push this, in order to (hopefully) stop the > endless pain of WinDog users wanting to use MinGW. > > IIRC, we discussed about this in the past and agreed that distutils > config files (in Python's site-packages and user's $HOME) should be > honored, but NOT a bare 'setup.cfg' in the current working > directory... Well, here you have... What do you think? > > diff -r 71980dd690eb pyximport/pyxbuild.py > --- a/pyximport/pyxbuild.py Tue Sep 22 02:13:13 2009 -0700 > +++ b/pyximport/pyxbuild.py Thu Sep 24 10:57:17 2009 -0300 > @@ -55,6 +55,11 @@ > build = dist.get_command_obj('build') > build.build_base = pyxbuild_dir > > + config_files = dist.find_config_files() > + try: config_files.remove('setup.cfg') > + except ValueError: pass > + dist.parse_config_files(config_files) > + > try: > ok = dist.parse_command_line() > except DistutilsArgError: > > +1 from me. -- Dag Sverre From robertwb at math.washington.edu Thu Sep 24 22:24:13 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 13:24:13 -0700 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <4ABBCFB1.8080108@student.matnat.uio.no> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> <4ABB2420.6010506@student.matnat.uio.no> <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> <4ABBCFB1.8080108@student.matnat.uio.no> Message-ID: <3246F2A5-5D5F-4203-9ACB-B0F546A0DEE6@math.washington.edu> On Sep 24, 2009, at 12:59 PM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> Robert, I would like to push this, in order to (hopefully) stop the >> endless pain of WinDog users wanting to use MinGW. >> >> IIRC, we discussed about this in the past and agreed that distutils >> config files (in Python's site-packages and user's $HOME) should be >> honored, but NOT a bare 'setup.cfg' in the current working >> directory... Well, here you have... What do you think? >> >> diff -r 71980dd690eb pyximport/pyxbuild.py >> --- a/pyximport/pyxbuild.py Tue Sep 22 02:13:13 2009 -0700 >> +++ b/pyximport/pyxbuild.py Thu Sep 24 10:57:17 2009 -0300 >> @@ -55,6 +55,11 @@ >> build = dist.get_command_obj('build') >> build.build_base = pyxbuild_dir >> >> + config_files = dist.find_config_files() >> + try: config_files.remove('setup.cfg') >> + except ValueError: pass >> + dist.parse_config_files(config_files) >> + >> try: >> ok = dist.parse_command_line() >> except DistutilsArgError: >> >> > > +1 from me. Yep, sounds like a good idea to me to. Go ahead and push. - Robert From robertwb at math.washington.edu Thu Sep 24 22:25:24 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 13:25:24 -0700 Subject: [Cython] C tracing enabled by default? In-Reply-To: References: Message-ID: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> On Sep 24, 2009, at 12:24 PM, Lisandro Dalcin wrote: > Are all you sure this should by the default ?? > > #ifndef CYTHON_TRACING > #define CYTHON_TRACING 1 > #endif No, I'm not, but I'm not sure it should be off by default either. Anyone else have any input? I guess I should run some more benchmarks at least. - Robert From dalcinl at gmail.com Thu Sep 24 23:38:48 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 18:38:48 -0300 Subject: [Cython] C tracing enabled by default? In-Reply-To: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> References: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> Message-ID: Wait a minute... I think the problem is not in the C code, but it is actually a bug in Cython ... See the patch below... diff -r 71980dd690eb Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py Tue Sep 22 02:13:13 2009 -0700 +++ b/Cython/Compiler/Nodes.py Thu Sep 24 18:36:00 2009 -0300 @@ -1062,7 +1062,7 @@ is_getbuffer_slot = (self.entry.name == "__getbuffer__" and self.entry.scope.is_c_class_scope) - if code.globalstate.directives['profile'] is None: + if code.globalstate.directives['profile']: profile = 'inline' not in self.modifiers and not lenv.nogil else: profile = code.globalstate.directives['profile'] diff -r 71980dd690eb Cython/Compiler/Options.py --- a/Cython/Compiler/Options.py Tue Sep 22 02:13:13 2009 -0700 +++ b/Cython/Compiler/Options.py Thu Sep 24 18:36:00 2009 -0300 @@ -67,7 +67,7 @@ 'wraparound' : True, 'c99_complex' : False, # Don't use macro wrappers for complex arith, not sure what to name this... 'callspec' : "", - 'profile': None, + 'profile': False, } # Override types possibilities above, if needed On Thu, Sep 24, 2009 at 5:25 PM, Robert Bradshaw wrote: > On Sep 24, 2009, at 12:24 PM, Lisandro Dalcin wrote: > >> Are all you sure this should by the default ?? >> >> #ifndef CYTHON_TRACING >> #define CYTHON_TRACING 1 >> #endif > > No, I'm not, but I'm not sure it should be off by default either. > Anyone else have any input? I guess I should run some more benchmarks > at least. > > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Sep 24 23:45:12 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 18:45:12 -0300 Subject: [Cython] Cython 0.11.3 beta is up In-Reply-To: <3246F2A5-5D5F-4203-9ACB-B0F546A0DEE6@math.washington.edu> References: <868C7868-EC5A-4E6B-BA8E-F81AB6F72F13@math.washington.edu> <00E43783-8D83-42C2-9F90-F23486D89ACF@math.washington.edu> <4ABB2420.6010506@student.matnat.uio.no> <4762943F-A4D6-4F41-8DD4-9126FFE91B6D@math.washington.edu> <4ABBCFB1.8080108@student.matnat.uio.no> <3246F2A5-5D5F-4203-9ACB-B0F546A0DEE6@math.washington.edu> Message-ID: Pushed to cython-devel http://hg.cython.org/cython-devel/rev/2ffa48b4073a On Thu, Sep 24, 2009 at 5:24 PM, Robert Bradshaw wrote: > On Sep 24, 2009, at 12:59 PM, Dag Sverre Seljebotn wrote: > >> Lisandro Dalcin wrote: >>> Robert, I would like to push this, in order to (hopefully) stop the >>> endless pain of WinDog users wanting to use MinGW. >>> >>> IIRC, we discussed about this in the past and agreed that distutils >>> config files (in Python's site-packages and user's $HOME) should be >>> honored, but NOT a bare 'setup.cfg' in the current working >>> directory... Well, here you have... What do you think? >>> >>> diff -r 71980dd690eb pyximport/pyxbuild.py >>> --- a/pyximport/pyxbuild.py ?Tue Sep 22 02:13:13 2009 -0700 >>> +++ b/pyximport/pyxbuild.py ?Thu Sep 24 10:57:17 2009 -0300 >>> @@ -55,6 +55,11 @@ >>> ? ? ?build = dist.get_command_obj('build') >>> ? ? ?build.build_base = pyxbuild_dir >>> >>> + ? ?config_files = dist.find_config_files() >>> + ? ?try: config_files.remove('setup.cfg') >>> + ? ?except ValueError: pass >>> + ? ?dist.parse_config_files(config_files) >>> + >>> ? ? ?try: >>> ? ? ? ? ?ok = dist.parse_command_line() >>> ? ? ?except DistutilsArgError: >>> >>> >> >> +1 from me. > > Yep, sounds like a good idea to me to. Go ahead and push. > > - Robert > > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Sep 24 23:50:18 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 18:50:18 -0300 Subject: [Cython] C tracing enabled by default? In-Reply-To: References: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> Message-ID: Sorry, I was not clear enough... if 'profile' directive is of type 'bool', but initialized to None, it seems that at some point Dag's transform machinery puts a 'False' there, so the "directives['profile'] is None" does not work as expected... So, in short, the 'profile' directive should be 'False' by default; but if enabled, the macro in the C code should definitely default to 1, as currently done. On Thu, Sep 24, 2009 at 6:38 PM, Lisandro Dalcin wrote: > Wait a minute... I think the problem is not in the C code, but it is > actually a bug in Cython ... See the patch below... > > diff -r 71980dd690eb Cython/Compiler/Nodes.py > --- a/Cython/Compiler/Nodes.py ?Tue Sep 22 02:13:13 2009 -0700 > +++ b/Cython/Compiler/Nodes.py ?Thu Sep 24 18:36:00 2009 -0300 > @@ -1062,7 +1062,7 @@ > ? ? ? ? is_getbuffer_slot = (self.entry.name == "__getbuffer__" and > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?self.entry.scope.is_c_class_scope) > > - ? ? ? ?if code.globalstate.directives['profile'] is None: > + ? ? ? ?if code.globalstate.directives['profile']: > ? ? ? ? ? ? profile = 'inline' not in self.modifiers and not lenv.nogil > ? ? ? ? else: > ? ? ? ? ? ? profile = code.globalstate.directives['profile'] > diff -r 71980dd690eb Cython/Compiler/Options.py > --- a/Cython/Compiler/Options.py ? ? ? ?Tue Sep 22 02:13:13 2009 -0700 > +++ b/Cython/Compiler/Options.py ? ? ? ?Thu Sep 24 18:36:00 2009 -0300 > @@ -67,7 +67,7 @@ > ? ? 'wraparound' : True, > ? ? 'c99_complex' : False, # Don't use macro wrappers for complex > arith, not sure what to name this... > ? ? 'callspec' : "", > - ? ?'profile': None, > + ? ?'profile': False, > ?} > > ?# Override types possibilities above, if needed > > > On Thu, Sep 24, 2009 at 5:25 PM, Robert Bradshaw > wrote: >> On Sep 24, 2009, at 12:24 PM, Lisandro Dalcin wrote: >> >>> Are all you sure this should by the default ?? >>> >>> #ifndef CYTHON_TRACING >>> #define CYTHON_TRACING 1 >>> #endif >> >> No, I'm not, but I'm not sure it should be off by default either. >> Anyone else have any input? I guess I should run some more benchmarks >> at least. >> >> - Robert >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Sep 25 00:16:36 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 15:16:36 -0700 Subject: [Cython] C tracing enabled by default? In-Reply-To: References: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> Message-ID: On Sep 24, 2009, at 2:50 PM, Lisandro Dalcin wrote: > Sorry, I was not clear enough... if 'profile' directive is of type > 'bool', but initialized to None, it seems that at some point Dag's > transform machinery puts a 'False' there, Ah. > so the > "directives['profile'] is None" does not work as expected... > > So, in short, the 'profile' directive should be 'False' by default; > but if enabled, the macro in the C code should definitely default to > 1, as currently done. Actually, None (the default) was a special case--enabled for non- inline non-nogil functions. It requires the GIL, so we need that at least, and I disabled it for inline functions for performance reasons. Maybe rather than True/False, it should be an int with different thresholds? If so, what should they be? Should any level be on by default? (I think so.) - Robert From dalcinl at gmail.com Fri Sep 25 00:37:58 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Sep 2009 19:37:58 -0300 Subject: [Cython] C tracing enabled by default? In-Reply-To: References: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> Message-ID: On Thu, Sep 24, 2009 at 7:16 PM, Robert Bradshaw wrote: > On Sep 24, 2009, at 2:50 PM, Lisandro Dalcin wrote: > >> Sorry, I was not clear enough... if 'profile' directive is of type >> 'bool', but initialized to None, it seems that at some point Dag's >> transform machinery puts a 'False' there, > > Ah. > >> so the >> "directives['profile'] is None" does not work as expected... >> >> So, in short, the 'profile' directive should be 'False' by default; >> but if enabled, ?the macro in the C code should definitely default to >> 1, as currently done. > > Actually, None (the default) was a special case--enabled for non- > inline non-nogil functions. It requires the GIL, so we need that at > least, and I disabled it for inline functions for performance reasons. > Ah! I see... but that way is broken... > Maybe rather than True/False, it should be an int with different > thresholds? I think so. > > If so, what should they be? > What about this?: 0: do no profile at all 1: profile only 'def' functions 2: profile 'def' and 'cdef' but not cdef+inline >=3: profile all, i.e def, cdef, and cdef+inline. You could also merge 1 and 2 in case the distinction is not worth enough. BTW, we could also support the support C code being generated, but with the define below by default: #ifndef CYTHON_TRACING #define CYTHON_TRACING 0 #endif So we could be able to define a default level of tracing (1,2,3) and how to compile the C code by default.. Just throwing some ideas... > > Should any level be on by > default? (I think so.) > I personally prefer that profile be off by default (level 0 in my proposal above), but as always, I do not bother too much about defaults provided that I can change it :-) Anyway, until you can have some more benchmarking (or in the case you do not have the time to do the benchmarking), I think you should disable the beast, just in case do not have it unintentionally enabled in final release 0.11.3... > - Robert > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Sep 25 01:32:59 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 24 Sep 2009 16:32:59 -0700 Subject: [Cython] C tracing enabled by default? In-Reply-To: References: <6B020482-D3D5-43DD-94A1-B1FAA12FDCE4@math.washington.edu> Message-ID: On Sep 24, 2009, at 3:37 PM, Lisandro Dalcin wrote: > On Thu, Sep 24, 2009 at 7:16 PM, Robert Bradshaw > wrote: >> On Sep 24, 2009, at 2:50 PM, Lisandro Dalcin wrote: >> >>> Sorry, I was not clear enough... if 'profile' directive is of type >>> 'bool', but initialized to None, it seems that at some point Dag's >>> transform machinery puts a 'False' there, >> >> Ah. >> >>> so the >>> "directives['profile'] is None" does not work as expected... >>> >>> So, in short, the 'profile' directive should be 'False' by default; >>> but if enabled, the macro in the C code should definitely >>> default to >>> 1, as currently done. >> >> Actually, None (the default) was a special case--enabled for non- >> inline non-nogil functions. It requires the GIL, so we need that at >> least, and I disabled it for inline functions for performance >> reasons. >> > > Ah! I see... but that way is broken... > >> Maybe rather than True/False, it should be an int with different >> thresholds? > > I think so. > >> >> If so, what should they be? >> > > What about this?: > > 0: