From robertwb at math.washington.edu Tue Dec 1 04:09:38 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 30 Nov 2009 19:09:38 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B12A58C.7020008@behnel.de> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> Message-ID: <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> Just to clarify discussion, here is what I'm proposing (which is still in flux, and simplified due to memory issues, which does make it less attractive as one does not get to choose the used encoding, but it would always be UTF-8 in Py3). Without directive(s) (as it is now): char* <-> bytes With the directive(s) (which can be applied locally or globally): char* <-> str unicode/bytes -> char* would also work (for Py2/Py3 respectively) The encoding used would be the system default (in Py2) and UTF-8 (in Py3). This would use the defenc slot so the encoded char* would be valid as long as the unicode object is around, and the long term future of the defenc slot needs to be ensured before this could be used for non-arguments conversion. Also out there is the idea of a directive that would make char* become unicode in both Py2 and Py3. On Nov 29, 2009, at 8:47 AM, Stefan Behnel wrote: > Robert Bradshaw, 28.11.2009 22:12: >> My personal concern is the pain I see porting Sage to Py3. I'd have >> to >> go through the codebase and throw in encodes() and decodes() and >> change signatures of functions that take char* arguments > > That's what I figured. Instead of having to fix up the code, you > want a > do-what-I-mean str data type that unifies everything that's unicode, > bytes > and char*, and that magically handles it all for you. Exactly. Improve the compiler rather than change the code. > In that case, you should drop the argument of Pyrex compatibility > for now, > because I don't think you can have a Cython specific hyper-versatile > data > type with automatic memory management and all that, while staying > compatible to the simple str/bytes type in Pyrex - even if we manage > to get > it working without new syntax. Just because I don't need Pyrex compatibility doesn't mean it isn't a worthwhile goal (though that was the main point of the previous thread, not this one). > We'd clearly break a lot of existing Pyrex/Cython code by starting > to coerce char* to unicode, for example. Only if the directive was enabled, and perhaps only in Py3. Existing code wouldn't break. >> (which, I just realized, will be a step backwards for cpdef >> functions). > > True. For cpdef functions, a char* parameter would be well-defined > as long > as user code doesn't use different encodings for char* internally > (which is > somewhat unlikely). > > Ok, let's think this through. There's two different scenarios. One > deals > with function signatures (strings going in and out), the other one > deals > with conversion on assignments or casts. I think it's easier if the Python to C and C to Python conversions are uniform whether it happen via to coercion, assignment, or function signature constraints. Then the question is what objects can be turned into a char* (the directive would add unicode) and what object does char* turn into (the directive would create str in Py2 and Py3). Function arguments typed as char* are a particularly useful case though, and it would be nice to make this friendlier for Py3. > In total, there are three cases: > accepting bytes/str/unicode in a str/bytes/char* signature, coercing > str/unicode to char*, and coercing char* to bytes or unicode. > > Function signatures have two sides to them that are not symmetric. > One is > that you want your string accepting functions to be agnostic about > the type > of string that comes in (although you may or may not want to have > control > about memory usage if you use char* in the signature), and the other > side > is that you want some string to go back out, which you may want to > be a > Py2-str (read: bytes) or a unicode string (maybe in Py2 and > definitely in > Py3). Remember that if your code originally couldn't handle unicode, > there's likely to be more code that can't handle it, either, so you > wouldn't want your hyper-versatile type to always turn into unicode. You're right, it would be nice to be able to return a str in both Py2 and Py3, which neither "return some_c_string" nor "return some_c_string.decode(...)" will do. > 1) Passing unicode strings into a function that expects char* means > that > some kind of encoding must happen and a new Python bytes object must > be > created on the fly. The input object isn't a problem here as the > caller > holds a reference to it anyway. The encoded object, however, must > have a > lifetime. Looking at buffer arguments, I wouldn't mind if that was the > lifetime of the function call itself. After all, it's the user's > choice to > use char* instead of str/bytes/unicode. So the case of a parameter > typed as > char* is actually easy to handle from a memory POV, given that some > kind of > automatic encoding is in place. > > 2) Automatic encoding for an assignment from unicode to char* is > tricky, > because you can't easily make assumptions about the lifetime of the > unicode > object itself. You could get away with a weak-ref mapping from unicode > strings to their byte encoded representation. I think every other > attempt > to keep track of the lifetime of the unicode object is futile in > current > Cython. Think of code like this, which I would expect to work: > > cdef unicode u = u"abcdefg" > cdef char* s1 = u > u2 = u > cdef char* s2 = u2 > u = None > print u, u2, s1, s2 > > So supporting automatic unicode->char* coercion on assignments is > really > hard to do internally. As Greg pointed out, this would have *exactly* the same semantics as we now have for cdef bytes b = b"abcdefg" cdef char* s1 = b b2 = b cdef char* s2 = b2 b = None print b, b2, s1, s2 using the defenc slot. > > 3) The third case is the same for both sigs and assignments: automatic > decoding of char* to unicode vs. instantiation of a bytes object, > i.e. the > following should do The Right Thing: > > cdef char* some_c_string = ... > some_python_name = some_c_string > > This would be heavily simplified if some_python_name was typed as > either > bytes or unicode (the latter of which might fail due to decoding > errors), > and even str would work if it did different things in Py2 and Py3 > (with > potential decoding errors only in Py3). However, that won't work for > untyped return values of def functions. This is not really about assignment, it's about coercion. If we declare some_python_name = some_c_string to always have the same meaning as some_python_name = some_c_string then the meaning of some_c_string and some_c_string are clear, and some_c_string is the only ambiguity, and the directive would control what some_c_string means. > Given that users would likely want > to use bytes in Py2 (for simple non-unicode strings) and unicode for > other > strings in Py2 and all text strings in Py3, this isn't easy to handle > automatically. If a user wants to return a mixture of str/unicode in Py2, or bytes/ str in Py3, they're going to have to be explicit one way or another, whether or not this directive is used. (It just changes the default.) > Now, the proposal was to enable this with a compiler directive, > which would > basically provide a default encoding. If this directive was used, all > untyped coercions from char* to a Python object would use it. As Dag > noted > already, this would interfere with type inference, as the resulting > type > would still be char* in that case. This is completely orthogonal to type inference. Type inference happens before any coercions are inserted, and would work just as well with our without this proposal. It's a question about what kind of coercion to allow/insert when one is needed. > The only exception are untyped function return values. > > For typed coercions to str or unicode, I personally don't think that > it's > too much typing to require "c_string.decode(enc)", which would work > nicely > with type inference. However, that would, again, not yield the > do-what-I-mean result of returning a byte string in Py2. Arguably, > that > might be considered an optimisation, but it could still fall under > the DWIM > compiler directive, e.g. as an "return_bytes_in_py2" option. > > > Ok, to sum things up, it looks like a special kind of coercion at > function > call bounderies would be quite easy to support, and would work > nicely with > type inference enabled. It would also match the support that CPython's > C-API argument unpacking functions have for converting Python strings. > Everything else would mean hard work inside of Cython and be rather > hard to > explain to users. Argument unpacking would be the most useful case. If defenc is actually going away, then I agree things could get messy, but otherwise it would be as easy (or hard) to explain and use as str -> char* is now. > BTW, I wouldn't mind extending the string input argument conversion > support > to everything that supports the buffer protocol. That might be interesting, though one difficulty is that buffers in general don't have a intrinsic notion of length. (Technically, nor do strings, null terminated strings encoded with null-free encodings are common enough to make char* useable.) - Robert From robertwb at math.washington.edu Tue Dec 1 04:23:57 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 30 Nov 2009 19:23:57 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B140B9D.8040701@noaa.gov> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> Message-ID: On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote: >> Robert Bradshaw wrote: >> this is the kind >>> of thing that usually tells me there's a deficiency in the language >>> that should be fixed to ease the users burden instead. > > sure -- but the deficiency is in C (and py2), and that's not something > we can fix. As for the Cython language, it should really follow > Python: > unicode for "text", bytes for arbitrary data. > > But we need to deal with C (and fortran) no matter how you slice it. > > I wrote a similar post on the numpy list: I think the key from a > user's > perspective is that one is either working with "text": human readable > stuff, or data. If text, then the natural python3 data type is a > unicode > string. If data, then bytes -- we should really follow that as best > we can. Exactly. unicode = char* + length + encoding bytes = char* + length So what is the Python equivalent of char*? Neither, and what you want depends on the application and context. >> most of the >> libraries we work with would probably balk at anything but ASCII >> anyways > > This is key. unicode is new, and AFAICT, C still doesn't really have a > decent way to deal with it anyway (it never even had a native string > type). > > So a very, very, common usage is for C and Fortran code and > libraries to > expect char*, encoded in ASCI (or ANSI, but 1 byte per character, in > any > case). It needs to be easy, and perhaps automatic, to write code that > crosses the Python-C border in these cases. > > I've lost track of what has been proposed here, but it seems to me > that > we need a Cython type: > > ANSI_string (not that that's what it should be called) > > It might be nice if there were a way to specify the encoding -- ASCII, > Latin1, etc. though it would have to be a 1byte-per-character > encoding. > I'm not sure what the syntax could be for that, but I'd like to have > it > specified in there code near where it is used, rather than as a > program-wide default. Compiler directives can be specified on a per-file, per-function, or per-block basis. On a per-line basis, I think it's easier to just call s.decode("ASCII"). > If you declare a variable an ANSI_string, then Cython will convert > to a > char* internally, using ASCII (or another defined encoding). At the > python level it could except either a unicode string or a byte string, > passing the byte string right on through. A runtime errror would be > raised if the input could not be ASCII encoded. > > It seems this would handle the very common case of libraries expecting > simple ascii strings for flags, etc. That is another idea. A new type would handle conversion to char*, but not from char*. Bytes objects would still be returned by default unless one did something extra there (which is fine for some uses, but for other str is more natural). > It would be kind of like numpy's "asarray" call, in that it may or may > not make a copy, depending on what the input is, but I don't think > that > would be problem, as strings are immutable anyway. > > Wouldn't this be much like declaring a variable a C int, and being > able > to pass in python integers that may or may not (until run time) fit? Yep, I'm thinking if the encoding fails, a runtime error would result. > This completely from a user's perspective. Thank you! The more user's perspective we can get the better. - Robert From dalcinl at gmail.com Tue Dec 1 04:26:58 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 1 Dec 2009 00:26:58 -0300 Subject: [Cython] Another string encoding idea In-Reply-To: <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> Message-ID: On Tue, Dec 1, 2009 at 12:09 AM, Robert Bradshaw wrote: > >> BTW, I wouldn't mind extending the string input argument conversion >> support >> to everything that supports the buffer protocol. > > That might be interesting, though one difficulty is that buffers in > general don't have a intrinsic notion of length. (Technically, nor do > strings, null terminated strings encoded with null-free encodings are > common enough to make char* useable.) > I see, then in the near future I'll be able to create a numpy array with "unsigned char" dtype (let say, for storing a 8-bit image?). But at some point, I'll mistakenly pass these arrays to something accepting 'bytes'... No, -1, do not do that please. Explicit is better than exlicit. Error should never pass silently. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Tue Dec 1 04:36:34 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 30 Nov 2009 19:36:34 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> Message-ID: <38CFBFED-1033-4EA3-9BF4-A37392184DB5@math.washington.edu> On Nov 30, 2009, at 7:26 PM, Lisandro Dalcin wrote: > On Tue, Dec 1, 2009 at 12:09 AM, Robert Bradshaw > wrote: >> >>> BTW, I wouldn't mind extending the string input argument conversion >>> support >>> to everything that supports the buffer protocol. >> >> That might be interesting, though one difficulty is that buffers in >> general don't have a intrinsic notion of length. (Technically, nor do >> strings, null terminated strings encoded with null-free encodings are >> common enough to make char* useable.) >> > > I see, then in the near future I'll be able to create a numpy array > with "unsigned char" dtype (let say, for storing a 8-bit image?). But > at some point, I'll mistakenly pass these arrays to something > accepting 'bytes'... No, -1, do not do that please. Explicit is better > than exlicit. Error should never pass silently. I agree, that would be bad. What I was interpreting this as is one could write def foo(int* data): ... and data would be "extracted" via the buffer interface. That could get messy with char*. I certainly don't support def foo(bytes data): ... accepting a buffer object, and automatically creating the bytes object out of it. Doing some "magic" for C <-> Python conversion is a lot more natural (even necessary) vs. doing magic to transform from one kind of Python object to another. - Robert From stefan_ml at behnel.de Tue Dec 1 07:41:40 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 Dec 2009 07:41:40 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> Message-ID: <4B14BAA4.8040306@behnel.de> Robert Bradshaw, 01.12.2009 04:09: > Just to clarify discussion, here is what I'm proposing (which is still > in flux, and simplified due to memory issues, which does make it less > attractive as one does not get to choose the used encoding, but it > would always be UTF-8 in Py3). ... and the 'default encoding' in Py2, which may or may not be ASCII, but would likely be at least something that's compatible with ASCII, as it would break tons of code otherwise. > Without directive(s) (as it is now): > > char* <-> bytes > > With the directive(s) (which can be applied locally or globally): > > char* <-> str > unicode/bytes -> char* would also work (for Py2/Py3 respectively) 'respectively' in the sense of 'for both'? > The encoding used would be the system default (in Py2) and UTF-8 (in > Py3). This would use the defenc slot so the encoded char* would be > valid as long as the unicode object is around, and the long term > future of the defenc slot needs to be ensured before this could be > used for non-arguments conversion. That's my main concern here. We are basing a major feature on a side-effect of something that's declared "for internal use only". The new buffer interface isn't even supported by Unicode strings in Py3, so the mere existence of the defenc slot in Py3 is plainly for internal optimisation purposes, and the fact that it's safe for external code to just borrow the reference into a char* is everything but clear to me. It's obvious enough that defenc isn't going to go away in Py2 any more, but since you keep insisting, please ask on python-dev for making that part of the C-API publicly specified (i.e. the slot itself and the fact that the object in defenc is kept alive for the lifetime of the unicode string) before we even consider doing anything like this. I still don't like the list.pop() optimisation, but this is much worse, as we can't just take this feature back when we realise that it was a mistake in the first place. > Also out there is the idea of a directive that would make char* become > unicode in both Py2 and Py3. ... which would likely only be useful for new code, as existing code would break in all sorts of places if you enable that (just as with type inference). > On Nov 29, 2009, at 8:47 AM, Stefan Behnel wrote: > >> Robert Bradshaw, 28.11.2009 22:12: >>> My personal concern is the pain I see porting Sage to Py3. I'd have >>> to go through the codebase and throw in encodes() and decodes() and >>> change signatures of functions that take char* arguments >> That's what I figured. Instead of having to fix up the code, you want >> a do-what-I-mean str data type that unifies everything that's unicode, >> bytes and char*, and that magically handles it all for you. > > Exactly. Improve the compiler rather than change the code. You calling it 'improve' actually makes it sound better than I think it is. I do see the interest of simplifying the path between unicode strings and char*, but I also see an interest in making it easy for developers to write safe APIs that reject broken input (e.g. with 0 bytes or other control characters). I really don't like APIs that use "well, it's written in C" (and certainly not "well, it's written in Cython"!) as an excuse for silently dropping parts of my accidentally broken input (which I may not even have control of myself). Automatic coercion to char* is only one side of input handling, and it may just as well lead to less helpful APIs being written. So enabling such a directive requires careful consideration, too, because it's not a simple all-win thing, not even in the long term. > I think it's easier if the Python to C and C to Python conversions are > uniform whether it happen via to coercion, assignment, or function > signature constraints. Then the question is what objects can be turned > into a char* (the directive would add unicode) and what object does > char* turn into (the directive would create str in Py2 and Py3). > [...] > If we declare > > some_python_name = some_c_string > > to always have the same meaning as > > some_python_name = some_c_string > > then the meaning of some_c_string and some_c_string > are clear, and some_c_string is the only ambiguity, and the > directive would control what some_c_string means. This sounds reasonable - except for the implementation details. > Function arguments typed as char* are a particularly useful case > though, and it would be nice to make this friendlier for Py3. Function arguments typed bytes/str/unicode are a lot easier and safer to handle, though, and not a bit slower in general. Coercion from bytes to plain char* is pretty fast, and could be even faster if it's typed (as we could use a None check and a macro in that case). >> Now, the proposal was to enable this with a compiler directive, which >> would basically provide a default encoding. If this directive was >> used, all untyped coercions from char* to a Python object would use >> it. As Dag noted already, this would interfere with type inference, as >> the resulting type would still be char* in that case. > > This is completely orthogonal to type inference. It's not orthogonal, as type inference currently breaks C type to untyped Python name assignments, which is exactly the case you want to influence with the directive. This means that the char* directive would override the type inference directive for one special case. >> BTW, I wouldn't mind extending the string input argument conversion >> support to everything that supports the buffer protocol. > > That might be interesting, though one difficulty is that buffers in > general don't have a intrinsic notion of length. Huh? Py_buffer.len will do just fine for a 1D buffer. Actually, PyUnicode_FromEncodedObject() will handle this for us in Py3 (although incorrectly by using PyObject_AsCharBuffer(), I just filed a bug report on their tracker). But the case where this would matter doesn't seem to be part of your revised proposal above any more. Stefan From stefan_ml at behnel.de Tue Dec 1 07:48:32 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 Dec 2009 07:48:32 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: <38CFBFED-1033-4EA3-9BF4-A37392184DB5@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <38CFBFED-1033-4EA3-9BF4-A37392184DB5@math.washington.edu> Message-ID: <4B14BC40.7020802@behnel.de> Robert Bradshaw, 01.12.2009 04:36: > one could write > > def foo(int* data): > ... > > and data would be "extracted" via the buffer interface. That could get > messy with char*. Less messy than for int*, for sure. At least, char* has a somewhat well defined termination character '\0'. How would you know how many elements an int* parameter would refer to? Stefan From robertwb at math.washington.edu Tue Dec 1 07:57:54 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 30 Nov 2009 22:57:54 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B14BC40.7020802@behnel.de> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <38CFBFED-1033-4EA3-9BF4-A37392184DB5@math.washington.edu> <4B14BC40.7020802@behnel.de> Message-ID: On Nov 30, 2009, at 10:48 PM, Stefan Behnel wrote: > > Robert Bradshaw, 01.12.2009 04:36: >> one could write >> >> def foo(int* data): >> ... >> >> and data would be "extracted" via the buffer interface. That could >> get >> messy with char*. > > Less messy than for int*, for sure. At least, char* has a somewhat > well > defined termination character '\0'. How would you know how many > elements an > int* parameter would refer to? You wouldn't, which is the problem that I was worried about. I think I misinterpreted what you were trying to say here, so just ignore it. - Robert From robertwb at math.washington.edu Tue Dec 1 08:41:28 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 30 Nov 2009 23:41:28 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B14BAA4.8040306@behnel.de> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <4B14BAA4.8040306@behnel.de> Message-ID: <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> On Nov 30, 2009, at 10:41 PM, Stefan Behnel wrote: > > Robert Bradshaw, 01.12.2009 04:09: >> Just to clarify discussion, here is what I'm proposing (which is >> still >> in flux, and simplified due to memory issues, which does make it less >> attractive as one does not get to choose the used encoding, but it >> would always be UTF-8 in Py3). > > ... and the 'default encoding' in Py2, which may or may not be > ASCII, but > would likely be at least something that's compatible with ASCII, as it > would break tons of code otherwise. Yep. > > >> Without directive(s) (as it is now): >> >> char* <-> bytes >> >> With the directive(s) (which can be applied locally or globally): >> >> char* <-> str >> unicode/bytes -> char* would also work (for Py2/Py3 respectively) > > 'respectively' in the sense of 'for both'? I was just avoiding being redundant with the case covered by char* <-> str. Yes, 'for both' would be accurate to say as well. > >> The encoding used would be the system default (in Py2) and UTF-8 (in >> Py3). This would use the defenc slot so the encoded char* would be >> valid as long as the unicode object is around, and the long term >> future of the defenc slot needs to be ensured before this could be >> used for non-arguments conversion. > > That's my main concern here. We are basing a major feature on a side- > effect > of something that's declared "for internal use only". > > The new buffer interface isn't even supported by Unicode strings in > Py3, so > the mere existence of the defenc slot in Py3 is plainly for internal > optimisation purposes, and the fact that it's safe for external code > to > just borrow the reference into a char* is everything but clear to me. > > It's obvious enough that defenc isn't going to go away in Py2 any > more, but > since you keep insisting, please ask on python-dev for making that > part of > the C-API publicly specified (i.e. the slot itself and the fact that > the > object in defenc is kept alive for the lifetime of the unicode string) > before we even consider doing anything like this. > > I still don't like the list.pop() optimisation, but this is much > worse, as > we can't just take this feature back when we realise that it was a > mistake > in the first place. This *is* a concern of mine as well (though I didn't know it was being deprecated when I first thought of using it), and much of the proposal is conditional on defenc, as we need it, is not going away. Until that's resolved, there's no way this should go in. If it's going away, then we can still handle argument parameters, and char* -> object, but the spontaneous unicode -> char* suffers the aforementioned technical difficulties and may have to be abandoned. (An advantage would also be that we could actually support a variety of encodings, not just the "default" one, if there's interest.) > > >> Also out there is the idea of a directive that would make char* >> become >> unicode in both Py2 and Py3. > > ... which would likely only be useful for new code, as existing code > would > break in all sorts of places if you enable that (just as with type > inference). Yeah, I think this is particular one is a more invasive change, and I probably wouldn't want it tied with the first, but I was trying to summarize. > > >> On Nov 29, 2009, at 8:47 AM, Stefan Behnel wrote: >> >>> Robert Bradshaw, 28.11.2009 22:12: >>>> My personal concern is the pain I see porting Sage to Py3. I'd have >>>> to go through the codebase and throw in encodes() and decodes() and >>>> change signatures of functions that take char* arguments >>> That's what I figured. Instead of having to fix up the code, you >>> want >>> a do-what-I-mean str data type that unifies everything that's >>> unicode, >>> bytes and char*, and that magically handles it all for you. >> >> Exactly. Improve the compiler rather than change the code. > > You calling it 'improve' actually makes it sound better than I think > it is. Yeah, that colors it from my perspective. Maybe "enhance" would be a better word. > I do see the interest of simplifying the path between unicode > strings and > char*, but I also see an interest in making it easy for developers > to write > safe APIs that reject broken input (e.g. with 0 bytes or other control > characters). I really don't like APIs that use "well, it's written > in C" > (and certainly not "well, it's written in Cython"!) as an excuse for > silently dropping parts of my accidentally broken input (which I may > not > even have control of myself). Yeah, zero bytes are something that no coercion of object to char* can handle, as char* is short on information. We have this issue now. The SIMD/vector/memoryview types might be a way to hold pointer + length in a single object. > Automatic coercion to char* is only one side > of input handling, and it may just as well lead to less helpful APIs > being > written. So enabling such a directive requires careful > consideration, too, > because it's not a simple all-win thing, not even in the long term. Yes, I agree. Unlike type inference, backwards compatibility is not the only significant motivation for not having this on by default. >> I think it's easier if the Python to C and C to Python conversions >> are >> uniform whether it happen via to coercion, assignment, or function >> signature constraints. Then the question is what objects can be >> turned >> into a char* (the directive would add unicode) and what object does >> char* turn into (the directive would create str in Py2 and Py3). >> [...] >> If we declare >> >> some_python_name = some_c_string >> >> to always have the same meaning as >> >> some_python_name = some_c_string >> >> then the meaning of some_c_string and some_c_string >> are clear, and some_c_string is the only ambiguity, and the >> directive would control what some_c_string means. > > This sounds reasonable - except for the implementation details. > > >> Function arguments typed as char* are a particularly useful case >> though, and it would be nice to make this friendlier for Py3. > > Function arguments typed bytes/str/unicode are a lot easier and > safer to > handle, though, and not a bit slower in general. Coercion from bytes > to > plain char* is pretty fast, and could be even faster if it's typed > (as we > could use a None check and a macro in that case). (As an asside, ironically, a type check is often just as fast as a None check, as I've noticed elsewhere statically typing things...) >>> Now, the proposal was to enable this with a compiler directive, >>> which >>> would basically provide a default encoding. If this directive was >>> used, all untyped coercions from char* to a Python object would use >>> it. As Dag noted already, this would interfere with type >>> inference, as >>> the resulting type would still be char* in that case. >> >> This is completely orthogonal to type inference. > > It's not orthogonal, as type inference currently breaks C type to > untyped > Python name assignments, which is exactly the case you want to > influence > with the directive. This means that the char* directive would > override the > type inference directive for one special case. I was just using assignment to an untyped variable as an implicit coercion to object in my example. I should have been more explicit and written cdef char* ss = ... cdef object x = ss If type inference is enabled, then cdef char* ss = ... x = ss (assuming no other assignments to x) will result in x being typed as char*, so no coercion is triggered to do the assignment. The proposal is to control what kind of object gets created if a char* needs to become an object. All types are inferred and completely resolved before any coercions get inserted. - Robert From stefan_ml at behnel.de Tue Dec 1 09:56:56 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 Dec 2009 09:56:56 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> Message-ID: <4B14DA58.5080104@behnel.de> Robert Bradshaw, 01.12.2009 04:23: > On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote: >> I think the key from a user's >> perspective is that one is either working with "text": human readable >> stuff, or data. If text, then the natural python3 data type is a >> unicode string. If data, then bytes -- we should really follow that as >> best we can. > > unicode = char* + length + encoding > bytes = char* + length > > So what is the Python equivalent of char*? Neither, and what you want > depends on the application and context. Ok, so we agree that there are various different use cases that require different setups. As I indicated before, CPython's argument unpacking functions support various ways of dealing with unicode/bytes conversion to char* through their "s#", "u#" and "es#" formats. These are actually helpful, but not currently supported by Cython. Maybe a buffer emulation might help here, where Cython would set up a Py_buffer struct for a function argument and fill in the values from the Python string that was passed. That might be a way to handle all use cases in a uniform way, and we could easily extend this to an additional buffer option 'encoding', which would override the platform specific default encoding used to handle char* buffers. There's also still Dag's trac ticket about ctypedef support for buffer parameters: http://trac.cython.org/cython_trac/ticket/194 This would allow users to define their own encoded char*+length type. The usage would be something like ctypedef str[encoding='ASCII'] ascii_string def func(ascii_string s): print s[:s.len].decode('ASCII') and would accept and encode Unicode arguments as well as arguments that support the 1D buffer protocol. Given that there's the "es" and "et" formattings in CPython (not sure if they continue to work for bytes in Py3, BTW, as it seems that their documentation wasn't overhauled), we could also distinguish how bytes arguments are handled: should they be checked for having the correct encoding, or should they be passed through? Both use cases are legitimate and could be distinguished by another buffer option. > That is another idea. A new type would handle conversion to char*, but > not from char*. Bytes objects would still be returned by default > unless one did something extra there (which is fine for some uses, but > for other str is more natural). We could have a "cython.str()" function that converts char*+length or a char* buffer to bytes or unicode depending on the platform and using either the platform encoding or a different one passed as argument. So you'd return "cython.str(c_string, length)" (or "cython.str(s)" for the example above) and be happy. For function return types, we could also accept the Py3 syntax: def func(str[encoding='ASCII'] s) -> cython.str: ... that would handle the conversion on the fly, as would an equivalent declaration for cdef/cpdef functions. Stefan From stefan_ml at behnel.de Tue Dec 1 10:43:39 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 Dec 2009 10:43:39 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <4B14BAA4.8040306@behnel.de> <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> Message-ID: <4B14E54B.3030904@behnel.de> Robert Bradshaw, 01.12.2009 08:41: > On Nov 30, 2009, at 10:41 PM, Stefan Behnel wrote: >> Robert Bradshaw, 01.12.2009 04:09: >>> This is completely orthogonal to type inference. >> It's not orthogonal, as type inference currently breaks C type to >> untyped Python name assignments, which is exactly the case you want to >> influence with the directive. This means that the char* directive >> would override the type inference directive for one special case. > > I was just using assignment to an untyped variable as an implicit > coercion to object in my example. I should have been more explicit and > written > > cdef char* ss = ... > cdef object x = ss In which case you could just as well type x as str, or use an explicit cast to the type you want. When enabling type inference, the 'default behaviour' no longer comes for free, except for exactly the function call boundary cases that I keep stressing. Stefan From Chris.Barker at noaa.gov Tue Dec 1 20:01:43 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 01 Dec 2009 11:01:43 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> Message-ID: <4B156817.4060801@noaa.gov> Robert Bradshaw wrote: > On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote: >> If text, then the natural python3 data type is a >> unicode string. If data, then bytes -- we should really follow that as best >> we can. > > Exactly. > > unicode = char* + length + encoding > bytes = char* + length > >> It needs to be easy, and perhaps automatic, to write code that >> crosses the Python-C border in these cases. >> >> I've lost track of what has been proposed here, but it seems to me >> that >> we need a Cython type: >> >> ANSI_string (not that that's what it should be called) >> >> It seems this would handle the very common case of libraries expecting >> simple ascii strings for flags, etc. > > That is another idea. A new type would handle conversion to char*, but > not from char*. Bytes objects would still be returned by default > unless one did something extra there (which is fine for some uses, but > for other str is more natural). This doesn't quite fit my vision -- I was thinking that a the "ANSI_string" type would look like a text string in python -- therefor a Unicode object, certainly for py3. Py2 is a mess in this regard no matter how you slice it, but i would think a string or Unicode object would make more sense than bytes -- the idea is that this would be used explicitly for "text", not data -- so the user would not want to get bytes back. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan_ml at behnel.de Wed Dec 2 13:25:58 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 Dec 2009 13:25:58 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <4B14BAA4.8040306@behnel.de> <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> Message-ID: <4B165CD6.7000107@behnel.de> Robert Bradshaw, 01.12.2009 08:41: > On Nov 30, 2009, at 10:41 PM, Stefan Behnel wrote: >> Coercion from bytes to plain char* is pretty fast, and could be even >> faster if it's typed (as we could use a None check and a macro in that >> case). > > (As an asside, ironically, a type check is often just as fast as a > None check, as I've noticed elsewhere statically typing things...) Note that this doesn't apply here, as we currently use a function call to PyBytes_AsString(). Doing a None check and a macro call to PyBytes_AS_STRING() is likely a lot faster than that. Stefan From robertwb at math.washington.edu Thu Dec 3 01:45:29 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Dec 2009 16:45:29 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B156817.4060801@noaa.gov> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B156817.4060801@noaa.gov> Message-ID: On Dec 1, 2009, at 11:01 AM, Christopher Barker wrote: > Robert Bradshaw wrote: >> On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote: >>> If text, then the natural python3 data type is a >>> unicode string. If data, then bytes -- we should really follow >>> that as best >>> we can. >> >> Exactly. >> >> unicode = char* + length + encoding >> bytes = char* + length >> >>> It needs to be easy, and perhaps automatic, to write code that >>> crosses the Python-C border in these cases. >>> >>> I've lost track of what has been proposed here, but it seems to me >>> that >>> we need a Cython type: >>> >>> ANSI_string (not that that's what it should be called) >>> >>> It seems this would handle the very common case of libraries >>> expecting >>> simple ascii strings for flags, etc. >> >> That is another idea. A new type would handle conversion to char*, >> but >> not from char*. Bytes objects would still be returned by default >> unless one did something extra there (which is fine for some uses, >> but >> for other str is more natural). > > This doesn't quite fit my vision -- I was thinking that a the > "ANSI_string" type would look like a text string in python -- > therefor a > Unicode object, certainly for py3. Py2 is a mess in this regard no > matter how you slice it, but i would think a string or Unicode object > would make more sense than bytes -- the idea is that this would be > used > explicitly for "text", not data -- so the user would not want to get > bytes back. Yes, this is the case that I'm thinking of as well. I wasn't seeing how a new type would fix the cdef char* s = ... return case. - Robert From robertwb at math.washington.edu Thu Dec 3 01:49:49 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Dec 2009 16:49:49 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B14E54B.3030904@behnel.de> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B12A58C.7020008@behnel.de> <9462C7CA-136D-4085-B2FF-103ED0E9BE88@math.washington.edu> <4B14BAA4.8040306@behnel.de> <45754378-854E-4E05-9EC4-38128AE87688@math.washington.edu> <4B14E54B.3030904@behnel.de> Message-ID: <165C1ED2-F871-467F-A089-8C17037E66FF@math.washington.edu> On Dec 1, 2009, at 1:43 AM, Stefan Behnel wrote: > Robert Bradshaw, 01.12.2009 08:41: >> On Nov 30, 2009, at 10:41 PM, Stefan Behnel wrote: >>> Robert Bradshaw, 01.12.2009 04:09: >>>> This is completely orthogonal to type inference. >>> It's not orthogonal, as type inference currently breaks C type to >>> untyped Python name assignments, which is exactly the case you >>> want to >>> influence with the directive. This means that the char* directive >>> would override the type inference directive for one special case. >> >> I was just using assignment to an untyped variable as an implicit >> coercion to object in my example. I should have been more explicit >> and >> written >> >> cdef char* ss = ... >> cdef object x = ss > > In which case you could just as well type x as str, or use an > explicit cast > to the type you want. When enabling type inference, the 'default > behaviour' > no longer comes for free, except for exactly the function call > boundary > cases that I keep stressing. I agree that the function call boundary is probably the most important case, but implicit coercions happen in lots of other places. cdef char* ss = ... foo = [ss], {ss: ss}, ss my_py_function(ss) py_object.attr = ss print(ss) return ss - Robert From robertwb at math.washington.edu Thu Dec 3 02:01:10 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Wed, 2 Dec 2009 17:01:10 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: <4B14DA58.5080104@behnel.de> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B14DA58.5080104@behnel.de> Message-ID: <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> On Dec 1, 2009, at 12:56 AM, Stefan Behnel wrote: > Robert Bradshaw, 01.12.2009 04:23: >> On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote: >>> I think the key from a user's >>> perspective is that one is either working with "text": human >>> readable >>> stuff, or data. If text, then the natural python3 data type is a >>> unicode string. If data, then bytes -- we should really follow >>> that as >>> best we can. >> >> unicode = char* + length + encoding >> bytes = char* + length >> >> So what is the Python equivalent of char*? Neither, and what you want >> depends on the application and context. > > Ok, so we agree that there are various different use cases that > require > different setups. > > As I indicated before, CPython's argument unpacking functions support > various ways of dealing with unicode/bytes conversion to char* through > their "s#", "u#" and "es#" formats. These are actually helpful, but > not > currently supported by Cython. > > Maybe a buffer emulation might help here, where Cython would set up a > Py_buffer struct for a function argument and fill in the values from > the > Python string that was passed. That might be a way to handle all use > cases > in a uniform way, and we could easily extend this to an additional > buffer > option 'encoding', which would override the platform specific default > encoding used to handle char* buffers. > > There's also still Dag's trac ticket about ctypedef support for buffer > parameters: > > http://trac.cython.org/cython_trac/ticket/194 > > This would allow users to define their own encoded char*+length type. > > The usage would be something like > > ctypedef str[encoding='ASCII'] ascii_string > > def func(ascii_string s): > print s[:s.len].decode('ASCII') I expect magic on the C <-> Python boundary, because conversion is necessary. Implicit coercion from one Python type to another is a bit less obvious. (If anything, I would want to introduce a new type rather than overload str to have this meaning...) > and would accept and encode Unicode arguments as well as arguments > that > support the 1D buffer protocol. So it would try to decode a numpy char* array? I think I'd rather get a type error than something implicit here. > Given that there's the "es" and "et" formattings in CPython (not > sure if > they continue to work for bytes in Py3, BTW, as it seems that their > documentation wasn't overhauled), we could also distinguish how bytes > arguments are handled: should they be checked for having the correct > encoding, or should they be passed through? Both use cases are > legitimate > and could be distinguished by another buffer option. > > >> That is another idea. A new type would handle conversion to char*, >> but >> not from char*. Bytes objects would still be returned by default >> unless one did something extra there (which is fine for some uses, >> but >> for other str is more natural). > > We could have a "cython.str()" function that converts char*+length > or a > char* buffer to bytes or unicode depending on the platform and using > either > the platform encoding or a different one passed as argument. So you'd > return "cython.str(c_string, length)" (or "cython.str(s)" for the > example > above) and be happy. That's a good idea, and should probably go in regardless of whatever else happens. > For function return types, we could also accept the Py3 syntax: > > def func(str[encoding='ASCII'] s) -> cython.str: > ... > > that would handle the conversion on the fly, as would an equivalent > declaration for cdef/cpdef functions. The motivation for a directive was to avoid having to be explicit every time a char* is used (or at least once for every function), which none of the above ideas address. - Robert From stefan_ml at behnel.de Thu Dec 3 08:50:35 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 08:50:35 +0100 Subject: [Cython] Another string encoding idea In-Reply-To: <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B14DA58.5080104@behnel.de> <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> Message-ID: <4B176DCB.6030400@behnel.de> Robert Bradshaw, 03.12.2009 02:01: > On Dec 1, 2009, at 12:56 AM, Stefan Behnel wrote: >> We could have a "cython.str()" function that converts char*+length or >> a char* buffer to bytes or unicode depending on the platform and using >> either the platform encoding or a different one passed as argument. >> So you'd return "cython.str(c_string, length)" (or "cython.str(s)" for >> the example above) and be happy. > > That's a good idea, and should probably go in regardless of whatever > else happens. Ok, so then we have three different cases for the char*->Python path: 1) create bytes - that's what currently happens automatically 2) create unicode - easy to do with "s.decode(enc)" 3) create str (i.e. bytes in Py2 and unicode in Py3) - easy to do with a future "cython.str(s)" or "cython.str(s[:length])", optionally taking an encoding as second argument and defaulting to the platform encoding otherwise. I think all of these are easy enough to type and read. So isn't that all we need for that direction? Or is it really the encoding name that you want to keep users from typing? Stefan From stefan_ml at behnel.de Thu Dec 3 09:56:01 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 09:56:01 +0100 Subject: [Cython] type inference - only for Python types? Message-ID: <4B177D21.8010600@behnel.de> Hi, I mentioned this idea before, but I think it's worth it's own thread. The main problem with the current type inference mechanism for existing code is that the semantics of assignments to untyped names change, i.e. cdef int i = 5 x = i will type 'x' as C int with type interence enabled. This will break existing code, as I expect most Pyrex/Cython code to depend on the above idiom for creating a Python object from a C type. Robert, please correct me, but IIUC, the above problem was the main reason not to enable type inference by default (except for some remaining bugs, but those will go away over time). However, type inference has many other virtues, e.g. for code like this: l = [] for x in range(100): item = do_some_complex_calculation(x) l.append(item) If the type of 'l' was inferred to be a list in this code, "l.append" would get optimised into the obvious C-API call (potentially even without a None check), instead of generating additional fallback code for cases where l is not a list. The same applies to the various other optimisations for builtin types. Another use case is for extension types. Currently, when instantiating an extension type, the result variable needs to be typed if you want to access its C methods and fields later on. With type inference enabled, the following would work without additional typing: cdef class MyExt: cdef int i x = MyExt() print x.i And, for the case that 'i' was actually declared 'public' or 'readonly', the access would go though the C field instead of the Python descriptor. So my proposal is to enable type inference by default (after fixing the remaining bugs), but only for Python types (including extension types). That should break *very* little code, but would give a major improvement in terms of both usability and performance. The existing directive would then only switch between full inference when enabled (including C types) and no type inference at all (when disabled explicitly). Comments? Stefan From dagss at student.matnat.uio.no Thu Dec 3 10:48:30 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 10:48:30 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B177D21.8010600@behnel.de> References: <4B177D21.8010600@behnel.de> Message-ID: <4B17896E.8030806@student.matnat.uio.no> Stefan Behnel wrote: > Hi, > > I mentioned this idea before, but I think it's worth it's own thread. > > The main problem with the current type inference mechanism for existing > code is that the semantics of assignments to untyped names change, i.e. > > cdef int i = 5 > x = i > > will type 'x' as C int with type interence enabled. This will break > existing code, as I expect most Pyrex/Cython code to depend on the above > idiom for creating a Python object from a C type. > > Robert, please correct me, but IIUC, the above problem was the main reason > not to enable type inference by default (except for some remaining bugs, > but those will go away over time). > > However, type inference has many other virtues, e.g. for code like this: > > l = [] > for x in range(100): > item = do_some_complex_calculation(x) > l.append(item) > > If the type of 'l' was inferred to be a list in this code, "l.append" would > get optimised into the obvious C-API call (potentially even without a None > check), instead of generating additional fallback code for cases where l is > not a list. The same applies to the various other optimisations for builtin > types. > > Another use case is for extension types. Currently, when instantiating an > extension type, the result variable needs to be typed if you want to access > its C methods and fields later on. With type inference enabled, the > following would work without additional typing: > > cdef class MyExt: > cdef int i > > x = MyExt() > print x.i > > And, for the case that 'i' was actually declared 'public' or 'readonly', > the access would go though the C field instead of the Python descriptor. > > So my proposal is to enable type inference by default (after fixing the > remaining bugs), but only for Python types (including extension types). > That should break *very* little code, but would give a major improvement in > terms of both usability and performance. > > The existing directive would then only switch between full inference when > enabled (including C types) and no type inference at all (when disabled > explicitly). > > Comments? > You might already have implied this, but I thought I'd point out that this is more general than your example imply. When you say x = MyExt() then the important characteristics are 1) MyExt is a callable with MyExt as return type 2) That callable is early-bound So, for instance it's OK to type here: cdef MyExt factoryfunc(): ... x = factoryfunc() Moving on to native types: - Python's floating point type is double, so it should be perfectly safe to do cdef double f(): ... x = f() # infer x as double By the same argument one might actually want to do cdef float f(): ... x = f() # infer x as *double* -- because that's what Python would use - For integers, I think we should have a directive "bigintegers" or similar. With it turned off, we blatantly assume that no integer computation will overflow ssize_t. We are then free to infer any integer type as ssize_t without loosing semantics under this constraint; i.e. x = returns_short() # infer x as ssize_t x = returns_size_t() # hmm, not sure...probably no inference? This is particularily useful for loops, as looping variables can all be set to ssize_t without further ado. - Long term, control flow analysis can be use to tighten this up: x = f() call_function(x) # do not use x again # => safe to type x as return value of f(), regardless of type Dag Sverre From dagss at student.matnat.uio.no Thu Dec 3 10:52:09 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 10:52:09 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B17896E.8030806@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> Message-ID: <4B178A49.1050507@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi, >> >> I mentioned this idea before, but I think it's worth it's own thread. >> >> The main problem with the current type inference mechanism for existing >> code is that the semantics of assignments to untyped names change, i.e. >> >> cdef int i = 5 >> x = i >> >> will type 'x' as C int with type interence enabled. This will break >> existing code, as I expect most Pyrex/Cython code to depend on the above >> idiom for creating a Python object from a C type. >> >> Robert, please correct me, but IIUC, the above problem was the main >> reason >> not to enable type inference by default (except for some remaining bugs, >> but those will go away over time). >> >> However, type inference has many other virtues, e.g. for code like this: >> >> l = [] >> for x in range(100): >> item = do_some_complex_calculation(x) >> l.append(item) >> >> If the type of 'l' was inferred to be a list in this code, "l.append" >> would >> get optimised into the obvious C-API call (potentially even without a >> None >> check), instead of generating additional fallback code for cases >> where l is >> not a list. The same applies to the various other optimisations for >> builtin >> types. >> >> Another use case is for extension types. Currently, when >> instantiating an >> extension type, the result variable needs to be typed if you want to >> access >> its C methods and fields later on. With type inference enabled, the >> following would work without additional typing: >> >> cdef class MyExt: >> cdef int i >> >> x = MyExt() >> print x.i >> >> And, for the case that 'i' was actually declared 'public' or 'readonly', >> the access would go though the C field instead of the Python descriptor. >> >> So my proposal is to enable type inference by default (after fixing the >> remaining bugs), but only for Python types (including extension types). >> That should break *very* little code, but would give a major >> improvement in >> terms of both usability and performance. >> >> The existing directive would then only switch between full inference >> when >> enabled (including C types) and no type inference at all (when disabled >> explicitly). >> >> Comments? >> > You might already have implied this, but I thought I'd point out that > this is more general than your example imply. When you say > > x = MyExt() > > then the important characteristics are > > 1) MyExt is a callable with MyExt as return type > 2) That callable is early-bound > > So, for instance it's OK to type here: > > cdef MyExt factoryfunc(): ... > x = factoryfunc() > > Moving on to native types: > - Python's floating point type is double, so it should be perfectly > safe to do > > cdef double f(): ... > x = f() # infer x as double > > By the same argument one might actually want to do > > cdef float f(): ... > x = f() # infer x as *double* -- because that's what Python would use > > - For integers, I think we should have a directive "bigintegers" or > similar. With it turned off, we blatantly assume that no integer > computation will overflow ssize_t. We are then free to infer any > integer type as ssize_t without loosing semantics under this > constraint; i.e. > > x = returns_short() # infer x as ssize_t > x = returns_size_t() # hmm, not sure...probably no inference? BTW, this implies: #cython: bigintegers=False cdef short i = 0 x = i # type x as ssize_t! > > This is particularily useful for loops, as looping variables can all > be set to ssize_t without further ado. > > - Long term, control flow analysis can be use to tighten this up: > > x = f() > call_function(x) > # do not use x again > # => safe to type x as return value of f(), regardless of type > > Dag Sverre > From stefan_ml at behnel.de Thu Dec 3 10:55:39 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 10:55:39 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B17896E.8030806@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> Message-ID: <4B178B1B.5030708@behnel.de> Dag Sverre Seljebotn, 03.12.2009 10:48: > This is particularily useful for loops, as looping variables can all be > set to ssize_t without further ado. This sounds a little too generic, so although I assume you're aware of it, let me clarify that even loop variables would have to depend on such a directive as this must work in the general case: for i in xrange(2**123456, 2**123456 + 2): print i Stefan From dagss at student.matnat.uio.no Thu Dec 3 11:13:22 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 11:13:22 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B178B1B.5030708@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178B1B.5030708@behnel.de> Message-ID: <4B178F42.20404@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 03.12.2009 10:48: > >> This is particularily useful for loops, as looping variables can all be >> set to ssize_t without further ado. >> > > This sounds a little too generic, so although I assume you're aware of it, > let me clarify that even loop variables would have to depend on such a > directive as this must work in the general case: > > for i in xrange(2**123456, 2**123456 + 2): print i > Yep, that would violate setting the bigintegetr directive. The directive would basically say "operate as if the Python integer type was a ssize_t", even for range and friends. I guess the bigger point of my post was various user-friendly ways of making type inference which *didn't* break backwards compatabiliy. I think a typical user either a) Type all variables b) or assume no overflows occur (i.e. don't exploit overflow behaviour) Trying to track which types variables would be assigned by a typeinference directive seems a very confusing way to work, hence I propose biginteger instead which keeps backwards compatability. Dag Sverre From stefan_ml at behnel.de Thu Dec 3 11:45:22 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 11:45:22 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B178A49.1050507@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> Message-ID: <4B1796C2.4060005@behnel.de> Dag Sverre Seljebotn, 03.12.2009 10:52: >> - Python's floating point type is double, so it should be perfectly >> safe to do >> >> cdef double f(): ... >> x = f() # infer x as double >> >> By the same argument one might actually want to do >> >> cdef float f(): ... >> x = f() # infer x as *double* -- because that's what Python would use That's actually an orthogonal feature: aliasing Python's float type to C's double type in general, including support for methods etc. Since 'double' is immutable and has exactly the same value range as Python's float type, this is something that can always be done safely. If I'm not mistaken, even Python's operators behave exactly like the equivalent C operators here, right? And numeric operations would always return a double, so even operations on doubles that involve integers would run completely in C space. Apart from a couple of further optimisations for methods and properties, I think enabling type inference here would give us a rather warmly requested feature. Stefan From dagss at student.matnat.uio.no Thu Dec 3 12:45:41 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 12:45:41 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1796C2.4060005@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> Message-ID: <4B17A4E5.3050706@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 03.12.2009 10:52: > >>> - Python's floating point type is double, so it should be perfectly >>> safe to do >>> >>> cdef double f(): ... >>> x = f() # infer x as double >>> >>> By the same argument one might actually want to do >>> >>> cdef float f(): ... >>> x = f() # infer x as *double* -- because that's what Python would use >>> > > That's actually an orthogonal feature: aliasing Python's float type to C's > double type in general, including support for methods etc. Since 'double' > is immutable and has exactly the same value range as Python's float type, > this is something that can always be done safely. If I'm not mistaken, even > Python's operators behave exactly like the equivalent C operators here, > right? And numeric operations would always return a double, so even > operations on doubles that involve integers would run completely in C space. > > Apart from a couple of further optimisations for methods and properties, I > think enabling type inference here would give us a rather warmly requested > feature. > This is now http://trac.cython.org/cython_trac/ticket/460. Dag Sverre From stefan_ml at behnel.de Thu Dec 3 13:51:38 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 13:51:38 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B17A4E5.3050706@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> Message-ID: <4B17B45A.6010000@behnel.de> Dag Sverre Seljebotn, 03.12.2009 12:45: >>>> cdef double f(): ... >>>> x = f() # infer x as double >> > This is now http://trac.cython.org/cython_trac/ticket/460. Note that I closed ticket 236, so you can now call a Python method on a corresponding C type, as in cdef int i = 1 print i.__add__(1) print i.bit_length() # Py3.1 It's implemented through coercion to a Python object. If we find an important use case where this isn't fast enough (after all, this is a very rare use case anyway), we can always add a specific handler in Optimize.py. As this didn't work at all before, I don't expect it to break code either (although I had to fix one test case that tested for the original error...) Stefan From dagss at student.matnat.uio.no Thu Dec 3 14:14:42 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 14:14:42 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B17B45A.6010000@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> Message-ID: <4B17B9C2.6050501@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 03.12.2009 12:45: > >>>>> cdef double f(): ... >>>>> x = f() # infer x as double >>>>> >>> >>> >> This is now http://trac.cython.org/cython_trac/ticket/460. >> > > Note that I closed ticket 236, so you can now call a Python method on a > corresponding C type, as in > > cdef int i = 1 > print i.__add__(1) > print i.bit_length() # Py3.1 > > It's implemented through coercion to a Python object. If we find an > important use case where this isn't fast enough (after all, this is a very > rare use case anyway), we can always add a specific handler in Optimize.py. > As this didn't work at all before, I don't expect it to break code either > (although I had to fix one test case that tested for the original error...) > Thanks! Heh, well, it certainly feels suboptimal to have "myfloat.imag", guaranteed to be 0, go through Python. Then again, who would ever call it? I'm perfectly fine with this. Dag Sverre From dalcinl at gmail.com Thu Dec 3 15:41:53 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 3 Dec 2009 11:41:53 -0300 Subject: [Cython] Another string encoding idea In-Reply-To: <4B176DCB.6030400@behnel.de> References: <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B14DA58.5080104@behnel.de> <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> <4B176DCB.6030400@behnel.de> Message-ID: On Thu, Dec 3, 2009 at 4:50 AM, Stefan Behnel wrote: > > Robert Bradshaw, 03.12.2009 02:01: >> On Dec 1, 2009, at 12:56 AM, Stefan Behnel wrote: >>> We could have a "cython.str()" function that converts char*+length or >>> a char* buffer to bytes or unicode depending on the platform and using >>> either the platform encoding or a different one passed as argument. >>> So you'd return "cython.str(c_string, length)" (or "cython.str(s)" for >>> the example above) and be happy. >> >> That's a good idea, and should probably go in regardless of whatever >> else happens. > > Ok, so then we have three different cases for the char*->Python path: > > 1) create bytes - that's what currently happens automatically > > 2) create unicode - easy to do with "s.decode(enc)" > > 3) create str (i.e. bytes in Py2 and unicode in Py3) - easy to do with a > future "cython.str(s)" or "cython.str(s[:length])", optionally taking an > encoding as second argument and defaulting to the platform encoding otherwise. > And you forgot mapping NULL to None... > > I think all of these are easy enough to type and read. So isn't that all we > need for that direction? > For brand-new code, I think you are right. > > Or is it really the encoding name that you want to > keep users from typing? > I think Robert's concern is the HUGE amount of code that should have to be reviewed/modified in SAGE. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Dec 3 16:08:47 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 3 Dec 2009 12:08:47 -0300 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) Message-ID: Some time ago, I was a heavy and rather-advanced SWIG user. Next I switched to Cython (except for a few legacy in-house codes), but there is a feature I still miss. As I dislike to ask others (read: Dag, Stefan, Robert, i.e, core devs) to think or implement about something I need but I'm not in position to contribute, I never commented on this. Suppose Cython could be enhanced by a mechanism where users can setup mappings C->Py and Py->C for any type using a user-provided cdef functions or even external functions. Then we should be able to do: cdef current_encoding = "latin-1" cdef unicode charp2unicode(char* p): return p.decode(current_encoding) and (somehow) add a to mapping (new syntax likely required): C_2_Python['char*'] = charp2unicode Of course, the other way is also possible and nice to have. Implementing this idea would enable some niceties: 1) Robert will be able to decide on a module-by-module base how to make the mappings char*<->str. 2) Projects that wraps other OO libs (like mpi4py and petsc4py, lxml?) could easily coerce Py instances (of cdef classes) to C handles, and the other way around. 3) ...(have you anything to add to this list) At some point in the future when Danilo's C++ support is merged, and some C/C++ header parser is available, Cython typemaps would enable AUTOMATIC wrapper generation, pretty much like SWIG does right now. Just a wild idea. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Thu Dec 3 16:40:55 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 16:40:55 +0100 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: References: Message-ID: <4B17DC07.7080507@behnel.de> Hi Lisandro, that idea isn't that wild at all. It also occurred to me while I was thinking about the string encoding stuff. However, there are certain drawbacks to this that do not make it easily suitable. Lisandro Dalcin, 03.12.2009 16:08: > cdef current_encoding = "latin-1" > > cdef unicode charp2unicode(char* p): > return p.decode(current_encoding) > > and (somehow) add a to mapping (new syntax likely required): > > C_2_Python['char*'] = charp2unicode Seeing this in action makes me think that a decorator would fit nicely here: @cython.typemapper(default=True) cdef inline unicode charp2unicode(char* p): return p.decode(current_encoding) @cython.typemapper cdef inline bytes charp2unicode(char* p): return p The compiler could then collect all such type mappers, make sure that only one of them is declared as the default mapper for an input type, and then just call them to do the type conversion between the types in a given context. Disadvantages: 1) The above works well for a global setup, but I'd expect there's a lot of code that requires different mappings depending on the context, at least for some types. (strings in lxml are certainly an example) 2) The above will not work for the unicode->char* case, for example, as there is no way to store a Python reference outside of the converter function scope. So this is limited to simple coercions that do not create new Python references. Stefan From dagss at student.matnat.uio.no Thu Dec 3 16:48:26 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 16:48:26 +0100 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: <4B17DC07.7080507@behnel.de> References: <4B17DC07.7080507@behnel.de> Message-ID: <4B17DDCA.4000303@student.matnat.uio.no> Stefan Behnel wrote: > Hi Lisandro, > > that idea isn't that wild at all. It also occurred to me while I was > thinking about the string encoding stuff. However, there are certain > drawbacks to this that do not make it easily suitable. > > Lisandro Dalcin, 03.12.2009 16:08: > >> cdef current_encoding = "latin-1" >> >> cdef unicode charp2unicode(char* p): >> return p.decode(current_encoding) >> >> and (somehow) add a to mapping (new syntax likely required): >> >> C_2_Python['char*'] = charp2unicode >> > > Seeing this in action makes me think that a decorator would fit nicely here: > > @cython.typemapper(default=True) > cdef inline unicode charp2unicode(char* p): > return p.decode(current_encoding) > > @cython.typemapper > cdef inline bytes charp2unicode(char* p): > return p > > The compiler could then collect all such type mappers, make sure that only > one of them is declared as the default mapper for an input type, and then > just call them to do the type conversion between the types in a given context. > > Disadvantages: > > 1) The above works well for a global setup, but I'd expect there's a lot of > code that requires different mappings depending on the context, at least > for some types. (strings in lxml are certainly an example) > > 2) The above will not work for the unicode->char* case, for example, as > there is no way to store a Python reference outside of the converter > function scope. So this is limited to simple coercions that do not create > new Python references. > There's also the possibility of "overloading conversion operators", similar to what you can do in C++. This binds it to type instead: cdef class utf8_charp: cdef char* ptr def __init__(self, char* ptr): self.ptr = ptr cdef char* __convert__(self): return self.ptr cdef object __convert__(self): return self.ptr.decode('utf-8') @classmethod cdef latin1_charp __coercefrom__(self, char* other): return latin1_charp(other) That might leave Sage with a global search&replace for char* for encoding issues... Dag Sverre From dagss at student.matnat.uio.no Thu Dec 3 16:52:35 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 16:52:35 +0100 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: <4B17DDCA.4000303@student.matnat.uio.no> References: <4B17DC07.7080507@behnel.de> <4B17DDCA.4000303@student.matnat.uio.no> Message-ID: <4B17DEC3.5030807@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi Lisandro, >> >> that idea isn't that wild at all. It also occurred to me while I was >> thinking about the string encoding stuff. However, there are certain >> drawbacks to this that do not make it easily suitable. >> >> Lisandro Dalcin, 03.12.2009 16:08: >> >>> cdef current_encoding = "latin-1" >>> >>> cdef unicode charp2unicode(char* p): >>> return p.decode(current_encoding) >>> >>> and (somehow) add a to mapping (new syntax likely required): >>> >>> C_2_Python['char*'] = charp2unicode >>> >> >> Seeing this in action makes me think that a decorator would fit >> nicely here: >> >> @cython.typemapper(default=True) >> cdef inline unicode charp2unicode(char* p): >> return p.decode(current_encoding) >> >> @cython.typemapper >> cdef inline bytes charp2unicode(char* p): >> return p >> >> The compiler could then collect all such type mappers, make sure that >> only >> one of them is declared as the default mapper for an input type, and >> then >> just call them to do the type conversion between the types in a given >> context. >> >> Disadvantages: >> >> 1) The above works well for a global setup, but I'd expect there's a >> lot of >> code that requires different mappings depending on the context, at least >> for some types. (strings in lxml are certainly an example) >> >> 2) The above will not work for the unicode->char* case, for example, as >> there is no way to store a Python reference outside of the converter >> function scope. So this is limited to simple coercions that do not >> create >> new Python references. >> > There's also the possibility of "overloading conversion operators", > similar to what you can do in C++. This binds it to type instead: > > cdef class utf8_charp: > cdef char* ptr > > def __init__(self, char* ptr): > self.ptr = ptr > > cdef char* __convert__(self): > return self.ptr > cdef object __convert__(self): > return self.ptr.decode('utf-8') > > @classmethod > cdef latin1_charp __coercefrom__(self, char* other): > return latin1_charp(other) This should be utf8_charp everywhere, sorry about that. I'm less than happy with the signature of __coercefrom__, but a very slightly improvement: @staticmethod cdef utf8_charp __convertfrom__(char* other): ... Just brainstorming though. To loose the object construction overhead one could also allow this for structs, which would thus tend to simply decorate a struct containing only a char* with custom conversion rules. But now I'm really reinventing C++...not that that must be a bad thing... Dag Sverre From robertwb at math.washington.edu Thu Dec 3 18:03:52 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 3 Dec 2009 09:03:52 -0800 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: <4B17DDCA.4000303@student.matnat.uio.no> References: <4B17DC07.7080507@behnel.de> <4B17DDCA.4000303@student.matnat.uio.no> Message-ID: On Dec 3, 2009, at 7:48 AM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Hi Lisandro, >> >> that idea isn't that wild at all. It also occurred to me while I was >> thinking about the string encoding stuff. However, there are certain >> drawbacks to this that do not make it easily suitable. >> >> Lisandro Dalcin, 03.12.2009 16:08: >> >>> cdef current_encoding = "latin-1" >>> >>> cdef unicode charp2unicode(char* p): >>> return p.decode(current_encoding) >>> >>> and (somehow) add a to mapping (new syntax likely required): >>> >>> C_2_Python['char*'] = charp2unicode >>> >> >> Seeing this in action makes me think that a decorator would fit >> nicely here: >> >> @cython.typemapper(default=True) >> cdef inline unicode charp2unicode(char* p): >> return p.decode(current_encoding) >> >> @cython.typemapper >> cdef inline bytes charp2unicode(char* p): >> return p >> >> The compiler could then collect all such type mappers, make sure >> that only >> one of them is declared as the default mapper for an input type, >> and then >> just call them to do the type conversion between the types in a >> given context. >> >> Disadvantages: >> >> 1) The above works well for a global setup, but I'd expect there's >> a lot of >> code that requires different mappings depending on the context, at >> least >> for some types. (strings in lxml are certainly an example) Perhaps we could come up with a way of using them in conjunction with a with statement, which could take care of locality. >> 2) The above will not work for the unicode->char* case, for >> example, as >> there is no way to store a Python reference outside of the converter >> function scope. So this is limited to simple coercions that do not >> create >> new Python references. >> > There's also the possibility of "overloading conversion operators", > similar to what you can do in C++. This binds it to type instead I like the idea of binding the conversion as part of the type declaration as well, though I don't have a ready syntax (though the above decorator is nice). - Robert From robertwb at math.washington.edu Thu Dec 3 18:10:55 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 3 Dec 2009 09:10:55 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B17B45A.6010000@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> Message-ID: On Dec 3, 2009, at 4:51 AM, Stefan Behnel wrote: > > Dag Sverre Seljebotn, 03.12.2009 12:45: >>>>> cdef double f(): ... >>>>> x = f() # infer x as double >>> >> This is now http://trac.cython.org/cython_trac/ticket/460. > > Note that I closed ticket 236, so you can now call a Python method > on a > corresponding C type, as in > > cdef int i = 1 > print i.__add__(1) > print i.bit_length() # Py3.1 > > It's implemented through coercion to a Python object. If we find an > important use case where this isn't fast enough (after all, this is > a very > rare use case anyway), we can always add a specific handler in > Optimize.py. > As this didn't work at all before, I don't expect it to break code > either > (although I had to fix one test case that tested for the original > error...) Nice. With that, I can't see any place that inference of doubles wouldn't be safe either, and it would be very convenient. - Robert From robertwb at math.washington.edu Thu Dec 3 18:15:39 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 3 Dec 2009 09:15:39 -0800 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B177D21.8010600@behnel.de> References: <4B177D21.8010600@behnel.de> Message-ID: <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> On Dec 3, 2009, at 12:56 AM, Stefan Behnel wrote: > Hi, > > I mentioned this idea before, but I think it's worth it's own thread. > > The main problem with the current type inference mechanism for > existing > code is that the semantics of assignments to untyped names change, > i.e. > > cdef int i = 5 > x = i > > will type 'x' as C int with type interence enabled. This will break > existing code, as I expect most Pyrex/Cython code to depend on the > above > idiom for creating a Python object from a C type. > > Robert, please correct me, but IIUC, the above problem was the main > reason > not to enable type inference by default (except for some remaining > bugs, > but those will go away over time). Yep. > However, type inference has many other virtues, e.g. for code like > this: > > l = [] > for x in range(100): > item = do_some_complex_calculation(x) > l.append(item) > > If the type of 'l' was inferred to be a list in this code, > "l.append" would > get optimised into the obvious C-API call (potentially even without > a None > check), instead of generating additional fallback code for cases > where l is > not a list. The same applies to the various other optimisations for > builtin > types. > > Another use case is for extension types. Currently, when > instantiating an > extension type, the result variable needs to be typed if you want to > access > its C methods and fields later on. With type inference enabled, the > following would work without additional typing: > > cdef class MyExt: > cdef int i > > x = MyExt() > print x.i > > And, for the case that 'i' was actually declared 'public' or > 'readonly', > the access would go though the C field instead of the Python > descriptor. > > So my proposal is to enable type inference by default (after fixing > the > remaining bugs), but only for Python types (including extension > types). > That should break *very* little code, but would give a major > improvement in > terms of both usability and performance. > > The existing directive would then only switch between full inference > when > enabled (including C types) and no type inference at all (when > disabled > explicitly). > > Comments? Sounds like it would be safe (and very useful) to me. The only thing that can happen on assignment between object types is a type check, and the type inference engine infers based on the set of assignments, so I don't see how it could go wrong. (There are corner cases where the behavior would change, but hopefully people aren't *counting* on cdef class MyExt: cdef int i x = MyExt() print x.i raising an AttributeError...) > - Robert From Chris.Barker at noaa.gov Thu Dec 3 19:10:16 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 03 Dec 2009 10:10:16 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: References: <4B10C890.6080106@behnel.de> <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B156817.4060801@noaa.gov> Message-ID: <4B17FF08.3060508@noaa.gov> Robert Bradshaw wrote: > On Dec 1, 2009, at 11:01 AM, Christopher Barker wrote: >>>> we need a Cython type: >>>> >>>> ANSI_string (not that that's what it should be called) >> the idea is that this would be used >> explicitly for "text", not data -- so the user would not want to get >> bytes back. > Yes, this is the case that I'm thinking of as well. I wasn't seeing > how a new type would fix the > > cdef char* s = ... > return do you mean "return s"? but anyway -- the point is that when you do: cdef char* s you've declared a char* only -- and that could be anything, so you're right, the only sane thing to return to python would be a bytes object. Maybe I'm confused about what's possible (I've only done a tiny bit of "real work" with Cython), but what I had in mid was that you'd do: cdef ansi_str s = ... and s would be a char*, but you'd have told Cython that you are using it as and ansi string, and thus: return s would return a python unicode object *using an ansi encoding -- I'm not sure how'd you'd tell Cython what encoding to use, though. Lisandro Dalcin wrote: > Suppose Cython could be enhanced by a mechanism where users can setup > mappings C->Py and Py->C for any type using a user-provided cdef > functions or even external functions. Then we should be able to do: > > cdef current_encoding = "latin-1" Maybe something like this would solve it -- I do like the general idea. > At some point in the future when Danilo's C++ support is merged, and > some C/C++ header parser is available, Cython typemaps would enable > AUTOMATIC wrapper generation, pretty much like SWIG does right now. And that would be fabulous! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robertwb at math.washington.edu Thu Dec 3 19:25:12 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 3 Dec 2009 10:25:12 -0800 Subject: [Cython] Another string encoding idea In-Reply-To: References: <4B112FF7.40006@student.matnat.uio.no> <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B14DA58.5080104@behnel.de> <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> <4B176DCB.6030400@behnel.de> Message-ID: <4FD386D2-6BD3-4DE5-A791-0CA573B0AA10@math.washington.edu> On Dec 3, 2009, at 6:41 AM, Lisandro Dalcin wrote: > On Thu, Dec 3, 2009 at 4:50 AM, Stefan Behnel > wrote: >> >> Robert Bradshaw, 03.12.2009 02:01: >>> On Dec 1, 2009, at 12:56 AM, Stefan Behnel wrote: >>>> We could have a "cython.str()" function that converts char* >>>> +length or >>>> a char* buffer to bytes or unicode depending on the platform and >>>> using >>>> either the platform encoding or a different one passed as argument. >>>> So you'd return "cython.str(c_string, length)" (or >>>> "cython.str(s)" for >>>> the example above) and be happy. >>> >>> That's a good idea, and should probably go in regardless of whatever >>> else happens. >> >> Ok, so then we have three different cases for the char*->Python path: >> >> 1) create bytes - that's what currently happens automatically >> >> 2) create unicode - easy to do with "s.decode(enc)" >> >> 3) create str (i.e. bytes in Py2 and unicode in Py3) - easy to do >> with a >> future "cython.str(s)" or "cython.str(s[:length])", optionally >> taking an >> encoding as second argument and defaulting to the platform encoding >> otherwise. Yep. > > And you forgot mapping NULL to None... Oh... that's a nice thought as well. > >> >> I think all of these are easy enough to type and read. So isn't >> that all we >> need for that direction? >> > > For brand-new code, I think you are right. > >> Or is it really the encoding name that you want to >> keep users from typing? Not at all. I'm up for a default, but that's not my issue. The motivating issue is that the user has to manually do something /every time/ a char* is converted. I would guess in most (almost all) applications one wants the same behavior throughout a whole module. > I think Robert's concern is the HUGE amount of code that should have > to be reviewed/modified in SAGE. Yes, that is a big concern, though of course this would be a one-time patch that someone could just sit down for X hours and do. Of course then new code (not written or refereed by me) would go in, leak bytes objects when that probably wasn't intended when we finally migrate to Py3 (we depend on a lot of upstream projects doing so first), and down the road get reported as a bug and (finally) be corrected. One could argue that one should train all developers, and add making sure things are handling char* conversions correctly as part of the referee process, etc. but there's a huge number of things for new developers to learn and get familiar with already. Also I strongly believe the maxim that changing a system is vastly easier than changing (and maintaining) human behavior. Even if everyone did this, in our case it would be 98% busywork for our project, I'd rather developers and referees spend their limited time thinking about more relevant things. I would guess many other projects are in the same boat, and will be surprised when they try to run their code under Py3 and all of the sudden bytes objects are returned all over. I see a (weak) analogy to memory management. For some usecases, manually managing memory is important. For others, it's unneeded bookkeeping that the developer would be better off not having to think about at every step. - Robert From anders at embl.de Thu Dec 3 20:52:17 2009 From: anders at embl.de (Simon Anders) Date: Thu, 03 Dec 2009 20:52:17 +0100 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: References: <4B17DC07.7080507@behnel.de> <4B17DDCA.4000303@student.matnat.uio.no> Message-ID: Hi, typemaps sound very useful to me. However, what always bothers me in SWIG's implementation is that the map is associated with a C type. So, if you have, say, two 'char *' arguments in a function to be wrapped that need to be converted differently you need some strange argument name pattern matching to tell SWIG which conversion to use where. In Cython, we are free to define new types. What about an extension to the 'ctypedef' syntax similar to the property syntax? As an example, let's say that in our C code, "char *" is sometimes a unicode string and sometimes an array of bytes. Then we might introduce ctypedef char* unicode with: cdef char* __py2c__( str a ): ... cdef str __c2py__( char* ): ... ctypedef char * byte_array with: cdef char* __py2c__( list a ): ... cdef list __c2py__( char * ): ... If one then has an external C function void foo( char* aByteArray, char* aUnicodeString ); one might declare it in Cython as cdef extern void foo( byte_array aByteArray, unicode aUnicodeString ) and it will appear to calling Python code as accepting a list and a string, which are then converted by the two '__py2c__' functions. Of course, if a type only appears as parameter type or only as return type, one needs to declare only one of __py2c__ or __c2py__ (similar to SWIG's 'in' and 'out' typemaps). (BTW, as this is my first post to this list: Many thanks for Cython, it is an incredibly useful tool!) Cheers Simon +--- | Dr. Simon Anders, Dipl.-Phys. | European Molecular Biology Laboratory (EMBL), Heidelberg | office phone +49-6221-387-8632 | preferred (permanent) e-mail: sanders at fs.tum.de From stefan_ml at behnel.de Thu Dec 3 21:05:08 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 03 Dec 2009 21:05:08 +0100 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: References: <4B17DC07.7080507@behnel.de> <4B17DDCA.4000303@student.matnat.uio.no> Message-ID: <4B1819F4.1030801@behnel.de> Hi, Simon Anders, 03.12.2009 20:52: > typemaps sound very useful to me. However, what always bothers me in > SWIG's implementation is that the map is associated with a C type. So, if > you have, say, two 'char *' arguments in a function to be wrapped that need > to be converted differently you need some strange argument name pattern > matching to tell SWIG which conversion to use where. > > In Cython, we are free to define new types. What about an extension to the > 'ctypedef' syntax similar to the property syntax? > > As an example, let's say that in our C code, "char *" is sometimes a > unicode string and sometimes an array of bytes. Then we might introduce > > ctypedef char* unicode with: > cdef char* __py2c__( str a ): > ... > cdef str __c2py__( char* ): > ... > > ctypedef char * byte_array with: > cdef char* __py2c__( list a ): > ... > cdef list __c2py__( char * ): > ... Makes sense to me. It would also allow you to define more than one mapper 'method', i.e. you could have ctypedef char * byte_array: cdef char* __py2c__( list a ): ... cdef char* __py2c__( bytes a ): ... cdef char* __py2c__( unsigned char* a ): ... cdef char* __py2c__( unicode a ): ... etc. (note that I stripped the 'with' at the end of the initial line, I think it's not needed). > Of course, if a type only appears as parameter type or only as return > type, one needs to declare only one of __py2c__ or __c2py__ (similar to > SWIG's 'in' and 'out' typemaps). That would be fine, sure. Although we might want to provide a way to say "please disallow any coercion from type X". Stefan From dalcinl at gmail.com Thu Dec 3 22:45:09 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 3 Dec 2009 18:45:09 -0300 Subject: [Cython] Another string encoding idea In-Reply-To: <4FD386D2-6BD3-4DE5-A791-0CA573B0AA10@math.washington.edu> References: <3B4EEB84-B85C-4801-878A-50AC793EA1D1@math.washington.edu> <4B11A503.10307@student.matnat.uio.no> <4B140B9D.8040701@noaa.gov> <4B14DA58.5080104@behnel.de> <3EB6BED2-C590-4BA6-AAB9-3734B02D4452@math.washington.edu> <4B176DCB.6030400@behnel.de> <4FD386D2-6BD3-4DE5-A791-0CA573B0AA10@math.washington.edu> Message-ID: On Thu, Dec 3, 2009 at 3:25 PM, Robert Bradshaw wrote: > > Not at all. I'm up for a default, but that's not my issue. The > motivating issue is that the user has to manually do something /every > time/ a char* is converted. I would guess in most (almost all) > applications one wants the same behavior throughout a whole module. > BTW, do you expect that these /every time/ would actually be /many times/ ?? In my projects, strings appear in a rather small percent of the C calls I have to use/wrap. I would love to ask Cython to generate a warning every time that a char*<->object coercion is implicitly done (I mean, done without an explicit cast or decode/encode method calls). If Cython could be instructed to point me to these locations, then I'll have a chance to eliminate bugs (like unintentionally returning bytes instead of str/unicode) and perhaps think a bit more about where char* means data or text at every place the coercion is done. Such approach would provide "better safe than sorry" alternative/complement to Robert's proposal. I think this should be more or less trivial to implement, right? Then Robert could try to run Cython on the whole Sage an report the outcome. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Dec 4 02:42:23 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 3 Dec 2009 22:42:23 -0300 Subject: [Cython] Why 'autotestdict' is on by default? Message-ID: Am I missing something? Or was this an oversight (for 0.12 final)? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Dec 4 04:31:45 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 04 Dec 2009 04:31:45 +0100 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: References: Message-ID: <4B1882A1.10701@behnel.de> Lisandro Dalcin, 04.12.2009 02:42: > Am I missing something? Or was this an oversight (for 0.12 final)? Hmm, not sure if it *should* be on by default, but does it break anything? Stefan From dalcinl at gmail.com Fri Dec 4 04:52:29 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Dec 2009 00:52:29 -0300 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: <4B1882A1.10701@behnel.de> References: <4B1882A1.10701@behnel.de> Message-ID: On Fri, Dec 4, 2009 at 12:31 AM, Stefan Behnel wrote: > > Lisandro Dalcin, 04.12.2009 02:42: >> Am I missing something? Or was this an oversight (for 0.12 final)? > > Hmm, not sure if it *should* be on by default, but does it break anything? > It do not break my code, but some reason I do not understand one of the extension modules in mpi4py have and the other does not have a __test__ attribute. In the module that have it, the value is a empty dict. I still cannot figure out why one module has __test__, and the other does not. Could it be that the ABSENCE of class&method docstrings would trigger the setattr of __test__ in the module namespace? I'm really lost. Still, I think Cython should not add stuff in module namespaces unless explicitly asked for... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Dec 4 07:06:59 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 04 Dec 2009 07:06:59 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> References: <4B177D21.8010600@behnel.de> <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> Message-ID: <4B18A703.5000504@behnel.de> Robert Bradshaw, 03.12.2009 18:15: > On Dec 3, 2009, at 12:56 AM, Stefan Behnel wrote: >> So my proposal is to enable type inference by default (after fixing >> the remaining bugs), but only for Python types (including extension >> types). > > Sounds like it would be safe (and very useful) to me. The only thing > that can happen on assignment between object types is a type check, > and the type inference engine infers based on the set of assignments, > so I don't see how it could go wrong. Implemented in http://hg.cython.org/cython-devel/rev/c0e5e7195070 I noticed that type inference wasn't available for type constructors. It is now: http://hg.cython.org/cython-devel/rev/ba5ae565b3d2 Robert, I didn't quite understand the comment in NameNode.infer_types(). Is there a reason why this wasn't enabled in the first place? Enabling the 'safe' mode breaks one (1!) test file in the test suite, typedfieldbug_T303. It was actually depending on the fact that MyType() returns a generic object, not something that is known to be an instance of MyType (with known readonly fields that are accessible at C level). This is really unlikely to break code IMHO, but we'll have to check our own test suite to make sure we didn't accidentally send some tests into void space by enabling some unexpected optimisations for them. Still, I left the default setting for the infer_types option as 'none' until we can be sure it works for all nodes. SliceIndexNode is still pending (which makes me wonder why this didn't show up in the test suite...). Once that's fixed, I'd vote for setting the option to 'safe' by default. Would that go in for 0.12.1 or should we wait for 0.13? (Maybe we should give Sage a build before answering that ...) Stefan From jonovik at gmail.com Fri Dec 4 08:41:49 2009 From: jonovik at gmail.com (Jon Olav Vik) Date: Fri, 4 Dec 2009 07:41:49 +0000 (UTC) Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) References: <4B17DC07.7080507@behnel.de> Message-ID: Stefan Behnel writes: > Seeing this in action makes me think that a decorator would fit nicely here: > > @cython.typemapper(default=True) > cdef inline unicode charp2unicode(char* p): > return p.decode(current_encoding) > > @cython.typemapper > cdef inline bytes charp2unicode(char* p): > return p > > The compiler could then collect all such type mappers, make sure that only > one of them is declared as the default mapper for an input type, and then > just call them to do the type conversion between the types in a given context. > > Disadvantages: > > 1) The above works well for a global setup, but I'd expect there's a lot of > code that requires different mappings depending on the context, at least > for some types. (strings in lxml are certainly an example) > > 2) The above will not work for the unicode->char* case, for example, as > there is no way to store a Python reference outside of the converter > function scope. So this is limited to simple coercions that do not create > new Python references. Disclaimers: * Sorry if I misunderstood your argument. * I'm completely ignorant about the inner workings of Cython, * and of C, * and of unicode vs char*... 8-) In Python, however, decorators can be classes, and thus have any amount of internal state and trickery -- including, I'd guess, creating and storing references. See the "memoization" example here: http://avinashv.net/2008/04/python-decorators-syntactic-sugar/ Best regards, Jon Olav From robertwb at math.washington.edu Fri Dec 4 09:59:53 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Dec 2009 00:59:53 -0800 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B18A703.5000504@behnel.de> References: <4B177D21.8010600@behnel.de> <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> <4B18A703.5000504@behnel.de> Message-ID: <32A85A41-DB59-42CA-985D-9501B3BCF0FA@math.washington.edu> On Dec 3, 2009, at 10:06 PM, Stefan Behnel wrote: > > Robert Bradshaw, 03.12.2009 18:15: >> On Dec 3, 2009, at 12:56 AM, Stefan Behnel wrote: >>> So my proposal is to enable type inference by default (after fixing >>> the remaining bugs), but only for Python types (including extension >>> types). >> >> Sounds like it would be safe (and very useful) to me. The only thing >> that can happen on assignment between object types is a type check, >> and the type inference engine infers based on the set of assignments, >> so I don't see how it could go wrong. > > Implemented in > > http://hg.cython.org/cython-devel/rev/c0e5e7195070 Excellent. > I noticed that type inference wasn't available for type > constructors. It is > now: > > http://hg.cython.org/cython-devel/rev/ba5ae565b3d2 > > Robert, I didn't quite understand the comment in > NameNode.infer_types(). Is > there a reason why this wasn't enabled in the first place? The problem is when I write x = list The type of x is not list, it's type. This should be changed back. > Enabling the 'safe' mode breaks one (1!) test file in the test suite, > typedfieldbug_T303. It was actually depending on the fact that > MyType() > returns a generic object, not something that is known to be an > instance of > MyType (with known readonly fields that are accessible at C level). > > This is really unlikely to break code IMHO, but we'll have to check > our own > test suite to make sure we didn't accidentally send some tests into > void > space by enabling some unexpected optimisations for them. > > Still, I left the default setting for the infer_types option as 'none' > until we can be sure it works for all nodes. SliceIndexNode is still > pending (which makes me wonder why this didn't show up in the test > suite...). Once that's fixed, I'd vote for setting the option to > 'safe' by > default. Would that go in for 0.12.1 or should we wait for 0.13? > (Maybe we > should give Sage a build before answering that ...) Sounds sensible, I like it. I don't have any strong opinions on what version it should go into, but I'd lean towards 0.13 (which should not be as long as a wait as 0.12 was). I'd like to get at least some of the C++ code into 0.13 as well. - Robert From dagss at student.matnat.uio.no Fri Dec 4 11:54:02 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 04 Dec 2009 11:54:02 +0100 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: References: <4B1882A1.10701@behnel.de> Message-ID: <4B18EA4A.1070107@student.matnat.uio.no> Lisandro Dalcin wrote: > On Fri, Dec 4, 2009 at 12:31 AM, Stefan Behnel wrote: >> Lisandro Dalcin, 04.12.2009 02:42: >>> Am I missing something? Or was this an oversight (for 0.12 final)? >> Hmm, not sure if it *should* be on by default, but does it break anything? >> > > It do not break my code, but some reason I do not understand one of > the extension modules in mpi4py have and the other does not have a > __test__ attribute. In the module that have it, the value is a empty > dict. > > I still cannot figure out why one module has __test__, and the other > does not. Could it be that the ABSENCE of class&method docstrings > would trigger the setattr of __test__ in the module namespace? I'm > really lost. > > Still, I think Cython should not add stuff in module namespaces unless > explicitly asked for... Well, it was discussed, and it seemed like there was no opposition at the time to leave it on by default -- so that was intentional. It was by principle of least surprise -- people would usually expect "doctest.testmod(mycythonmodule)" to just work, and it doesn't without this. Myself I've wasted at least an hour in total just by forgetting that Cython modules couldn't be doctested :-) Also it's already in use in the Cython test suite. Robert moved a lot of the docstrings into the functions (and the test suite now looks a lot better). If we turn it off by default, we have to turn it on in every test case in our test suite...(or in runtests.py, but that's rather obscure, -1 on that). I'll try to remember to look into the strange behaviour later. -- Dag Sverre From dalcinl at gmail.com Fri Dec 4 14:09:56 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Dec 2009 10:09:56 -0300 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: <4B18EA4A.1070107@student.matnat.uio.no> References: <4B1882A1.10701@behnel.de> <4B18EA4A.1070107@student.matnat.uio.no> Message-ID: On Fri, Dec 4, 2009 at 7:54 AM, Dag Sverre Seljebotn wrote: > > It was by principle of least surprise -- people would usually expect > "doctest.testmod(mycythonmodule)" to just work, and it doesn't without > this. Myself I've wasted at least an hour in total just by forgetting > that Cython modules couldn't be doctested :-) > OK. You now convinced me. After all, it is really easy to turn this off, and I agree that it is VERY unlikely that this could break something. > > I'll try to remember to look into the strange behaviour later. > Thanks. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Dec 4 14:26:32 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 04 Dec 2009 14:26:32 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B17A4E5.3050706@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> Message-ID: <4B190E08.6050104@behnel.de> Dag Sverre Seljebotn, 03.12.2009 12:45: > Stefan Behnel wrote: >> Dag Sverre Seljebotn, 03.12.2009 10:52: >> >>>> - Python's floating point type is double, so it should be perfectly >>>> safe to do >>>> >>>> cdef double f(): ... >>>> x = f() # infer x as double >>>> >>>> By the same argument one might actually want to do >>>> >>>> cdef float f(): ... >>>> x = f() # infer x as *double* -- because that's what Python would use >>>> >> That's actually an orthogonal feature: aliasing Python's float type to C's >> double type in general, including support for methods etc. Since 'double' >> is immutable and has exactly the same value range as Python's float type, >> this is something that can always be done safely. If I'm not mistaken, even >> Python's operators behave exactly like the equivalent C operators here, >> right? And numeric operations would always return a double, so even >> operations on doubles that involve integers would run completely in C space. >> >> Apart from a couple of further optimisations for methods and properties, I >> think enabling type inference here would give us a rather warmly requested >> feature. >> > This is now http://trac.cython.org/cython_trac/ticket/460. Rethinking this some more, there are a couple of problems with this, e.g. overflow errors: >>> 2.0**1000000 Traceback (most recent call last): OverflowError: (34, 'Numerical result out of range') Moving this calculation to C space means we have to catch the error and raise an exception for it. Then, while simplistic code like this will work beautifully: d = 1.0 for i in range(1000): d += 1.0 print d most code won't currently benefit from enabling type inference for doubles, as it's rather unlikely that all operations done on a variable only use other (implicitly/explicitly) typed names as operands. And as long as there is one assignment with unknown types on the rhs, the name will be typed as Python object (which is required since the value could be anything, e.g. some kind of vector object). So we clearly have to do more here to make this useful. Stefan From stefan_ml at behnel.de Fri Dec 4 14:52:05 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 04 Dec 2009 14:52:05 +0100 Subject: [Cython] type inference - only for Python types? In-Reply-To: <32A85A41-DB59-42CA-985D-9501B3BCF0FA@math.washington.edu> References: <4B177D21.8010600@behnel.de> <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> <4B18A703.5000504@behnel.de> <32A85A41-DB59-42CA-985D-9501B3BCF0FA@math.washington.edu> Message-ID: <4B191405.4080309@behnel.de> Robert Bradshaw, 04.12.2009 09:59: > The problem is when I write > > x = list > > The type of x is not list, it's type. This should be changed back. Done. > I don't have any strong opinions on what > version it should go into, but I'd lean towards 0.13 (which should not > be as long as a wait as 0.12 was). I'd like to get at least some of > the C++ code into 0.13 as well. For the records, I fixed another bug regarding type inference for extension types (ticket #461), and now lxml tests perfectly with safe type inference enabled. Stefan From dagss at student.matnat.uio.no Fri Dec 4 18:57:08 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 04 Dec 2009 18:57:08 +0100 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: References: <4B1882A1.10701@behnel.de> <4B18EA4A.1070107@student.matnat.uio.no> Message-ID: <4B194D74.6080609@student.matnat.uio.no> Lisandro Dalcin wrote: > On Fri, Dec 4, 2009 at 7:54 AM, Dag Sverre Seljebotn > wrote: >> It was by principle of least surprise -- people would usually expect >> "doctest.testmod(mycythonmodule)" to just work, and it doesn't without >> this. Myself I've wasted at least an hour in total just by forgetting >> that Cython modules couldn't be doctested :-) >> > > OK. You now convinced me. After all, it is really easy to turn this > off, and I agree that it is VERY unlikely that this could break > something. > >> I'll try to remember to look into the strange behaviour later. >> Of course, if you file a ticket, I'm more likely to remember (don't have time myself just now, sorry). -- Dag Sverre From dalcinl at gmail.com Fri Dec 4 19:19:01 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Dec 2009 15:19:01 -0300 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: <4B194D74.6080609@student.matnat.uio.no> References: <4B1882A1.10701@behnel.de> <4B18EA4A.1070107@student.matnat.uio.no> <4B194D74.6080609@student.matnat.uio.no> Message-ID: On Fri, Dec 4, 2009 at 2:57 PM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Fri, Dec 4, 2009 at 7:54 AM, Dag Sverre Seljebotn >> wrote: >>> It was by principle of least surprise -- people would usually expect >>> "doctest.testmod(mycythonmodule)" to just work, and it doesn't without >>> this. Myself I've wasted at least an hour in total just by forgetting >>> that Cython modules couldn't be doctested :-) >>> >> >> OK. You now convinced me. After all, it is really easy to turn this >> off, and I agree that it is VERY unlikely that this could break >> something. >> >>> I'll try to remember to look into the strange behaviour later. >>> > > Of course, if you file a ticket, I'm more likely to remember (don't have > time myself just now, sorry). > That's my problem, too :-( -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat Dec 5 01:26:47 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Dec 2009 21:26:47 -0300 Subject: [Cython] Fwd: classmethod-related changes for cls argument In-Reply-To: References: Message-ID: Here is a mail I've sent a few days ago to Stefan... ---------- Forwarded message ---------- From: Lisandro Dalcin Date: Wed, Dec 2, 2009 at 11:40 PM Subject: classmethod-related changes for cls argument To: Stefan Behnel Stefan, I'm very busy with personal stuff, so no time at all to look on this. You have below the output from an interactive IPython session. Perhaps you forgot to update some code counting function arguments (I bet such code was discounting self_arg, but you should also discount type_arg?) In [1]: from mpi4py import MPI In [2]: MPI.Comm.Compare? Type: ? ? ? ? ? builtin_function_or_method Base Class: ? ? String Form: ? ? Namespace: ? ? ?Interactive Docstring: ? ?Comm.Compare(type cls, Comm comm1, Comm comm2) ? ?Compare two communicators In [3]: MPI.Comm.Compare(MPI.COMM_WORLD, MPI.COMM_WORLD) --------------------------------------------------------------------------- TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) /u/dalcinl/Devel/mpi4py-dev/ in () /u/dalcinl/lib64/python/mpi4py/MPI.so in mpi4py.MPI.Comm.Compare (src/mpi4py.MPI.c:46815)() TypeError: Compare() takes at least 3 positional arguments (2 given) In [4]: The patch below fixes my issues, but I not sure if this is the definitive solution ... diff -r f6efe60d7a31 Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py Fri Dec 04 15:23:23 2009 +0100 +++ b/Cython/Compiler/Nodes.py Fri Dec 04 21:25:33 2009 -0300 @@ -2075,7 +2075,7 @@ argtuple_error_label = code.new_label("argtuple_error") min_positional_args = self.num_required_args - self.num_required_kw_args - if len(self.args) > 0 and self.args[0].is_self_arg: + if len(self.args) > 0 and (self.args[0].is_self_arg or self.args[0].is_type_arg): min_positional_args -= 1 max_positional_args = len(positional_args) has_fixed_positional_count = not self.star_arg and \ -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Sat Dec 5 01:31:23 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Dec 2009 01:31:23 +0100 Subject: [Cython] Fwd: classmethod-related changes for cls argument In-Reply-To: References: Message-ID: <4B19A9DB.4050708@behnel.de> Lisandro Dalcin, 05.12.2009 01:26: > In [3]: MPI.Comm.Compare(MPI.COMM_WORLD, MPI.COMM_WORLD) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > > /u/dalcinl/Devel/mpi4py-dev/ in () > > /u/dalcinl/lib64/python/mpi4py/MPI.so in mpi4py.MPI.Comm.Compare > (src/mpi4py.MPI.c:46815)() > > TypeError: Compare() takes at least 3 positional arguments (2 given) > > In [4]: > > The patch below fixes my issues, but I not sure if this is the > definitive solution ... > > diff -r f6efe60d7a31 Cython/Compiler/Nodes.py > --- a/Cython/Compiler/Nodes.py Fri Dec 04 15:23:23 2009 +0100 > +++ b/Cython/Compiler/Nodes.py Fri Dec 04 21:25:33 2009 -0300 > @@ -2075,7 +2075,7 @@ > argtuple_error_label = code.new_label("argtuple_error") > > min_positional_args = self.num_required_args - > self.num_required_kw_args > - if len(self.args) > 0 and self.args[0].is_self_arg: > + if len(self.args) > 0 and (self.args[0].is_self_arg or > self.args[0].is_type_arg): > min_positional_args -= 1 > max_positional_args = len(positional_args) > has_fixed_positional_count = not self.star_arg and \ Looks good to me, please push this fix. Stefan From dalcinl at gmail.com Sat Dec 5 01:47:14 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 4 Dec 2009 21:47:14 -0300 Subject: [Cython] Fwd: classmethod-related changes for cls argument In-Reply-To: <4B19A9DB.4050708@behnel.de> References: <4B19A9DB.4050708@behnel.de> Message-ID: On Fri, Dec 4, 2009 at 9:31 PM, Stefan Behnel wrote: > > Lisandro Dalcin, 05.12.2009 01:26: >> In [3]: MPI.Comm.Compare(MPI.COMM_WORLD, MPI.COMM_WORLD) >> --------------------------------------------------------------------------- >> TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) >> >> /u/dalcinl/Devel/mpi4py-dev/ in () >> >> /u/dalcinl/lib64/python/mpi4py/MPI.so in mpi4py.MPI.Comm.Compare >> (src/mpi4py.MPI.c:46815)() >> >> TypeError: Compare() takes at least 3 positional arguments (2 given) >> >> In [4]: >> >> The patch below fixes my issues, but I not sure if this is the >> definitive solution ... >> >> diff -r f6efe60d7a31 Cython/Compiler/Nodes.py >> --- a/Cython/Compiler/Nodes.py ? ? ? ?Fri Dec 04 15:23:23 2009 +0100 >> +++ b/Cython/Compiler/Nodes.py ? ? ? ?Fri Dec 04 21:25:33 2009 -0300 >> @@ -2075,7 +2075,7 @@ >> ? ? ? ? ?argtuple_error_label = code.new_label("argtuple_error") >> >> ? ? ? ? ?min_positional_args = self.num_required_args - >> self.num_required_kw_args >> - ? ? ? ?if len(self.args) > 0 and self.args[0].is_self_arg: >> + ? ? ? ?if len(self.args) > 0 and (self.args[0].is_self_arg or >> self.args[0].is_type_arg): >> ? ? ? ? ? ? ?min_positional_args -= 1 >> ? ? ? ? ?max_positional_args = len(positional_args) >> ? ? ? ? ?has_fixed_positional_count = not self.star_arg and \ > > Looks good to me, please push this fix. > Done: changeset: 2738:723772dbc385 tag: tip user: Lisandro Dalcin date: Fri Dec 04 21:44:45 2009 -0300 summary: discount one to min pos args for classmethod (complementary fix for #454) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sat Dec 5 05:00:53 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Dec 2009 20:00:53 -0800 Subject: [Cython] A typemap system for Cython (idea stolen from SWIG) In-Reply-To: References: <4B17DC07.7080507@behnel.de> Message-ID: <3A0717A8-8C78-45CE-A172-5E4D645DF359@math.washington.edu> On Dec 3, 2009, at 11:41 PM, Jon Olav Vik wrote: > Stefan Behnel writes: > >> Seeing this in action makes me think that a decorator would fit >> nicely here: >> >> @cython.typemapper(default=True) >> cdef inline unicode charp2unicode(char* p): >> return p.decode(current_encoding) >> >> @cython.typemapper >> cdef inline bytes charp2unicode(char* p): >> return p >> >> The compiler could then collect all such type mappers, make sure >> that only >> one of them is declared as the default mapper for an input type, >> and then >> just call them to do the type conversion between the types in a >> given context. >> >> Disadvantages: >> >> 1) The above works well for a global setup, but I'd expect there's >> a lot of >> code that requires different mappings depending on the context, at >> least >> for some types. (strings in lxml are certainly an example) >> >> 2) The above will not work for the unicode->char* case, for >> example, as >> there is no way to store a Python reference outside of the converter >> function scope. So this is limited to simple coercions that do not >> create >> new Python references. > > Disclaimers: > * Sorry if I misunderstood your argument. > * I'm completely ignorant about the inner workings of Cython, > * and of C, > * and of unicode vs char*... 8-) > > In Python, however, decorators can be classes, and thus have any > amount of > internal state and trickery -- including, I'd guess, creating and > storing > references. See the "memoization" example here: > http://avinashv.net/2008/04/python-decorators-syntactic-sugar/ The cython decorators aren't really implemented as decorators, they're compiler directives. - Robert From robertwb at math.washington.edu Sat Dec 5 05:01:02 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Dec 2009 20:01:02 -0800 Subject: [Cython] Why 'autotestdict' is on by default? In-Reply-To: <4B18EA4A.1070107@student.matnat.uio.no> References: <4B1882A1.10701@behnel.de> <4B18EA4A.1070107@student.matnat.uio.no> Message-ID: On Dec 4, 2009, at 2:54 AM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Fri, Dec 4, 2009 at 12:31 AM, Stefan Behnel >> wrote: >>> Lisandro Dalcin, 04.12.2009 02:42: >>>> Am I missing something? Or was this an oversight (for 0.12 final)? >>> Hmm, not sure if it *should* be on by default, but does it break >>> anything? >>> >> >> It do not break my code, but some reason I do not understand one of >> the extension modules in mpi4py have and the other does not have a >> __test__ attribute. In the module that have it, the value is a empty >> dict. >> >> I still cannot figure out why one module has __test__, and the other >> does not. Could it be that the ABSENCE of class&method docstrings >> would trigger the setattr of __test__ in the module namespace? I'm >> really lost. >> >> Still, I think Cython should not add stuff in module namespaces >> unless >> explicitly asked for... > > Well, it was discussed, and it seemed like there was no opposition at > the time to leave it on by default -- so that was intentional. Well, there was a little bit of concern that it might break stuff, but I think we took care of the corner issues. > It was by principle of least surprise -- people would usually expect > "doctest.testmod(mycythonmodule)" to just work, and it doesn't without > this. Myself I've wasted at least an hour in total just by forgetting > that Cython modules couldn't be doctested :-) > > Also it's already in use in the Cython test suite. Robert moved a > lot of > the docstrings into the functions (and the test suite now looks a lot > better). If we turn it off by default, we have to turn it on in every > test case in our test suite...(or in runtests.py, but that's rather > obscure, -1 on that). It probably wouldn't be too hard to turn it on for just our runtests, but fortunately it looks like it's not causing any issues out in the wild. - Robert From dalcinl at gmail.com Sat Dec 5 05:03:23 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 5 Dec 2009 01:03:23 -0300 Subject: [Cython] clasmethod still broken Message-ID: Switching to test on Windows, MSVC complained with the code below. Note the zero-sized "PyObject* values[0] = {}" being generated... /* "/home/dalcinl/Devel/mpi4py-dev/src/MPI/Info.pyx":26 * * @classmethod * def Create(cls): # <<<<<<<<<<<<<< * """ * Create a new, empty info object */ static PyObject *__pyx_pf_6mpi4py_3MPI_4Info_Create(PyObject *__pyx_v_cls, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/ static char __pyx_doc_6mpi4py_3MPI_4Info_Create[] = "Info.Create(type cls)\n\n Create a new, empty info object\n "; static PyObject *__pyx_pf_6mpi4py_3MPI_4Info_Create(PyObject *__pyx_v_cls, PyObject *__pyx_args, PyObject *__pyx_kwds) { struct __pyx_obj_6mpi4py_3MPI_Info *__pyx_v_info = 0; PyObject *__pyx_r = NULL; PyObject *__pyx_t_1 = NULL; int __pyx_t_2; static PyObject **__pyx_pyargnames[] = {0}; __Pyx_RefNannySetupContext("Create"); if (unlikely(__pyx_kwds)) { Py_ssize_t kw_args = PyDict_Size(__pyx_kwds); PyObject* values[0] = {}; switch (PyTuple_GET_SIZE(__pyx_args)) { case 0: break; default: goto __pyx_L5_argtuple_error; } This patch fixes the issue, but perhaps I'm just reverting to previous behavior? diff -r f6efe60d7a31 Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py Fri Dec 04 15:23:23 2009 +0100 +++ b/Cython/Compiler/Nodes.py Fri Dec 04 20:53:43 2009 -0300 @@ -1752,11 +1752,11 @@ arg = self.args[i] arg.is_generic = 0 if sig.is_self_arg(i) and not self.is_staticmethod: + arg.is_self_arg = 1 if self.is_classmethod: arg.is_type_arg = 1 arg.hdr_type = arg.type = Builtin.type_type else: - arg.is_self_arg = 1 arg.hdr_type = arg.type = env.parent_type arg.needs_conversion = 0 else: -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Sat Dec 5 05:00:59 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Fri, 4 Dec 2009 20:00:59 -0800 Subject: [Cython] type inference - only for Python types? In-Reply-To: <4B191405.4080309@behnel.de> References: <4B177D21.8010600@behnel.de> <43B010C7-8B53-4898-BD24-E58905DA88FD@math.washington.edu> <4B18A703.5000504@behnel.de> <32A85A41-DB59-42CA-985D-9501B3BCF0FA@math.washington.edu> <4B191405.4080309@behnel.de> Message-ID: <667D423C-37D0-4CC7-BADA-AA65DB7C3A7A@math.washington.edu> On Dec 4, 2009, at 5:52 AM, Stefan Behnel wrote: > > Robert Bradshaw, 04.12.2009 09:59: >> The problem is when I write >> >> x = list >> >> The type of x is not list, it's type. This should be changed back. > > Done. > > >> I don't have any strong opinions on what >> version it should go into, but I'd lean towards 0.13 (which should >> not >> be as long as a wait as 0.12 was). I'd like to get at least some of >> the C++ code into 0.13 as well. > > For the records, I fixed another bug regarding type inference for > extension > types (ticket #461), and now lxml tests perfectly with safe type > inference > enabled. Next up, try to compile Sage :) - Robert From stefan_ml at behnel.de Sat Dec 5 08:22:41 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Dec 2009 08:22:41 +0100 Subject: [Cython] clasmethod still broken In-Reply-To: References: Message-ID: <4B1A0A41.3010103@behnel.de> Lisandro Dalcin, 05.12.2009 05:03: > Switching to test on Windows, MSVC complained with the code below. > Note the zero-sized "PyObject* values[0] = {}" being generated... > [...] > This patch fixes the issue, but perhaps I'm just reverting to previous > behavior? Yes, looks like you are. I'll look into this. Stefan From dagss at student.matnat.uio.no Sat Dec 5 10:13:09 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 Dec 2009 10:13:09 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B190E08.6050104@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B190E08.6050104@behnel.de> Message-ID: <4B1A2425.8040701@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 03.12.2009 12:45: >> Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 03.12.2009 10:52: >>> >>>>> - Python's floating point type is double, so it should be perfectly >>>>> safe to do >>>>> >>>>> cdef double f(): ... >>>>> x = f() # infer x as double >>>>> >>>>> By the same argument one might actually want to do >>>>> >>>>> cdef float f(): ... >>>>> x = f() # infer x as *double* -- because that's what Python would use >>>>> >>> That's actually an orthogonal feature: aliasing Python's float type to C's >>> double type in general, including support for methods etc. Since 'double' >>> is immutable and has exactly the same value range as Python's float type, >>> this is something that can always be done safely. If I'm not mistaken, even >>> Python's operators behave exactly like the equivalent C operators here, >>> right? And numeric operations would always return a double, so even >>> operations on doubles that involve integers would run completely in C space. >>> >>> Apart from a couple of further optimisations for methods and properties, I >>> think enabling type inference here would give us a rather warmly requested >>> feature. >>> >> This is now http://trac.cython.org/cython_trac/ticket/460. > > Rethinking this some more, there are a couple of problems with this, e.g. > overflow errors: > > >>> 2.0**1000000 > Traceback (most recent call last): > OverflowError: (34, 'Numerical result out of range') > > Moving this calculation to C space means we have to catch the error and > raise an exception for it. Hmm. Only seems like it applies to pow though: In [23]: 9e307+9e307 Out[23]: inf At least for Python 2.6.2. Which is a very strange and inconsistent result to me. 2.0**10000000.0 also gives an exception. I don't quite have the time to look up the Python specs on this one but it should be done... Anyways, my opinion here is: - Directive to assume we don't go out of range (perhaps bignumbers and merge with the integer case) - We should raise exceptions in the same situations as in Python even for "cdef double"s (re: the integer division isse -- it's just much easier to learn stuff if we're consistent) (Long-term, note that floating point exceptions can be raised by checking flags in the CPU, so that when there's no side-effects, like z = (x**2 + y**2)**(.5) we don't have to check to raise an exception before the entire expression has run. That needs control flow analysis though.) > Then, while simplistic code like this will work beautifully: > > d = 1.0 > for i in range(1000): > d += 1.0 > print d > > most code won't currently benefit from enabling type inference for doubles, > as it's rather unlikely that all operations done on a variable only use > other (implicitly/explicitly) typed names as operands. And as long as there > is one assignment with unknown types on the rhs, the name will be typed as > Python object (which is required since the value could be anything, e.g. > some kind of vector object). So we clearly have to do more here to make > this useful. Well, when Two remedies though: - Support for declaring that e.g. math.sin in returns a double (e.g. through a math.pxd which is used on a pure Python import..) - I think there's no way around the fact that users have to care about types at some boundary points in the program. However I think it is much friendlier (compatible with pure Python, for one thing) to do e.g. d = 1.0 for i in range(1000): d += float(myfunc(i)) which would be enough. -- Dag Sverre From f.guerrieri at gmail.com Sat Dec 5 13:02:55 2009 From: f.guerrieri at gmail.com (Francesco Guerrieri) Date: Sat, 5 Dec 2009 13:02:55 +0100 Subject: [Cython] small typo in profiling tutorial Message-ID: <79b79e730912050402id06ed19h6dbb218c6a95d9b9@mail.gmail.com> Hi, I was reading the profiling tutorial http://docs.cython.org/src/tutorial/profiling_tutorial.html. ISTM that there is a small typo, approx_py is referenced instead of recip_square. I attach a small patch for the markup text. Sorry if this is not the best way of communication for such small issues :-) Francesco -------------- next part -------------- An HTML attachment was scrubbed... URL: http://codespeak.net/pipermail/cython-dev/attachments/20091205/4e4744f5/attachment.htm -------------- next part -------------- 204c204 < changed a lot. Let's concentrate on the approx_pi function a bit more. First --- > changed a lot. Let's concentrate on the recip_square function a bit more. First From stefan_ml at behnel.de Sat Dec 5 13:16:10 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Dec 2009 13:16:10 +0100 Subject: [Cython] clasmethod still broken In-Reply-To: <4B1A0A41.3010103@behnel.de> References: <4B1A0A41.3010103@behnel.de> Message-ID: <4B1A4F0A.7020804@behnel.de> Stefan Behnel, 05.12.2009 08:22: > Lisandro Dalcin, 05.12.2009 05:03: >> Switching to test on Windows, MSVC complained with the code below. >> Note the zero-sized "PyObject* values[0] = {}" being generated... >> [...] >> This patch fixes the issue, but perhaps I'm just reverting to previous >> behavior? > > Yes, looks like you are. I'll look into this. BTW, I just noticed that this is also broken: cdef class cclass: def test(*args): pass but it looks like it was already broken before my changes. I'll add a couple of test cases and see what that gives. http://trac.cython.org/cython_trac/ticket/462 Stefan From stefan_ml at behnel.de Sat Dec 5 16:38:02 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Dec 2009 16:38:02 +0100 Subject: [Cython] clasmethod still broken In-Reply-To: <4B1A4F0A.7020804@behnel.de> References: <4B1A0A41.3010103@behnel.de> <4B1A4F0A.7020804@behnel.de> Message-ID: <4B1A7E5A.20602@behnel.de> Stefan Behnel, 05.12.2009 13:16: > I just noticed that this is also broken: > > cdef class cclass: > def test(*args): > pass > > but it looks like it was already broken before my changes. Pyrex suffers from the same problem: $ python ./bin/pyrexc cdef_methods_T462.pyx [...] .../cdef_methods_T462.pyx:26:4: Method test_args has wrong number of arguments (0 declared, 1 or more expected) .../cdef_methods_T462.pyx:33:4: Method test_args_kwargs has wrong number of arguments (0 declared, 1 or more expected) Stefan From dalcinl at gmail.com Sat Dec 5 20:38:45 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 5 Dec 2009 16:38:45 -0300 Subject: [Cython] clasmethod still broken In-Reply-To: <4B1A4F0A.7020804@behnel.de> References: <4B1A0A41.3010103@behnel.de> <4B1A4F0A.7020804@behnel.de> Message-ID: On Sat, Dec 5, 2009 at 9:16 AM, Stefan Behnel wrote: > > Stefan Behnel, 05.12.2009 08:22: >> Lisandro Dalcin, 05.12.2009 05:03: >>> Switching to test on Windows, MSVC complained with the code below. >>> Note the zero-sized "PyObject* values[0] = {}" being generated... >>> [...] >>> This patch fixes the issue, but perhaps I'm just reverting to previous >>> behavior? >> >> Yes, looks like you are. I'll look into this. > > BTW, I just noticed that this is also broken: > > ? ?cdef class cclass: > ? ? ? ? def test(*args): > ? ? ? ? ? ? pass > Mmm, Should we support that? After all, cdef classes are not normal classes... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Sat Dec 5 22:56:11 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 Dec 2009 22:56:11 +0100 Subject: [Cython] clasmethod still broken In-Reply-To: References: <4B1A0A41.3010103@behnel.de> <4B1A4F0A.7020804@behnel.de> Message-ID: <4B1AD6FB.7010308@behnel.de> Lisandro Dalcin, 05.12.2009 20:38: > On Sat, Dec 5, 2009 at 9:16 AM, Stefan Behnel wrote: >> I just noticed that this is also broken: >> >> cdef class cclass: >> def test(*args): >> pass > > Mmm, Should we support that? After all, cdef classes are not normal classes... IMHO, any difference that we can remove is worth a patch, so the question is rather: why should we not support it? Anyway, the fix is almost ready. I'll just clean it up a bit and push it. Stefan From stefan_ml at behnel.de Sun Dec 6 00:36:08 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 Dec 2009 00:36:08 +0100 Subject: [Cython] clasmethod still broken In-Reply-To: <4B1AD6FB.7010308@behnel.de> References: <4B1A0A41.3010103@behnel.de> <4B1A4F0A.7020804@behnel.de> <4B1AD6FB.7010308@behnel.de> Message-ID: <4B1AEE68.5010101@behnel.de> Stefan Behnel, 05.12.2009 22:56: > Lisandro Dalcin, 05.12.2009 20:38: >> On Sat, Dec 5, 2009 at 9:16 AM, Stefan Behnel wrote: >>> I just noticed that this is also broken: >>> >>> cdef class cclass: >>> def test(*args): >>> pass >> Mmm, Should we support that? After all, cdef classes are not normal classes... > > IMHO, any difference that we can remove is worth a patch, so the question > is rather: why should we not support it? > > Anyway, the fix is almost ready. I'll just clean it up a bit and push it. The above and the classmethod breakage should be fixed now. Lisandro, could you give it another try? Stefan From dagss at student.matnat.uio.no Sun Dec 6 21:25:10 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 6 Dec 2009 21:25:10 +0100 Subject: [Cython] Checking extension type sizes Message-ID: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> I'm bringing this over from the NumPy list. I'm guessing this behaviour is there so that subclasses written in Cython won't break? Perhaps the error message could at least be made clearer ("please recompile") + perhaps only do this check for types which are subclassed? Dag Sverre On Sun, Dec 6, 2009 at 07:53, Gael Varoquaux wrote: > I have a lot of code that has stopped working with my latest SVN pull to > numpy. > > * Some compiled code yields an error looking like (from memory): > > ? ?"incorrect type 'numpy.ndarray'" > > Rebuilding it is sufficient. Is this Cython or Pyrex code? Unfortunately Pyrex checks the size of types exactly such that even if you extend the type in a backwards compatible way, it will raise that exception. This behavior has been inherited by Cython. I have asked for this feature to be removed, or at least turned into a >= check, but it got no traction. -- Robert Kern From stefan_ml at behnel.de Sun Dec 6 23:30:20 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 Dec 2009 23:30:20 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> Message-ID: <4B1C307C.7090703@behnel.de> Robert Bradshaw, 03.12.2009 18:10: > Nice. With that, I can't see any place that inference of doubles > wouldn't be safe either, and it would be very convenient. ... what about 'bint'? Now that and/or behave as expected, would that type be safe to infer, too? Stefan From mayzel at gmail.com Mon Dec 7 14:00:18 2009 From: mayzel at gmail.com (Max) Date: Mon, 7 Dec 2009 14:00:18 +0100 Subject: [Cython] cython + lapack on OS X 10.6 Message-ID: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> Hi, I'm trying to run testlapack example from cython-notur09 on OS X 10.6. First, while compiling I faced with following error gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 build/temp.macosx-10.4-i386-2.6/lapack.o -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas -latlas -o lapack.so -g Undefined symbols: "_clapack_dgesv", referenced from: _dgesv in lapack.o _dsolve in lapack.o ld: symbol(s) not found So I changed clapack_dgesw in lapack.pxd to dgesw and successfully build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) but when I run testlapack.test() python froze, using 100% of CPU, below is debuging info. >>> tl.test() test start ^C Program received signal SIGINT, Interrupt. 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) at lapack.c:775 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { (gdb) exit I was also tried to build it against Mac OS X accelerated framework and the result is the same, while on linux this example works fine. Any suggestions what's wrong? Regards, Max Mayzel From dalcinl at gmail.com Mon Dec 7 15:41:34 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 11:41:34 -0300 Subject: [Cython] clasmethod still broken In-Reply-To: <4B1AEE68.5010101@behnel.de> References: <4B1A0A41.3010103@behnel.de> <4B1A4F0A.7020804@behnel.de> <4B1AD6FB.7010308@behnel.de> <4B1AEE68.5010101@behnel.de> Message-ID: On Sat, Dec 5, 2009 at 8:36 PM, Stefan Behnel wrote: > > Stefan Behnel, 05.12.2009 22:56: >> Lisandro Dalcin, 05.12.2009 20:38: >>> On Sat, Dec 5, 2009 at 9:16 AM, Stefan Behnel wrote: >>>> I just noticed that this is also broken: >>>> >>>> ? ?cdef class cclass: >>>> ? ? ? ? def test(*args): >>>> ? ? ? ? ? ? pass >>> Mmm, Should we support that? After all, cdef classes are not normal classes... >> >> IMHO, any difference that we can remove is worth a patch, so the question >> is rather: why should we not support it? >> >> Anyway, the fix is almost ready. I'll just clean it up a bit and push it. > > The above and the classmethod breakage should be fixed now. Lisandro, could > you give it another try? > Sorry for the delay. All is working now. Many thanks, Stefan. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Mon Dec 7 18:51:44 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 09:51:44 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> Message-ID: <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> On Dec 6, 2009, at 12:25 PM, Dag Sverre Seljebotn wrote: > I'm bringing this over from the NumPy list. I'm guessing this > behaviour is > there so that subclasses written in Cython won't break? Also, very hard to diagnose behavior can happen if you change a class (for example, add or remove a cdef attribute) without re-compiling everything that depends on it. I'd like to have stronger checks on extending from other Cython classes. > Perhaps the error > message could at least be made clearer ("please recompile") + > perhaps only > do this check for types which are subclassed? +1 to making the error more clear--the current one is far from clear. > > Dag Sverre > > On Sun, Dec 6, 2009 at 07:53, Gael Varoquaux > wrote: >> I have a lot of code that has stopped working with my latest SVN >> pull to >> numpy. >> >> * Some compiled code yields an error looking like (from memory): >> >> "incorrect type 'numpy.ndarray'" >> >> Rebuilding it is sufficient. > > Is this Cython or Pyrex code? Unfortunately Pyrex checks the size of > types exactly such that even if you extend the type in a backwards > compatible way, it will raise that exception. This behavior has been > inherited by Cython. I have asked for this feature to be removed, or > at least turned into a >= check, but it got no traction. Could you better describe your usecase? You have a Cython type that you then extend from C and try to use from Cython again without redeclaring it? - Robert From robertwb at math.washington.edu Mon Dec 7 18:53:39 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 09:53:39 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1C307C.7090703@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> Message-ID: <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> On Dec 6, 2009, at 2:30 PM, Stefan Behnel wrote: > Robert Bradshaw, 03.12.2009 18:10: >> Nice. With that, I can't see any place that inference of doubles >> wouldn't be safe either, and it would be very convenient. > > ... what about 'bint'? Now that and/or behave as expected, would > that type > be safe to infer, too? What about x = 5 x = True print x - Robert From jonas at lophus.org Mon Dec 7 19:03:22 2009 From: jonas at lophus.org (jonas at lophus.org) Date: Mon, 07 Dec 2009 19:03:22 +0100 Subject: [Cython] [Docs bug] Wrong reference syntax Message-ID: <4B1D436A.4080405@lophus.org> You've got a syntax error on http://docs.cython.org/src/quickstart/build.html (ref to reference/compilation). Regards From robertwb at math.washington.edu Mon Dec 7 19:29:42 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 10:29:42 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: <20091207180016.GB1111@phare.normalesup.org> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> Message-ID: <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> On Dec 7, 2009, at 10:00 AM, Gael Varoquaux wrote: > On Mon, Dec 07, 2009 at 09:51:44AM -0800, Robert Bradshaw wrote: >>>> I have a lot of code that has stopped working with my latest SVN >>>> pull to >>>> numpy. > >>>> * Some compiled code yields an error looking like (from memory): > >>>> "incorrect type 'numpy.ndarray'" > >>>> Rebuilding it is sufficient. > >>> Is this Cython or Pyrex code? Unfortunately Pyrex checks the size of >>> types exactly such that even if you extend the type in a backwards >>> compatible way, it will raise that exception. This behavior has been >>> inherited by Cython. I have asked for this feature to be removed, or >>> at least turned into a >= check, but it got no traction. >> >> Could you better describe your usecase? You have a Cython type that >> you >> then extend from C and try to use from Cython again without >> redeclaring >> it? > > The usecase in my own code is simply to bind existing C code passing > it > the numpy array (and yes, it has to be contigous). Maybe you could boil this down to a tiny example? > However, some scipy code displays the same problem: > > /home/varoquau/dev/scipy/scipy/stats/distributions.py in () > 25 from scipy.special import gammaln as gamln > 26 from copy import copy > ---> 27 import vonmises_cython > 28 import textwrap > 29 > > /home/varoquau/Desktop/graph_francois/numpy.pxd in > scipy.stats.vonmises_cython (scipy/stats/vonmises_cython.c:2939)() > 28 > 29 > ---> 30 > 31 > 32 > > ValueError: numpy.dtype does not appear to be the correct type object > > To make sure that we are talking about the same thing, the object that > has changed, here, is the numpy array, which is independant of my > code, > or of scipy code. The problem is currenlty that when numpy changes, > even > in a backward compatible way, Cython code taking numpy arrays needs > to be > recompiled. Is there a way to detect whether the way its been changed is backward compatible? I'd rather be overly cautious. I don't think it's strange to have to recompile when a C-level dependancy changes, even if the change is compatible. I'm still a bit confused though--numpy types are declared as "extern" so I wouldn't expect their size to be checked. - Robert From stefan_ml at behnel.de Mon Dec 7 20:28:15 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 Dec 2009 20:28:15 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> Message-ID: <4B1D574F.6050902@behnel.de> Robert Bradshaw, 03.12.2009 18:10: > Nice. With that, I can't see any place that inference of doubles > wouldn't be safe either, and it would be very convenient. I have an implementation, but I noticed that typing the lhs of assignments isn't enough to support code like this: x = 2.5 + 1.5 ** float(some_obj) # could be string/float/... because the calculation would still run in Python space, although we know it could run completely in C after unpacking the last operand. So I needed to force Python float operands into C doubles inside of NumBinopNode, in addition to the support in the type inference mechanism. Would that still be enabled by the type inference switch? I guess a separate option would be better, right? Stefan From stefan_ml at behnel.de Mon Dec 7 20:30:48 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 Dec 2009 20:30:48 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> Message-ID: <4B1D57E8.1090205@behnel.de> Robert Bradshaw, 07.12.2009 18:53: > On Dec 6, 2009, at 2:30 PM, Stefan Behnel wrote: >> Robert Bradshaw, 03.12.2009 18:10: >>> Nice. With that, I can't see any place that inference of doubles >>> wouldn't be safe either, and it would be very convenient. >> ... what about 'bint'? Now that and/or behave as expected, would >> that type >> be safe to infer, too? > > What about > > x = 5 > x = True > print x That would unexpectedly print '1', I guess. But we could special case 'bint' in the type inference algorithms so that it won't be compatible with any other int type. So if you do the above, x will turn into a Python object instead. There's another case that I came up with. Since True/False are specified to be equivalent to the int values 1/0, there's likely some code out there that does this: cdef bint func(x): ... return 2 # also 'true' in C true_values = 0 for i in range(10): true_values += func(i) This currently coerces the return value to True and then adds its integer value 1 to true_values. But I'm actually fine with breaking that kind of weird code... Stefan From robertwb at math.washington.edu Mon Dec 7 20:36:57 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 11:36:57 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1D574F.6050902@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1D574F.6050902@behnel.de> Message-ID: <8671C18B-2A4F-431D-9F2A-7428565DB203@math.washington.edu> On Dec 7, 2009, at 11:28 AM, Stefan Behnel wrote: > Robert Bradshaw, 03.12.2009 18:10: >> Nice. With that, I can't see any place that inference of doubles >> wouldn't be safe either, and it would be very convenient. > > I have an implementation, but I noticed that typing the lhs of > assignments > isn't enough to support code like this: > > x = 2.5 + 1.5 ** float(some_obj) # could be string/float/... > > because the calculation would still run in Python space, although we > know > it could run completely in C after unpacking the last operand. So I > needed > to force Python float operands into C doubles inside of > NumBinopNode, in > addition to the support in the type inference mechanism. I'm not sure where exactly where the float(...) becoming a double operation optimization occurs, but perhaps it might be easier to simply modify the call node to infer the type of float(...) as being a double. > Would that still be enabled by the type inference switch? I guess a > separate option would be better, right? Well, that's controlled by the float() optimization, right? (I guess there's not a switch for that.) - Robert From robertwb at math.washington.edu Mon Dec 7 20:43:48 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 11:43:48 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1D57E8.1090205@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> Message-ID: <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> On Dec 7, 2009, at 11:30 AM, Stefan Behnel wrote: > Robert Bradshaw, 07.12.2009 18:53: >> On Dec 6, 2009, at 2:30 PM, Stefan Behnel wrote: >>> Robert Bradshaw, 03.12.2009 18:10: >>>> Nice. With that, I can't see any place that inference of doubles >>>> wouldn't be safe either, and it would be very convenient. >>> ... what about 'bint'? Now that and/or behave as expected, would >>> that type >>> be safe to infer, too? >> >> What about >> >> x = 5 >> x = True >> print x > > That would unexpectedly print '1', I guess. But we could special case > 'bint' in the type inference algorithms so that it won't be > compatible with > any other int type. So if you do the above, x will turn into a Python > object instead. Special casing bint for safe mode seems reasonable. > There's another case that I came up with. Since True/False are > specified to > be equivalent to the int values 1/0, there's likely some code out > there > that does this: > > cdef bint func(x): > ... > return 2 # also 'true' in C > > true_values = 0 > for i in range(10): > true_values += func(i) > > This currently coerces the return value to True and then adds its > integer > value 1 to true_values. But I'm actually fine with breaking that > kind of > weird code... Yeah, that's not breaking valid Python code at least. If your code is depending on the subtleties of non-identity C/Python conversions, then you're asking for trouble to have any kind of inference at all, and I don't thing this is a worthy enough use case to not enable safe mode. - Robert From dalcinl at gmail.com Mon Dec 7 20:57:48 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 16:57:48 -0300 Subject: [Cython] Checking extension type sizes In-Reply-To: <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> Message-ID: On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw wrote: > > Is there a way to detect whether the way its been changed is backward > compatible? > NumPy exposes an API version number. However, I'm not sure if that would be enough to detect if a change is backwards. > > I'd rather be overly cautious. > Indeed. However, I think Cython should switch from an error to a warning. That way, everybody should be more or less happy. > > I don't think it's strange > to have to recompile when a C-level dependancy changes, even if the > change is compatible. > You are right, but developers should keep in mind that having to recompile is a major hurdle for other end-users. > > I'm still a bit confused though--numpy types are declared as "extern" > so I wouldn't expect their size to be checked. > This is not the case. The code is here: #ifndef __PYX_HAVE_RT_ImportType #define __PYX_HAVE_RT_ImportType static PyTypeObject *__Pyx_ImportType(const char *module_name, const char *class_name, long size) { ..... if (((PyTypeObject *)result)->tp_basicsize != size) { PyErr_Format(PyExc_ValueError, "%s.%s does not appear to be the correct type object", module_name, class_name); goto bad; } .... } #endif So, in short, what about making this at least a (silenceable) warning, instead of an (unrecoverable) error ? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Mon Dec 7 21:11:39 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 Dec 2009 21:11:39 +0100 Subject: [Cython] Checking extension type sizes In-Reply-To: References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> Message-ID: <4B1D617B.60800@student.matnat.uio.no> Lisandro Dalcin wrote: > On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw > wrote: >> Is there a way to detect whether the way its been changed is backward >> compatible? >> > > NumPy exposes an API version number. However, I'm not sure if that > would be enough to detect if a change is backwards. Talking about NumPy-specific solutions, this was improved in 1.4.0 with several kind of version numbers depending on how things are breaked. http://docs.scipy.org/doc/numpy/reference/c-api.array.html#checking-the-api-version So if you call import_array() for NumPy >=1.4.0, NumPy deals with checking ABI compatability. Not sure how that impacts the Cython project though. I guess just preparing ourself to answer "please recompile your Cython code towards the NumPy version in question" to the coming flood of user questions will get us far :-) -- Dag Sverre From dagss at student.matnat.uio.no Mon Dec 7 21:17:00 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 Dec 2009 21:17:00 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> Message-ID: <4B1D62BC.4010009@student.matnat.uio.no> Robert Bradshaw wrote: > On Dec 7, 2009, at 11:30 AM, Stefan Behnel wrote: > >> Robert Bradshaw, 07.12.2009 18:53: >>> On Dec 6, 2009, at 2:30 PM, Stefan Behnel wrote: >>>> Robert Bradshaw, 03.12.2009 18:10: >>>>> Nice. With that, I can't see any place that inference of doubles >>>>> wouldn't be safe either, and it would be very convenient. >>>> ... what about 'bint'? Now that and/or behave as expected, would >>>> that type >>>> be safe to infer, too? >>> What about >>> >>> x = 5 >>> x = True >>> print x >> That would unexpectedly print '1', I guess. But we could special case >> 'bint' in the type inference algorithms so that it won't be >> compatible with >> any other int type. So if you do the above, x will turn into a Python >> object instead. > > Special casing bint for safe mode seems reasonable. > >> There's another case that I came up with. Since True/False are >> specified to >> be equivalent to the int values 1/0, there's likely some code out >> there >> that does this: >> >> cdef bint func(x): >> ... >> return 2 # also 'true' in C >> >> true_values = 0 >> for i in range(10): >> true_values += func(i) >> >> This currently coerces the return value to True and then adds its >> integer >> value 1 to true_values. But I'm actually fine with breaking that >> kind of >> weird code... > > Yeah, that's not breaking valid Python code at least. If your code is > depending on the subtleties of non-identity C/Python conversions, then > you're asking for trouble to have any kind of inference at all, and I > don't thing this is a worthy enough use case to not enable safe mode. I'm growing worried here. Will there be three levels of inference (none, safe, full) exposed to end-users? That's way too complicated in my opinion. The point of "safe" inference is that they can be done by default, without users having to know anything about it (except that some more code just run faster). If you have to actually read the manual to understand it and turn it on, you might as well use the full mode. I'm not against "bint" always being 0 or 1 in general though, so that cdef bint x = 3 turns into cdef bint x = (3 != 0) and cdef extern bint foo() x = foo() turns into __pyx_v_x = (foo() != 0); But, it should be completely orthogonal to type inference! -- Dag Sverre From dagss at student.matnat.uio.no Mon Dec 7 21:25:29 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 Dec 2009 21:25:29 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B190E08.6050104@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B190E08.6050104@behnel.de> Message-ID: <4B1D64B9.3030308@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 03.12.2009 12:45: >> Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 03.12.2009 10:52: >>> >>>>> - Python's floating point type is double, so it should be perfectly >>>>> safe to do >>>>> >>>>> cdef double f(): ... >>>>> x = f() # infer x as double >>>>> >>>>> By the same argument one might actually want to do >>>>> >>>>> cdef float f(): ... >>>>> x = f() # infer x as *double* -- because that's what Python would use >>>>> >>> That's actually an orthogonal feature: aliasing Python's float type to C's >>> double type in general, including support for methods etc. Since 'double' >>> is immutable and has exactly the same value range as Python's float type, >>> this is something that can always be done safely. If I'm not mistaken, even >>> Python's operators behave exactly like the equivalent C operators here, >>> right? And numeric operations would always return a double, so even >>> operations on doubles that involve integers would run completely in C space. >>> >>> Apart from a couple of further optimisations for methods and properties, I >>> think enabling type inference here would give us a rather warmly requested >>> feature. >>> >> This is now http://trac.cython.org/cython_trac/ticket/460. > > Rethinking this some more, there are a couple of problems with this, e.g. > overflow errors: > > >>> 2.0**1000000 > Traceback (most recent call last): > OverflowError: (34, 'Numerical result out of range') > > Moving this calculation to C space means we have to catch the error and > raise an exception for it. I've tried to find information about this but what seems most likely to me now is that this is simply "defined by implementation" in current CPython. x**y and x / 0 will give exceptions, but + will give "inf" on overflow. Only real lead was this from 2000: http://mail.python.org/pipermail/python-dev/2000-May/003990.html """ Anyway, once all that is done, float overflow *will* raise an exception (by default; there will also be a way to turn that off), unlike what happens today. Before then, I guess continuing the current policy of benign neglect (i.e., let it overflow silently) is best for consistency. Without access to all the 754 features in C, it's not even easy to detect overflow now! """ I think this is orthogonal to type inference. I.e. like integers, floating point in Cython should have Python semantics by default: cdef double x = 2.0, y = 1000000.0 print x**y # exception, not "inf" With a directive to disable it, of course. -- Dag Sverre From dalcinl at gmail.com Mon Dec 7 21:46:03 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 17:46:03 -0300 Subject: [Cython] enabling OpenID login on Cython's Trac Message-ID: http://bitbucket.org/Dalius/authopenid-plugin/wiki/Home -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Mon Dec 7 23:22:18 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 Dec 2009 23:22:18 +0100 Subject: [Cython] cython + lapack on OS X 10.6 In-Reply-To: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> Message-ID: <4B1D801A.7050108@student.matnat.uio.no> Max wrote: > Hi, > I'm trying to run testlapack example from cython-notur09 on OS X 10.6. > First, while compiling I faced with following error > gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 > build/temp.macosx-10.4-i386-2.6/lapack.o > -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas > -latlas -o lapack.so -g > Undefined symbols: > "_clapack_dgesv", referenced from: > _dgesv in lapack.o > _dsolve in lapack.o > ld: symbol(s) not found > So I changed clapack_dgesw in lapack.pxd to dgesw and successfully > build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) > but when I run testlapack.test() python froze, using 100% of CPU, > below is debuging info. > >>>> tl.test() > test start > ^C > Program received signal SIGINT, Interrupt. > 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, > __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, > __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) > at lapack.c:775 > 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, > int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int > *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { > (gdb) exit > > I was also tried to build it against Mac OS X accelerated framework > and the result is the same, while on linux this example works fine. > Any suggestions what's wrong? Your change likely caused calling the Fortran dgesv instead. Fortran functions must be called in a different manner, which explains the crash. As to getting it working, what does import numpy numpy.show_config() say? Also, I'm worried that -L/sw64/lib comes before the Sage include...perhaps some libraries come from that directory and some from Sage and they don't match... *shrug* -- Dag Sverre From dalcinl at gmail.com Mon Dec 7 23:43:12 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 19:43:12 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? Message-ID: In mpi4py, I have to declare C-level MPI handles. Depending on the MPI implementation, handles are integers (i.e. a plain C int) or pointers to a (opaque, incomplete) struct. So the actual type of an MPI handle is an implementation detail that user code should never rely on. Then in my pxd I do: cdef extern from "mpi.h": cdef struct _mpi_comm_t ctypedef _mpi_comm_t* MPI_Comm So _mpi_comm_t is a fake definition, but so far it works everywhere (and it is IMHO better that using a typedef to an integral type, because of the added type-checking) In MPI, invalid handles are indicated by "special" handle values (not necessarily being zero or NULL), so I also have this definition: cdef extern from "mpi.h": MPI_Comm MPI_COMM_NULL For each C-level MPI handle, I have a proxy class: ctypedef public api class Comm [type PyMPIComm_Type, object PyMPICommObject]: cdef MPI_Comm ob_mpi Now I want to define an API routine for getting the C-level handle from a Python-level instance. Moreover, I what that routine to return the MPI_XXXX_NULL handle in case of failure (note the type-check): cdef api MPI_Comm PyMPIComm_AsComm(object arg) except MPI_COMM_NULL: return (arg).ob_mpi However, Cython does not let me write that function. I'm asked to provide a constant expression. MPI_COMM_NULL do is a constant in MPI tongue, but AFAIK I have no way to ask Cython to treat an general (I mean, non-enumeration) external definition as a constant. After much thinking, I'm not sure why Cython (and likely Pyrex) do require a constant expression here... Could we relax this requirement? Of perhaps devise other way that let me use a external definition? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Tue Dec 8 00:41:02 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 15:41:02 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1D62BC.4010009@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> Message-ID: On Dec 7, 2009, at 12:17 PM, Dag Sverre Seljebotn wrote: > I'm growing worried here. Will there be three levels of inference > (none, > safe, full) exposed to end-users? > > That's way too complicated in my opinion. The point of "safe" > inference > is that they can be done by default, without users having to know > anything about it (except that some more code just run faster). If you > have to actually read the manual to understand it and turn it on, you > might as well use the full mode. For debugging purposes, it might be good to make it easy to turn it completely off, especially since it's such a new feature. I do have to say I'm not a fan of the current interface. What I think would be more useful is cython.infer_types(True) # explicitly enable it everywhere. cython.infer_types(False) # explicitly disable it everywere, including "safe" inference. cython.infer_types(None) # the default, "safe" inference. What this actually means may change over time, but the semantics of the resulting code shouldn't. This is similar to the dilemma with Profiling, where the default should probably be sometimes enabled (e.g. for def functions, not for inline functions) and a boolean value is an easy to remember explicit flag one can pass. > I'm not against "bint" always being 0 or 1 in general though, so that > > cdef bint x = 3 > > turns into > > cdef bint x = (3 != 0) > > and > > cdef extern bint foo() > x = foo() > > turns into > > __pyx_v_x = (foo() != 0); This may have performance ramifications...though probably small. Also, we can't make any guarantees (without extra work) about extern functions that are declared to return bint (which are not as uncommon as one would think...) > But, it should be completely orthogonal to type inference! Yep. - Robert From robertwb at math.washington.edu Tue Dec 8 00:42:34 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 15:42:34 -0800 Subject: [Cython] enabling OpenID login on Cython's Trac In-Reply-To: References: Message-ID: <5726A82C-93F9-4D81-A84D-87D46CA185A5@math.washington.edu> So does this mean that anyone with an OpenID can log in? Really, the logins are primarily to cut off spam, so I'm not opposed if OpenID is hard for spammers to obtain/spoof. Did you want to set it up? On Dec 7, 2009, at 12:46 PM, Lisandro Dalcin wrote: > http://bitbucket.org/Dalius/authopenid-plugin/wiki/Home > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Tue Dec 8 00:46:34 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 15:46:34 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: <4B1D617B.60800@student.matnat.uio.no> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <4B1D617B.60800@student.matnat.uio.no> Message-ID: <0BEA36EA-DEB3-4A83-9D79-5B9BA894AAB4@math.washington.edu> On Dec 7, 2009, at 12:11 PM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw >> wrote: >>> Is there a way to detect whether the way its been changed is >>> backward >>> compatible? >>> >> >> NumPy exposes an API version number. However, I'm not sure if that >> would be enough to detect if a change is backwards. > > Talking about NumPy-specific solutions, this was improved in 1.4.0 > with > several kind of version numbers depending on how things are breaked. > > http://docs.scipy.org/doc/numpy/reference/c-api.array.html#checking-the-api-version > > So if you call import_array() for NumPy >=1.4.0, NumPy deals with > checking ABI compatability. > > Not sure how that impacts the Cython project though. I guess just > preparing ourself to answer "please recompile your Cython code towards > the NumPy version in question" to the coming flood of user questions > will get us far :-) So are you thinking of a "check_numpy_abi" function in numpy.pxd that users can call (which would check the runtime version against the compile time version) if they're worried about binary compatibility? - Robert From robert.kern at gmail.com Tue Dec 8 00:55:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 07 Dec 2009 17:55:57 -0600 Subject: [Cython] Checking extension type sizes In-Reply-To: <0BEA36EA-DEB3-4A83-9D79-5B9BA894AAB4@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <4B1D617B.60800@student.matnat.uio.no> <0BEA36EA-DEB3-4A83-9D79-5B9BA894AAB4@math.washington.edu> Message-ID: On 2009-12-07 17:46 PM, Robert Bradshaw wrote: > On Dec 7, 2009, at 12:11 PM, Dag Sverre Seljebotn wrote: > >> Lisandro Dalcin wrote: >>> On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw >>> wrote: >>>> Is there a way to detect whether the way its been changed is >>>> backward >>>> compatible? >>>> >>> >>> NumPy exposes an API version number. However, I'm not sure if that >>> would be enough to detect if a change is backwards. >> >> Talking about NumPy-specific solutions, this was improved in 1.4.0 >> with >> several kind of version numbers depending on how things are breaked. >> >> http://docs.scipy.org/doc/numpy/reference/c-api.array.html#checking-the-api-version >> >> So if you call import_array() for NumPy>=1.4.0, NumPy deals with >> checking ABI compatability. >> >> Not sure how that impacts the Cython project though. I guess just >> preparing ourself to answer "please recompile your Cython code towards >> the NumPy version in question" to the coming flood of user questions >> will get us far :-) > > So are you thinking of a "check_numpy_abi" function in numpy.pxd that > users can call (which would check the runtime version against the > compile time version) if they're worried about binary compatibility? As Dag says, ABI compatibility is already tested automatically inside import_array(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robertwb at math.washington.edu Tue Dec 8 01:01:13 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 16:01:13 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <4B1D617B.60800@student.matnat.uio.no> <0BEA36EA-DEB3-4A83-9D79-5B9BA894AAB4@math.washington.edu> Message-ID: On Dec 7, 2009, at 3:55 PM, Robert Kern wrote: > On 2009-12-07 17:46 PM, Robert Bradshaw wrote: >> On Dec 7, 2009, at 12:11 PM, Dag Sverre Seljebotn wrote: >> >>> Lisandro Dalcin wrote: >>>> On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw >>>> wrote: >>>>> Is there a way to detect whether the way its been changed is >>>>> backward >>>>> compatible? >>>>> >>>> >>>> NumPy exposes an API version number. However, I'm not sure if that >>>> would be enough to detect if a change is backwards. >>> >>> Talking about NumPy-specific solutions, this was improved in 1.4.0 >>> with >>> several kind of version numbers depending on how things are breaked. >>> >>> http://docs.scipy.org/doc/numpy/reference/c-api.array.html#checking-the-api-version >>> >>> So if you call import_array() for NumPy>=1.4.0, NumPy deals with >>> checking ABI compatability. >>> >>> Not sure how that impacts the Cython project though. I guess just >>> preparing ourself to answer "please recompile your Cython code >>> towards >>> the NumPy version in question" to the coming flood of user questions >>> will get us far :-) >> >> So are you thinking of a "check_numpy_abi" function in numpy.pxd that >> users can call (which would check the runtime version against the >> compile time version) if they're worried about binary compatibility? > > As Dag says, ABI compatibility is already tested automatically inside > import_array(). Thanks. I should have read this more carefully. - Robert From robertwb at math.washington.edu Tue Dec 8 01:09:07 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 7 Dec 2009 16:09:07 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> Message-ID: <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> On Dec 7, 2009, at 11:57 AM, Lisandro Dalcin wrote: > On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw > wrote: >> >> I'd rather be overly cautious. > > Indeed. However, I think Cython should switch from an error to a > warning. That way, everybody should be more or less happy. > >> >> I don't think it's strange >> to have to recompile when a C-level dependancy changes, even if the >> change is compatible. >> > > You are right, but developers should keep in mind that having to > recompile is a major hurdle for other end-users. > >> >> I'm still a bit confused though--numpy types are declared as "extern" >> so I wouldn't expect their size to be checked. >> > > This is not the case. The code is here: Oh, I forgot that it used sizeof(...) on the externally declared struct. > > > #ifndef __PYX_HAVE_RT_ImportType > #define __PYX_HAVE_RT_ImportType > static PyTypeObject *__Pyx_ImportType(const char *module_name, const > char *class_name, long size) > { > ..... > if (((PyTypeObject *)result)->tp_basicsize != size) { > PyErr_Format(PyExc_ValueError, > "%s.%s does not appear to be the correct type object", > module_name, class_name); > goto bad; > } > .... > } > #endif > > > So, in short, what about making this at least a (silenceable) warning, > instead of an (unrecoverable) error ? You do make a good argument for issuing a warning rather than an error. Are you sure this is the only place we use the size of the type? Should we be more strict with Cython-defined (non-extern) types? Should we require that the struct size at least goes up? What about if one tried to extend one of these "expanded" types? One hesitation I have is that allowing this is that very strange and hard to debug errors can result due to undetected binary incompatibilities--I certainly don't want it to be too easy to ignore this warning. This is one of those situations where I want a big sign that says "undefined behavior ahead, you better know what you're doing!" - Robert From dalcinl at gmail.com Tue Dec 8 01:13:38 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 21:13:38 -0300 Subject: [Cython] enabling OpenID login on Cython's Trac In-Reply-To: <5726A82C-93F9-4D81-A84D-87D46CA185A5@math.washington.edu> References: <5726A82C-93F9-4D81-A84D-87D46CA185A5@math.washington.edu> Message-ID: On Mon, Dec 7, 2009 at 8:42 PM, Robert Bradshaw wrote: > So does this mean that anyone with an OpenID can log in? Well, I'm not sure about this. But the point is that once you have an account, you do not need to enter username&password to log-in. > Really, the > logins are primarily to cut off spam, so I'm not opposed if OpenID is > hard for spammers to obtain/spoof. > Well, no, IIUC, OpenID does not protect against spammers. Any spammer could set-up a fake OpenID-provider server and start creating accounts for spamming. But my point is not automatic creation of new account, but easy logging once you have your account. > Did you want to set it up? > > On Dec 7, 2009, at 12:46 PM, Lisandro Dalcin wrote: > >> http://bitbucket.org/Dalius/authopenid-plugin/wiki/Home >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue Dec 8 01:27:04 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 7 Dec 2009 21:27:04 -0300 Subject: [Cython] Checking extension type sizes In-Reply-To: <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> Message-ID: On Mon, Dec 7, 2009 at 9:09 PM, Robert Bradshaw wrote: > > What about if one tried to extend one of these "expanded" types? > OK, good point. Strong/Weak checking depending on the type being/being not extended could be done, but this complicates the implementation and we would still have a half-assed protection mechanism. However, I would not be opposed to specil-casing NumPy and complicating the ext type import machinery. NumPy is a target worth enough to seamless support. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Tue Dec 8 01:29:37 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 01:29:37 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <8671C18B-2A4F-431D-9F2A-7428565DB203@math.washington.edu> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1D574F.6050902@behnel.de> <8671C18B-2A4F-431D-9F2A-7428565DB203@math.washington.edu> Message-ID: <4B1D9DF1.8040509@behnel.de> Robert Bradshaw, 07.12.2009 20:36: > On Dec 7, 2009, at 11:28 AM, Stefan Behnel wrote: > >> Robert Bradshaw, 03.12.2009 18:10: >>> Nice. With that, I can't see any place that inference of doubles >>> wouldn't be safe either, and it would be very convenient. >> I have an implementation, but I noticed that typing the lhs of >> assignments >> isn't enough to support code like this: >> >> x = 2.5 + 1.5 ** float(some_obj) # could be string/float/... >> >> because the calculation would still run in Python space, although we >> know >> it could run completely in C after unpacking the last operand. So I >> needed >> to force Python float operands into C doubles inside of >> NumBinopNode, in >> addition to the support in the type inference mechanism. > > I'm not sure where exactly where the float(...) becoming a double > operation optimization occurs, but perhaps it might be easier to > simply modify the call node to infer the type of float(...) as being a > double. Change is in, feel free to review and please test. :) Stefan From robert.kern at gmail.com Tue Dec 8 02:31:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 07 Dec 2009 19:31:27 -0600 Subject: [Cython] Checking extension type sizes In-Reply-To: <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> Message-ID: On 2009-12-07 18:09 PM, Robert Bradshaw wrote: > You do make a good argument for issuing a warning rather than an > error. Are you sure this is the only place we use the size of the > type? Should we be more strict with Cython-defined (non-extern) types? I think that there is a use case for this. I believe that a Pyrex type hierarchy across modules was Greg's original use case for this check in the first place, not extern types from non-Pyrex extension modules. The former will be pretty common as one develops a collection of Cython module and rebuilds one module but not another. The latter case is much rarer, mostly because without Cython's syntax support, writing types and subclassing them is such a bear. I conjecture that pure C type writers will be more careful about only extending types that are not likely to have subtypes, as in the numpy.dtype case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan_ml at behnel.de Tue Dec 8 10:42:50 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 10:42:50 +0100 Subject: [Cython] Checking extension type sizes In-Reply-To: <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> Message-ID: <4B1E1F9A.1000008@behnel.de> Robert Bradshaw, 08.12.2009 01:09: > You do make a good argument for issuing a warning rather than an > error. Are you sure this is the only place we use the size of the > type? Should we be more strict with Cython-defined (non-extern) types? > Should we require that the struct size at least goes up? What about if > one tried to extend one of these "expanded" types? FWIW, I'm for attempting to make this a warning depending on >= for arbitrary non-subtyped external types that do not define any C methods, and keeping the error strict for all subtyped types or those that do define C methods. Disabling it completely is just asking for trouble. Stefan From stefan_ml at behnel.de Tue Dec 8 13:29:17 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 13:29:17 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> Message-ID: <4B1E469D.80402@behnel.de> Robert Bradshaw, 08.12.2009 00:41: > I'm not a fan of the current interface. What I think would be more > useful is > > cython.infer_types(True) # explicitly enable it everywhere. > cython.infer_types(False) # explicitly disable it everywere, > including "safe" inference. > cython.infer_types(None) # the default, "safe" inference. What > this actually means may change over time, but the semantics of the > resulting code shouldn't. I switched to strings as I imagined that we may have to support other values as well, such as a list of single type names or a name for a group of types. But all of that isn't easy to support with the current directive parsers anyway. So, if we ever want to support that, we can just as well allow keyword arguments as in "infer_types(True, bint=False)". (I love keyword arguments) >> I'm not against "bint" always being 0 or 1 in general though, so that >> >> cdef bint x = 3 >> >> turns into >> >> cdef bint x = (3 != 0) >> >> and >> >> cdef extern bint foo() >> x = foo() >> >> turns into >> >> __pyx_v_x = (foo() != 0); > > This may have performance ramifications...though probably small. Also, > we can't make any guarantees (without extra work) about extern > functions that are declared to return bint (which are not as uncommon > as one would think...) At least for the Python .pxd files that we ship, I've taken care to use bint only where it's really just 0/1, although additional -1 values can be used as exception values, for example. If we do a switch like the above, we'd have to make sure that at least the exception values are checked before conversion. >> But, it should be completely orthogonal to type inference! > > Yep. ... except that it would broaden the niche of safe type inference a bit. Stefan From stefan_ml at behnel.de Tue Dec 8 13:33:09 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 13:33:09 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> Message-ID: <4B1E4785.8020806@behnel.de> Robert Bradshaw, 08.12.2009 00:41: > cython.infer_types(None) # the default, "safe" inference. Note how "infer_types(None)" basically reads "do not do any type inference" in the source that uses it. However, that would be "infer_types(False)". Stefan From stefan_ml at behnel.de Tue Dec 8 13:51:35 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 13:51:35 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1D62BC.4010009@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> Message-ID: <4B1E4BD7.8090800@behnel.de> Dag Sverre Seljebotn, 07.12.2009 21:17: > I'm not against "bint" always being 0 or 1 in general though, so that > > cdef bint x = 3 > > turns into > > cdef bint x = (3 != 0) > > and > > cdef extern bint foo() > x = foo() > > turns into > > __pyx_v_x = (foo() != 0); Actually, that's not even required. All you'd have to take care of is that bint becomes 0/1 when coercing to a non-bint type. That could simply include other int types in the future, so that cdef int i = 0 cdef bint x = 5 i = x would be guaranteed to set i to 1, whereas x would still be 5. I'm not sure how this would apply to comparisons, though, such as cdef bint x = 5 cdef int i = 5 cdef bint result = i == x print result should that print False (being evaluated as 'int'), or would the comparison use the bint type and return True? Stefan From dagss at student.matnat.uio.no Tue Dec 8 13:54:28 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Dec 2009 13:54:28 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1E4785.8020806@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> <4B1E4785.8020806@behnel.de> Message-ID: <4B1E4C84.6090400@student.matnat.uio.no> Stefan Behnel wrote: > Robert Bradshaw, 08.12.2009 00:41: > >> cython.infer_types(None) # the default, "safe" inference. >> > > Note how "infer_types(None)" basically reads "do not do any type inference" > in the source that uses it. However, that would be "infer_types(False)". > Just to make a point, I think it should be - infer_types(True) - unsafe mode - infer_types(False) - backwards-compatible mode - _hidden_developers_only_directive_turn_off_inference_completely(True) - absolutely no inference Similarily for "bint" etc. I think there's a limit to how much of the language should be controllable by compiler directives; sometimes we have to just make choices on the users behalf, or the result will be completely unusable. If we have to do tweaks to double and bint to get acceptable default behaviour, but would like a switch for backwards-compatability, a simple language_level("0.12") directive would be better. Dag Sverre From stefan_ml at behnel.de Tue Dec 8 14:10:13 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 Dec 2009 14:10:13 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1E4C84.6090400@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> <4B1E4785.8020806@behnel.de> <4B1E4C84.6090400@student.matnat.uio.no> Message-ID: <4B1E5035.9000708@behnel.de> Dag Sverre Seljebotn, 08.12.2009 13:54: > Stefan Behnel wrote: >> Robert Bradshaw, 08.12.2009 00:41: >> >>> cython.infer_types(None) # the default, "safe" inference. >>> >> Note how "infer_types(None)" basically reads "do not do any type inference" >> in the source that uses it. However, that would be "infer_types(False)". >> > Just to make a point, I think it should be > > - infer_types(True) - unsafe mode > - infer_types(False) - backwards-compatible mode > - _hidden_developers_only_directive_turn_off_inference_completely(True) > - absolutely no inference But then, the directive is not "infer_types" but "infer_unsafe_types", "infer_c_types" or "infer_backwards_incompatible_types". Stefan From mayzel at gmail.com Tue Dec 8 13:56:24 2009 From: mayzel at gmail.com (Max Mayzel) Date: Tue, 8 Dec 2009 12:56:24 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn writes: > > Max wrote: > > Hi, > > I'm trying to run testlapack example from cython-notur09 on OS X 10.6. > > First, while compiling I faced with following error > > gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 > > build/temp.macosx-10.4-i386-2.6/lapack.o > > -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas > > -latlas -o lapack.so -g > > Undefined symbols: > > "_clapack_dgesv", referenced from: > > _dgesv in lapack.o > > _dsolve in lapack.o > > ld: symbol(s) not found > > So I changed clapack_dgesw in lapack.pxd to dgesw and successfully > > build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) > > but when I run testlapack.test() python froze, using 100% of CPU, > > below is debuging info. > > > >>>> tl.test() > > test start > > ^C > > Program received signal SIGINT, Interrupt. > > 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, > > __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, > > __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) > > at lapack.c:775 > > 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, > > int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int > > *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { > > (gdb) exit > > > > I was also tried to build it against Mac OS X accelerated framework > > and the result is the same, while on linux this example works fine. > > Any suggestions what's wrong? > > Your change likely caused calling the Fortran dgesv instead. Fortran > functions must be called in a different manner, which explains the crash. > > As to getting it working, what does > > import numpy > numpy.show_config() > > say? Also, I'm worried that -L/sw64/lib comes before the Sage > include...perhaps some libraries come from that directory and some from > Sage and they don't match... *shrug* > Hi Dag, thanks for answer. Here is a numpy config. In [3]: numpy.show_config() lapack_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3'] define_macros = [('NO_ATLAS_INFO', 3)] blas_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] define_macros = [('NO_ATLAS_INFO', 3)] I tried to remove -L/sw64/lib, but gcc didn't found libf77blas; moving -L/sw64/lib after -L/Applications/development/sage//local/lib still gave Undefined symbols: "_clapack_dgesv", referenced from: _dgesv in lapack.o _dsolve in lapack.o ld: symbol(s) not found Regards, Max From dagss at student.matnat.uio.no Tue Dec 8 14:36:54 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Dec 2009 14:36:54 +0100 Subject: [Cython] cython + lapack on OS X 10.6 In-Reply-To: References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> Message-ID: <4B1E5676.8040207@student.matnat.uio.no> Max Mayzel wrote: > Dag Sverre Seljebotn writes: > > >> Max wrote: >> >>> Hi, >>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. >>> First, while compiling I faced with following error >>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 >>> build/temp.macosx-10.4-i386-2.6/lapack.o >>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas >>> -latlas -o lapack.so -g >>> Undefined symbols: >>> "_clapack_dgesv", referenced from: >>> _dgesv in lapack.o >>> _dsolve in lapack.o >>> ld: symbol(s) not found >>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully >>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) >>> but when I run testlapack.test() python froze, using 100% of CPU, >>> below is debuging info. >>> >>> >>>>>> tl.test() >>>>>> >>> test start >>> ^C >>> Program received signal SIGINT, Interrupt. >>> 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, >>> __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, >>> __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) >>> at lapack.c:775 >>> 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, >>> int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int >>> *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { >>> (gdb) exit >>> >>> I was also tried to build it against Mac OS X accelerated framework >>> and the result is the same, while on linux this example works fine. >>> Any suggestions what's wrong? >>> >> Your change likely caused calling the Fortran dgesv instead. Fortran >> functions must be called in a different manner, which explains the crash. >> >> As to getting it working, what does >> >> import numpy >> numpy.show_config() >> >> say? Also, I'm worried that -L/sw64/lib comes before the Sage >> include...perhaps some libraries come from that directory and some from >> Sage and they don't match... *shrug* >> >> > > Hi Dag, thanks for answer. > > Here is a numpy config. > > In [3]: numpy.show_config() > lapack_opt_info: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > extra_compile_args = ['-msse3'] > define_macros = [('NO_ATLAS_INFO', 3)] > > blas_opt_info: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > extra_compile_args = ['-msse3', > '-I/System/Library/Frameworks/vecLib.framework/Headers'] > define_macros = [('NO_ATLAS_INFO', 3)] > > I tried to remove -L/sw64/lib, but gcc didn't found libf77blas; moving > -L/sw64/lib after -L/Applications/development/sage//local/lib still gave > Undefined symbols: > "_clapack_dgesv", referenced from: > _dgesv in lapack.o > _dsolve in lapack.o > ld: symbol(s) not found > Hmm. I expected a line libraries=[...] in there. Try scipy.show_config()? I'm afraid I don't have a good answer, one just has to find the right set of libraries to link with. Dag Sverre From dagss at student.matnat.uio.no Tue Dec 8 14:37:54 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Dec 2009 14:37:54 +0100 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1E5035.9000708@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> <4B1E4785.8020806@behnel.de> <4B1E4C84.6090400@student.matnat.uio.no> <4B1E5035.9000708@behnel.de> Message-ID: <4B1E56B2.6030808@student.matnat.uio.no> Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.12.2009 13:54: > >> Stefan Behnel wrote: >> >>> Robert Bradshaw, 08.12.2009 00:41: >>> >>> >>>> cython.infer_types(None) # the default, "safe" inference. >>>> >>>> >>> Note how "infer_types(None)" basically reads "do not do any type inference" >>> in the source that uses it. However, that would be "infer_types(False)". >>> >>> >> Just to make a point, I think it should be >> >> - infer_types(True) - unsafe mode >> - infer_types(False) - backwards-compatible mode >> - _hidden_developers_only_directive_turn_off_inference_completely(True) >> - absolutely no inference >> > > But then, the directive is not "infer_types" but "infer_unsafe_types", > "infer_c_types" or "infer_backwards_incompatible_types". > Sounds good to me. Perhaps "unsafe_inference". Dag Sverre From cournape at gmail.com Tue Dec 8 14:51:51 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Dec 2009 22:51:51 +0900 Subject: [Cython] Checking extension type sizes In-Reply-To: <4B1D617B.60800@student.matnat.uio.no> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <4B1D617B.60800@student.matnat.uio.no> Message-ID: <5b8d13220912080551r6dcba5ebi41f3cc2eac602d5@mail.gmail.com> On Tue, Dec 8, 2009 at 5:11 AM, Dag Sverre Seljebotn wrote: > Lisandro Dalcin wrote: >> On Mon, Dec 7, 2009 at 3:29 PM, Robert Bradshaw >> wrote: >>> Is there a way to detect whether the way its been changed is backward >>> compatible? >>> >> >> NumPy exposes an API version number. However, I'm not sure if that >> would be enough to detect if a change is backwards. > > Talking about NumPy-specific solutions, this was improved in 1.4.0 with > several kind of version numbers depending on how things are breaked. > > http://docs.scipy.org/doc/numpy/reference/c-api.array.html#checking-the-api-version > To complete Dag explanation, numpy C API has two numbers: - ABI: the one obtained from PyArray_GetNDArrayCVersion. Every extension using the C numpy api must have exactly the same number. This is normally checked at import time since forever. - API: the one obtained PyArray_GetNDArrayCFeatureVersion. Every time the C API is extended, we increase this number. This is only available since 1.4.0 The basic rule is that two extensions ext 1 and ext 2 are compatible iif ABI_1 == ABI_2 == NUMPY_ABI and API_1 >= API_2 >= NUMPY_API. This rule is what is checked for since 1.4.0. Note that this is as reliable as we are tracking changes in the ABI - it is unfortunately quite difficult to guarantee that we have not broken the ABI between two versions (it is already difficult in pure C, but it is worse with python extensions because we don't have the linker safety net, nor do we have versioned symbols). cheers, David From mayzel at gmail.com Tue Dec 8 15:02:43 2009 From: mayzel at gmail.com (Max Mayzel) Date: Tue, 8 Dec 2009 14:02:43 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B1E5676.8040207@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn writes: > > > >>> Hi, > >>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. > >>> First, while compiling I faced with following error > >>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 > >>> build/temp.macosx-10.4-i386-2.6/lapack.o > >>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas > >>> -latlas -o lapack.so -g > >>> Undefined symbols: > >>> "_clapack_dgesv", referenced from: > >>> _dgesv in lapack.o > >>> _dsolve in lapack.o > >>> ld: symbol(s) not found > >>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully > >>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) > >>> but when I run testlapack.test() python froze, using 100% of CPU, > >>> below is debuging info. > >>> > >>> > >>>>>> tl.test() > >>>>>> > >>> test start > >>> ^C > >>> Program received signal SIGINT, Interrupt. > >>> 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, > >>> __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, > >>> __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) > >>> at lapack.c:775 > >>> 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, > >>> int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int > >>> *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { > >>> (gdb) exit > >>> > > > In [3]: numpy.show_config() > > lapack_opt_info: > > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > > extra_compile_args = ['-msse3'] > > define_macros = [('NO_ATLAS_INFO', 3)] > > > > blas_opt_info: > > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > > extra_compile_args = ['-msse3', > > '-I/System/Library/Frameworks/vecLib.framework/Headers'] > > define_macros = [('NO_ATLAS_INFO', 3)] > > > > > Hmm. I expected a line libraries=[...] in there. Try > scipy.show_config()? I'm afraid I don't have a good answer, one just has > to find the right set of libraries to link with. > > Dag Sverre > Here it is In [2]: scipy.show_config() amd_info: libraries = ['amd'] library_dirs = ['/sw64/lib'] umfpack_info: libraries = ['umfpack', 'amd'] library_dirs = ['/sw64/lib'] lapack_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3'] define_macros = [('NO_ATLAS_INFO', 3)] blas_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] define_macros = [('NO_ATLAS_INFO', 3)] As far, as I understand, numpy and scipy are build against Mac OS X Accelerated framework, which includes lapack and blas libs, but as I've wrote recently, when I'm building case a crash Max From dagss at student.matnat.uio.no Tue Dec 8 15:09:50 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Dec 2009 15:09:50 +0100 Subject: [Cython] cython + lapack on OS X 10.6 In-Reply-To: References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B1E5676.8040207@student.matnat.uio.no> Message-ID: <4B1E5E2E.5080605@student.matnat.uio.no> Max Mayzel wrote: > Dag Sverre Seljebotn writes: > > >> >>>>> Hi, >>>>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. >>>>> First, while compiling I faced with following error >>>>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 >>>>> build/temp.macosx-10.4-i386-2.6/lapack.o >>>>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas >>>>> -latlas -o lapack.so -g >>>>> Undefined symbols: >>>>> "_clapack_dgesv", referenced from: >>>>> _dgesv in lapack.o >>>>> _dsolve in lapack.o >>>>> ld: symbol(s) not found >>>>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully >>>>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) >>>>> but when I run testlapack.test() python froze, using 100% of CPU, >>>>> below is debuging info. >>>>> >>>>> >>>>> >>>>>>>> tl.test() >>>>>>>> >>>>>>>> >>>>> test start >>>>> ^C >>>>> Program received signal SIGINT, Interrupt. >>>>> 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, >>>>> __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, >>>>> __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) >>>>> at lapack.c:775 >>>>> 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, >>>>> int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int >>>>> *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { >>>>> (gdb) exit >>>>> >>>>> >>> In [3]: numpy.show_config() >>> lapack_opt_info: >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>> extra_compile_args = ['-msse3'] >>> define_macros = [('NO_ATLAS_INFO', 3)] >>> >>> blas_opt_info: >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>> extra_compile_args = ['-msse3', >>> '-I/System/Library/Frameworks/vecLib.framework/Headers'] >>> define_macros = [('NO_ATLAS_INFO', 3)] >>> >>> >>> >> Hmm. I expected a line libraries=[...] in there. Try >> scipy.show_config()? I'm afraid I don't have a good answer, one just has >> to find the right set of libraries to link with. >> >> Dag Sverre >> >> > Here it is > In [2]: scipy.show_config() > amd_info: > libraries = ['amd'] > library_dirs = ['/sw64/lib'] > > umfpack_info: > libraries = ['umfpack', 'amd'] > library_dirs = ['/sw64/lib'] > > lapack_opt_info: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > extra_compile_args = ['-msse3'] > define_macros = [('NO_ATLAS_INFO', 3)] > > blas_opt_info: > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > extra_compile_args = ['-msse3', > '-I/System/Library/Frameworks/vecLib.framework/Headers'] > define_macros = [('NO_ATLAS_INFO', 3)] > > As far, as I understand, numpy and scipy are build against Mac OS X Accelerated > framework, which includes lapack and blas libs, but as I've wrote recently, when > I'm building case a crash > Have you tried dropping libraries=.... from setup.py? You should really build your extension the way SciPy has been built, meaning modifying setup.py so that gcc is invoked in the way described by blas_opt_info above. Dag Sverre From robertwb at math.washington.edu Tue Dec 8 18:42:52 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 8 Dec 2009 09:42:52 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: <4B1E1F9A.1000008@behnel.de> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> <4B1E1F9A.1000008@behnel.de> Message-ID: <957D379D-7BCB-4CF5-BE22-ECF65D1AD8AE@math.washington.edu> On Dec 8, 2009, at 1:42 AM, Stefan Behnel wrote: > > Robert Bradshaw, 08.12.2009 01:09: >> You do make a good argument for issuing a warning rather than an >> error. Are you sure this is the only place we use the size of the >> type? Should we be more strict with Cython-defined (non-extern) >> types? >> Should we require that the struct size at least goes up? What about >> if >> one tried to extend one of these "expanded" types? > > FWIW, I'm for attempting to make this a warning depending on >= for > arbitrary non-subtyped external types that do not define any C > methods, and > keeping the error strict for all subtyped types OK, sounds like we're in agreement. > or those that do define C methods. I don't think extern types can define C methods. > Disabling it completely is just asking for trouble. Yep. - Robert From robertwb at math.washington.edu Tue Dec 8 18:49:32 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 8 Dec 2009 09:49:32 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1E469D.80402@behnel.de> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> <4B1E469D.80402@behnel.de> Message-ID: <5D9B2463-5E38-4E20-B96A-1455FA02335F@math.washington.edu> On Dec 8, 2009, at 4:29 AM, Stefan Behnel wrote: >> This may have performance ramifications...though probably small. >> Also, >> we can't make any guarantees (without extra work) about extern >> functions that are declared to return bint (which are not as uncommon >> as one would think...) > > At least for the Python .pxd files that we ship, I've taken care to > use > bint only where it's really just 0/1, although additional -1 values > can be > used as exception values, for example. Some libraries on specify that they return a non-zero value on success/ error, yet the bint type still makes sense. > If we do a switch like the above, > we'd have to make sure that at least the exception values are checked > before conversion. Yep. > Actually, that's not even required. All you'd have to take care of > is that > bint becomes 0/1 when coercing to a non-bint type. That could simply > include other int types in the future, so that > > cdef int i = 0 > cdef bint x = 5 > i = x > > would be guaranteed to set i to 1, whereas x would still be 5. I think it makes much more sense (if we do it) to convert to 0/1 when converting *to* a bint type, rather than from it. > I'm not sure how this would apply to comparisons, though, such as > > cdef bint x = 5 > cdef int i = 5 > cdef bint result = i == x > print result > > should that print False (being evaluated as 'int'), or would the > comparison > use the bint type and return True? The natural widening is bint -> int. - Robert From robert.kern at gmail.com Tue Dec 8 18:59:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 08 Dec 2009 11:59:44 -0600 Subject: [Cython] Checking extension type sizes In-Reply-To: <957D379D-7BCB-4CF5-BE22-ECF65D1AD8AE@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> <4B1E1F9A.1000008@behnel.de> <957D379D-7BCB-4CF5-BE22-ECF65D1AD8AE@math.washington.edu> Message-ID: On 2009-12-08 11:42 AM, Robert Bradshaw wrote: > On Dec 8, 2009, at 1:42 AM, Stefan Behnel wrote: > >> >> Robert Bradshaw, 08.12.2009 01:09: >>> You do make a good argument for issuing a warning rather than an >>> error. Are you sure this is the only place we use the size of the >>> type? Should we be more strict with Cython-defined (non-extern) >>> types? >>> Should we require that the struct size at least goes up? What about >>> if >>> one tried to extend one of these "expanded" types? >> >> FWIW, I'm for attempting to make this a warning depending on>= for >> arbitrary non-subtyped external types that do not define any C >> methods, and >> keeping the error strict for all subtyped types > > OK, sounds like we're in agreement. +1. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robertwb at math.washington.edu Tue Dec 8 19:02:04 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Tue, 8 Dec 2009 10:02:04 -0800 Subject: [Cython] aliasing Python floats to C double In-Reply-To: <4B1E4C84.6090400@student.matnat.uio.no> References: <4B177D21.8010600@behnel.de> <4B17896E.8030806@student.matnat.uio.no> <4B178A49.1050507@student.matnat.uio.no> <4B1796C2.4060005@behnel.de> <4B17A4E5.3050706@student.matnat.uio.no> <4B17B45A.6010000@behnel.de> <4B1C307C.7090703@behnel.de> <6BE579DA-177B-436E-A8EE-71FE06B0113D@math.washington.edu> <4B1D57E8.1090205@behnel.de> <34C9D51A-E655-4E2D-8378-D2829C6CA79C@math.washington.edu> <4B1D62BC.4010009@student.matnat.uio.no> <4B1E4785.8020806@behnel.de> <4B1E4C84.6090400@student.matnat.uio.no> Message-ID: On Dec 8, 2009, at 4:54 AM, Dag Sverre Seljebotn wrote: > Stefan Behnel wrote: >> Robert Bradshaw, 08.12.2009 00:41: >> >>> cython.infer_types(None) # the default, "safe" inference. >>> >> >> Note how "infer_types(None)" basically reads "do not do any type >> inference" >> in the source that uses it. However, that would be >> "infer_types(False)". I guess this comes from a convention in Sage where one does def some_calculation(proof=None): ... where one can pass True or False, and passing None (or nothing at all) falls back to a global default. I'm all for keywords as well. > Just to make a point, I think it should be > > - infer_types(True) - unsafe mode > - infer_types(False) - backwards-compatible mode > - > _hidden_developers_only_directive_turn_off_inference_completely(True) > - absolutely no inference I think we should support True (unsafe)/False (backwards compatible) as the official interface, and then have extra keyword arguments, which may not be forwards compatible, for developer use only. I don't like the label "unsafe," in particular, we shouldn't leak that to the user. It's not unsafe to use, only unsafe to turn on by default. > Similarily for "bint" etc. I think there's a limit to how much of the > language should be controllable by compiler directives; sometimes we > have to just make choices on the users behalf, or the result will be > completely unusable. Good point. > If we have to do tweaks to double and bint to get acceptable default > behaviour, but would like a switch for backwards-compatability, a > simple > language_level("0.12") directive would be better. This would seem to imply a promise to make everything the same... even stuff we don't think about. Also, would we support every language level? If people need the old semantics, old versions are always still around. Actually, I hope things don't change so much we need to differentiate between language levels... - Robert From greg.ewing at canterbury.ac.nz Tue Dec 8 22:32:10 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Dec 2009 10:32:10 +1300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: Message-ID: <4B1EC5DA.2080702@canterbury.ac.nz> Lisandro Dalcin wrote: > After much thinking, I'm not sure why Cython (and likely Pyrex) do > require a constant expression here... Could we relax this requirement? Allowing a fully general expression here would be quite tricky, as it needs to be evaluable in any context where the function is called. However it might be possible to relax it enough to allow an extern variable. Are you sure you really need to use this particular value, though? If NULL is a valid return value, you can use 'except? NULL' (with a question mark) to have an exception test generated if the value is NULL. -- Greg From greg.ewing at canterbury.ac.nz Tue Dec 8 22:36:51 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Dec 2009 10:36:51 +1300 Subject: [Cython] Checking extension type sizes In-Reply-To: References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> Message-ID: <4B1EC6F3.9040903@canterbury.ac.nz> Robert Kern wrote: > I think that there is a use case for this. I believe that a Pyrex type hierarchy > across modules was Greg's original use case for this check in the first place Yes, that's mainly what I had in mind. However, it's not really feasible to make a distinction between Pyrex and non-Pyrex defined classes, since Pyrex has no idea whether a .pxd file it's importing describes something implemented in Pyrex or not -- nor should it, IMO. -- Greg From dalcinl at gmail.com Wed Dec 9 00:02:30 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 8 Dec 2009 20:02:30 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B1EC5DA.2080702@canterbury.ac.nz> References: <4B1EC5DA.2080702@canterbury.ac.nz> Message-ID: On Tue, Dec 8, 2009 at 6:32 PM, Greg Ewing wrote: > Lisandro Dalcin wrote: > >> After much thinking, I'm not sure why Cython (and likely Pyrex) do >> require a constant expression here... Could we relax this requirement? > > Allowing a fully general expression here would be quite > tricky, as it needs to be evaluable in any context where > the function is called. > Yes, I understand. > However it might be possible to relax it enough to allow > an extern variable. > That's basically what I'm asking for. > Are you sure you really need to use this particular value, > though? If NULL is a valid return value, you can use > 'except? NULL' (with a question mark) to have an exception > test generated if the value is NULL. > The point is that NULL is not a valid return value, but MPI_COMM_NULL is. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From mayzel at gmail.com Wed Dec 9 12:23:08 2009 From: mayzel at gmail.com (Max Mayzel) Date: Wed, 9 Dec 2009 11:23:08 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B1E5676.8040207@student.matnat.uio.no> <4B1E5E2E.5080605@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn writes: > >>>>> Hi, > >>>>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. > >>>>> First, while compiling I faced with following error > >>>>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 > >>>>> build/temp.macosx-10.4-i386-2.6/lapack.o > >>>>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas > >>>>> -latlas -o lapack.so -g > >>>>> Undefined symbols: > >>>>> "_clapack_dgesv", referenced from: > >>>>> _dgesv in lapack.o > >>>>> _dsolve in lapack.o > >>>>> ld: symbol(s) not found > >>>>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully > >>>>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) > >>>>> but when I run testlapack.test() python froze, using 100% of CPU, > >>>>> below is debuging info. > >>> In [3]: numpy.show_config() > >>> lapack_opt_info: > >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > >>> extra_compile_args = ['-msse3'] > >>> define_macros = [('NO_ATLAS_INFO', 3)] > >>> > >>> blas_opt_info: > >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > >>> extra_compile_args = ['-msse3', > >>> '-I/System/Library/Frameworks/vecLib.framework/Headers'] > >>> define_macros = [('NO_ATLAS_INFO', 3)] > > Here it is > > In [2]: scipy.show_config() > > amd_info: > > libraries = ['amd'] > > library_dirs = ['/sw64/lib'] > > > > umfpack_info: > > libraries = ['umfpack', 'amd'] > > library_dirs = ['/sw64/lib'] > > > > lapack_opt_info: > > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > > extra_compile_args = ['-msse3'] > > define_macros = [('NO_ATLAS_INFO', 3)] > > > > blas_opt_info: > > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > > extra_compile_args = ['-msse3', > > '-I/System/Library/Frameworks/vecLib.framework/Headers'] > > define_macros = [('NO_ATLAS_INFO', 3)] > > > Have you tried dropping libraries=.... from setup.py? You should really > build your extension the way SciPy has been built, meaning modifying > setup.py so that gcc is invoked in the way described by blas_opt_info above. > > Dag Sverre > Finally I used following setup.py from distutils.core import setup from distutils.extension import Extension from Cython.Distutils import build_ext import os import numpy as np ext_modules=[ Extension("lapack", ["lapack.pyx"], libraries=['lapack', 'cblas'], extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'], extra_compile_args = ['-faltivec','-I/System/Library/Frameworks/vecLib.framework/Headers'], define_macros = [('NO_ATLAS_INFO', 3)], include_dirs=[np.get_include()]), Extension("testlapack", ["testlapack.pyx"], include_dirs=[np.get_include()]), ] setup( name = 'LAPACK wrapping demo', cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules, ) I build it using /usr/bin/python, lapack.so and testlapack.so are used following libs, but it still don't work. otool -L lapack.so lapack.so: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/ vecLib.framework/Versions/A/libLAPACK.dylib (compatibility version 1.0.0, current version 219.0.0) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/ vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 219.0.0) /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0) /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 246.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.0.0) otool -L testlapack.so testlapack.so: /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 246.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, c urrent version 125.0.0) But it still doesn't work ( Max From dagss at student.matnat.uio.no Wed Dec 9 14:43:30 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Dec 2009 14:43:30 +0100 Subject: [Cython] cython + lapack on OS X 10.6 In-Reply-To: References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B1E5676.8040207@student.matnat.uio.no> <4B1E5E2E.5080605@student.matnat.uio.no> Message-ID: <4B1FA982.8060307@student.matnat.uio.no> Max Mayzel wrote: > Dag Sverre Seljebotn writes: > > >>>>>>> Hi, >>>>>>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. >>>>>>> First, while compiling I faced with following error >>>>>>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 >>>>>>> build/temp.macosx-10.4-i386-2.6/lapack.o >>>>>>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas >>>>>>> -latlas -o lapack.so -g >>>>>>> Undefined symbols: >>>>>>> "_clapack_dgesv", referenced from: >>>>>>> _dgesv in lapack.o >>>>>>> _dsolve in lapack.o >>>>>>> ld: symbol(s) not found >>>>>>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully >>>>>>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) >>>>>>> but when I run testlapack.test() python froze, using 100% of CPU, >>>>>>> below is debuging info. >>>>>>> >>>>> In [3]: numpy.show_config() >>>>> lapack_opt_info: >>>>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>>>> extra_compile_args = ['-msse3'] >>>>> define_macros = [('NO_ATLAS_INFO', 3)] >>>>> >>>>> blas_opt_info: >>>>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>>>> extra_compile_args = ['-msse3', >>>>> '-I/System/Library/Frameworks/vecLib.framework/Headers'] >>>>> define_macros = [('NO_ATLAS_INFO', 3)] >>>>> >>> Here it is >>> In [2]: scipy.show_config() >>> amd_info: >>> libraries = ['amd'] >>> library_dirs = ['/sw64/lib'] >>> >>> umfpack_info: >>> libraries = ['umfpack', 'amd'] >>> library_dirs = ['/sw64/lib'] >>> >>> lapack_opt_info: >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>> extra_compile_args = ['-msse3'] >>> define_macros = [('NO_ATLAS_INFO', 3)] >>> >>> blas_opt_info: >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] >>> extra_compile_args = ['-msse3', >>> '-I/System/Library/Frameworks/vecLib.framework/Headers'] >>> define_macros = [('NO_ATLAS_INFO', 3)] >>> >>> >> Have you tried dropping libraries=.... from setup.py? You should really >> build your extension the way SciPy has been built, meaning modifying >> setup.py so that gcc is invoked in the way described by blas_opt_info above. >> >> Dag Sverre >> >> > Finally I used following setup.py > > from distutils.core import setup > from distutils.extension import Extension > from Cython.Distutils import build_ext > import os > import numpy as np > > ext_modules=[ > Extension("lapack", > ["lapack.pyx"], > libraries=['lapack', 'cblas'], > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'], > extra_compile_args = > ['-faltivec','-I/System/Library/Frameworks/vecLib.framework/Headers'], > define_macros = [('NO_ATLAS_INFO', 3)], > include_dirs=[np.get_include()]), > Extension("testlapack", > ["testlapack.pyx"], > include_dirs=[np.get_include()]), > ] > > setup( > name = 'LAPACK wrapping demo', > cmdclass = {'build_ext': build_ext}, > ext_modules = ext_modules, > ) > > I build it using /usr/bin/python, lapack.so and testlapack.so are used following > libs, but it still don't work. > otool -L lapack.so > lapack.so: > /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/ > vecLib.framework/Versions/A/libLAPACK.dylib (compatibility version 1.0.0, > current version 219.0.0) > /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/ > vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, > current version 219.0.0) > /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate > (compatibility version 1.0.0, current version 4.0.0) > /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, > current version 246.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, > current version 125.0.0) > > otool -L testlapack.so > testlapack.so: > /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, > current version 246.0.0) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, c > urrent version 125.0.0) > > But it still doesn't work ( > You mentioned Sage earlier...Sage comes with its own LAPACK, so by using "sage -python" BOTH to get scipy.show_info() and put the information in setup.py AND do the compilation you might get different results... Dag Sverre From greg.ewing at canterbury.ac.nz Wed Dec 9 22:35:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Dec 2009 10:35:42 +1300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: <4B1EC5DA.2080702@canterbury.ac.nz> Message-ID: <4B20182E.1040704@canterbury.ac.nz> Lisandro Dalcin wrote: > The point is that NULL is not a valid return value, but MPI_COMM_NULL is. In that case, you don't have a problem in the first place, because you *want* something that is not a valid return value. Are you sure you understand what the 'except' clause is doing here? It *doesn't* cause an exception to be raised when the routine returns that value. It's an out-of-band value for Pyrex to use to indicate that the routine has, or may have, already raised an exception. If you're calling an external routine, you need to check for whatever it uses to signal an error and raise and exception yourself. When you do that, Pyrex will use the exception value you've declared to signal that to calling Python-aware code. The exception value is ideally a value that the routine can't ever return normally, or (with '?') one that it returns rarely. If the external routine is guaranteed never to return NULL, then that's an ideal unconditional exception value for your wrapper routine to use. -- Greg From dagss at student.matnat.uio.no Wed Dec 9 22:38:59 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Dec 2009 22:38:59 +0100 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B20182E.1040704@canterbury.ac.nz> References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> Message-ID: <4B2018F3.6010309@student.matnat.uio.no> Greg Ewing wrote: > Lisandro Dalcin wrote: > >> The point is that NULL is not a valid return value, but MPI_COMM_NULL is. > > In that case, you don't have a problem in the first > place, because you *want* something that is not a > valid return value. > > Are you sure you understand what the 'except' clause > is doing here? It *doesn't* cause an exception to be > raised when the routine returns that value. It's an > out-of-band value for Pyrex to use to indicate that > the routine has, or may have, already raised an > exception. > > If you're calling an external routine, you need to > check for whatever it uses to signal an error and raise > and exception yourself. When you do that, Pyrex will > use the exception value you've declared to signal > that to calling Python-aware code. > > The exception value is ideally a value that the routine > can't ever return normally, or (with '?') one that > it returns rarely. If the external routine is guaranteed > never to return NULL, then that's an ideal unconditional > exception value for your wrapper routine to use. I believe the idea is that the routine doesn't return a pointer, but a struct or similar. Of course, also structs can have NULL-like values defined for them. -- Dag Sverre From dalcinl at gmail.com Thu Dec 10 01:37:12 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 9 Dec 2009 21:37:12 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B2018F3.6010309@student.matnat.uio.no> References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> Message-ID: On Wed, Dec 9, 2009 at 6:38 PM, Dag Sverre Seljebotn wrote: > Greg Ewing wrote: >> Lisandro Dalcin wrote: >> >>> The point is that NULL is not a valid return value, but MPI_COMM_NULL is. >> >> In that case, you don't have a problem in the first >> place, because you *want* something that is not a >> valid return value. >> >> Are you sure you understand what the 'except' clause >> is doing here? It *doesn't* cause an exception to be >> raised when the routine returns that value. It's an >> out-of-band value for Pyrex to use to indicate that >> the routine has, or may have, already raised an >> exception. >> >> If you're calling an external routine, you need to >> check for whatever it uses to signal an error and raise >> and exception yourself. When you do that, Pyrex will >> use the exception value you've declared to signal >> that to calling Python-aware code. >> >> The exception value is ideally a value that the routine >> can't ever return normally, or (with '?') one that >> it returns rarely. If the external routine is guaranteed >> never to return NULL, then that's an ideal unconditional >> exception value for your wrapper routine to use. > > I believe the idea is that the routine doesn't return a pointer, but a > struct or similar. Of course, also structs can have NULL-like values > defined for them. > Indeed. I need to return a MPI_Comm value, and MPI_Comm can be anything (well, it usually is an integer or a pointer, depending on the implementation). So I need to code this: cdef class Comm: MPI_Comm ob_mpi cdef api MPI_Comm PyMPIComm_AsComm(object o) except? MPI_COMM_NULL: return (o).ob_mpi Inside that function I do a typecheck to make sure that object 'o' do have the appropriate type. So if the typecheck fail, Cython/Pyrex should set the Python error, and I want MPI_COMM_NULL to be returned to signal the failure to the caller. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From mayzel at gmail.com Thu Dec 10 12:02:38 2009 From: mayzel at gmail.com (Max Mayzel) Date: Thu, 10 Dec 2009 11:02:38 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B1E5676.8040207@student.matnat.uio.no> <4B1E5E2E.5080605@student.matnat.uio.no> <4B1FA982.8060307@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn writes: > > Max Mayzel wrote: > > Dag Sverre Seljebotn writes: > > > > > >>> Here it is > >>> In [2]: scipy.show_config() > >>> amd_info: > >>> libraries = ['amd'] > >>> library_dirs = ['/sw64/lib'] > >>> > >>> umfpack_info: > >>> libraries = ['umfpack', 'amd'] > >>> library_dirs = ['/sw64/lib'] > >>> > >>> lapack_opt_info: > >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > >>> extra_compile_args = ['-msse3'] > >>> define_macros = [('NO_ATLAS_INFO', 3)] > >>> > >>> blas_opt_info: > >>> extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] > >>> extra_compile_args = ['-msse3', > >>> '-I/System/Library/Frameworks/vecLib.framework/Headers'] > >>> define_macros = [('NO_ATLAS_INFO', 3)] > >>> > >>> > >> Have you tried dropping libraries=.... from setup.py? You should really > >> build your extension the way SciPy has been built, meaning modifying > >> setup.py so that gcc is invoked in the way described by blas_opt_info above. > >> > >> Dag Sverre > >> > >> > > Finally I used following setup.py > > > > from distutils.core import setup > > from distutils.extension import Extension > > from Cython.Distutils import build_ext > > import os > > import numpy as np > > > > ext_modules=[ > > Extension("lapack", > > ["lapack.pyx"], > > libraries=['lapack', 'cblas'], > > extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'], > > extra_compile_args = > > ['-faltivec','-I/System/Library/Frameworks/vecLib.framework/Headers'], > > define_macros = [('NO_ATLAS_INFO', 3)], > > include_dirs=[np.get_include()]), > > Extension("testlapack", > > ["testlapack.pyx"], > > include_dirs=[np.get_include()]), > > ] > > > > setup( > > name = 'LAPACK wrapping demo', > > cmdclass = {'build_ext': build_ext}, > > ext_modules = ext_modules, > > ) > > > You mentioned Sage earlier...Sage comes with its own LAPACK, so by using > "sage -python" BOTH to get scipy.show_info() and put the information in > setup.py AND do the compilation you might get different results... > > Dag Sverre > It seems, that sage is also build against mac accelerated framework, sage -python Python 2.6.2 (r262:71600, Nov 14 2009, 10:42:01) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.show_config() lapack_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3'] define_macros = [('NO_ATLAS_INFO', 3)] blas_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] define_macros = [('NO_ATLAS_INFO', 3)] >>> import scipy >>> scipy.show_config() umfpack_info: NOT AVAILABLE lapack_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3'] define_macros = [('NO_ATLAS_INFO', 3)] blas_opt_info: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] define_macros = [('NO_ATLAS_INFO', 3)] >>> exit and it still doesn't work ( Max From mayzel at gmail.com Thu Dec 10 12:22:31 2009 From: mayzel at gmail.com (Max Mayzel) Date: Thu, 10 Dec 2009 11:22:31 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> Message-ID: Dag Sverre Seljebotn writes: > > Max wrote: > > Hi, > > I'm trying to run testlapack example from cython-notur09 on OS X 10.6. > > First, while compiling I faced with following error > > gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 > > build/temp.macosx-10.4-i386-2.6/lapack.o > > -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas > > -latlas -o lapack.so -g > > Undefined symbols: > > "_clapack_dgesv", referenced from: > > _dgesv in lapack.o > > _dsolve in lapack.o > > ld: symbol(s) not found > > So I changed clapack_dgesw in lapack.pxd to dgesw and successfully > > build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) > > but when I run testlapack.test() python froze, using 100% of CPU, > > below is debuging info. > > > >>>> tl.test() > > test start > > ^C > > Program received signal SIGINT, Interrupt. > > 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, > > __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, > > __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) > > at lapack.c:775 > > 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, > > int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int > > *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { > > (gdb) exit > > > > I was also tried to build it against Mac OS X accelerated framework > > and the result is the same, while on linux this example works fine. > > Any suggestions what's wrong? Dag, here are clapack.h on linux int clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb); and on Mac accelerated framework /* Subroutine */ int dgesv_(__CLPK_integer *n, __CLPK_integer *nrhs, __CLPK_doublereal *a, __CLPK_integer *lda, __CLPK_integer *ipiv, __CLPK_doublereal *b, __CLPK_integer *ldb, __CLPK_integer *info); Max From dagss at student.matnat.uio.no Thu Dec 10 14:10:04 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 10 Dec 2009 14:10:04 +0100 Subject: [Cython] cython + lapack on OS X 10.6 In-Reply-To: References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> Message-ID: <4B20F32C.4060708@student.matnat.uio.no> Max Mayzel wrote: > Dag Sverre Seljebotn writes: > > >> Max wrote: >> >>> Hi, >>> I'm trying to run testlapack example from cython-notur09 on OS X 10.6. >>> First, while compiling I faced with following error >>> gcc -L/sw64/lib -bundle -L/sw64/lib/python2.6/config -lpython2.6 >>> build/temp.macosx-10.4-i386-2.6/lapack.o >>> -L/Applications/development/sage//local/lib -llapack -lf77blas -lcblas >>> -latlas -o lapack.so -g >>> Undefined symbols: >>> "_clapack_dgesv", referenced from: >>> _dgesv in lapack.o >>> _dsolve in lapack.o >>> ld: symbol(s) not found >>> So I changed clapack_dgesw in lapack.pxd to dgesw and successfully >>> build it against SAGE (sage-4.2.1-OSX10.6-Intel-64bit-i386-Darwin.dmg) >>> but when I run testlapack.test() python froze, using 100% of CPU, >>> below is debuging info. >>> >>> >>>>>> tl.test() >>>>>> >>> test start >>> ^C >>> Program received signal SIGINT, Interrupt. >>> 0x00000001005280a4 in dgesv (__pyx_v_Order=CblasRowMajor, __pyx_v_N=3, >>> __pyx_v_NRHS=1, __pyx_v_A=0x7fff5fbfdda0, __pyx_v_lda=3, >>> __pyx_v_ipiv=0x7fff5fbfde10, __pyx_v_B=0x7fff5fbfddf0, __pyx_v_ldb=3) >>> at lapack.c:775 >>> 775 static int dgesv(enum CBLAS_ORDER __pyx_v_Order, int __pyx_v_N, >>> int __pyx_v_NRHS, double *__pyx_v_A, int __pyx_v_lda, int >>> *__pyx_v_ipiv, double *__pyx_v_B, int __pyx_v_ldb) { >>> (gdb) exit >>> >>> I was also tried to build it against Mac OS X accelerated framework >>> and the result is the same, while on linux this example works fine. >>> Any suggestions what's wrong? >>> > > Dag, here are clapack.h on linux > int clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, > double *A, const int lda, int *ipiv, > double *B, const int ldb); > > and on Mac accelerated framework > > /* Subroutine */ int dgesv_(__CLPK_integer *n, __CLPK_integer *nrhs, > __CLPK_doublereal *a, __CLPK_integer > *lda, __CLPK_integer *ipiv, __CLPK_doublereal *b, __CLPK_integer *ldb, > __CLPK_integer *info); > > Max > OK, that explains it -- they're different APIs. What clapack.h on Mac apparently contains is a direct interface to the FORTRAN lapack. You need to declare the function above to Cython (note that the signature is very different, taking pointers to integers etc.), and write a wrapper in Cython which a) Passes e.g. "&n" rather than "n", &nrhs rather than nrhs and so on (and make sure n and nrhs and so on all have the type __CLPK_integer). b) The data (__CLPK_doublereal*s) should be passed like this: def dgesv(..., a, b): cdef np.ndarray[double, ndim=2, mode='fortan'] a_work = np.asfortranarray(a) cdef np.ndarray[double, ndim=2, mode='fortan'] b_work = np.asfortranarray(b) errcode = dgesv_(...., <__CLPK_doublereal*>a_work.data, <...>b_work.data,. ..) ... if b_work is not b: b[...] = b_work # verbatim "..."! Copies back data if contiguous copy had to be made This overwrites b with the solution, which is perhaps not what you want, but makes for a raw wrapper. Otherwise: ... b_work = b.copy('F') return b_work # or some slice of it...don't remember LAPACK here Dag Sverre From dalcinl at gmail.com Thu Dec 10 16:13:56 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 10 Dec 2009 12:13:56 -0300 Subject: [Cython] cannot cythonize mpi4py after 885dbfad02aa Message-ID: After this changeset ... changeset: 2764:885dbfad02aa user: Stefan Behnel date: Tue Dec 08 01:05:01 2009 +0100 summary: translate Python float calculations into C doubles I cannot Cythonize mpi4py (note: I've not enabled type inference). The failure happens in many methods like this one: cdef class Datatype: ... def Dup(self): """ Duplicate a datatype """ cdef Datatype datatype = type(self)() ..... return datatype Traceback below: python ./conf/cythonize.py Traceback (most recent call last): File "./conf/cythonize.py", line 77, in run('mpi4py.MPI.pyx', 'src') File "./conf/cythonize.py", line 58, in run output_h=os.path.join('include', package), File "./conf/cythonize.py", line 22, in cythonize result = compile(source, options) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", line 737, in compile return compile_single(source, options, full_module_name) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", line 682, in compile_single return run_pipeline(source, options, full_module_name) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", line 570, in run_pipeline err, enddata = context.run_pipeline(pipeline, source) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Main.py", line 221, in run_pipeline data = phase(data) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 265, in __call__ return super(CythonTransform, self).__call__(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 248, in __call__ return self.visit(root) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/ParseTreeTransforms.py", line 799, in visit_ModuleNode self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 199, in visitchildren childretval = self.visitchild(child, parent, attr, None) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 275, in visit_Node self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 197, in visitchildren childretval = [self.visitchild(x, parent, attr, idx) for idx, x in enumerate(child)] File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 275, in visit_Node self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 197, in visitchildren childretval = [self.visitchild(x, parent, attr, idx) for idx, x in enumerate(child)] File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 275, in visit_Node self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 197, in visitchildren childretval = [self.visitchild(x, parent, attr, idx) for idx, x in enumerate(child)] File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 275, in visit_Node self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 199, in visitchildren childretval = self.visitchild(child, parent, attr, None) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 275, in visit_Node self.visitchildren(node) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 227, in visitchildren result = TreeVisitor.visitchildren(self, parent, attrs) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 197, in visitchildren childretval = [self.visitchild(x, parent, attr, idx) for idx, x in enumerate(child)] File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 150, in visitchild result = self.visit(child) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Visitor.py", line 46, in visit return handler_method(obj) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/ParseTreeTransforms.py", line 804, in visit_FuncDefNode node.body.analyse_expressions(node.local_scope) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Nodes.py", line 336, in analyse_expressions stat.analyse_expressions(env) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Nodes.py", line 2936, in analyse_expressions self.analyse_types(env) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/Nodes.py", line 3027, in analyse_types self.rhs.analyse_types(env) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/ExprNodes.py", line 4271, in analyse_types self.operand.analyse_types(env) File "/u/dalcinl/Devel/Cython/cython-devel/Cython/Compiler/ExprNodes.py", line 2452, in analyse_types if func_type is Builtin.type_type and function.entry.is_builtin and \ AttributeError: 'SimpleCallNode' object has no attribute 'entry' -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From mayzel at gmail.com Thu Dec 10 17:46:41 2009 From: mayzel at gmail.com (Max Mayzel) Date: Thu, 10 Dec 2009 16:46:41 +0000 (UTC) Subject: [Cython] cython + lapack on OS X 10.6 References: <164c760d0912070500v2087ad7v5653642447871384@mail.gmail.com> <4B1D801A.7050108@student.matnat.uio.no> <4B20F32C.4060708@student.matnat.uio.no> Message-ID: Thanks Dag, it works. Max From robertwb at math.washington.edu Thu Dec 10 19:54:06 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Dec 2009 10:54:06 -0800 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> Message-ID: <898CA4F0-CF29-474D-9E19-339690F05919@math.washington.edu> On Dec 9, 2009, at 4:37 PM, Lisandro Dalcin wrote: > On Wed, Dec 9, 2009 at 6:38 PM, Dag Sverre Seljebotn > wrote: >> Greg Ewing wrote: >>> Lisandro Dalcin wrote: >>> >>>> The point is that NULL is not a valid return value, but >>>> MPI_COMM_NULL is. >>> >>> In that case, you don't have a problem in the first >>> place, because you *want* something that is not a >>> valid return value. >>> >>> Are you sure you understand what the 'except' clause >>> is doing here? It *doesn't* cause an exception to be >>> raised when the routine returns that value. It's an >>> out-of-band value for Pyrex to use to indicate that >>> the routine has, or may have, already raised an >>> exception. >>> >>> If you're calling an external routine, you need to >>> check for whatever it uses to signal an error and raise >>> and exception yourself. When you do that, Pyrex will >>> use the exception value you've declared to signal >>> that to calling Python-aware code. >>> >>> The exception value is ideally a value that the routine >>> can't ever return normally, or (with '?') one that >>> it returns rarely. If the external routine is guaranteed >>> never to return NULL, then that's an ideal unconditional >>> exception value for your wrapper routine to use. >> >> I believe the idea is that the routine doesn't return a pointer, >> but a >> struct or similar. Of course, also structs can have NULL-like values >> defined for them. >> > > Indeed. I need to return a MPI_Comm value, and MPI_Comm can be > anything (well, it usually is an integer or a pointer, depending on > the implementation). So I need to code this: > > cdef class Comm: > MPI_Comm ob_mpi > > cdef api MPI_Comm PyMPIComm_AsComm(object o) except? MPI_COMM_NULL: > return (o).ob_mpi > > Inside that function I do a typecheck to make sure that object 'o' do > have the appropriate type. So if the typecheck fail, Cython/Pyrex > should set the Python error, and I want MPI_COMM_NULL to be returned > to signal the failure to the caller. I would be fine with allowing extern variables as exception values. - Robert From robertwb at math.washington.edu Thu Dec 10 20:01:39 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Dec 2009 11:01:39 -0800 Subject: [Cython] small typo in profiling tutorial In-Reply-To: <79b79e730912050402id06ed19h6dbb218c6a95d9b9@mail.gmail.com> References: <79b79e730912050402id06ed19h6dbb218c6a95d9b9@mail.gmail.com> Message-ID: On Dec 5, 2009, at 4:02 AM, Francesco Guerrieri wrote: > Hi, I was reading the profiling tutorial http://docs.cython.org/src/tutorial/profiling_tutorial.html > . > ISTM that there is a small typo, approx_py is referenced instead of > recip_square. Thanks. > I attach a small patch for the markup text. > Sorry if this is not the best way of communication for such small > issues :-) No, this is perfect. - Robert From robertwb at math.washington.edu Thu Dec 10 20:08:42 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 10 Dec 2009 11:08:42 -0800 Subject: [Cython] [Docs bug] Wrong reference syntax In-Reply-To: <4B1D436A.4080405@lophus.org> References: <4B1D436A.4080405@lophus.org> Message-ID: <6E4806AF-5BBB-4809-8234-D441315F4F31@math.washington.edu> Thanks. Fixed. On Dec 7, 2009, at 10:03 AM, jonas at lophus.org wrote: > You've got a syntax error on > http://docs.cython.org/src/quickstart/build.html (ref to > reference/compilation). > > Regards > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From greg.ewing at canterbury.ac.nz Fri Dec 11 00:01:44 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 11 Dec 2009 12:01:44 +1300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> Message-ID: <4B217DD8.7020401@canterbury.ac.nz> Lisandro Dalcin wrote: > I need to return a MPI_Comm value, and MPI_Comm can be > anything (well, it usually is an integer or a pointer, depending on > the implementation). If it's actually a struct, then you're in trouble, because an exception value has to be something comparable with == in C, and you can't do that with structs. If it's a scalar type, you can probably tell Pyrex that it's an int and use an enum for the exception value. -- Greg From stefan_ml at behnel.de Fri Dec 11 14:14:25 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 Dec 2009 14:14:25 +0100 Subject: [Cython] cannot cythonize mpi4py after 885dbfad02aa In-Reply-To: References: Message-ID: <4B2245B1.9040008@behnel.de> Hi Lisandro, Lisandro Dalcin, 10.12.2009 16:13: > After this changeset ... > > changeset: 2764:885dbfad02aa > user: Stefan Behnel > date: Tue Dec 08 01:05:01 2009 +0100 > summary: translate Python float calculations into C doubles > > I cannot Cythonize mpi4py (note: I've not enabled type inference). The > failure happens in many methods like this one: > > cdef class Datatype: > ... > def Dup(self): > """ > Duplicate a datatype > """ > cdef Datatype datatype = type(self)() > ..... > return datatype > > > Traceback below: > [...] > line 2452, in analyse_types > if func_type is Builtin.type_type and function.entry.is_builtin and \ > AttributeError: 'SimpleCallNode' object has no attribute 'entry' Thanks for the test case, I can reproduce this. I'll look into it. Stefan From stefan_ml at behnel.de Fri Dec 11 15:49:47 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 Dec 2009 15:49:47 +0100 Subject: [Cython] cannot cythonize mpi4py after 885dbfad02aa In-Reply-To: <4B2245B1.9040008@behnel.de> References: <4B2245B1.9040008@behnel.de> Message-ID: <4B225C0B.1020008@behnel.de> Stefan Behnel, 11.12.2009 14:14: > Hi Lisandro, > > Lisandro Dalcin, 10.12.2009 16:13: >> After this changeset ... >> >> changeset: 2764:885dbfad02aa >> user: Stefan Behnel >> date: Tue Dec 08 01:05:01 2009 +0100 >> summary: translate Python float calculations into C doubles >> >> I cannot Cythonize mpi4py (note: I've not enabled type inference). The >> failure happens in many methods like this one: >> >> cdef class Datatype: >> ... >> def Dup(self): >> """ >> Duplicate a datatype >> """ >> cdef Datatype datatype = type(self)() >> ..... >> return datatype >> >> >> Traceback below: >> [...] >> line 2452, in analyse_types >> if func_type is Builtin.type_type and function.entry.is_builtin and \ >> AttributeError: 'SimpleCallNode' object has no attribute 'entry' > > Thanks for the test case, I can reproduce this. I'll look into it. Yep, that code was actually much too generic. Fixed here: http://hg.cython.org/cython-devel/rev/0c48cba30316 Stefan From dalcinl at gmail.com Fri Dec 11 16:17:39 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Dec 2009 12:17:39 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B217DD8.7020401@canterbury.ac.nz> References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> <4B217DD8.7020401@canterbury.ac.nz> Message-ID: On Thu, Dec 10, 2009 at 8:01 PM, Greg Ewing wrote: > Lisandro Dalcin wrote: >> I need to return a MPI_Comm value, and MPI_Comm can be >> anything (well, it usually is an integer or a pointer, depending on >> the implementation). > > If it's actually a struct, then you're in trouble, because > an exception value has to be something comparable with > == in C, and you can't do that with structs. > OK. "Anything" was too much. It is an integer typedef or a pointer to an opaque struct, depending on the MPI implementation. But I make Cython to see MPI_Comm as a pointer type. > > If it's a scalar type, you can probably tell Pyrex that > it's an int and use an enum for the exception value. > Well, at least in Cython, an enum is not compatible with all scalar types, just with integral types. As I have to return a pointer type, I loose. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Dec 11 16:26:21 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Dec 2009 12:26:21 -0300 Subject: [Cython] cannot cythonize mpi4py after 885dbfad02aa In-Reply-To: <4B225C0B.1020008@behnel.de> References: <4B2245B1.9040008@behnel.de> <4B225C0B.1020008@behnel.de> Message-ID: On Fri, Dec 11, 2009 at 11:49 AM, Stefan Behnel wrote: > > Stefan Behnel, 11.12.2009 14:14: >> Hi Lisandro, >> >> Lisandro Dalcin, 10.12.2009 16:13: >>> After this changeset ... >>> >>> changeset: ? 2764:885dbfad02aa >>> user: ? ? ? ?Stefan Behnel >>> date: ? ? ? ?Tue Dec 08 01:05:01 2009 +0100 >>> summary: ? ? translate Python float calculations into C doubles >>> >>> I cannot Cythonize mpi4py (note: I've not enabled type inference). The >>> failure happens in many methods like this one: >>> >>> cdef class Datatype: >>> ... >>> ? ? def Dup(self): >>> ? ? ? ? """ >>> ? ? ? ? Duplicate a datatype >>> ? ? ? ? """ >>> ? ? ? ? cdef Datatype datatype = type(self)() >>> ? ? ? ? ..... >>> ? ? ? ? return datatype >>> >>> >>> ?Traceback below: >>> [...] >>> line 2452, in analyse_types >>> ? ? if func_type is Builtin.type_type and function.entry.is_builtin and \ >>> AttributeError: 'SimpleCallNode' object has no attribute 'entry' >> >> Thanks for the test case, I can reproduce this. I'll look into it. > > Yep, that code was actually much too generic. Fixed here: > > http://hg.cython.org/cython-devel/rev/0c48cba30316 > Many thanks, Stefan. All is working from my side. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From david.n.mashburn at gmail.com Fri Dec 11 20:49:17 2009 From: david.n.mashburn at gmail.com (David Mashburn) Date: Fri, 11 Dec 2009 14:49:17 -0500 Subject: [Cython] Cpyx for automatic Module creation and inlining of Cython code... Message-ID: <4B22A23D.1010303@gmail.com> Hello Cython Developers, I wanted to announce "Cpyx", a module I've been working on off and on since 2006 that I use to automatically compile and also inline Cython code in my work (mostly because I like to do everything in one step). This is more-or-less a prototype, but it works for me on Windows, Mac, and Ubuntu, so I thought I'd share! I know it has similar goals to pyx_import, but I think the two are quite compilementary... (and I couldn't figure out how to get numpy support in pyx_import when it came out...) My main hope for this is that it can give people a starting point for using manual compilation/distutils on their system (it is very verbose by default) and that it can automatically inline code with numpy support! If you find it useful, I think it is almost mature enough to be included in cython, and if not I certainly enjoy using it! In any case, I'd love some feedback. Thanks for all the hard work you all are doing with Greg's brainchild! -David Mashburn -------------- next part -------------- A non-text attachment was scrubbed... Name: Cpyx.py Type: text/x-python Size: 15745 bytes Desc: not available Url : http://codespeak.net/pipermail/cython-dev/attachments/20091211/b2486d06/attachment.py From greg.ewing at canterbury.ac.nz Sat Dec 12 03:02:52 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Dec 2009 15:02:52 +1300 Subject: [Cython] Idea for automatic encoding and decoding Message-ID: <4B22F9CC.3090606@canterbury.ac.nz> I've had an idea that might help with making the encoding and decoding of unicode strings more automatic. Suppose we have a way of expressing a type parameterised with an encoding, maybe something like encoding[name] We could have a few predefined ones, such as ctypedef encoding['ascii'] ascii ctypedef encoding['utf8'] utf8 ctypedef encoding['latin1'] latin1 These are Python object types. Internally they're represented as bytes objects, but the compiler knows statically that they have an encoding associated with them, and the appropriate encoding and decoding operations are performed when coercing from and to strings. Being bytes, they can also be cast to char * without any problem. So we can write things like cdef extern from "foo.h": void cflump(char *) def flump(utf8 s): cflump(s) Now we can pass a unicode string to flump() and it will first be encoded to bytes as utf8, and then passed to cflump() as a char *. For going the other way, we also need a corresponding family of C string types with associated encodings. We could give them different names, but that isn't really necessary, since we can re-use the same ones: cdef extern from "foo.h": utf8 *cbrazzle() This is unambiguous, because you can't declare a pointer to a Python object. What we're saying here is that cbrazzle() returns a char *, but it is to be understood as encoded in utf8. So we can write def brazzle(): return cbrazzle() and the return value from cbrazzle is automatically decoded using utf8. I've put a cast there because otherwise there would be an ambiguity -- should a utf8 * be converted to a str on coercion to a Python type, or a utf8 (i.e. bytes) object? Having to use a cast is a bit ugly, though. It could be eliminated by allowing a def function to specify a return type: def str brazzle(): return cbrazzle() Or there could simply be a rule that resolves the ambiguity in favour of str whenever the target type is a generic Python object, in which case we could simply write def brazzle(): return cbrazzle() What do you think? Seems like this sort of scheme would keep the encoding being used at each point fairly explicit without being too intrusive. -- Greg From dalcinl at gmail.com Sat Dec 12 03:50:03 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 11 Dec 2009 23:50:03 -0300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B22F9CC.3090606@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: On Fri, Dec 11, 2009 at 11:02 PM, Greg Ewing wrote: > I've had an idea that might help with making the > encoding and decoding of unicode strings more > automatic. > > Suppose we have a way of expressing a type parameterised > with an encoding, maybe something like > > ? encoding[name] > > We could have a few predefined ones, such as > > ? ctypedef encoding['ascii'] ascii > ? ctypedef encoding['utf8'] utf8 > ? ctypedef encoding['latin1'] latin1 > Just in case, did you mean ctypedef encoding['ascii'] char* ascii ctypedef encoding['utf8'] char* utf8 ctypedef encoding['latin1'] char* latin1 ?? > > What do you think? > Long time ago (about 4 years), Guido commented on Python-Dev that Greg's taste in language features is hard to beat. > Seems like this sort of scheme would > keep the encoding being used at each point fairly explicit > without being too intrusive. > I think this does not handle Robert's concern of updating all the code in Sage, as still every usage of char* will need review. But Greg's proposal would be really handy for NEW code targeting Py3k. Moreover, Robert's idea of using compiler directives to automate the char* <-> str conversion could build on Greg's proposal. In short, I'm definitely +1 on this approach. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Sat Dec 12 08:52:13 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Dec 2009 08:52:13 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B22F9CC.3090606@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: <4B234BAD.9060900@behnel.de> Hi Greg, Greg Ewing, 12.12.2009 03:02: > I've had an idea that might help with making the > encoding and decoding of unicode strings more > automatic. > > Suppose we have a way of expressing a type parameterised > with an encoding, maybe something like > > encoding[name] > > We could have a few predefined ones, such as > > ctypedef encoding['ascii'] ascii > ctypedef encoding['utf8'] utf8 > ctypedef encoding['latin1'] latin1 > > These are Python object types. Internally they're > represented as bytes objects, but the compiler knows > statically that they have an encoding associated with > them, and the appropriate encoding and decoding > operations are performed when coercing from and to > strings. > > Being bytes, they can also be cast to char * without > any problem. So we can write things like > > cdef extern from "foo.h": > void cflump(char *) > > def flump(utf8 s): > cflump(s) > > Now we can pass a unicode string to flump() and it will > first be encoded to bytes as utf8, and then passed to > cflump() as a char *. Thanks for bringing my recent proposals back into the discussion. I actually prefer something closer to the existing buffer syntax, but I'm certainly +1 on such a feature. Note that Cython has cpdef functions already, so adding a return type to def functions isn't far off. However, the above describes a new feature - not a solution to Robert's intention of making string recoding fully automatic for existing code. Stefan From njs at pobox.com Sat Dec 12 10:05:03 2009 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 12 Dec 2009 01:05:03 -0800 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? Message-ID: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux), some code of mine broke. Specifically, it's code for reading a binary format, and in the tests I had a string that made Cython fail to compile with the error: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. As an example, here's a complete file that Cython 0.12 will refuse to compile: ------------- s = "\x12\x34\x9f\x65" ------------- I'm not sure why it's nattering about the source code encoding when the problem is with explicitly quoted byte values, but... my question is, I can fix this by adding a "b" sigil on the front, but that's incompatible with earlier versions of Cython. Is there any way to write this string that will work with all versions of Cython? (And was it really intentional to break Python source compatibility so badly?) -- Nathaniel From stefan_ml at behnel.de Sat Dec 12 10:51:38 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Dec 2009 10:51:38 +0100 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> Message-ID: <4B2367AA.7090407@behnel.de> Nathaniel Smith, 12.12.2009 10:05: > After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux), > some code of mine broke. Specifically, it's code for reading a binary > format, and in the tests I had a string that made Cython fail to > compile with the error: > String decoding as 'UTF-8' failed. Consider using a byte string or > unicode string explicitly, or adjust the source code encoding. > > As an example, here's a complete file that Cython 0.12 will refuse to compile: > ------------- > s = "\x12\x34\x9f\x65" > ------------- > > I'm not sure why it's nattering about the source code encoding when > the problem is with explicitly quoted byte values Because you are using a 'str' literal, which needs to be decoded in Python 3 to become the equivalent str (i.e. unicode) object. A check for that is required for the semantics of the 'str' type in Cython, as it would otherwise be impossible to switch the type in the generated C code - you simply can't write out a unicode literal into C in a portable way. The relevant CEP is here: http://wiki.cython.org/enhancements/stringliterals > but... my question > is, I can fix this by adding a "b" sigil on the front, but that's > incompatible with earlier versions of Cython. Yes, bytes literals were fixed up fairly recently - may have been 0.11 or so. Given that they were partly broken before that, I don't really see why you would want to support earlier versions of Cython anyway. > Is there any way to > write this string that will work with all versions of Cython? I'd just drop support for earlier Cython versions and go with an explicit b'...' literal. > (And was it really intentional to break Python source compatibility so badly?) What do you mean? And what version of Python are you referring to? Stefan From greg.ewing at canterbury.ac.nz Sat Dec 12 11:44:15 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Dec 2009 23:44:15 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: <4B2373FF.2030406@canterbury.ac.nz> Lisandro Dalcin wrote: > Just in case, did you mean > > ctypedef encoding['ascii'] char* ascii > ctypedef encoding['utf8'] char* utf8 > ctypedef encoding['latin1'] char* latin1 No, I didn't. ctypedef encoding['utf8'] utf8 ^^^^^^^^^^^^^^^^ ^^^^ This denotes a We are declaring this name type to be an alias for that type -- Greg From njs at vorpus.org Sat Dec 12 12:19:46 2009 From: njs at vorpus.org (Nathaniel Smith) Date: Sat, 12 Dec 2009 03:19:46 -0800 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <4B2367AA.7090407@behnel.de> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> <4B2367AA.7090407@behnel.de> Message-ID: <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> On Sat, Dec 12, 2009 at 1:51 AM, Stefan Behnel wrote: > Nathaniel Smith, 12.12.2009 10:05: >> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux), >> some code of mine broke. Specifically, it's code for reading a binary >> format, and in the tests I had a string that made Cython fail to >> compile with the error: >> ? String decoding as 'UTF-8' failed. Consider using a byte string or >> unicode string explicitly, or adjust the source code encoding. >> >> As an example, here's a complete file that Cython 0.12 will refuse to compile: >> ------------- >> s = "\x12\x34\x9f\x65" >> ------------- >> >> I'm not sure why it's nattering about the source code encoding when >> the problem is with explicitly quoted byte values > > Because you are using a 'str' literal, which needs to be decoded in Python > 3 to become the equivalent str (i.e. unicode) object. A check for that is > required for the semantics of the 'str' type in Cython, as it would > otherwise be impossible to switch the type in the generated C code - you > simply can't write out a unicode literal into C in a portable way. > > The relevant CEP is here: > > http://wiki.cython.org/enhancements/stringliterals Sure, I know. But I'm not using Python 3 (I'm using 2.5.2, as mentioned), and that page says "Unmarked string literals, when used in a Python context, would be [...] byte strings in Py2", and the table labeled "Proposal" seems to imply that in Py2, cython will treat "foo" and b"foo" as equivalent (just as CPython would). Similarly, under "Cons" it notes that the changes under discussion may cause backwards compatibility problems when moving from Py2 to Py3, but it does not note that they also cause (IMHO rather more serious) backwards incompatibility between Cython 0.11+Py2 and Cython 0.12+Py2. >> but... my question >> is, I can fix this by adding a "b" sigil on the front, but that's >> incompatible with earlier versions of Cython. > > Yes, bytes literals were fixed up fairly recently - may have been 0.11 or > so. Given that they were partly broken before that, I don't really see why > you would want to support earlier versions of Cython anyway. Oh, does that work in 0.11? All the documentation I had found (e.g. at the top of that page you linked) only mentions py3-style string handling in the context of 0.12. That solves my personal problem. >> (And was it really intentional to break Python source compatibility so badly?) > > What do you mean? And what version of Python are you referring to? Just that -- AFAICT -- I can no longer rely on Py2 syntax for specifying string literals in Py2 extension modules. That seems odd. -- Nathaniel From dagss at student.matnat.uio.no Sat Dec 12 12:38:57 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 12 Dec 2009 12:38:57 +0100 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> <4B2367AA.7090407@behnel.de> <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> Message-ID: <1c7406972325640fd07d063cef502f6a.squirrel@webmail.uio.no> Nathaniel Smith wrote: > On Sat, Dec 12, 2009 at 1:51 AM, Stefan Behnel > wrote: >> Nathaniel Smith, 12.12.2009 10:05: >>> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux), >>> some code of mine broke. Specifically, it's code for reading a binary >>> format, and in the tests I had a string that made Cython fail to >>> compile with the error: >>> ? String decoding as 'UTF-8' failed. Consider using a byte string or >>> unicode string explicitly, or adjust the source code encoding. >>> >>> As an example, here's a complete file that Cython 0.12 will refuse to >>> compile: >>> ------------- >>> s = "\x12\x34\x9f\x65" >>> ------------- >>> >>> I'm not sure why it's nattering about the source code encoding when >>> the problem is with explicitly quoted byte values >> >> Because you are using a 'str' literal, which needs to be decoded in >> Python >> 3 to become the equivalent str (i.e. unicode) object. A check for that >> is >> required for the semantics of the 'str' type in Cython, as it would >> otherwise be impossible to switch the type in the generated C code - you >> simply can't write out a unicode literal into C in a portable way. >> >> The relevant CEP is here: >> >> http://wiki.cython.org/enhancements/stringliterals > > Sure, I know. But I'm not using Python 3 (I'm using 2.5.2, as > mentioned), and that page says "Unmarked string literals, when used in > a Python context, would be [...] byte strings in Py2", and the table > labeled "Proposal" seems to imply that in Py2, cython will treat "foo" > and b"foo" as equivalent (just as CPython would). Similarly, under > "Cons" it notes that the changes under discussion may cause backwards > compatibility problems when moving from Py2 to Py3, but it does not > note that they also cause (IMHO rather more serious) backwards > incompatibility between Cython 0.11+Py2 and Cython 0.12+Py2. > >>> but... my question >>> is, I can fix this by adding a "b" sigil on the front, but that's >>> incompatible with earlier versions of Cython. >> >> Yes, bytes literals were fixed up fairly recently - may have been 0.11 >> or >> so. Given that they were partly broken before that, I don't really see >> why >> you would want to support earlier versions of Cython anyway. > > Oh, does that work in 0.11? All the documentation I had found (e.g. at > the top of that page you linked) only mentions py3-style string > handling in the context of 0.12. That solves my personal problem. > >>> (And was it really intentional to break Python source compatibility so >>> badly?) >> >> What do you mean? And what version of Python are you referring to? > > Just that -- AFAICT -- I can no longer rely on Py2 syntax for > specifying string literals in Py2 extension modules. That seems odd. The point is that Cython doesn't know when compiling whether the target is py2 or py3; the same C source works for both. The alternative would be to do something like allowing the code on Py2 but make the module always raise an exception when being loaded in Py3...doesn't seem like an improvement to me though. Dag Sverre From stefan_ml at behnel.de Sat Dec 12 14:26:00 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Dec 2009 14:26:00 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B2373FF.2030406@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> Message-ID: <4B2399E8.3010409@behnel.de> Greg Ewing, 12.12.2009 11:44: > Lisandro Dalcin wrote: > >> Just in case, did you mean >> >> ctypedef encoding['ascii'] char* ascii >> ctypedef encoding['utf8'] char* utf8 >> ctypedef encoding['latin1'] char* latin1 > > No, I didn't. > > ctypedef encoding['utf8'] utf8 > > ^^^^^^^^^^^^^^^^ ^^^^ > This denotes a We are declaring this name > type to be an alias for that type I don't think "encoding" is a good name for a type, though. The purpose of names of that type is to hold data, not encodings. Stefan From robertwb at math.washington.edu Sat Dec 12 20:27:01 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Dec 2009 11:27:01 -0800 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> <4B2367AA.7090407@behnel.de> <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> Message-ID: <62EDAE7E-21AC-49C0-AA32-E22CA5694CEA@math.washington.edu> On Dec 12, 2009, at 3:19 AM, Nathaniel Smith wrote: > On Sat, Dec 12, 2009 at 1:51 AM, Stefan Behnel > wrote: >> Nathaniel Smith, 12.12.2009 10:05: >>> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux), >>> some code of mine broke. Specifically, it's code for reading a >>> binary >>> format, and in the tests I had a string that made Cython fail to >>> compile with the error: >>> String decoding as 'UTF-8' failed. Consider using a byte string or >>> unicode string explicitly, or adjust the source code encoding. >>> >>> As an example, here's a complete file that Cython 0.12 will refuse >>> to compile: >>> ------------- >>> s = "\x12\x34\x9f\x65" >>> ------------- >>> >>> I'm not sure why it's nattering about the source code encoding when >>> the problem is with explicitly quoted byte values >> >> Because you are using a 'str' literal, which needs to be decoded in >> Python >> 3 to become the equivalent str (i.e. unicode) object. A check for >> that is >> required for the semantics of the 'str' type in Cython, as it would >> otherwise be impossible to switch the type in the generated C code >> - you >> simply can't write out a unicode literal into C in a portable way. >> >> The relevant CEP is here: >> >> http://wiki.cython.org/enhancements/stringliterals > > Sure, I know. But I'm not using Python 3 (I'm using 2.5.2, as > mentioned), and that page says "Unmarked string literals, when used in > a Python context, would be [...] byte strings in Py2", and the table > labeled "Proposal" seems to imply that in Py2, cython will treat "foo" > and b"foo" as equivalent (just as CPython would). Similarly, under > "Cons" it notes that the changes under discussion may cause backwards > compatibility problems when moving from Py2 to Py3, but it does not > note that they also cause (IMHO rather more serious) backwards > incompatibility between Cython 0.11+Py2 and Cython 0.12+Py2. > >>> but... my question >>> is, I can fix this by adding a "b" sigil on the front, but that's >>> incompatible with earlier versions of Cython. >> >> Yes, bytes literals were fixed up fairly recently - may have been >> 0.11 or >> so. Given that they were partly broken before that, I don't really >> see why >> you would want to support earlier versions of Cython anyway. > > Oh, does that work in 0.11? All the documentation I had found (e.g. at > the top of that page you linked) only mentions py3-style string > handling in the context of 0.12. That solves my personal problem. If you really want byte strings, than I agree that prefixing with b is the best option. However, I agree with your assessment of backwards incompatibility. Consider len("\xc3\xbf") In both Python 2 and Python 3 this gives 2, but in Cython it gives 2 when compiled against 2.x and 1 when compiled against 3.x. That seems inconsistent. Given that "abc\xFF" works in Py2 and Py3, would it make more sense that this also work (and have the same behavior) in Cython? The underlying representation would differ, but in both cases it would be the unambiguous 4-character (bytes or unicode) string that one would get typing the same thing at the Python prompt. Thus when one writes "abc\xFF" it would be interpreted as the actual value of the string, not as an encoded value of the string. - Robert From robertwb at math.washington.edu Sat Dec 12 20:35:35 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Dec 2009 11:35:35 -0800 Subject: [Cython] Cpyx for automatic Module creation and inlining of Cython code... In-Reply-To: <4B22A23D.1010303@gmail.com> References: <4B22A23D.1010303@gmail.com> Message-ID: <40620762-2F32-4036-B63F-80050F7ED935@math.washington.edu> On Dec 11, 2009, at 11:49 AM, David Mashburn wrote: > Hello Cython Developers, > > I wanted to announce "Cpyx", a module I've been working on off and > on since 2006 that I use to automatically compile and also inline > Cython code in my work (mostly because I like to do everything in > one step). > > This is more-or-less a prototype, but it works for me on Windows, > Mac, and Ubuntu, so I thought I'd share! > > I know it has similar goals to pyx_import, but I think the two are > quite compilementary... (and I couldn't figure out how to get numpy > support in pyx_import when it came out...) > > My main hope for this is that it can give people a starting point > for using manual compilation/distutils > on their system (it is very verbose by default) and that it can > automatically inline code with numpy support! > > If you find it useful, I think it is almost mature enough to be > included in cython, and if not I certainly enjoy using it! > > In any case, I'd love some feedback. Thanks for posting this. This reminds me a bit of what we do with Cython for the notebook in sage. One comment I have is that a lot of paths seem to be hard coded, and may not always be accurate depending on how/where Python is installed or what version of the OS you have. There is the handy sys.prefix that you can use to determine the running Python's directory and include paths. - Robert > Thanks for all the hard work you all are doing with Greg's brainchild! > -David Mashburn > # Author: David Mashburn > # Created July 2006 > # Last Modified December 11, 2009 > # License: ??? (Apache 2) -- whatever is easiest for cython folks... > > # This module is for the automatic compilation (and also inlining) of > # Pyrex / Cython code... > # It can use distutils or manual compilation with gcc (or another > compiler) > # It can work with a single existing C source and automatically > compile it as well > # It has been tested on Windows, Mac, and Ubuntu Linux > > # That said, I make no guarantees that it will work as expected! > # Numpy support is automatically enabled for the non-distutils > version... > > # Unless the printCmds option is set to False, the script will > output every action taken > # and command run > > # My main goal for this is to aid people in learning how to compile > cython code > # on their system, and give them a starting point so they can tweak > what they want... > > # My other goal is to automate the Cython compile process so I can > do everything in > # one step after getting it set up :) > > # I really like the inline feature a lot for testing! > # And try it with PySlices, the latest incarnation of the wxPython > shell, PyCrust! (Shameless plug...) > > import os > import sys > import glob > import random > import numpy > import SetEnvironVars > > # Making this work in Vista... > # Download the latest mingw (5.x.x): > # add C:\MinGW\bin to the PATH environment variable > > # Should work with latest MingW on Windows 7... > > # Making this work on Mac... > # Download Xcode from the apple developer site (create a login) and > install it: > # http://connect.apple.com > > # Sample output for Cpyx on Windows: > # Pieces: > # gcc -c -IC:/Python25/include PyrexExample.c -o PyrexExample.o > # gcc -shared PyrexExample.o -LC:/Python25/libs -lpython25 -o > PyrexExample.pyd > # All-in-one: > # gcc -shared PyrexExample.c -IC:/Python25/include -LC:/Python25/ > libs -lpython25 -o PyrexExample.pyd > # All-in-one with linking dll... > # gcc -shared numpyTest.c -IC:/Python25/include -LC:/Python25/libs - > LC:/Users/mashbudn/Programming/Python/Pyx -lpython25 -lnumpyTestC -o > numpyTest.pyd > > myPythonDir=os.environ['MYPYTHON'] > myPyrexDir=os.environ['MYPYREX'] > globalUseCython=True > > if sys.platform=='win32': > pyrexcName='"' + 'C:\\Python25\\Scripts\\pyrexc.py' + '"' # Full > path to the Pyrex compiler script > cythonName='"' + 'C:\\Python25\\Scripts\\cython.py' + '"' # Full > path to the Cython compiler script > pythonName='C:\\Python25\\python.exe' # Full path to python.exe > sitePackages='C:\\Python25\\Lib\\site-packages' > pythonInclude='C:/Python25/include' > pythonLibs='C:/Python25/libs' > elif sys.platform=='darwin': > pyrexcName='"' + '/Library/Frameworks/Python.framework/Versions/ > 5.1.1/bin/pyrexc' + '"' # Full path to the Pyrex compiler script > cythonName='"' + '/Library/Frameworks/Python.framework/Versions/ > 5.1.1/bin/cython' + '"' # Full path to the Cython compiler script > pythonName='/Library/Frameworks/Python.framework/Versions/5.1.1/ > bin/python' # Full path to python.exe > sitePackages='/Library/Frameworks/Python.framework/Versions/5.1.1/ > lib/python2.5/site-packages' > pythonInclude='/Library/Frameworks/Python.framework/Versions/ > 5.1.1/include' > pythonLibs='/Library/Frameworks/Python.framework/Versions/5.1.1/ > lib/python2.5/config/' # contains libpython2.5.so > elif sys.platform=='linux2': > pyrexcName='"' + '/usr/bin/pyrexc' + '"' # Full path to the Pyrex > compiler script > cythonName='"' + '/usr/bin/cython' + '"' # Full path to the > Cython compiler script > pythonName='/usr/bin/python2.5' # Full path to python.exe > sitePackages='/usr/lib/python2.5/site-packages' > pythonInclude='/usr/include/python2.5' > pythonLibs='/usr/lib' # contains libpython2.5.so > else: > print 'Platform "' + sys.platform + '" not supported yet' > > # New way to find numpy's arrayobject.h to include > arrayobjecthPath = > os.path.join(numpy.get_include(),'numpy','arrayobject.h') > arrayObjectDir = numpy.get_include() > > def Cdll(cNameIn='',printCmds=True,gccOptions=''): > cwd=os.getcwd() > > (cPath,cName)=os.path.split(cNameIn) # input path and input file > name > if cPath=='': > cPath=myPyrexDir # directory used for all Pyrex stuff > dllPath=cPath > > stripName=(os.path.splitext(cName))[0] # input file name without > extension > > if sys.platform=='win32': dllName='"' + > os.path.join(dllPath,stripName+'.dll') + '"' > elif sys.platform=='darwin': dllName='"' + > os.path.join(dllPath,'lib'+stripName+'.so') + '"' > elif sys.platform=='linux2': dllName='"' + > os.path.join(dllPath,'lib'+stripName+'.so') + '"' > else: print 'Platform "' + sys.platform > + '" not supported yet' > > cName='"' + os.path.join(cPath,stripName+'.c') + '"' # redefine > cName > hName='"' + os.path.join(cPath,stripName+'.h') + '"' > oName='"' + os.path.join(dllPath,stripName+'.o') + '"' > > os.chdir(cPath) > > cmd=' '.join(['gcc',gccOptions,'-fPIC','-c',cName,'-o',stripName > +'.o']) > if printCmds: > print '\n', cmd > os.system(cmd) > > cmd=' '.join(['gcc','-shared','-o',dllName,oName]) > if printCmds: > print '\n', cmd > os.system(cmd) > > os.chdir(cwd) > > def > Cpyx > (pyxNameIn > = > 'PyrexExample > .pyx > ',useDistutils > =False,useCython=globalUseCython,gccOptions='',printCmds=True): > cwd=os.getcwd() > > (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input > file name > > if pyxPath=='': > pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff > else: > pydPath=mainDir=pyxPath > > pyxStrip=(os.path.splitext(pyxName))[0] # input file name without > extension > > extName='"' + pyxStrip + '"' > pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full > path to the PYX file (must be in Python/Pyx folder) redefine pyxName > pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full > path to the C file to be created > pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full > path to the PYD file to be created > soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full > path to the lib*.so file to be created > setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full > path of the Setup File to be created > > # run the main pyrex command to make the C file > pyxCompiler = cythonName if useCython else pyrexcName > cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) > if printCmds: > print '\n', cmd > os.system(cmd) > > if useDistutils: > > #write setup.py which will make a PYD file that can be imported > setupText="""### This file is setup.py ### > from distutils.core import setup > from distutils.extension import Extension > from Pyrex.Distutils import build_ext > > setup( > name = 'Lock module', > ext_modules=[ > Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + > ']' + """), > ], > cmdclass = {'build_ext': build_ext} > )""" > > if printCmds: > print 'Write Stuff to ', setupName[1:-1] > fid = open(setupName[1:-1],'w') # [1:-1] removes quotes > fid.write(setupText) > fid.close() > > # run setup.py > > os.chdir(mainDir) > > if sys.platform=='win32': cmd=' > '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- > inplace']) > elif sys.platform=='darwin': cmd=' > '.join([pythonName,setupName,'build_ext','--inplace']) > elif sys.platform=='linux2': cmd=' > '.join([pythonName,setupName,'build_ext','--inplace']) > else: print 'Platform "' + > sys.platform + '" not supported yet' > > else: > if sys.platform=='win32': cmd=' > '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- > I'+pythonInclude,'-L'+pythonLibs,'-lpython25','-o',pydName]) > elif sys.platform=='darwin': cmd=' > '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- > no-cpp-precomp','-mno-fused-madd','-fno-common', > '-dynamic','- > DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- > I'+pythonInclude, > '- > I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- > L/usr/local/lib',pyx2cName,'-o',soName]) > elif sys.platform=='linux2': cmd=' > '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- > I'+pythonInclude,'-L'+pythonLibs,'-lpython2.5','-o',soName]) > else: print 'Platform "' + > sys.platform + '" not supported yet' > > if printCmds: > print '\n', cmd > os.system(cmd) > > os.chdir(cwd) > > def > CpyxLib > (pyxNameIn > = > 'PyrexExample > .pyx > ',cNameIn > = > 'CTestC > .c > ',recompile > = > True > ,useDistutils > =False,useCython=globalUseCython,gccOptions='',printCmds=True): > cwd=os.getcwd() > > (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input > file name > (cPath,cName)=os.path.split(cNameIn) # input path and input file > name > > if cPath=='': > dllPath=cPath=myPyrexDir # directory used for all Pyrex stuff > > if pyxPath=='': > pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff > else: > pydPath=mainDir=pyxPath > > pyxStrip=(os.path.splitext(pyxName))[0] # input file name without > extension > cStrip=(os.path.splitext(cName))[0] # input file name without > extension > > extName='"' + pyxStrip + '"' > pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full > path to the PYX file (must be in Python\\Pyrex folder) > > if sys.platform=='win32': > dllName='"' + os.path.join(dllPath,cStrip+'.dll') + '"' > libName='"' + os.path.join(dllPath,cStrip) + '"' > library_dirs_txt='' > elif sys.platform=='darwin': > dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' > libName='"' + cStrip + '"' > library_dirs_txt=""" > library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + > """], > runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + > '"' + """],""" > elif sys.platform=='linux2': > dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' > libName='"' + cStrip + '"' > library_dirs_txt=""" > library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + > """], > runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + > '"' + """],""" > else: > print 'Platform "' + sys.platform + '" not supported yet' > > cName='"' + os.path.join(cPath,cStrip+'.c') + '"' # redefine cName > hName='"' + os.path.join(cPath,cStrip+'.h') + '"' > oName='"' + os.path.join(dllPath,cStrip+'.o') + '"' > > os.chdir(cPath) > > pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full > path to the C file to be created > setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full > path of the Setup File to be created > pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full > path to the PYD file to be created > soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full > path to the lib*.so file to be created > > # compile the DLL needed for the link to the C file > if recompile: > Cdll(cName[1:-1],printCmds=printCmds,gccOptions=gccOptions) # > [1:-1] to remove the quotes > > # run the main pyrex command to make the C file > pyxCompiler = cythonName if useCython else pyrexcName > cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) > if printCmds: > print '\n', cmd > os.system(cmd) > > if useDistutils: > #write setup.py which will make a PYD file that can be imported > setupText="""### This file is setup.py ### > from distutils.core import setup > from distutils.extension import Extension > from Pyrex.Distutils import build_ext > > setup( > name = 'Lock module', > ext_modules=[ > Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + > """],"""+library_dirs_txt+""" > libraries=[""" + libName.replace('\\','\\\\') + ']' + """), > ], > cmdclass = {'build_ext': build_ext} > )""" > if printCmds: > print 'Write Stuff to ', setupName[1:-1] > fid = open(setupName[1:-1],'w') # [1:-1] removes quotes > fid.write(setupText) > fid.close() > > # run setup.py > os.chdir(mainDir) > > if sys.platform=='win32': cmd=' > '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- > inplace']) > elif sys.platform=='darwin': cmd=' > '.join([pythonName,setupName,'build_ext','--inplace']) > elif sys.platform=='linux2': cmd=' > '.join([pythonName,setupName,'build_ext','--inplace']) > else: print 'Platform "' + > sys.platform + '" not supported yet' > else: > if sys.platform=='win32': cmd=' > '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- > I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- > lpython25','-l'+cStrip,'-o',pydName]) > elif sys.platform=='darwin': cmd=' > '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- > no-cpp-precomp','-mno-fused-madd','-fno-common', > '-dynamic','- > DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- > I'+pythonInclude, > '- > I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- > L/usr/local/lib','-L'+cPath,'-Wl,-R'+cPath, > '- > l'+cStrip,pyx2cName,'-o',soName]) > elif sys.platform=='linux2': cmd=' > '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- > I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- > lpython2.5','-l'+cStrip,'-o',soName]) > else: print 'Platform "' + > sys.platform + '" not supported yet' > > if printCmds: > print '\n', cmd > os.system(cmd) > > os.chdir(cwd) > > #Shamelessly steal the idea used by scipy.weave.inline but for Pyrex/ > Cython instead... > # In order to be able to import *, have to use exec in the calling > module... > def > PyrexInline > (code > ,cleanUp > = > False > ,useDistutils=False,useCython=False,gccOptions='',printCmds=True): > '''PyrexInline returns a string that is an import statement to > the temporary cython module'''+ \ > '''Use this like: exec(PyrexInline(r"""""",))''' > > testCode=r""" > cdef extern from "stdio.h": > ctypedef struct FILE > > FILE * stdout > int printf(char *format,...) > int fflush( FILE *stream ) > > def PyrexPrint(mystring): > printf(mystring) > fflush(stdout) > > PyrexPrint('HelloWorld!') > """ > tmpPath=os.path.expanduser('~/.Cpyx_tmp') > if not os.path.isdir(tmpPath): > os.mkdir(tmpPath) > if tmpPath not in sys.path: > sys.path.append(tmpPath) > if cleanUp: > CleanTmp() > > # Ensure you always get a new module! > # This means there is no reason to "reload" > # Also means memory gets majorly eaten up! > # Can't have everything! > moduleName='Pyrex'+str(random.randint(0,1e18)) > file=os.path.join(tmpPath,moduleName+'.pyx') > > fid=open(file,'w') > fid.write(code) > fid.close() > > > Cpyx > (file > ,useDistutils > = > useDistutils > ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) > > #cmd="""import """+moduleName+""" as LoadPyrexInline""" > cmd="""from """+moduleName+""" import *""" > if printCmds: > print cmd > return cmd > > # Create a dummy function that defaults to using Cython instead for > clarity... > def > CythonInline > (code > ,cleanUp > = > False,useDistutils=False,useCython=True,gccOptions='',printCmds=True): > return > PyrexInline > (code > ,cleanUp > = > cleanUp > ,useDistutils > = > useDistutils > ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) > > def CleanTmp(): > tmpPath=os.path.expanduser('~/.Cpyx_tmp') > for i in glob.glob(os.path.join(tmpPath,'*')): > os.remove(i) > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Sat Dec 12 22:02:09 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Dec 2009 13:02:09 -0800 Subject: [Cython] Checking extension type sizes In-Reply-To: <957D379D-7BCB-4CF5-BE22-ECF65D1AD8AE@math.washington.edu> References: <7cd6de7e5b1443a191ecbf4482f2ebd0.squirrel@webmail.uio.no> <898ACD1D-08F3-4AA8-B957-4B94A822F632@math.washington.edu> <20091207180016.GB1111@phare.normalesup.org> <349DAB97-4856-4BFA-AF40-6EDCB6E9A154@math.washington.edu> <8CCDFE65-C88E-4916-B458-82CC707E3E7D@math.washington.edu> <4B1E1F9A.1000008@behnel.de> <957D379D-7BCB-4CF5-BE22-ECF65D1AD8AE@math.washington.edu> Message-ID: <9768916A-3C66-43E4-AC4D-2AC531CFF0E6@math.washington.edu> On Dec 8, 2009, at 9:42 AM, Robert Bradshaw wrote: > On Dec 8, 2009, at 1:42 AM, Stefan Behnel wrote: > >> FWIW, I'm for attempting to make this a warning depending on >= for >> arbitrary non-subtyped external types that do not define any C >> methods, and keeping the error strict for all subtyped types > > OK, sounds like we're in agreement. Done. - Robert From robertwb at math.washington.edu Sat Dec 12 22:49:24 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sat, 12 Dec 2009 13:49:24 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B22F9CC.3090606@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: On Dec 11, 2009, at 6:02 PM, Greg Ewing wrote: > I've had an idea that might help with making the > encoding and decoding of unicode strings more > automatic. > > Suppose we have a way of expressing a type parameterised > with an encoding, maybe something like > > encoding[name] > > We could have a few predefined ones, such as > > ctypedef encoding['ascii'] ascii > ctypedef encoding['utf8'] utf8 > ctypedef encoding['latin1'] latin1 > > These are Python object types. Internally they're > represented as bytes objects, but the compiler knows > statically that they have an encoding associated with > them, and the appropriate encoding and decoding > operations are performed when coercing from and to > strings. > > Being bytes, they can also be cast to char * without > any problem. So we can write things like > > cdef extern from "foo.h": > void cflump(char *) > > def flump(utf8 s): > cflump(s) > > Now we can pass a unicode string to flump() and it will > first be encoded to bytes as utf8, and then passed to > cflump() as a char *. So if I'm understanding correctly here utf8 would behave like a bytes object except one could assign unicode objects to it? Would def flump(utf8 s): return s return a bytes object? I think I've mentioned this before, but I find conversion/construction happening on object/object boundaries to be a bit less intuitive, a bit like "def flump(tuple t)" accepting a list and creating a tuple behind the scenes. The final goal is not to get a bytes object, but a char*, so it seems more natural to put the decoding at that spot. > For going the other way, we also need a corresponding > family of C string types with associated encodings. We > could give them different names, but that isn't really > necessary, since we can re-use the same ones: > > cdef extern from "foo.h": > utf8 *cbrazzle() > > This is unambiguous, because you can't declare a pointer > to a Python object. What we're saying here is that > cbrazzle() returns a char *, but it is to be understood as > encoded in utf8. So we can write > > def brazzle(): > return cbrazzle() > > and the return value from cbrazzle is automatically > decoded using utf8. > > I've put a cast there because otherwise there would be an > ambiguity -- should a utf8 * be converted to a str on > coercion to a Python type, or a utf8 (i.e. bytes) object? > > Having to use a cast is a bit ugly, though. It could be > eliminated by allowing a def function to specify a return > type: > > def str brazzle(): > return cbrazzle() > > Or there could simply be a rule that resolves the > ambiguity in favour of str whenever the target type is > a generic Python object, in which case we could simply > write > > def brazzle(): > return cbrazzle() Actually, Cython already has a c_utf8_char_array_type that I think is supposed to do this, though I don't think it's actually used anywhere. There is kind of an odd asymmetry here, for instance if I had a function that both accepted and returned a char* I would have to write cdef extern from "foo.h": utf8* cblarg(char*) [somewhere much later] def blarg(utf8 s): return cblarg(s) Another disadvantage of attaching the encoding to the C signature is that for many declarations, especially ones that could be widely shared (printf, fread, ...) or eventually auto-generated (from a C header file) it doesn't make as much sense to attach an encoding to the C function so much as to the module/function in which its used. > What do you think? Seems like this sort of scheme would > keep the encoding being used at each point fairly explicit > without being too intrusive. My whole goal was to not have to be explicit at each point, but to be able to specify the encoding (or at least to use a default encoding) for an entire file, function, or block of code at once rather than at every line. You're right, this isn't very intrusive, but no matter how unintrusive, there's still the matter of converting old code (e.g. Sage) as yet unwritten code (not necessarily by those participating in this discussion, but any current or future Cython users who aren't thinking about unicode or targeting Py3 yet) and the fact that it's just one more thing to have to learn and constantly keep in mind. If I want to be explicit at every point, Stefan's optimized .encode() and .decode() plus a cython.str() special method seem natural enough, and also have the advantage that it's what you'd right in Python anyways, so there's no new keywords and special "cython" way of doing things. The only glaring deficiency is the inverse of cython.str, which would create a char* from a bytes or unicode, which would be especially convenient to be able to do for function arguments (including cpdef functions that would be able to accept a raw char*). (As an aside, perhaps we could let str(...) take a char* directly and (optional or not) encoding so they wouldn't even have to use cython.str(...).) - Robert From greg.ewing at canterbury.ac.nz Sun Dec 13 00:28:23 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Dec 2009 12:28:23 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B2399E8.3010409@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> <4B2399E8.3010409@behnel.de> Message-ID: <4B242717.8070607@canterbury.ac.nz> Stefan Behnel wrote: > I don't think "encoding" is a good name for a type, though. I'm open to suggestions if you can think of something better. -- Greg From greg.ewing at canterbury.ac.nz Sun Dec 13 01:05:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Dec 2009 13:05:42 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: <4B242FD6.8010505@canterbury.ac.nz> Robert Bradshaw wrote: > So if I'm understanding correctly here utf8 would behave like a bytes > object except one could assign unicode objects to it? Would > > def flump(utf8 s): > return s > > return a bytes object? That's something that would require some thought. It's another case where declaring a return type might be needed. > The final goal is not to get a bytes object, but a > char*, so it seems more natural to put the decoding at that spot. The reason for the intermediate bytes object is that it neatly solves the memory management issue that arises if you try to go directly from str to char *, and it does it without having to make a special case of function arguments. > There is kind of an odd asymmetry here, for instance if I had a > function that both accepted and returned a char* I would have to write > > cdef extern from "foo.h": > utf8* cblarg(char*) You can write that part more symmetrically if you want: cdef extern from "foo.h": utf8* cblarg(utf8*) > [somewhere much later] > > def blarg(utf8 s): > return cblarg(s) Yes, it's a bit odd having the str->bytes conversion determined by the Python side and the bytes->str by the C side. I'll have to think about it some more. > My whole goal was to not have to be explicit at each point, but to be > able to specify the encoding (or at least to use a default encoding) > for an entire file Yes, I realise it doesn't fully address your use case. It's more aimed at people who think a blanket declaration would be too implicit and error-prone. However, it seems to be difficult to implement fully automatic conversions directly between str and char * except for a very few encodings -- ascii and utf8 -- and even the latter would appear to hinge on a deprecated feature held over from Py2. The advantages of my proposal are that it would work for any encoding and wouldn't be restricted to function arguments. -- Greg From stefan_ml at behnel.de Sun Dec 13 08:15:42 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 08:15:42 +0100 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <62EDAE7E-21AC-49C0-AA32-E22CA5694CEA@math.washington.edu> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> <4B2367AA.7090407@behnel.de> <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> <62EDAE7E-21AC-49C0-AA32-E22CA5694CEA@math.washington.edu> Message-ID: <4B24949E.5060803@behnel.de> Robert Bradshaw, 12.12.2009 20:27: > However, I agree with your assessment of backwards incompatibility. > Consider > > len("\xc3\xbf") > > In both Python 2 and Python 3 this gives 2, but in Cython it gives 2 > when compiled against 2.x and 1 when compiled against 3.x. That seems > inconsistent. The inconsistent thing here is that the string changes semantics *after* being parsed, whereas Python simply parses it differently in Py2 and Py3. This could be worked around in Cython by parsing the string literal twice (potentially in parallel) once with byte string semantics and once with unicode string semantics, and then generate two C string literals into the C code that get converted back into a Python string depending on the C compile time Python version. (Note that simple recoding isn't possible as there may not be an encoding that maps the unicode string literal to the byte string literal if character escapes are used). This whole 'str' semantics business is really getting hard to understand by now. If we're having a hard time to "get it right", how is a user ever going to understand the semantics once we're done? Stefan From strombrg at gmail.com Sun Dec 13 08:29:11 2009 From: strombrg at gmail.com (Dan Stromberg) Date: Sat, 12 Dec 2009 23:29:11 -0800 Subject: [Cython] Generators - status? Message-ID: <4B2497C7.8070406@gmail.com> How close is cython (0.12 and trunk or whatever cython calls trunk, or the cython-closures repository?) to supporting generators? I have a performance-oriented project I'm working on, and cython seems like a good fit - except I really wanted it to have generators. I'd happily eliminate monkey patching and nested scopes for cython's performance, if it just had generators. I heard that cython will want "functional" closures first, then generators (but does "functional" here mean "'functional' as in 'working'" or "'functional' as in 'functional programming'". Have the true closures been added? Someone wrote that the requirements for adding generators are understood (it was probably in the cython FAQ). Is there a list of these requirements written down somewhere? Thanks! PS: I tried building the cython-closures code, but it still gives an error on the 'yield' keyword, despite there appearing to be some yield-related code in there. From stefan_ml at behnel.de Sun Dec 13 08:35:17 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 08:35:17 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: References: <4B22F9CC.3090606@canterbury.ac.nz> Message-ID: <4B249935.4070001@behnel.de> Robert Bradshaw, 12.12.2009 22:49: > Another disadvantage of attaching the encoding to the C signature is > that for many declarations, especially ones that could be widely > shared (printf, fread, ...) or eventually auto-generated (from a C > header file) it doesn't make as much sense to attach an encoding to > the C function so much as to the module/function in which its used. Very good point. So it's actually only part of the internal workings of a function, not the externally visible signature. Code that calls a function shouldn't be bothered with the implementation details of that function, so if it wants to pass anything other than a byte string (in which case the encoding *is* part of the signature), the encoding used internally by the function should be completely transparent. That gets us back to the idea of transparently encoding at function call boundaries. Actually, this isn't even about function call boundaries but about the Python call boundary. C functions that take a char* will always only accept an encoded byte string no matter what, so there is no reason to pass them a unicode string in the first place. And once the Python call boundary is passed, module internal code is best served by using byte strings anyway, for passing them around internally, for iterating over them efficiently (at least for ASCII string content and single-byte encodings), and for passing them to C code. Remember that, in C, char is actually an integer type, so it won't matter much if iteration returns an integer value or a byte character value. So I think the right solution is to support automatic conversion *only* at the Python call boundary, i.e. for Python function parameters and return values. Now, parameters are easy as long as we stick with the bytes type, for which "bytes[encoding='utf-8']" would be an obvious syntax in Cython. Function return values can be made to work in the same way, by simply allowing their declaration also for 'def' functions. And ctypedefs would make this quite writeable, as Greg suggested. Again, this won't rescue code that was already written, but I think it would solve the problem for future code, and existing (unicode unaware) code could be fixed up relatively easily by replacing char* in Python function signatures with "bytes[encoding=...]" or the ctypedef-ed equivalent. Comments? Stefan From stefan_ml at behnel.de Sun Dec 13 08:40:06 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 08:40:06 +0100 Subject: [Cython] Generators - status? In-Reply-To: <4B2497C7.8070406@gmail.com> References: <4B2497C7.8070406@gmail.com> Message-ID: <4B249A56.2030606@behnel.de> Dan Stromberg, 13.12.2009 08:29: > How close is cython (0.12 and trunk or whatever cython calls trunk, or > the cython-closures repository?) to supporting generators? Close, but lots of work left. There is a spec in CEP 307: http://wiki.cython.org/enhancements/generators > PS: I tried building the cython-closures code, but it still gives an > error on the 'yield' keyword, despite there appearing to be some > yield-related code in there. Yes, but at least the error is more explicit than a simple syntax error by now. :) Stefan From robertwb at math.washington.edu Sun Dec 13 09:22:33 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 13 Dec 2009 00:22:33 -0800 Subject: [Cython] how do I write binary strings in a way that works with multiple cython versions? In-Reply-To: <4B24949E.5060803@behnel.de> References: <961fa2b40912120105s4b8d7148i9755764f55980d5c@mail.gmail.com> <4B2367AA.7090407@behnel.de> <961fa2b40912120319y28687668g7c6d8f91279df4fb@mail.gmail.com> <62EDAE7E-21AC-49C0-AA32-E22CA5694CEA@math.washington.edu> <4B24949E.5060803@behnel.de> Message-ID: On Dec 12, 2009, at 11:15 PM, Stefan Behnel wrote: > Robert Bradshaw, 12.12.2009 20:27: >> However, I agree with your assessment of backwards incompatibility. >> Consider >> >> len("\xc3\xbf") >> >> In both Python 2 and Python 3 this gives 2, but in Cython it gives 2 >> when compiled against 2.x and 1 when compiled against 3.x. That seems >> inconsistent. > > The inconsistent thing here is that the string changes semantics > *after* > being parsed, whereas Python simply parses it differently in Py2 and > Py3. > > This could be worked around in Cython by parsing the string literal > twice > (potentially in parallel) once with byte string semantics and once > with > unicode string semantics, and then generate two C string literals > into the > C code that get converted back into a Python string depending on the C > compile time Python version. (Note that simple recoding isn't > possible as > there may not be an encoding that maps the unicode string literal to > the > byte string literal if character escapes are used). Yeah, it wouldn't be trivial to change (though nor would it be that hard...) > This whole 'str' semantics business is really getting hard to > understand by > now. If we're having a hard time to "get it right", how is a user ever > going to understand the semantics once we're done? In my mind, the guiding principle should be that they behave in a .pyx file as similar as possible to the way they would behave in a .py file, and where there are differences we document and justify them. The smaller the number of differences, the easier for the user to understand. (Of course, we do things in .pyx files that don't make sense in Python, so it can be a bit more complicated.) - Robert From stefan_ml at behnel.de Sun Dec 13 09:25:51 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 09:25:51 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B249935.4070001@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> Message-ID: <4B24A50F.3010108@behnel.de> Stefan Behnel, 13.12.2009 08:35: > So I think the right solution is to support automatic conversion *only* at > the Python call boundary, i.e. for Python function parameters and return > values. > > Now, parameters are easy as long as we stick with the bytes type, for which > "bytes[encoding='utf-8']" would be an obvious syntax in Cython. Function > return values can be made to work in the same way, by simply allowing their > declaration also for 'def' functions. And ctypedefs would make this quite > writeable, as Greg suggested. > > Again, this won't rescue code that was already written, but I think it > would solve the problem for future code, and existing (unicode unaware) > code could be fixed up relatively easily by replacing char* in Python > function signatures with "bytes[encoding=...]" or the ctypedef-ed equivalent. Thinking about this some more, I actually believe that the main usage pattern would be to declare a function like this: def str[encoding='ASCII'] func(bytes[encoding='ASCII'] s): ... So most my-data-is-not-unicode users would want to make sure that they always get an easy-to-use bytes object on the way in and that the return value is an easy-to-use Python value, i.e. it follows the normal platform str type: bytes on Py2 and unicode on Py3. So there is an intrinsic asymmetry in input and output types here. Stefan From robertwb at math.washington.edu Sun Dec 13 09:40:31 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 13 Dec 2009 00:40:31 -0800 Subject: [Cython] Generators - status? In-Reply-To: <4B2497C7.8070406@gmail.com> References: <4B2497C7.8070406@gmail.com> Message-ID: On Dec 12, 2009, at 11:29 PM, Dan Stromberg wrote: > How close is cython (0.12 and trunk or whatever cython calls trunk, or > the cython-closures repository?) to supporting generators? It's getting closer. Actually, Craig Citro made some progress on stress-testing it last week (he's been looking at doing this for a while, but has been very busy with teaching this last quarter...), and if all goes well closures should be merged in, otherwise we'll at least know where we stand. (I think they're pretty solid.) > I have a performance-oriented project I'm working on, and cython seems > like a good fit - except I really wanted it to have generators. I'd > happily eliminate monkey patching and nested scopes for cython's > performance, if it just had generators. > > I heard that cython will want "functional" closures first, then > generators (but does "functional" here mean "'functional' as in > 'working'" or "'functional' as in 'functional programming'". Both :). > Have the true closures been added? That's what's in the closures branch. > Someone wrote that the requirements for adding generators are > understood > (it was probably in the cython FAQ). Is there a list of these > requirements written down somewhere? As Stefan mentioned, there's the CEP which documents it pretty well. > Thanks! > > PS: I tried building the cython-closures code, but it still gives an > error on the 'yield' keyword, despite there appearing to be some > yield-related code in there. Closures are an essential building block for generators. Essentially, closures boil down to being able to retain the local state of a function while it's not actively running--something that generators obviously need to be able to do. It is the one glaring deficiency, and I don't think it's that far away (in terms of work, probably months before I could personally sit down and attack it). - Robert From robertwb at math.washington.edu Sun Dec 13 10:10:36 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 13 Dec 2009 01:10:36 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B242FD6.8010505@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B242FD6.8010505@canterbury.ac.nz> Message-ID: <927A514E-BF2F-4CD1-A441-8D9D29E9F9DA@math.washington.edu> On Dec 12, 2009, at 4:05 PM, Greg Ewing wrote: > The reason for the intermediate bytes object is that it neatly > solves the memory management issue that arises if you try to > go directly from str to char *, and it does it without having > to make a special case of function arguments. I agree, that is nice. >> My whole goal was to not have to be explicit at each point, but to be >> able to specify the encoding (or at least to use a default encoding) >> for an entire file > > Yes, I realise it doesn't fully address your use case. > It's more aimed at people who think a blanket declaration > would be too implicit and error-prone. With the exception of function argument declaration, I think the people who don't want blanket declarations are are already fairly well served with encode() and decode(). cdef bytes[encoding='utf8'] ss = s or even cdef utf8 ss = s is not (to me at least) clearer than cdef bytes ss = s.encode('utf8') which requires no new syntax or types. > However, it seems to be difficult to implement fully > automatic conversions directly between str and char * > except for a very few encodings -- ascii and utf8 -- > and even the latter would appear to hinge on a > deprecated feature held over from Py2. I think ascii and utf8 alone would cover a broad range of usecases, especially for those who want more global declarations. The defenc slot is a real concern though. > The advantages of my proposal are that it would work > for any encoding and wouldn't be restricted to function > arguments. I think this is a valuable point. - Robert From robertwb at math.washington.edu Sun Dec 13 10:51:54 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Sun, 13 Dec 2009 01:51:54 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B249935.4070001@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> Message-ID: <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> On Dec 12, 2009, at 11:35 PM, Stefan Behnel wrote: > Robert Bradshaw, 12.12.2009 22:49: >> Another disadvantage of attaching the encoding to the C signature is >> that for many declarations, especially ones that could be widely >> shared (printf, fread, ...) or eventually auto-generated (from a C >> header file) it doesn't make as much sense to attach an encoding to >> the C function so much as to the module/function in which its used. > > Very good point. So it's actually only part of the internal workings > of a > function, not the externally visible signature. Code that calls a > function > shouldn't be bothered with the implementation details of that > function, so > if it wants to pass anything other than a byte string (in which case > the > encoding *is* part of the signature), the encoding used internally > by the > function should be completely transparent. > > That gets us back to the idea of transparently encoding at function > call > boundaries. Actually, this isn't even about function call boundaries > but > about the Python call boundary. C functions that take a char* will > always > only accept an encoded byte string no matter what, so there is no > reason to > pass them a unicode string in the first place. And once the Python > call > boundary is passed, module internal code is best served by using byte > strings anyway, for passing them around internally, for iterating > over them > efficiently (at least for ASCII string content and single-byte > encodings), > and for passing them to C code. Remember that, in C, char is > actually an > integer type, so it won't matter much if iteration returns an > integer value > or a byte character value. > > So I think the right solution is to support automatic conversion > *only* at > the Python call boundary, i.e. for Python function parameters and > return > values. I disagree. Most of the examples here have been very simple, but in general Python/C boundary need not be cleanly aligned with the Python call boundary. Some more general examples would be cdef extern from "foo.h": cdef cblarg(int i, char*): def blarg(obj): cblarg(obj.id, obj.name) # I realize I'm assuming name is not a dynamically generated attribute... or even def barg_all(list L): for i, a in enumerate(L): cblarg(i, a) Of course, this boundary is an important one, and when passing in arguments there's no current way to implicitly or explicitly do call to encode. > Now, parameters are easy as long as we stick with the bytes type, > for which > "bytes[encoding='utf-8']" would be an obvious syntax in Cython. > Function > return values can be made to work in the same way, by simply > allowing their > declaration also for 'def' functions. And ctypedefs would make this > quite > writeable, as Greg suggested. > > Again, this won't rescue code that was already written, but I think it > would solve the problem for future code, and existing (unicode > unaware) > code could be fixed up relatively easily by replacing char* in Python > function signatures with "bytes[encoding=...]" or the ctypedef-ed > equivalent. > > Comments? I'm all for making string encodings easier to use, though as I've said encode() and decode() seem to be a clean enough solution for nearly everything but argument parsing. However (and maybe this belongs on the other thread), you are completely skirting the issue of being able to declare the encoding for a block of code in one place, rather than having to specify it every single place it is used. I initially thought your concern with char* <-> unicode conversion was the ambiguity in what character set to use, which I was proposing could be declared at a higher than case- by-case level. Is there another reason it is vital that the encoding step and/or parameters be reiterated at every instance they are used? - Robert From stefan_ml at behnel.de Sun Dec 13 12:11:21 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 12:11:21 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> Message-ID: <4B24CBD9.3000202@behnel.de> Robert Bradshaw, 13.12.2009 10:51: > On Dec 12, 2009, at 11:35 PM, Stefan Behnel wrote: >> So I think the right solution is to support automatic conversion >> *only* at the Python call boundary, i.e. for Python function >> parameters and return values. > > I disagree. Most of the examples here have been very simple, but in > general Python/C boundary need not be cleanly aligned with the Python > call boundary. Some more general examples would be > > cdef extern from "foo.h": > cdef cblarg(int i, char*): > > def blarg(obj): > cblarg(obj.id, obj.name) # I realize I'm assuming > name is not a dynamically generated attribute... > > or even > > def barg_all(list L): > for i, a in enumerate(L): > cblarg(i, a) I guess I'm still not used to passing arbitrary user values into a C function call without doing some kind of parameter checking before hand. That's different for function arguments, where only the encoding would happen automatically (and would raise an appropriate error on failure), and the result would still be a safe Python bytes object that users can validate in any way they want, without having to care about 0 bytes silently becoming end markers. We are still talking about two different use cases here. One deals with automatic encoding of unicode strings into byte strings on input and with automatic decoding of byte strings (or char*) on the way out. The other use case deals with automatic coercion of Python string objects to char*, which is what you show above. I personally think it's good to keep those separate. Remember that you mentioned the performance issue of a char* vs. a Python object parameter when the function is called from Cython code? The only place where this matters is for cpdef functions, and that should be rare enough to ignore it and require an explicit wrapper function, as it's quite likely that user input would have to be validated separately anyway. To make this clear: I don't think it's worth encouraging users to drop input validation in favour of automatic and unsafe coercion. > I'm all for making string encodings easier to use, though as I've said > encode() and decode() seem to be a clean enough solution for nearly > everything but argument parsing. That seems to match my distinction above then. > However (and maybe this belongs on the other thread), you are > completely skirting the issue of being able to declare the encoding > for a block of code in one place, rather than having to specify it > every single place it is used. Yes, the above would actually be orthogonal to that feature. Although I'm not sure simply saying def func(bytes s): ... plus a global setting somewhere at the top of your code is really readable enough as "this function accepts unicode strings which get converted automatically". And, no, I don't think typing the input parameter as "str" is what people want in most cases. I'm really leaning towards the assumption that most people really *want* bytes as basic string input type in their Cython code. Either that, or exactly unicode strings. Not 'str'. > I initially thought your concern with > char* <-> unicode conversion was the ambiguity in what character set > to use, which I was proposing could be declared at a higher than case- > by-case level. Is there another reason it is vital that the encoding > step and/or parameters be reiterated at every instance they are used? I don't like code redundancy either. But making up a default should only be the second step after fixing the semantics of the feature that has this default. Stefan From stefan_ml at behnel.de Sun Dec 13 21:27:24 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 13 Dec 2009 21:27:24 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B24CBD9.3000202@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> Message-ID: <4B254E2C.6030200@behnel.de> Stefan Behnel, 13.12.2009 12:11: > I don't think typing the input parameter as "str" > is what people want in most cases. I'm really leaning towards the > assumption that most people really *want* bytes as basic string input type > in their Cython code. Either that, or exactly unicode strings. Not 'str'. To fill this with a bit of background, I started writing up a couple of thoughts on use cases that I think are relevant here. http://wiki.cython.org/enhancements/stringcoercion I hope this will allow us to see where actual compiler support is needed and what it will allow us to do. Please add to them as you see fit. Stefan From greg.ewing at canterbury.ac.nz Sun Dec 13 22:49:23 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 14 Dec 2009 10:49:23 +1300 Subject: [Cython] Generators - status? In-Reply-To: <4B2497C7.8070406@gmail.com> References: <4B2497C7.8070406@gmail.com> Message-ID: <4B256163.9060204@canterbury.ac.nz> Dan Stromberg wrote: > I heard that cython will want "functional" closures first, then > generators (but does "functional" here mean "'functional' as in > 'working'" or "'functional' as in 'functional programming'". Both, I think. :-) -- Greg From strombrg at gmail.com Mon Dec 14 02:51:23 2009 From: strombrg at gmail.com (Dan Stromberg) Date: Sun, 13 Dec 2009 17:51:23 -0800 Subject: [Cython] Preprocessor? Message-ID: <4B259A1B.8000306@gmail.com> Has anyone already worked out a system for preprocessing (maybe via m4) a single input file into two output files: one a plain .py, and on a .pyx? I sometimes prefer a pure python version of a dependency if I can get it, because it makes the ongoing maintenance costs lower than compiling (and recompiling) extension modules would. It seems like due to the similarities between .py and .pyx, there should be a way of maintaining both as a single document - so people who want the speed can have it, and people who want convenience can have that, but the programmer doesn't end up maintaining two versions of said dependency. I googled, but didn't find anything that looked relevant, other than pages about how to use various preprocessors. It's probably just a set of macros I'm looking for. Anyone? From stefan_ml at behnel.de Mon Dec 14 07:55:52 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Dec 2009 07:55:52 +0100 Subject: [Cython] Preprocessor? In-Reply-To: <4B259A1B.8000306@gmail.com> References: <4B259A1B.8000306@gmail.com> Message-ID: <4B25E178.4060601@behnel.de> Dan Stromberg, 14.12.2009 02:51: > Has anyone already worked out a system for preprocessing (maybe via m4) > a single input file into two output files: one a plain .py, and on a .pyx? > > I sometimes prefer a pure python version of a dependency if I can get > it, because it makes the ongoing maintenance costs lower than compiling > (and recompiling) extension modules would. It seems like due to the > similarities between .py and .pyx, there should be a way of maintaining > both as a single document - so people who want the speed can have it, > and people who want convenience can have that, but the programmer > doesn't end up maintaining two versions of said dependency. > > I googled, but didn't find anything that looked relevant, other than > pages about how to use various preprocessors. I assume you know that Cython can compile .py files and has a pure Python syntax for type declarations? http://wiki.cython.org/pure Stefan From stefan_ml at behnel.de Mon Dec 14 08:11:57 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Dec 2009 08:11:57 +0100 Subject: [Cython] Generators - status? In-Reply-To: References: <4B2497C7.8070406@gmail.com> Message-ID: <4B25E53D.4050902@behnel.de> Robert Bradshaw, 13.12.2009 09:40: > On Dec 12, 2009, at 11:29 PM, Dan Stromberg wrote: > >> How close is cython (0.12 and trunk or whatever cython calls trunk, or >> the cython-closures repository?) to supporting generators? > > It's getting closer. Actually, Craig Citro made some progress on > stress-testing it last week (he's been looking at doing this for a > while, but has been very busy with teaching this last quarter...), and > if all goes well closures should be merged in, otherwise we'll at > least know where we stand. Is he working on extending the test suite, or is he testing with his own code? Extending the test corpus would really be helpful here. For now, there are only a couple of simple tests but we are far from any reasonable coverage, and close to zero when it comes to corner cases. Stefan From david.n.mashburn at gmail.com Mon Dec 14 08:15:10 2009 From: david.n.mashburn at gmail.com (David Mashburn) Date: Mon, 14 Dec 2009 02:15:10 -0500 Subject: [Cython] Cpyx for automatic Module creation and inlining of Cython code... In-Reply-To: <40620762-2F32-4036-B63F-80050F7ED935@math.washington.edu> References: <4B22A23D.1010303@gmail.com> <40620762-2F32-4036-B63F-80050F7ED935@math.washington.edu> Message-ID: <4B25E5FE.6020700@gmail.com> Thanks Robert! I had been hoping to upgrade this because I knew the hard-coding was going to be problematic... Thanks for the tip-off about where to look for the paths! Could you point me to the code where this is done in Sage? I'd love to look at it for ideas! Thanks, -David Robert Bradshaw wrote: > On Dec 11, 2009, at 11:49 AM, David Mashburn wrote: > > >> Hello Cython Developers, >> >> I wanted to announce "Cpyx", a module I've been working on off and >> on since 2006 that I use to automatically compile and also inline >> Cython code in my work (mostly because I like to do everything in >> one step). >> >> This is more-or-less a prototype, but it works for me on Windows, >> Mac, and Ubuntu, so I thought I'd share! >> >> I know it has similar goals to pyx_import, but I think the two are >> quite compilementary... (and I couldn't figure out how to get numpy >> support in pyx_import when it came out...) >> >> My main hope for this is that it can give people a starting point >> for using manual compilation/distutils >> on their system (it is very verbose by default) and that it can >> automatically inline code with numpy support! >> >> If you find it useful, I think it is almost mature enough to be >> included in cython, and if not I certainly enjoy using it! >> >> In any case, I'd love some feedback. >> > > Thanks for posting this. This reminds me a bit of what we do with > Cython for the notebook in sage. One comment I have is that a lot of > paths seem to be hard coded, and may not always be accurate depending > on how/where Python is installed or what version of the OS you have. > There is the handy sys.prefix that you can use to determine the > running Python's directory and include paths. > > - Robert > > >> Thanks for all the hard work you all are doing with Greg's brainchild! >> -David Mashburn >> # Author: David Mashburn >> # Created July 2006 >> # Last Modified December 11, 2009 >> # License: ??? (Apache 2) -- whatever is easiest for cython folks... >> >> # This module is for the automatic compilation (and also inlining) of >> # Pyrex / Cython code... >> # It can use distutils or manual compilation with gcc (or another >> compiler) >> # It can work with a single existing C source and automatically >> compile it as well >> # It has been tested on Windows, Mac, and Ubuntu Linux >> >> # That said, I make no guarantees that it will work as expected! >> # Numpy support is automatically enabled for the non-distutils >> version... >> >> # Unless the printCmds option is set to False, the script will >> output every action taken >> # and command run >> >> # My main goal for this is to aid people in learning how to compile >> cython code >> # on their system, and give them a starting point so they can tweak >> what they want... >> >> # My other goal is to automate the Cython compile process so I can >> do everything in >> # one step after getting it set up :) >> >> # I really like the inline feature a lot for testing! >> # And try it with PySlices, the latest incarnation of the wxPython >> shell, PyCrust! (Shameless plug...) >> >> import os >> import sys >> import glob >> import random >> import numpy >> import SetEnvironVars >> >> # Making this work in Vista... >> # Download the latest mingw (5.x.x): >> # add C:\MinGW\bin to the PATH environment variable >> >> # Should work with latest MingW on Windows 7... >> >> # Making this work on Mac... >> # Download Xcode from the apple developer site (create a login) and >> install it: >> # http://connect.apple.com >> >> # Sample output for Cpyx on Windows: >> # Pieces: >> # gcc -c -IC:/Python25/include PyrexExample.c -o PyrexExample.o >> # gcc -shared PyrexExample.o -LC:/Python25/libs -lpython25 -o >> PyrexExample.pyd >> # All-in-one: >> # gcc -shared PyrexExample.c -IC:/Python25/include -LC:/Python25/ >> libs -lpython25 -o PyrexExample.pyd >> # All-in-one with linking dll... >> # gcc -shared numpyTest.c -IC:/Python25/include -LC:/Python25/libs - >> LC:/Users/mashbudn/Programming/Python/Pyx -lpython25 -lnumpyTestC -o >> numpyTest.pyd >> >> myPythonDir=os.environ['MYPYTHON'] >> myPyrexDir=os.environ['MYPYREX'] >> globalUseCython=True >> >> if sys.platform=='win32': >> pyrexcName='"' + 'C:\\Python25\\Scripts\\pyrexc.py' + '"' # Full >> path to the Pyrex compiler script >> cythonName='"' + 'C:\\Python25\\Scripts\\cython.py' + '"' # Full >> path to the Cython compiler script >> pythonName='C:\\Python25\\python.exe' # Full path to python.exe >> sitePackages='C:\\Python25\\Lib\\site-packages' >> pythonInclude='C:/Python25/include' >> pythonLibs='C:/Python25/libs' >> elif sys.platform=='darwin': >> pyrexcName='"' + '/Library/Frameworks/Python.framework/Versions/ >> 5.1.1/bin/pyrexc' + '"' # Full path to the Pyrex compiler script >> cythonName='"' + '/Library/Frameworks/Python.framework/Versions/ >> 5.1.1/bin/cython' + '"' # Full path to the Cython compiler script >> pythonName='/Library/Frameworks/Python.framework/Versions/5.1.1/ >> bin/python' # Full path to python.exe >> sitePackages='/Library/Frameworks/Python.framework/Versions/5.1.1/ >> lib/python2.5/site-packages' >> pythonInclude='/Library/Frameworks/Python.framework/Versions/ >> 5.1.1/include' >> pythonLibs='/Library/Frameworks/Python.framework/Versions/5.1.1/ >> lib/python2.5/config/' # contains libpython2.5.so >> elif sys.platform=='linux2': >> pyrexcName='"' + '/usr/bin/pyrexc' + '"' # Full path to the Pyrex >> compiler script >> cythonName='"' + '/usr/bin/cython' + '"' # Full path to the >> Cython compiler script >> pythonName='/usr/bin/python2.5' # Full path to python.exe >> sitePackages='/usr/lib/python2.5/site-packages' >> pythonInclude='/usr/include/python2.5' >> pythonLibs='/usr/lib' # contains libpython2.5.so >> else: >> print 'Platform "' + sys.platform + '" not supported yet' >> >> # New way to find numpy's arrayobject.h to include >> arrayobjecthPath = >> os.path.join(numpy.get_include(),'numpy','arrayobject.h') >> arrayObjectDir = numpy.get_include() >> >> def Cdll(cNameIn='',printCmds=True,gccOptions=''): >> cwd=os.getcwd() >> >> (cPath,cName)=os.path.split(cNameIn) # input path and input file >> name >> if cPath=='': >> cPath=myPyrexDir # directory used for all Pyrex stuff >> dllPath=cPath >> >> stripName=(os.path.splitext(cName))[0] # input file name without >> extension >> >> if sys.platform=='win32': dllName='"' + >> os.path.join(dllPath,stripName+'.dll') + '"' >> elif sys.platform=='darwin': dllName='"' + >> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >> elif sys.platform=='linux2': dllName='"' + >> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >> else: print 'Platform "' + sys.platform >> + '" not supported yet' >> >> cName='"' + os.path.join(cPath,stripName+'.c') + '"' # redefine >> cName >> hName='"' + os.path.join(cPath,stripName+'.h') + '"' >> oName='"' + os.path.join(dllPath,stripName+'.o') + '"' >> >> os.chdir(cPath) >> >> cmd=' '.join(['gcc',gccOptions,'-fPIC','-c',cName,'-o',stripName >> +'.o']) >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> cmd=' '.join(['gcc','-shared','-o',dllName,oName]) >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> os.chdir(cwd) >> >> def >> Cpyx >> (pyxNameIn >> = >> 'PyrexExample >> .pyx >> ',useDistutils >> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >> cwd=os.getcwd() >> >> (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >> file name >> >> if pyxPath=='': >> pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff >> else: >> pydPath=mainDir=pyxPath >> >> pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >> extension >> >> extName='"' + pyxStrip + '"' >> pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >> path to the PYX file (must be in Python/Pyx folder) redefine pyxName >> pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >> path to the C file to be created >> pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >> path to the PYD file to be created >> soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >> path to the lib*.so file to be created >> setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >> path of the Setup File to be created >> >> # run the main pyrex command to make the C file >> pyxCompiler = cythonName if useCython else pyrexcName >> cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> if useDistutils: >> >> #write setup.py which will make a PYD file that can be imported >> setupText="""### This file is setup.py ### >> from distutils.core import setup >> from distutils.extension import Extension >> from Pyrex.Distutils import build_ext >> >> setup( >> name = 'Lock module', >> ext_modules=[ >> Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >> ']' + """), >> ], >> cmdclass = {'build_ext': build_ext} >> )""" >> >> if printCmds: >> print 'Write Stuff to ', setupName[1:-1] >> fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >> fid.write(setupText) >> fid.close() >> >> # run setup.py >> >> os.chdir(mainDir) >> >> if sys.platform=='win32': cmd=' >> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >> inplace']) >> elif sys.platform=='darwin': cmd=' >> '.join([pythonName,setupName,'build_ext','--inplace']) >> elif sys.platform=='linux2': cmd=' >> '.join([pythonName,setupName,'build_ext','--inplace']) >> else: print 'Platform "' + >> sys.platform + '" not supported yet' >> >> else: >> if sys.platform=='win32': cmd=' >> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >> I'+pythonInclude,'-L'+pythonLibs,'-lpython25','-o',pydName]) >> elif sys.platform=='darwin': cmd=' >> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- >> no-cpp-precomp','-mno-fused-madd','-fno-common', >> '-dynamic','- >> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >> I'+pythonInclude, >> '- >> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >> L/usr/local/lib',pyx2cName,'-o',soName]) >> elif sys.platform=='linux2': cmd=' >> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >> I'+pythonInclude,'-L'+pythonLibs,'-lpython2.5','-o',soName]) >> else: print 'Platform "' + >> sys.platform + '" not supported yet' >> >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> os.chdir(cwd) >> >> def >> CpyxLib >> (pyxNameIn >> = >> 'PyrexExample >> .pyx >> ',cNameIn >> = >> 'CTestC >> .c >> ',recompile >> = >> True >> ,useDistutils >> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >> cwd=os.getcwd() >> >> (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >> file name >> (cPath,cName)=os.path.split(cNameIn) # input path and input file >> name >> >> if cPath=='': >> dllPath=cPath=myPyrexDir # directory used for all Pyrex stuff >> >> if pyxPath=='': >> pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff >> else: >> pydPath=mainDir=pyxPath >> >> pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >> extension >> cStrip=(os.path.splitext(cName))[0] # input file name without >> extension >> >> extName='"' + pyxStrip + '"' >> pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >> path to the PYX file (must be in Python\\Pyrex folder) >> >> if sys.platform=='win32': >> dllName='"' + os.path.join(dllPath,cStrip+'.dll') + '"' >> libName='"' + os.path.join(dllPath,cStrip) + '"' >> library_dirs_txt='' >> elif sys.platform=='darwin': >> dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >> libName='"' + cStrip + '"' >> library_dirs_txt=""" >> library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >> """], >> runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >> '"' + """],""" >> elif sys.platform=='linux2': >> dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >> libName='"' + cStrip + '"' >> library_dirs_txt=""" >> library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >> """], >> runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >> '"' + """],""" >> else: >> print 'Platform "' + sys.platform + '" not supported yet' >> >> cName='"' + os.path.join(cPath,cStrip+'.c') + '"' # redefine cName >> hName='"' + os.path.join(cPath,cStrip+'.h') + '"' >> oName='"' + os.path.join(dllPath,cStrip+'.o') + '"' >> >> os.chdir(cPath) >> >> pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >> path to the C file to be created >> setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >> path of the Setup File to be created >> pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >> path to the PYD file to be created >> soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >> path to the lib*.so file to be created >> >> # compile the DLL needed for the link to the C file >> if recompile: >> Cdll(cName[1:-1],printCmds=printCmds,gccOptions=gccOptions) # >> [1:-1] to remove the quotes >> >> # run the main pyrex command to make the C file >> pyxCompiler = cythonName if useCython else pyrexcName >> cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> if useDistutils: >> #write setup.py which will make a PYD file that can be imported >> setupText="""### This file is setup.py ### >> from distutils.core import setup >> from distutils.extension import Extension >> from Pyrex.Distutils import build_ext >> >> setup( >> name = 'Lock module', >> ext_modules=[ >> Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >> """],"""+library_dirs_txt+""" >> libraries=[""" + libName.replace('\\','\\\\') + ']' + """), >> ], >> cmdclass = {'build_ext': build_ext} >> )""" >> if printCmds: >> print 'Write Stuff to ', setupName[1:-1] >> fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >> fid.write(setupText) >> fid.close() >> >> # run setup.py >> os.chdir(mainDir) >> >> if sys.platform=='win32': cmd=' >> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >> inplace']) >> elif sys.platform=='darwin': cmd=' >> '.join([pythonName,setupName,'build_ext','--inplace']) >> elif sys.platform=='linux2': cmd=' >> '.join([pythonName,setupName,'build_ext','--inplace']) >> else: print 'Platform "' + >> sys.platform + '" not supported yet' >> else: >> if sys.platform=='win32': cmd=' >> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >> lpython25','-l'+cStrip,'-o',pydName]) >> elif sys.platform=='darwin': cmd=' >> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- >> no-cpp-precomp','-mno-fused-madd','-fno-common', >> '-dynamic','- >> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >> I'+pythonInclude, >> '- >> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >> L/usr/local/lib','-L'+cPath,'-Wl,-R'+cPath, >> '- >> l'+cStrip,pyx2cName,'-o',soName]) >> elif sys.platform=='linux2': cmd=' >> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >> lpython2.5','-l'+cStrip,'-o',soName]) >> else: print 'Platform "' + >> sys.platform + '" not supported yet' >> >> if printCmds: >> print '\n', cmd >> os.system(cmd) >> >> os.chdir(cwd) >> >> #Shamelessly steal the idea used by scipy.weave.inline but for Pyrex/ >> Cython instead... >> # In order to be able to import *, have to use exec in the calling >> module... >> def >> PyrexInline >> (code >> ,cleanUp >> = >> False >> ,useDistutils=False,useCython=False,gccOptions='',printCmds=True): >> '''PyrexInline returns a string that is an import statement to >> the temporary cython module'''+ \ >> '''Use this like: exec(PyrexInline(r"""""",))''' >> >> testCode=r""" >> cdef extern from "stdio.h": >> ctypedef struct FILE >> >> FILE * stdout >> int printf(char *format,...) >> int fflush( FILE *stream ) >> >> def PyrexPrint(mystring): >> printf(mystring) >> fflush(stdout) >> >> PyrexPrint('HelloWorld!') >> """ >> tmpPath=os.path.expanduser('~/.Cpyx_tmp') >> if not os.path.isdir(tmpPath): >> os.mkdir(tmpPath) >> if tmpPath not in sys.path: >> sys.path.append(tmpPath) >> if cleanUp: >> CleanTmp() >> >> # Ensure you always get a new module! >> # This means there is no reason to "reload" >> # Also means memory gets majorly eaten up! >> # Can't have everything! >> moduleName='Pyrex'+str(random.randint(0,1e18)) >> file=os.path.join(tmpPath,moduleName+'.pyx') >> >> fid=open(file,'w') >> fid.write(code) >> fid.close() >> >> >> Cpyx >> (file >> ,useDistutils >> = >> useDistutils >> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >> >> #cmd="""import """+moduleName+""" as LoadPyrexInline""" >> cmd="""from """+moduleName+""" import *""" >> if printCmds: >> print cmd >> return cmd >> >> # Create a dummy function that defaults to using Cython instead for >> clarity... >> def >> CythonInline >> (code >> ,cleanUp >> = >> False,useDistutils=False,useCython=True,gccOptions='',printCmds=True): >> return >> PyrexInline >> (code >> ,cleanUp >> = >> cleanUp >> ,useDistutils >> = >> useDistutils >> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >> >> def CleanTmp(): >> tmpPath=os.path.expanduser('~/.Cpyx_tmp') >> for i in glob.glob(os.path.join(tmpPath,'*')): >> os.remove(i) >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > From dalcinl at gmail.com Mon Dec 14 17:04:14 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Mon, 14 Dec 2009 13:04:14 -0300 Subject: [Cython] Cpyx for automatic Module creation and inlining of Cython code... In-Reply-To: <4B25E5FE.6020700@gmail.com> References: <4B22A23D.1010303@gmail.com> <40620762-2F32-4036-B63F-80050F7ED935@math.washington.edu> <4B25E5FE.6020700@gmail.com> Message-ID: On Mon, Dec 14, 2009 at 4:15 AM, David Mashburn wrote: > Thanks Robert! > > I had been hoping to upgrade this because I knew the hard-coding was > going to be problematic... ?Thanks for the tip-off about where to look > for the paths! > > Could you point me to the code where this is done in Sage? ?I'd love to > look at it for ideas! > You could find insteresting the first block of the script here: http://code.google.com/p/mpi4py/source/browse/trunk/misc/cy2py . That's IMHO a better way to call distutil's setup function, as you do not need to create a temporary file, nor spawn a new Python process to do the job (though doing os.fork() could be a good idea). Other thing you should look at is caching. Cpyx shoudl not waste time re-cytonizing and re-compiling inline Cython/Pyrex code that never changes. > Thanks, > -David > > Robert Bradshaw wrote: >> On Dec 11, 2009, at 11:49 AM, David Mashburn wrote: >> >> >>> Hello Cython Developers, >>> >>> I wanted to announce "Cpyx", a module I've been working on off and >>> on since 2006 that I use to automatically compile and also inline >>> Cython code in my work (mostly because I like to do everything in >>> one step). >>> >>> This is more-or-less a prototype, but it works for me on Windows, >>> Mac, and Ubuntu, so I thought I'd share! >>> >>> I know it has similar goals to pyx_import, but I think the two are >>> quite compilementary... (and I couldn't figure out how to get numpy >>> support in pyx_import when it came out...) >>> >>> My main hope for this is that it can give people a starting point >>> for using manual compilation/distutils >>> on their system (it is very verbose by default) and that it can >>> automatically inline code with numpy support! >>> >>> If you find it useful, I think it is almost mature enough to be >>> included in cython, and if not I certainly enjoy using it! >>> >>> In any case, I'd love some feedback. >>> >> >> Thanks for posting this. This reminds me a bit of what we do with >> Cython for the notebook in sage. One comment I have is that a lot of >> paths seem to be hard coded, and may not always be accurate depending >> on how/where Python is installed or what version of the OS you have. >> There is the handy sys.prefix that you can use to determine the >> running Python's directory and include paths. >> >> - Robert >> >> >>> Thanks for all the hard work you all are doing with Greg's brainchild! >>> -David Mashburn >>> # Author: David Mashburn >>> # Created July 2006 >>> # Last Modified December 11, 2009 >>> # License: ??? (Apache 2) -- whatever is easiest for cython folks... >>> >>> # This module is for the automatic compilation (and also inlining) of >>> # Pyrex / Cython code... >>> # It can use distutils or manual compilation with gcc (or another >>> compiler) >>> # It can work with a single existing C source and automatically >>> compile it as well >>> # It has been tested on Windows, Mac, and Ubuntu Linux >>> >>> # That said, I make no guarantees that it will work as expected! >>> # Numpy support is automatically enabled for the non-distutils >>> version... >>> >>> # Unless the printCmds option is set to False, the script will >>> output every action taken >>> # and command run >>> >>> # My main goal for this is to aid people in learning how to compile >>> cython code >>> # on their system, and give them a starting point so they can tweak >>> what they want... >>> >>> # My other goal is to automate the Cython compile process so I can >>> do everything in >>> # one step after getting it set up :) >>> >>> # I really like the inline feature a lot for testing! >>> # And try it with PySlices, the latest incarnation of the wxPython >>> shell, PyCrust! (Shameless plug...) >>> >>> import os >>> import sys >>> import glob >>> import random >>> import numpy >>> import SetEnvironVars >>> >>> # Making this work in Vista... >>> # Download the latest mingw (5.x.x): >>> # add C:\MinGW\bin to the PATH environment variable >>> >>> # Should work with latest MingW on Windows 7... >>> >>> # Making this work on Mac... >>> # Download Xcode from the apple developer site (create a login) and >>> install it: >>> # http://connect.apple.com >>> >>> # Sample output for Cpyx on Windows: >>> # Pieces: >>> # gcc -c -IC:/Python25/include PyrexExample.c -o PyrexExample.o >>> # gcc -shared PyrexExample.o -LC:/Python25/libs -lpython25 -o >>> PyrexExample.pyd >>> # All-in-one: >>> # gcc -shared PyrexExample.c -IC:/Python25/include -LC:/Python25/ >>> libs -lpython25 -o PyrexExample.pyd >>> # All-in-one with linking dll... >>> # gcc -shared numpyTest.c -IC:/Python25/include -LC:/Python25/libs - >>> LC:/Users/mashbudn/Programming/Python/Pyx -lpython25 -lnumpyTestC -o >>> numpyTest.pyd >>> >>> myPythonDir=os.environ['MYPYTHON'] >>> myPyrexDir=os.environ['MYPYREX'] >>> globalUseCython=True >>> >>> if sys.platform=='win32': >>> ? ?pyrexcName='"' + 'C:\\Python25\\Scripts\\pyrexc.py' + '"' # Full >>> path to the Pyrex compiler script >>> ? ?cythonName='"' + 'C:\\Python25\\Scripts\\cython.py' + '"' # Full >>> path to the Cython compiler script >>> ? ?pythonName='C:\\Python25\\python.exe' # Full path to python.exe >>> ? ?sitePackages='C:\\Python25\\Lib\\site-packages' >>> ? ?pythonInclude='C:/Python25/include' >>> ? ?pythonLibs='C:/Python25/libs' >>> elif sys.platform=='darwin': >>> ? ?pyrexcName='"' + '/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/bin/pyrexc' + '"' # Full path to the Pyrex compiler script >>> ? ?cythonName='"' + '/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/bin/cython' + '"' # Full path to the Cython compiler script >>> ? ?pythonName='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> bin/python' # Full path to python.exe >>> ? ?sitePackages='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> lib/python2.5/site-packages' >>> ? ?pythonInclude='/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/include' >>> ? ?pythonLibs='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> lib/python2.5/config/' # contains libpython2.5.so >>> elif sys.platform=='linux2': >>> ? ?pyrexcName='"' + '/usr/bin/pyrexc' + '"' # Full path to the Pyrex >>> compiler script >>> ? ?cythonName='"' + '/usr/bin/cython' + '"' # Full path to the >>> Cython compiler script >>> ? ?pythonName='/usr/bin/python2.5' # Full path to python.exe >>> ? ?sitePackages='/usr/lib/python2.5/site-packages' >>> ? ?pythonInclude='/usr/include/python2.5' >>> ? ?pythonLibs='/usr/lib' # contains libpython2.5.so >>> else: >>> ? ?print 'Platform "' + sys.platform + '" not supported yet' >>> >>> # New way to find numpy's arrayobject.h to include >>> arrayobjecthPath = >>> os.path.join(numpy.get_include(),'numpy','arrayobject.h') >>> arrayObjectDir = numpy.get_include() >>> >>> def Cdll(cNameIn='',printCmds=True,gccOptions=''): >>> ? ?cwd=os.getcwd() >>> >>> ? ?(cPath,cName)=os.path.split(cNameIn) # input path and input file >>> name >>> ? ?if cPath=='': >>> ? ? ? ?cPath=myPyrexDir # directory used for all Pyrex stuff >>> ? ?dllPath=cPath >>> >>> ? ?stripName=(os.path.splitext(cName))[0] # input file name without >>> extension >>> >>> ? ?if sys.platform=='win32': ? ? ?dllName='"' + >>> os.path.join(dllPath,stripName+'.dll') + '"' >>> ? ?elif sys.platform=='darwin': ? dllName='"' + >>> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >>> ? ?elif sys.platform=='linux2': ? dllName='"' + >>> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >>> ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ?print 'Platform "' + sys.platform >>> + '" not supported yet' >>> >>> ? ?cName='"' + os.path.join(cPath,stripName+'.c') + '"' # redefine >>> cName >>> ? ?hName='"' + os.path.join(cPath,stripName+'.h') + '"' >>> ? ?oName='"' + os.path.join(dllPath,stripName+'.o') + '"' >>> >>> ? ?os.chdir(cPath) >>> >>> ? ?cmd=' '.join(['gcc',gccOptions,'-fPIC','-c',cName,'-o',stripName >>> +'.o']) >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?cmd=' '.join(['gcc','-shared','-o',dllName,oName]) >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?os.chdir(cwd) >>> >>> def >>> Cpyx >>> (pyxNameIn >>> = >>> 'PyrexExample >>> .pyx >>> ',useDistutils >>> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >>> ? ?cwd=os.getcwd() >>> >>> ? ?(pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >>> file name >>> >>> ? ?if pyxPath=='': >>> ? ? ? ?pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff >>> ? ?else: >>> ? ? ? ?pydPath=mainDir=pyxPath >>> >>> ? ?pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >>> extension >>> >>> ? ?extName='"' + pyxStrip + '"' >>> ? ?pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >>> path to the PYX file (must be in Python/Pyx folder) redefine pyxName >>> ? ?pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >>> path to the C file to be created >>> ? ?pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >>> path to the PYD file to be created >>> ? ?soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >>> path to the lib*.so file to be created >>> ? ?setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >>> path of the Setup File to be created >>> >>> ? ?# run the main pyrex command to make the C file >>> ? ?pyxCompiler = cythonName if useCython else pyrexcName >>> ? ?cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?if useDistutils: >>> >>> ? ? ? ?#write setup.py which will make a PYD file that can be imported >>> ? ? ? ?setupText="""### This file is setup.py ### >>> from distutils.core import setup >>> from distutils.extension import Extension >>> from Pyrex.Distutils import build_ext >>> >>> setup( >>> ?name = 'Lock module', >>> ?ext_modules=[ >>> ? ?Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >>> ']' + """), >>> ?], >>> ?cmdclass = {'build_ext': build_ext} >>> )""" >>> >>> ? ? ? ?if printCmds: >>> ? ? ? ? ? ?print 'Write Stuff to ', setupName[1:-1] >>> ? ? ? ?fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >>> ? ? ? ?fid.write(setupText) >>> ? ? ? ?fid.close() >>> >>> ? ? ? ?# run setup.py >>> >>> ? ? ? ?os.chdir(mainDir) >>> >>> ? ? ? ?if sys.platform=='win32': ? ? ? ?cmd=' >>> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >>> inplace']) >>> ? ? ? ?elif sys.platform=='darwin': ? ? cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> ? ? ? ?elif sys.platform=='linux2': ? ? cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> ? ? ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ? ?print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> ? ?else: >>> ? ? ? ?if sys.platform=='win32': ? ? ? ?cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-lpython25','-o',pydName]) >>> ? ? ? ?elif sys.platform=='darwin': ? ? cmd=' >>> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- >>> no-cpp-precomp','-mno-fused-madd','-fno-common', >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '-dynamic','- >>> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >>> I'+pythonInclude, >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '- >>> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >>> L/usr/local/lib',pyx2cName,'-o',soName]) >>> ? ? ? ?elif sys.platform=='linux2': ? ? cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-lpython2.5','-o',soName]) >>> ? ? ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ? ?print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?os.chdir(cwd) >>> >>> def >>> CpyxLib >>> (pyxNameIn >>> = >>> 'PyrexExample >>> .pyx >>> ',cNameIn >>> = >>> 'CTestC >>> .c >>> ',recompile >>> = >>> True >>> ,useDistutils >>> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >>> ? ?cwd=os.getcwd() >>> >>> ? ?(pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >>> file name >>> ? ?(cPath,cName)=os.path.split(cNameIn) # input path and input file >>> name >>> >>> ? ?if cPath=='': >>> ? ? ? ?dllPath=cPath=myPyrexDir # directory used for all Pyrex stuff >>> >>> ? ?if pyxPath=='': >>> ? ? ? ?pydPath=mainDir=myPyrexDir # directory used for all Pyrex stuff >>> ? ?else: >>> ? ? ? ?pydPath=mainDir=pyxPath >>> >>> ? ?pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >>> extension >>> ? ?cStrip=(os.path.splitext(cName))[0] # input file name without >>> extension >>> >>> ? ?extName='"' + pyxStrip + '"' >>> ? ?pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >>> path to the PYX file (must be in Python\\Pyrex folder) >>> >>> ? ?if sys.platform=='win32': >>> ? ? ? ?dllName='"' + os.path.join(dllPath,cStrip+'.dll') + '"' >>> ? ? ? ?libName='"' + os.path.join(dllPath,cStrip) + '"' >>> ? ? ? ?library_dirs_txt='' >>> ? ?elif sys.platform=='darwin': >>> ? ? ? ?dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >>> ? ? ? ?libName='"' + cStrip + '"' >>> ? ? ? ?library_dirs_txt=""" >>> ? ?library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >>> """], >>> ? ?runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >>> '"' + """],""" >>> ? ?elif sys.platform=='linux2': >>> ? ? ? ?dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >>> ? ? ? ?libName='"' + cStrip + '"' >>> ? ? ? ?library_dirs_txt=""" >>> ? ?library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >>> """], >>> ? ?runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >>> '"' + """],""" >>> ? ?else: >>> ? ? ? ?print 'Platform "' + sys.platform + '" not supported yet' >>> >>> ? ?cName='"' + os.path.join(cPath,cStrip+'.c') + '"' # redefine cName >>> ? ?hName='"' + os.path.join(cPath,cStrip+'.h') + '"' >>> ? ?oName='"' + os.path.join(dllPath,cStrip+'.o') + '"' >>> >>> ? ?os.chdir(cPath) >>> >>> ? ?pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >>> path to the C file to be created >>> ? ?setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >>> path of the Setup File to be created >>> ? ?pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >>> path to the PYD file to be created >>> ? ?soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >>> path to the lib*.so file to be created >>> >>> ? ?# compile the DLL needed for the link to the C file >>> ? ?if recompile: >>> ? ? ? ?Cdll(cName[1:-1],printCmds=printCmds,gccOptions=gccOptions) # >>> [1:-1] to remove the quotes >>> >>> ? ?# run the main pyrex command to make the C file >>> ? ?pyxCompiler = cythonName if useCython else pyrexcName >>> ? ?cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?if useDistutils: >>> ? ? ? ?#write setup.py which will make a PYD file that can be imported >>> ? ? ? ?setupText="""### This file is setup.py ### >>> from distutils.core import setup >>> from distutils.extension import Extension >>> from Pyrex.Distutils import build_ext >>> >>> setup( >>> ?name = 'Lock module', >>> ?ext_modules=[ >>> ? ?Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >>> """],"""+library_dirs_txt+""" >>> ? ?libraries=[""" + libName.replace('\\','\\\\') + ']' + """), >>> ?], >>> ?cmdclass = {'build_ext': build_ext} >>> )""" >>> ? ? ? ?if printCmds: >>> ? ? ? ? ? ?print 'Write Stuff to ', setupName[1:-1] >>> ? ? ? ?fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >>> ? ? ? ?fid.write(setupText) >>> ? ? ? ?fid.close() >>> >>> ? ? ? ?# run setup.py >>> ? ? ? ?os.chdir(mainDir) >>> >>> ? ? ? ?if sys.platform=='win32': ? ? ? ?cmd=' >>> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >>> inplace']) >>> ? ? ? ?elif sys.platform=='darwin': ? ? cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> ? ? ? ?elif sys.platform=='linux2': ? ? cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> ? ? ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ? ?print 'Platform "' + >>> sys.platform + '" not supported yet' >>> ? ?else: >>> ? ? ? ?if sys.platform=='win32': ? ? ? ?cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >>> lpython25','-l'+cStrip,'-o',pydName]) >>> ? ? ? ?elif sys.platform=='darwin': ? ? cmd=' >>> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long-double','- >>> no-cpp-precomp','-mno-fused-madd','-fno-common', >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '-dynamic','- >>> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >>> I'+pythonInclude, >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '- >>> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >>> L/usr/local/lib','-L'+cPath,'-Wl,-R'+cPath, >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? '- >>> l'+cStrip,pyx2cName,'-o',soName]) >>> ? ? ? ?elif sys.platform=='linux2': ? ? cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >>> lpython2.5','-l'+cStrip,'-o',soName]) >>> ? ? ? ?else: ? ? ? ? ? ? ? ? ? ? ? ? ? ?print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> ? ?if printCmds: >>> ? ? ? ?print '\n', cmd >>> ? ?os.system(cmd) >>> >>> ? ?os.chdir(cwd) >>> >>> #Shamelessly steal the idea used by scipy.weave.inline but for Pyrex/ >>> Cython instead... >>> # In order to be able to import *, have to use exec in the calling >>> module... >>> def >>> PyrexInline >>> (code >>> ,cleanUp >>> = >>> False >>> ,useDistutils=False,useCython=False,gccOptions='',printCmds=True): >>> ? ?'''PyrexInline returns a string that is an import statement to >>> the temporary cython module'''+ \ >>> ? ?'''Use this like: exec(PyrexInline(r"""""",))''' >>> >>> ? ?testCode=r""" >>> cdef extern from "stdio.h": >>> ? ?ctypedef struct FILE >>> >>> ? ?FILE * stdout >>> ? ?int printf(char *format,...) >>> ? ?int fflush( FILE *stream ) >>> >>> def PyrexPrint(mystring): >>> ? ?printf(mystring) >>> ? ?fflush(stdout) >>> >>> PyrexPrint('HelloWorld!') >>> """ >>> ? ?tmpPath=os.path.expanduser('~/.Cpyx_tmp') >>> ? ?if not os.path.isdir(tmpPath): >>> ? ? ? ?os.mkdir(tmpPath) >>> ? ?if tmpPath not in sys.path: >>> ? ? ? ?sys.path.append(tmpPath) >>> ? ?if cleanUp: >>> ? ? ? ?CleanTmp() >>> >>> ? ?# Ensure you always get a new module! >>> ? ?# This means there is no reason to "reload" >>> ? ?# Also means memory gets majorly eaten up! >>> ? ?# Can't have everything! >>> ? ?moduleName='Pyrex'+str(random.randint(0,1e18)) >>> ? ?file=os.path.join(tmpPath,moduleName+'.pyx') >>> >>> ? ?fid=open(file,'w') >>> ? ?fid.write(code) >>> ? ?fid.close() >>> >>> >>> Cpyx >>> (file >>> ,useDistutils >>> = >>> useDistutils >>> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >>> >>> ? ?#cmd="""import """+moduleName+""" as LoadPyrexInline""" >>> ? ?cmd="""from """+moduleName+""" import *""" >>> ? ?if printCmds: >>> ? ? ? ?print cmd >>> ? ?return cmd >>> >>> # Create a dummy function that defaults to using Cython instead for >>> clarity... >>> def >>> CythonInline >>> (code >>> ,cleanUp >>> = >>> False,useDistutils=False,useCython=True,gccOptions='',printCmds=True): >>> ? ?return >>> PyrexInline >>> (code >>> ,cleanUp >>> = >>> cleanUp >>> ,useDistutils >>> = >>> useDistutils >>> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >>> >>> def CleanTmp(): >>> ? ?tmpPath=os.path.expanduser('~/.Cpyx_tmp') >>> ? ?for i in glob.glob(os.path.join(tmpPath,'*')): >>> ? ? ? ?os.remove(i) >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From strombrg at gmail.com Mon Dec 14 17:27:00 2009 From: strombrg at gmail.com (Dan Stromberg) Date: Mon, 14 Dec 2009 08:27:00 -0800 Subject: [Cython] Preprocessor? In-Reply-To: <4B25E178.4060601@behnel.de> References: <4B259A1B.8000306@gmail.com> <4B25E178.4060601@behnel.de> Message-ID: <4B266754.1040107@gmail.com> Stefan Behnel wrote: > Dan Stromberg, 14.12.2009 02:51: > >> Has anyone already worked out a system for preprocessing (maybe via m4) >> a single input file into two output files: one a plain .py, and on a .pyx? >> >> I sometimes prefer a pure python version of a dependency if I can get >> it, because it makes the ongoing maintenance costs lower than compiling >> (and recompiling) extension modules would. It seems like due to the >> similarities between .py and .pyx, there should be a way of maintaining >> both as a single document - so people who want the speed can have it, >> and people who want convenience can have that, but the programmer >> doesn't end up maintaining two versions of said dependency. >> >> I googled, but didn't find anything that looked relevant, other than >> pages about how to use various preprocessors. >> > > I assume you know that Cython can compile .py files and has a pure Python > syntax for type declarations? > > http://wiki.cython.org/pure > > Stefan > Actually, I didn't know that. It's interesting, but I don't think it's what I want; I'm looking for something that will allow my code to not depend on cython at all in one form. I've since worked out some simple ifdef's in m4 to do what I wanted. From robertwb at math.washington.edu Mon Dec 14 19:41:33 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Dec 2009 10:41:33 -0800 Subject: [Cython] Generators - status? In-Reply-To: <4B25E53D.4050902@behnel.de> References: <4B2497C7.8070406@gmail.com> <4B25E53D.4050902@behnel.de> Message-ID: On Dec 13, 2009, at 11:11 PM, Stefan Behnel wrote: > Robert Bradshaw, 13.12.2009 09:40: >> On Dec 12, 2009, at 11:29 PM, Dan Stromberg wrote: >> >>> How close is cython (0.12 and trunk or whatever cython calls >>> trunk, or >>> the cython-closures repository?) to supporting generators? >> >> It's getting closer. Actually, Craig Citro made some progress on >> stress-testing it last week (he's been looking at doing this for a >> while, but has been very busy with teaching this last quarter...), >> and >> if all goes well closures should be merged in, otherwise we'll at >> least know where we stand. > > Is he working on extending the test suite, or is he testing with his > own code? Both. One thing he added was the ability to put def functions inside if statements, etc. (Basically a parsing, rather than code generation, issue). > Extending the test corpus would really be helpful here. For now, > there are > only a couple of simple tests but we are far from any reasonable > coverage, > and close to zero when it comes to corner cases. A long time ago he wrote a scheme compiler, and has a huge battery of closure-related tests he used for that. He wrote a translator to represent them in Python which should provide good coverage of standard and corner-case behavior. - Robert From robertwb at math.washington.edu Mon Dec 14 20:10:56 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Dec 2009 11:10:56 -0800 Subject: [Cython] Cpyx for automatic Module creation and inlining of Cython code... In-Reply-To: <4B25E5FE.6020700@gmail.com> References: <4B22A23D.1010303@gmail.com> <40620762-2F32-4036-B63F-80050F7ED935@math.washington.edu> <4B25E5FE.6020700@gmail.com> Message-ID: <7A787FB9-D58B-48BB-87D8-3A42A1A2F61A@math.washington.edu> On Dec 13, 2009, at 11:15 PM, David Mashburn wrote: > Thanks Robert! > > I had been hoping to upgrade this because I knew the hard-coding was > going to be problematic... Thanks for the tip-off about where to look > for the paths! > > Could you point me to the code where this is done in Sage? I'd love > to > look at it for ideas! http://hg.sagemath.org/sage-main/file/5db805d3bdaf/sage/misc/cython.py > Thanks, > -David > > Robert Bradshaw wrote: >> On Dec 11, 2009, at 11:49 AM, David Mashburn wrote: >> >> >>> Hello Cython Developers, >>> >>> I wanted to announce "Cpyx", a module I've been working on off and >>> on since 2006 that I use to automatically compile and also inline >>> Cython code in my work (mostly because I like to do everything in >>> one step). >>> >>> This is more-or-less a prototype, but it works for me on Windows, >>> Mac, and Ubuntu, so I thought I'd share! >>> >>> I know it has similar goals to pyx_import, but I think the two are >>> quite compilementary... (and I couldn't figure out how to get numpy >>> support in pyx_import when it came out...) >>> >>> My main hope for this is that it can give people a starting point >>> for using manual compilation/distutils >>> on their system (it is very verbose by default) and that it can >>> automatically inline code with numpy support! >>> >>> If you find it useful, I think it is almost mature enough to be >>> included in cython, and if not I certainly enjoy using it! >>> >>> In any case, I'd love some feedback. >>> >> >> Thanks for posting this. This reminds me a bit of what we do with >> Cython for the notebook in sage. One comment I have is that a lot of >> paths seem to be hard coded, and may not always be accurate depending >> on how/where Python is installed or what version of the OS you have. >> There is the handy sys.prefix that you can use to determine the >> running Python's directory and include paths. >> >> - Robert >> >> >>> Thanks for all the hard work you all are doing with Greg's >>> brainchild! >>> -David Mashburn >>> # Author: David Mashburn >>> # Created July 2006 >>> # Last Modified December 11, 2009 >>> # License: ??? (Apache 2) -- whatever is easiest for cython folks... >>> >>> # This module is for the automatic compilation (and also inlining) >>> of >>> # Pyrex / Cython code... >>> # It can use distutils or manual compilation with gcc (or another >>> compiler) >>> # It can work with a single existing C source and automatically >>> compile it as well >>> # It has been tested on Windows, Mac, and Ubuntu Linux >>> >>> # That said, I make no guarantees that it will work as expected! >>> # Numpy support is automatically enabled for the non-distutils >>> version... >>> >>> # Unless the printCmds option is set to False, the script will >>> output every action taken >>> # and command run >>> >>> # My main goal for this is to aid people in learning how to compile >>> cython code >>> # on their system, and give them a starting point so they can tweak >>> what they want... >>> >>> # My other goal is to automate the Cython compile process so I can >>> do everything in >>> # one step after getting it set up :) >>> >>> # I really like the inline feature a lot for testing! >>> # And try it with PySlices, the latest incarnation of the wxPython >>> shell, PyCrust! (Shameless plug...) >>> >>> import os >>> import sys >>> import glob >>> import random >>> import numpy >>> import SetEnvironVars >>> >>> # Making this work in Vista... >>> # Download the latest mingw (5.x.x): >>> # add C:\MinGW\bin to the PATH environment variable >>> >>> # Should work with latest MingW on Windows 7... >>> >>> # Making this work on Mac... >>> # Download Xcode from the apple developer site (create a login) and >>> install it: >>> # http://connect.apple.com >>> >>> # Sample output for Cpyx on Windows: >>> # Pieces: >>> # gcc -c -IC:/Python25/include PyrexExample.c -o PyrexExample.o >>> # gcc -shared PyrexExample.o -LC:/Python25/libs -lpython25 -o >>> PyrexExample.pyd >>> # All-in-one: >>> # gcc -shared PyrexExample.c -IC:/Python25/include -LC:/Python25/ >>> libs -lpython25 -o PyrexExample.pyd >>> # All-in-one with linking dll... >>> # gcc -shared numpyTest.c -IC:/Python25/include -LC:/Python25/libs - >>> LC:/Users/mashbudn/Programming/Python/Pyx -lpython25 -lnumpyTestC -o >>> numpyTest.pyd >>> >>> myPythonDir=os.environ['MYPYTHON'] >>> myPyrexDir=os.environ['MYPYREX'] >>> globalUseCython=True >>> >>> if sys.platform=='win32': >>> pyrexcName='"' + 'C:\\Python25\\Scripts\\pyrexc.py' + '"' # Full >>> path to the Pyrex compiler script >>> cythonName='"' + 'C:\\Python25\\Scripts\\cython.py' + '"' # Full >>> path to the Cython compiler script >>> pythonName='C:\\Python25\\python.exe' # Full path to python.exe >>> sitePackages='C:\\Python25\\Lib\\site-packages' >>> pythonInclude='C:/Python25/include' >>> pythonLibs='C:/Python25/libs' >>> elif sys.platform=='darwin': >>> pyrexcName='"' + '/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/bin/pyrexc' + '"' # Full path to the Pyrex compiler script >>> cythonName='"' + '/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/bin/cython' + '"' # Full path to the Cython compiler script >>> pythonName='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> bin/python' # Full path to python.exe >>> sitePackages='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> lib/python2.5/site-packages' >>> pythonInclude='/Library/Frameworks/Python.framework/Versions/ >>> 5.1.1/include' >>> pythonLibs='/Library/Frameworks/Python.framework/Versions/5.1.1/ >>> lib/python2.5/config/' # contains libpython2.5.so >>> elif sys.platform=='linux2': >>> pyrexcName='"' + '/usr/bin/pyrexc' + '"' # Full path to the Pyrex >>> compiler script >>> cythonName='"' + '/usr/bin/cython' + '"' # Full path to the >>> Cython compiler script >>> pythonName='/usr/bin/python2.5' # Full path to python.exe >>> sitePackages='/usr/lib/python2.5/site-packages' >>> pythonInclude='/usr/include/python2.5' >>> pythonLibs='/usr/lib' # contains libpython2.5.so >>> else: >>> print 'Platform "' + sys.platform + '" not supported yet' >>> >>> # New way to find numpy's arrayobject.h to include >>> arrayobjecthPath = >>> os.path.join(numpy.get_include(),'numpy','arrayobject.h') >>> arrayObjectDir = numpy.get_include() >>> >>> def Cdll(cNameIn='',printCmds=True,gccOptions=''): >>> cwd=os.getcwd() >>> >>> (cPath,cName)=os.path.split(cNameIn) # input path and input file >>> name >>> if cPath=='': >>> cPath=myPyrexDir # directory used for all Pyrex stuff >>> dllPath=cPath >>> >>> stripName=(os.path.splitext(cName))[0] # input file name without >>> extension >>> >>> if sys.platform=='win32': dllName='"' + >>> os.path.join(dllPath,stripName+'.dll') + '"' >>> elif sys.platform=='darwin': dllName='"' + >>> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >>> elif sys.platform=='linux2': dllName='"' + >>> os.path.join(dllPath,'lib'+stripName+'.so') + '"' >>> else: print 'Platform "' + sys.platform >>> + '" not supported yet' >>> >>> cName='"' + os.path.join(cPath,stripName+'.c') + '"' # redefine >>> cName >>> hName='"' + os.path.join(cPath,stripName+'.h') + '"' >>> oName='"' + os.path.join(dllPath,stripName+'.o') + '"' >>> >>> os.chdir(cPath) >>> >>> cmd=' '.join(['gcc',gccOptions,'-fPIC','-c',cName,'-o',stripName >>> +'.o']) >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> cmd=' '.join(['gcc','-shared','-o',dllName,oName]) >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> os.chdir(cwd) >>> >>> def >>> Cpyx >>> (pyxNameIn >>> = >>> 'PyrexExample >>> .pyx >>> ',useDistutils >>> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >>> cwd=os.getcwd() >>> >>> (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >>> file name >>> >>> if pyxPath=='': >>> pydPath=mainDir=myPyrexDir # directory used for all Pyrex >>> stuff >>> else: >>> pydPath=mainDir=pyxPath >>> >>> pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >>> extension >>> >>> extName='"' + pyxStrip + '"' >>> pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >>> path to the PYX file (must be in Python/Pyx folder) redefine pyxName >>> pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >>> path to the C file to be created >>> pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >>> path to the PYD file to be created >>> soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >>> path to the lib*.so file to be created >>> setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >>> path of the Setup File to be created >>> >>> # run the main pyrex command to make the C file >>> pyxCompiler = cythonName if useCython else pyrexcName >>> cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> if useDistutils: >>> >>> #write setup.py which will make a PYD file that can be >>> imported >>> setupText="""### This file is setup.py ### >>> from distutils.core import setup >>> from distutils.extension import Extension >>> from Pyrex.Distutils import build_ext >>> >>> setup( >>> name = 'Lock module', >>> ext_modules=[ >>> Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >>> ']' + """), >>> ], >>> cmdclass = {'build_ext': build_ext} >>> )""" >>> >>> if printCmds: >>> print 'Write Stuff to ', setupName[1:-1] >>> fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >>> fid.write(setupText) >>> fid.close() >>> >>> # run setup.py >>> >>> os.chdir(mainDir) >>> >>> if sys.platform=='win32': cmd=' >>> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >>> inplace']) >>> elif sys.platform=='darwin': cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> elif sys.platform=='linux2': cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> else: print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> else: >>> if sys.platform=='win32': cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-lpython25','-o',pydName]) >>> elif sys.platform=='darwin': cmd=' >>> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long- >>> double','- >>> no-cpp-precomp','-mno-fused-madd','-fno-common', >>> '-dynamic','- >>> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >>> I'+pythonInclude, >>> '- >>> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >>> L/usr/local/lib',pyx2cName,'-o',soName]) >>> elif sys.platform=='linux2': cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-lpython2.5','-o',soName]) >>> else: print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> os.chdir(cwd) >>> >>> def >>> CpyxLib >>> (pyxNameIn >>> = >>> 'PyrexExample >>> .pyx >>> ',cNameIn >>> = >>> 'CTestC >>> .c >>> ',recompile >>> = >>> True >>> ,useDistutils >>> =False,useCython=globalUseCython,gccOptions='',printCmds=True): >>> cwd=os.getcwd() >>> >>> (pyxPath,pyxName)=os.path.split(pyxNameIn) # input path and input >>> file name >>> (cPath,cName)=os.path.split(cNameIn) # input path and input file >>> name >>> >>> if cPath=='': >>> dllPath=cPath=myPyrexDir # directory used for all Pyrex stuff >>> >>> if pyxPath=='': >>> pydPath=mainDir=myPyrexDir # directory used for all Pyrex >>> stuff >>> else: >>> pydPath=mainDir=pyxPath >>> >>> pyxStrip=(os.path.splitext(pyxName))[0] # input file name without >>> extension >>> cStrip=(os.path.splitext(cName))[0] # input file name without >>> extension >>> >>> extName='"' + pyxStrip + '"' >>> pyxName='"' + os.path.join(mainDir,pyxStrip+'.pyx') + '"' # Full >>> path to the PYX file (must be in Python\\Pyrex folder) >>> >>> if sys.platform=='win32': >>> dllName='"' + os.path.join(dllPath,cStrip+'.dll') + '"' >>> libName='"' + os.path.join(dllPath,cStrip) + '"' >>> library_dirs_txt='' >>> elif sys.platform=='darwin': >>> dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >>> libName='"' + cStrip + '"' >>> library_dirs_txt=""" >>> library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >>> """], >>> runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >>> '"' + """],""" >>> elif sys.platform=='linux2': >>> dllName='"' + os.path.join(dllPath,'lib'+cStrip+'.so') + '"' >>> libName='"' + cStrip + '"' >>> library_dirs_txt=""" >>> library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + '"' + >>> """], >>> runtime_library_dirs=[""" + '"' + pydPath.replace('\\','\\\\') + >>> '"' + """],""" >>> else: >>> print 'Platform "' + sys.platform + '" not supported yet' >>> >>> cName='"' + os.path.join(cPath,cStrip+'.c') + '"' # redefine cName >>> hName='"' + os.path.join(cPath,cStrip+'.h') + '"' >>> oName='"' + os.path.join(dllPath,cStrip+'.o') + '"' >>> >>> os.chdir(cPath) >>> >>> pyx2cName='"' + os.path.join(pydPath,pyxStrip+'.c') + '"' # Full >>> path to the C file to be created >>> setupName='"' + os.path.join(pydPath,'setup.py') + '"' # Full >>> path of the Setup File to be created >>> pydName='"' + os.path.join(pydPath,pyxStrip+'.pyd') + '"' # Full >>> path to the PYD file to be created >>> soName='"' + os.path.join(pydPath,pyxStrip+'.so') + '"' # Full >>> path to the lib*.so file to be created >>> >>> # compile the DLL needed for the link to the C file >>> if recompile: >>> Cdll(cName[1:-1],printCmds=printCmds,gccOptions=gccOptions) # >>> [1:-1] to remove the quotes >>> >>> # run the main pyrex command to make the C file >>> pyxCompiler = cythonName if useCython else pyrexcName >>> cmd=' '.join([pythonName,pyxCompiler,pyxName,'-o',pyx2cName]) >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> if useDistutils: >>> #write setup.py which will make a PYD file that can be >>> imported >>> setupText="""### This file is setup.py ### >>> from distutils.core import setup >>> from distutils.extension import Extension >>> from Pyrex.Distutils import build_ext >>> >>> setup( >>> name = 'Lock module', >>> ext_modules=[ >>> Extension(""" + extName + ', [' + pyxName.replace('\\','\\\\') + >>> """],"""+library_dirs_txt+""" >>> libraries=[""" + libName.replace('\\','\\\\') + ']' + """), >>> ], >>> cmdclass = {'build_ext': build_ext} >>> )""" >>> if printCmds: >>> print 'Write Stuff to ', setupName[1:-1] >>> fid = open(setupName[1:-1],'w') # [1:-1] removes quotes >>> fid.write(setupText) >>> fid.close() >>> >>> # run setup.py >>> os.chdir(mainDir) >>> >>> if sys.platform=='win32': cmd=' >>> '.join([pythonName,setupName,'build_ext','--compiler=mingw32','-- >>> inplace']) >>> elif sys.platform=='darwin': cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> elif sys.platform=='linux2': cmd=' >>> '.join([pythonName,setupName,'build_ext','--inplace']) >>> else: print 'Platform "' + >>> sys.platform + '" not supported yet' >>> else: >>> if sys.platform=='win32': cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >>> lpython25','-l'+cStrip,'-o',pydName]) >>> elif sys.platform=='darwin': cmd=' >>> '.join(['gcc',gccOptions,'-fno-strict-aliasing','-Wno-long- >>> double','- >>> no-cpp-precomp','-mno-fused-madd','-fno-common', >>> '-dynamic','- >>> DNDEBUG','-g','-O3','-bundle','-undefined dynamic_lookup','- >>> I'+pythonInclude, >>> '- >>> I'+pythonInclude+'/python2.5','-I'+arrayObjectDir,'-L'+pythonLibs,'- >>> L/usr/local/lib','-L'+cPath,'-Wl,-R'+cPath, >>> '- >>> l'+cStrip,pyx2cName,'-o',soName]) >>> elif sys.platform=='linux2': cmd=' >>> '.join(['gcc',gccOptions,'-fPIC','-shared',pyx2cName,'- >>> I'+pythonInclude,'-L'+pythonLibs,'-L'+cPath,'-Wl,-R'+cPath,'- >>> lpython2.5','-l'+cStrip,'-o',soName]) >>> else: print 'Platform "' + >>> sys.platform + '" not supported yet' >>> >>> if printCmds: >>> print '\n', cmd >>> os.system(cmd) >>> >>> os.chdir(cwd) >>> >>> #Shamelessly steal the idea used by scipy.weave.inline but for >>> Pyrex/ >>> Cython instead... >>> # In order to be able to import *, have to use exec in the calling >>> module... >>> def >>> PyrexInline >>> (code >>> ,cleanUp >>> = >>> False >>> ,useDistutils=False,useCython=False,gccOptions='',printCmds=True): >>> '''PyrexInline returns a string that is an import statement to >>> the temporary cython module'''+ \ >>> '''Use this like: >>> exec(PyrexInline(r"""""",))''' >>> >>> testCode=r""" >>> cdef extern from "stdio.h": >>> ctypedef struct FILE >>> >>> FILE * stdout >>> int printf(char *format,...) >>> int fflush( FILE *stream ) >>> >>> def PyrexPrint(mystring): >>> printf(mystring) >>> fflush(stdout) >>> >>> PyrexPrint('HelloWorld!') >>> """ >>> tmpPath=os.path.expanduser('~/.Cpyx_tmp') >>> if not os.path.isdir(tmpPath): >>> os.mkdir(tmpPath) >>> if tmpPath not in sys.path: >>> sys.path.append(tmpPath) >>> if cleanUp: >>> CleanTmp() >>> >>> # Ensure you always get a new module! >>> # This means there is no reason to "reload" >>> # Also means memory gets majorly eaten up! >>> # Can't have everything! >>> moduleName='Pyrex'+str(random.randint(0,1e18)) >>> file=os.path.join(tmpPath,moduleName+'.pyx') >>> >>> fid=open(file,'w') >>> fid.write(code) >>> fid.close() >>> >>> >>> Cpyx >>> (file >>> ,useDistutils >>> = >>> useDistutils >>> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >>> >>> #cmd="""import """+moduleName+""" as LoadPyrexInline""" >>> cmd="""from """+moduleName+""" import *""" >>> if printCmds: >>> print cmd >>> return cmd >>> >>> # Create a dummy function that defaults to using Cython instead for >>> clarity... >>> def >>> CythonInline >>> (code >>> ,cleanUp >>> = >>> False >>> ,useDistutils=False,useCython=True,gccOptions='',printCmds=True): >>> return >>> PyrexInline >>> (code >>> ,cleanUp >>> = >>> cleanUp >>> ,useDistutils >>> = >>> useDistutils >>> ,useCython=useCython,gccOptions=gccOptions,printCmds=printCmds) >>> >>> def CleanTmp(): >>> tmpPath=os.path.expanduser('~/.Cpyx_tmp') >>> for i in glob.glob(os.path.join(tmpPath,'*')): >>> os.remove(i) >>> _______________________________________________ >>> Cython-dev mailing list >>> Cython-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/cython-dev >>> >> >> _______________________________________________ >> Cython-dev mailing list >> Cython-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/cython-dev >> > > _______________________________________________ > Cython-dev mailing list > Cython-dev at codespeak.net > http://codespeak.net/mailman/listinfo/cython-dev From robertwb at math.washington.edu Mon Dec 14 20:13:21 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Dec 2009 11:13:21 -0800 Subject: [Cython] Preprocessor? In-Reply-To: <4B266754.1040107@gmail.com> References: <4B259A1B.8000306@gmail.com> <4B25E178.4060601@behnel.de> <4B266754.1040107@gmail.com> Message-ID: <59CC8580-C028-47DB-B295-9DCD63DAF42B@math.washington.edu> On Dec 14, 2009, at 8:27 AM, Dan Stromberg wrote: > Stefan Behnel wrote: >> Dan Stromberg, 14.12.2009 02:51: >> >>> Has anyone already worked out a system for preprocessing (maybe >>> via m4) >>> a single input file into two output files: one a plain .py, and on >>> a .pyx? >>> >>> I sometimes prefer a pure python version of a dependency if I can >>> get >>> it, because it makes the ongoing maintenance costs lower than >>> compiling >>> (and recompiling) extension modules would. It seems like due to the >>> similarities between .py and .pyx, there should be a way of >>> maintaining >>> both as a single document - so people who want the speed can have >>> it, >>> and people who want convenience can have that, but the programmer >>> doesn't end up maintaining two versions of said dependency. >>> >>> I googled, but didn't find anything that looked relevant, other than >>> pages about how to use various preprocessors. >>> >> >> I assume you know that Cython can compile .py files and has a pure >> Python >> syntax for type declarations? >> >> http://wiki.cython.org/pure >> >> Stefan >> > Actually, I didn't know that. It's interesting, but I don't think > it's > what I want; I'm looking for something that will allow my code to not > depend on cython at all in one form. The @cython decorators in pure mode are all noops, so in that sense it doesn't depend on Cython. > I've since worked out some simple ifdef's in m4 to do what I wanted. Good. I remember a big discussion about preparsing long ago, and I think the consensus was to have people use an external tool rather than build anything into Cython itself (though perhaps making it easier to integrate). - Robert From stefan_ml at behnel.de Mon Dec 14 20:15:51 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 Dec 2009 20:15:51 +0100 Subject: [Cython] Preprocessor? In-Reply-To: <4B266754.1040107@gmail.com> References: <4B259A1B.8000306@gmail.com> <4B25E178.4060601@behnel.de> <4B266754.1040107@gmail.com> Message-ID: <4B268EE7.8060409@behnel.de> Dan Stromberg, 14.12.2009 17:27: > Stefan Behnel wrote: >> I assume you know that Cython can compile .py files and has a pure Python >> syntax for type declarations? >> >> http://wiki.cython.org/pure >> > Actually, I didn't know that. It's interesting, but I don't think it's > what I want; I'm looking for something that will allow my code to not > depend on cython at all in one form. Your code /won't/ depend on Cython, that's what this is all about. All you need is a tiny cython.py file in your distribution that implements a dummy 'cython' module (we actually ship that). That's all you add as a "dependency" to your otherwise normal Python code. It will run in the normal Python interpreter, although it will run faster if you use Cython to compile it. Stefan From dagss at student.matnat.uio.no Mon Dec 14 20:22:27 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 14 Dec 2009 20:22:27 +0100 Subject: [Cython] Preprocessor? In-Reply-To: <4B268EE7.8060409@behnel.de> References: <4B259A1B.8000306@gmail.com> <4B25E178.4060601@behnel.de> <4B266754.1040107@gmail.com> <4B268EE7.8060409@behnel.de> Message-ID: <4B269073.10809@student.matnat.uio.no> Stefan Behnel wrote: > Dan Stromberg, 14.12.2009 17:27: >> Stefan Behnel wrote: >>> I assume you know that Cython can compile .py files and has a pure Python >>> syntax for type declarations? >>> >>> http://wiki.cython.org/pure >>> >> Actually, I didn't know that. It's interesting, but I don't think it's >> what I want; I'm looking for something that will allow my code to not >> depend on cython at all in one form. > > Your code /won't/ depend on Cython, that's what this is all about. All you > need is a tiny cython.py file in your distribution that implements a dummy > 'cython' module (we actually ship that). That's all you add as a > "dependency" to your otherwise normal Python code. It will run in the > normal Python interpreter, although it will run faster if you use Cython to > compile it. Be advised however that a) there's quite a few features not available in pure Python mode, and b) one has to be a bit careful because sometimes when writing code in pure Python mode, it could mean one thing in Python and another in Cython, as the semantics are not closely enough lined up (yet). As long as one know what one is doing and knows Cython pretty well, it's pretty good though. Another approach that was mentioned here some time ago was a script that would take Cython code (well, a subset) and transform it into pure Python code. I don't remember the final outcome though. -- Dag Sverre From greg.ewing at canterbury.ac.nz Mon Dec 14 23:20:07 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 15 Dec 2009 11:20:07 +1300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> <4B217DD8.7020401@canterbury.ac.nz> Message-ID: <4B26BA17.2090208@canterbury.ac.nz> Lisandro Dalcin wrote: > Well, at least in Cython, an enum is not compatible with all scalar > types, just with integral types. That won't matter if you tell Pyrex that the opaque type is an int. Pyrex will think it's compatible with an enum, and the C compiler will see it as whatever type it really is. -- Greg From dalcinl at gmail.com Mon Dec 14 23:35:46 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Mon, 14 Dec 2009 19:35:46 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B26BA17.2090208@canterbury.ac.nz> References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> <4B217DD8.7020401@canterbury.ac.nz> <4B26BA17.2090208@canterbury.ac.nz> Message-ID: On Mon, Dec 14, 2009 at 7:20 PM, Greg Ewing wrote: > Lisandro Dalcin wrote: > >> Well, at least in Cython, an enum is not compatible with all scalar >> types, just with integral types. > > That won't matter if you tell Pyrex that the opaque type > is an int. I have to insist: I have VERY good reasons to not tell Cython/Pyrex that the opaque type is an int. Instead, I REALLY need to tell that they are pointers, and not just a typedef to void*, but to DIFFERENT, INCOMPATIBLE, fake structures. Why am I being so pedantic about this? Because if I tell Cython/Pyrex that various MPI opaque types are 'int', and the underlying MPI implementation also uses 'int' for defining (using typedef's ) the various different handle types, then I do not have ANY guard against mistakes (I mean, like passing a actual int instead of MPI_Comm in some call). Greg, please tell me, what could go wrong with accepting external definition for except clauses? I cannot imagine any potential issue, but your "resistance" to my use case makes my think that you have some concerns about my request... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Tue Dec 15 00:19:46 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 15 Dec 2009 12:19:46 +1300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: References: <4B1EC5DA.2080702@canterbury.ac.nz> <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> <4B217DD8.7020401@canterbury.ac.nz> <4B26BA17.2090208@canterbury.ac.nz> Message-ID: <4B26C812.8060404@canterbury.ac.nz> Lisandro Dalc?n wrote: > Greg, please tell me, what could go wrong with accepting external > definition for except clauses? I cannot imagine any potential issue, > but your "resistance" to my use case makes my think that you have some > concerns about my request... You misunderstand -- I'm not resisting, I'm just suggesting ways in which you could work around the problem in the meantime. Your point about type checking is a valid one. There would still be some protection, since the C compiler will complain if you misuse the type, but I agree it's not an ideal situation. -- Greg From dalcinl at gmail.com Tue Dec 15 00:28:11 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Mon, 14 Dec 2009 20:28:11 -0300 Subject: [Cython] except values: could we relax to non-constant expressions? In-Reply-To: <4B26C812.8060404@canterbury.ac.nz> References: <4B20182E.1040704@canterbury.ac.nz> <4B2018F3.6010309@student.matnat.uio.no> <4B217DD8.7020401@canterbury.ac.nz> <4B26BA17.2090208@canterbury.ac.nz> <4B26C812.8060404@canterbury.ac.nz> Message-ID: On Mon, Dec 14, 2009 at 8:19 PM, Greg Ewing wrote: > Lisandro Dalc?n wrote: > >> Greg, please tell me, what could go wrong with accepting external >> definition for except clauses? I cannot imagine any potential issue, >> but your "resistance" to my use case makes my think that you have some >> concerns about my request... > > You misunderstand -- I'm not resisting, I'm just > suggesting ways in which you could work around the > problem in the meantime. > OK. Sorry! > Your point about type checking is a valid one. There > would still be some protection, since the C compiler > will complain if you misuse the type, but I agree > it's not an ideal situation. > Well, not always! ... MPICH2 (a MPI implementation), does this: $ grep 'typedef int MPI_' /usr/local/mpich2/include/mpi.h typedef int MPI_Datatype; typedef int MPI_Comm; typedef int MPI_Group; typedef int MPI_Win; typedef int MPI_Op; typedef int MPI_Errhandler; typedef int MPI_Request; typedef int MPI_Info; typedef int MPI_Fint; So the C compiler sees all the opaque types as an 'int' ... Then you do not get any actual type-checking from the C compiler ... So, for this call: int MPI_Isend(void*, int, MPI_Datatype, int, int, MPI_Comm, MPI_Request *); I can write the code below, and it compiles without any single warning... but that code is actually broken... #include int main( int argc, char ** argv ) { int a=0,b=0,c=0,d=0,e=0,f=0; MPI_Isend(0,a, b, c, d, e, &f); return 0; } Fortunately, telling Cython that MPI_Comm, MPI_Datatype, and MPI_Request are pointers to incomplete structs saves me for the lack of type-checking in C land. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Tue Dec 15 00:40:09 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Mon, 14 Dec 2009 15:40:09 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B24CBD9.3000202@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> Message-ID: <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> On Dec 13, 2009, at 3:11 AM, Stefan Behnel wrote: > Robert Bradshaw, 13.12.2009 10:51: >> On Dec 12, 2009, at 11:35 PM, Stefan Behnel wrote: >>> So I think the right solution is to support automatic conversion >>> *only* at the Python call boundary, i.e. for Python function >>> parameters and return values. >> >> I disagree. Most of the examples here have been very simple, but in >> general Python/C boundary need not be cleanly aligned with the Python >> call boundary. Some more general examples would be >> >> cdef extern from "foo.h": >> cdef cblarg(int i, char*): >> >> def blarg(obj): >> cblarg(obj.id, obj.name) # I realize I'm assuming >> name is not a dynamically generated attribute... >> >> or even >> >> def barg_all(list L): >> for i, a in enumerate(L): >> cblarg(i, a) > > I guess I'm still not used to passing arbitrary user values into a C > function call without doing some kind of parameter checking before > hand. > That's different for function arguments, where only the encoding would > happen automatically (and would raise an appropriate error on > failure), and > the result would still be a safe Python bytes object that users can > validate in any way they want, without having to care about 0 bytes > silently becoming end markers. > > We are still talking about two different use cases here. One deals > with > automatic encoding of unicode strings into byte strings on input and > with > automatic decoding of byte strings (or char*) on the way out. Yep, though they are related. I would imagine that most (though not all) of the time one an API would return the same kind of string it expects. > The other use case deals with automatic coercion of Python string > objects > to char*, which is what you show above. I personally think it's good > to > keep those separate. > > Remember that you mentioned the performance issue of a char* vs. a > Python > object parameter when the function is called from Cython code? The > only > place where this matters is for cpdef functions, and that should be > rare > enough to ignore it and require an explicit wrapper function, Not if we start using a def -> cpdef optimization by default. > as it's quite likely that user input would have to be validated > separately anyway. > > To make this clear: I don't think it's worth encouraging users to drop > input validation in favour of automatic and unsafe coercion. I don't think users doing input validation are going to stop doing input validation because of an easier str -> char* conversion option. I'm also skeptical that having to manually do str -> byes -> char* encourages input validation. Validation is good. Shunning user friendliness to try to enforce validation is not (in my mind) so good. > >> I'm all for making string encodings easier to use, though as I've >> said >> encode() and decode() seem to be a clean enough solution for nearly >> everything but argument parsing. > > That seems to match my distinction above then. > > >> However (and maybe this belongs on the other thread), you are >> completely skirting the issue of being able to declare the encoding >> for a block of code in one place, rather than having to specify it >> every single place it is used. > > Yes, the above would actually be orthogonal to that feature. > Although I'm > not sure simply saying > > def func(bytes s): > ... > > plus a global setting somewhere at the top of your code is really > readable > enough as "this function accepts unicode strings which get converted > automatically". And, no, I don't think typing the input parameter as > "str" > is what people want in most cases. I'm really leaning towards the > assumption that most people really *want* bytes as basic string > input type > in their Cython code. Either that, or exactly unicode strings. Not > 'str'. I agree with you for Py3, but Py2 is an important target, arguably more important than Py3 at this point in time (until numpy and the rest of the scientific world moves over), and will be with us for at least a while longer. > >> I initially thought your concern with >> char* <-> unicode conversion was the ambiguity in what character set >> to use, which I was proposing could be declared at a higher than >> case- >> by-case level. Is there another reason it is vital that the encoding >> step and/or parameters be reiterated at every instance they are used? > > I don't like code redundancy either. But making up a default should > only be > the second step after fixing the semantics of the feature that has > this > default. I think they're relatively orthogonal. Most of the discussion has been about adding new types, new syntax, mutating objects from one type to another, etc. and the semantics of doing all that are much less clear than "if an encoding is needed, use this one rather than bailing..." - Robert From stefan_ml at behnel.de Tue Dec 15 09:06:33 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 15 Dec 2009 09:06:33 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> Message-ID: <4B274389.6070800@behnel.de> Robert Bradshaw, 15.12.2009 00:40: > I don't think users doing input validation are going to stop doing > input validation because of an easier str -> char* conversion option. > I'm also skeptical that having to manually do str -> byes -> char* > encourages input validation. Validation is good. Shunning user > friendliness to try to enforce validation is not (in my mind) so good. The only case I really care about here are 0 bytes. Besides that case, 'bytes' and 'char*' are basically equivalent (or should be, at least), except for memory management, which is the main advantage of the bytes type. >> I'm not sure simply saying >> >> def func(bytes s): >> ... >> >> plus a global setting somewhere at the top of your code is really >> readable >> enough as "this function accepts unicode strings which get converted >> automatically". And, no, I don't think typing the input parameter as >> "str" >> is what people want in most cases. I'm really leaning towards the >> assumption that most people really *want* bytes as basic string >> input type >> in their Cython code. Either that, or exactly unicode strings. Not >> 'str'. > > I agree with you for Py3, but Py2 is an important target, arguably > more important than Py3 at this point in time (until numpy and the > rest of the scientific world moves over), and will be with us for at > least a while longer. In Py2, 'str' is 'bytes', and my statement certainly holds for Py2. Honestly, what would you want with an input data type that suddenly switches to something completely different when you compile your code in Py3? If you want encoded bytes input in Py2, you most likely want encoded bytes input in Py3 as well (see the Wiki page I started). And if you want unicode in Py2, you surely want unicode in Py3. > I think they're relatively orthogonal. Most of the discussion has been > about adding new types, new syntax, mutating objects from one type to > another, etc. and the semantics of doing all that are much less clear > than "if an encoding is needed, use this one rather than bailing..." If that's so clear, then please answer the following: when is an encoding needed? Is that only when coercing between char* and Python strings, or also when coercing between bytes/unicode? Will there be a different handling for function signatures, or will it work the same everywhere? I.e. will a "def func(bytes b)" function always accept unicode, and what is the way to disable that? Or will only "def func(char*)" accept unicode input? And will the latter still accept bytes input? Not so clear to me, at least, and certainly not obvious. Stefan From Chris.Barker at noaa.gov Wed Dec 16 19:38:01 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 16 Dec 2009 10:38:01 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B2373FF.2030406@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> Message-ID: <4B292909.5010300@noaa.gov> Speaking as a user that is still confused about many implementation issues... Greg Ewing wrote: > Suppose we have a way of expressing a type parameterised > with an encoding, maybe something like > > encoding[name] > > We could have a few predefined ones, such as > > ctypedef encoding['ascii'] ascii > ctypedef encoding['utf8'] utf8 > ctypedef encoding['latin1'] latin1 I like this -- something like this would be really helpful. I posted a similar note a little while back, but to repeat: From the users side, I am generally thinking of a given variable as working with "text" or "data". For text, a string makes sense, for data, a bytes object makes sense. It so happens that in Py2 and C, both are stored the same way, which is the source of all this mess. Nevertheless, if I'm writing a Cython method, I'll know what the nature of the data I'm working with, and I'll use the above types for text-like data. In the case of ascii, or course, the actual bytes will be the same as a bytes object, and that will be a very common case for methods that need to pass a char* on to C code, and will work well particularly for things like flags and the like -- little ascii strings that are kind of like data. Stefan Behnel wrote: > I don't think "encoding" is a good name for a type, though. The purpose of > names of that type is to hold data, not encodings. How about something like "ustring" ? Am I missing something, or is this a lot like a unicode object, but with the encoding statically defined? Kind of like a numpy array with the datatype statically defined? Robert Bradshaw wrote: > Would > > def flump(utf8 s): > return s > > return a bytes object? I would expect it to return a unicode object -- in Python, I'd expect bytes+encoding to be returned as a unicode object -- it's the only way not to lose the encoding information. Of course, in Py2, you might expect an ascii string returned, rather than unicode or bytes -- arrrgg! But I could live with unicode. Stefan Behnel wrote: > So most my-data-is-not-unicode users would want to make sure that they > always get an easy-to-use bytes object on the way in and that the return > value is an easy-to-use Python value, i.e. it follows the normal platform > str type: bytes on Py2 and unicode on Py3. But what about non-ansi encodings? does a string make sense here? The python code would then need to know the encoding to do anything intelligent with with it. Greg Ewing wrote: > Yes, I realize it doesn't fully address your use case. > It's more aimed at people who think a blanket declaration > would be too implicit and error-prone. I agree -- while ascii is such a common case that a blanket declaration would be useful, I'd rather declare it where I need it. It seems it wouldn't be hard to convert existing code, if what you want is ascii everywhere. Stefan Behnel wrote: > To fill this with a bit of background, I started writing up a couple of > thoughts on use cases that I think are relevant here. > > http://wiki.cython.org/enhancements/stringcoercion Thanks -- I do think that's helpful. Robert Bradshaw wrote: > Yep, though they are related. I would imagine that most (though not > all) of the time one an API would return the same kind of string it > expects. Sure, though with Python's duck typing, it's pretty common for a function to always return a given type, but accept anything that can be converted to that type. At least in my code. Stefan Behnel wrote: > If that's so clear, then please answer the following: when is an encoding > needed? Is that only when coercing between char* and Python strings, or > also when coercing between bytes/unicode? Certainly the later. You can't convert form bytes to unicode without defining an encoding -- can you? > Will there be a different > handling for function signatures, or will it work the same everywhere? I.e. > will a "def func(bytes b)" function always accept unicode, I don't think it should (aside from maybe backward compatibility--sigh). Again, I use bytes for data, unicode for text. Yes, and encoded string can be data, and stored in bytes, but I would use that only explicitly. > Or will only "def func(char*)" accept unicode input? Actually, I think char* is really analogous to bytes, and shouldn't accept unicode -- again -- dangerous without encoding information. I guess the short version is that coercion between unicode and bytes(or char*) should only be done explicitly. That could mean that you've explicitly defined a global encoding, but I think that's too subtle, really, I'd rather it was declared where it was used. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From greg.ewing at canterbury.ac.nz Thu Dec 17 00:14:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Dec 2009 12:14:20 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B292909.5010300@noaa.gov> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> <4B292909.5010300@noaa.gov> Message-ID: <4B2969CC.8050000@canterbury.ac.nz> Christopher Barker wrote: > Robert Bradshaw wrote: > >>Would >> >>def flump(utf8 s): >> return s >> >>return a bytes object? > > I would expect it to return a unicode object -- in Python, I'd expect > bytes+encoding to be returned as a unicode object -- it's the only way > not to lose the encoding information. I've been thinking something similar myself. Perhaps there should be a rule that the encoded-bytes types are only for "internal" use by Cython code, and whenever one gets coerced to a generic Python object, it gets decoded into a unicode string. I think that would allow us to drop the C versions of the encoded types altogether, and write things like cdef extern from "somewhere.h": char *cflump(char *) def utf8 flump(utf8 s): return cflump(s) Advantages of this are that all the declarations are now symmetrical and there is no need for any encoding declarations on the C side. A disadvantage is that it may not be obvious that flump() actually returns a unicode string despite being declared as returning utf8. If you wanted it to actually return a bytes object, you would have to write def bytes flump(utf8 s): return cflump(s) >>Will there be a different >>handling for function signatures, or will it work the same everywhere? I.e. >>will a "def func(bytes b)" function always accept unicode, Not under my version of the proposal -- there is only automatic conversion between unicode and a bytes type with a declared encoding. Unicode and plain bytes are still incompatible. -- Greg From stefan_ml at behnel.de Thu Dec 17 09:20:17 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 Dec 2009 09:20:17 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B2969CC.8050000@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> <4B292909.5010300@noaa.gov> <4B2969CC.8050000@canterbury.ac.nz> Message-ID: <4B29E9C1.5090503@behnel.de> Greg Ewing, 17.12.2009 00:14: >>> Will there be a different >>> handling for function signatures, or will it work the same everywhere? I.e. >>> will a "def func(bytes b)" function always accept unicode, > > Not under my version of the proposal -- there is only > automatic conversion between unicode and a bytes type > with a declared encoding. Unicode and plain bytes are > still incompatible. That's my preference, too. And I think having a special unicode type that coerces to and from a specifically encoded Python byte string makes sense. Stefan From dalcinl at gmail.com Thu Dec 17 16:16:40 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 12:16:40 -0300 Subject: [Cython] remove old cruft Message-ID: Cython contains some code to C-compile&link generated C sources. This likely comes from Pyrex. I would like to remove all that (we already have pyximport, and there are better ways to implement .pyx -> .so|.pyd): This change is going to require: 1) Fix Cython/Compiler/Main.py in a few places 2) rm -r Cython/Mac 3) rm -r Cython/Unix 4) adjust setup.py for changes in (2) y (3) -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Thu Dec 17 16:21:42 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 Dec 2009 16:21:42 +0100 Subject: [Cython] remove old cruft In-Reply-To: References: Message-ID: <4B2A4C86.8080908@student.matnat.uio.no> Lisandro Dalc?n wrote: > Cython contains some code to C-compile&link generated C sources. This > likely comes from Pyrex. I would like to remove all that (we already > have pyximport, and there are better ways to implement .pyx -> > .so|.pyd): > > This change is going to require: > > 1) Fix Cython/Compiler/Main.py in a few places > I'd appriciate it if you stayed away from the parts containing pipeline construction -- in the kurt-gsoc branch I've moved it to another file (Pipeline.py) which makes merging a pain (I hope to be able to merge it soon). Other than that, go for it IMO. Dag Sverre > 2) rm -r Cython/Mac > > 3) rm -r Cython/Unix > > 4) adjust setup.py for changes in (2) y (3) > > > > From julien at danjou.info Thu Dec 17 16:25:31 2009 From: julien at danjou.info (Julien Danjou) Date: Thu, 17 Dec 2009 16:25:31 +0100 Subject: [Cython] [PATCH] Fix usage of elif on undefined values Message-ID: <1261063531-8183-1-git-send-email-julien@danjou.info> This kills a compilation warning. Signed-off-by: Julien Danjou --- Cython/Compiler/Nodes.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py index e6b0048..dfe94d6 100644 --- a/Cython/Compiler/Nodes.py +++ b/Cython/Compiler/Nodes.py @@ -4822,7 +4822,7 @@ utility_function_predeclarations = \ """ #ifdef __GNUC__ #define INLINE __inline__ -#elif _WIN32 +#elif defined(_WIN32) #define INLINE __inline #else #define INLINE -- 1.6.5.4 From stefan_ml at behnel.de Thu Dec 17 16:29:14 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 Dec 2009 16:29:14 +0100 Subject: [Cython] remove old cruft In-Reply-To: References: Message-ID: <4B2A4E4A.9010704@behnel.de> Lisandro Dalc?n, 17.12.2009 16:16: > Cython contains some code to C-compile&link generated C sources. This > likely comes from Pyrex. I would like to remove all that (we already > have pyximport, and there are better ways to implement .pyx -> > .so|.pyd): > > This change is going to require: > > 1) Fix Cython/Compiler/Main.py in a few places > > 2) rm -r Cython/Mac > > 3) rm -r Cython/Unix > > 4) adjust setup.py for changes in (2) y (3) +1, make that "rm -fr". Stefan From dagss at student.matnat.uio.no Thu Dec 17 16:58:04 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 Dec 2009 16:58:04 +0100 Subject: [Cython] remove old cruft In-Reply-To: References: Message-ID: <4B2A550C.2030407@student.matnat.uio.no> Lisandro Dalc?n wrote: > Cython contains some code to C-compile&link generated C sources. This > likely comes from Pyrex. I would like to remove all that (we already > have pyximport, and there are better ways to implement .pyx -> > .so|.pyd): > > This change is going to require: > > 1) Fix Cython/Compiler/Main.py in a few places > > 2) rm -r Cython/Mac > > 3) rm -r Cython/Unix > > 4) adjust setup.py for changes in (2) y (3) > > 5) Remove some command line argument parsing in CmdLine.py Thanks! Dag Sverre From dalcinl at gmail.com Thu Dec 17 18:41:20 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 14:41:20 -0300 Subject: [Cython] running Cython from a zip file Message-ID: I'm experimenting with running Cython from a zip file containing the minimum code required to get a fully functional Cython compiler. So far, I can make a zipfile of about 290K $ ./mklite.sh ../cython-devel $ ls -alh cython.zip -rw-r--r--. 1 dalcinl users 289K 2009-12-17 14:01 cython.zip Then I can run cython from the zip file setting PYTHONPATH and using 'python -m' $ touch tmp.pyx $ PYTHONPATH=./cython.zip python -m cython tmp.pyx Unable to hash scanner source file ([Errno 20] Not a directory: '/u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.py') Creating lexicon... Done (0.15 seconds) Warning: Unable to save pickled lexicon in /u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.pickle but these warnings are REALLY annoying. Assuming I could determine in Cython sources that Cython is being used from a zipfile (ideas?) ... Could I add some hackery to silent these annoying warnings? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dagss at student.matnat.uio.no Thu Dec 17 19:01:20 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 Dec 2009 19:01:20 +0100 Subject: [Cython] running Cython from a zip file In-Reply-To: References: Message-ID: <4B2A71F0.1030105@student.matnat.uio.no> Lisandro Dalc?n wrote: > I'm experimenting with running Cython from a zip file containing the > minimum code required to get a fully functional Cython compiler. > > So far, I can make a zipfile of about 290K > > $ ./mklite.sh ../cython-devel > $ ls -alh cython.zip > -rw-r--r--. 1 dalcinl users 289K 2009-12-17 14:01 cython.zip > > Then I can run cython from the zip file setting PYTHONPATH and using 'python -m' > > $ touch tmp.pyx > $ PYTHONPATH=./cython.zip python -m cython tmp.pyx > Unable to hash scanner source file ([Errno 20] Not a directory: > '/u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.py') > Creating lexicon... > Done (0.15 seconds) > Warning: Unable to save pickled lexicon in > /u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.pickle > > but these warnings are REALLY annoying. > > Assuming I could determine in Cython sources that Cython is being used > from a zipfile (ideas?) ... Could I add some hackery to silent these > annoying warnings? I wouldn't even consider that hackery...the Python warnings module even has explicit code for temporary or permanent warning silencing (though I guess that might not be Py2.3). -- Dag Sverre From stefan_ml at behnel.de Thu Dec 17 19:48:20 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 Dec 2009 19:48:20 +0100 Subject: [Cython] running Cython from a zip file In-Reply-To: <4B2A71F0.1030105@student.matnat.uio.no> References: <4B2A71F0.1030105@student.matnat.uio.no> Message-ID: <4B2A7CF4.7050006@behnel.de> Dag Sverre Seljebotn, 17.12.2009 19:01: > Lisandro Dalc?n wrote: >> I'm experimenting with running Cython from a zip file containing the >> minimum code required to get a fully functional Cython compiler. >> >> So far, I can make a zipfile of about 290K >> >> $ ./mklite.sh ../cython-devel >> $ ls -alh cython.zip >> -rw-r--r--. 1 dalcinl users 289K 2009-12-17 14:01 cython.zip >> >> Then I can run cython from the zip file setting PYTHONPATH and using 'python -m' >> >> $ touch tmp.pyx >> $ PYTHONPATH=./cython.zip python -m cython tmp.pyx >> Unable to hash scanner source file ([Errno 20] Not a directory: >> '/u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.py') >> Creating lexicon... >> Done (0.15 seconds) >> Warning: Unable to save pickled lexicon in >> /u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.pickle >> >> but these warnings are REALLY annoying. >> >> Assuming I could determine in Cython sources that Cython is being used >> from a zipfile (ideas?) ... Could I add some hackery to silent these >> annoying warnings? You could check if the directory you are trying to write to really is a directory. If not, we shouldn't try to use pickle. BTW, why isn't the lexicon pickled already in the case above? > I wouldn't even consider that hackery...the Python warnings module even > has explicit code for temporary or permanent warning silencing (though I > guess that might not be Py2.3). No need to silence a warning that we generate ourselves. Stefan From dalcinl at gmail.com Thu Dec 17 23:28:38 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 19:28:38 -0300 Subject: [Cython] running Cython from a zip file In-Reply-To: <4B2A7CF4.7050006@behnel.de> References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> Message-ID: On Thu, Dec 17, 2009 at 3:48 PM, Stefan Behnel wrote: > > Dag Sverre Seljebotn, 17.12.2009 19:01: >> Lisandro Dalc?n wrote: >>> I'm experimenting with running Cython from a zip file containing the >>> minimum code required to get a fully functional Cython compiler. >>> >>> So far, I can make a zipfile of about 290K >>> >>> $ ./mklite.sh ../cython-devel >>> $ ls -alh cython.zip >>> -rw-r--r--. 1 dalcinl users 289K 2009-12-17 14:01 cython.zip >>> >>> Then I can run cython from the zip file setting PYTHONPATH and using 'python -m' >>> >>> $ touch tmp.pyx >>> $ PYTHONPATH=./cython.zip python -m cython tmp.pyx >>> Unable to hash scanner source file ([Errno 20] Not a directory: >>> '/u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.py') >>> Creating lexicon... >>> Done (0.15 seconds) >>> Warning: Unable to save pickled lexicon in >>> /u/dalcinl/Devel/Cython/LITE/cython.zip/Cython/Compiler/Lexicon.pickle >>> >>> but these warnings are REALLY annoying. >>> >>> Assuming I could determine in Cython sources that Cython is being used >>> from a zipfile (ideas?) ... Could I add some hackery to silent these >>> annoying warnings? > > You could check if the directory you are trying to write to really is a > directory. If not, we shouldn't try to use pickle. > OK. > > BTW, why isn't the lexicon pickled already in the case above? > Lexicon.pickle is inside the zip file, but Cython code is treating to read Lexicon.pickle from a "standard" filesystem file, so it fail. In short, the Lexicon.pickle packaged inside the zip is useless. Moreover, as Creating lexicon... Done (0.08 seconds) is SO FAST, I'm not interested at all in complicating the zip file support for actually using a pickled Lexicon. > >> I wouldn't even consider that hackery...the Python warnings module even >> has explicit code for temporary or permanent warning silencing (though I >> guess that might not be Py2.3). Well, the point is that as there is no (easy) way to use Lexicon.pickle when running Cython from a zip file, then the warning is pointless... Then I asked about your opinions about removing the warning, but ONLY in the case I can figure out that Cython is actually being used from a zip file. > > No need to silence a warning that we generate ourselves. > Unless the warning is pointless, as in this very specific use case. PS: It could seem that I'm worrying too much about this, but I want this feature to function really well. This is really nice way to easily "upgrade"/"downgrade" your Cython version for building specific projects... This is going to be important if Cython ever gets its way to Python's stdlib. It is also important if you are using some Linux distro where Cython-0.11 is in the system install, but you need to build the -dev copy of some Cython-based project that requires 0.12... Then a user can download the big 4.3MB Cython-0.12.zip (or a lite, striped version with 290KB like the one I'm trying to do) and get the -dev code cythonized. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Thu Dec 17 23:40:12 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 19:40:12 -0300 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: <1261063531-8183-1-git-send-email-julien@danjou.info> References: <1261063531-8183-1-git-send-email-julien@danjou.info> Message-ID: On Thu, Dec 17, 2009 at 12:25 PM, Julien Danjou wrote: > This kills a compilation warning. > > Signed-off-by: Julien Danjou > --- > ?Cython/Compiler/Nodes.py | ? ?2 +- > ?1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py > index e6b0048..dfe94d6 100644 > --- a/Cython/Compiler/Nodes.py > +++ b/Cython/Compiler/Nodes.py > @@ -4822,7 +4822,7 @@ utility_function_predeclarations = \ > ?""" > ?#ifdef __GNUC__ > ?#define INLINE __inline__ > -#elif _WIN32 > +#elif defined(_WIN32) > ?#define INLINE __inline > ?#else > ?#define INLINE > -- > 1.6.5.4 > Mmm... What about the fix below? IIUC, __inline is a builtin keyword for MSVC, but not for every other C compiler running on Windows... Better safe than sorry... $ hg diff Cython/Compiler/Nodes.py diff -r d76177fc0796 Cython/Compiler/Nodes.py --- a/Cython/Compiler/Nodes.py Thu Dec 17 09:32:44 2009 +0100 +++ b/Cython/Compiler/Nodes.py Thu Dec 17 19:38:15 2009 -0300 @@ -4820,9 +4820,9 @@ utility_function_predeclarations = \ """ -#ifdef __GNUC__ +#if defined(__GNUC__) #define INLINE __inline__ -#elif _WIN32 +#elif defined(_MSC_VER) #define INLINE __inline #else #define INLINE -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Dec 18 00:24:59 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 15:24:59 -0800 Subject: [Cython] running Cython from a zip file In-Reply-To: References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> Message-ID: <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> On Dec 17, 2009, at 2:28 PM, Lisandro Dalc?n wrote: > On Thu, Dec 17, 2009 at 3:48 PM, Stefan Behnel > wrote: > >> No need to silence a warning that we generate ourselves. > > Unless the warning is pointless, as in this very specific use case. This could also happen with a system install, where the first person to run Cython doesn't have the required permissions to write the pickle. > PS: It could seem that I'm worrying too much about this, but I want > this feature to function really well. This is really nice way to > easily "upgrade"/"downgrade" your Cython version for building specific > projects... This is going to be important if Cython ever gets its way > to Python's stdlib. It is also important if you are using some Linux > distro where Cython-0.11 is in the system install, but you need to > build the -dev copy of some Cython-based project that requires 0.12... > Then a user can download the big 4.3MB Cython-0.12.zip (or a lite, > striped version with 290KB like the one I'm trying to do) and get the > -dev code cythonized. I think this is a worthy usecase. > Creating lexicon... > Done (0.08 seconds) Another data point: $ touch empty.pyx $ time python cython.py empty.pyx Creating lexicon... Done (0.04 seconds) Pickling lexicon... Done (0.01 seconds) real 0m0.353s user 0m0.279s sys 0m0.071s cython-devel$ time python cython.py empty.pyx real 0m0.321s user 0m0.243s sys 0m0.073s Really, that's hardly any savings at all. Maybe at one point this was really expensive? In any case, I think it might make more sense to generate the pickle at install time (with a possible error), and silently ignore failed re-creating attempts otherwise. - Robert From robertwb at math.washington.edu Fri Dec 18 00:35:22 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 15:35:22 -0800 Subject: [Cython] remove old cruft In-Reply-To: <4B2A4E4A.9010704@behnel.de> References: <4B2A4E4A.9010704@behnel.de> Message-ID: On Dec 17, 2009, at 7:29 AM, Stefan Behnel wrote: > Lisandro Dalc?n, 17.12.2009 16:16: >> Cython contains some code to C-compile&link generated C sources. This >> likely comes from Pyrex. I would like to remove all that (we already >> have pyximport, and there are better ways to implement .pyx -> >> .so|.pyd): >> >> This change is going to require: >> >> 1) Fix Cython/Compiler/Main.py in a few places >> >> 2) rm -r Cython/Mac >> >> 3) rm -r Cython/Unix >> >> 4) adjust setup.py for changes in (2) y (3) > > +1, make that "rm -fr". +1 from me too. - Robert From dalcinl at gmail.com Fri Dec 18 00:37:00 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 20:37:00 -0300 Subject: [Cython] running Cython from a zip file In-Reply-To: <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> Message-ID: On Thu, Dec 17, 2009 at 8:24 PM, Robert Bradshaw wrote: > > Really, that's hardly any savings at all. Maybe at one point this was > really expensive? In any case, I think it might make more sense to > generate the pickle at install time (with a possible error), and > silently ignore failed re-creating attempts otherwise. > Well, I was tempted to propose exactly that, but I had a bit of fear of having strong opposition :-) ... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Dec 18 00:38:26 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 15:38:26 -0800 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: References: <1261063531-8183-1-git-send-email-julien@danjou.info> Message-ID: <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> On Dec 17, 2009, at 2:40 PM, Lisandro Dalc?n wrote: > On Thu, Dec 17, 2009 at 12:25 PM, Julien Danjou > wrote: >> This kills a compilation warning. >> >> Signed-off-by: Julien Danjou >> --- >> Cython/Compiler/Nodes.py | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py >> index e6b0048..dfe94d6 100644 >> --- a/Cython/Compiler/Nodes.py >> +++ b/Cython/Compiler/Nodes.py >> @@ -4822,7 +4822,7 @@ utility_function_predeclarations = \ >> """ >> #ifdef __GNUC__ >> #define INLINE __inline__ >> -#elif _WIN32 >> +#elif defined(_WIN32) >> #define INLINE __inline >> #else >> #define INLINE >> -- >> 1.6.5.4 >> > > Mmm... What about the fix below? IIUC, __inline is a builtin keyword > for MSVC, but not for every other C compiler running on Windows... > Better safe than sorry... > > > $ hg diff Cython/Compiler/Nodes.py > diff -r d76177fc0796 Cython/Compiler/Nodes.py > --- a/Cython/Compiler/Nodes.py Thu Dec 17 09:32:44 2009 +0100 > +++ b/Cython/Compiler/Nodes.py Thu Dec 17 19:38:15 2009 -0300 > @@ -4820,9 +4820,9 @@ > > utility_function_predeclarations = \ > """ > -#ifdef __GNUC__ > +#if defined(__GNUC__) > #define INLINE __inline__ > -#elif _WIN32 > +#elif defined(_MSC_VER) > #define INLINE __inline > #else > #define INLINE Good point, please push. Are there any other compilers that we should single out? We heavily use the assumption that inlined functions actually get inlined for optimization purposes. - Robert From robertwb at math.washington.edu Fri Dec 18 00:44:02 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 15:44:02 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B29E9C1.5090503@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> <4B292909.5010300@noaa.gov> <4B2969CC.8050000@canterbury.ac.nz> <4B29E9C1.5090503@behnel.de> Message-ID: <01E2439A-9D4A-4A3D-8E47-57C5CDC3B7B5@math.washington.edu> On Dec 17, 2009, at 12:20 AM, Stefan Behnel wrote: > Greg Ewing, 17.12.2009 00:14: >>>> Will there be a different >>>> handling for function signatures, or will it work the same >>>> everywhere? I.e. >>>> will a "def func(bytes b)" function always accept unicode, >> >> Not under my version of the proposal -- there is only >> automatic conversion between unicode and a bytes type >> with a declared encoding. Unicode and plain bytes are >> still incompatible. > > That's my preference, too. And I think having a special unicode type > that > coerces to and from a specifically encoded Python byte string makes > sense. I actually think unicode -> bytes is weirder than unicode -> char* (because the latter is the best C has to offer for strings), though it's messier from a null-bytes and memory-management perspective. - Robert From dalcinl at gmail.com Fri Dec 18 00:49:28 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 20:49:28 -0300 Subject: [Cython] remove old cruft In-Reply-To: References: <4B2A4E4A.9010704@behnel.de> Message-ID: On Thu, Dec 17, 2009 at 8:35 PM, Robert Bradshaw wrote: > On Dec 17, 2009, at 7:29 AM, Stefan Behnel wrote: > >> Lisandro Dalc?n, 17.12.2009 16:16: >>> Cython contains some code to C-compile&link generated C sources. This >>> likely comes from Pyrex. I would like to remove all that (we already >>> have pyximport, and there are better ways to implement .pyx -> >>> .so|.pyd): >>> >>> This change is going to require: >>> >>> 1) Fix Cython/Compiler/Main.py in a few places >>> >>> 2) rm -r Cython/Mac >>> >>> 3) rm -r Cython/Unix >>> >>> 4) adjust setup.py for changes in (2) y (3) >> >> +1, make that "rm -fr". > > +1 from me too. > http://hg.cython.org/cython-devel/rev/adcb695965d7 -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Dec 18 00:56:32 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 20:56:32 -0300 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> References: <1261063531-8183-1-git-send-email-julien@danjou.info> <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> Message-ID: On Thu, Dec 17, 2009 at 8:38 PM, Robert Bradshaw wrote: > On Dec 17, 2009, at 2:40 PM, Lisandro Dalc?n wrote: > >> On Thu, Dec 17, 2009 at 12:25 PM, Julien Danjou >> wrote: >>> This kills a compilation warning. >>> >>> Signed-off-by: Julien Danjou >>> --- >>> ?Cython/Compiler/Nodes.py | ? ?2 +- >>> ?1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py >>> index e6b0048..dfe94d6 100644 >>> --- a/Cython/Compiler/Nodes.py >>> +++ b/Cython/Compiler/Nodes.py >>> @@ -4822,7 +4822,7 @@ utility_function_predeclarations = \ >>> ?""" >>> ?#ifdef __GNUC__ >>> ?#define INLINE __inline__ >>> -#elif _WIN32 >>> +#elif defined(_WIN32) >>> ?#define INLINE __inline >>> ?#else >>> ?#define INLINE >>> -- >>> 1.6.5.4 >>> >> >> Mmm... What about the fix below? IIUC, __inline is a builtin keyword >> for MSVC, but not for every other C compiler running on Windows... >> Better safe than sorry... >> >> >> $ hg diff Cython/Compiler/Nodes.py >> diff -r d76177fc0796 Cython/Compiler/Nodes.py >> --- a/Cython/Compiler/Nodes.py ? ? ? ?Thu Dec 17 09:32:44 2009 +0100 >> +++ b/Cython/Compiler/Nodes.py ? ? ? ?Thu Dec 17 19:38:15 2009 -0300 >> @@ -4820,9 +4820,9 @@ >> >> utility_function_predeclarations = \ >> """ >> -#ifdef __GNUC__ >> +#if defined(__GNUC__) >> #define INLINE __inline__ >> -#elif _WIN32 >> +#elif defined(_MSC_VER) >> #define INLINE __inline >> #else >> #define INLINE > > Good point, please push. Are there any other compilers that we should > single out? We heavily use the assumption that inlined functions > actually get inlined for optimization purposes. > Intel? PathScale? PGI? Borland? (Open) Watcom? I can do it for Intel and PathScale ... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Dec 18 01:08:35 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 21:08:35 -0300 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: References: <1261063531-8183-1-git-send-email-julien@danjou.info> <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> Message-ID: On Thu, Dec 17, 2009 at 8:56 PM, Lisandro Dalc?n wrote: > On Thu, Dec 17, 2009 at 8:38 PM, Robert Bradshaw > wrote: >> On Dec 17, 2009, at 2:40 PM, Lisandro Dalc?n wrote: >> >>> On Thu, Dec 17, 2009 at 12:25 PM, Julien Danjou >>> wrote: >>>> This kills a compilation warning. >>>> >>>> Signed-off-by: Julien Danjou >>>> --- >>>> ?Cython/Compiler/Nodes.py | ? ?2 +- >>>> ?1 files changed, 1 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/Cython/Compiler/Nodes.py b/Cython/Compiler/Nodes.py >>>> index e6b0048..dfe94d6 100644 >>>> --- a/Cython/Compiler/Nodes.py >>>> +++ b/Cython/Compiler/Nodes.py >>>> @@ -4822,7 +4822,7 @@ utility_function_predeclarations = \ >>>> ?""" >>>> ?#ifdef __GNUC__ >>>> ?#define INLINE __inline__ >>>> -#elif _WIN32 >>>> +#elif defined(_WIN32) >>>> ?#define INLINE __inline >>>> ?#else >>>> ?#define INLINE >>>> -- >>>> 1.6.5.4 >>>> >>> >>> Mmm... What about the fix below? IIUC, __inline is a builtin keyword >>> for MSVC, but not for every other C compiler running on Windows... >>> Better safe than sorry... >>> >>> >>> $ hg diff Cython/Compiler/Nodes.py >>> diff -r d76177fc0796 Cython/Compiler/Nodes.py >>> --- a/Cython/Compiler/Nodes.py ? ? ? ?Thu Dec 17 09:32:44 2009 +0100 >>> +++ b/Cython/Compiler/Nodes.py ? ? ? ?Thu Dec 17 19:38:15 2009 -0300 >>> @@ -4820,9 +4820,9 @@ >>> >>> utility_function_predeclarations = \ >>> """ >>> -#ifdef __GNUC__ >>> +#if defined(__GNUC__) >>> #define INLINE __inline__ >>> -#elif _WIN32 >>> +#elif defined(_MSC_VER) >>> #define INLINE __inline >>> #else >>> #define INLINE >> > >> Good point, please push. Are there any other compilers that we should >> single out? We heavily use the assumption that inlined functions >> actually get inlined for optimization purposes. >> > > Intel? PathScale? PGI? Borland? (Open) Watcom? > > I can do it for Intel and PathScale ... > BTW, we should protect all these definitions of INLINE inside an outer #ifndef INLINE ... #endif. That way, in the face of a compiler Cython is not aware of, we can pass -DINLINE=something and make it work. What do you think? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robertwb at math.washington.edu Fri Dec 18 01:23:25 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 16:23:25 -0800 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B274389.6070800@behnel.de> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> <4B274389.6070800@behnel.de> Message-ID: <9BA4E1EB-38E4-4265-BF01-3E5631BD58EC@math.washington.edu> On Dec 15, 2009, at 12:06 AM, Stefan Behnel wrote: > Robert Bradshaw, 15.12.2009 00:40: >> I don't think users doing input validation are going to stop doing >> input validation because of an easier str -> char* conversion option. >> I'm also skeptical that having to manually do str -> byes -> char* >> encourages input validation. Validation is good. Shunning user >> friendliness to try to enforce validation is not (in my mind) so >> good. > > The only case I really care about here are 0 bytes. Besides that case, > 'bytes' and 'char*' are basically equivalent (or should be, at least), > except for memory management, which is the main advantage of the > bytes type. The other difference is that introducing object -> object conversions. while solving the memory issue, makes the language semantics much messier. For example "o is o" would no longer always hold, and "o" would no longer be shorthand for raising an error if not isinstance(o, bytes), and that's just for explicit coercions. Any magic that happens is much less surprising on the Python/C boundary, as it's already obvious something non-trivial is going on there. (That being said, something like o is overt enough to diminish the level of surprise.) > >>> I'm not sure simply saying >>> >>> def func(bytes s): >>> ... >>> >>> plus a global setting somewhere at the top of your code is really >>> readable >>> enough as "this function accepts unicode strings which get converted >>> automatically". And, no, I don't think typing the input parameter as >>> "str" >>> is what people want in most cases. I'm really leaning towards the >>> assumption that most people really *want* bytes as basic string >>> input type >>> in their Cython code. Either that, or exactly unicode strings. Not >>> 'str'. >> >> I agree with you for Py3, but Py2 is an important target, arguably >> more important than Py3 at this point in time (until numpy and the >> rest of the scientific world moves over), and will be with us for at >> least a while longer. > > In Py2, 'str' is 'bytes', and my statement certainly holds for Py2. > Honestly, what would you want with an input data type that suddenly > switches to something completely different when you compile your > code in > Py3? If you want encoded bytes input in Py2, you most likely want > encoded > bytes input in Py3 as well (see the Wiki page I started). And if you > want > unicode in Py2, you surely want unicode in Py3. I wasn't trying to say people should type their arguments str, I was claiming that it's common to want to accept both bytes and unicode in Py2. This is what you said in the wiki "For Python 2.x, the code needs to deal with both str (bytes) and unicode, whereas it would only accept unicode strings (str) in Python 3." so I think we're in agreement here. > >> I think they're relatively orthogonal. Most of the discussion has >> been >> about adding new types, new syntax, mutating objects from one type to >> another, etc. and the semantics of doing all that are much less clear >> than "if an encoding is needed, use this one rather than bailing..." > > If that's so clear, then please answer the following: when is an > encoding > needed? Is that only when coercing between char* and Python strings, > or > also when coercing between bytes/unicode? Only object <-> char* would use a default encoding, everything else would be explicit. > Will there be a different > handling for function signatures, or will it work the same everywhere? That depends on how we are able to handle the memory. Ideally the same everywhere, but that may not be feasible. > I.e. > will a "def func(bytes b)" function always accept unicode, and what > is the > way to disable that? I was thinking not. > Or will only "def func(char*)" accept unicode input? Yep. > And will the latter still accept bytes input? Yes. One could make the case that in the treat-all-char*-as-encoded- text mode, bytes should be disallowed in Py3. > Not so clear to me, at least, and certainly not obvious. Too many other ideas floating around. I should write up a CEP. - Robert From robertwb at math.washington.edu Fri Dec 18 01:25:20 2009 From: robertwb at math.washington.edu (Robert Bradshaw) Date: Thu, 17 Dec 2009 16:25:20 -0800 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: References: <1261063531-8183-1-git-send-email-julien@danjou.info> <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> Message-ID: <87626171-8BD7-4B8F-A856-9BE61DAB8452@math.washington.edu> On Dec 17, 2009, at 4:08 PM, Lisandro Dalc?n wrote: >>> Good point, please push. Are there any other compilers that we >>> should >>> single out? We heavily use the assumption that inlined functions >>> actually get inlined for optimization purposes. >>> >> >> Intel? PathScale? PGI? Borland? (Open) Watcom? >> >> I can do it for Intel and PathScale ... Never heard of PathScale, but if you think there's a good chance of people compiling Cython code with it than it shouldn't hurt. Also, inline is part of the C99 standard, maybe we could check for that generically too. > > BTW, we should protect all these definitions of INLINE inside an outer > #ifndef INLINE ... #endif. That way, in the face of a compiler Cython > is not aware of, we can pass -DINLINE=something and make it work. What > do you think? Yes, we should. - Robert From dalcinl at gmail.com Fri Dec 18 01:47:34 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Thu, 17 Dec 2009 21:47:34 -0300 Subject: [Cython] [PATCH] Fix usage of elif on undefined values In-Reply-To: <87626171-8BD7-4B8F-A856-9BE61DAB8452@math.washington.edu> References: <1261063531-8183-1-git-send-email-julien@danjou.info> <47CEF261-6FA7-4B11-9B41-05AFC23251A9@math.washington.edu> <87626171-8BD7-4B8F-A856-9BE61DAB8452@math.washington.edu> Message-ID: On Thu, Dec 17, 2009 at 9:25 PM, Robert Bradshaw wrote: > On Dec 17, 2009, at 4:08 PM, Lisandro Dalc?n wrote: > >>>> Good point, please push. Are there any other compilers that we >>>> should >>>> single out? We heavily use the assumption that inlined functions >>>> actually get inlined for optimization purposes. >>>> >>> >>> Intel? PathScale? PGI? Borland? (Open) Watcom? >>> >>> I can do it for Intel and PathScale ... > > Never heard of PathScale, but if you think there's a good chance of > people compiling Cython code with it than it shouldn't hurt. > Well, I build mpi4py on SiCortex machines (MIPS arch) with PathScale :-) > Also, > inline is part of the C99 standard, maybe we could check for that > generically too. > Of course. >> >> BTW, we should protect all these definitions of INLINE inside an outer >> #ifndef INLINE ... #endif. That way, in the face of a compiler Cython >> is not aware of, we can pass -DINLINE=something and make it work. What >> do you think? > > Yes, we should. > This is a preliminary fix: http://hg.cython.org/cython-devel/rev/9918bc676467 -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From greg.ewing at canterbury.ac.nz Fri Dec 18 02:07:29 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 Dec 2009 14:07:29 +1300 Subject: [Cython] running Cython from a zip file In-Reply-To: References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> Message-ID: <4B2AD5D1.2040802@canterbury.ac.nz> Lisandro Dalc?n wrote: > Moreover, as > > Creating lexicon... > Done (0.08 seconds) > > is SO FAST, I'm not interested at all in complicating the zip file > support for actually using a pickled Lexicon. The pickling mechanism was put in a long time ago when machines were slower, and building the lexicon took a noticeable amount of time. It could probably be dropped altogether nowadays. Or at least get rid of the warning -- it was mainly there to notify me when there was something wrong with the pickling system. -- Greg From greg.ewing at canterbury.ac.nz Fri Dec 18 02:13:31 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 Dec 2009 14:13:31 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <01E2439A-9D4A-4A3D-8E47-57C5CDC3B7B5@math.washington.edu> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B2373FF.2030406@canterbury.ac.nz> <4B292909.5010300@noaa.gov> <4B2969CC.8050000@canterbury.ac.nz> <4B29E9C1.5090503@behnel.de> <01E2439A-9D4A-4A3D-8E47-57C5CDC3B7B5@math.washington.edu> Message-ID: <4B2AD73B.20904@canterbury.ac.nz> Robert Bradshaw wrote: > I actually think unicode -> bytes is weirder than unicode -> char* > (because the latter is the best C has to offer for strings), though > it's messier from a null-bytes and memory-management perspective. Another way of thinking about it is that unicode -> bytes is the same thing as unicode -> char *, it's just that you're making use of Python to manage the memory for you. -- Greg From greg.ewing at canterbury.ac.nz Fri Dec 18 02:23:36 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 Dec 2009 14:23:36 +1300 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <9BA4E1EB-38E4-4265-BF01-3E5631BD58EC@math.washington.edu> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> <4B274389.6070800@behnel.de> <9BA4E1EB-38E4-4265-BF01-3E5631BD58EC@math.washington.edu> Message-ID: <4B2AD998.3080509@canterbury.ac.nz> Robert Bradshaw wrote: > For example "o is o" would no longer always hold, If you don't like that idea, then come up with another syntax for in-line type coercions, like C++ has done. >> Or will only "def func(char*)" accept unicode input? > > Yep. In my world that would only accept bytes. -- Greg From stefan_ml at behnel.de Fri Dec 18 08:10:44 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Dec 2009 08:10:44 +0100 Subject: [Cython] running Cython from a zip file In-Reply-To: <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> Message-ID: <4B2B2AF4.10100@behnel.de> Robert Bradshaw, 18.12.2009 00:24: > On Dec 17, 2009, at 2:28 PM, Lisandro Dalc?n wrote: >> On Thu, Dec 17, 2009 at 3:48 PM, Stefan Behnel wrote: >> >>> No need to silence a warning that we generate ourselves. >> Unless the warning is pointless, as in this very specific use case. I actually meant that it would be easier to drop the warning altogether than to try to silence it using the warnings module. > This could also happen with a system install, where the first person > to run Cython doesn't have the required permissions to write the pickle. Then that's a good reason to actually remove the warning. If pickling fails, fine. >> PS: It could seem that I'm worrying too much about this, but I want >> this feature to function really well. This is really nice way to >> easily "upgrade"/"downgrade" your Cython version for building specific >> projects... This is going to be important if Cython ever gets its way >> to Python's stdlib. It is also important if you are using some Linux >> distro where Cython-0.11 is in the system install, but you need to >> build the -dev copy of some Cython-based project that requires 0.12... >> Then a user can download the big 4.3MB Cython-0.12.zip (or a lite, >> striped version with 290KB like the one I'm trying to do) and get the >> -dev code cythonized. > > I think this is a worthy usecase. Note that zip files don't support binaries, though. So a Cython version running from a zip file will be substantially slower than one that is properly installed with a compiled parser. Setuptools has some fake support for .so/.dll files in zip files, but it basically just copies them out of the zip and into a temporary directory to load them from there. I find that rather unintuitive and ugly behaviour. >> Creating lexicon... >> Done (0.08 seconds) > > Another data point: > > $ touch empty.pyx > $ time python cython.py empty.pyx > Creating lexicon... > Done (0.04 seconds) > Pickling lexicon... > Done (0.01 seconds) > > real 0m0.353s > user 0m0.279s > sys 0m0.071s > > cython-devel$ time python cython.py empty.pyx > > real 0m0.321s > user 0m0.243s > sys 0m0.073s > > Really, that's hardly any savings at all. Maybe at one point this was > really expensive? In any case, I think it might make more sense to > generate the pickle at install time (with a possible error), and > silently ignore failed re-creating attempts otherwise. Or just drop the pickling completely, as Greg suggested. Stefan From stefan_ml at behnel.de Fri Dec 18 08:15:19 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Dec 2009 08:15:19 +0100 Subject: [Cython] Idea for automatic encoding and decoding In-Reply-To: <4B2AD998.3080509@canterbury.ac.nz> References: <4B22F9CC.3090606@canterbury.ac.nz> <4B249935.4070001@behnel.de> <7B661FB0-367B-4516-AC13-BBBCD0F4A880@math.washington.edu> <4B24CBD9.3000202@behnel.de> <4CE79EB3-7E2A-463E-B209-9B61BB5435AD@math.washington.edu> <4B274389.6070800@behnel.de> <9BA4E1EB-38E4-4265-BF01-3E5631BD58EC@math.washington.edu> <4B2AD998.3080509@canterbury.ac.nz> Message-ID: <4B2B2C07.7040201@behnel.de> Greg Ewing, 18.12.2009 02:23: > Robert Bradshaw wrote: >>> Or will only "def func(char*)" accept unicode input? >> Yep. > > In my world that would only accept bytes. Sounds cleaner to me, too. Stefan From stefan_ml at behnel.de Fri Dec 18 08:37:56 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Dec 2009 08:37:56 +0100 Subject: [Cython] running Cython from a zip file In-Reply-To: <4B2B2AF4.10100@behnel.de> References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> <4B2B2AF4.10100@behnel.de> Message-ID: <4B2B3154.7060708@behnel.de> Stefan Behnel, 18.12.2009 08:10: > Robert Bradshaw, 18.12.2009 00:24: >> Really, that's hardly any savings at all. Maybe at one point this was >> really expensive? In any case, I think it might make more sense to >> generate the pickle at install time (with a possible error), and >> silently ignore failed re-creating attempts otherwise. > > Or just drop the pickling completely, as Greg suggested. Since I didn't read any major argument for keeping it in so far: http://hg.cython.org/cython-devel/rev/1a5be8d78c80 I only left the lazy initialisation in, the rest is gone. Should also make the zip file another bit smaller. ;-) Stefan From dalcinl at gmail.com Fri Dec 18 20:26:50 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Fri, 18 Dec 2009 16:26:50 -0300 Subject: [Cython] running Cython from a zip file In-Reply-To: <4B2B2AF4.10100@behnel.de> References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> <4B2B2AF4.10100@behnel.de> Message-ID: On Fri, Dec 18, 2009 at 4:10 AM, Stefan Behnel wrote: > > Note that zip files don't support binaries, though. So a Cython version > running from a zip file will be substantially slower than one that is > properly installed with a compiled parser. > And there is an additional gotcha: using the PXD's in Cython/Includes... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Dec 18 20:36:56 2009 From: dalcinl at gmail.com (=?UTF-8?Q?Lisandro_Dalc=C3=ADn?=) Date: Fri, 18 Dec 2009 16:36:56 -0300 Subject: [Cython] running Cython from a zip file In-Reply-To: References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> <4B2B2AF4.10100@behnel.de> Message-ID: On Fri, Dec 18, 2009 at 4:26 PM, Lisandro Dalc?n wrote: > On Fri, Dec 18, 2009 at 4:10 AM, Stefan Behnel wrote: >> >> Note that zip files don't support binaries, though. So a Cython version >> running from a zip file will be substantially slower than one that is >> properly installed with a compiled parser. >> > > And there is an additional gotcha: using the PXD's in Cython/Includes... > BTW, we could support this in the future (using the "annoying" setuptool mechanism ;-) ) http://peak.telecommunity.com/DevCenter/PkgResources#resource-extraction PS: Every body hates setuptools, but no one (before Tarek) contributed any solution to all the issues setuptools tried to address... Setuptool's .so|.dll extraction could be unintuitive or seem ugly behaviour, but if you do not like it, do not use it. No one forces you to create or use compiled ext modules inside a zip file :-) ... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From stefan_ml at behnel.de Fri Dec 18 22:02:43 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 18 Dec 2009 22:02:43 +0100 Subject: [Cython] running Cython from a zip file In-Reply-To: References: <4B2A71F0.1030105@student.matnat.uio.no> <4B2A7CF4.7050006@behnel.de> <473FCA5A-D155-42FC-B1CB-355DEB87B756@math.washington.edu> <4B2B2AF4.10100@behnel.de> Message-ID: <4B2BEDF3.20605@behnel.de> Lisandro Dalc?n, 18.12.2009 20:36: > On Fri, Dec 18, 2009 at 4:26 PM, Lisandro Dalc?n wrote: >> On Fri, Dec 18, 2009 at 4:10 AM, Stefan Behnel wrote: >>> Note that zip files don't support binaries, though. So a Cython version >>> running from a zip file will be substantially slower than one that is >>> properly installed with a compiled parser. >>> >> And there is an additional gotcha: using the PXD's in Cython/Includes... That's not a problem at all. If and how we support that (using setuptools or not) is totally up to us. > BTW, we could support this in the future (using the "annoying" > setuptool mechanism ;-) ) Oh, that's a totally different thing. The problem I have with binaries in eggs is that setuptools doesn't tell you that it won't work the way it looks. First time I tried this with lxml, I just set a setup.py flag and it worked like a snap. It was so easy, I ran it through strace to see how it had managed to load the .so file from the zip file. It was only then that I discovered that it wasn't. Instead, setuptools was copying the .so file into my home directory! This means that a package distributor can easily fall into the trap of building an egg with binaries ("because it's so easy"), and the end users of the packages will not be made aware that something is copied into the home directory of every single user of the package. So you may end up with tons of copies of DLLs on a system, just because setuptools managed to trick a developer into believing that it could do stuff that normally won't work, and that actually doesn't work. Great! So, big fat warning here: package maintainers, don't fall into that trap! But I'm perfectly fine with loading Cython uncompiled from a zip file, and with supporting any sensible way of loading .pxd (and other compilable) files from them, be it from Cython's own zip file or from user provided zipped packages that end up in the PYTHONPATH in one way or another. Stefan From markflorisson88 at gmail.com Fri Dec 18 23:41:58 2009 F