[Cython] target language syntax of Cython: Py2.6 or Py3.0?
Stefan Behnel
stefan_ml at behnel.de
Tue Apr 15 12:02:06 CEST 2008
Hi,
just answering the first part of your comments for now.
Dag Sverre Seljebotn wrote:
>> Also, I really like the fact that "test" is a plain byte string in Cython that
>> can directly be converted to a C char*, depending on its use. This shouldn't
>> change, even if Py3 dictates that this literal becomes a Unicode string.
>>
> What exactly are the consequences here... if it is just about the
> runtime object used then I suppose it can be inferred from context?
"In the face of ambiguity, refuse the temptation to guess." :)
Somehow "inferring" the difference between str and unicode literals is the
wrong thing to do.
> (I.e., coercion to char* deals with it...) Or does it mean that string
> literals converted to char* should be UTF-8 strings or something?
You cannot automatically convert a unicode object to a char*, that's why I
said that a byte string makes more sense in the Cython context.
> What is the current behaviour for string literals anyway..probably that
> the encoding of the Cython source gets carried through to the strings in
> C source?
Yes, they are passed through to the C compiler as they are - although that's
not really what I'd call "well defined semantics". We can improve on this by
supporting PEP 263.
http://www.python.org/doc/2.3/whatsnew/section-encodings.html
The current string literal semantics in Cython are:
"text" is a literal byte sequence that translates directly to a Py2 str object
or a C char*.
u"text" is a unicode literal that is parsed as UTF-8 encoded byte sequence and
converted into a Python unicode object (at runtime).
Stefan
More information about the Cython-dev
mailing list