[Cython] string literals in Py2 vs Py3
Stefan Behnel
stefan_ml at behnel.de
Sat May 17 07:29:02 CEST 2008
Hi,
Robert Bradshaw wrote:
>>> --- a.pyx ---
>>>
>>> def foo(x):
>>> if x > 0:
>>> return "good"
>>> else:
>>> return "bad"
>>> -------------
>>>
>>> import a
>>> print "3 is %s" % a.foo(3)
>>>
>>> won't work in both Py2 and Py3, which I think it should. "Principle
>>> of least surprise."
>
> What it does mean is that you have to ship two separate sets of C
> files,
Not at all. You just have to state what you mean *in your source file*, i.e.
in your Cython source. If you say "I want this literal to be a byte string",
you will get a byte string in both Py2 and Py3. If you say "I want this
literal to be a unicode string", you will get a unicode string in both Py2 and
Py3. How is that a surprise?
Just because your code assumes that a byte string is the same as a unicode
string does not mean Cython has to take measures to fix this for you,
especially in a way that you might or might not have intended. Be explicit.
You now have a number of ways to say what a literal should be, based on the
Py2 syntax + the 'b' prefix of Py3:
u"abc" # a unicode string
"abc" # a byte string
b"abc" # a byte string
You can do
from __future__ import unicode_literals
u"abc" # a unicode string
"abc" # a unicode string
b"abc" # a byte string
This is actually different from Py3 (and I think in line with 2.6) in that
both the 'u' prefix and the 'b' prefix are allowed at the same time. I think
that's ok, you will just have to use them wisely. Some style guide policies
will save your project here. ;)
Stefan
More information about the Cython-dev
mailing list