[Cython] string literals in Py2 vs Py3
Stefan Behnel
stefan_ml at behnel.de
Fri May 16 12:23:21 CEST 2008
Hi,
Robert Bradshaw wrote:
> I would rather that string literals be interpreted according to the C
> library they're linked against. I'm also thinking about code that,
> say, returns string literals. I would much rather it returns str in
> Py2 and unicode in Py3.
That would be unexpected, especially in Py3 where the two are distinct types.
As I said to Lisandro in another post:
S> If you want source compatibility, you can't change the semantics based on
S> the compile time environment - except for the cases where the runtime
S> environments really differ (such as byte/unicode identifiers). Imagine you
S> had some latin-1 encoded XML byte literal in your code. In Py2, under your
S> proposal, this would become a byte string that can be parsed. In Py3,
S> however, this would suddenly become a unicode string and the parser would
S> refuse to handle it, as it's no longer ISO encoded.
> Note, this is not something that needs to be done to get ready for
> Py3--it's an assumption that unqualified string literals are the same
> type as python identifiers.
This happens to be a correct assumption in Py2 and Py3, but I don't see the link.
> I was doing some playing around with str and unicode in Python, and I
> noticed that it will automatically convert between the two (no
> explicit encoding needed) as long as the data in question is pure
> ASCII.
That would be Py2. Py3 will never attempt any kind of automatic conversion
between bytes and str. And I am convinced that Cython shouldn't do that either.
Stefan
More information about the Cython-dev
mailing list