[Cython] PEPs 263 and 3120
Stefan Behnel
stefan_ml at behnel.de
Tue Apr 22 22:13:43 CEST 2008
Hi,
Stefan Behnel wrote:
> just a quick note that I started working on PEP 263 (source code encoding
> declaration) and PEP 3120 (UTF-8 as default source encoding).
Ok, I have it working, including correct doc-string types.
The idea is that strings remember their encoding themselves. Every string that
ends up in a StringNode is an EncodedString, which is a subclass of unicode
with an additional "encoding" attribute. It's None for unicode strings and the
source encoding name for byte strings. Strings are then byte re-encoded on the
way into a Symtab.Entry, and serialised into the C file as expected. This also
nicely handles .pxi includes with different source encodings.
However, once I got that finished, I noticed that there is a bug in the
control flow tracker. It keeps an empty tuple as initial end position and
seems to rely on the fact that in Py2 the empty tuple () sorts before ('some
string',). However, (u'some string') sorts *after* (). This is obviously a
completely arbitrary order, and no code should really rely on something like
that. For exactly that reason, Py3 will not allow this anymore and (IIRC)
raises an exception instead.
Robert, before I dig into this, you know this part of the code much better
than I do. Could you try fixing this up?
Thanks,
Stefan
More information about the Cython-dev
mailing list