[Cython] PEPs 263 and 3120

Stefan Behnel stefan_ml at behnel.de
Mon Apr 21 10:54:27 CEST 2008


Hi,

just a quick note that I started working on PEP 263 (source code encoding
declaration) and PEP 3120 (UTF-8 as default source encoding). I think the main
problem is my initial implementation of unicode string support. On my first
try, class names accidentally ended up as unicode strings, which Python 2.5
rejects in its C-API. :)  Unicode strings will have to be rewritten to
actually use something like a UnicodeNode, that should fix it. Also, non-ASCII
bytes should get escaped when written to the C file.

According to PEP 263, the Python tokenizer is supposed to work on a UTF-8 byte
sequence. However, since Cython runs under Python, I think it's easier to work
with unicode strings throughout the compiler. Any objections to that? I assume
that this will also make the transition of Cython itself to Py3 simpler.

Stefan


More information about the Cython-dev mailing list