[py-dev] py.test in Unicode context
François Pinard
pinard at iro.umontreal.ca
Mon Apr 17 16:22:00 CEST 2006
Hi, people. I hope this is an appropriate forum for discussing such
things, otherwise, please friendly tell me! :-)
For a while now, and more as it goes, we are using py.test and py.log in
a some projects. A few being bigger, most being smaller...
A few weeks ago, we started the experiment of fully converting a set of
programs to full Unicode internally. That is, for example, *all*
constant strings in the sources got a 'u' prepended by the application
of ``unipy *.py``, where ``unipy`` is a script of ours. A bit sadly,
Python is not fully ready for such usage -- comments censored :-) --
yet with a few appropriate local stunts, it seems we can manage
nevertheless. In fact, it sounds promising. The ``unipy`` scripts adds
the following special line near the start of Python modules::
from Unicode import file, isinstance, open, os, str, sys, unicode
and also cleans out pre-existing import statements from ``os`` and
``sys`` references. The effect is that, for example, ``file`` or
``os.popen`` have a Unicode-aware filter automatically installed around
the real file object, and this is true as well for ``sys.stdin`` and
``sys.stdout`` say, but only for modules using the special ``from``
line, the real things are left alone for non unipy-ized modules.
py.test and py.log does not behave well in such contexts, and I would
much like not giving on them, so my incentive for this conversation.
I'll likely adjust a local copy of py.log, but py.test is less easy for
me. It uses some magic by which, for example, ``sys.stdout`` is
overriden in the tested module space, and by a ``cStringIO`` object.
For one thing, ``cStringIO`` does not work with Unicode strings, while
``StringIO`` does, but it should not even be a problem, because the
special ``sys`` imported from our ``Unicode`` module should, for
example, write only 8-bit strings to the real ``sys.stdout``, so I would
guess the interception installed by ``py.test`` is not low level enough:
it should ideally not play in the tested module namespace.
Do you have any opinion, suggestion, or thought you would feel like
sharing, on this matter?
[On a parallel line of thought, I also wonder if the pylib project could
not adopt, as one of its sub-projects, the seek for a workable solution
to the problematic of those like us, who try to match Python and Unicode
for real. :-)]
--
François Pinard http://pinard.progiciels-bpi.ca
More information about the py-dev
mailing list