[Cython] Assignment nodes, buffers and transforms
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Sun Jul 20 15:30:30 CEST 2008
I've had some skirmishes with Robert and Stefan on this and I think it
is time for a thread.
Currently, for buffers, I must somehow write code for two different
cases that have to do with assignment:
1. When buffers variables are somehow assigned to
2. When buffers have items assigned to
The question is whether to write more code in Nodes.py to do this, or
make it easier for transforms to react to assignments. The latter seem
more intrusive to how you usually do things, however I also see more
usecases than just buffers (especially for optimizing refcount in a
better way while keeping the algorithmic code simple, as well as for
parts of type inference, as well as compile-time evaluation, and just
overall transform complexity in number of node types to deal with).
I'm not sure if I'm suggesting that there should only be one type of
assignment node, but I'd love for them to be a bit different:
- SetItemNode
- SetAttributeNode
- SetCVarNode
...so that the node type starts corresponding to what is going on in C
without having to examine the lhs. I feel that would simplify a lot of
code, also existing code (and it could be made "backwards-compatible"
through inheritance anyway). However that is a second step that relies
on reducing to a single assignment node first (to avoid combinatorial
explosion).
Current ways of assigning a variable x:
a) def f(x):
b) cdef type x = 2
c) x = 2
d) x, (y, z) = (2, (3, 4))
e) x = y = 2
f) except Exception, x
g) for x in C:
That's all I can think of (with is turned into c) already). All of these
cases must be covered for both 1 and 2 above. What I've done so far is
support a) and c) directly, and transformed b) into c) in the parsing stage.
What I *could* do now is move on and turn d) and e) into c) as well. The
effect of all these things is to i) reduce the number of different node
types, and ii) remove code from Nodes.py that requires you to know about
how all code generation/analysis work and instead add code that requires
you to know about writing tree transforms (I think that even if the
latter has usually more lines of code, they all follow a fundamental
principle and results in code that is more robust to changes other
places in the code, more "isolated" code).
f) and g) can also be turned into c) but with a bit extra work and in
less obvious ways. If you think about SetItemNode, SetAttributeNode etc.
you see that it isn't wholly unatural for a loop to reduce into such
instructions (that is of course what is happening today too, but less
explicit).
All in all I'm unsure about this point -- I have to transform them all,
or write BufferNodes to wrap namenodes in the tree anyway, in which case
the transforms weren't really necesarry. So I ask for input. (Help me
counter the exhilirating feeling I get from having Buffer.py "just work"
in more cases by refactoring unrelated pieces of code :-) )
From my own experiences, whenever I have to do something in Nodes.py or
ExprNodes.py I invariably introduce bugs that it takes hours to find,
because I do not really understand it. Of course I could learn, but I'd
rather write the transforms _if that is a preferred direction anyway_.
So the ideal preferred direction is what I'd like input on, and then
I'll take the final call on work amount myself (and on how smart things
I can cook up for f) and g), if they end up being hopeless then I'll
drop it).
--
Dag Sverre
More information about the Cython-dev
mailing list