[Cython] Assignment nodes, buffers and transforms

Robert Bradshaw robertwb at math.washington.edu
Thu Jul 24 06:38:04 CEST 2008


On Jul 23, 2008, at 7:07 AM, Dag Sverre Seljebotn wrote:

> Note (forgot where you touched upon it) that I consider the current  
> temp handling in transforms a temporary solution - temps (also  
> those written as cdef in my examples for notation) should be  
> regular allocate_temps things.

OK, great you've already been thinking about this.

> In fact I have something locally and a proposal for that which I  
> will post when I get around to it (tomorrow).


>>> f) and g) can also be turned into c) but with a bit extra work  
>>> and in
>> less obvious ways. If you think about SetItemNode, SetAttributeNode
>> etc.
>> you see that it isn't wholly unatural for a loop to reduce into such
>> instructions (that is of course what is happening today too, but less
>> explicit).
>>
>> I am curious if C compilers are better at optimizing/unrolling for
>> loops (especially with static bounds) then while loops with an
>> implicit counter...
>
> Yes, For*From*StatNode, which would be the target node for that  
> kind of thing. So add that to the list as a seperate case... and it  
> probably shouldn't turn into singleassignment.

Yep.

> It doesn't include object assignments so tracking it is less  
> crucial, a lot of algorithms could do without it. And it cannot  
> assign to module scope etc either.

It may, and it can assign to module scope too.

>
>>> All in all I'm unsure about this point -- I have to transform them
>> all,
>> or write BufferNodes to wrap namenodes in the tree anyway, in which
>> case
>> the transforms weren't really necesarry. So I ask for input. (Help me
>> counter the exhilirating feeling I get from having Buffer.py "just
>> work"
>> in more cases by refactoring unrelated pieces of code :-) )
>>
>>>  From my own experiences, whenever I have to do something in
>> Nodes.py or
>> ExprNodes.py I invariably introduce bugs that it takes hours to find,
>> because I do not really understand it. Of course I could learn, but
>> I'd
>> rather write the transforms _if that is a preferred direction  
>> anyway_.
>> So the ideal preferred direction is what I'd like input on, and then
>> I'll take the final call on work amount myself (and on how smart
>> things
>> I can cook up for f) and g), if they end up being hopeless then I'll
>> drop it).
>>
>> I think transformations to simplify/consolidate some of the
>> assignments is a good direction, but it seems things get increasingly
>> messy and unnatural (for example with the exception catching). We are
>
> Good point.
>
>> also potentially loosing information that could be used for
>> optimization (which can be recovered, e.g. as you did for (b), but
>> what about
>
> I prefer to think about it as "refined" rather than recovered :-)
>
> In general, with each transform you loose some and get some info,  
> and each algorithm that needs the information must figure out where  
> in the pipeline it wants to be. I didn't say these transforms would  
> be done very early, just that I wanted buffers to be after them.  
> You just have added flexibility rather than having to look at them  
> as seperate concepts..


>> Switching gears a bit, what you state you (might) want is the nodes
>>
>> - SetItemNode
>> - SetAttributeNode
>> - SetCVarNode
>>
>> Now we already have
>>
>> - ItemNode
>> - AttributeNode
>> - NameNode
>>
>> And each of them comes with a analyse_target_types(),
>> generate_assignment_code(), and generate_post_assignment_code()
>> method. All types of assignments call these methods, so you only need
>> to worry about putting the buffer code there, instead of worrying
>> about all the current (or possibly future) ways of making an
>> assignment.
>
> Yes, this is what I do now. (and as you know at this point this  
> discussion is not for gsoc any longer)

Yep, but this GSoC or not it brings up a lot of interesting points. :-)

> My point is still that you cannot so easily write a visitor to do  
> general code analysis. Also, each of these function you mention  
> often read like a lots of top-level if-tests to see which situation  
> we are in (setting a module dict entry or a local var? And so on.  
> So there is no correspondance in the tree with what is vastly  
> different operations in CPython).
>
> Consider the old (obsolete?) idea of Cython-inlineable compile-time  
> operator overloads. Then SetItem and GetItem could be transformed  
> into method call nodes. And so on (of course, it is entirely  
> possible to just add yet another if test and duplicate/call into  
> some SimpleCallNode code instead).

Yes, this idea certainly has merits.

>> Another way to go is to treat it like a type test with
>> side effects (so stuff like "(<object[int,2]>x)[0,0]" could possibly
>> work), though ignore that if you're onto a working solution.
>
> On a tangential note, this would mean that all the auxiliary vars  
> and buffer acquisiton would happen again for each such cast, so I'd  
> rather not support it because it should be avoided :-) which is why  
> I've said that I think of this as acting on the variable rather  
> than the type (though there is  BufferType similar to TypedefType  
> now).

True, but it could still be cheaper than going through Python and  
more versatile than creating a variable of the right type, sticking  
it in, then clearing the variable (though harder to implement too).

- Robert



More information about the Cython-dev mailing list