[Cython] Some more optimisations

Martin C. Martin martin at martincmartin.com
Fri Feb 1 16:02:13 CET 2008


Stefan Behnel wrote:
> Hi,
> 
> Martin C. Martin wrote:
>> Stefan Behnel wrote:
>>
>>> One thing I'm not sure about is how to propagate undeclared function
>>> result
>>> types. If we can figure out what type a function has, we can use that
>>> type.
>>> But the function might be declared later in the code, so we wouldn't
>>> have that
>>> information early enough. I think that's where two passed are required.
>> I think you're absolutely right, and that's the standard answer to this.
> 
> Now that I give it another thought, there's one very ugly thing to deal with.
> Imagine this:
> 
>   cdef f1(a):
>       d = f2(a)
>       d[1] = 1
>       return d
> 
>   cdef f2(a):
>       d = f3(a)
>       d[2] = 2
>       return d
> 
>   cdef f3(a):
>       d = f4(a)
>       d[3] = 3
>       return d
> 
>   cdef f4(a):
>     return {4:a}
> 
> Here, we will traverse all functions, and at the last one, we can decide that
> f4 will always return a dict. Then, we traverse again, and we can decide that
> f3 will always return a dict, and so on.
> 
> That would let us run through as many passes as we have functions.

Actually, it could be worse than that, if the function bodies reference 
more than one function.  It could mean a separate pass for each expression.

But there's a better way.  What you're describing is type inference:

http://en.wikipedia.org/wiki/Type_inference

Given that the only types we care about are dict, list, tuple, and 
"everything else," that might be overkill.  On the other hand, it would 
be a fun project.

The alternative is to only use the type when people have explicitly 
declared it in the function prototype.  So in your example, all of the 
d[x] = y calls wouldn't know that d is a dict, because the user didn't 
declare f4's return type.  But if they did, then we'd know that d was 
always a dict, so we could call the dict-specific set for all those d[x] 
= y.  That only requires a single extra pass: A first pass that just 
collects all the function prototypes, then a second pass to generate the 
actual code.

> Couldn't someone just write code for this? ;)

Oh, look at the time!  It's getting late, I'd better be going...  ;)

Best,
Martin


More information about the Cython-dev mailing list