[Cython] Some more optimisations

Robert Bradshaw robertwb at math.washington.edu
Fri Feb 1 21:30:32 CET 2008


On Feb 1, 2008, at 1:12 AM, Stefan Behnel wrote:

> Hi Robert,
>
> Robert Bradshaw wrote:
>> On Jan 24, 2008, at 2:53 AM, Stefan Behnel wrote:
>>> What I could imagine, on the other hand, is exploiting the type hint
>>> given by
>>> *args and **kwargs and propagate that (at least up to the next
>>> assignment), so
>>> that access to the variables can use straight PyAPI calls. But as  
>>> Kay
>>> noted,
>>> we don't currently have a framework for questions like: where is  
>>> "the
>>> next assignment?".
>>
>> What we need is special list/dict/tuple types. I know Greg  
>> (eventually)
>> plans to do something like this too. They would be like extension  
>> types,
>> though should fail for subtypes (e.g. if one subclasses a list,  
>> then one
>> can't assign it to a cdef list variable, or it might invalidate using
>> the faster api's).
>
> If we require exact types (no subtypes), we would be inconsistent  
> with how
> things currently work for extension types (and Python types). The only
> exception are plain C types. So here, list/dict/tuple would  
> basically behave
> like a C type (but I guess it would be enough to document that...)

Yes, they would behave more like C types. I think this is a necessary  
and not overburdonsome restriction--it is much more common to have  
exactly a list than to have a subclass of a list. The real motivation  
for doing this is that unless we have exactly these types, we can't  
use the fast Python C/API calls anyways, so what would be the gain?

>> As for the type assignment propagation, I think we need to  
>> introduce a
>> two-pass analyses_types. There would be a special undeclared type,  
>> and
>> by tracking assignments one could create a dependancy tree of  
>> types. One
>> would then use this to resolve all variable types and run  
>> analyses_types
>> again.
>>
>> The reason one needs to run analyses_types twice is because there is
>> often branching on type.is_pyobject, as well as determining coercion
>> needs, etc. I'm sure it could be done with a single pass and solving
>> some giant type-constraint problem, but that would take a major  
>> rewrite.
>
> I had imagined building a per-name list of previously assigned  
> types in
> analyse_types() that could be traversed to check what type we  
> currently
> expect. Would work as follows:
>
> - assignments replace the list content by their result of the  
> analyse_types()
> for the right-hand side. This possibly requires taking into account  
> the
> current list of types, and it might mean we just append an  
> additional type
> possibility.

analyse_types() won't be able to do much until we actually know what  
the types in the expression are. I think the best we can do is build  
up a table of "this type depends on these types in this way."

> - branches (if/loops) collect the results of each branch and sort  
> the types in
> the expected order of probability - which might be arbitrary, but  
> could be
> based on the number of occurrences over the branches.

I'm not quite following here--how are branches/loops different than  
two adjacent statements?

> The later "is_pyobject" tests would then be replaced by a more  
> accurate test
> for relevant types, but from a point on, we would always have a  
> list of
> possible types for each code point that we would base our decision  
> on. BTW,
> "is_pyobject" is not wrong, it's just not accurate enough for  
> everything.

This is completely tangential to type inference, isn't it?

> One thing I'm not sure about is how to propagate undeclared  
> function result
> types. If we can figure out what type a function has, we can use  
> that type.
> But the function might be declared later in the code, so we  
> wouldn't have that
> information early enough. I think that's where two passed are  
> required.

I think for the first step, we should just leave functions (and class  
members) as they are now--we can come back to them later but this  
will make things much messier and less local (not to mention the  
issues of cimporting functions and types from other modules).

- Robert



More information about the Cython-dev mailing list