[Cython] Some more optimisations
Robert Bradshaw
robertwb at math.washington.edu
Fri Feb 1 21:30:32 CET 2008
On Feb 1, 2008, at 1:12 AM, Stefan Behnel wrote:
> Hi Robert,
>
> Robert Bradshaw wrote:
>> On Jan 24, 2008, at 2:53 AM, Stefan Behnel wrote:
>>> What I could imagine, on the other hand, is exploiting the type hint
>>> given by
>>> *args and **kwargs and propagate that (at least up to the next
>>> assignment), so
>>> that access to the variables can use straight PyAPI calls. But as
>>> Kay
>>> noted,
>>> we don't currently have a framework for questions like: where is
>>> "the
>>> next assignment?".
>>
>> What we need is special list/dict/tuple types. I know Greg
>> (eventually)
>> plans to do something like this too. They would be like extension
>> types,
>> though should fail for subtypes (e.g. if one subclasses a list,
>> then one
>> can't assign it to a cdef list variable, or it might invalidate using
>> the faster api's).
>
> If we require exact types (no subtypes), we would be inconsistent
> with how
> things currently work for extension types (and Python types). The only
> exception are plain C types. So here, list/dict/tuple would
> basically behave
> like a C type (but I guess it would be enough to document that...)
Yes, they would behave more like C types. I think this is a necessary
and not overburdonsome restriction--it is much more common to have
exactly a list than to have a subclass of a list. The real motivation
for doing this is that unless we have exactly these types, we can't
use the fast Python C/API calls anyways, so what would be the gain?
>> As for the type assignment propagation, I think we need to
>> introduce a
>> two-pass analyses_types. There would be a special undeclared type,
>> and
>> by tracking assignments one could create a dependancy tree of
>> types. One
>> would then use this to resolve all variable types and run
>> analyses_types
>> again.
>>
>> The reason one needs to run analyses_types twice is because there is
>> often branching on type.is_pyobject, as well as determining coercion
>> needs, etc. I'm sure it could be done with a single pass and solving
>> some giant type-constraint problem, but that would take a major
>> rewrite.
>
> I had imagined building a per-name list of previously assigned
> types in
> analyse_types() that could be traversed to check what type we
> currently
> expect. Would work as follows:
>
> - assignments replace the list content by their result of the
> analyse_types()
> for the right-hand side. This possibly requires taking into account
> the
> current list of types, and it might mean we just append an
> additional type
> possibility.
analyse_types() won't be able to do much until we actually know what
the types in the expression are. I think the best we can do is build
up a table of "this type depends on these types in this way."
> - branches (if/loops) collect the results of each branch and sort
> the types in
> the expected order of probability - which might be arbitrary, but
> could be
> based on the number of occurrences over the branches.
I'm not quite following here--how are branches/loops different than
two adjacent statements?
> The later "is_pyobject" tests would then be replaced by a more
> accurate test
> for relevant types, but from a point on, we would always have a
> list of
> possible types for each code point that we would base our decision
> on. BTW,
> "is_pyobject" is not wrong, it's just not accurate enough for
> everything.
This is completely tangential to type inference, isn't it?
> One thing I'm not sure about is how to propagate undeclared
> function result
> types. If we can figure out what type a function has, we can use
> that type.
> But the function might be declared later in the code, so we
> wouldn't have that
> information early enough. I think that's where two passed are
> required.
I think for the first step, we should just leave functions (and class
members) as they are now--we can come back to them later but this
will make things much messier and less local (not to mention the
issues of cimporting functions and types from other modules).
- Robert
More information about the Cython-dev
mailing list