[Cython] Some more optimisations
Stefan Behnel
stefan_ml at behnel.de
Fri Feb 1 10:12:15 CET 2008
Hi Robert,
Robert Bradshaw wrote:
> On Jan 24, 2008, at 2:53 AM, Stefan Behnel wrote:
>> What I could imagine, on the other hand, is exploiting the type hint
>> given by
>> *args and **kwargs and propagate that (at least up to the next
>> assignment), so
>> that access to the variables can use straight PyAPI calls. But as Kay
>> noted,
>> we don't currently have a framework for questions like: where is "the
>> next assignment?".
>
> What we need is special list/dict/tuple types. I know Greg (eventually)
> plans to do something like this too. They would be like extension types,
> though should fail for subtypes (e.g. if one subclasses a list, then one
> can't assign it to a cdef list variable, or it might invalidate using
> the faster api's).
If we require exact types (no subtypes), we would be inconsistent with how
things currently work for extension types (and Python types). The only
exception are plain C types. So here, list/dict/tuple would basically behave
like a C type (but I guess it would be enough to document that...)
> As for the type assignment propagation, I think we need to introduce a
> two-pass analyses_types. There would be a special undeclared type, and
> by tracking assignments one could create a dependancy tree of types. One
> would then use this to resolve all variable types and run analyses_types
> again.
>
> The reason one needs to run analyses_types twice is because there is
> often branching on type.is_pyobject, as well as determining coercion
> needs, etc. I'm sure it could be done with a single pass and solving
> some giant type-constraint problem, but that would take a major rewrite.
I had imagined building a per-name list of previously assigned types in
analyse_types() that could be traversed to check what type we currently
expect. Would work as follows:
- assignments replace the list content by their result of the analyse_types()
for the right-hand side. This possibly requires taking into account the
current list of types, and it might mean we just append an additional type
possibility.
- branches (if/loops) collect the results of each branch and sort the types in
the expected order of probability - which might be arbitrary, but could be
based on the number of occurrences over the branches.
The later "is_pyobject" tests would then be replaced by a more accurate test
for relevant types, but from a point on, we would always have a list of
possible types for each code point that we would base our decision on. BTW,
"is_pyobject" is not wrong, it's just not accurate enough for everything.
One thing I'm not sure about is how to propagate undeclared function result
types. If we can figure out what type a function has, we can use that type.
But the function might be declared later in the code, so we wouldn't have that
information early enough. I think that's where two passed are required.
Stefan
More information about the Cython-dev
mailing list