[Cython] Question on Efficient Indexing
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Tue Sep 2 10:11:26 CEST 2008
T J wrote:
> I was reading: http://wiki.cython.org/tutorials/numpy
>
> and I am confused on the difference between:
>
> A)
>
> cimport numpy as np
> cdef np.ndarray h = np.zeros([xmax, ymax], dtype=DTYPE)
>
> and
>
> B)
>
> cimport numpy as np
> cdef np.ndarray[DTYPE_t, ndim=2] h =np.zeros([xmax, ymax], dtype=DTYPE)
>
>
> In particular, I guess I don't understand what is gained by using A.
> I had thought that the cimport of numpy and the cdef h would be enough
> to give significant benefits. From the article, I understand what B
> improves upon over A, but what does A give us in the first place?
>
I'll have to admit that this was not very well motivated in the tutorial
(feel free to improve it).
First off, cimport of numpy and "cdef h" (whatever you mean by that?)
will not give you any improvements at all. All the cimport statement
really does is make A) and B) available, but it still has to be used. If
something is still unclear here then please ask again.
The reason for introducing A) first was primarily so that things would
be done one step at the time. However A) gives efficient access to the
parts of "h" which does not access the subitems, specifically lookups of
"h.shape[0]" is much more efficient.
(This is only used for f and g in the example (and even then it doesn't
really matter as it is a constant overhead). I felt that being
inconsistent would be more confusing, but in the specific example
nothing is actually gained using A) for h).
Another reason for A) is that it makes it possible for the code to get
benefits from future improvements to the Cython/NumPy interface. For
instance operations like "h += g" or "np.asum(h)" have potential for
optimizations if h is typed to ndarray that doesn't depend on knowing
the dtype or ndim (but at present such things are only hypothetical).
Dag Sverre
More information about the Cython-dev
mailing list