.. highlight:: cython .. _memoryviews: ************************** Typed Memoryviews ************************** Typed memoryviews can be used for efficient access to buffers, such as NumPy arrays, without incurring any Python overhead. It is similar to the current buffer support (``np.ndarray[np.float64_t, ndim=2]``, but has more features and cleaner syntax. A memoryview can be used in any context (function parameters, module-level, cdef class attribute, etc) and can be obtained from any object that exposes the PEP 3118 buffer interface. .. Note:: Support is experimental and new in this release, there may be bugs! Memoryview slices ==================== Copying -------- Memoryview slices can be obtained as follows:: cdef int[:, :] myslice = obj The memoryview slice can then be efficiently indexed and sliced in GIL and nogil mode. In GIL mode these slices can also be transposed (which gives a new memoryview slice), or copied to a C or Fortran contiguous array:: # This slice is C contiguous cdef int[:, ::1] c_contiguous_slice = myslice.copy() # This slice is Fortran contiguous cdef int[::1, :] f_contiguous_slice = myslice.copy_fortran() print c_contiguous_slice.is_c_contig() print f_contiguous_slice.is_f_contig() The `::1` in the slice type specification indicates in which dimension the data is contiguous. It can only be used to specify full C or Fortran contiguity. Slices can also be copied inplace:: cdef int[:, :, :] to_slice, from_slice ... # copy the elements in from_slice to to_slice to_slice[...] = from_slice .. Note:: Copying of buffers with ``object`` as the base type is not supported yet. Pointer types are not at all supported yet in memoryview slices. Indexing and Slicing -------------------- Indexing and slicing can be done with or without the GIL. It basically works like numpy. If indices are specified for every dimension you will get specify an element of the base type (e.g. `int`), otherwise you will get a new view. An Ellipsis means you get consecutive slices for every unspecified dimension:: cdef int[:, :, :] slice = ... # These are all equivalent slice[10] slice[10, :, :] slice[10, ...] Transposing ----------- If all dimensions are direct (i.e., there are no indirections through pointers), then the slice can be transposed in the same way that numpy slices can be transposed:: cdef int[:, ::1] c_contig = ... cdef int[::1, :] f_contig = c_contig.T This gives a new, transposed, view on the data. Specifying data layout ====================== Data layout can be specified using the previously seen ``::1`` slice syntax, or by using any of the constants in ``cython.view``. The concepts are as follows: there is data access and data packing. Data access means either direct (no pointer) or indirect (pointer). Data packing means your data may be strided (e.g. after slicing it, ``a[::2]``) or contiguous (consecutive elements are adjacent in memory). If no specifier is given in any dimension, then the data access is assumed to be direct, and the data packing assumed to be strided. If you don't know whether a dimension will be direct or indirect (because you're getting an object with a buffer interface from some library perhaps), then you can specify the `generic` flag, in which case it will be determined at runtime. The flags are as follows: * generic - strided and direct or indirect * strided - strided and direct (this is the default) * indirect - strided and indirect * contiguous - contiguous and direct * indirect_contiguous - the list of pointers is contiguous and they can be used like this:: from cython cimport view # direct access in both dimensions, strided in the first dimension, contiguous in the last cdef int[:, ::view.contiguous] a # contiguous list of pointers to contiguous lists of ints cdef int[::view.indirect_contiguous, ::1] b # direct or indirect in the first dimension, direct in the second dimension # strided in both dimensions cdef int[::view.generic, :] c Only the first, last or the dimension following an indirect dimension may be specified contiguous:: # INVALID cdef int[::view.contiguous, ::view.indirect, :] a cdef int[::1, ::view.indirect, :] b # VALID cdef int[::view.indirect, ::1, :] a cdef int[::view.indirect, :, ::1] b cdef int[::view.indirect_contiguous, ::1, :] The difference between the `contiguous` flag and the `::1` specifier is that the former specifies contiguity for only one dimension, whereas the latter specifies contiguity for all following (Fortran) or preceding (C) dimensions:: cdef int[:, ::1] c_contig = ... # VALID cdef int[:, ::view.contiguous] myslice = c_contig[::2] # INVALID cdef int[:, ::1] myslice = c_contig[::2] The former case is valid because the last dimension remains contiguous, but the first dimension does not "follow" the last one anymore (meaning, it was strided already, but it is not C or Fortran contiguous any longer), since it was sliced. Memoryview objects and cython.array =================================== These typed slices can be converted to Python objects (`cython.memoryview`), and are indexable, slicable and transposable in the same way that the slices are. They can also be converted back to typed slices at any time. They have the following attributes: * shape * strides * suboffsets * ndim * size * itemsize * nbytes * base And of course the aforementioned ``T`` attribute. These attributes have the same semantics as in NumPy_. For instance, to retrieve the original object:: import numpy cimport numpy as np cdef np.int32_t[:] a = numpy.arange(10, dtype=numpy.int32) a = a[::2] print a, numpy.asarray(a), a.base # this prints: <MemoryView of 'ndarray' object> [0 2 4 6 8] [0 1 2 3 4 5 6 7 8 9] Note that this example returns the original object from which the view was obtained, and that the view was resliced in the meantime. Cython Array ============ Whenever a slice is copied (using any of the `copy` or `copy_fortran` methods), you get a new memoryview slice of a newly created cython.array object. This array can also be used manually, and will automatically allocate a block of data. It can later be assigned to a C or Fortran contiguous slice (or a strided slice). It can be used like:: import cython my_array = cython.array(shape=(10, 2), itemsize=sizeof(int), format="i") cdef int[:, :] my_slice = my_array It also takes an optional argument `mode` ('c' or 'fortran') and a boolean `allocate_buffer`, that indicates whether a buffer should be allocated and freed when it goes out of scope:: cdef cython.array my_array = cython.array(..., mode="fortran", allocate_buffer=False) my_array.data = <char *> my_data_pointer # define a function that can deallocate the data (if needed) my_array.callback_free_data = free You can also cast pointers to arrays:: cdef cython.array my_array = <int[:10, :2]> my_data_pointer Of course, you can also immidiately assign a cython.array to a typed memoryview slice. The arrays are indexable and slicable from Python space just like memoryview objects, and have the same attributes as memoryview objects. Coercion to NumPy ================= Memoryview (and array) objects can be coerced to a NumPy ndarray, without having to copy the data. You can e.g. do:: cimport numpy as np import numpy as np numpy_array = np.asarray(<np.int32_t[:10, :10]> my_pointer) Of course, you are not restricted to using NumPy's type (such as ``np.int32_t`` here), you can use any usable type. The future ========== In the future some functionality may be added for convenience, like 1. A numpy-like `.flat` attribute (that allows efficient iteration) 2. Indexing with newaxis or None to introduce a new axis .. _NumPy: http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#memory-layout