Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Xavier Thompson
cython
Commits
cdcb14ed
Commit
cdcb14ed
authored
Feb 03, 2015
by
Robert Bradshaw
Browse files
Options
Browse Files
Download
Plain Diff
Merge pull request #358 from larsmans/buffer-doc
Docs: how to implement PEP 3118, the buffer protocol
parents
b2807e3d
ee3e6ae3
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
194 additions
and
0 deletions
+194
-0
docs/src/userguide/buffer.rst
docs/src/userguide/buffer.rst
+192
-0
docs/src/userguide/index.rst
docs/src/userguide/index.rst
+1
-0
docs/src/userguide/special_methods.rst
docs/src/userguide/special_methods.rst
+1
-0
No files found.
docs/src/userguide/buffer.rst
0 → 100644
View file @
cdcb14ed
.. _buffer:
Implementing the buffer protocol
================================
Cython objects can expose memory buffers to Python code
by implementing the "buffer protocol".
This chapter shows how to implement the protocol
and make use of the memory managed by an extension type from NumPy.
A matrix class
--------------
The following Cython/C++ code implements a matrix of floats,
where the number of columns is fixed at construction time
but rows can be added dynamically.
::
# matrix.pyx
from libcpp.vector cimport vector
cdef class Matrix:
cdef unsigned ncols
cdef vector[float] v
def __cinit__(self, unsigned ncols):
self.ncols = ncols
def add_row(self):
"""Adds a row, initially zero-filled."""
self.v.extend(self.ncols)
There are no members to do anything productive with the matrices' contents.
We could implement custom ``__getitem__``, ``__setitem__``, etc. for this,
but instead we'll use the buffer protocol to expose the matrix's data to Python
so we can use NumPy to do useful work.
Implementing the buffer protocol requires adding two members,
``__getbuffer__`` and ``__releasebuffer__``,
which Cython handles specially.
::
from cpython cimport Py_buffer
from libcpp.vector cimport vector
cdef class Matrix:
cdef Py_ssize_t ncols
cdef Py_ssize_t shape[2]
cdef Py_ssize_t strides[2]
cdef vector[float] v
def __cinit__(self, Py_ssize_t ncols):
self.ncols = ncols
def add_row(self):
"""Adds a row, initially zero-filled."""
self.v.resize(self.v.size() + self.ncols)
def __getbuffer__(self, Py_buffer *buffer, int flags):
cdef Py_ssize_t itemsize = sizeof(self.v[0])
self.shape[0] = self.v.size()
self.shape[1] = self.ncols
# Stride 1 is the distance, in bytes, between two items in a row;
# this is the distance between two adjance items in the vector.
# Stride 0 is the distance between the first elements of adjacent rows.
self.strides[1] = <Py_ssize_t>( <char *>&(self.v[1])
- <char *>&(self.v[0]))
self.strides[0] = self.ncols * self.strides[1]
buffer.buf = <char *>&(self.v[0])
buffer.format = 'f'
buffer.internal = NULL
buffer.itemsize = itemsize
buffer.len = self.v.size() * itemsize
buffer.ndim = 2
buffer.obj = self
buffer.readonly = 0
buffer.shape = self.shape
buffer.strides = self.strides
buffer.suboffsets = NULL
def __releasebuffer__(self, Py_buffer *buffer):
pass
The method ``Matrix.__getbuffer__`` fills a descriptor structure,
called a ``Py_buffer``, that is defined by the Python C-API.
It contains a pointer to the actual buffer in memory,
as well as metadata about the shape of the array and the strides
(step sizes to get from one element or row to the next).
Its ``shape`` and ``strides`` members are pointers
that must be point to arrays of ``Py_ssize_t[ndim]``.
These arrays have to stay alive as long as any buffer views the data,
so we store them on the ``Matrix`` object as members.
The code is not yet complete, but we can already compile it
and test the basic functionality.
::
>>> from matrix import Matrix
>>> import numpy as np
>>> m = Matrix(10)
>>> np.asarray(m)
array([], shape=(0, 10), dtype=float32)
>>> m.add_row()
>>> a = np.asarray(m)
>>> a[:] = 1
>>> m.add_row()
>>> a = np.asarray(m)
>>> a
array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)
Now we can view the ``Matrix`` as a NumPy ``ndarray``,
and modify its contents using standard NumPy operations.
Memory safety and reference counting
------------------------------------
The ``Matrix`` class as implemented so far is unsafe.
The ``add_row`` operation can move the underlying buffer,
which invalidates any NumPy (or other) view on the data.
If you try to access values after an ``add_row`` call,
you'll get outdated values or a segfault.
This is where ``__releasebuffer__`` comes in.
We can add a reference count to each matrix,
and lock it for mutation whenever a view exists.
::
cdef class Matrix:
# ...
cdef int view_count
def __cinit__(self, Py_ssize_t ncols):
self.ncols = ncols
self.view_count = 0
def add_row(self):
if self.view_count > 0:
raise ValueError("can't add row while being viewed")
self.v.resize(self.v.size() + self.ncols)
def __getbuffer__(self, Py_buffer *buffer, int flags):
# ... as before
self.view_count += 1
def __releasebuffer__(self, Py_buffer *buffer):
self.view_count -= 1
Flags
-----
We skipped some input validation in the code.
The ``flags`` argument to ``__getbuffer__`` comes from ``np.asarray``
(and other clients) and is an OR of boolean flags
that describe the kind of array that is requested.
Strictly speaking, if the flags contain ``PyBUF_ND``, ``PyBUF_SIMPLE``,
or ``PyBUF_F_CONTIGUOUS``, ``__getbuffer__`` must raise a ``BufferError``.
These macros can be ``cimport``'d from the pseudo-package ``cython``.
(The matrix-in-vector structure actually conforms to ``PyBUF_ND``,
but that would prohibit ``__getbuffer__`` from filling in the strides.
A single-row matrix is F-contiguous, but a larger matrix is not.)
References
----------
The buffer interface used here is set out in
`PEP 3118, Revising the buffer protocol
<http://legacy.python.org/dev/peps/pep-3118/>`_
A tutorial for using this API from C is on Jake Vanderplas's blog,
`An Introduction to the Python Buffer Protocol
<https://jakevdp.github.io/blog/2014/05/05/introduction-to-the-python-buffer-protocol/>`_.
Reference documentation is available for
`Python 3 <https://docs.python.org/3/c-api/buffer.html>`_
and `Python 2 <https://docs.python.org/2.7/c-api/buffer.html>`_.
The Py2 documentation also describes an older buffer protocol
that is no longer in use;
since Python 2.6, the PEP 3118 protocol has been implemented,
and the older protocol is only relevant for legacy code.
docs/src/userguide/index.rst
View file @
cdcb14ed
...
...
@@ -19,6 +19,7 @@ Contents:
limitations
pyrex_differences
memoryviews
buffer
parallelism
debugging
...
...
docs/src/userguide/special_methods.rst
View file @
cdcb14ed
...
...
@@ -372,6 +372,7 @@ Descriptor objects (see note 2)
accessible from Python. It is described in the Python/C API Reference Manual
of Python 2.x under sections 6.6 and 10.6. It was superseded by the new
PEP 3118 buffer protocol in Python 2.6 and is no longer available in Python 3.
For a how-to guide to the new API, see :ref:`buffer`.
.. note:: (2) Descriptor objects are part of the support mechanism for new-style
Python classes. See the discussion of descriptors in the Python documentation.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment