summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormelissawm <melissawm.github@gmail.com>2021-11-09 17:59:50 -0300
committermelissawm <melissawm.github@gmail.com>2021-11-22 11:25:20 -0300
commit7ce32d6188fcb76ad4790dd9679abdb3b7a6dacf (patch)
tree9e021af1994e9edd416fbaf80fe4f1a19b58f968
parentfc56a4d93943c8ac3b047b33b65e5bb554ae62ab (diff)
downloadnumpy-7ce32d6188fcb76ad4790dd9679abdb3b7a6dacf.tar.gz
Addressing review comments
-rw-r--r--doc/source/reference/arrays.classes.rst1
-rw-r--r--doc/source/reference/arrays.interface.rst32
-rw-r--r--doc/source/user/basics.interoperability.rst122
3 files changed, 85 insertions, 70 deletions
diff --git a/doc/source/reference/arrays.classes.rst b/doc/source/reference/arrays.classes.rst
index 92c271f6b..d79c2e78a 100644
--- a/doc/source/reference/arrays.classes.rst
+++ b/doc/source/reference/arrays.classes.rst
@@ -42,6 +42,7 @@ however, of why your subroutine may not be able to handle an arbitrary
subclass of an array is that matrices redefine the "*" operator to be
matrix-multiplication, rather than element-by-element multiplication.
+.. _special-attributes-and-methods:
Special attributes and methods
==============================
diff --git a/doc/source/reference/arrays.interface.rst b/doc/source/reference/arrays.interface.rst
index 6a8c5f9c4..25cfa3de1 100644
--- a/doc/source/reference/arrays.interface.rst
+++ b/doc/source/reference/arrays.interface.rst
@@ -4,18 +4,18 @@
.. _arrays.interface:
-*******************
-The Array Interface
-*******************
+****************************
+The array interface protocol
+****************************
.. note::
- This page describes the numpy-specific API for accessing the contents of
- a numpy array from other C extensions. :pep:`3118` --
+ This page describes the NumPy-specific API for accessing the contents of
+ a NumPy array from other C extensions. :pep:`3118` --
:c:func:`The Revised Buffer Protocol <PyObject_GetBuffer>` introduces
similar, standardized API to Python 2.6 and 3.0 for any extension
module to use. Cython__'s buffer array support
- uses the :pep:`3118` API; see the `Cython numpy
+ uses the :pep:`3118` API; see the `Cython NumPy
tutorial`__. Cython provides a way to write code that supports the buffer
protocol with Python versions older than 2.6 because it has a
backward-compatible implementation utilizing the array interface
@@ -81,7 +81,8 @@ This approach to the interface consists of the object having an
===== ================================================================
``t`` Bit field (following integer gives the number of
bits in the bit field).
- ``b`` Boolean (integer type where all values are only True or False)
+ ``b`` Boolean (integer type where all values are only ``True`` or
+ ``False``)
``i`` Integer
``u`` Unsigned integer
``f`` Floating point
@@ -141,11 +142,11 @@ This approach to the interface consists of the object having an
must be stored by the new object if the memory area is to be
secured.
- **Default**: None
+ **Default**: ``None``
**strides** (optional)
Either ``None`` to indicate a C-style contiguous array or
- a Tuple of strides which provides the number of bytes needed
+ a tuple of strides which provides the number of bytes needed
to jump to the next array element in the corresponding
dimension. Each entry must be an integer (a Python
:py:class:`int`). As with shape, the values may
@@ -156,26 +157,26 @@ This approach to the interface consists of the object having an
memory buffer. In this model, the last dimension of the array
varies the fastest. For example, the default strides tuple
for an object whose array entries are 8 bytes long and whose
- shape is ``(10, 20, 30)`` would be ``(4800, 240, 8)``
+ shape is ``(10, 20, 30)`` would be ``(4800, 240, 8)``.
**Default**: ``None`` (C-style contiguous)
**mask** (optional)
- None or an object exposing the array interface. All
+ ``None`` or an object exposing the array interface. All
elements of the mask array should be interpreted only as true
or not true indicating which elements of this array are valid.
The shape of this object should be `"broadcastable"
<arrays.broadcasting.broadcastable>` to the shape of the
original array.
- **Default**: None (All array values are valid)
+ **Default**: ``None`` (All array values are valid)
**offset** (optional)
An integer offset into the array data region. This can only be
used when data is ``None`` or returns a :class:`buffer`
object.
- **Default**: 0.
+ **Default**: ``0``.
**version** (required)
An integer showing the version of the interface (i.e. 3 for
@@ -243,6 +244,11 @@ flag is present.
returning the :c:type:`PyCapsule`, and configure a destructor to decref this
reference.
+.. note::
+
+ :obj:`__array_struct__` is considered legacy and should not be used for new
+ code. Use the :py:doc:`buffer protocol <c-api/buffer>` instead.
+
Type description examples
=========================
diff --git a/doc/source/user/basics.interoperability.rst b/doc/source/user/basics.interoperability.rst
index 444574e32..eeb7492ef 100644
--- a/doc/source/user/basics.interoperability.rst
+++ b/doc/source/user/basics.interoperability.rst
@@ -3,9 +3,9 @@
Interoperability with NumPy
***************************
-NumPy’s ndarray objects provide both a high-level API for operations on
+NumPy's ndarray objects provide both a high-level API for operations on
array-structured data and a concrete implementation of the API based on
-`strided in-RAM storage <https://numpy.org/doc/stable/reference/arrays.html>`__.
+:ref:`strided in-RAM storage <arrays>`.
While this API is powerful and fairly general, its concrete implementation has
limitations. As datasets grow and NumPy becomes used in a variety of new
environments and architectures, there are cases where the strided in-RAM storage
@@ -29,44 +29,39 @@ Using arbitrary objects in NumPy
When NumPy functions encounter a foreign object, they will try (in order):
-1. The buffer protocol, described `in the Python C-API documentation
- <https://docs.python.org/3/c-api/buffer.html>`__.
+1. The buffer protocol, described :py:doc:`in the Python C-API documentation
+ <c-api/buffer>`.
2. The ``__array_interface__`` protocol, described
- :ref:`in this page <arrays.interface>`. A precursor to Python’s buffer
+ :ref:`in this page <arrays.interface>`. A precursor to Python's buffer
protocol, it defines a way to access the contents of a NumPy array from other
C extensions.
-3. The ``__array__`` protocol, which asks an arbitrary object to convert itself
- into an array.
+3. The ``__array__()`` method, which asks an arbitrary object to convert
+ itself into an array.
For both the buffer and the ``__array_interface__`` protocols, the object
describes its memory layout and NumPy does everything else (zero-copy if
-possible). If that’s not possible, the object itself is responsible for
+possible). If that's not possible, the object itself is responsible for
returning a ``ndarray`` from ``__array__()``.
-The array interface
-~~~~~~~~~~~~~~~~~~~
+The array interface protocol
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The :ref:`array interface <arrays.interface>` defines a protocol for array-like
-objects to re-use each other’s data buffers. Its implementation relies on the
-existence of the following attributes or methods:
+The :ref:`array interface protocol <arrays.interface>` defines a way for
+array-like objects to re-use each other's data buffers. Its implementation
+relies on the existence of the following attributes or methods:
- ``__array_interface__``: a Python dictionary containing the shape, the
element type, and optionally, the data buffer address and the strides of an
array-like object;
- ``__array__()``: a method returning the NumPy ndarray view of an array-like
object;
-- ``__array_struct__``: a ``PyCapsule`` containing a pointer to a
- ``PyArrayInterface`` C-structure.
-The ``__array_interface__`` and ``__array_struct__`` attributes can be inspected
-directly:
+The ``__array_interface__`` attribute can be inspected directly:
>>> import numpy as np
>>> x = np.array([1, 2, 5.0, 8])
>>> x.__array_interface__
{'data': (94708397920832, False), 'strides': None, 'descr': [('', '<f8')], 'typestr': '<f8', 'shape': (4,), 'version': 3}
- >>> x.__array_struct__
- <capsule object NULL at 0x7f798800be40>
The ``__array_interface__`` attribute can also be used to manipulate the object
data in place:
@@ -96,21 +91,20 @@ We can check that ``arr`` and ``new_arr`` share the same data buffer:
array([1000, 2, 3, 4])
-The ``__array__`` protocol
+The ``__array__()`` method
~~~~~~~~~~~~~~~~~~~~~~~~~~
-The ``__array__`` protocol acts as a dispatch mechanism and ensures that any
-NumPy-like object (an array, any object exposing the array interface, an object
-whose ``__array__`` method returns an array or any nested sequence) that
-implements it can be used as a NumPy array. If possible, this will mean using
-``__array__`` to create a NumPy ndarray view of the array-like object.
-Otherwise, this copies the data into a new ndarray object. This is not optimal,
-as coercing arrays into ndarrays may cause performance problems or create the
-need for copies and loss of metadata.
+The ``__array__()`` method ensures that any NumPy-like object (an array, any
+object exposing the array interface, an object whose ``__array__()`` method
+returns an array or any nested sequence) that implements it can be used as a
+NumPy array. If possible, this will mean using ``__array__()`` to create a NumPy
+ndarray view of the array-like object. Otherwise, this copies the data into a
+new ndarray object. This is not optimal, as coercing arrays into ndarrays may
+cause performance problems or create the need for copies and loss of metadata,
+as the original object and any attributes/behavior it may have had, is lost.
-To see an example of a custom array implementation including the use of the
-``__array__`` protocol, see `Writing custom array containers
-<https://numpy.org/devdocs/user/basics.dispatch.html>`__.
+To see an example of a custom array implementation including the use of
+``__array__()``, see :ref:`basics.dispatch`.
Operating on foreign objects without converting
-----------------------------------------------
@@ -121,7 +115,11 @@ Consider the following function.
>>> def f(x):
... return np.mean(np.exp(x))
-We can apply it to a NumPy ndarray object directly:
+Note that `np.exp` is a :ref:`ufunc <ufuncs-basics>`, which means that it
+operates on ndarrays in an element-by-element fashion. On the other hand,
+`np.mean` operates along one of the array's axes.
+
+We can apply ``f`` to a NumPy ndarray object directly:
>>> x = np.array([1, 2, 3, 4])
>>> f(x)
@@ -149,9 +147,13 @@ The ``__array_ufunc__`` protocol
A :ref:`universal function (or ufunc for short) <ufuncs-basics>` is a
“vectorized” wrapper for a function that takes a fixed number of specific inputs
and produces a fixed number of specific outputs. The output of the ufunc (and
-its methods) is not necessarily an ndarray, if all input arguments are not
+its methods) is not necessarily an ndarray, if not all input arguments are
ndarrays. Indeed, if any input defines an ``__array_ufunc__`` method, control
-will be passed completely to that function, i.e., the ufunc is overridden.
+will be passed completely to that function, i.e., the ufunc is overridden. The
+``__array_ufunc__`` method defined on that (non-ndarray) object has access to
+the NumPy ufunc. Because ufuncs have a well-defined structure, the foreign
+``__array_ufunc__`` method may rely on ufunc attributes like ``.at()``,
+``.reduce()``, and others.
A subclass can override what happens when executing NumPy ufuncs on it by
overriding the default ``ndarray.__array_ufunc__`` method. This method is
@@ -169,9 +171,7 @@ is safe and consistent across projects.
The semantics of ``__array_function__`` are very similar to ``__array_ufunc__``,
except the operation is specified by an arbitrary callable object rather than a
-ufunc instance and method. For more details, see `NEP 18
-<https://numpy.org/neps/nep-0018-array-function-protocol.html>`__.
-
+ufunc instance and method. For more details, see :ref:`NEP18`.
Interoperability examples
-------------------------
@@ -223,7 +223,7 @@ Example: PyTorch tensors
`PyTorch <https://pytorch.org/>`__ is an optimized tensor library for deep
learning using GPUs and CPUs. PyTorch arrays are commonly called *tensors*.
-Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or
+Tensors are similar to NumPy's ndarrays, except that tensors can run on GPUs or
other hardware accelerators. In fact, tensors and NumPy arrays can often share
the same underlying memory, eliminating the need to copy data.
@@ -251,13 +251,22 @@ explicit conversion:
Also, note that the return type of this function is compatible with the initial
data type.
-**Note** PyTorch does not implement ``__array_function__`` or
-``__array_ufunc__``. Under the hood, the ``Tensor.__array__()`` method returns a
-NumPy ndarray as a view of the tensor data buffer. See `this issue
-<https://github.com/pytorch/pytorch/issues/24015>`__ and the
-`__torch_function__ implementation
-<https://github.com/pytorch/pytorch/blob/master/torch/overrides.py>`__
-for details.
+.. admonition:: Warning
+
+ While this mixing of ndarrays and tensors may be convenient, it is not
+ recommended. It will not work for non-CPU tensors, and will have unexpected
+ behavior in corner cases. Users should prefer explicitly converting the
+ ndarray to a tensor.
+
+.. note::
+
+ PyTorch does not implement ``__array_function__`` or ``__array_ufunc__``.
+ Under the hood, the ``Tensor.__array__()`` method returns a NumPy ndarray as
+ a view of the tensor data buffer. See `this issue
+ <https://github.com/pytorch/pytorch/issues/24015>`__ and the
+ `__torch_function__ implementation
+ <https://github.com/pytorch/pytorch/blob/master/torch/overrides.py>`__
+ for details.
Example: CuPy arrays
~~~~~~~~~~~~~~~~~~~~
@@ -271,7 +280,8 @@ with Python. CuPy implements a subset of the NumPy interface by implementing
>>> x_gpu = cp.array([1, 2, 3, 4])
The ``cupy.ndarray`` object implements the ``__array_ufunc__`` interface. This
-enables NumPy ufuncs to be directly operated on CuPy arrays:
+enables NumPy ufuncs to be applied to CuPy arrays (this will defer operation to
+the matching CuPy CUDA/ROCm implementation of the ufunc):
>>> np.mean(np.exp(x_gpu))
array(21.19775622)
@@ -307,8 +317,7 @@ implements a subset of the NumPy ndarray interface using blocked algorithms,
cutting up the large array into many small arrays. This allows computations on
larger-than-memory arrays using multiple cores.
-Dask supports array protocols like ``__array__`` and
-``__array_ufunc__``.
+Dask supports ``__array__()`` and ``__array_ufunc__``.
>>> import dask.array as da
>>> x = da.random.normal(1, 0.1, size=(20, 20), chunks=(10, 10))
@@ -317,8 +326,10 @@ Dask supports array protocols like ``__array__`` and
>>> np.mean(np.exp(x)).compute()
5.090097550553843
-**Note** Dask is lazily evaluated, and the result from a computation isn’t
-computed until you ask for it by invoking ``compute()``.
+.. note::
+
+ Dask is lazily evaluated, and the result from a computation isn't computed
+ until you ask for it by invoking ``compute()``.
See `the Dask array documentation
<https://docs.dask.org/en/stable/array.html>`__
@@ -328,13 +339,10 @@ and the `scope of Dask arrays interoperability with NumPy arrays
Further reading
---------------
-- `The Array interface
- <https://numpy.org/doc/stable/reference/arrays.interface.html>`__
-- `Writing custom array containers
- <https://numpy.org/devdocs/user/basics.dispatch.html>`__.
-- `Special array attributes
- <https://numpy.org/devdocs/reference/arrays.classes.html#special-attributes-and-methods>`__
- (details on the ``__array_ufunc__`` and ``__array_function__`` protocols)
+- :ref:`arrays.interface`
+- :ref:`basics.dispatch`
+- :ref:`special-attributes-and-methods` (details on the ``__array_ufunc__`` and
+ ``__array_function__`` protocols)
- `NumPy roadmap: interoperability
<https://numpy.org/neps/roadmap.html#interoperability>`__
- `PyTorch documentation on the Bridge with NumPy