summaryrefslogtreecommitdiff
path: root/doc/source/user
diff options
context:
space:
mode:
authorMatti Picus <matti.picus@gmail.com>2020-09-02 21:01:35 +0300
committerGitHub <noreply@github.com>2020-09-02 13:01:35 -0500
commit1f8ce6341159ebb0731c2c262f4576609210d2c8 (patch)
treeaa10443358243366d776ad669c0cbcd383ed3634 /doc/source/user
parente3c84a44b68966ab887a3623a0ff57169e508deb (diff)
downloadnumpy-1f8ce6341159ebb0731c2c262f4576609210d2c8.tar.gz
MAINT, DOC: move informational files from numpy.doc.*.py to their *.rst counterparts (#17222)
* DOC: redistribute docstring-only content from numpy/doc * DOC: post-transition clean-up * DOC, MAINT: reskip doctests, fix a few easy ones
Diffstat (limited to 'doc/source/user')
-rw-r--r--doc/source/user/basics.broadcasting.rst176
-rw-r--r--doc/source/user/basics.byteswapping.rst150
-rw-r--r--doc/source/user/basics.creation.rst139
-rw-r--r--doc/source/user/basics.dispatch.rst266
-rw-r--r--doc/source/user/basics.indexing.rst452
-rw-r--r--doc/source/user/basics.io.genfromtxt.rst26
-rw-r--r--doc/source/user/basics.rec.rst643
-rw-r--r--doc/source/user/basics.subclassing.rst749
-rw-r--r--doc/source/user/basics.types.rst337
-rw-r--r--doc/source/user/misc.rst222
-rw-r--r--doc/source/user/whatisnumpy.rst2
11 files changed, 3138 insertions, 24 deletions
diff --git a/doc/source/user/basics.broadcasting.rst b/doc/source/user/basics.broadcasting.rst
index 00bf17a41..5eae3eb32 100644
--- a/doc/source/user/basics.broadcasting.rst
+++ b/doc/source/user/basics.broadcasting.rst
@@ -10,4 +10,178 @@ Broadcasting
:ref:`array-broadcasting-in-numpy`
An introduction to the concepts discussed here
-.. automodule:: numpy.doc.broadcasting
+.. note::
+ See `this article
+ <https://numpy.org/devdocs/user/theory.broadcasting.html>`_
+ for illustrations of broadcasting concepts.
+
+
+The term broadcasting describes how numpy treats arrays with different
+shapes during arithmetic operations. Subject to certain constraints,
+the smaller array is "broadcast" across the larger array so that they
+have compatible shapes. Broadcasting provides a means of vectorizing
+array operations so that looping occurs in C instead of Python. It does
+this without making needless copies of data and usually leads to
+efficient algorithm implementations. There are, however, cases where
+broadcasting is a bad idea because it leads to inefficient use of memory
+that slows computation.
+
+NumPy operations are usually done on pairs of arrays on an
+element-by-element basis. In the simplest case, the two arrays must
+have exactly the same shape, as in the following example:
+
+ >>> a = np.array([1.0, 2.0, 3.0])
+ >>> b = np.array([2.0, 2.0, 2.0])
+ >>> a * b
+ array([ 2., 4., 6.])
+
+NumPy's broadcasting rule relaxes this constraint when the arrays'
+shapes meet certain constraints. The simplest broadcasting example occurs
+when an array and a scalar value are combined in an operation:
+
+>>> a = np.array([1.0, 2.0, 3.0])
+>>> b = 2.0
+>>> a * b
+array([ 2., 4., 6.])
+
+The result is equivalent to the previous example where ``b`` was an array.
+We can think of the scalar ``b`` being *stretched* during the arithmetic
+operation into an array with the same shape as ``a``. The new elements in
+``b`` are simply copies of the original scalar. The stretching analogy is
+only conceptual. NumPy is smart enough to use the original scalar value
+without actually making copies so that broadcasting operations are as
+memory and computationally efficient as possible.
+
+The code in the second example is more efficient than that in the first
+because broadcasting moves less memory around during the multiplication
+(``b`` is a scalar rather than an array).
+
+General Broadcasting Rules
+==========================
+When operating on two arrays, NumPy compares their shapes element-wise.
+It starts with the trailing (i.e. rightmost) dimensions and works its
+way left. Two dimensions are compatible when
+
+1) they are equal, or
+2) one of them is 1
+
+If these conditions are not met, a
+``ValueError: operands could not be broadcast together`` exception is
+thrown, indicating that the arrays have incompatible shapes. The size of
+the resulting array is the size that is not 1 along each axis of the inputs.
+
+Arrays do not need to have the same *number* of dimensions. For example,
+if you have a ``256x256x3`` array of RGB values, and you want to scale
+each color in the image by a different value, you can multiply the image
+by a one-dimensional array with 3 values. Lining up the sizes of the
+trailing axes of these arrays according to the broadcast rules, shows that
+they are compatible::
+
+ Image (3d array): 256 x 256 x 3
+ Scale (1d array): 3
+ Result (3d array): 256 x 256 x 3
+
+When either of the dimensions compared is one, the other is
+used. In other words, dimensions with size 1 are stretched or "copied"
+to match the other.
+
+In the following example, both the ``A`` and ``B`` arrays have axes with
+length one that are expanded to a larger size during the broadcast
+operation::
+
+ A (4d array): 8 x 1 x 6 x 1
+ B (3d array): 7 x 1 x 5
+ Result (4d array): 8 x 7 x 6 x 5
+
+Here are some more examples::
+
+ A (2d array): 5 x 4
+ B (1d array): 1
+ Result (2d array): 5 x 4
+
+ A (2d array): 5 x 4
+ B (1d array): 4
+ Result (2d array): 5 x 4
+
+ A (3d array): 15 x 3 x 5
+ B (3d array): 15 x 1 x 5
+ Result (3d array): 15 x 3 x 5
+
+ A (3d array): 15 x 3 x 5
+ B (2d array): 3 x 5
+ Result (3d array): 15 x 3 x 5
+
+ A (3d array): 15 x 3 x 5
+ B (2d array): 3 x 1
+ Result (3d array): 15 x 3 x 5
+
+Here are examples of shapes that do not broadcast::
+
+ A (1d array): 3
+ B (1d array): 4 # trailing dimensions do not match
+
+ A (2d array): 2 x 1
+ B (3d array): 8 x 4 x 3 # second from last dimensions mismatched
+
+An example of broadcasting in practice::
+
+ >>> x = np.arange(4)
+ >>> xx = x.reshape(4,1)
+ >>> y = np.ones(5)
+ >>> z = np.ones((3,4))
+
+ >>> x.shape
+ (4,)
+
+ >>> y.shape
+ (5,)
+
+ >>> x + y
+ ValueError: operands could not be broadcast together with shapes (4,) (5,)
+
+ >>> xx.shape
+ (4, 1)
+
+ >>> y.shape
+ (5,)
+
+ >>> (xx + y).shape
+ (4, 5)
+
+ >>> xx + y
+ array([[ 1., 1., 1., 1., 1.],
+ [ 2., 2., 2., 2., 2.],
+ [ 3., 3., 3., 3., 3.],
+ [ 4., 4., 4., 4., 4.]])
+
+ >>> x.shape
+ (4,)
+
+ >>> z.shape
+ (3, 4)
+
+ >>> (x + z).shape
+ (3, 4)
+
+ >>> x + z
+ array([[ 1., 2., 3., 4.],
+ [ 1., 2., 3., 4.],
+ [ 1., 2., 3., 4.]])
+
+Broadcasting provides a convenient way of taking the outer product (or
+any other outer operation) of two arrays. The following example shows an
+outer addition operation of two 1-d arrays::
+
+ >>> a = np.array([0.0, 10.0, 20.0, 30.0])
+ >>> b = np.array([1.0, 2.0, 3.0])
+ >>> a[:, np.newaxis] + b
+ array([[ 1., 2., 3.],
+ [ 11., 12., 13.],
+ [ 21., 22., 23.],
+ [ 31., 32., 33.]])
+
+Here the ``newaxis`` index operator inserts a new axis into ``a``,
+making it a two-dimensional ``4x1`` array. Combining the ``4x1`` array
+with ``b``, which has shape ``(3,)``, yields a ``4x3`` array.
+
+
diff --git a/doc/source/user/basics.byteswapping.rst b/doc/source/user/basics.byteswapping.rst
index 4b1008df3..fecdb9ee8 100644
--- a/doc/source/user/basics.byteswapping.rst
+++ b/doc/source/user/basics.byteswapping.rst
@@ -2,4 +2,152 @@
Byte-swapping
*************
-.. automodule:: numpy.doc.byteswapping
+Introduction to byte ordering and ndarrays
+==========================================
+
+The ``ndarray`` is an object that provide a python array interface to data
+in memory.
+
+It often happens that the memory that you want to view with an array is
+not of the same byte ordering as the computer on which you are running
+Python.
+
+For example, I might be working on a computer with a little-endian CPU -
+such as an Intel Pentium, but I have loaded some data from a file
+written by a computer that is big-endian. Let's say I have loaded 4
+bytes from a file written by a Sun (big-endian) computer. I know that
+these 4 bytes represent two 16-bit integers. On a big-endian machine, a
+two-byte integer is stored with the Most Significant Byte (MSB) first,
+and then the Least Significant Byte (LSB). Thus the bytes are, in memory order:
+
+#. MSB integer 1
+#. LSB integer 1
+#. MSB integer 2
+#. LSB integer 2
+
+Let's say the two integers were in fact 1 and 770. Because 770 = 256 *
+3 + 2, the 4 bytes in memory would contain respectively: 0, 1, 3, 2.
+The bytes I have loaded from the file would have these contents:
+
+>>> big_end_buffer = bytearray([0,1,3,2])
+>>> big_end_buffer
+bytearray(b'\\x00\\x01\\x03\\x02')
+
+We might want to use an ``ndarray`` to access these integers. In that
+case, we can create an array around this memory, and tell numpy that
+there are two integers, and that they are 16 bit and big-endian:
+
+>>> import numpy as np
+>>> big_end_arr = np.ndarray(shape=(2,),dtype='>i2', buffer=big_end_buffer)
+>>> big_end_arr[0]
+1
+>>> big_end_arr[1]
+770
+
+Note the array ``dtype`` above of ``>i2``. The ``>`` means 'big-endian'
+(``<`` is little-endian) and ``i2`` means 'signed 2-byte integer'. For
+example, if our data represented a single unsigned 4-byte little-endian
+integer, the dtype string would be ``<u4``.
+
+In fact, why don't we try that?
+
+>>> little_end_u4 = np.ndarray(shape=(1,),dtype='<u4', buffer=big_end_buffer)
+>>> little_end_u4[0] == 1 * 256**1 + 3 * 256**2 + 2 * 256**3
+True
+
+Returning to our ``big_end_arr`` - in this case our underlying data is
+big-endian (data endianness) and we've set the dtype to match (the dtype
+is also big-endian). However, sometimes you need to flip these around.
+
+.. warning::
+
+ Scalars currently do not include byte order information, so extracting
+ a scalar from an array will return an integer in native byte order.
+ Hence:
+
+ >>> big_end_arr[0].dtype.byteorder == little_end_u4[0].dtype.byteorder
+ True
+
+Changing byte ordering
+======================
+
+As you can imagine from the introduction, there are two ways you can
+affect the relationship between the byte ordering of the array and the
+underlying memory it is looking at:
+
+* Change the byte-ordering information in the array dtype so that it
+ interprets the underlying data as being in a different byte order.
+ This is the role of ``arr.newbyteorder()``
+* Change the byte-ordering of the underlying data, leaving the dtype
+ interpretation as it was. This is what ``arr.byteswap()`` does.
+
+The common situations in which you need to change byte ordering are:
+
+#. Your data and dtype endianness don't match, and you want to change
+ the dtype so that it matches the data.
+#. Your data and dtype endianness don't match, and you want to swap the
+ data so that they match the dtype
+#. Your data and dtype endianness match, but you want the data swapped
+ and the dtype to reflect this
+
+Data and dtype endianness don't match, change dtype to match data
+-----------------------------------------------------------------
+
+We make something where they don't match:
+
+>>> wrong_end_dtype_arr = np.ndarray(shape=(2,),dtype='<i2', buffer=big_end_buffer)
+>>> wrong_end_dtype_arr[0]
+256
+
+The obvious fix for this situation is to change the dtype so it gives
+the correct endianness:
+
+>>> fixed_end_dtype_arr = wrong_end_dtype_arr.newbyteorder()
+>>> fixed_end_dtype_arr[0]
+1
+
+Note the array has not changed in memory:
+
+>>> fixed_end_dtype_arr.tobytes() == big_end_buffer
+True
+
+Data and type endianness don't match, change data to match dtype
+----------------------------------------------------------------
+
+You might want to do this if you need the data in memory to be a certain
+ordering. For example you might be writing the memory out to a file
+that needs a certain byte ordering.
+
+>>> fixed_end_mem_arr = wrong_end_dtype_arr.byteswap()
+>>> fixed_end_mem_arr[0]
+1
+
+Now the array *has* changed in memory:
+
+>>> fixed_end_mem_arr.tobytes() == big_end_buffer
+False
+
+Data and dtype endianness match, swap data and dtype
+----------------------------------------------------
+
+You may have a correctly specified array dtype, but you need the array
+to have the opposite byte order in memory, and you want the dtype to
+match so the array values make sense. In this case you just do both of
+the previous operations:
+
+>>> swapped_end_arr = big_end_arr.byteswap().newbyteorder()
+>>> swapped_end_arr[0]
+1
+>>> swapped_end_arr.tobytes() == big_end_buffer
+False
+
+An easier way of casting the data to a specific dtype and byte ordering
+can be achieved with the ndarray astype method:
+
+>>> swapped_end_arr = big_end_arr.astype('<i2')
+>>> swapped_end_arr[0]
+1
+>>> swapped_end_arr.tobytes() == big_end_buffer
+False
+
+
diff --git a/doc/source/user/basics.creation.rst b/doc/source/user/basics.creation.rst
index b3fa81017..671a8ec59 100644
--- a/doc/source/user/basics.creation.rst
+++ b/doc/source/user/basics.creation.rst
@@ -6,4 +6,141 @@ Array creation
.. seealso:: :ref:`Array creation routines <routines.array-creation>`
-.. automodule:: numpy.doc.creation
+Introduction
+============
+
+There are 5 general mechanisms for creating arrays:
+
+1) Conversion from other Python structures (e.g., lists, tuples)
+2) Intrinsic numpy array creation objects (e.g., arange, ones, zeros,
+ etc.)
+3) Reading arrays from disk, either from standard or custom formats
+4) Creating arrays from raw bytes through the use of strings or buffers
+5) Use of special library functions (e.g., random)
+
+This section will not cover means of replicating, joining, or otherwise
+expanding or mutating existing arrays. Nor will it cover creating object
+arrays or structured arrays. Both of those are covered in their own sections.
+
+Converting Python array_like Objects to NumPy Arrays
+====================================================
+
+In general, numerical data arranged in an array-like structure in Python can
+be converted to arrays through the use of the array() function. The most
+obvious examples are lists and tuples. See the documentation for array() for
+details for its use. Some objects may support the array-protocol and allow
+conversion to arrays this way. A simple way to find out if the object can be
+converted to a numpy array using array() is simply to try it interactively and
+see if it works! (The Python Way).
+
+Examples: ::
+
+ >>> x = np.array([2,3,1,0])
+ >>> x = np.array([2, 3, 1, 0])
+ >>> x = np.array([[1,2.0],[0,0],(1+1j,3.)]) # note mix of tuple and lists,
+ and types
+ >>> x = np.array([[ 1.+0.j, 2.+0.j], [ 0.+0.j, 0.+0.j], [ 1.+1.j, 3.+0.j]])
+
+Intrinsic NumPy Array Creation
+==============================
+
+NumPy has built-in functions for creating arrays from scratch:
+
+zeros(shape) will create an array filled with 0 values with the specified
+shape. The default dtype is float64. ::
+
+ >>> np.zeros((2, 3))
+ array([[ 0., 0., 0.], [ 0., 0., 0.]])
+
+ones(shape) will create an array filled with 1 values. It is identical to
+zeros in all other respects.
+
+arange() will create arrays with regularly incrementing values. Check the
+docstring for complete information on the various ways it can be used. A few
+examples will be given here: ::
+
+ >>> np.arange(10)
+ array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
+ >>> np.arange(2, 10, dtype=float)
+ array([ 2., 3., 4., 5., 6., 7., 8., 9.])
+ >>> np.arange(2, 3, 0.1)
+ array([ 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
+
+Note that there are some subtleties regarding the last usage that the user
+should be aware of that are described in the arange docstring.
+
+linspace() will create arrays with a specified number of elements, and
+spaced equally between the specified beginning and end values. For
+example: ::
+
+ >>> np.linspace(1., 4., 6)
+ array([ 1. , 1.6, 2.2, 2.8, 3.4, 4. ])
+
+The advantage of this creation function is that one can guarantee the
+number of elements and the starting and end point, which arange()
+generally will not do for arbitrary start, stop, and step values.
+
+indices() will create a set of arrays (stacked as a one-higher dimensioned
+array), one per dimension with each representing variation in that dimension.
+An example illustrates much better than a verbal description: ::
+
+ >>> np.indices((3,3))
+ array([[[0, 0, 0], [1, 1, 1], [2, 2, 2]], [[0, 1, 2], [0, 1, 2], [0, 1, 2]]])
+
+This is particularly useful for evaluating functions of multiple dimensions on
+a regular grid.
+
+Reading Arrays From Disk
+========================
+
+This is presumably the most common case of large array creation. The details,
+of course, depend greatly on the format of data on disk and so this section
+can only give general pointers on how to handle various formats.
+
+Standard Binary Formats
+-----------------------
+
+Various fields have standard formats for array data. The following lists the
+ones with known python libraries to read them and return numpy arrays (there
+may be others for which it is possible to read and convert to numpy arrays so
+check the last section as well)
+::
+
+ HDF5: h5py
+ FITS: Astropy
+
+Examples of formats that cannot be read directly but for which it is not hard to
+convert are those formats supported by libraries like PIL (able to read and
+write many image formats such as jpg, png, etc).
+
+Common ASCII Formats
+------------------------
+
+Comma Separated Value files (CSV) are widely used (and an export and import
+option for programs like Excel). There are a number of ways of reading these
+files in Python. There are CSV functions in Python and functions in pylab
+(part of matplotlib).
+
+More generic ascii files can be read using the io package in scipy.
+
+Custom Binary Formats
+---------------------
+
+There are a variety of approaches one can use. If the file has a relatively
+simple format then one can write a simple I/O library and use the numpy
+fromfile() function and .tofile() method to read and write numpy arrays
+directly (mind your byteorder though!) If a good C or C++ library exists that
+read the data, one can wrap that library with a variety of techniques though
+that certainly is much more work and requires significantly more advanced
+knowledge to interface with C or C++.
+
+Use of Special Libraries
+------------------------
+
+There are libraries that can be used to generate arrays for special purposes
+and it isn't possible to enumerate all of them. The most common uses are use
+of the many array generation functions in random that can generate arrays of
+random values, and some utility functions to generate special matrices (e.g.
+diagonal).
+
+
diff --git a/doc/source/user/basics.dispatch.rst b/doc/source/user/basics.dispatch.rst
index f7b8da262..c0e1cf9ba 100644
--- a/doc/source/user/basics.dispatch.rst
+++ b/doc/source/user/basics.dispatch.rst
@@ -4,5 +4,269 @@
Writing custom array containers
*******************************
-.. automodule:: numpy.doc.dispatch
+Numpy's dispatch mechanism, introduced in numpy version v1.16 is the
+recommended approach for writing custom N-dimensional array containers that are
+compatible with the numpy API and provide custom implementations of numpy
+functionality. Applications include `dask <http://dask.pydata.org>`_ arrays, an
+N-dimensional array distributed across multiple nodes, and `cupy
+<https://docs-cupy.chainer.org/en/stable/>`_ arrays, an N-dimensional array on
+a GPU.
+
+To get a feel for writing custom array containers, we'll begin with a simple
+example that has rather narrow utility but illustrates the concepts involved.
+
+>>> import numpy as np
+>>> class DiagonalArray:
+... def __init__(self, N, value):
+... self._N = N
+... self._i = value
+... def __repr__(self):
+... return f"{self.__class__.__name__}(N={self._N}, value={self._i})"
+... def __array__(self):
+... return self._i * np.eye(self._N)
+
+Our custom array can be instantiated like:
+
+>>> arr = DiagonalArray(5, 1)
+>>> arr
+DiagonalArray(N=5, value=1)
+
+We can convert to a numpy array using :func:`numpy.array` or
+:func:`numpy.asarray`, which will call its ``__array__`` method to obtain a
+standard ``numpy.ndarray``.
+
+>>> np.asarray(arr)
+array([[1., 0., 0., 0., 0.],
+ [0., 1., 0., 0., 0.],
+ [0., 0., 1., 0., 0.],
+ [0., 0., 0., 1., 0.],
+ [0., 0., 0., 0., 1.]])
+
+If we operate on ``arr`` with a numpy function, numpy will again use the
+``__array__`` interface to convert it to an array and then apply the function
+in the usual way.
+
+>>> np.multiply(arr, 2)
+array([[2., 0., 0., 0., 0.],
+ [0., 2., 0., 0., 0.],
+ [0., 0., 2., 0., 0.],
+ [0., 0., 0., 2., 0.],
+ [0., 0., 0., 0., 2.]])
+
+
+Notice that the return type is a standard ``numpy.ndarray``.
+
+>>> type(arr)
+numpy.ndarray
+
+How can we pass our custom array type through this function? Numpy allows a
+class to indicate that it would like to handle computations in a custom-defined
+way through the interfaces ``__array_ufunc__`` and ``__array_function__``. Let's
+take one at a time, starting with ``_array_ufunc__``. This method covers
+:ref:`ufuncs`, a class of functions that includes, for example,
+:func:`numpy.multiply` and :func:`numpy.sin`.
+
+The ``__array_ufunc__`` receives:
+
+- ``ufunc``, a function like ``numpy.multiply``
+- ``method``, a string, differentiating between ``numpy.multiply(...)`` and
+ variants like ``numpy.multiply.outer``, ``numpy.multiply.accumulate``, and so
+ on. For the common case, ``numpy.multiply(...)``, ``method == '__call__'``.
+- ``inputs``, which could be a mixture of different types
+- ``kwargs``, keyword arguments passed to the function
+
+For this example we will only handle the method ``__call__``
+
+>>> from numbers import Number
+>>> class DiagonalArray:
+... def __init__(self, N, value):
+... self._N = N
+... self._i = value
+... def __repr__(self):
+... return f"{self.__class__.__name__}(N={self._N}, value={self._i})"
+... def __array__(self):
+... return self._i * np.eye(self._N)
+... def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+... if method == '__call__':
+... N = None
+... scalars = []
+... for input in inputs:
+... if isinstance(input, Number):
+... scalars.append(input)
+... elif isinstance(input, self.__class__):
+... scalars.append(input._i)
+... if N is not None:
+... if N != self._N:
+... raise TypeError("inconsistent sizes")
+... else:
+... N = self._N
+... else:
+... return NotImplemented
+... return self.__class__(N, ufunc(*scalars, **kwargs))
+... else:
+... return NotImplemented
+
+Now our custom array type passes through numpy functions.
+
+>>> arr = DiagonalArray(5, 1)
+>>> np.multiply(arr, 3)
+DiagonalArray(N=5, value=3)
+>>> np.add(arr, 3)
+DiagonalArray(N=5, value=4)
+>>> np.sin(arr)
+DiagonalArray(N=5, value=0.8414709848078965)
+
+At this point ``arr + 3`` does not work.
+
+>>> arr + 3
+TypeError: unsupported operand type(s) for *: 'DiagonalArray' and 'int'
+
+To support it, we need to define the Python interfaces ``__add__``, ``__lt__``,
+and so on to dispatch to the corresponding ufunc. We can achieve this
+conveniently by inheriting from the mixin
+:class:`~numpy.lib.mixins.NDArrayOperatorsMixin`.
+
+>>> import numpy.lib.mixins
+>>> class DiagonalArray(numpy.lib.mixins.NDArrayOperatorsMixin):
+... def __init__(self, N, value):
+... self._N = N
+... self._i = value
+... def __repr__(self):
+... return f"{self.__class__.__name__}(N={self._N}, value={self._i})"
+... def __array__(self):
+... return self._i * np.eye(self._N)
+... def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+... if method == '__call__':
+... N = None
+... scalars = []
+... for input in inputs:
+... if isinstance(input, Number):
+... scalars.append(input)
+... elif isinstance(input, self.__class__):
+... scalars.append(input._i)
+... if N is not None:
+... if N != self._N:
+... raise TypeError("inconsistent sizes")
+... else:
+... N = self._N
+... else:
+... return NotImplemented
+... return self.__class__(N, ufunc(*scalars, **kwargs))
+... else:
+... return NotImplemented
+
+>>> arr = DiagonalArray(5, 1)
+>>> arr + 3
+DiagonalArray(N=5, value=4)
+>>> arr > 0
+DiagonalArray(N=5, value=True)
+
+Now let's tackle ``__array_function__``. We'll create dict that maps numpy
+functions to our custom variants.
+
+>>> HANDLED_FUNCTIONS = {}
+>>> class DiagonalArray(numpy.lib.mixins.NDArrayOperatorsMixin):
+... def __init__(self, N, value):
+... self._N = N
+... self._i = value
+... def __repr__(self):
+... return f"{self.__class__.__name__}(N={self._N}, value={self._i})"
+... def __array__(self):
+... return self._i * np.eye(self._N)
+... def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+... if method == '__call__':
+... N = None
+... scalars = []
+... for input in inputs:
+... # In this case we accept only scalar numbers or DiagonalArrays.
+... if isinstance(input, Number):
+... scalars.append(input)
+... elif isinstance(input, self.__class__):
+... scalars.append(input._i)
+... if N is not None:
+... if N != self._N:
+... raise TypeError("inconsistent sizes")
+... else:
+... N = self._N
+... else:
+... return NotImplemented
+... return self.__class__(N, ufunc(*scalars, **kwargs))
+... else:
+... return NotImplemented
+... def __array_function__(self, func, types, args, kwargs):
+... if func not in HANDLED_FUNCTIONS:
+... return NotImplemented
+... # Note: this allows subclasses that don't override
+... # __array_function__ to handle DiagonalArray objects.
+... if not all(issubclass(t, self.__class__) for t in types):
+... return NotImplemented
+... return HANDLED_FUNCTIONS[func](*args, **kwargs)
+...
+
+A convenient pattern is to define a decorator ``implements`` that can be used
+to add functions to ``HANDLED_FUNCTIONS``.
+
+>>> def implements(np_function):
+... "Register an __array_function__ implementation for DiagonalArray objects."
+... def decorator(func):
+... HANDLED_FUNCTIONS[np_function] = func
+... return func
+... return decorator
+...
+
+Now we write implementations of numpy functions for ``DiagonalArray``.
+For completeness, to support the usage ``arr.sum()`` add a method ``sum`` that
+calls ``numpy.sum(self)``, and the same for ``mean``.
+
+>>> @implements(np.sum)
+... def sum(arr):
+... "Implementation of np.sum for DiagonalArray objects"
+... return arr._i * arr._N
+...
+>>> @implements(np.mean)
+... def mean(arr):
+... "Implementation of np.mean for DiagonalArray objects"
+... return arr._i / arr._N
+...
+>>> arr = DiagonalArray(5, 1)
+>>> np.sum(arr)
+5
+>>> np.mean(arr)
+0.2
+
+If the user tries to use any numpy functions not included in
+``HANDLED_FUNCTIONS``, a ``TypeError`` will be raised by numpy, indicating that
+this operation is not supported. For example, concatenating two
+``DiagonalArrays`` does not produce another diagonal array, so it is not
+supported.
+
+>>> np.concatenate([arr, arr])
+TypeError: no implementation found for 'numpy.concatenate' on types that implement __array_function__: [<class '__main__.DiagonalArray'>]
+
+Additionally, our implementations of ``sum`` and ``mean`` do not accept the
+optional arguments that numpy's implementation does.
+
+>>> np.sum(arr, axis=0)
+TypeError: sum() got an unexpected keyword argument 'axis'
+
+The user always has the option of converting to a normal ``numpy.ndarray`` with
+:func:`numpy.asarray` and using standard numpy from there.
+
+>>> np.concatenate([np.asarray(arr), np.asarray(arr)])
+array([[1., 0., 0., 0., 0.],
+ [0., 1., 0., 0., 0.],
+ [0., 0., 1., 0., 0.],
+ [0., 0., 0., 1., 0.],
+ [0., 0., 0., 0., 1.],
+ [1., 0., 0., 0., 0.],
+ [0., 1., 0., 0., 0.],
+ [0., 0., 1., 0., 0.],
+ [0., 0., 0., 1., 0.],
+ [0., 0., 0., 0., 1.]])
+
+Refer to the `dask source code <https://github.com/dask/dask>`_ and
+`cupy source code <https://github.com/cupy/cupy>`_ for more fully-worked
+examples of custom array containers.
+
+See also :doc:`NEP 18<neps:nep-0018-array-function-protocol>`.
diff --git a/doc/source/user/basics.indexing.rst b/doc/source/user/basics.indexing.rst
index 0dca4b884..9545bb78c 100644
--- a/doc/source/user/basics.indexing.rst
+++ b/doc/source/user/basics.indexing.rst
@@ -10,4 +10,454 @@ Indexing
:ref:`Indexing routines <routines.indexing>`
-.. automodule:: numpy.doc.indexing
+Array indexing refers to any use of the square brackets ([]) to index
+array values. There are many options to indexing, which give numpy
+indexing great power, but with power comes some complexity and the
+potential for confusion. This section is just an overview of the
+various options and issues related to indexing. Aside from single
+element indexing, the details on most of these options are to be
+found in related sections.
+
+Assignment vs referencing
+=========================
+
+Most of the following examples show the use of indexing when
+referencing data in an array. The examples work just as well
+when assigning to an array. See the section at the end for
+specific examples and explanations on how assignments work.
+
+Single element indexing
+=======================
+
+Single element indexing for a 1-D array is what one expects. It work
+exactly like that for other standard Python sequences. It is 0-based,
+and accepts negative indices for indexing from the end of the array. ::
+
+ >>> x = np.arange(10)
+ >>> x[2]
+ 2
+ >>> x[-2]
+ 8
+
+Unlike lists and tuples, numpy arrays support multidimensional indexing
+for multidimensional arrays. That means that it is not necessary to
+separate each dimension's index into its own set of square brackets. ::
+
+ >>> x.shape = (2,5) # now x is 2-dimensional
+ >>> x[1,3]
+ 8
+ >>> x[1,-1]
+ 9
+
+Note that if one indexes a multidimensional array with fewer indices
+than dimensions, one gets a subdimensional array. For example: ::
+
+ >>> x[0]
+ array([0, 1, 2, 3, 4])
+
+That is, each index specified selects the array corresponding to the
+rest of the dimensions selected. In the above example, choosing 0
+means that the remaining dimension of length 5 is being left unspecified,
+and that what is returned is an array of that dimensionality and size.
+It must be noted that the returned array is not a copy of the original,
+but points to the same values in memory as does the original array.
+In this case, the 1-D array at the first position (0) is returned.
+So using a single index on the returned array, results in a single
+element being returned. That is: ::
+
+ >>> x[0][2]
+ 2
+
+So note that ``x[0,2] = x[0][2]`` though the second case is more
+inefficient as a new temporary array is created after the first index
+that is subsequently indexed by 2.
+
+Note to those used to IDL or Fortran memory order as it relates to
+indexing. NumPy uses C-order indexing. That means that the last
+index usually represents the most rapidly changing memory location,
+unlike Fortran or IDL, where the first index represents the most
+rapidly changing location in memory. This difference represents a
+great potential for confusion.
+
+Other indexing options
+======================
+
+It is possible to slice and stride arrays to extract arrays of the
+same number of dimensions, but of different sizes than the original.
+The slicing and striding works exactly the same way it does for lists
+and tuples except that they can be applied to multiple dimensions as
+well. A few examples illustrates best: ::
+
+ >>> x = np.arange(10)
+ >>> x[2:5]
+ array([2, 3, 4])
+ >>> x[:-7]
+ array([0, 1, 2])
+ >>> x[1:7:2]
+ array([1, 3, 5])
+ >>> y = np.arange(35).reshape(5,7)
+ >>> y[1:5:2,::3]
+ array([[ 7, 10, 13],
+ [21, 24, 27]])
+
+Note that slices of arrays do not copy the internal array data but
+only produce new views of the original data. This is different from
+list or tuple slicing and an explicit ``copy()`` is recommended if
+the original data is not required anymore.
+
+It is possible to index arrays with other arrays for the purposes of
+selecting lists of values out of arrays into new arrays. There are
+two different ways of accomplishing this. One uses one or more arrays
+of index values. The other involves giving a boolean array of the proper
+shape to indicate the values to be selected. Index arrays are a very
+powerful tool that allow one to avoid looping over individual elements in
+arrays and thus greatly improve performance.
+
+It is possible to use special features to effectively increase the
+number of dimensions in an array through indexing so the resulting
+array acquires the shape needed for use in an expression or with a
+specific function.
+
+Index arrays
+============
+
+NumPy arrays may be indexed with other arrays (or any other sequence-
+like object that can be converted to an array, such as lists, with the
+exception of tuples; see the end of this document for why this is). The
+use of index arrays ranges from simple, straightforward cases to
+complex, hard-to-understand cases. For all cases of index arrays, what
+is returned is a copy of the original data, not a view as one gets for
+slices.
+
+Index arrays must be of integer type. Each value in the array indicates
+which value in the array to use in place of the index. To illustrate: ::
+
+ >>> x = np.arange(10,1,-1)
+ >>> x
+ array([10, 9, 8, 7, 6, 5, 4, 3, 2])
+ >>> x[np.array([3, 3, 1, 8])]
+ array([7, 7, 9, 2])
+
+
+The index array consisting of the values 3, 3, 1 and 8 correspondingly
+create an array of length 4 (same as the index array) where each index
+is replaced by the value the index array has in the array being indexed.
+
+Negative values are permitted and work as they do with single indices
+or slices: ::
+
+ >>> x[np.array([3,3,-3,8])]
+ array([7, 7, 4, 2])
+
+It is an error to have index values out of bounds: ::
+
+ >>> x[np.array([3, 3, 20, 8])]
+ <type 'exceptions.IndexError'>: index 20 out of bounds 0<=index<9
+
+Generally speaking, what is returned when index arrays are used is
+an array with the same shape as the index array, but with the type
+and values of the array being indexed. As an example, we can use a
+multidimensional index array instead: ::
+
+ >>> x[np.array([[1,1],[2,3]])]
+ array([[9, 9],
+ [8, 7]])
+
+Indexing Multi-dimensional arrays
+=================================
+
+Things become more complex when multidimensional arrays are indexed,
+particularly with multidimensional index arrays. These tend to be
+more unusual uses, but they are permitted, and they are useful for some
+problems. We'll start with the simplest multidimensional case (using
+the array y from the previous examples): ::
+
+ >>> y[np.array([0,2,4]), np.array([0,1,2])]
+ array([ 0, 15, 30])
+
+In this case, if the index arrays have a matching shape, and there is
+an index array for each dimension of the array being indexed, the
+resultant array has the same shape as the index arrays, and the values
+correspond to the index set for each position in the index arrays. In
+this example, the first index value is 0 for both index arrays, and
+thus the first value of the resultant array is y[0,0]. The next value
+is y[2,1], and the last is y[4,2].
+
+If the index arrays do not have the same shape, there is an attempt to
+broadcast them to the same shape. If they cannot be broadcast to the
+same shape, an exception is raised: ::
+
+ >>> y[np.array([0,2,4]), np.array([0,1])]
+ <type 'exceptions.ValueError'>: shape mismatch: objects cannot be
+ broadcast to a single shape
+
+The broadcasting mechanism permits index arrays to be combined with
+scalars for other indices. The effect is that the scalar value is used
+for all the corresponding values of the index arrays: ::
+
+ >>> y[np.array([0,2,4]), 1]
+ array([ 1, 15, 29])
+
+Jumping to the next level of complexity, it is possible to only
+partially index an array with index arrays. It takes a bit of thought
+to understand what happens in such cases. For example if we just use
+one index array with y: ::
+
+ >>> y[np.array([0,2,4])]
+ array([[ 0, 1, 2, 3, 4, 5, 6],
+ [14, 15, 16, 17, 18, 19, 20],
+ [28, 29, 30, 31, 32, 33, 34]])
+
+What results is the construction of a new array where each value of
+the index array selects one row from the array being indexed and the
+resultant array has the resulting shape (number of index elements,
+size of row).
+
+An example of where this may be useful is for a color lookup table
+where we want to map the values of an image into RGB triples for
+display. The lookup table could have a shape (nlookup, 3). Indexing
+such an array with an image with shape (ny, nx) with dtype=np.uint8
+(or any integer type so long as values are with the bounds of the
+lookup table) will result in an array of shape (ny, nx, 3) where a
+triple of RGB values is associated with each pixel location.
+
+In general, the shape of the resultant array will be the concatenation
+of the shape of the index array (or the shape that all the index arrays
+were broadcast to) with the shape of any unused dimensions (those not
+indexed) in the array being indexed.
+
+Boolean or "mask" index arrays
+==============================
+
+Boolean arrays used as indices are treated in a different manner
+entirely than index arrays. Boolean arrays must be of the same shape
+as the initial dimensions of the array being indexed. In the
+most straightforward case, the boolean array has the same shape: ::
+
+ >>> b = y>20
+ >>> y[b]
+ array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34])
+
+Unlike in the case of integer index arrays, in the boolean case, the
+result is a 1-D array containing all the elements in the indexed array
+corresponding to all the true elements in the boolean array. The
+elements in the indexed array are always iterated and returned in
+:term:`row-major` (C-style) order. The result is also identical to
+``y[np.nonzero(b)]``. As with index arrays, what is returned is a copy
+of the data, not a view as one gets with slices.
+
+The result will be multidimensional if y has more dimensions than b.
+For example: ::
+
+ >>> b[:,5] # use a 1-D boolean whose first dim agrees with the first dim of y
+ array([False, False, False, True, True])
+ >>> y[b[:,5]]
+ array([[21, 22, 23, 24, 25, 26, 27],
+ [28, 29, 30, 31, 32, 33, 34]])
+
+Here the 4th and 5th rows are selected from the indexed array and
+combined to make a 2-D array.
+
+In general, when the boolean array has fewer dimensions than the array
+being indexed, this is equivalent to y[b, ...], which means
+y is indexed by b followed by as many : as are needed to fill
+out the rank of y.
+Thus the shape of the result is one dimension containing the number
+of True elements of the boolean array, followed by the remaining
+dimensions of the array being indexed.
+
+For example, using a 2-D boolean array of shape (2,3)
+with four True elements to select rows from a 3-D array of shape
+(2,3,5) results in a 2-D result of shape (4,5): ::
+
+ >>> x = np.arange(30).reshape(2,3,5)
+ >>> x
+ array([[[ 0, 1, 2, 3, 4],
+ [ 5, 6, 7, 8, 9],
+ [10, 11, 12, 13, 14]],
+ [[15, 16, 17, 18, 19],
+ [20, 21, 22, 23, 24],
+ [25, 26, 27, 28, 29]]])
+ >>> b = np.array([[True, True, False], [False, True, True]])
+ >>> x[b]
+ array([[ 0, 1, 2, 3, 4],
+ [ 5, 6, 7, 8, 9],
+ [20, 21, 22, 23, 24],
+ [25, 26, 27, 28, 29]])
+
+For further details, consult the numpy reference documentation on array indexing.
+
+Combining index arrays with slices
+==================================
+
+Index arrays may be combined with slices. For example: ::
+
+ >>> y[np.array([0, 2, 4]), 1:3]
+ array([[ 1, 2],
+ [15, 16],
+ [29, 30]])
+
+In effect, the slice and index array operation are independent.
+The slice operation extracts columns with index 1 and 2,
+(i.e. the 2nd and 3rd columns),
+followed by the index array operation which extracts rows with
+index 0, 2 and 4 (i.e the first, third and fifth rows).
+
+This is equivalent to::
+
+ >>> y[:, 1:3][np.array([0, 2, 4]), :]
+ array([[ 1, 2],
+ [15, 16],
+ [29, 30]])
+
+Likewise, slicing can be combined with broadcasted boolean indices: ::
+
+ >>> b = y > 20
+ >>> b
+ array([[False, False, False, False, False, False, False],
+ [False, False, False, False, False, False, False],
+ [False, False, False, False, False, False, False],
+ [ True, True, True, True, True, True, True],
+ [ True, True, True, True, True, True, True]])
+ >>> y[b[:,5],1:3]
+ array([[22, 23],
+ [29, 30]])
+
+Structural indexing tools
+=========================
+
+To facilitate easy matching of array shapes with expressions and in
+assignments, the np.newaxis object can be used within array indices
+to add new dimensions with a size of 1. For example: ::
+
+ >>> y.shape
+ (5, 7)
+ >>> y[:,np.newaxis,:].shape
+ (5, 1, 7)
+
+Note that there are no new elements in the array, just that the
+dimensionality is increased. This can be handy to combine two
+arrays in a way that otherwise would require explicitly reshaping
+operations. For example: ::
+
+ >>> x = np.arange(5)
+ >>> x[:,np.newaxis] + x[np.newaxis,:]
+ array([[0, 1, 2, 3, 4],
+ [1, 2, 3, 4, 5],
+ [2, 3, 4, 5, 6],
+ [3, 4, 5, 6, 7],
+ [4, 5, 6, 7, 8]])
+
+The ellipsis syntax maybe used to indicate selecting in full any
+remaining unspecified dimensions. For example: ::
+
+ >>> z = np.arange(81).reshape(3,3,3,3)
+ >>> z[1,...,2]
+ array([[29, 32, 35],
+ [38, 41, 44],
+ [47, 50, 53]])
+
+This is equivalent to: ::
+
+ >>> z[1,:,:,2]
+ array([[29, 32, 35],
+ [38, 41, 44],
+ [47, 50, 53]])
+
+Assigning values to indexed arrays
+==================================
+
+As mentioned, one can select a subset of an array to assign to using
+a single index, slices, and index and mask arrays. The value being
+assigned to the indexed array must be shape consistent (the same shape
+or broadcastable to the shape the index produces). For example, it is
+permitted to assign a constant to a slice: ::
+
+ >>> x = np.arange(10)
+ >>> x[2:7] = 1
+
+or an array of the right size: ::
+
+ >>> x[2:7] = np.arange(5)
+
+Note that assignments may result in changes if assigning
+higher types to lower types (like floats to ints) or even
+exceptions (assigning complex to floats or ints): ::
+
+ >>> x[1] = 1.2
+ >>> x[1]
+ 1
+ >>> x[1] = 1.2j
+ TypeError: can't convert complex to int
+
+
+Unlike some of the references (such as array and mask indices)
+assignments are always made to the original data in the array
+(indeed, nothing else would make sense!). Note though, that some
+actions may not work as one may naively expect. This particular
+example is often surprising to people: ::
+
+ >>> x = np.arange(0, 50, 10)
+ >>> x
+ array([ 0, 10, 20, 30, 40])
+ >>> x[np.array([1, 1, 3, 1])] += 1
+ >>> x
+ array([ 0, 11, 20, 31, 40])
+
+Where people expect that the 1st location will be incremented by 3.
+In fact, it will only be incremented by 1. The reason is because
+a new array is extracted from the original (as a temporary) containing
+the values at 1, 1, 3, 1, then the value 1 is added to the temporary,
+and then the temporary is assigned back to the original array. Thus
+the value of the array at x[1]+1 is assigned to x[1] three times,
+rather than being incremented 3 times.
+
+Dealing with variable numbers of indices within programs
+========================================================
+
+The index syntax is very powerful but limiting when dealing with
+a variable number of indices. For example, if you want to write
+a function that can handle arguments with various numbers of
+dimensions without having to write special case code for each
+number of possible dimensions, how can that be done? If one
+supplies to the index a tuple, the tuple will be interpreted
+as a list of indices. For example (using the previous definition
+for the array z): ::
+
+ >>> indices = (1,1,1,1)
+ >>> z[indices]
+ 40
+
+So one can use code to construct tuples of any number of indices
+and then use these within an index.
+
+Slices can be specified within programs by using the slice() function
+in Python. For example: ::
+
+ >>> indices = (1,1,1,slice(0,2)) # same as [1,1,1,0:2]
+ >>> z[indices]
+ array([39, 40])
+
+Likewise, ellipsis can be specified by code by using the Ellipsis
+object: ::
+
+ >>> indices = (1, Ellipsis, 1) # same as [1,...,1]
+ >>> z[indices]
+ array([[28, 31, 34],
+ [37, 40, 43],
+ [46, 49, 52]])
+
+For this reason it is possible to use the output from the np.nonzero()
+function directly as an index since it always returns a tuple of index
+arrays.
+
+Because the special treatment of tuples, they are not automatically
+converted to an array as a list would be. As an example: ::
+
+ >>> z[[1,1,1,1]] # produces a large array
+ array([[[[27, 28, 29],
+ [30, 31, 32], ...
+ >>> z[(1,1,1,1)] # returns a single value
+ 40
+
+
diff --git a/doc/source/user/basics.io.genfromtxt.rst b/doc/source/user/basics.io.genfromtxt.rst
index 3fce6a8aa..5364acbe9 100644
--- a/doc/source/user/basics.io.genfromtxt.rst
+++ b/doc/source/user/basics.io.genfromtxt.rst
@@ -28,7 +28,7 @@ Defining the input
The only mandatory argument of :func:`~numpy.genfromtxt` is the source of
the data. It can be a string, a list of strings, a generator or an open
-file-like object with a :meth:`read` method, for example, a file or
+file-like object with a ``read`` method, for example, a file or
:class:`io.StringIO` object. If a single string is provided, it is assumed
to be the name of a local or remote file. If a list of strings or a generator
returning strings is provided, each string is treated as one line in a file.
@@ -36,10 +36,10 @@ When the URL of a remote file is passed, the file is automatically downloaded
to the current directory and opened.
Recognized file types are text files and archives. Currently, the function
-recognizes :class:`gzip` and :class:`bz2` (`bzip2`) archives. The type of
+recognizes ``gzip`` and ``bz2`` (``bzip2``) archives. The type of
the archive is determined from the extension of the file: if the filename
-ends with ``'.gz'``, a :class:`gzip` archive is expected; if it ends with
-``'bz2'``, a :class:`bzip2` archive is assumed.
+ends with ``'.gz'``, a ``gzip`` archive is expected; if it ends with
+``'bz2'``, a ``bzip2`` archive is assumed.
@@ -360,9 +360,9 @@ The ``converters`` argument
Usually, defining a dtype is sufficient to define how the sequence of
strings must be converted. However, some additional control may sometimes
be required. For example, we may want to make sure that a date in a format
-``YYYY/MM/DD`` is converted to a :class:`datetime` object, or that a string
-like ``xx%`` is properly converted to a float between 0 and 1. In such
-cases, we should define conversion functions with the ``converters``
+``YYYY/MM/DD`` is converted to a :class:`~datetime.datetime` object, or that
+a string like ``xx%`` is properly converted to a float between 0 and 1. In
+such cases, we should define conversion functions with the ``converters``
arguments.
The value of this argument is typically a dictionary with column indices or
@@ -427,7 +427,7 @@ previous example, we used a converter to transform an empty string into a
float. However, user-defined converters may rapidly become cumbersome to
manage.
-The :func:`~nummpy.genfromtxt` function provides two other complementary
+The :func:`~numpy.genfromtxt` function provides two other complementary
mechanisms: the ``missing_values`` argument is used to recognize
missing data and a second argument, ``filling_values``, is used to
process these missing data.
@@ -514,15 +514,15 @@ output array will then be a :class:`~numpy.ma.MaskedArray`.
Shortcut functions
==================
-In addition to :func:`~numpy.genfromtxt`, the :mod:`numpy.lib.io` module
+In addition to :func:`~numpy.genfromtxt`, the :mod:`numpy.lib.npyio` module
provides several convenience functions derived from
:func:`~numpy.genfromtxt`. These functions work the same way as the
original, but they have different default values.
-:func:`~numpy.recfromtxt`
+:func:`~numpy.npyio.recfromtxt`
Returns a standard :class:`numpy.recarray` (if ``usemask=False``) or a
- :class:`~numpy.ma.MaskedRecords` array (if ``usemaske=True``). The
+ :class:`~numpy.ma.mrecords.MaskedRecords` array (if ``usemaske=True``). The
default dtype is ``dtype=None``, meaning that the types of each column
will be automatically determined.
-:func:`~numpy.recfromcsv`
- Like :func:`~numpy.recfromtxt`, but with a default ``delimiter=","``.
+:func:`~numpy.npyio.recfromcsv`
+ Like :func:`~numpy.npyio.recfromtxt`, but with a default ``delimiter=","``.
diff --git a/doc/source/user/basics.rec.rst b/doc/source/user/basics.rec.rst
index b885c9e77..f579b0d85 100644
--- a/doc/source/user/basics.rec.rst
+++ b/doc/source/user/basics.rec.rst
@@ -4,10 +4,649 @@
Structured arrays
*****************
-.. automodule:: numpy.doc.structured_arrays
+Introduction
+============
+
+Structured arrays are ndarrays whose datatype is a composition of simpler
+datatypes organized as a sequence of named :term:`fields <field>`. For example,
+::
+
+ >>> x = np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)],
+ ... dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])
+ >>> x
+ array([('Rex', 9, 81.), ('Fido', 3, 27.)],
+ dtype=[('name', 'U10'), ('age', '<i4'), ('weight', '<f4')])
+
+Here ``x`` is a one-dimensional array of length two whose datatype is a
+structure with three fields: 1. A string of length 10 or less named 'name', 2.
+a 32-bit integer named 'age', and 3. a 32-bit float named 'weight'.
+
+If you index ``x`` at position 1 you get a structure::
+
+ >>> x[1]
+ ('Fido', 3, 27.0)
+
+You can access and modify individual fields of a structured array by indexing
+with the field name::
+
+ >>> x['age']
+ array([9, 3], dtype=int32)
+ >>> x['age'] = 5
+ >>> x
+ array([('Rex', 5, 81.), ('Fido', 5, 27.)],
+ dtype=[('name', 'U10'), ('age', '<i4'), ('weight', '<f4')])
+
+Structured datatypes are designed to be able to mimic 'structs' in the C
+language, and share a similar memory layout. They are meant for interfacing with
+C code and for low-level manipulation of structured buffers, for example for
+interpreting binary blobs. For these purposes they support specialized features
+such as subarrays, nested datatypes, and unions, and allow control over the
+memory layout of the structure.
+
+Users looking to manipulate tabular data, such as stored in csv files, may find
+other pydata projects more suitable, such as xarray, pandas, or DataArray.
+These provide a high-level interface for tabular data analysis and are better
+optimized for that use. For instance, the C-struct-like memory layout of
+structured arrays in numpy can lead to poor cache behavior in comparison.
+
+.. _defining-structured-types:
+
+Structured Datatypes
+====================
+
+A structured datatype can be thought of as a sequence of bytes of a certain
+length (the structure's :term:`itemsize`) which is interpreted as a collection
+of fields. Each field has a name, a datatype, and a byte offset within the
+structure. The datatype of a field may be any numpy datatype including other
+structured datatypes, and it may also be a :term:`subarray data type` which
+behaves like an ndarray of a specified shape. The offsets of the fields are
+arbitrary, and fields may even overlap. These offsets are usually determined
+automatically by numpy, but can also be specified.
+
+Structured Datatype Creation
+----------------------------
+
+Structured datatypes may be created using the function :func:`numpy.dtype`.
+There are 4 alternative forms of specification which vary in flexibility and
+conciseness. These are further documented in the
+:ref:`Data Type Objects <arrays.dtypes.constructing>` reference page, and in
+summary they are:
+
+1. A list of tuples, one tuple per field
+
+ Each tuple has the form ``(fieldname, datatype, shape)`` where shape is
+ optional. ``fieldname`` is a string (or tuple if titles are used, see
+ :ref:`Field Titles <titles>` below), ``datatype`` may be any object
+ convertible to a datatype, and ``shape`` is a tuple of integers specifying
+ subarray shape.
+
+ >>> np.dtype([('x', 'f4'), ('y', np.float32), ('z', 'f4', (2, 2))])
+ dtype([('x', '<f4'), ('y', '<f4'), ('z', '<f4', (2, 2))])
+
+ If ``fieldname`` is the empty string ``''``, the field will be given a
+ default name of the form ``f#``, where ``#`` is the integer index of the
+ field, counting from 0 from the left::
+
+ >>> np.dtype([('x', 'f4'), ('', 'i4'), ('z', 'i8')])
+ dtype([('x', '<f4'), ('f1', '<i4'), ('z', '<i8')])
+
+ The byte offsets of the fields within the structure and the total
+ structure itemsize are determined automatically.
+
+2. A string of comma-separated dtype specifications
+
+ In this shorthand notation any of the :ref:`string dtype specifications
+ <arrays.dtypes.constructing>` may be used in a string and separated by
+ commas. The itemsize and byte offsets of the fields are determined
+ automatically, and the field names are given the default names ``f0``,
+ ``f1``, etc. ::
+
+ >>> np.dtype('i8, f4, S3')
+ dtype([('f0', '<i8'), ('f1', '<f4'), ('f2', 'S3')])
+ >>> np.dtype('3int8, float32, (2, 3)float64')
+ dtype([('f0', 'i1', (3,)), ('f1', '<f4'), ('f2', '<f8', (2, 3))])
+
+3. A dictionary of field parameter arrays
+
+ This is the most flexible form of specification since it allows control
+ over the byte-offsets of the fields and the itemsize of the structure.
+
+ The dictionary has two required keys, 'names' and 'formats', and four
+ optional keys, 'offsets', 'itemsize', 'aligned' and 'titles'. The values
+ for 'names' and 'formats' should respectively be a list of field names and
+ a list of dtype specifications, of the same length. The optional 'offsets'
+ value should be a list of integer byte-offsets, one for each field within
+ the structure. If 'offsets' is not given the offsets are determined
+ automatically. The optional 'itemsize' value should be an integer
+ describing the total size in bytes of the dtype, which must be large
+ enough to contain all the fields.
+ ::
+
+ >>> np.dtype({'names': ['col1', 'col2'], 'formats': ['i4', 'f4']})
+ dtype([('col1', '<i4'), ('col2', '<f4')])
+ >>> np.dtype({'names': ['col1', 'col2'],
+ ... 'formats': ['i4', 'f4'],
+ ... 'offsets': [0, 4],
+ ... 'itemsize': 12})
+ dtype({'names':['col1','col2'], 'formats':['<i4','<f4'], 'offsets':[0,4], 'itemsize':12})
+
+ Offsets may be chosen such that the fields overlap, though this will mean
+ that assigning to one field may clobber any overlapping field's data. As
+ an exception, fields of :class:`numpy.object` type cannot overlap with
+ other fields, because of the risk of clobbering the internal object
+ pointer and then dereferencing it.
+
+ The optional 'aligned' value can be set to ``True`` to make the automatic
+ offset computation use aligned offsets (see :ref:`offsets-and-alignment`),
+ as if the 'align' keyword argument of :func:`numpy.dtype` had been set to
+ True.
+
+ The optional 'titles' value should be a list of titles of the same length
+ as 'names', see :ref:`Field Titles <titles>` below.
+
+4. A dictionary of field names
+
+ The use of this form of specification is discouraged, but documented here
+ because older numpy code may use it. The keys of the dictionary are the
+ field names and the values are tuples specifying type and offset::
+
+ >>> np.dtype({'col1': ('i1', 0), 'col2': ('f4', 1)})
+ dtype([('col1', 'i1'), ('col2', '<f4')])
+
+ This form is discouraged because Python dictionaries do not preserve order
+ in Python versions before Python 3.6, and the order of the fields in a
+ structured dtype has meaning. :ref:`Field Titles <titles>` may be
+ specified by using a 3-tuple, see below.
+
+Manipulating and Displaying Structured Datatypes
+------------------------------------------------
+
+The list of field names of a structured datatype can be found in the ``names``
+attribute of the dtype object::
+
+ >>> d = np.dtype([('x', 'i8'), ('y', 'f4')])
+ >>> d.names
+ ('x', 'y')
+
+The field names may be modified by assigning to the ``names`` attribute using a
+sequence of strings of the same length.
+
+The dtype object also has a dictionary-like attribute, ``fields``, whose keys
+are the field names (and :ref:`Field Titles <titles>`, see below) and whose
+values are tuples containing the dtype and byte offset of each field. ::
+
+ >>> d.fields
+ mappingproxy({'x': (dtype('int64'), 0), 'y': (dtype('float32'), 8)})
+
+Both the ``names`` and ``fields`` attributes will equal ``None`` for
+unstructured arrays. The recommended way to test if a dtype is structured is
+with `if dt.names is not None` rather than `if dt.names`, to account for dtypes
+with 0 fields.
+
+The string representation of a structured datatype is shown in the "list of
+tuples" form if possible, otherwise numpy falls back to using the more general
+dictionary form.
+
+.. _offsets-and-alignment:
+
+Automatic Byte Offsets and Alignment
+------------------------------------
+
+Numpy uses one of two methods to automatically determine the field byte offsets
+and the overall itemsize of a structured datatype, depending on whether
+``align=True`` was specified as a keyword argument to :func:`numpy.dtype`.
+
+By default (``align=False``), numpy will pack the fields together such that
+each field starts at the byte offset the previous field ended, and the fields
+are contiguous in memory. ::
+
+ >>> def print_offsets(d):
+ ... print("offsets:", [d.fields[name][1] for name in d.names])
+ ... print("itemsize:", d.itemsize)
+ >>> print_offsets(np.dtype('u1, u1, i4, u1, i8, u2'))
+ offsets: [0, 1, 2, 6, 7, 15]
+ itemsize: 17
+
+If ``align=True`` is set, numpy will pad the structure in the same way many C
+compilers would pad a C-struct. Aligned structures can give a performance
+improvement in some cases, at the cost of increased datatype size. Padding
+bytes are inserted between fields such that each field's byte offset will be a
+multiple of that field's alignment, which is usually equal to the field's size
+in bytes for simple datatypes, see :c:member:`PyArray_Descr.alignment`. The
+structure will also have trailing padding added so that its itemsize is a
+multiple of the largest field's alignment. ::
+
+ >>> print_offsets(np.dtype('u1, u1, i4, u1, i8, u2', align=True))
+ offsets: [0, 1, 4, 8, 16, 24]
+ itemsize: 32
+
+Note that although almost all modern C compilers pad in this way by default,
+padding in C structs is C-implementation-dependent so this memory layout is not
+guaranteed to exactly match that of a corresponding struct in a C program. Some
+work may be needed, either on the numpy side or the C side, to obtain exact
+correspondence.
+
+If offsets were specified using the optional ``offsets`` key in the
+dictionary-based dtype specification, setting ``align=True`` will check that
+each field's offset is a multiple of its size and that the itemsize is a
+multiple of the largest field size, and raise an exception if not.
+
+If the offsets of the fields and itemsize of a structured array satisfy the
+alignment conditions, the array will have the ``ALIGNED`` :attr:`flag
+<numpy.ndarray.flags>` set.
+
+A convenience function :func:`numpy.lib.recfunctions.repack_fields` converts an
+aligned dtype or array to a packed one and vice versa. It takes either a dtype
+or structured ndarray as an argument, and returns a copy with fields re-packed,
+with or without padding bytes.
+
+.. _titles:
+
+Field Titles
+------------
+
+In addition to field names, fields may also have an associated :term:`title`,
+an alternate name, which is sometimes used as an additional description or
+alias for the field. The title may be used to index an array, just like a
+field name.
+
+To add titles when using the list-of-tuples form of dtype specification, the
+field name may be specified as a tuple of two strings instead of a single
+string, which will be the field's title and field name respectively. For
+example::
+
+ >>> np.dtype([(('my title', 'name'), 'f4')])
+ dtype([(('my title', 'name'), '<f4')])
+
+When using the first form of dictionary-based specification, the titles may be
+supplied as an extra ``'titles'`` key as described above. When using the second
+(discouraged) dictionary-based specification, the title can be supplied by
+providing a 3-element tuple ``(datatype, offset, title)`` instead of the usual
+2-element tuple::
+
+ >>> np.dtype({'name': ('i4', 0, 'my title')})
+ dtype([(('my title', 'name'), '<i4')])
+
+The ``dtype.fields`` dictionary will contain titles as keys, if any
+titles are used. This means effectively that a field with a title will be
+represented twice in the fields dictionary. The tuple values for these fields
+will also have a third element, the field title. Because of this, and because
+the ``names`` attribute preserves the field order while the ``fields``
+attribute may not, it is recommended to iterate through the fields of a dtype
+using the ``names`` attribute of the dtype, which will not list titles, as
+in::
+
+ >>> for name in d.names:
+ ... print(d.fields[name][:2])
+ (dtype('int64'), 0)
+ (dtype('float32'), 8)
+
+Union types
+-----------
+
+Structured datatypes are implemented in numpy to have base type
+:class:`numpy.void` by default, but it is possible to interpret other numpy
+types as structured types using the ``(base_dtype, dtype)`` form of dtype
+specification described in
+:ref:`Data Type Objects <arrays.dtypes.constructing>`. Here, ``base_dtype`` is
+the desired underlying dtype, and fields and flags will be copied from
+``dtype``. This dtype is similar to a 'union' in C.
+
+Indexing and Assignment to Structured arrays
+============================================
+
+Assigning data to a Structured Array
+------------------------------------
+
+There are a number of ways to assign values to a structured array: Using python
+tuples, using scalar values, or using other structured arrays.
+
+Assignment from Python Native Types (Tuples)
+````````````````````````````````````````````
+
+The simplest way to assign values to a structured array is using python tuples.
+Each assigned value should be a tuple of length equal to the number of fields
+in the array, and not a list or array as these will trigger numpy's
+broadcasting rules. The tuple's elements are assigned to the successive fields
+of the array, from left to right::
+
+ >>> x = np.array([(1, 2, 3), (4, 5, 6)], dtype='i8, f4, f8')
+ >>> x[1] = (7, 8, 9)
+ >>> x
+ array([(1, 2., 3.), (7, 8., 9.)],
+ dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '<f8')])
+
+Assignment from Scalars
+```````````````````````
+
+A scalar assigned to a structured element will be assigned to all fields. This
+happens when a scalar is assigned to a structured array, or when an
+unstructured array is assigned to a structured array::
+
+ >>> x = np.zeros(2, dtype='i8, f4, ?, S1')
+ >>> x[:] = 3
+ >>> x
+ array([(3, 3., True, b'3'), (3, 3., True, b'3')],
+ dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '?'), ('f3', 'S1')])
+ >>> x[:] = np.arange(2)
+ >>> x
+ array([(0, 0., False, b'0'), (1, 1., True, b'1')],
+ dtype=[('f0', '<i8'), ('f1', '<f4'), ('f2', '?'), ('f3', 'S1')])
+
+Structured arrays can also be assigned to unstructured arrays, but only if the
+structured datatype has just a single field::
+
+ >>> twofield = np.zeros(2, dtype=[('A', 'i4'), ('B', 'i4')])
+ >>> onefield = np.zeros(2, dtype=[('A', 'i4')])
+ >>> nostruct = np.zeros(2, dtype='i4')
+ >>> nostruct[:] = twofield
+ Traceback (most recent call last):
+ ...
+ TypeError: Cannot cast array data from dtype([('A', '<i4'), ('B', '<i4')]) to dtype('int32') according to the rule 'unsafe'
+
+Assignment from other Structured Arrays
+```````````````````````````````````````
+
+Assignment between two structured arrays occurs as if the source elements had
+been converted to tuples and then assigned to the destination elements. That
+is, the first field of the source array is assigned to the first field of the
+destination array, and the second field likewise, and so on, regardless of
+field names. Structured arrays with a different number of fields cannot be
+assigned to each other. Bytes of the destination structure which are not
+included in any of the fields are unaffected. ::
+
+ >>> a = np.zeros(3, dtype=[('a', 'i8'), ('b', 'f4'), ('c', 'S3')])
+ >>> b = np.ones(3, dtype=[('x', 'f4'), ('y', 'S3'), ('z', 'O')])
+ >>> b[:] = a
+ >>> b
+ array([(0., b'0.0', b''), (0., b'0.0', b''), (0., b'0.0', b'')],
+ dtype=[('x', '<f4'), ('y', 'S3'), ('z', 'O')])
+
+
+Assignment involving subarrays
+``````````````````````````````
+
+When assigning to fields which are subarrays, the assigned value will first be
+broadcast to the shape of the subarray.
+
+Indexing Structured Arrays
+--------------------------
+
+Accessing Individual Fields
+```````````````````````````
+
+Individual fields of a structured array may be accessed and modified by indexing
+the array with the field name. ::
+
+ >>> x = np.array([(1, 2), (3, 4)], dtype=[('foo', 'i8'), ('bar', 'f4')])
+ >>> x['foo']
+ array([1, 3])
+ >>> x['foo'] = 10
+ >>> x
+ array([(10, 2.), (10, 4.)],
+ dtype=[('foo', '<i8'), ('bar', '<f4')])
+
+The resulting array is a view into the original array. It shares the same
+memory locations and writing to the view will modify the original array. ::
+
+ >>> y = x['bar']
+ >>> y[:] = 11
+ >>> x
+ array([(10, 11.), (10, 11.)],
+ dtype=[('foo', '<i8'), ('bar', '<f4')])
+
+This view has the same dtype and itemsize as the indexed field, so it is
+typically a non-structured array, except in the case of nested structures.
+
+ >>> y.dtype, y.shape, y.strides
+ (dtype('float32'), (2,), (12,))
+
+If the accessed field is a subarray, the dimensions of the subarray
+are appended to the shape of the result::
+
+ >>> x = np.zeros((2, 2), dtype=[('a', np.int32), ('b', np.float64, (3, 3))])
+ >>> x['a'].shape
+ (2, 2)
+ >>> x['b'].shape
+ (2, 2, 3, 3)
+
+Accessing Multiple Fields
+```````````````````````````
+
+One can index and assign to a structured array with a multi-field index, where
+the index is a list of field names.
+
+.. warning::
+ The behavior of multi-field indexes changed from Numpy 1.15 to Numpy 1.16.
+
+The result of indexing with a multi-field index is a view into the original
+array, as follows::
+
+ >>> a = np.zeros(3, dtype=[('a', 'i4'), ('b', 'i4'), ('c', 'f4')])
+ >>> a[['a', 'c']]
+ array([(0, 0.), (0, 0.), (0, 0.)],
+ dtype={'names':['a','c'], 'formats':['<i4','<f4'], 'offsets':[0,8], 'itemsize':12})
+
+Assignment to the view modifies the original array. The view's fields will be
+in the order they were indexed. Note that unlike for single-field indexing, the
+dtype of the view has the same itemsize as the original array, and has fields
+at the same offsets as in the original array, and unindexed fields are merely
+missing.
+
+.. warning::
+ In Numpy 1.15, indexing an array with a multi-field index returned a copy of
+ the result above, but with fields packed together in memory as if
+ passed through :func:`numpy.lib.recfunctions.repack_fields`.
+
+ The new behavior as of Numpy 1.16 leads to extra "padding" bytes at the
+ location of unindexed fields compared to 1.15. You will need to update any
+ code which depends on the data having a "packed" layout. For instance code
+ such as::
+
+ >>> a[['a', 'c']].view('i8') # Fails in Numpy 1.16
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ ValueError: When changing to a smaller dtype, its size must be a divisor of the size of original dtype
+
+ will need to be changed. This code has raised a ``FutureWarning`` since
+ Numpy 1.12, and similar code has raised ``FutureWarning`` since 1.7.
+
+ In 1.16 a number of functions have been introduced in the
+ :mod:`numpy.lib.recfunctions` module to help users account for this
+ change. These are
+ :func:`numpy.lib.recfunctions.repack_fields`.
+ :func:`numpy.lib.recfunctions.structured_to_unstructured`,
+ :func:`numpy.lib.recfunctions.unstructured_to_structured`,
+ :func:`numpy.lib.recfunctions.apply_along_fields`,
+ :func:`numpy.lib.recfunctions.assign_fields_by_name`, and
+ :func:`numpy.lib.recfunctions.require_fields`.
+
+ The function :func:`numpy.lib.recfunctions.repack_fields` can always be
+ used to reproduce the old behavior, as it will return a packed copy of the
+ structured array. The code above, for example, can be replaced with:
+
+ >>> from numpy.lib.recfunctions import repack_fields
+ >>> repack_fields(a[['a', 'c']]).view('i8') # supported in 1.16
+ array([0, 0, 0])
+
+ Furthermore, numpy now provides a new function
+ :func:`numpy.lib.recfunctions.structured_to_unstructured` which is a safer
+ and more efficient alternative for users who wish to convert structured
+ arrays to unstructured arrays, as the view above is often indeded to do.
+ This function allows safe conversion to an unstructured type taking into
+ account padding, often avoids a copy, and also casts the datatypes
+ as needed, unlike the view. Code such as:
+
+ >>> b = np.zeros(3, dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
+ >>> b[['x', 'z']].view('f4')
+ array([0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)
+
+ can be made safer by replacing with:
+
+ >>> from numpy.lib.recfunctions import structured_to_unstructured
+ >>> structured_to_unstructured(b[['x', 'z']])
+ array([0, 0, 0])
+
+
+Assignment to an array with a multi-field index modifies the original array::
+
+ >>> a[['a', 'c']] = (2, 3)
+ >>> a
+ array([(2, 0, 3.), (2, 0, 3.), (2, 0, 3.)],
+ dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<f4')])
+
+This obeys the structured array assignment rules described above. For example,
+this means that one can swap the values of two fields using appropriate
+multi-field indexes::
+
+ >>> a[['a', 'c']] = a[['c', 'a']]
+
+Indexing with an Integer to get a Structured Scalar
+```````````````````````````````````````````````````
+
+Indexing a single element of a structured array (with an integer index) returns
+a structured scalar::
+
+ >>> x = np.array([(1, 2., 3.)], dtype='i, f, f')
+ >>> scalar = x[0]
+ >>> scalar
+ (1, 2., 3.)
+ >>> type(scalar)
+ <class 'numpy.void'>
+
+Unlike other numpy scalars, structured scalars are mutable and act like views
+into the original array, such that modifying the scalar will modify the
+original array. Structured scalars also support access and assignment by field
+name::
+
+ >>> x = np.array([(1, 2), (3, 4)], dtype=[('foo', 'i8'), ('bar', 'f4')])
+ >>> s = x[0]
+ >>> s['bar'] = 100
+ >>> x
+ array([(1, 100.), (3, 4.)],
+ dtype=[('foo', '<i8'), ('bar', '<f4')])
+
+Similarly to tuples, structured scalars can also be indexed with an integer::
+
+ >>> scalar = np.array([(1, 2., 3.)], dtype='i, f, f')[0]
+ >>> scalar[0]
+ 1
+ >>> scalar[1] = 4
+
+Thus, tuples might be thought of as the native Python equivalent to numpy's
+structured types, much like native python integers are the equivalent to
+numpy's integer types. Structured scalars may be converted to a tuple by
+calling `numpy.ndarray.item`::
+
+ >>> scalar.item(), type(scalar.item())
+ ((1, 4.0, 3.0), <class 'tuple'>)
+
+Viewing Structured Arrays Containing Objects
+--------------------------------------------
+
+In order to prevent clobbering object pointers in fields of
+:class:`numpy.object` type, numpy currently does not allow views of structured
+arrays containing objects.
+
+Structure Comparison
+--------------------
+
+If the dtypes of two void structured arrays are equal, testing the equality of
+the arrays will result in a boolean array with the dimensions of the original
+arrays, with elements set to ``True`` where all fields of the corresponding
+structures are equal. Structured dtypes are equal if the field names,
+dtypes and titles are the same, ignoring endianness, and the fields are in
+the same order::
+
+ >>> a = np.zeros(2, dtype=[('a', 'i4'), ('b', 'i4')])
+ >>> b = np.ones(2, dtype=[('a', 'i4'), ('b', 'i4')])
+ >>> a == b
+ array([False, False])
+
+Currently, if the dtypes of two void structured arrays are not equivalent the
+comparison fails, returning the scalar value ``False``. This behavior is
+deprecated as of numpy 1.10 and will raise an error or perform elementwise
+comparison in the future.
+
+The ``<`` and ``>`` operators always return ``False`` when comparing void
+structured arrays, and arithmetic and bitwise operations are not supported.
+
+Record Arrays
+=============
+
+As an optional convenience numpy provides an ndarray subclass,
+:class:`numpy.recarray`, and associated helper functions in the
+:mod:`numpy.lib.recfunctions` submodule (aliased as ``numpy.rec``), that allows
+access to fields of structured arrays by attribute instead of only by index.
+Record arrays also use a special datatype, :class:`numpy.record`, that allows
+field access by attribute on the structured scalars obtained from the array.
+
+The simplest way to create a record array is with ``numpy.rec.array``::
+
+ >>> recordarr = np.rec.array([(1, 2., 'Hello'), (2, 3., "World")],
+ ... dtype=[('foo', 'i4'),('bar', 'f4'), ('baz', 'S10')])
+ >>> recordarr.bar
+ array([ 2., 3.], dtype=float32)
+ >>> recordarr[1:2]
+ rec.array([(2, 3., b'World')],
+ dtype=[('foo', '<i4'), ('bar', '<f4'), ('baz', 'S10')])
+ >>> recordarr[1:2].foo
+ array([2], dtype=int32)
+ >>> recordarr.foo[1:2]
+ array([2], dtype=int32)
+ >>> recordarr[1].baz
+ b'World'
+
+:func:`numpy.rec.array` can convert a wide variety of arguments into record
+arrays, including structured arrays::
+
+ >>> arr = np.array([(1, 2., 'Hello'), (2, 3., "World")],
+ ... dtype=[('foo', 'i4'), ('bar', 'f4'), ('baz', 'S10')])
+ >>> recordarr = np.rec.array(arr)
+
+The :mod:`numpy.rec` module provides a number of other convenience functions for
+creating record arrays, see :ref:`record array creation routines
+<routines.array-creation.rec>`.
+
+A record array representation of a structured array can be obtained using the
+appropriate `view <numpy-ndarray-view>`_::
+
+ >>> arr = np.array([(1, 2., 'Hello'), (2, 3., "World")],
+ ... dtype=[('foo', 'i4'),('bar', 'f4'), ('baz', 'a10')])
+ >>> recordarr = arr.view(dtype=np.dtype((np.record, arr.dtype)),
+ ... type=np.recarray)
+
+For convenience, viewing an ndarray as type :class:`np.recarray` will
+automatically convert to :class:`np.record` datatype, so the dtype can be left
+out of the view::
+
+ >>> recordarr = arr.view(np.recarray)
+ >>> recordarr.dtype
+ dtype((numpy.record, [('foo', '<i4'), ('bar', '<f4'), ('baz', 'S10')]))
+
+To get back to a plain ndarray both the dtype and type must be reset. The
+following view does so, taking into account the unusual case that the
+recordarr was not a structured type::
+
+ >>> arr2 = recordarr.view(recordarr.dtype.fields or recordarr.dtype, np.ndarray)
+
+Record array fields accessed by index or by attribute are returned as a record
+array if the field has a structured type but as a plain ndarray otherwise. ::
+
+ >>> recordarr = np.rec.array([('Hello', (1, 2)), ("World", (3, 4))],
+ ... dtype=[('foo', 'S6'),('bar', [('A', int), ('B', int)])])
+ >>> type(recordarr.foo)
+ <class 'numpy.ndarray'>
+ >>> type(recordarr.bar)
+ <class 'numpy.recarray'>
+
+Note that if a field has the same name as an ndarray attribute, the ndarray
+attribute takes precedence. Such fields will be inaccessible by attribute but
+will still be accessible by index.
+
Recarray Helper Functions
-*************************
+-------------------------
.. automodule:: numpy.lib.recfunctions
:members:
diff --git a/doc/source/user/basics.subclassing.rst b/doc/source/user/basics.subclassing.rst
index 43315521c..d8d104220 100644
--- a/doc/source/user/basics.subclassing.rst
+++ b/doc/source/user/basics.subclassing.rst
@@ -4,4 +4,751 @@
Subclassing ndarray
*******************
-.. automodule:: numpy.doc.subclassing
+Introduction
+------------
+
+Subclassing ndarray is relatively simple, but it has some complications
+compared to other Python objects. On this page we explain the machinery
+that allows you to subclass ndarray, and the implications for
+implementing a subclass.
+
+ndarrays and object creation
+============================
+
+Subclassing ndarray is complicated by the fact that new instances of
+ndarray classes can come about in three different ways. These are:
+
+#. Explicit constructor call - as in ``MySubClass(params)``. This is
+ the usual route to Python instance creation.
+#. View casting - casting an existing ndarray as a given subclass
+#. New from template - creating a new instance from a template
+ instance. Examples include returning slices from a subclassed array,
+ creating return types from ufuncs, and copying arrays. See
+ :ref:`new-from-template` for more details
+
+The last two are characteristics of ndarrays - in order to support
+things like array slicing. The complications of subclassing ndarray are
+due to the mechanisms numpy has to support these latter two routes of
+instance creation.
+
+.. _view-casting:
+
+View casting
+------------
+
+*View casting* is the standard ndarray mechanism by which you take an
+ndarray of any subclass, and return a view of the array as another
+(specified) subclass:
+
+>>> import numpy as np
+>>> # create a completely useless ndarray subclass
+>>> class C(np.ndarray): pass
+>>> # create a standard ndarray
+>>> arr = np.zeros((3,))
+>>> # take a view of it, as our useless subclass
+>>> c_arr = arr.view(C)
+>>> type(c_arr)
+<class 'C'>
+
+.. _new-from-template:
+
+Creating new from template
+--------------------------
+
+New instances of an ndarray subclass can also come about by a very
+similar mechanism to :ref:`view-casting`, when numpy finds it needs to
+create a new instance from a template instance. The most obvious place
+this has to happen is when you are taking slices of subclassed arrays.
+For example:
+
+>>> v = c_arr[1:]
+>>> type(v) # the view is of type 'C'
+<class 'C'>
+>>> v is c_arr # but it's a new instance
+False
+
+The slice is a *view* onto the original ``c_arr`` data. So, when we
+take a view from the ndarray, we return a new ndarray, of the same
+class, that points to the data in the original.
+
+There are other points in the use of ndarrays where we need such views,
+such as copying arrays (``c_arr.copy()``), creating ufunc output arrays
+(see also :ref:`array-wrap`), and reducing methods (like
+``c_arr.mean()``).
+
+Relationship of view casting and new-from-template
+--------------------------------------------------
+
+These paths both use the same machinery. We make the distinction here,
+because they result in different input to your methods. Specifically,
+:ref:`view-casting` means you have created a new instance of your array
+type from any potential subclass of ndarray. :ref:`new-from-template`
+means you have created a new instance of your class from a pre-existing
+instance, allowing you - for example - to copy across attributes that
+are particular to your subclass.
+
+Implications for subclassing
+----------------------------
+
+If we subclass ndarray, we need to deal not only with explicit
+construction of our array type, but also :ref:`view-casting` or
+:ref:`new-from-template`. NumPy has the machinery to do this, and this
+machinery that makes subclassing slightly non-standard.
+
+There are two aspects to the machinery that ndarray uses to support
+views and new-from-template in subclasses.
+
+The first is the use of the ``ndarray.__new__`` method for the main work
+of object initialization, rather then the more usual ``__init__``
+method. The second is the use of the ``__array_finalize__`` method to
+allow subclasses to clean up after the creation of views and new
+instances from templates.
+
+A brief Python primer on ``__new__`` and ``__init__``
+=====================================================
+
+``__new__`` is a standard Python method, and, if present, is called
+before ``__init__`` when we create a class instance. See the `python
+__new__ documentation
+<https://docs.python.org/reference/datamodel.html#object.__new__>`_ for more detail.
+
+For example, consider the following Python code:
+
+.. testcode::
+
+ class C:
+ def __new__(cls, *args):
+ print('Cls in __new__:', cls)
+ print('Args in __new__:', args)
+ # The `object` type __new__ method takes a single argument.
+ return object.__new__(cls)
+
+ def __init__(self, *args):
+ print('type(self) in __init__:', type(self))
+ print('Args in __init__:', args)
+
+meaning that we get:
+
+>>> c = C('hello')
+Cls in __new__: <class 'C'>
+Args in __new__: ('hello',)
+type(self) in __init__: <class 'C'>
+Args in __init__: ('hello',)
+
+When we call ``C('hello')``, the ``__new__`` method gets its own class
+as first argument, and the passed argument, which is the string
+``'hello'``. After python calls ``__new__``, it usually (see below)
+calls our ``__init__`` method, with the output of ``__new__`` as the
+first argument (now a class instance), and the passed arguments
+following.
+
+As you can see, the object can be initialized in the ``__new__``
+method or the ``__init__`` method, or both, and in fact ndarray does
+not have an ``__init__`` method, because all the initialization is
+done in the ``__new__`` method.
+
+Why use ``__new__`` rather than just the usual ``__init__``? Because
+in some cases, as for ndarray, we want to be able to return an object
+of some other class. Consider the following:
+
+.. testcode::
+
+ class D(C):
+ def __new__(cls, *args):
+ print('D cls is:', cls)
+ print('D args in __new__:', args)
+ return C.__new__(C, *args)
+
+ def __init__(self, *args):
+ # we never get here
+ print('In D __init__')
+
+meaning that:
+
+>>> obj = D('hello')
+D cls is: <class 'D'>
+D args in __new__: ('hello',)
+Cls in __new__: <class 'C'>
+Args in __new__: ('hello',)
+>>> type(obj)
+<class 'C'>
+
+The definition of ``C`` is the same as before, but for ``D``, the
+``__new__`` method returns an instance of class ``C`` rather than
+``D``. Note that the ``__init__`` method of ``D`` does not get
+called. In general, when the ``__new__`` method returns an object of
+class other than the class in which it is defined, the ``__init__``
+method of that class is not called.
+
+This is how subclasses of the ndarray class are able to return views
+that preserve the class type. When taking a view, the standard
+ndarray machinery creates the new ndarray object with something
+like::
+
+ obj = ndarray.__new__(subtype, shape, ...
+
+where ``subdtype`` is the subclass. Thus the returned view is of the
+same class as the subclass, rather than being of class ``ndarray``.
+
+That solves the problem of returning views of the same type, but now
+we have a new problem. The machinery of ndarray can set the class
+this way, in its standard methods for taking views, but the ndarray
+``__new__`` method knows nothing of what we have done in our own
+``__new__`` method in order to set attributes, and so on. (Aside -
+why not call ``obj = subdtype.__new__(...`` then? Because we may not
+have a ``__new__`` method with the same call signature).
+
+The role of ``__array_finalize__``
+==================================
+
+``__array_finalize__`` is the mechanism that numpy provides to allow
+subclasses to handle the various ways that new instances get created.
+
+Remember that subclass instances can come about in these three ways:
+
+#. explicit constructor call (``obj = MySubClass(params)``). This will
+ call the usual sequence of ``MySubClass.__new__`` then (if it exists)
+ ``MySubClass.__init__``.
+#. :ref:`view-casting`
+#. :ref:`new-from-template`
+
+Our ``MySubClass.__new__`` method only gets called in the case of the
+explicit constructor call, so we can't rely on ``MySubClass.__new__`` or
+``MySubClass.__init__`` to deal with the view casting and
+new-from-template. It turns out that ``MySubClass.__array_finalize__``
+*does* get called for all three methods of object creation, so this is
+where our object creation housekeeping usually goes.
+
+* For the explicit constructor call, our subclass will need to create a
+ new ndarray instance of its own class. In practice this means that
+ we, the authors of the code, will need to make a call to
+ ``ndarray.__new__(MySubClass,...)``, a class-hierarchy prepared call to
+ ``super(MySubClass, cls).__new__(cls, ...)``, or do view casting of an
+ existing array (see below)
+* For view casting and new-from-template, the equivalent of
+ ``ndarray.__new__(MySubClass,...`` is called, at the C level.
+
+The arguments that ``__array_finalize__`` receives differ for the three
+methods of instance creation above.
+
+The following code allows us to look at the call sequences and arguments:
+
+.. testcode::
+
+ import numpy as np
+
+ class C(np.ndarray):
+ def __new__(cls, *args, **kwargs):
+ print('In __new__ with class %s' % cls)
+ return super(C, cls).__new__(cls, *args, **kwargs)
+
+ def __init__(self, *args, **kwargs):
+ # in practice you probably will not need or want an __init__
+ # method for your subclass
+ print('In __init__ with class %s' % self.__class__)
+
+ def __array_finalize__(self, obj):
+ print('In array_finalize:')
+ print(' self type is %s' % type(self))
+ print(' obj type is %s' % type(obj))
+
+
+Now:
+
+>>> # Explicit constructor
+>>> c = C((10,))
+In __new__ with class <class 'C'>
+In array_finalize:
+ self type is <class 'C'>
+ obj type is <type 'NoneType'>
+In __init__ with class <class 'C'>
+>>> # View casting
+>>> a = np.arange(10)
+>>> cast_a = a.view(C)
+In array_finalize:
+ self type is <class 'C'>
+ obj type is <type 'numpy.ndarray'>
+>>> # Slicing (example of new-from-template)
+>>> cv = c[:1]
+In array_finalize:
+ self type is <class 'C'>
+ obj type is <class 'C'>
+
+The signature of ``__array_finalize__`` is::
+
+ def __array_finalize__(self, obj):
+
+One sees that the ``super`` call, which goes to
+``ndarray.__new__``, passes ``__array_finalize__`` the new object, of our
+own class (``self``) as well as the object from which the view has been
+taken (``obj``). As you can see from the output above, the ``self`` is
+always a newly created instance of our subclass, and the type of ``obj``
+differs for the three instance creation methods:
+
+* When called from the explicit constructor, ``obj`` is ``None``
+* When called from view casting, ``obj`` can be an instance of any
+ subclass of ndarray, including our own.
+* When called in new-from-template, ``obj`` is another instance of our
+ own subclass, that we might use to update the new ``self`` instance.
+
+Because ``__array_finalize__`` is the only method that always sees new
+instances being created, it is the sensible place to fill in instance
+defaults for new object attributes, among other tasks.
+
+This may be clearer with an example.
+
+Simple example - adding an extra attribute to ndarray
+-----------------------------------------------------
+
+.. testcode::
+
+ import numpy as np
+
+ class InfoArray(np.ndarray):
+
+ def __new__(subtype, shape, dtype=float, buffer=None, offset=0,
+ strides=None, order=None, info=None):
+ # Create the ndarray instance of our type, given the usual
+ # ndarray input arguments. This will call the standard
+ # ndarray constructor, but return an object of our type.
+ # It also triggers a call to InfoArray.__array_finalize__
+ obj = super(InfoArray, subtype).__new__(subtype, shape, dtype,
+ buffer, offset, strides,
+ order)
+ # set the new 'info' attribute to the value passed
+ obj.info = info
+ # Finally, we must return the newly created object:
+ return obj
+
+ def __array_finalize__(self, obj):
+ # ``self`` is a new object resulting from
+ # ndarray.__new__(InfoArray, ...), therefore it only has
+ # attributes that the ndarray.__new__ constructor gave it -
+ # i.e. those of a standard ndarray.
+ #
+ # We could have got to the ndarray.__new__ call in 3 ways:
+ # From an explicit constructor - e.g. InfoArray():
+ # obj is None
+ # (we're in the middle of the InfoArray.__new__
+ # constructor, and self.info will be set when we return to
+ # InfoArray.__new__)
+ if obj is None: return
+ # From view casting - e.g arr.view(InfoArray):
+ # obj is arr
+ # (type(obj) can be InfoArray)
+ # From new-from-template - e.g infoarr[:3]
+ # type(obj) is InfoArray
+ #
+ # Note that it is here, rather than in the __new__ method,
+ # that we set the default value for 'info', because this
+ # method sees all creation of default objects - with the
+ # InfoArray.__new__ constructor, but also with
+ # arr.view(InfoArray).
+ self.info = getattr(obj, 'info', None)
+ # We do not need to return anything
+
+
+Using the object looks like this:
+
+ >>> obj = InfoArray(shape=(3,)) # explicit constructor
+ >>> type(obj)
+ <class 'InfoArray'>
+ >>> obj.info is None
+ True
+ >>> obj = InfoArray(shape=(3,), info='information')
+ >>> obj.info
+ 'information'
+ >>> v = obj[1:] # new-from-template - here - slicing
+ >>> type(v)
+ <class 'InfoArray'>
+ >>> v.info
+ 'information'
+ >>> arr = np.arange(10)
+ >>> cast_arr = arr.view(InfoArray) # view casting
+ >>> type(cast_arr)
+ <class 'InfoArray'>
+ >>> cast_arr.info is None
+ True
+
+This class isn't very useful, because it has the same constructor as the
+bare ndarray object, including passing in buffers and shapes and so on.
+We would probably prefer the constructor to be able to take an already
+formed ndarray from the usual numpy calls to ``np.array`` and return an
+object.
+
+Slightly more realistic example - attribute added to existing array
+-------------------------------------------------------------------
+
+Here is a class that takes a standard ndarray that already exists, casts
+as our type, and adds an extra attribute.
+
+.. testcode::
+
+ import numpy as np
+
+ class RealisticInfoArray(np.ndarray):
+
+ def __new__(cls, input_array, info=None):
+ # Input array is an already formed ndarray instance
+ # We first cast to be our class type
+ obj = np.asarray(input_array).view(cls)
+ # add the new attribute to the created instance
+ obj.info = info
+ # Finally, we must return the newly created object:
+ return obj
+
+ def __array_finalize__(self, obj):
+ # see InfoArray.__array_finalize__ for comments
+ if obj is None: return
+ self.info = getattr(obj, 'info', None)
+
+
+So:
+
+ >>> arr = np.arange(5)
+ >>> obj = RealisticInfoArray(arr, info='information')
+ >>> type(obj)
+ <class 'RealisticInfoArray'>
+ >>> obj.info
+ 'information'
+ >>> v = obj[1:]
+ >>> type(v)
+ <class 'RealisticInfoArray'>
+ >>> v.info
+ 'information'
+
+.. _array-ufunc:
+
+``__array_ufunc__`` for ufuncs
+------------------------------
+
+ .. versionadded:: 1.13
+
+A subclass can override what happens when executing numpy ufuncs on it by
+overriding the default ``ndarray.__array_ufunc__`` method. This method is
+executed *instead* of the ufunc and should return either the result of the
+operation, or :obj:`NotImplemented` if the operation requested is not
+implemented.
+
+The signature of ``__array_ufunc__`` is::
+
+ def __array_ufunc__(ufunc, method, *inputs, **kwargs):
+
+ - *ufunc* is the ufunc object that was called.
+ - *method* is a string indicating how the Ufunc was called, either
+ ``"__call__"`` to indicate it was called directly, or one of its
+ :ref:`methods<ufuncs.methods>`: ``"reduce"``, ``"accumulate"``,
+ ``"reduceat"``, ``"outer"``, or ``"at"``.
+ - *inputs* is a tuple of the input arguments to the ``ufunc``
+ - *kwargs* contains any optional or keyword arguments passed to the
+ function. This includes any ``out`` arguments, which are always
+ contained in a tuple.
+
+A typical implementation would convert any inputs or outputs that are
+instances of one's own class, pass everything on to a superclass using
+``super()``, and finally return the results after possible
+back-conversion. An example, taken from the test case
+``test_ufunc_override_with_super`` in ``core/tests/test_umath.py``, is the
+following.
+
+.. testcode::
+
+ input numpy as np
+
+ class A(np.ndarray):
+ def __array_ufunc__(self, ufunc, method, *inputs, out=None, **kwargs):
+ args = []
+ in_no = []
+ for i, input_ in enumerate(inputs):
+ if isinstance(input_, A):
+ in_no.append(i)
+ args.append(input_.view(np.ndarray))
+ else:
+ args.append(input_)
+
+ outputs = out
+ out_no = []
+ if outputs:
+ out_args = []
+ for j, output in enumerate(outputs):
+ if isinstance(output, A):
+ out_no.append(j)
+ out_args.append(output.view(np.ndarray))
+ else:
+ out_args.append(output)
+ kwargs['out'] = tuple(out_args)
+ else:
+ outputs = (None,) * ufunc.nout
+
+ info = {}
+ if in_no:
+ info['inputs'] = in_no
+ if out_no:
+ info['outputs'] = out_no
+
+ results = super(A, self).__array_ufunc__(ufunc, method,
+ *args, **kwargs)
+ if results is NotImplemented:
+ return NotImplemented
+
+ if method == 'at':
+ if isinstance(inputs[0], A):
+ inputs[0].info = info
+ return
+
+ if ufunc.nout == 1:
+ results = (results,)
+
+ results = tuple((np.asarray(result).view(A)
+ if output is None else output)
+ for result, output in zip(results, outputs))
+ if results and isinstance(results[0], A):
+ results[0].info = info
+
+ return results[0] if len(results) == 1 else results
+
+So, this class does not actually do anything interesting: it just
+converts any instances of its own to regular ndarray (otherwise, we'd
+get infinite recursion!), and adds an ``info`` dictionary that tells
+which inputs and outputs it converted. Hence, e.g.,
+
+>>> a = np.arange(5.).view(A)
+>>> b = np.sin(a)
+>>> b.info
+{'inputs': [0]}
+>>> b = np.sin(np.arange(5.), out=(a,))
+>>> b.info
+{'outputs': [0]}
+>>> a = np.arange(5.).view(A)
+>>> b = np.ones(1).view(A)
+>>> c = a + b
+>>> c.info
+{'inputs': [0, 1]}
+>>> a += b
+>>> a.info
+{'inputs': [0, 1], 'outputs': [0]}
+
+Note that another approach would be to to use ``getattr(ufunc,
+methods)(*inputs, **kwargs)`` instead of the ``super`` call. For this example,
+the result would be identical, but there is a difference if another operand
+also defines ``__array_ufunc__``. E.g., lets assume that we evalulate
+``np.add(a, b)``, where ``b`` is an instance of another class ``B`` that has
+an override. If you use ``super`` as in the example,
+``ndarray.__array_ufunc__`` will notice that ``b`` has an override, which
+means it cannot evaluate the result itself. Thus, it will return
+`NotImplemented` and so will our class ``A``. Then, control will be passed
+over to ``b``, which either knows how to deal with us and produces a result,
+or does not and returns `NotImplemented`, raising a ``TypeError``.
+
+If instead, we replace our ``super`` call with ``getattr(ufunc, method)``, we
+effectively do ``np.add(a.view(np.ndarray), b)``. Again, ``B.__array_ufunc__``
+will be called, but now it sees an ``ndarray`` as the other argument. Likely,
+it will know how to handle this, and return a new instance of the ``B`` class
+to us. Our example class is not set up to handle this, but it might well be
+the best approach if, e.g., one were to re-implement ``MaskedArray`` using
+``__array_ufunc__``.
+
+As a final note: if the ``super`` route is suited to a given class, an
+advantage of using it is that it helps in constructing class hierarchies.
+E.g., suppose that our other class ``B`` also used the ``super`` in its
+``__array_ufunc__`` implementation, and we created a class ``C`` that depended
+on both, i.e., ``class C(A, B)`` (with, for simplicity, not another
+``__array_ufunc__`` override). Then any ufunc on an instance of ``C`` would
+pass on to ``A.__array_ufunc__``, the ``super`` call in ``A`` would go to
+``B.__array_ufunc__``, and the ``super`` call in ``B`` would go to
+``ndarray.__array_ufunc__``, thus allowing ``A`` and ``B`` to collaborate.
+
+.. _array-wrap:
+
+``__array_wrap__`` for ufuncs and other functions
+-------------------------------------------------
+
+Prior to numpy 1.13, the behaviour of ufuncs could only be tuned using
+``__array_wrap__`` and ``__array_prepare__``. These two allowed one to
+change the output type of a ufunc, but, in contrast to
+``__array_ufunc__``, did not allow one to make any changes to the inputs.
+It is hoped to eventually deprecate these, but ``__array_wrap__`` is also
+used by other numpy functions and methods, such as ``squeeze``, so at the
+present time is still needed for full functionality.
+
+Conceptually, ``__array_wrap__`` "wraps up the action" in the sense of
+allowing a subclass to set the type of the return value and update
+attributes and metadata. Let's show how this works with an example. First
+we return to the simpler example subclass, but with a different name and
+some print statements:
+
+.. testcode::
+
+ import numpy as np
+
+ class MySubClass(np.ndarray):
+
+ def __new__(cls, input_array, info=None):
+ obj = np.asarray(input_array).view(cls)
+ obj.info = info
+ return obj
+
+ def __array_finalize__(self, obj):
+ print('In __array_finalize__:')
+ print(' self is %s' % repr(self))
+ print(' obj is %s' % repr(obj))
+ if obj is None: return
+ self.info = getattr(obj, 'info', None)
+
+ def __array_wrap__(self, out_arr, context=None):
+ print('In __array_wrap__:')
+ print(' self is %s' % repr(self))
+ print(' arr is %s' % repr(out_arr))
+ # then just call the parent
+ return super(MySubClass, self).__array_wrap__(self, out_arr, context)
+
+We run a ufunc on an instance of our new array:
+
+>>> obj = MySubClass(np.arange(5), info='spam')
+In __array_finalize__:
+ self is MySubClass([0, 1, 2, 3, 4])
+ obj is array([0, 1, 2, 3, 4])
+>>> arr2 = np.arange(5)+1
+>>> ret = np.add(arr2, obj)
+In __array_wrap__:
+ self is MySubClass([0, 1, 2, 3, 4])
+ arr is array([1, 3, 5, 7, 9])
+In __array_finalize__:
+ self is MySubClass([1, 3, 5, 7, 9])
+ obj is MySubClass([0, 1, 2, 3, 4])
+>>> ret
+MySubClass([1, 3, 5, 7, 9])
+>>> ret.info
+'spam'
+
+Note that the ufunc (``np.add``) has called the ``__array_wrap__`` method
+with arguments ``self`` as ``obj``, and ``out_arr`` as the (ndarray) result
+of the addition. In turn, the default ``__array_wrap__``
+(``ndarray.__array_wrap__``) has cast the result to class ``MySubClass``,
+and called ``__array_finalize__`` - hence the copying of the ``info``
+attribute. This has all happened at the C level.
+
+But, we could do anything we wanted:
+
+.. testcode::
+
+ class SillySubClass(np.ndarray):
+
+ def __array_wrap__(self, arr, context=None):
+ return 'I lost your data'
+
+>>> arr1 = np.arange(5)
+>>> obj = arr1.view(SillySubClass)
+>>> arr2 = np.arange(5)
+>>> ret = np.multiply(obj, arr2)
+>>> ret
+'I lost your data'
+
+So, by defining a specific ``__array_wrap__`` method for our subclass,
+we can tweak the output from ufuncs. The ``__array_wrap__`` method
+requires ``self``, then an argument - which is the result of the ufunc -
+and an optional parameter *context*. This parameter is returned by
+ufuncs as a 3-element tuple: (name of the ufunc, arguments of the ufunc,
+domain of the ufunc), but is not set by other numpy functions. Though,
+as seen above, it is possible to do otherwise, ``__array_wrap__`` should
+return an instance of its containing class. See the masked array
+subclass for an implementation.
+
+In addition to ``__array_wrap__``, which is called on the way out of the
+ufunc, there is also an ``__array_prepare__`` method which is called on
+the way into the ufunc, after the output arrays are created but before any
+computation has been performed. The default implementation does nothing
+but pass through the array. ``__array_prepare__`` should not attempt to
+access the array data or resize the array, it is intended for setting the
+output array type, updating attributes and metadata, and performing any
+checks based on the input that may be desired before computation begins.
+Like ``__array_wrap__``, ``__array_prepare__`` must return an ndarray or
+subclass thereof or raise an error.
+
+Extra gotchas - custom ``__del__`` methods and ndarray.base
+-----------------------------------------------------------
+
+One of the problems that ndarray solves is keeping track of memory
+ownership of ndarrays and their views. Consider the case where we have
+created an ndarray, ``arr`` and have taken a slice with ``v = arr[1:]``.
+The two objects are looking at the same memory. NumPy keeps track of
+where the data came from for a particular array or view, with the
+``base`` attribute:
+
+>>> # A normal ndarray, that owns its own data
+>>> arr = np.zeros((4,))
+>>> # In this case, base is None
+>>> arr.base is None
+True
+>>> # We take a view
+>>> v1 = arr[1:]
+>>> # base now points to the array that it derived from
+>>> v1.base is arr
+True
+>>> # Take a view of a view
+>>> v2 = v1[1:]
+>>> # base points to the original array that it was derived from
+>>> v2.base is arr
+True
+
+In general, if the array owns its own memory, as for ``arr`` in this
+case, then ``arr.base`` will be None - there are some exceptions to this
+- see the numpy book for more details.
+
+The ``base`` attribute is useful in being able to tell whether we have
+a view or the original array. This in turn can be useful if we need
+to know whether or not to do some specific cleanup when the subclassed
+array is deleted. For example, we may only want to do the cleanup if
+the original array is deleted, but not the views. For an example of
+how this can work, have a look at the ``memmap`` class in
+``numpy.core``.
+
+Subclassing and Downstream Compatibility
+----------------------------------------
+
+When sub-classing ``ndarray`` or creating duck-types that mimic the ``ndarray``
+interface, it is your responsibility to decide how aligned your APIs will be
+with those of numpy. For convenience, many numpy functions that have a corresponding
+``ndarray`` method (e.g., ``sum``, ``mean``, ``take``, ``reshape``) work by checking
+if the first argument to a function has a method of the same name. If it exists, the
+method is called instead of coercing the arguments to a numpy array.
+
+For example, if you want your sub-class or duck-type to be compatible with
+numpy's ``sum`` function, the method signature for this object's ``sum`` method
+should be the following:
+
+.. testcode::
+
+ def sum(self, axis=None, dtype=None, out=None, keepdims=False):
+ ...
+
+This is the exact same method signature for ``np.sum``, so now if a user calls
+``np.sum`` on this object, numpy will call the object's own ``sum`` method and
+pass in these arguments enumerated above in the signature, and no errors will
+be raised because the signatures are completely compatible with each other.
+
+If, however, you decide to deviate from this signature and do something like this:
+
+.. testcode::
+
+ def sum(self, axis=None, dtype=None):
+ ...
+
+This object is no longer compatible with ``np.sum`` because if you call ``np.sum``,
+it will pass in unexpected arguments ``out`` and ``keepdims``, causing a TypeError
+to be raised.
+
+If you wish to maintain compatibility with numpy and its subsequent versions (which
+might add new keyword arguments) but do not want to surface all of numpy's arguments,
+your function's signature should accept ``**kwargs``. For example:
+
+.. testcode::
+
+ def sum(self, axis=None, dtype=None, **unused_kwargs):
+ ...
+
+This object is now compatible with ``np.sum`` again because any extraneous arguments
+(i.e. keywords that are not ``axis`` or ``dtype``) will be hidden away in the
+``**unused_kwargs`` parameter.
+
+
diff --git a/doc/source/user/basics.types.rst b/doc/source/user/basics.types.rst
index 5ce5af15a..3c39b35d0 100644
--- a/doc/source/user/basics.types.rst
+++ b/doc/source/user/basics.types.rst
@@ -4,4 +4,339 @@ Data types
.. seealso:: :ref:`Data type objects <arrays.dtypes>`
-.. automodule:: numpy.doc.basics
+Array types and conversions between types
+=========================================
+
+NumPy supports a much greater variety of numerical types than Python does.
+This section shows which are available, and how to modify an array's data-type.
+
+The primitive types supported are tied closely to those in C:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Numpy type
+ - C type
+ - Description
+
+ * - `np.bool_`
+ - ``bool``
+ - Boolean (True or False) stored as a byte
+
+ * - `np.byte`
+ - ``signed char``
+ - Platform-defined
+
+ * - `np.ubyte`
+ - ``unsigned char``
+ - Platform-defined
+
+ * - `np.short`
+ - ``short``
+ - Platform-defined
+
+ * - `np.ushort`
+ - ``unsigned short``
+ - Platform-defined
+
+ * - `np.intc`
+ - ``int``
+ - Platform-defined
+
+ * - `np.uintc`
+ - ``unsigned int``
+ - Platform-defined
+
+ * - `np.int_`
+ - ``long``
+ - Platform-defined
+
+ * - `np.uint`
+ - ``unsigned long``
+ - Platform-defined
+
+ * - `np.longlong`
+ - ``long long``
+ - Platform-defined
+
+ * - `np.ulonglong`
+ - ``unsigned long long``
+ - Platform-defined
+
+ * - `np.half` / `np.float16`
+ -
+ - Half precision float:
+ sign bit, 5 bits exponent, 10 bits mantissa
+
+ * - `np.single`
+ - ``float``
+ - Platform-defined single precision float:
+ typically sign bit, 8 bits exponent, 23 bits mantissa
+
+ * - `np.double`
+ - ``double``
+ - Platform-defined double precision float:
+ typically sign bit, 11 bits exponent, 52 bits mantissa.
+
+ * - `np.longdouble`
+ - ``long double``
+ - Platform-defined extended-precision float
+
+ * - `np.csingle`
+ - ``float complex``
+ - Complex number, represented by two single-precision floats (real and imaginary components)
+
+ * - `np.cdouble`
+ - ``double complex``
+ - Complex number, represented by two double-precision floats (real and imaginary components).
+
+ * - `np.clongdouble`
+ - ``long double complex``
+ - Complex number, represented by two extended-precision floats (real and imaginary components).
+
+
+Since many of these have platform-dependent definitions, a set of fixed-size
+aliases are provided:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Numpy type
+ - C type
+ - Description
+
+ * - `np.int8`
+ - ``int8_t``
+ - Byte (-128 to 127)
+
+ * - `np.int16`
+ - ``int16_t``
+ - Integer (-32768 to 32767)
+
+ * - `np.int32`
+ - ``int32_t``
+ - Integer (-2147483648 to 2147483647)
+
+ * - `np.int64`
+ - ``int64_t``
+ - Integer (-9223372036854775808 to 9223372036854775807)
+
+ * - `np.uint8`
+ - ``uint8_t``
+ - Unsigned integer (0 to 255)
+
+ * - `np.uint16`
+ - ``uint16_t``
+ - Unsigned integer (0 to 65535)
+
+ * - `np.uint32`
+ - ``uint32_t``
+ - Unsigned integer (0 to 4294967295)
+
+ * - `np.uint64`
+ - ``uint64_t``
+ - Unsigned integer (0 to 18446744073709551615)
+
+ * - `np.intp`
+ - ``intptr_t``
+ - Integer used for indexing, typically the same as ``ssize_t``
+
+ * - `np.uintp`
+ - ``uintptr_t``
+ - Integer large enough to hold a pointer
+
+ * - `np.float32`
+ - ``float``
+ -
+
+ * - `np.float64` / `np.float_`
+ - ``double``
+ - Note that this matches the precision of the builtin python `float`.
+
+ * - `np.complex64`
+ - ``float complex``
+ - Complex number, represented by two 32-bit floats (real and imaginary components)
+
+ * - `np.complex128` / `np.complex_`
+ - ``double complex``
+ - Note that this matches the precision of the builtin python `complex`.
+
+
+NumPy numerical types are instances of ``dtype`` (data-type) objects, each
+having unique characteristics. Once you have imported NumPy using
+
+ ::
+
+ >>> import numpy as np
+
+the dtypes are available as ``np.bool_``, ``np.float32``, etc.
+
+Advanced types, not listed in the table above, are explored in
+section :ref:`structured_arrays`.
+
+There are 5 basic numerical types representing booleans (bool), integers (int),
+unsigned integers (uint) floating point (float) and complex. Those with numbers
+in their name indicate the bitsize of the type (i.e. how many bits are needed
+to represent a single value in memory). Some types, such as ``int`` and
+``intp``, have differing bitsizes, dependent on the platforms (e.g. 32-bit
+vs. 64-bit machines). This should be taken into account when interfacing
+with low-level code (such as C or Fortran) where the raw memory is addressed.
+
+Data-types can be used as functions to convert python numbers to array scalars
+(see the array scalar section for an explanation), python sequences of numbers
+to arrays of that type, or as arguments to the dtype keyword that many numpy
+functions or methods accept. Some examples::
+
+ >>> import numpy as np
+ >>> x = np.float32(1.0)
+ >>> x
+ 1.0
+ >>> y = np.int_([1,2,4])
+ >>> y
+ array([1, 2, 4])
+ >>> z = np.arange(3, dtype=np.uint8)
+ >>> z
+ array([0, 1, 2], dtype=uint8)
+
+Array types can also be referred to by character codes, mostly to retain
+backward compatibility with older packages such as Numeric. Some
+documentation may still refer to these, for example::
+
+ >>> np.array([1, 2, 3], dtype='f')
+ array([ 1., 2., 3.], dtype=float32)
+
+We recommend using dtype objects instead.
+
+To convert the type of an array, use the .astype() method (preferred) or
+the type itself as a function. For example: ::
+
+ >>> z.astype(float) #doctest: +NORMALIZE_WHITESPACE
+ array([ 0., 1., 2.])
+ >>> np.int8(z)
+ array([0, 1, 2], dtype=int8)
+
+Note that, above, we use the *Python* float object as a dtype. NumPy knows
+that ``int`` refers to ``np.int_``, ``bool`` means ``np.bool_``,
+that ``float`` is ``np.float_`` and ``complex`` is ``np.complex_``.
+The other data-types do not have Python equivalents.
+
+To determine the type of an array, look at the dtype attribute::
+
+ >>> z.dtype
+ dtype('uint8')
+
+dtype objects also contain information about the type, such as its bit-width
+and its byte-order. The data type can also be used indirectly to query
+properties of the type, such as whether it is an integer::
+
+ >>> d = np.dtype(int)
+ >>> d
+ dtype('int32')
+
+ >>> np.issubdtype(d, np.integer)
+ True
+
+ >>> np.issubdtype(d, np.floating)
+ False
+
+
+Array Scalars
+=============
+
+NumPy generally returns elements of arrays as array scalars (a scalar
+with an associated dtype). Array scalars differ from Python scalars, but
+for the most part they can be used interchangeably (the primary
+exception is for versions of Python older than v2.x, where integer array
+scalars cannot act as indices for lists and tuples). There are some
+exceptions, such as when code requires very specific attributes of a scalar
+or when it checks specifically whether a value is a Python scalar. Generally,
+problems are easily fixed by explicitly converting array scalars
+to Python scalars, using the corresponding Python type function
+(e.g., ``int``, ``float``, ``complex``, ``str``, ``unicode``).
+
+The primary advantage of using array scalars is that
+they preserve the array type (Python may not have a matching scalar type
+available, e.g. ``int16``). Therefore, the use of array scalars ensures
+identical behaviour between arrays and scalars, irrespective of whether the
+value is inside an array or not. NumPy scalars also have many of the same
+methods arrays do.
+
+Overflow Errors
+===============
+
+The fixed size of NumPy numeric types may cause overflow errors when a value
+requires more memory than available in the data type. For example,
+`numpy.power` evaluates ``100 * 10 ** 8`` correctly for 64-bit integers,
+but gives 1874919424 (incorrect) for a 32-bit integer.
+
+ >>> np.power(100, 8, dtype=np.int64)
+ 10000000000000000
+ >>> np.power(100, 8, dtype=np.int32)
+ 1874919424
+
+The behaviour of NumPy and Python integer types differs significantly for
+integer overflows and may confuse users expecting NumPy integers to behave
+similar to Python's ``int``. Unlike NumPy, the size of Python's ``int`` is
+flexible. This means Python integers may expand to accommodate any integer and
+will not overflow.
+
+NumPy provides `numpy.iinfo` and `numpy.finfo` to verify the
+minimum or maximum values of NumPy integer and floating point values
+respectively ::
+
+ >>> np.iinfo(int) # Bounds of the default integer on this system.
+ iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)
+ >>> np.iinfo(np.int32) # Bounds of a 32-bit integer
+ iinfo(min=-2147483648, max=2147483647, dtype=int32)
+ >>> np.iinfo(np.int64) # Bounds of a 64-bit integer
+ iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)
+
+If 64-bit integers are still too small the result may be cast to a
+floating point number. Floating point numbers offer a larger, but inexact,
+range of possible values.
+
+ >>> np.power(100, 100, dtype=np.int64) # Incorrect even with 64-bit int
+ 0
+ >>> np.power(100, 100, dtype=np.float64)
+ 1e+200
+
+Extended Precision
+==================
+
+Python's floating-point numbers are usually 64-bit floating-point numbers,
+nearly equivalent to ``np.float64``. In some unusual situations it may be
+useful to use floating-point numbers with more precision. Whether this
+is possible in numpy depends on the hardware and on the development
+environment: specifically, x86 machines provide hardware floating-point
+with 80-bit precision, and while most C compilers provide this as their
+``long double`` type, MSVC (standard for Windows builds) makes
+``long double`` identical to ``double`` (64 bits). NumPy makes the
+compiler's ``long double`` available as ``np.longdouble`` (and
+``np.clongdouble`` for the complex numbers). You can find out what your
+numpy provides with ``np.finfo(np.longdouble)``.
+
+NumPy does not provide a dtype with more precision than C's
+``long double``\\; in particular, the 128-bit IEEE quad precision
+data type (FORTRAN's ``REAL*16``\\) is not available.
+
+For efficient memory alignment, ``np.longdouble`` is usually stored
+padded with zero bits, either to 96 or 128 bits. Which is more efficient
+depends on hardware and development environment; typically on 32-bit
+systems they are padded to 96 bits, while on 64-bit systems they are
+typically padded to 128 bits. ``np.longdouble`` is padded to the system
+default; ``np.float96`` and ``np.float128`` are provided for users who
+want specific padding. In spite of the names, ``np.float96`` and
+``np.float128`` provide only as much precision as ``np.longdouble``,
+that is, 80 bits on most x86 machines and 64 bits in standard
+Windows builds.
+
+Be warned that even if ``np.longdouble`` offers more precision than
+python ``float``, it is easy to lose that extra precision, since
+python often forces values to pass through ``float``. For example,
+the ``%`` formatting operator requires its arguments to be converted
+to standard python types, and it is therefore impossible to preserve
+extended precision even if many decimal places are requested. It can
+be useful to test your code with the value
+``1 + np.finfo(np.longdouble).eps``.
+
+
diff --git a/doc/source/user/misc.rst b/doc/source/user/misc.rst
index c10aea486..031ce4efa 100644
--- a/doc/source/user/misc.rst
+++ b/doc/source/user/misc.rst
@@ -2,4 +2,224 @@
Miscellaneous
*************
-.. automodule:: numpy.doc.misc
+IEEE 754 Floating Point Special Values
+--------------------------------------
+
+Special values defined in numpy: nan, inf,
+
+NaNs can be used as a poor-man's mask (if you don't care what the
+original value was)
+
+Note: cannot use equality to test NaNs. E.g.: ::
+
+ >>> myarr = np.array([1., 0., np.nan, 3.])
+ >>> np.nonzero(myarr == np.nan)
+ (array([], dtype=int64),)
+ >>> np.nan == np.nan # is always False! Use special numpy functions instead.
+ False
+ >>> myarr[myarr == np.nan] = 0. # doesn't work
+ >>> myarr
+ array([ 1., 0., NaN, 3.])
+ >>> myarr[np.isnan(myarr)] = 0. # use this instead find
+ >>> myarr
+ array([ 1., 0., 0., 3.])
+
+Other related special value functions: ::
+
+ isinf(): True if value is inf
+ isfinite(): True if not nan or inf
+ nan_to_num(): Map nan to 0, inf to max float, -inf to min float
+
+The following corresponds to the usual functions except that nans are excluded
+from the results: ::
+
+ nansum()
+ nanmax()
+ nanmin()
+ nanargmax()
+ nanargmin()
+
+ >>> x = np.arange(10.)
+ >>> x[3] = np.nan
+ >>> x.sum()
+ nan
+ >>> np.nansum(x)
+ 42.0
+
+How numpy handles numerical exceptions
+--------------------------------------
+
+The default is to ``'warn'`` for ``invalid``, ``divide``, and ``overflow``
+and ``'ignore'`` for ``underflow``. But this can be changed, and it can be
+set individually for different kinds of exceptions. The different behaviors
+are:
+
+ - 'ignore' : Take no action when the exception occurs.
+ - 'warn' : Print a `RuntimeWarning` (via the Python `warnings` module).
+ - 'raise' : Raise a `FloatingPointError`.
+ - 'call' : Call a function specified using the `seterrcall` function.
+ - 'print' : Print a warning directly to ``stdout``.
+ - 'log' : Record error in a Log object specified by `seterrcall`.
+
+These behaviors can be set for all kinds of errors or specific ones:
+
+ - all : apply to all numeric exceptions
+ - invalid : when NaNs are generated
+ - divide : divide by zero (for integers as well!)
+ - overflow : floating point overflows
+ - underflow : floating point underflows
+
+Note that integer divide-by-zero is handled by the same machinery.
+These behaviors are set on a per-thread basis.
+
+Examples
+--------
+
+::
+
+ >>> oldsettings = np.seterr(all='warn')
+ >>> np.zeros(5,dtype=np.float32)/0.
+ invalid value encountered in divide
+ >>> j = np.seterr(under='ignore')
+ >>> np.array([1.e-100])**10
+ >>> j = np.seterr(invalid='raise')
+ >>> np.sqrt(np.array([-1.]))
+ FloatingPointError: invalid value encountered in sqrt
+ >>> def errorhandler(errstr, errflag):
+ ... print("saw stupid error!")
+ >>> np.seterrcall(errorhandler)
+ <function err_handler at 0x...>
+ >>> j = np.seterr(all='call')
+ >>> np.zeros(5, dtype=np.int32)/0
+ FloatingPointError: invalid value encountered in divide
+ saw stupid error!
+ >>> j = np.seterr(**oldsettings) # restore previous
+ ... # error-handling settings
+
+Interfacing to C
+----------------
+Only a survey of the choices. Little detail on how each works.
+
+1) Bare metal, wrap your own C-code manually.
+
+ - Plusses:
+
+ - Efficient
+ - No dependencies on other tools
+
+ - Minuses:
+
+ - Lots of learning overhead:
+
+ - need to learn basics of Python C API
+ - need to learn basics of numpy C API
+ - need to learn how to handle reference counting and love it.
+
+ - Reference counting often difficult to get right.
+
+ - getting it wrong leads to memory leaks, and worse, segfaults
+
+ - API will change for Python 3.0!
+
+2) Cython
+
+ - Plusses:
+
+ - avoid learning C API's
+ - no dealing with reference counting
+ - can code in pseudo python and generate C code
+ - can also interface to existing C code
+ - should shield you from changes to Python C api
+ - has become the de-facto standard within the scientific Python community
+ - fast indexing support for arrays
+
+ - Minuses:
+
+ - Can write code in non-standard form which may become obsolete
+ - Not as flexible as manual wrapping
+
+3) ctypes
+
+ - Plusses:
+
+ - part of Python standard library
+ - good for interfacing to existing sharable libraries, particularly
+ Windows DLLs
+ - avoids API/reference counting issues
+ - good numpy support: arrays have all these in their ctypes
+ attribute: ::
+
+ a.ctypes.data a.ctypes.get_strides
+ a.ctypes.data_as a.ctypes.shape
+ a.ctypes.get_as_parameter a.ctypes.shape_as
+ a.ctypes.get_data a.ctypes.strides
+ a.ctypes.get_shape a.ctypes.strides_as
+
+ - Minuses:
+
+ - can't use for writing code to be turned into C extensions, only a wrapper
+ tool.
+
+4) SWIG (automatic wrapper generator)
+
+ - Plusses:
+
+ - around a long time
+ - multiple scripting language support
+ - C++ support
+ - Good for wrapping large (many functions) existing C libraries
+
+ - Minuses:
+
+ - generates lots of code between Python and the C code
+ - can cause performance problems that are nearly impossible to optimize
+ out
+ - interface files can be hard to write
+ - doesn't necessarily avoid reference counting issues or needing to know
+ API's
+
+5) scipy.weave
+
+ - Plusses:
+
+ - can turn many numpy expressions into C code
+ - dynamic compiling and loading of generated C code
+ - can embed pure C code in Python module and have weave extract, generate
+ interfaces and compile, etc.
+
+ - Minuses:
+
+ - Future very uncertain: it's the only part of Scipy not ported to Python 3
+ and is effectively deprecated in favor of Cython.
+
+6) Psyco
+
+ - Plusses:
+
+ - Turns pure python into efficient machine code through jit-like
+ optimizations
+ - very fast when it optimizes well
+
+ - Minuses:
+
+ - Only on intel (windows?)
+ - Doesn't do much for numpy?
+
+Interfacing to Fortran:
+-----------------------
+The clear choice to wrap Fortran code is
+`f2py <https://docs.scipy.org/doc/numpy/f2py/>`_.
+
+Pyfort is an older alternative, but not supported any longer.
+Fwrap is a newer project that looked promising but isn't being developed any
+longer.
+
+Interfacing to C++:
+-------------------
+ 1) Cython
+ 2) CXX
+ 3) Boost.python
+ 4) SWIG
+ 5) SIP (used mainly in PyQT)
+
+
diff --git a/doc/source/user/whatisnumpy.rst b/doc/source/user/whatisnumpy.rst
index 8478a77c4..154f91c84 100644
--- a/doc/source/user/whatisnumpy.rst
+++ b/doc/source/user/whatisnumpy.rst
@@ -125,7 +125,7 @@ same shape, or a scalar and an array, or even two arrays of with
different shapes, provided that the smaller array is "expandable" to
the shape of the larger in such a way that the resulting broadcast is
unambiguous. For detailed "rules" of broadcasting see
-`numpy.doc.broadcasting`.
+`basics.broadcasting`.
Who Else Uses NumPy?
--------------------