summaryrefslogtreecommitdiff
path: root/doc/source
diff options
context:
space:
mode:
authorMatti Picus <matti.picus@gmail.com>2020-09-28 23:15:03 +0300
committerGitHub <noreply@github.com>2020-09-28 23:15:03 +0300
commit6525ed58f1757092d73f33ee8717be2b5988c3e8 (patch)
treea576d459b8074d28f04212de0e4dd619a5bd8bea /doc/source
parent34d7d395d79f880dc9e156ff83af0cf5844867bf (diff)
parent33b25bc3da8c9f5de3890b016dde2f24f98903cb (diff)
downloadnumpy-6525ed58f1757092d73f33ee8717be2b5988c3e8.tar.gz
Merge pull request #16996 from bjnath/revised-glossary
DOC: Revise glossary page
Diffstat (limited to 'doc/source')
-rw-r--r--doc/source/glossary.rst513
1 files changed, 201 insertions, 312 deletions
diff --git a/doc/source/glossary.rst b/doc/source/glossary.rst
index 4a59c990b..17071c8f1 100644
--- a/doc/source/glossary.rst
+++ b/doc/source/glossary.rst
@@ -6,20 +6,27 @@ Glossary
(`n`,)
- A tuple with one element. The trailing comma distinguishes a one-element
- tuple from a parenthesized ``n``.
+ A parenthesized number followed by a comma denotes a tuple with one
+ element. The trailing comma distinguishes a one-element tuple from a
+ parenthesized ``n``.
-1
- Used as a dimension entry, ``-1`` instructs NumPy to choose the length
- that will keep the total number of elements the same.
+ - **In a dimension entry**, instructs NumPy to choose the length
+ that will keep the total number of array elements the same.
+ >>> np.arange(12).reshape(4, -1).shape
+ (4, 3)
- ``...``
- An :py:data:`Ellipsis`
+ - **In an index**, any negative value
+ `denotes <https://docs.python.org/dev/faq/programming.html#what-s-a-negative-index>`_
+ indexing from the right.
- **When indexing an array**, shorthand that the missing axes, if they
- exist, are full slices.
+ . . .
+ An :py:data:`Ellipsis`.
+
+ - **When indexing an array**, shorthand that the missing axes, if they
+ exist, are full slices.
>>> a = np.arange(24).reshape(2,3,4)
@@ -35,13 +42,13 @@ Glossary
>>> a[0,...,0].shape
(3,)
- It can be used at most once; ``a[...,0,...]`` raises an :exc:`IndexError`.
+ It can be used at most once; ``a[...,0,...]`` raises an :exc:`IndexError`.
- **In printouts**, NumPy substitutes ``...`` for the middle elements of
- large arrays. To see the entire array, use `numpy.printoptions`
+ - **In printouts**, NumPy substitutes ``...`` for the middle elements of
+ large arrays. To see the entire array, use `numpy.printoptions`
- ``:``
+ :
The Python :term:`python:slice`
operator. In ndarrays, slicing can be applied to every
axis:
@@ -73,14 +80,14 @@ Glossary
For details, see :ref:`combining-advanced-and-basic-indexing`.
- ``<``
+ <
In a dtype declaration, indicates that the data is
:term:`little-endian` (the bracket is big on the right). ::
>>> dt = np.dtype('<f') # little-endian single-precision float
- ``>``
+ >
In a dtype declaration, indicates that the data is
:term:`big-endian` (the bracket is big on the left). ::
@@ -95,54 +102,67 @@ Glossary
along an axis
- Axes are defined for arrays with more than one dimension. A
- 2-dimensional array has two corresponding axes: the first running
- vertically downwards across rows (axis 0), and the second running
- horizontally across columns (axis 1).
+ An operation `along axis n` of array ``a`` behaves as if its argument
+ were an array of slices of ``a`` where each slice has a successive
+ index of axis `n`.
- Many operations can take place along one of these axes. For example,
- we can sum each row of an array, in which case we operate along
- columns, or axis 1::
+ For example, if ``a`` is a 3 x `N` array, an operation along axis 0
+ behaves as if its argument were an array containing slices of each row:
- >>> x = np.arange(12).reshape((3,4))
+ >>> np.array((a[0,:], a[1,:], a[2,:])) #doctest: +SKIP
- >>> x
- array([[ 0, 1, 2, 3],
- [ 4, 5, 6, 7],
- [ 8, 9, 10, 11]])
+ To make it concrete, we can pick the operation to be the array-reversal
+ function :func:`numpy.flip`, which accepts an ``axis`` argument. We
+ construct a 3 x 4 array ``a``:
- >>> x.sum(axis=1)
- array([ 6, 22, 38])
+ >>> a = np.arange(12).reshape(3,4)
+ >>> a
+ array([[ 0, 1, 2, 3],
+ [ 4, 5, 6, 7],
+ [ 8, 9, 10, 11]])
+ Reversing along axis 0 (the row axis) yields
- array
- A homogeneous container of numerical elements. Each element in the
- array occupies a fixed amount of memory (hence homogeneous), and
- can be a numerical element of a single type (such as float, int
- or complex) or a combination (such as ``(float, int, float)``). Each
- array has an associated data-type (or ``dtype``), which describes
- the numerical type of its elements::
+ >>> np.flip(a,axis=0)
+ array([[ 8, 9, 10, 11],
+ [ 4, 5, 6, 7],
+ [ 0, 1, 2, 3]])
- >>> x = np.array([1, 2, 3], float)
+ Recalling the definition of `along an axis`, ``flip`` along axis 0 is
+ treating its argument as if it were
- >>> x
- array([ 1., 2., 3.])
+ >>> np.array((a[0,:], a[1,:], a[2,:]))
+ array([[ 0, 1, 2, 3],
+ [ 4, 5, 6, 7],
+ [ 8, 9, 10, 11]])
- >>> x.dtype # floating point number, 64 bits of memory per element
- dtype('float64')
+ and the result of ``np.flip(a,axis=0)`` is to reverse the slices:
+ >>> np.array((a[2,:],a[1,:],a[0,:]))
+ array([[ 8, 9, 10, 11],
+ [ 4, 5, 6, 7],
+ [ 0, 1, 2, 3]])
- # More complicated data type: each array element is a combination of
- # and integer and a floating point number
- >>> np.array([(1, 2.0), (3, 4.0)], dtype=[('x', np.int64), ('y', float)])
- array([(1, 2.), (3, 4.)], dtype=[('x', '<i8'), ('y', '<f8')])
- Fast element-wise operations, called a :term:`ufunc`, operate on arrays.
+ array
+ Used synonymously in the NumPy docs with :term:`ndarray`.
array_like
- Any sequence that can be interpreted as an ndarray. This includes
- nested lists, tuples, scalars and existing arrays.
+ Any :doc:`scalar <reference/arrays.scalars>` or
+ :term:`python:sequence`
+ that can be interpreted as an ndarray. In addition to ndarrays
+ and scalars this category includes lists (possibly nested and with
+ different element types) and tuples. Any argument accepted by
+ :doc:`numpy.array <reference/generated/numpy.array>`
+ is array_like. ::
+
+ >>> a = np.array([[1, 2.0], [0, 0], (1+1j, 3.)])
+
+ >>> a
+ array([[1.+0.j, 2.+0.j],
+ [0.+0.j, 0.+0.j],
+ [1.+1.j, 3.+0.j]])
array scalar
@@ -152,7 +172,6 @@ Glossary
axis
-
Another term for an array dimension. Axes are numbered left to right;
axis 0 is the first element in the shape tuple.
@@ -167,7 +186,6 @@ Glossary
>>> a
array([[[ 0, 1, 2],
[ 3, 4, 5]],
- <BLANKLINE>
[[ 6, 7, 8],
[ 9, 10, 11]]])
@@ -206,19 +224,16 @@ Glossary
.base
If an array does not own its memory, then its
- :doc:`base <reference/generated/numpy.ndarray.base>` attribute
- returns the object whose memory the array is referencing. That object
- may be borrowing the memory from still another object, so the
- owning object may be ``a.base.base.base...``. Despite advice to the
- contrary, testing ``base`` is not a surefire way to determine if two
- arrays are :term:`view`\ s.
+ :doc:`base <reference/generated/numpy.ndarray.base>` attribute returns
+ the object whose memory the array is referencing. That object may be
+ referencing the memory from still another object, so the owning object
+ may be ``a.base.base.base...``. Some writers erroneously claim that
+ testing ``base`` determines if arrays are :term:`view`\ s. For the
+ correct way, see :func:`numpy.shares_memory`.
big-endian
- When storing a multi-byte value in memory as a sequence of bytes, the
- sequence addresses/sends/stores the most significant byte first (lowest
- address) and the least significant byte last (highest address). Common in
- micro-processors and used for transmission of data over network protocols.
+ See `Endianness <https://en.wikipedia.org/wiki/Endianness>`_.
BLAS
@@ -226,244 +241,145 @@ Glossary
broadcast
- NumPy can do operations on arrays whose shapes are mismatched::
+ *broadcasting* is NumPy's ability to process ndarrays of
+ different sizes as if all were the same size.
- >>> x = np.array([1, 2])
- >>> y = np.array([[3], [4]])
+ It permits an elegant do-what-I-mean behavior where, for instance,
+ adding a scalar to a vector adds the scalar value to every element.
- >>> x
- array([1, 2])
+ >>> a = np.arange(3)
+ >>> a
+ array([0, 1, 2])
- >>> y
- array([[3],
- [4]])
+ >>> a + [3, 3, 3]
+ array([3, 4, 5])
+
+ >>> a + 3
+ array([3, 4, 5])
- >>> x + y
- array([[4, 5],
- [5, 6]])
+ Ordinarly, vector operands must all be the same size, because NumPy
+ works element by element -- for instance, ``c = a * b`` is ::
- See `basics.broadcasting` for more information.
+ c[0,0,0] = a[0,0,0] * b[0,0,0]
+ c[0,0,1] = a[0,0,1] * b[0,0,1]
+ ...
+
+ But in certain useful cases, NumPy can duplicate data along "missing"
+ axes or "too-short" dimensions so shapes will match. The duplication
+ costs no memory or time. For details, see
+ :doc:`Broadcasting. <user/basics.broadcasting>`
C order
- See `row-major`
+ Same as :term:`row-major`.
column-major
- A way to represent items in a N-dimensional array in the 1-dimensional
- computer memory. In column-major order, the leftmost index "varies the
- fastest": for example the array::
-
- [[1, 2, 3],
- [4, 5, 6]]
+ See `Row- and column-major order <https://en.wikipedia.org/wiki/Row-_and_column-major_order>`_.
- is represented in the column-major order as::
- [1, 4, 2, 5, 3, 6]
+ contiguous
+ An array is contiguous if
+ * it occupies an unbroken block of memory, and
+ * array elements with higher indexes occupy higher addresses (that
+ is, no :term:`stride` is negative).
- Column-major order is also known as the Fortran order, as the Fortran
- programming language uses it.
copy
-
See :term:`view`.
- decorator
- An operator that transforms a function. For example, a ``log``
- decorator may be defined to print debugging information upon
- function execution::
-
- >>> def log(f):
- ... def new_logging_func(*args, **kwargs):
- ... print("Logging call with parameters:", args, kwargs)
- ... return f(*args, **kwargs)
- ...
- ... return new_logging_func
-
- Now, when we define a function, we can "decorate" it using ``log``::
-
- >>> @log
- ... def add(a, b):
- ... return a + b
-
- Calling ``add`` then yields:
-
- >>> add(1, 2)
- Logging call with parameters: (1, 2) {}
- 3
-
-
- dictionary
- Resembling a language dictionary, which provides a mapping between
- words and descriptions thereof, a Python dictionary is a mapping
- between two objects::
-
- >>> x = {1: 'one', 'two': [1, 2]}
-
- Here, `x` is a dictionary mapping keys to values, in this case
- the integer 1 to the string "one", and the string "two" to
- the list ``[1, 2]``. The values may be accessed using their
- corresponding keys::
-
- >>> x[1]
- 'one'
-
- >>> x['two']
- [1, 2]
-
- Note that dictionaries are not stored in any specific order. Also,
- most mutable (see *immutable* below) objects, such as lists, may not
- be used as keys.
-
- For more information on dictionaries, read the
- `Python tutorial <https://docs.python.org/tutorial/>`_.
-
-
dimension
-
See :term:`axis`.
dtype
-
The datatype describing the (identically typed) elements in an ndarray.
It can be changed to reinterpret the array contents. For details, see
:doc:`Data type objects (dtype). <reference/arrays.dtypes>`
fancy indexing
-
Another term for :term:`advanced indexing`.
field
- In a :term:`structured data type`, each sub-type is called a `field`.
+ In a :term:`structured data type`, each subtype is called a `field`.
The `field` has a name (a string), a type (any valid dtype), and
- an optional `title`. See :ref:`arrays.dtypes`
+ an optional `title`. See :ref:`arrays.dtypes`.
Fortran order
- See `column-major`
+ Same as :term:`column-major`.
flattened
- Collapsed to a one-dimensional array. See `numpy.ndarray.flatten`
- for details.
+ See :term:`ravel`.
homogeneous
- Describes a block of memory comprised of blocks, each block comprised of
- items and of the same size, and blocks are interpreted in exactly the
- same way. In the simplest case each block contains a single item, for
- instance int32 or float64.
-
+ All elements of a homogeneous array have the same type. ndarrays, in
+ contrast to Python lists, are homogeneous. The type can be complicated,
+ as in a :term:`structured array`, but all elements have that type.
- immutable
- An object that cannot be modified after execution is called
- immutable. Two common examples are strings and tuples.
+ NumPy `object arrays <#term-object-array>`_, which contain references to
+ Python objects, fill the role of heterogeneous arrays.
itemsize
The size of the dtype element in bytes.
- list
- A Python container that can hold any number of objects or items.
- The items do not have to be of the same type, and can even be
- lists themselves::
-
- >>> x = [2, 2.0, "two", [2, 2.0]]
-
- The list `x` contains 4 items, each which can be accessed individually::
-
- >>> x[2] # the string 'two'
- 'two'
-
- >>> x[3] # a list, containing an integer 2 and a float 2.0
- [2, 2.0]
-
- It is also possible to select more than one item at a time,
- using *slicing*::
-
- >>> x[0:2] # or, equivalently, x[:2]
- [2, 2.0]
-
- In code, arrays are often conveniently expressed as nested lists::
-
-
- >>> np.array([[1, 2], [3, 4]])
- array([[1, 2],
- [3, 4]])
-
- For more information, read the section on lists in the `Python
- tutorial <https://docs.python.org/tutorial/>`_. For a mapping
- type (key-value), see *dictionary*.
-
-
little-endian
- When storing a multi-byte value in memory as a sequence of bytes, the
- sequence addresses/sends/stores the least significant byte first (lowest
- address) and the most significant byte last (highest address). Common in
- x86 processors.
+ See `Endianness <https://en.wikipedia.org/wiki/Endianness>`_.
mask
- A boolean array, used to select only certain elements for an operation::
+ A boolean array used to select only certain elements for an operation:
- >>> x = np.arange(5)
- >>> x
- array([0, 1, 2, 3, 4])
+ >>> x = np.arange(5)
+ >>> x
+ array([0, 1, 2, 3, 4])
- >>> mask = (x > 2)
- >>> mask
- array([False, False, False, True, True])
+ >>> mask = (x > 2)
+ >>> mask
+ array([False, False, False, True, True])
- >>> x[mask] = -1
- >>> x
- array([ 0, 1, 2, -1, -1])
+ >>> x[mask] = -1
+ >>> x
+ array([ 0, 1, 2, -1, -1])
masked array
- Array that suppressed values indicated by a mask::
+ Bad or missing data can be cleanly ignored by putting it in a masked
+ array, which has an internal boolean array indicating invalid
+ entries. Operations with masked arrays ignore these entries. ::
- >>> x = np.ma.masked_array([np.nan, 2, np.nan], [True, False, True])
- >>> x
+ >>> a = np.ma.masked_array([np.nan, 2, np.nan], [True, False, True])
+ >>> a
masked_array(data=[--, 2.0, --],
mask=[ True, False, True],
fill_value=1e+20)
- >>> x + [1, 2, 3]
+ >>> a + [1, 2, 3]
masked_array(data=[--, 4.0, --],
mask=[ True, False, True],
fill_value=1e+20)
-
- Masked arrays are often used when operating on arrays containing
- missing or invalid entries.
+ For details, see :doc:`Masked arrays. <reference/maskedarray>`
matrix
- A 2-dimensional ndarray that preserves its two-dimensional nature
- throughout operations. It has certain special operations, such as ``*``
- (matrix multiplication) and ``**`` (matrix power), defined::
-
- >>> x = np.mat([[1, 2], [3, 4]])
- >>> x
- matrix([[1, 2],
- [3, 4]])
-
- >>> x**2
- matrix([[ 7, 10],
- [15, 22]])
+ NumPy's two-dimensional
+ :doc:`matrix class <reference/generated/numpy.matrix>`
+ should no longer be used; use regular ndarrays.
ndarray
- See *array*.
+ :doc:`NumPy's basic structure <reference/arrays>`.
object array
-
An array whose dtype is ``object``; that is, it contains references to
Python objects. Indexing the array dereferences the Python objects, so
unlike other ndarrays, an object array has the ability to hold
@@ -471,72 +387,43 @@ Glossary
ravel
-
- `numpy.ravel` and `numpy.ndarray.flatten` both flatten an ndarray. ``ravel``
- will return a view if possible; ``flatten`` always returns a copy.
-
- Flattening collapses a multi-dimensional array to a single dimension;
+ :doc:`numpy.ravel \
+ <reference/generated/numpy.ravel>`
+ and :doc:`numpy.flatten \
+ <reference/generated/numpy.ndarray.flatten>`
+ both flatten an ndarray. ``ravel`` will return a view if possible;
+ ``flatten`` always returns a copy.
+
+ Flattening collapses a multimdimensional array to a single dimension;
details of how this is done (for instance, whether ``a[n+1]`` should be
the next row or next column) are parameters.
record array
- An :term:`ndarray` with :term:`structured data type` which has been
- subclassed as ``np.recarray`` and whose dtype is of type ``np.record``,
- making the fields of its data type to be accessible by attribute.
-
-
- reference
- If ``a`` is a reference to ``b``, then ``(a is b) == True``. Therefore,
- ``a`` and ``b`` are different names for the same Python object.
+ A :term:`structured array` with allowing access in an attribute style
+ (``a.field``) in addition to ``a['field']``. For details, see
+ :doc:`numpy.recarray. <reference/generated/numpy.recarray>`
row-major
- A way to represent items in a N-dimensional array in the 1-dimensional
- computer memory. In row-major order, the rightmost index "varies
- the fastest": for example the array::
-
- [[1, 2, 3],
- [4, 5, 6]]
-
- is represented in the row-major order as::
-
- [1, 2, 3, 4, 5, 6]
-
- Row-major order is also known as the C order, as the C programming
- language uses it. New NumPy arrays are by default in row-major order.
+ See `Row- and column-major order <https://en.wikipedia.org/wiki/Row-_and_column-major_order>`_.
+ NumPy creates arrays in row-major order by default.
- slice
- Used to select only certain elements from a sequence:
+ scalar
+ In NumPy, usually a synonym for :term:`array scalar`.
- >>> x = range(5)
- >>> x
- [0, 1, 2, 3, 4]
- >>> x[1:3] # slice from 1 to 3 (excluding 3 itself)
- [1, 2]
-
- >>> x[1:5:2] # slice from 1 to 5, but skipping every second element
- [1, 3]
-
- >>> x[::-1] # slice a sequence in reverse
- [4, 3, 2, 1, 0]
-
- Arrays may have more than one dimension, each which can be sliced
- individually:
-
- >>> x = np.array([[1, 2], [3, 4]])
- >>> x
- array([[1, 2],
- [3, 4]])
-
- >>> x[:, 1]
- array([2, 4])
+ shape
+ A tuple showing the length of each dimension of an ndarray. The
+ length of the tuple itself is the number of dimensions
+ (:doc:`numpy.ndim <reference/generated/numpy.ndarray.ndim>`).
+ The product of the tuple elements is the number of elements in the
+ array. For details, see
+ :doc:`numpy.ndarray.shape <reference/generated/numpy.ndarray.shape>`.
stride
-
Physical memory is one-dimensional; strides provide a mechanism to map
a given index to an address in memory. For an N-dimensional array, its
``strides`` attribute is an N-element tuple; advancing from index
@@ -555,56 +442,70 @@ Glossary
<https://arxiv.org/pdf/1102.1523.pdf>`_
- structure
- See :term:`structured data type`
-
-
structured array
-
Array whose :term:`dtype` is a :term:`structured data type`.
structured data type
- A data type composed of other datatypes
+ Users can create arbitrarily complex :term:`dtypes <dtype>`
+ that can include other arrays and dtypes. These composite dtypes are called
+ :doc:`structured data types. <user/basics.rec>`
- subarray data type
- A :term:`structured data type` may contain a :term:`ndarray` with its
- own dtype and shape:
+ subarray
+ An array nested in a :term:`structured data type`, as ``b`` is here:
+
+ >>> dt = np.dtype([('a', np.int32), ('b', np.float32, (3,))])
+ >>> np.zeros(3, dtype=dt)
+ array([(0, [0., 0., 0.]), (0, [0., 0., 0.]), (0, [0., 0., 0.])],
+ dtype=[('a', '<i4'), ('b', '<f4', (3,))])
- >>> dt = np.dtype([('a', np.int32), ('b', np.float32, (3,))])
- >>> np.zeros(3, dtype=dt)
- array([(0, [0., 0., 0.]), (0, [0., 0., 0.]), (0, [0., 0., 0.])],
- dtype=[('a', '<i4'), ('b', '<f4', (3,))])
+
+ subarray data type
+ An element of a structured datatype that behaves like an ndarray.
title
- In addition to field names, structured array fields may have an
- associated :ref:`title <titles>` which is an alias to the name and is
- commonly used for plotting.
+ An alias for a field name in a structured datatype.
+
+
+ type
+ In NumPy, usually a synonym for :term:`dtype`. For the more general
+ Python meaning, :term:`see here. <python:type>`
ufunc
- Universal function. A fast element-wise, :term:`vectorized
- <vectorization>` array operation. Examples include ``add``, ``sin`` and
- ``logical_or``.
+ NumPy's fast element-by-element computation (:term:`vectorization`)
+ gives a choice which function gets applied. The general term for the
+ function is ``ufunc``, short for ``universal function``. NumPy routines
+ have built-in ufuncs, but users can also
+ :doc:`write their own. <reference/ufuncs>`
vectorization
- Optimizing a looping block by specialized code. In a traditional sense,
- vectorization performs the same operation on multiple elements with
- fixed strides between them via specialized hardware. Compilers know how
- to take advantage of well-constructed loops to implement such
- optimizations. NumPy uses :ref:`vectorization <whatis-vectorization>`
- to mean any optimization via specialized code performing the same
- operations on multiple elements, typically achieving speedups by
- avoiding some of the overhead in looking up and converting the elements.
-
+ NumPy hands off array processing to C, where looping and computation are
+ much faster than in Python. To exploit this, programmers using NumPy
+ eliminate Python loops in favor of array-to-array operations.
+ :term:`vectorization` can refer both to the C offloading and to
+ structuring NumPy code to leverage it.
view
- An array that does not own its data, but refers to another array's
- data instead. For example, we may create a view that only shows
- every second element of another array::
+ Without touching underlying data, NumPy can make one array appear
+ to change its datatype and shape.
+
+ An array created this way is a `view`, and NumPy often exploits the
+ performance gain of using a view versus making a new array.
+
+ A potential drawback is that writing to a view can alter the original
+ as well. If this is a problem, NumPy instead needs to create a
+ physically distinct array -- a `copy`.
+
+ Some NumPy routines always return views, some always return copies, some
+ may return one or the other, and for some the choice can be specified.
+ Responsiblity for managing views and copies falls to the programmer.
+ :func:`numpy.shares_memory` will check whether ``b`` is a view of
+ ``a``, but an exact answer isn't always feasible, as the documentation
+ page explains.
>>> x = np.arange(5)
>>> x
@@ -618,15 +519,3 @@ Glossary
>>> y
array([3, 2, 4])
-
- wrapper
- Python is a high-level (highly abstracted, or English-like) language.
- This abstraction comes at a price in execution speed, and sometimes
- it becomes necessary to use lower level languages to do fast
- computations. A wrapper is code that provides a bridge between
- high and the low level languages, allowing, e.g., Python to execute
- code written in C or Fortran.
-
- Examples include ctypes, SWIG and Cython (which wraps C and C++)
- and f2py (which wraps Fortran).
-