diff options
author | Matti Picus <matti.picus@gmail.com> | 2020-09-28 23:15:03 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-09-28 23:15:03 +0300 |
commit | 6525ed58f1757092d73f33ee8717be2b5988c3e8 (patch) | |
tree | a576d459b8074d28f04212de0e4dd619a5bd8bea /doc/source | |
parent | 34d7d395d79f880dc9e156ff83af0cf5844867bf (diff) | |
parent | 33b25bc3da8c9f5de3890b016dde2f24f98903cb (diff) | |
download | numpy-6525ed58f1757092d73f33ee8717be2b5988c3e8.tar.gz |
Merge pull request #16996 from bjnath/revised-glossary
DOC: Revise glossary page
Diffstat (limited to 'doc/source')
-rw-r--r-- | doc/source/glossary.rst | 513 |
1 files changed, 201 insertions, 312 deletions
diff --git a/doc/source/glossary.rst b/doc/source/glossary.rst index 4a59c990b..17071c8f1 100644 --- a/doc/source/glossary.rst +++ b/doc/source/glossary.rst @@ -6,20 +6,27 @@ Glossary (`n`,) - A tuple with one element. The trailing comma distinguishes a one-element - tuple from a parenthesized ``n``. + A parenthesized number followed by a comma denotes a tuple with one + element. The trailing comma distinguishes a one-element tuple from a + parenthesized ``n``. -1 - Used as a dimension entry, ``-1`` instructs NumPy to choose the length - that will keep the total number of elements the same. + - **In a dimension entry**, instructs NumPy to choose the length + that will keep the total number of array elements the same. + >>> np.arange(12).reshape(4, -1).shape + (4, 3) - ``...`` - An :py:data:`Ellipsis` + - **In an index**, any negative value + `denotes <https://docs.python.org/dev/faq/programming.html#what-s-a-negative-index>`_ + indexing from the right. - **When indexing an array**, shorthand that the missing axes, if they - exist, are full slices. + . . . + An :py:data:`Ellipsis`. + + - **When indexing an array**, shorthand that the missing axes, if they + exist, are full slices. >>> a = np.arange(24).reshape(2,3,4) @@ -35,13 +42,13 @@ Glossary >>> a[0,...,0].shape (3,) - It can be used at most once; ``a[...,0,...]`` raises an :exc:`IndexError`. + It can be used at most once; ``a[...,0,...]`` raises an :exc:`IndexError`. - **In printouts**, NumPy substitutes ``...`` for the middle elements of - large arrays. To see the entire array, use `numpy.printoptions` + - **In printouts**, NumPy substitutes ``...`` for the middle elements of + large arrays. To see the entire array, use `numpy.printoptions` - ``:`` + : The Python :term:`python:slice` operator. In ndarrays, slicing can be applied to every axis: @@ -73,14 +80,14 @@ Glossary For details, see :ref:`combining-advanced-and-basic-indexing`. - ``<`` + < In a dtype declaration, indicates that the data is :term:`little-endian` (the bracket is big on the right). :: >>> dt = np.dtype('<f') # little-endian single-precision float - ``>`` + > In a dtype declaration, indicates that the data is :term:`big-endian` (the bracket is big on the left). :: @@ -95,54 +102,67 @@ Glossary along an axis - Axes are defined for arrays with more than one dimension. A - 2-dimensional array has two corresponding axes: the first running - vertically downwards across rows (axis 0), and the second running - horizontally across columns (axis 1). + An operation `along axis n` of array ``a`` behaves as if its argument + were an array of slices of ``a`` where each slice has a successive + index of axis `n`. - Many operations can take place along one of these axes. For example, - we can sum each row of an array, in which case we operate along - columns, or axis 1:: + For example, if ``a`` is a 3 x `N` array, an operation along axis 0 + behaves as if its argument were an array containing slices of each row: - >>> x = np.arange(12).reshape((3,4)) + >>> np.array((a[0,:], a[1,:], a[2,:])) #doctest: +SKIP - >>> x - array([[ 0, 1, 2, 3], - [ 4, 5, 6, 7], - [ 8, 9, 10, 11]]) + To make it concrete, we can pick the operation to be the array-reversal + function :func:`numpy.flip`, which accepts an ``axis`` argument. We + construct a 3 x 4 array ``a``: - >>> x.sum(axis=1) - array([ 6, 22, 38]) + >>> a = np.arange(12).reshape(3,4) + >>> a + array([[ 0, 1, 2, 3], + [ 4, 5, 6, 7], + [ 8, 9, 10, 11]]) + Reversing along axis 0 (the row axis) yields - array - A homogeneous container of numerical elements. Each element in the - array occupies a fixed amount of memory (hence homogeneous), and - can be a numerical element of a single type (such as float, int - or complex) or a combination (such as ``(float, int, float)``). Each - array has an associated data-type (or ``dtype``), which describes - the numerical type of its elements:: + >>> np.flip(a,axis=0) + array([[ 8, 9, 10, 11], + [ 4, 5, 6, 7], + [ 0, 1, 2, 3]]) - >>> x = np.array([1, 2, 3], float) + Recalling the definition of `along an axis`, ``flip`` along axis 0 is + treating its argument as if it were - >>> x - array([ 1., 2., 3.]) + >>> np.array((a[0,:], a[1,:], a[2,:])) + array([[ 0, 1, 2, 3], + [ 4, 5, 6, 7], + [ 8, 9, 10, 11]]) - >>> x.dtype # floating point number, 64 bits of memory per element - dtype('float64') + and the result of ``np.flip(a,axis=0)`` is to reverse the slices: + >>> np.array((a[2,:],a[1,:],a[0,:])) + array([[ 8, 9, 10, 11], + [ 4, 5, 6, 7], + [ 0, 1, 2, 3]]) - # More complicated data type: each array element is a combination of - # and integer and a floating point number - >>> np.array([(1, 2.0), (3, 4.0)], dtype=[('x', np.int64), ('y', float)]) - array([(1, 2.), (3, 4.)], dtype=[('x', '<i8'), ('y', '<f8')]) - Fast element-wise operations, called a :term:`ufunc`, operate on arrays. + array + Used synonymously in the NumPy docs with :term:`ndarray`. array_like - Any sequence that can be interpreted as an ndarray. This includes - nested lists, tuples, scalars and existing arrays. + Any :doc:`scalar <reference/arrays.scalars>` or + :term:`python:sequence` + that can be interpreted as an ndarray. In addition to ndarrays + and scalars this category includes lists (possibly nested and with + different element types) and tuples. Any argument accepted by + :doc:`numpy.array <reference/generated/numpy.array>` + is array_like. :: + + >>> a = np.array([[1, 2.0], [0, 0], (1+1j, 3.)]) + + >>> a + array([[1.+0.j, 2.+0.j], + [0.+0.j, 0.+0.j], + [1.+1.j, 3.+0.j]]) array scalar @@ -152,7 +172,6 @@ Glossary axis - Another term for an array dimension. Axes are numbered left to right; axis 0 is the first element in the shape tuple. @@ -167,7 +186,6 @@ Glossary >>> a array([[[ 0, 1, 2], [ 3, 4, 5]], - <BLANKLINE> [[ 6, 7, 8], [ 9, 10, 11]]]) @@ -206,19 +224,16 @@ Glossary .base If an array does not own its memory, then its - :doc:`base <reference/generated/numpy.ndarray.base>` attribute - returns the object whose memory the array is referencing. That object - may be borrowing the memory from still another object, so the - owning object may be ``a.base.base.base...``. Despite advice to the - contrary, testing ``base`` is not a surefire way to determine if two - arrays are :term:`view`\ s. + :doc:`base <reference/generated/numpy.ndarray.base>` attribute returns + the object whose memory the array is referencing. That object may be + referencing the memory from still another object, so the owning object + may be ``a.base.base.base...``. Some writers erroneously claim that + testing ``base`` determines if arrays are :term:`view`\ s. For the + correct way, see :func:`numpy.shares_memory`. big-endian - When storing a multi-byte value in memory as a sequence of bytes, the - sequence addresses/sends/stores the most significant byte first (lowest - address) and the least significant byte last (highest address). Common in - micro-processors and used for transmission of data over network protocols. + See `Endianness <https://en.wikipedia.org/wiki/Endianness>`_. BLAS @@ -226,244 +241,145 @@ Glossary broadcast - NumPy can do operations on arrays whose shapes are mismatched:: + *broadcasting* is NumPy's ability to process ndarrays of + different sizes as if all were the same size. - >>> x = np.array([1, 2]) - >>> y = np.array([[3], [4]]) + It permits an elegant do-what-I-mean behavior where, for instance, + adding a scalar to a vector adds the scalar value to every element. - >>> x - array([1, 2]) + >>> a = np.arange(3) + >>> a + array([0, 1, 2]) - >>> y - array([[3], - [4]]) + >>> a + [3, 3, 3] + array([3, 4, 5]) + + >>> a + 3 + array([3, 4, 5]) - >>> x + y - array([[4, 5], - [5, 6]]) + Ordinarly, vector operands must all be the same size, because NumPy + works element by element -- for instance, ``c = a * b`` is :: - See `basics.broadcasting` for more information. + c[0,0,0] = a[0,0,0] * b[0,0,0] + c[0,0,1] = a[0,0,1] * b[0,0,1] + ... + + But in certain useful cases, NumPy can duplicate data along "missing" + axes or "too-short" dimensions so shapes will match. The duplication + costs no memory or time. For details, see + :doc:`Broadcasting. <user/basics.broadcasting>` C order - See `row-major` + Same as :term:`row-major`. column-major - A way to represent items in a N-dimensional array in the 1-dimensional - computer memory. In column-major order, the leftmost index "varies the - fastest": for example the array:: - - [[1, 2, 3], - [4, 5, 6]] + See `Row- and column-major order <https://en.wikipedia.org/wiki/Row-_and_column-major_order>`_. - is represented in the column-major order as:: - [1, 4, 2, 5, 3, 6] + contiguous + An array is contiguous if + * it occupies an unbroken block of memory, and + * array elements with higher indexes occupy higher addresses (that + is, no :term:`stride` is negative). - Column-major order is also known as the Fortran order, as the Fortran - programming language uses it. copy - See :term:`view`. - decorator - An operator that transforms a function. For example, a ``log`` - decorator may be defined to print debugging information upon - function execution:: - - >>> def log(f): - ... def new_logging_func(*args, **kwargs): - ... print("Logging call with parameters:", args, kwargs) - ... return f(*args, **kwargs) - ... - ... return new_logging_func - - Now, when we define a function, we can "decorate" it using ``log``:: - - >>> @log - ... def add(a, b): - ... return a + b - - Calling ``add`` then yields: - - >>> add(1, 2) - Logging call with parameters: (1, 2) {} - 3 - - - dictionary - Resembling a language dictionary, which provides a mapping between - words and descriptions thereof, a Python dictionary is a mapping - between two objects:: - - >>> x = {1: 'one', 'two': [1, 2]} - - Here, `x` is a dictionary mapping keys to values, in this case - the integer 1 to the string "one", and the string "two" to - the list ``[1, 2]``. The values may be accessed using their - corresponding keys:: - - >>> x[1] - 'one' - - >>> x['two'] - [1, 2] - - Note that dictionaries are not stored in any specific order. Also, - most mutable (see *immutable* below) objects, such as lists, may not - be used as keys. - - For more information on dictionaries, read the - `Python tutorial <https://docs.python.org/tutorial/>`_. - - dimension - See :term:`axis`. dtype - The datatype describing the (identically typed) elements in an ndarray. It can be changed to reinterpret the array contents. For details, see :doc:`Data type objects (dtype). <reference/arrays.dtypes>` fancy indexing - Another term for :term:`advanced indexing`. field - In a :term:`structured data type`, each sub-type is called a `field`. + In a :term:`structured data type`, each subtype is called a `field`. The `field` has a name (a string), a type (any valid dtype), and - an optional `title`. See :ref:`arrays.dtypes` + an optional `title`. See :ref:`arrays.dtypes`. Fortran order - See `column-major` + Same as :term:`column-major`. flattened - Collapsed to a one-dimensional array. See `numpy.ndarray.flatten` - for details. + See :term:`ravel`. homogeneous - Describes a block of memory comprised of blocks, each block comprised of - items and of the same size, and blocks are interpreted in exactly the - same way. In the simplest case each block contains a single item, for - instance int32 or float64. - + All elements of a homogeneous array have the same type. ndarrays, in + contrast to Python lists, are homogeneous. The type can be complicated, + as in a :term:`structured array`, but all elements have that type. - immutable - An object that cannot be modified after execution is called - immutable. Two common examples are strings and tuples. + NumPy `object arrays <#term-object-array>`_, which contain references to + Python objects, fill the role of heterogeneous arrays. itemsize The size of the dtype element in bytes. - list - A Python container that can hold any number of objects or items. - The items do not have to be of the same type, and can even be - lists themselves:: - - >>> x = [2, 2.0, "two", [2, 2.0]] - - The list `x` contains 4 items, each which can be accessed individually:: - - >>> x[2] # the string 'two' - 'two' - - >>> x[3] # a list, containing an integer 2 and a float 2.0 - [2, 2.0] - - It is also possible to select more than one item at a time, - using *slicing*:: - - >>> x[0:2] # or, equivalently, x[:2] - [2, 2.0] - - In code, arrays are often conveniently expressed as nested lists:: - - - >>> np.array([[1, 2], [3, 4]]) - array([[1, 2], - [3, 4]]) - - For more information, read the section on lists in the `Python - tutorial <https://docs.python.org/tutorial/>`_. For a mapping - type (key-value), see *dictionary*. - - little-endian - When storing a multi-byte value in memory as a sequence of bytes, the - sequence addresses/sends/stores the least significant byte first (lowest - address) and the most significant byte last (highest address). Common in - x86 processors. + See `Endianness <https://en.wikipedia.org/wiki/Endianness>`_. mask - A boolean array, used to select only certain elements for an operation:: + A boolean array used to select only certain elements for an operation: - >>> x = np.arange(5) - >>> x - array([0, 1, 2, 3, 4]) + >>> x = np.arange(5) + >>> x + array([0, 1, 2, 3, 4]) - >>> mask = (x > 2) - >>> mask - array([False, False, False, True, True]) + >>> mask = (x > 2) + >>> mask + array([False, False, False, True, True]) - >>> x[mask] = -1 - >>> x - array([ 0, 1, 2, -1, -1]) + >>> x[mask] = -1 + >>> x + array([ 0, 1, 2, -1, -1]) masked array - Array that suppressed values indicated by a mask:: + Bad or missing data can be cleanly ignored by putting it in a masked + array, which has an internal boolean array indicating invalid + entries. Operations with masked arrays ignore these entries. :: - >>> x = np.ma.masked_array([np.nan, 2, np.nan], [True, False, True]) - >>> x + >>> a = np.ma.masked_array([np.nan, 2, np.nan], [True, False, True]) + >>> a masked_array(data=[--, 2.0, --], mask=[ True, False, True], fill_value=1e+20) - >>> x + [1, 2, 3] + >>> a + [1, 2, 3] masked_array(data=[--, 4.0, --], mask=[ True, False, True], fill_value=1e+20) - - Masked arrays are often used when operating on arrays containing - missing or invalid entries. + For details, see :doc:`Masked arrays. <reference/maskedarray>` matrix - A 2-dimensional ndarray that preserves its two-dimensional nature - throughout operations. It has certain special operations, such as ``*`` - (matrix multiplication) and ``**`` (matrix power), defined:: - - >>> x = np.mat([[1, 2], [3, 4]]) - >>> x - matrix([[1, 2], - [3, 4]]) - - >>> x**2 - matrix([[ 7, 10], - [15, 22]]) + NumPy's two-dimensional + :doc:`matrix class <reference/generated/numpy.matrix>` + should no longer be used; use regular ndarrays. ndarray - See *array*. + :doc:`NumPy's basic structure <reference/arrays>`. object array - An array whose dtype is ``object``; that is, it contains references to Python objects. Indexing the array dereferences the Python objects, so unlike other ndarrays, an object array has the ability to hold @@ -471,72 +387,43 @@ Glossary ravel - - `numpy.ravel` and `numpy.ndarray.flatten` both flatten an ndarray. ``ravel`` - will return a view if possible; ``flatten`` always returns a copy. - - Flattening collapses a multi-dimensional array to a single dimension; + :doc:`numpy.ravel \ + <reference/generated/numpy.ravel>` + and :doc:`numpy.flatten \ + <reference/generated/numpy.ndarray.flatten>` + both flatten an ndarray. ``ravel`` will return a view if possible; + ``flatten`` always returns a copy. + + Flattening collapses a multimdimensional array to a single dimension; details of how this is done (for instance, whether ``a[n+1]`` should be the next row or next column) are parameters. record array - An :term:`ndarray` with :term:`structured data type` which has been - subclassed as ``np.recarray`` and whose dtype is of type ``np.record``, - making the fields of its data type to be accessible by attribute. - - - reference - If ``a`` is a reference to ``b``, then ``(a is b) == True``. Therefore, - ``a`` and ``b`` are different names for the same Python object. + A :term:`structured array` with allowing access in an attribute style + (``a.field``) in addition to ``a['field']``. For details, see + :doc:`numpy.recarray. <reference/generated/numpy.recarray>` row-major - A way to represent items in a N-dimensional array in the 1-dimensional - computer memory. In row-major order, the rightmost index "varies - the fastest": for example the array:: - - [[1, 2, 3], - [4, 5, 6]] - - is represented in the row-major order as:: - - [1, 2, 3, 4, 5, 6] - - Row-major order is also known as the C order, as the C programming - language uses it. New NumPy arrays are by default in row-major order. + See `Row- and column-major order <https://en.wikipedia.org/wiki/Row-_and_column-major_order>`_. + NumPy creates arrays in row-major order by default. - slice - Used to select only certain elements from a sequence: + scalar + In NumPy, usually a synonym for :term:`array scalar`. - >>> x = range(5) - >>> x - [0, 1, 2, 3, 4] - >>> x[1:3] # slice from 1 to 3 (excluding 3 itself) - [1, 2] - - >>> x[1:5:2] # slice from 1 to 5, but skipping every second element - [1, 3] - - >>> x[::-1] # slice a sequence in reverse - [4, 3, 2, 1, 0] - - Arrays may have more than one dimension, each which can be sliced - individually: - - >>> x = np.array([[1, 2], [3, 4]]) - >>> x - array([[1, 2], - [3, 4]]) - - >>> x[:, 1] - array([2, 4]) + shape + A tuple showing the length of each dimension of an ndarray. The + length of the tuple itself is the number of dimensions + (:doc:`numpy.ndim <reference/generated/numpy.ndarray.ndim>`). + The product of the tuple elements is the number of elements in the + array. For details, see + :doc:`numpy.ndarray.shape <reference/generated/numpy.ndarray.shape>`. stride - Physical memory is one-dimensional; strides provide a mechanism to map a given index to an address in memory. For an N-dimensional array, its ``strides`` attribute is an N-element tuple; advancing from index @@ -555,56 +442,70 @@ Glossary <https://arxiv.org/pdf/1102.1523.pdf>`_ - structure - See :term:`structured data type` - - structured array - Array whose :term:`dtype` is a :term:`structured data type`. structured data type - A data type composed of other datatypes + Users can create arbitrarily complex :term:`dtypes <dtype>` + that can include other arrays and dtypes. These composite dtypes are called + :doc:`structured data types. <user/basics.rec>` - subarray data type - A :term:`structured data type` may contain a :term:`ndarray` with its - own dtype and shape: + subarray + An array nested in a :term:`structured data type`, as ``b`` is here: + + >>> dt = np.dtype([('a', np.int32), ('b', np.float32, (3,))]) + >>> np.zeros(3, dtype=dt) + array([(0, [0., 0., 0.]), (0, [0., 0., 0.]), (0, [0., 0., 0.])], + dtype=[('a', '<i4'), ('b', '<f4', (3,))]) - >>> dt = np.dtype([('a', np.int32), ('b', np.float32, (3,))]) - >>> np.zeros(3, dtype=dt) - array([(0, [0., 0., 0.]), (0, [0., 0., 0.]), (0, [0., 0., 0.])], - dtype=[('a', '<i4'), ('b', '<f4', (3,))]) + + subarray data type + An element of a structured datatype that behaves like an ndarray. title - In addition to field names, structured array fields may have an - associated :ref:`title <titles>` which is an alias to the name and is - commonly used for plotting. + An alias for a field name in a structured datatype. + + + type + In NumPy, usually a synonym for :term:`dtype`. For the more general + Python meaning, :term:`see here. <python:type>` ufunc - Universal function. A fast element-wise, :term:`vectorized - <vectorization>` array operation. Examples include ``add``, ``sin`` and - ``logical_or``. + NumPy's fast element-by-element computation (:term:`vectorization`) + gives a choice which function gets applied. The general term for the + function is ``ufunc``, short for ``universal function``. NumPy routines + have built-in ufuncs, but users can also + :doc:`write their own. <reference/ufuncs>` vectorization - Optimizing a looping block by specialized code. In a traditional sense, - vectorization performs the same operation on multiple elements with - fixed strides between them via specialized hardware. Compilers know how - to take advantage of well-constructed loops to implement such - optimizations. NumPy uses :ref:`vectorization <whatis-vectorization>` - to mean any optimization via specialized code performing the same - operations on multiple elements, typically achieving speedups by - avoiding some of the overhead in looking up and converting the elements. - + NumPy hands off array processing to C, where looping and computation are + much faster than in Python. To exploit this, programmers using NumPy + eliminate Python loops in favor of array-to-array operations. + :term:`vectorization` can refer both to the C offloading and to + structuring NumPy code to leverage it. view - An array that does not own its data, but refers to another array's - data instead. For example, we may create a view that only shows - every second element of another array:: + Without touching underlying data, NumPy can make one array appear + to change its datatype and shape. + + An array created this way is a `view`, and NumPy often exploits the + performance gain of using a view versus making a new array. + + A potential drawback is that writing to a view can alter the original + as well. If this is a problem, NumPy instead needs to create a + physically distinct array -- a `copy`. + + Some NumPy routines always return views, some always return copies, some + may return one or the other, and for some the choice can be specified. + Responsiblity for managing views and copies falls to the programmer. + :func:`numpy.shares_memory` will check whether ``b`` is a view of + ``a``, but an exact answer isn't always feasible, as the documentation + page explains. >>> x = np.arange(5) >>> x @@ -618,15 +519,3 @@ Glossary >>> y array([3, 2, 4]) - - wrapper - Python is a high-level (highly abstracted, or English-like) language. - This abstraction comes at a price in execution speed, and sometimes - it becomes necessary to use lower level languages to do fast - computations. A wrapper is code that provides a bridge between - high and the low level languages, allowing, e.g., Python to execute - code written in C or Fortran. - - Examples include ctypes, SWIG and Cython (which wraps C and C++) - and f2py (which wraps Fortran). - |