summaryrefslogtreecommitdiff
path: root/numpy/doc/indexing.py
diff options
context:
space:
mode:
authorStefan van der Walt <stefan@sun.ac.za>2008-08-23 23:17:23 +0000
committerStefan van der Walt <stefan@sun.ac.za>2008-08-23 23:17:23 +0000
commit5c86844c34674e3d580ac2cd12ef171e18130b13 (patch)
tree2fdf1150706c07c7e193eb7483ce58a5074e5774 /numpy/doc/indexing.py
parent376d483d31c4c5427510cf3a8c69fc795aef63aa (diff)
downloadnumpy-5c86844c34674e3d580ac2cd12ef171e18130b13.tar.gz
Move documentation outside of source tree. Remove `doc` import from __init__.
Diffstat (limited to 'numpy/doc/indexing.py')
-rw-r--r--numpy/doc/indexing.py384
1 files changed, 384 insertions, 0 deletions
diff --git a/numpy/doc/indexing.py b/numpy/doc/indexing.py
new file mode 100644
index 000000000..365edd67a
--- /dev/null
+++ b/numpy/doc/indexing.py
@@ -0,0 +1,384 @@
+"""
+==============
+Array indexing
+==============
+
+Array indexing refers to any use of the square brackets ([]) to index
+array values. There are many options to indexing, which give numpy
+indexing great power, but with power comes some complexity and the
+potential for confusion. This section is just an overview of the
+various options and issues related to indexing. Aside from single
+element indexing, the details on most of these options are to be
+found in related sections.
+
+Assignment vs referencing
+=========================
+
+Most of the following examples show the use of indexing when referencing
+data in an array. The examples work just as well when assigning to an
+array. See the section at the end for specific examples and explanations
+on how assignments work.
+
+Single element indexing
+=======================
+
+Single element indexing for a 1-D array is what one expects. It work
+exactly like that for other standard Python sequences. It is 0-based,
+and accepts negative indices for indexing from the end of the array. ::
+
+ >>> x = np.arange(10)
+ >>> x[2]
+ 2
+ >>> x[-2]
+ 8
+
+Unlike lists and tuples, numpy arrays support multidimensional indexing
+for multidimensional arrays. That means that it is not necessary to
+separate each dimension's index into its own set of square brackets. ::
+
+ >>> x.shape = (2,5) # now x is 2-dimensional
+ >>> x[1,3]
+ 8
+ >>> x[1,-1]
+ 9
+
+Note that if one indexes a multidimensional array with fewer indices
+than dimensions, one gets a subdimensional array. For example: ::
+
+ >>> x[0]
+ array([0, 1, 2, 3, 4])
+
+That is, each index specified selects the array corresponding to the rest
+of the dimensions selected. In the above example, choosing 0 means that
+remaining dimension of lenth 5 is being left unspecified, and that what
+is returned is an array of that dimensionality and size. It must be noted
+that the returned array is not a copy of the original, but points to the
+same values in memory as does the original array (a new view of the same
+data in other words, see xxx for details). In this case,
+the 1-D array at the first position (0) is returned. So using a single
+index on the returned array, results in a single element being returned.
+That is: ::
+
+ >>> x[0][2]
+ 2
+
+So note that ``x[0,2] = x[0][2]`` though the second case is more inefficient
+a new temporary array is created after the first index that is subsequently
+indexed by 2.
+
+Note to those used to IDL or Fortran memory order as it relates to indexing.
+Numpy uses C-order indexing. That means that the last index usually (see
+xxx for exceptions) represents the most rapidly changing memory location,
+unlike Fortran or IDL, where the first index represents the most rapidly
+changing location in memory. This difference represents a great potential
+for confusion.
+
+Other indexing options
+======================
+
+It is possible to slice and stride arrays to extract arrays of the same
+number of dimensions, but of different sizes than the original. The slicing
+and striding works exactly the same way it does for lists and tuples except
+that they can be applied to multiple dimensions as well. A few
+examples illustrates best: ::
+
+ >>> x = np.arange(10)
+ >>> x[2:5]
+ array([2, 3, 4])
+ >>> x[:-7]
+ array([0, 1, 2])
+ >>> x[1:7:2]
+ array([1,3,5])
+ >>> y = np.arange(35).reshape(5,7)
+ >>> y[1:5:2,::3]
+ array([[ 7, 10, 13],
+ [21, 24, 27]])
+
+Note that slices of arrays do not copy the internal array data but
+also produce new views of the original data (see xxx for more
+explanation of this issue).
+
+It is possible to index arrays with other arrays for the purposes of
+selecting lists of values out of arrays into new arrays. There are two
+different ways of accomplishing this. One uses one or more arrays of
+index values (see xxx for details). The other involves giving a boolean
+array of the proper shape to indicate the values to be selected.
+Index arrays are a very powerful tool that allow one to avoid looping
+over individual elements in arrays and thus greatly improve performance
+(see xxx for examples)
+
+It is possible to use special features to effectively increase the
+number of dimensions in an array through indexing so the resulting
+array aquires the shape needed for use in an expression or with a
+specific function. See xxx.
+
+Index arrays
+============
+
+Numpy arrays may be indexed with other arrays (or any other sequence-like
+object that can be converted to an array, such as lists, with the exception
+of tuples; see the end of this document for why this is). The use of index
+arrays ranges from simple, straightforward cases to complex, hard-to-understand
+cases. For all cases of index arrays, what is returned is a copy of the
+original data, not a view as one gets for slices.
+
+Index arrays must be of integer type. Each value in the array indicates which
+value in the array to use in place of the index. To illustrate: ::
+
+ >>> x = np.arange(10,1,-1)
+ >>> x
+ array([10, 9, 8, 7, 6, 5, 4, 3, 2])
+ >>> x[np.array([3, 3, 1, 8])]
+ array([7, 7, 9, 2])
+
+
+The index array consisting of the values 3, 3, 1 and 8 correspondingly create
+an array of length 4 (same as the index array) where each index is replaced by
+the value the index array has in the array being indexed.
+
+Negative values are permitted and work as they do with single indices or slices: ::
+
+ >>> x[np.array([3,3,-3,8])]
+ array([7, 7, 4, 2])
+
+It is an error to have index values out of bounds: ::
+
+ >>> x[np.array([3, 3, 20, 8])]
+ <type 'exceptions.IndexError'>: index 20 out of bounds 0<=index<9
+
+Generally speaking, what is returned when index arrays are used is an array with
+the same shape as the index array, but with the type and values of the array being
+indexed. As an example, we can use a multidimensional index array instead: ::
+
+ >>> x[np.array([[1,1],[2,3]])]
+ array([[9, 9],
+ [8, 7]])
+
+Indexing Multi-dimensional arrays
+=================================
+
+Things become more complex when multidimensional arrays are indexed, particularly
+with multidimensional index arrays. These tend to be more unusal uses, but they
+are permitted, and they are useful for some problems. We'll start with the
+simplest multidimensional case (using the array y from the previous examples): ::
+
+ >>> y[np.array([0,2,4]), np.array([0,1,2])]
+ array([ 0, 15, 30])
+
+In this case, if the index arrays have a matching shape, and there is an index
+array for each dimension of the array being indexed, the resultant array has the
+same shape as the index arrays, and the values correspond to the index set for each
+position in the index arrays. In this example, the first index value is 0 for both
+index arrays, and thus the first value of the resultant array is y[0,0]. The next
+value is y[2,1], and the last is y[4,2].
+
+If the index arrays do not have the same shape, there is an attempt to broadcast
+them to the same shape. Broadcasting won't be discussed here but is discussed in
+detail in xxx. If they cannot be broadcast to the same shape, an exception is
+raised: ::
+
+ >>> y[np.array([0,2,4]), np.array([0,1])]
+ <type 'exceptions.ValueError'>: shape mismatch: objects cannot be broadcast to a single shape
+
+The broadcasting mechanism permits index arrays to be combined with scalars for
+other indices. The effect is that the scalar value is used for all the corresponding
+values of the index arrays: ::
+
+ >>> y[np.array([0,2,4]), 1]
+ array([ 1, 15, 29])
+
+Jumping to the next level of complexity, it is possible to only partially index an array
+with index arrays. It takes a bit of thought to understand what happens in such cases.
+For example if we just use one index array with y: ::
+
+ >>> y[np.array([0,2,4])]
+ array([[ 0, 1, 2, 3, 4, 5, 6],
+ [14, 15, 16, 17, 18, 19, 20],
+ [28, 29, 30, 31, 32, 33, 34]])
+
+What results is the construction of a new array where each value of the index array
+selects one row from the array being indexed and the resultant array has the resulting
+shape (size of row, number index elements).
+
+An example of where this may be useful is for a color lookup table where we want to map
+the values of an image into RGB triples for display. The lookup table could have a shape
+(nlookup, 3). Indexing such an array with an image with shape (ny, nx) with dtype=np.uint8
+(or any integer type so long as values are with the bounds of the lookup table) will
+result in an array of shape (ny, nx, 3) where a triple of RGB values is associated with
+each pixel location.
+
+In general, the shape of the resulant array will be the concatenation of the shape of
+the index array (or the shape that all the index arrays were broadcast to) with the
+shape of any unused dimensions (those not indexed) in the array being indexed.
+
+Boolean or "mask" index arrays
+==============================
+
+Boolean arrays used as indices are treated in a different manner entirely than index
+arrays. Boolean arrays must be of the same shape as the array being indexed, or
+broadcastable to the same shape. In the most straightforward case, the boolean array
+has the same shape: ::
+
+ >>> b = y>20
+ >>> y[b]
+ array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34])
+
+The result is a 1-D array containing all the elements in the indexed array corresponding
+to all the true elements in the boolean array. As with index arrays, what is returned
+is a copy of the data, not a view as one gets with slices.
+
+With broadcasting, multidimesional arrays may be the result. For example: ::
+
+ >>> b[:,5] # use a 1-D boolean that broadcasts with y
+ array([False, False, False, True, True], dtype=bool)
+ >>> y[b[:,5]]
+ array([[21, 22, 23, 24, 25, 26, 27],
+ [28, 29, 30, 31, 32, 33, 34]])
+
+Here the 4th and 5th rows are selected from the indexed array and combined to make a
+2-D array.
+
+Combining index arrays with slices
+==================================
+
+Index arrays may be combined with slices. For example: ::
+
+ >>> y[np.array([0,2,4]),1:3]
+ array([[ 1, 2],
+ [15, 16],
+ [29, 30]])
+
+In effect, the slice is converted to an index array np.array([[1,2]]) (shape (1,2)) that is
+broadcast with the index array to produce a resultant array of shape (3,2).
+
+Likewise, slicing can be combined with broadcasted boolean indices: ::
+
+ >>> y[b[:,5],1:3]
+ array([[22, 23],
+ [29, 30]])
+
+Structural indexing tools
+=========================
+
+To facilitate easy matching of array shapes with expressions and in
+assignments, the np.newaxis object can be used within array indices
+to add new dimensions with a size of 1. For example: ::
+
+ >>> y.shape
+ (5, 7)
+ >>> y[:,np.newaxis,:].shape
+ (5, 1, 7)
+
+Note that there are no new elements in the array, just that the
+dimensionality is increased. This can be handy to combine two
+arrays in a way that otherwise would require explicitly reshaping
+operations. For example: ::
+
+ >>> x = np.arange(5)
+ >>> x[:,np.newaxis] + x[np.newaxis,:]
+ array([[0, 1, 2, 3, 4],
+ [1, 2, 3, 4, 5],
+ [2, 3, 4, 5, 6],
+ [3, 4, 5, 6, 7],
+ [4, 5, 6, 7, 8]])
+
+The ellipsis syntax maybe used to indicate selecting in full any
+remaining unspecified dimensions. For example: ::
+
+ >>> z = np.arange(81).reshape(3,3,3,3)
+ >>> z[1,...,2]
+ array([[29, 32, 35],
+ [38, 41, 44],
+ [47, 50, 53]])
+
+This is equivalent to: ::
+
+ >>> z[1,:,:,2]
+
+Assigning values to indexed arrays
+==================================
+
+As mentioned, one can select a subset of an array to assign to using
+a single index, slices, and index and mask arrays. The value being
+assigned to the indexed array must be shape consistent (the same shape
+or broadcastable to the shape the index produces). For example, it is
+permitted to assign a constant to a slice: ::
+
+ >>> x[2:7] = 1
+
+or an array of the right size: ::
+
+ >>> x[2:7] = np.arange(5)
+
+Note that assignments may result in changes if assigning
+higher types to lower types (like floats to ints) or even
+exceptions (assigning complex to floats or ints): ::
+
+ >>> x[1] = 1.2
+ >>> x[1]
+ 1
+ >>> x[1] = 1.2j
+ <type 'exceptions.TypeError'>: can't convert complex to long; use long(abs(z))
+
+
+Unlike some of the references (such as array and mask indices)
+assignments are always made to the original data in the array
+(indeed, nothing else would make sense!). Note though, that some
+actions may not work as one may naively expect. This particular
+example is often surprising to people: ::
+
+ >>> x[np.array([1, 1, 3, 1]) += 1
+
+Where people expect that the 1st location will be incremented by 3.
+In fact, it will only be incremented by 1. The reason is because
+a new array is extracted from the original (as a temporary) containing
+the values at 1, 1, 3, 1, then the value 1 is added to the temporary,
+and then the temporary is assigned back to the original array. Thus
+the value of the array at x[1]+1 is assigned to x[1] three times,
+rather than being incremented 3 times.
+
+Dealing with variable numbers of indices within programs
+========================================================
+
+The index syntax is very powerful but limiting when dealing with
+a variable number of indices. For example, if you want to write
+a function that can handle arguments with various numbers of
+dimensions without having to write special case code for each
+number of possible dimensions, how can that be done? If one
+supplies to the index a tuple, the tuple will be interpreted
+as a list of indices. For example (using the previous definition
+for the array z): ::
+
+ >>> indices = (1,1,1,1)
+ >>> z[indices]
+ 40
+
+So one can use code to construct tuples of any number of indices
+and then use these within an index.
+
+Slices can be specified within programs by using the slice() function
+in Python. For example: ::
+
+ >>> indices = (1,1,1,slice(0,2)) # same as [1,1,1,0:2]
+ array([39, 40])
+
+Likewise, ellipsis can be specified by code by using the Ellipsis object: ::
+
+ >>> indices = (1, Ellipsis, 1) # same as [1,...,1]
+ >>> z[indices]
+ array([[28, 31, 34],
+ [37, 40, 43],
+ [46, 49, 52]])
+
+For this reason it is possible to use the output from the np.where()
+function directly as an index since it always returns a tuple of index arrays.
+
+Because the special treatment of tuples, they are not automatically converted
+to an array as a list would be. As an example: ::
+
+ >>> z[[1,1,1,1]]
+ ... # produces a large array
+ >>> z[(1,1,1,1)]
+ 40 # returns a single value
+
+"""