diff options
Diffstat (limited to 'doc/source/user/basics.copies.rst')
-rw-r--r-- | doc/source/user/basics.copies.rst | 152 |
1 files changed, 152 insertions, 0 deletions
diff --git a/doc/source/user/basics.copies.rst b/doc/source/user/basics.copies.rst new file mode 100644 index 000000000..583a59b95 --- /dev/null +++ b/doc/source/user/basics.copies.rst @@ -0,0 +1,152 @@ +.. _basics.copies-and-views: + +**************** +Copies and views +**************** + +When operating on NumPy arrays, it is possible to access the internal data +buffer directly using a :ref:`view <view>` without copying data around. This +ensures good performance but can also cause unwanted problems if the user is +not aware of how this works. Hence, it is important to know the difference +between these two terms and to know which operations return copies and +which return views. + +The NumPy array is a data structure consisting of two parts: +the :term:`contiguous` data buffer with the actual data elements and the +metadata that contains information about the data buffer. The metadata +includes data type, strides, and other important information that helps +manipulate the :class:`.ndarray` easily. See the :ref:`numpy-internals` +section for a detailed look. + +.. _view: + +View +==== + +It is possible to access the array differently by just changing certain +metadata like :term:`stride` and :term:`dtype` without changing the +data buffer. This creates a new way of looking at the data and these new +arrays are called views. The data buffer remains the same, so any changes made +to a view reflects in the original copy. A view can be forced through the +:meth:`.ndarray.view` method. + +Copy +==== + +When a new array is created by duplicating the data buffer as well as the +metadata, it is called a copy. Changes made to the copy +do not reflect on the original array. Making a copy is slower and +memory-consuming but sometimes necessary. A copy can be forced by using +:meth:`.ndarray.copy`. + +Indexing operations +=================== + +.. seealso:: :ref:`basics.indexing` + +Views are created when elements can be addressed with offsets and strides +in the original array. Hence, basic indexing always creates views. +For example:: + + >>> x = np.arange(10) + >>> x + array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) + >>> y = x[1:3] # creates a view + >>> y + array([1, 2]) + >>> x[1:3] = [10, 11] + >>> x + array([ 0, 10, 11, 3, 4, 5, 6, 7, 8, 9]) + >>> y + array([10, 11]) + +Here, ``y`` gets changed when ``x`` is changed because it is a view. + +:ref:`advanced-indexing`, on the other hand, always creates copies. +For example:: + + >>> x = np.arange(9).reshape(3, 3) + >>> x + array([[0, 1, 2], + [3, 4, 5], + [6, 7, 8]]) + >>> y = x[[1, 2]] + >>> y + array([[3, 4, 5], + [6, 7, 8]]) + >>> y.base is None + True + +Here, ``y`` is a copy, as signified by the :attr:`base <.ndarray.base>` +attribute. We can also confirm this by assigning new values to ``x[[1, 2]]`` +which in turn will not affect ``y`` at all:: + + >>> x[[1, 2]] = [[10, 11, 12], [13, 14, 15]] + >>> x + array([[ 0, 1, 2], + [10, 11, 12], + [13, 14, 15]]) + >>> y + array([[3, 4, 5], + [6, 7, 8]]) + +It must be noted here that during the assignment of ``x[[1, 2]]`` no view +or copy is created as the assignment happens in-place. + + +Other operations +================ + +The :func:`numpy.reshape` function creates a view where possible or a copy +otherwise. In most cases, the strides can be modified to reshape the +array with a view. However, in some cases where the array becomes +non-contiguous (perhaps after a :meth:`.ndarray.transpose` operation), +the reshaping cannot be done by modifying strides and requires a copy. +In these cases, we can raise an error by assigning the new shape to the +shape attribute of the array. For example:: + + >>> x = np.ones((2, 3)) + >>> y = x.T # makes the array non-contiguous + >>> y + array([[1., 1.], + [1., 1.], + [1., 1.]]) + >>> z = y.view() + >>> z.shape = 6 + Traceback (most recent call last): + ... + AttributeError: Incompatible shape for in-place modification. Use + `.reshape()` to make a copy with the desired shape. + +Taking the example of another operation, :func:`.ravel` returns a contiguous +flattened view of the array wherever possible. On the other hand, +:meth:`.ndarray.flatten` always returns a flattened copy of the array. +However, to guarantee a view in most cases, ``x.reshape(-1)`` may be preferable. + +How to tell if the array is a view or a copy +============================================ + +The :attr:`base <.ndarray.base>` attribute of the ndarray makes it easy +to tell if an array is a view or a copy. The base attribute of a view returns +the original array while it returns ``None`` for a copy. + + >>> x = np.arange(9) + >>> x + array([0, 1, 2, 3, 4, 5, 6, 7, 8]) + >>> y = x.reshape(3, 3) + >>> y + array([[0, 1, 2], + [3, 4, 5], + [6, 7, 8]]) + >>> y.base # .reshape() creates a view + array([0, 1, 2, 3, 4, 5, 6, 7, 8]) + >>> z = y[[2, 1]] + >>> z + array([[6, 7, 8], + [3, 4, 5]]) + >>> z.base is None # advanced indexing creates a copy + True + +Note that the ``base`` attribute should not be used to determine +if an ndarray object is *new*; only if it is a view or a copy +of another ndarray.
\ No newline at end of file |