summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEric Wieser <wieser.eric@gmail.com>2018-10-15 19:36:53 -0700
committerGitHub <noreply@github.com>2018-10-15 19:36:53 -0700
commita5e10f8b2903892c1c0771de3ff6516709cbb739 (patch)
tree94481665dfd767de4a0a68097da98ed39916f91d
parent86ebcffb482afb67c2f6ec4f396d9017ea610bf1 (diff)
parent1f027a6e8fd7eb953dbb5f6c43e689fc5059c889 (diff)
downloadnumpy-a5e10f8b2903892c1c0771de3ff6516709cbb739.tar.gz
Merge pull request #12166 from mattip/nep-16
NEP: Add zero-rank arrays historical info NEP
-rw-r--r--doc/neps/nep-0027-zero-rank-arrarys.rst251
1 files changed, 251 insertions, 0 deletions
diff --git a/doc/neps/nep-0027-zero-rank-arrarys.rst b/doc/neps/nep-0027-zero-rank-arrarys.rst
new file mode 100644
index 000000000..11ea44dbd
--- /dev/null
+++ b/doc/neps/nep-0027-zero-rank-arrarys.rst
@@ -0,0 +1,251 @@
+=========================
+NEP 27 — Zero Rank Arrays
+=========================
+
+:Author: Alexander Belopolsky (sasha), transcribed Matt Picus <matti.picus@gmail.com>
+:Status: Draft
+:Type: Informational
+:Created: 2006-06-10
+
+Abstract
+--------
+
+NumPy has both zero rank arrays and scalars. This design document, adapted from
+a `2006 wiki entry`_, describes what zero rank arrays are and why they exist.
+It was transcribed 2018-10-13 into a NEP and links were updated.
+
+Note that some of the information here is dated, for instance indexing of 0-D
+arrays now is now implemented and does not error.
+
+Zero-Rank Arrays
+----------------
+
+Zero-rank arrays are arrays with shape=(). For example:
+
+ >>> x = array(1)
+ >>> x.shape
+ ()
+
+
+Zero-Rank Arrays and Array Scalars
+----------------------------------
+
+Array scalars are similar to zero-rank arrays in many aspects::
+
+
+ >>> int_(1).shape
+ ()
+
+They even print the same::
+
+
+ >>> print int_(1)
+ 1
+ >>> print array(1)
+ 1
+
+
+However there are some important differences:
+
+* Array scalars are immutable
+* Array scalars have different python type for different data types
+
+Motivation for Array Scalars
+----------------------------
+
+Numpy's design decision to provide 0-d arrays and array scalars in addition to
+native python types goes against one of the fundamental python design
+principles that there should be only one obvious way to do it. In this section
+we will try to explain why it is necessary to have three different ways to
+represent a number.
+
+There were several numpy-discussion threads:
+
+
+* `rank-0 arrays`_ in a 2002 mailing list thread.
+* Thoughts about zero dimensional arrays vs Python scalars in a `2005 mailing list thread`_]
+
+It has been suggested several times that NumPy just use rank-0 arrays to
+represent scalar quantities in all case. Pros and cons of converting rank-0
+arrays to scalars were summarized as follows:
+
+- Pros:
+
+ - Some cases when Python expects an integer (the most
+ dramatic is when slicing and indexing a sequence:
+ _PyEval_SliceIndex in ceval.c) it will not try to
+ convert it to an integer first before raising an error.
+ Therefore it is convenient to have 0-dim arrays that
+ are integers converted for you by the array object.
+
+ - No risk of user confusion by having two types that
+ are nearly but not exactly the same and whose separate
+ existence can only be explained by the history of
+ Python and NumPy development.
+
+ - No problems with code that does explicit typechecks
+ ``(isinstance(x, float)`` or ``type(x) == types.FloatType)``. Although
+ explicit typechecks are considered bad practice in general, there are a
+ couple of valid reasons to use them.
+
+ - No creation of a dependency on Numeric in pickle
+ files (though this could also be done by a special case
+ in the pickling code for arrays)
+
+- Cons:
+
+ - It is difficult to write generic code because scalars
+ do not have the same methods and attributes as arrays.
+ (such as ``.type`` or ``.shape``). Also Python scalars have
+ different numeric behavior as well.
+
+ - This results in a special-case checking that is not
+ pleasant. Fundamentally it lets the user believe that
+ somehow multidimensional homoegeneous arrays
+ are something like Python lists (which except for
+ Object arrays they are not).
+
+Numpy implements a solution that is designed to have all the pros and none of the cons above.
+
+ Create Python scalar types for all of the 21 types and also
+ inherit from the three that already exist. Define equivalent
+ methods and attributes for these Python scalar types.
+
+The Need for Zero-Rank Arrays
+-----------------------------
+
+Once the idea to use zero-rank arrays to represent scalars was rejected, it was
+natural to consider whether zero-rank arrays can be eliminated alltogether.
+However there are some important use cases where zero-rank arrays cannot be
+replaced by array scalars. See also `A case for rank-0 arrays`_ from February
+2006.
+
+* Output arguments::
+
+ >>> y = int_(5)
+ >>> add(5,5,x)
+ array(10)
+ >>> x
+ array(10)
+ >>> add(5,5,y)
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+ TypeError: return arrays must be of ArrayType
+
+* Shared data::
+
+ >>> x = array([1,2])
+ >>> y = x[1:2]
+ >>> y.shape = ()
+ >>> y
+ array(2)
+ >>> x[1] = 20
+ >>> y
+ array(20)
+
+Indexing of Zero-Rank Arrays
+----------------------------
+
+As of NumPy release 0.9.3, zero-rank arrays do not support any indexing::
+
+ >>> x[...]
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+ IndexError: 0-d arrays can't be indexed.
+
+On the other hand there are several cases that make sense for rank-zero arrays.
+
+Ellipsis and empty tuple
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sasha started a `Jan 2006 discussion`_ on scipy-dev
+with the following proposal:
+
+ ... it may be reasonable to allow ``a[...]``. This way
+ ellipsis can be interpereted as any number of ``:`` s including zero.
+ Another subscript operation that makes sense for scalars would be
+ ``a[...,newaxis]`` or even ``a[{newaxis, }* ..., {newaxis,}*]``, where
+ ``{newaxis,}*`` stands for any number of comma-separated newaxis tokens.
+ This will allow one to use ellipsis in generic code that would work on
+ any numpy type.
+
+Francesc Altet supported the idea of ``[...]`` on zero-rank arrays and
+`suggested`_ that ``[()]`` be supported as well.
+
+Francesc's proposal was::
+
+ In [65]: type(numpy.array(0)[...])
+ Out[65]: <type 'numpy.ndarray'>
+
+ In [66]: type(numpy.array(0)[()]) # Indexing a la numarray
+ Out[66]: <type 'int32_arrtype'>
+
+ In [67]: type(numpy.array(0).item()) # already works
+ Out[67]: <type 'int'>
+
+There is a consensus that for a zero-rank array ``x``, both ``x[...]`` and ``x[()]`` should be valid, but the question
+remains on what should be the type of the result - zero rank ndarray or ``x.dtype``?
+
+(Sasha)
+ First, whatever choice is made for ``x[...]`` and ``x[()]`` they should be
+ the same because ``...`` is just syntactic sugar for "as many `:` as
+ necessary", which in the case of zero rank leads to ``... = (:,)*0 = ()``.
+ Second, rank zero arrays and numpy scalar types are interchangeable within
+ numpy, but numpy scalars can be use in some python constructs where ndarrays
+ can't. For example::
+
+ >>> (1,)[array(0)]
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+ TypeError: tuple indices must be integers
+ >>> (1,)[int32(0)]
+ 1
+
+Since most if not all numpy function automatically convert zero-rank arrays to scalars on return, there is no reason for
+``[...]`` and ``[()]`` operations to be different.
+
+See SVN changeset 1864 (which became git commit `9024ff0`_) for
+implementation of ``x[...]`` and ``x[()]`` returning numpy scalars.
+
+See SVN changeset 1866 (which became git commit `743d922`_) for
+implementation of ``x[...] = v`` and ``x[()] = v``
+
+Increasing rank with newaxis
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Everyone who commented liked this feature, so as of SVN changeset 1871 (which became git commit `b32744e`_) any number of ellipses and
+newaxis tokens can be placed as a subscript argument for a zero-rank array. For
+example::
+
+ >>> x = array(1)
+ >>> x[newaxis,...,newaxis,...]
+ array([[1]])
+
+It is not clear why more than one ellipsis should be allowed, but this is the
+behavior of higher rank arrays that we are trying to preserve.
+
+Refactoring
+~~~~~~~~~~~
+
+Currently all indexing on zero-rank arrays is implemented in a special ``if (nd
+== 0)`` branch of code that used to always raise an index error. This ensures
+that the changes do not affect any existing usage (except, the usage that
+relies on exceptions). On the other hand part of motivation for these changes
+was to make behavior of ndarrays more uniform and this should allow to
+eliminate ``if (nd == 0)`` checks alltogether.
+
+Copyright
+---------
+
+The original document appeared on the scipy.org wiki, with no Copyright notice, and its `history`_ attributes it to sasha.
+
+.. _`2006 wiki entry`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray
+.. _`history`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray?action=history
+.. _`2005 mailing list thread`: https://sourceforge.net/p/numpy/mailman/message/11299166
+.. _`suggested`: https://mail.python.org/pipermail/numpy-discussion/2006-January/005572.html
+.. _`Jan 2006 discussion`: https://mail.python.org/pipermail/numpy-discussion/2006-January/005579.html
+.. _`A case for rank-0 arrays`: https://mail.python.org/pipermail/numpy-discussion/2006-February/006384.html
+.. _`rank-0 arrays`: https://mail.python.org/pipermail/numpy-discussion/2002-September/001600.html
+.. _`9024ff0`: https://github.com/numpy/numpy/commit/9024ff0dc052888b5922dde0f3e615607a9e99d7
+.. _`743d922`: https://github.com/numpy/numpy/commit/743d922bf5893acf00ac92e823fe12f460726f90
+.. _`b32744e`: https://github.com/numpy/numpy/commit/b32744e3fc5b40bdfbd626dcc1f72907d77c01c4