Merge pull request #12166 from mattip/nep-16

NEP: Add zero-rank arrays historical info NEP
author: Eric Wieser <wieser.eric@gmail.com> 2018-10-15 19:36:53 -0700
committer: GitHub <noreply@github.com> 2018-10-15 19:36:53 -0700
commit: a5e10f8b2903892c1c0771de3ff6516709cbb739 (patch)
tree: 94481665dfd767de4a0a68097da98ed39916f91d
parent: 86ebcffb482afb67c2f6ec4f396d9017ea610bf1 (diff)
parent: 1f027a6e8fd7eb953dbb5f6c43e689fc5059c889 (diff)
download: numpy-a5e10f8b2903892c1c0771de3ff6516709cbb739.tar.gz
1 files changed, 251 insertions, 0 deletions
diff --git a/doc/neps/nep-0027-zero-rank-arrarys.rst b/doc/neps/nep-0027-zero-rank-arrarys.rst
new file mode 100644
index 000000000..11ea44dbd
--- /dev/null
+++ b/doc/neps/nep-0027-zero-rank-arrarys.rst
@@ -0,0 +1,251 @@
+=========================
+NEP 27 — Zero Rank Arrays
+=========================
+
+:Author: Alexander Belopolsky (sasha), transcribed Matt Picus <matti.picus@gmail.com>
+:Status: Draft
+:Type: Informational
+:Created: 2006-06-10
+
+Abstract
+--------
+
+NumPy has both zero rank arrays and scalars. This design document, adapted from
+a `2006 wiki entry`_, describes what zero rank arrays are and why they exist.
+It was transcribed 2018-10-13 into a NEP and links were updated.
+
+Note that some of the information here is dated, for instance indexing of 0-D
+arrays now is now implemented and does not error.
+
+Zero-Rank Arrays
+----------------
+
+Zero-rank arrays are arrays with shape=().  For example:
+
+    >>> x = array(1)
+    >>> x.shape
+    ()
+
+
+Zero-Rank Arrays and Array Scalars
+----------------------------------
+
+Array scalars are similar to zero-rank arrays in many aspects::
+
+
+    >>> int_(1).shape
+    ()
+
+They even print the same::
+
+
+    >>> print int_(1)
+    1
+    >>> print array(1)
+    1
+
+
+However there are some important differences:
+
+* Array scalars are immutable
+* Array scalars have different python type for different data types
+ 
+Motivation for Array Scalars
+----------------------------
+
+Numpy's design decision to provide 0-d arrays and array scalars in addition to
+native python types goes against one of the fundamental python design
+principles that there should be only one obvious way to do it.  In this section
+we will try to explain why it is necessary to have three different ways to
+represent a number.
+
+There were several numpy-discussion threads:
+   
+
+* `rank-0 arrays`_ in a 2002 mailing list thread.
+* Thoughts about zero dimensional arrays vs Python scalars in a `2005 mailing list thread`_]
+
+It has been suggested several times that NumPy just use rank-0 arrays to
+represent scalar quantities in all case.  Pros and cons of converting rank-0
+arrays to scalars were summarized as follows:
+
+- Pros: 
+
+  - Some cases when Python expects an integer (the most
+    dramatic is when slicing and indexing a sequence:
+    _PyEval_SliceIndex in ceval.c) it will not try to
+    convert it to an integer first before raising an error.
+    Therefore it is convenient to have 0-dim arrays that
+    are integers converted for you by the array object.
+
+  - No risk of user confusion by having two types that
+    are nearly but not exactly the same and whose separate
+    existence can only be explained by the history of
+    Python and NumPy development.
+
+  - No problems with code that does explicit typechecks
+    ``(isinstance(x, float)`` or ``type(x) == types.FloatType)``. Although
+    explicit typechecks are considered bad practice in general, there are a
+    couple of valid reasons to use them.
+
+  - No creation of a dependency on Numeric in pickle
+    files (though this could also be done by a special case
+    in the pickling code for arrays)
+
+- Cons:  
+
+  - It is difficult to write generic code because scalars
+    do not have the same methods and attributes as arrays.
+    (such as ``.type``  or ``.shape``).  Also Python scalars have
+    different numeric behavior as well. 
+
+  - This results in a special-case checking that is not 
+    pleasant.  Fundamentally it lets the user believe that 
+    somehow multidimensional homoegeneous arrays
+    are something like Python lists (which except for
+    Object arrays they are not).
+
+Numpy implements a solution that is designed to have all the pros and none of the cons above.
+
+    Create Python scalar types for all of the 21 types and also
+    inherit from the three that already exist. Define equivalent
+    methods and attributes for these Python scalar types.
+
+The Need for Zero-Rank Arrays
+-----------------------------
+
+Once the idea to use zero-rank arrays to represent scalars was rejected, it was
+natural to consider whether zero-rank arrays can be eliminated alltogether.
+However there are some important use cases where zero-rank arrays cannot be
+replaced by array scalars.  See also `A case for rank-0 arrays`_ from February
+2006.
+
+* Output arguments::
+
+    >>> y = int_(5)
+    >>> add(5,5,x)
+    array(10)
+    >>> x
+    array(10)
+    >>> add(5,5,y)
+    Traceback (most recent call last):
+         File "<stdin>", line 1, in ?
+    TypeError: return arrays must be of ArrayType
+
+* Shared data::
+
+    >>> x = array([1,2])
+    >>> y = x[1:2]
+    >>> y.shape = ()
+    >>> y
+    array(2)
+    >>> x[1] = 20
+    >>> y
+    array(20)
+
+Indexing of Zero-Rank Arrays
+----------------------------
+
+As of NumPy release 0.9.3, zero-rank arrays do not support any indexing::
+
+    >>> x[...]
+    Traceback (most recent call last):
+      File "<stdin>", line 1, in ?
+    IndexError: 0-d arrays can't be indexed.
+
+On the other hand there are several cases that make sense for rank-zero arrays.
+
+Ellipsis and empty tuple
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sasha started a `Jan 2006 discussion`_ on scipy-dev
+with the following proposal:
+
+    ... it may be reasonable to allow ``a[...]``.  This way
+    ellipsis can be interpereted as any number of  ``:`` s including zero. 
+    Another subscript operation that makes sense for scalars would be
+    ``a[...,newaxis]`` or even ``a[{newaxis, }* ..., {newaxis,}*]``, where 
+    ``{newaxis,}*`` stands for any number of comma-separated newaxis tokens. 
+    This will allow one to use ellipsis in generic code that would work on
+    any numpy type. 
+
+Francesc Altet supported the idea of ``[...]`` on zero-rank arrays and
+`suggested`_ that ``[()]`` be supported as well.
+
+Francesc's proposal was::
+
+    In [65]: type(numpy.array(0)[...])
+    Out[65]: <type 'numpy.ndarray'>
+
+    In [66]: type(numpy.array(0)[()])   # Indexing a la numarray
+    Out[66]: <type 'int32_arrtype'>
+
+    In [67]: type(numpy.array(0).item())  # already works
+    Out[67]: <type 'int'>
+
+There is a consensus that for a zero-rank array ``x``, both ``x[...]`` and ``x[()]`` should be valid, but the question
+remains on what should be the type of the result - zero rank ndarray or ``x.dtype``?
+
+(Sasha)
+    First, whatever choice is made for ``x[...]`` and ``x[()]`` they should be
+    the same because ``...`` is just syntactic sugar for "as many `:` as
+    necessary", which in the case of zero rank leads to ``... = (:,)*0 = ()``.
+    Second, rank zero arrays and numpy scalar types are interchangeable within
+    numpy, but numpy scalars can be use in some python constructs where ndarrays
+    can't.  For example::
+
+        >>> (1,)[array(0)]
+        Traceback (most recent call last):
+          File "<stdin>", line 1, in ?
+        TypeError: tuple indices must be integers
+        >>> (1,)[int32(0)]
+        1
+
+Since most if not all numpy function automatically convert zero-rank arrays to scalars on return, there is no reason for
+``[...]`` and ``[()]`` operations to be different. 
+
+See SVN changeset 1864 (which became git commit `9024ff0`_) for
+implementation of ``x[...]`` and ``x[()]`` returning numpy scalars.
+
+See SVN changeset 1866 (which became git commit `743d922`_) for
+implementation of ``x[...] = v`` and ``x[()] = v``
+
+Increasing rank with newaxis
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Everyone who commented liked this feature, so as of SVN changeset 1871 (which became git commit `b32744e`_) any number of ellipses and
+newaxis tokens can be placed as a subscript argument for a zero-rank array. For
+example::
+
+    >>> x = array(1)
+    >>> x[newaxis,...,newaxis,...]
+    array([[1]])
+
+It is not clear why more than one ellipsis should be allowed, but this is the
+behavior of higher rank arrays that we are trying to preserve.
+
+Refactoring
+~~~~~~~~~~~
+
+Currently all indexing on zero-rank arrays is implemented in a special ``if (nd
+== 0)`` branch of code that used to always raise an index error. This ensures
+that the changes do not affect any existing usage (except, the usage that
+relies on exceptions).  On the other hand part of motivation for these changes
+was to make behavior of ndarrays more uniform and this should allow to
+eliminate  ``if (nd == 0)`` checks alltogether.
+
+Copyright
+---------
+
+The original document appeared on the scipy.org wiki, with no Copyright notice, and its `history`_ attributes it to sasha.
+
+.. _`2006 wiki entry`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray
+.. _`history`: https://web.archive.org/web/20100503065506/http://projects.scipy.org:80/numpy/wiki/ZeroRankArray?action=history
+.. _`2005 mailing list thread`: https://sourceforge.net/p/numpy/mailman/message/11299166
+.. _`suggested`: https://mail.python.org/pipermail/numpy-discussion/2006-January/005572.html
+.. _`Jan 2006 discussion`: https://mail.python.org/pipermail/numpy-discussion/2006-January/005579.html
+.. _`A case for rank-0 arrays`: https://mail.python.org/pipermail/numpy-discussion/2006-February/006384.html
+.. _`rank-0 arrays`: https://mail.python.org/pipermail/numpy-discussion/2002-September/001600.html
+.. _`9024ff0`: https://github.com/numpy/numpy/commit/9024ff0dc052888b5922dde0f3e615607a9e99d7
+.. _`743d922`: https://github.com/numpy/numpy/commit/743d922bf5893acf00ac92e823fe12f460726f90
+.. _`b32744e`: https://github.com/numpy/numpy/commit/b32744e3fc5b40bdfbd626dcc1f72907d77c01c4
author	Eric Wieser <wieser.eric@gmail.com>	2018-10-15 19:36:53 -0700
committer	GitHub <noreply@github.com>	2018-10-15 19:36:53 -0700
commit	a5e10f8b2903892c1c0771de3ff6516709cbb739 (patch)
tree	94481665dfd767de4a0a68097da98ed39916f91d
parent	86ebcffb482afb67c2f6ec4f396d9017ea610bf1 (diff)
parent	1f027a6e8fd7eb953dbb5f6c43e689fc5059c889 (diff)
download	numpy-a5e10f8b2903892c1c0771de3ff6516709cbb739.tar.gz