summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/neps/nep-0031-uarray.rst124
1 files changed, 122 insertions, 2 deletions
diff --git a/doc/neps/nep-0031-uarray.rst b/doc/neps/nep-0031-uarray.rst
index d1c3d806e..c7b950f14 100644
--- a/doc/neps/nep-0031-uarray.rst
+++ b/doc/neps/nep-0031-uarray.rst
@@ -114,7 +114,8 @@ Proposals
~~~~~~~~~
The only change this NEP proposes at its acceptance, is to make ``unumpy`` the
-officially recommended way to override NumPy. ``unumpy`` will remain a separate
+officially recommended way to override NumPy, along with making some submodules
+overridable by default via ``uarray``. ``unumpy`` will remain a separate
repository/package (which we propose to vendor to avoid a hard dependency, and
use the separate ``unumpy`` package only if it is installed, rather than depend
on for the time being). In concrete terms, ``numpy.overridable`` becomes an
@@ -130,6 +131,10 @@ GitHub workflow. There are a few reasons for this:
rather than breakages happening when it is least expected.
In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains
unaffected.
+* For ``numpy.fft``, ``numpy.linalg`` and ``numpy.random``, the functions in
+ the main namespace will mirror those in the ``numpy.overridable`` namespace.
+ The reason for this is that there may exist functions in the in these
+ submodules that need backends, even for ``numpy.ndarray`` inputs.
Advantanges of ``unumpy`` over other solutions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -156,7 +161,13 @@ allows one to override a large part of the NumPy API by defining only a small
part of it. This is to ease the creation of new duck-arrays, by providing
default implementations of many functions that can be easily expressed in
terms of others, as well as a repository of utility functions that help in the
-implementation of duck-arrays that most duck-arrays would require.
+implementation of duck-arrays that most duck-arrays would require. This would
+allow us to avoid designing entire protocols, e.g., a protocol for stacking
+and concatenating would be replaced by simply implementing ``stack`` and/or
+``concatenate`` and then providing default implementations for everything else
+in that class. The same applies for transposing, and many other functions for
+which protocols haven't been proposed, such as ``isin`` in terms of ``in1d``,
+``setdiff1d`` in terms of ``unique``, and so on.
It also allows one to override functions in a manner which
``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the
@@ -211,6 +222,101 @@ If the user wishes to obtain a NumPy array, there are two ways of doing it:
2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion
enabled
+Aliases outside of the ``numpy.overridable`` namespace
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All functionality in ``numpy.random``, ``numpy.linalg`` and ``numpy.fft``
+will be aliased to their respective overridable versions inside
+``numpy.overridable``. The reason for this is that there are alternative
+implementations of RNGs (``mkl-random``), linear algebra routines (``eigen``,
+``blis``) and FFT routines (``mkl-fft``, ``pyFFTW``) that need to operate on
+``numpy.ndarray`` inputs, but still need the ability to switch behaviour.
+
+This is different from monkeypatching in a few different ways:
+
+* The caller-facing signature of the function is always the same,
+ so there is at least the loose sense of an API contract. Monkeypatching
+ does not provide this ability.
+* There is the ability of locally switching the backend.
+* It has been `suggested <http://numpy-discussion.10968.n7.nabble.com/NEP-31-Context-local-and-global-overrides-of-the-NumPy-API-tp47452p47472.html>`_
+ that the reason that 1.17 hasn't landed in the Anaconda defaults channel is
+ due to the incompatibility between monkeypatching and ``__array_function__``,
+ as monkeypatching would bypass the protocol completely.
+* Statements of the form ``from numpy import x; x`` and ``np.x`` would have
+ different results depending on whether the import was made before or
+ after monkeypatching happened.
+
+All this isn't possible at all with ``__array_function__`` or
+``__array_ufunc__``.
+
+It has been formally realised (at least in part) that a backend system is
+needed for this, in the `NumPy roadmap <https://numpy.org/neps/roadmap.html#other-functionality>`_.
+
+For ``numpy.random``, it's still necessary to make the C-API fit the one
+proposed in `NEP-19 <https://numpy.org/neps/nep-0019-rng-policy.html>`_.
+This is impossible for `mkl-random`, because then it would need to be
+rewritten to fit that framework. The guarantees on stream
+compatibility will be the same as before, but if there's a backend that affects
+``numpy.random`` set, we make no guarantees about stream compatibility, and it
+is up to the backend author to provide their own guarantees.
+
+Providing a way for implicit dispatch
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It has been suggested that the ability to dispatch methods which do not take
+a dispatchable is needed, while guessing that backend from another dispatchable.
+
+As a concrete example, consider the following:
+
+.. code:: python
+
+ with unumpy.determine_backend(array_like, np.ndarray):
+ unumpy.arange(len(array_like))
+
+While this does not exist yet in ``uarray``, it is trivial to add it. The need for
+this kind of code exists because one might want to have an alternative for the
+proposed ``*_like`` functions, or the ``like=`` keyword argument. The need for these
+exists because there are functions in the NumPy API that do not take a dispatchable
+argument, but there is still the need to select a backend based on a different
+dispatchable.
+
+The need for an opt-in module
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The need for an opt-in module is realised because of a few reasons:
+
+* There are parts of the API (like `numpy.asarray`) that simply cannot be
+ overridden due to incompatibility concerns with C/Cython extensions, however,
+ one may want to coerce to a duck-array using ``asarray`` with a backend set.
+* There are possible issues around an implicit option and monkeypatching, such
+ as those mentioned above.
+
+NEP 18 notes that this may require maintenance of two separate APIs. However,
+this burden may be lessened by, for example, parametrizing all tests over
+``numpy.overridable`` separately via a fixture. This also has the side-effect
+of thoroughly testing it, unlike ``__array_function__``. We also feel that it
+provides an oppurtunity to separate the NumPy API contract properly from the
+implementation.
+
+Benefits to end-users and mixing backends
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Mixing backends is easy in ``uarray``, one only has to do:
+
+.. code:: python
+
+ # Explicitly say which backends you want to mix
+ ua.register_backend(backend1)
+ ua.register_backend(backend2)
+ ua.register_backend(backend3)
+
+ # Freely use code that mixes backends here.
+
+The benefits to end-users extend beyond just writing new code. Old code
+(usually in the form of scripts) can be easily ported to different backends
+by a simple import switch and a line adding the preferred backend. This way,
+users may find it easier to port existing code to GPU or distributed computing.
+
Related Work
------------
@@ -245,6 +351,14 @@ Existing alternate dtype implementations
* Datashape: https://datashape.readthedocs.io
* Plum: https://plum-py.readthedocs.io/
+Alternate implementations of parts of the NumPy API
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* ``mkl_random``: https://github.com/IntelPython/mkl_random
+* ``mkl_fft``: https://github.com/IntelPython/mkl_fft
+* ``bottleneck``: https://github.com/pydata/bottleneck
+* ``opt_einsum``: https://github.com/dgasmith/opt_einsum
+
Implementation
--------------
@@ -420,6 +534,12 @@ also a possibility that can be considered by this NEP. However, the act of
doing an extra ``pip install`` or ``conda install`` may discourage some users
from adopting this method.
+An alternative to requiring opt-in is mainly to *not* override ``np.asarray``
+and ``np.array``, and making the rest of the NumPy API surface overridable,
+instead providing ``np.duckarray`` and ``np.asduckarray``
+as duck-array friendly alternatives that used the respective overrides. However,
+this has the downside of adding a minor overhead to NumPy calls.
+
Discussion
----------