summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarten van Kerkwijk <mhvk@astro.utoronto.ca>2017-03-31 12:28:16 -0400
committerCharles Harris <charlesr.harris@gmail.com>2017-04-27 13:25:50 -0600
commit5fe6fc640d752fe9e4a9a51635bf070b503aa85e (patch)
treeac0a5244c98a24e4f3ec88954e50e2f4f675c43b
parent30417109170d1f5f1256172e6506ea32751b0587 (diff)
downloadnumpy-5fe6fc640d752fe9e4a9a51635bf070b503aa85e.tar.gz
DOC Update NEP to reflect actual implementation.
-rw-r--r--doc/neps/ufunc-overrides.rst262
1 files changed, 151 insertions, 111 deletions
diff --git a/doc/neps/ufunc-overrides.rst b/doc/neps/ufunc-overrides.rst
index 480e229c2..f69da0090 100644
--- a/doc/neps/ufunc-overrides.rst
+++ b/doc/neps/ufunc-overrides.rst
@@ -2,6 +2,8 @@
A Mechanism for Overriding Ufuncs
=================================
+.. currentmodule:: numpy
+
:Author: Blake Griffith
:Contact: blake.g@utexas.edu
:Date: 2013-07-10
@@ -10,25 +12,32 @@ A Mechanism for Overriding Ufuncs
:Author: Nathaniel Smith
+:Author: Marten van Kerkwijk
+:Date: 2017-03-31
Executive summary
=================
NumPy's universal functions (ufuncs) currently have some limited
-functionality for operating on user defined subclasses of ndarray using
-``__array_prepare__`` and ``__array_wrap__`` [1]_, and there is little
-to no support for arbitrary objects. e.g. SciPy's sparse matrices [2]_
-[3]_.
+functionality for operating on user defined subclasses of
+:class:`ndarray` using ``__array_prepare__`` and ``__array_wrap__``
+[1]_, and there is little to no support for arbitrary
+objects. e.g. SciPy's sparse matrices [2]_ [3]_.
Here we propose adding a mechanism to override ufuncs based on the ufunc
-checking each of it's arguments for a ``__numpy_ufunc__`` method.
-On discovery of ``__numpy_ufunc__`` the ufunc will hand off the
+checking each of it's arguments for a ``__array_ufunc__`` method.
+On discovery of ``__array_ufunc__`` the ufunc will hand off the
operation to the method.
This covers some of the same ground as Travis Oliphant's proposal to
retro-fit NumPy with multi-methods [4]_, which would solve the same
problem. The mechanism here follows more closely the way Python enables
-classes to override ``__mul__`` and other binary operations.
+classes to override ``__mul__`` and other binary operations. It also
+specifically addresses how binary operators and ufuncs should interact.
+
+.. note:: In earlier iterations, the override was called
+ ``__numpy_ufunc__``. An implementation was made, but had not
+ quite the right behaviour, hence the change in name.
.. [1] http://docs.python.org/doc/numpy/user/basics.subclassing.html
.. [2] https://github.com/scipy/scipy/issues/2123
@@ -41,13 +50,14 @@ Motivation
The current machinery for dispatching Ufuncs is generally agreed to be
insufficient. There have been lengthy discussions and other proposed
-solutions [5]_.
+solutions [5]_, [6]_.
-Using ufuncs with subclasses of ndarray is limited to ``__array_prepare__`` and
-``__array_wrap__`` to prepare the arguments, but these don't allow you to for
-example change the shape or the data of the arguments. Trying to ufunc things
-that don't subclass ndarray is even more difficult, as the input arguments tend
-to be cast to object arrays, which ends up producing surprising results.
+Using ufuncs with subclasses of :class:`ndarray` is limited to
+``__array_prepare__`` and ``__array_wrap__`` to prepare the arguments,
+but these don't allow you to for example change the shape or the data of
+the arguments. Trying to ufunc things that don't subclass
+:class:`ndarray` is even more difficult, as the input arguments tend to
+be cast to object arrays, which ends up producing surprising results.
Take this example of ufuncs interoperability with sparse matrices.::
@@ -81,7 +91,7 @@ Take this example of ufuncs interoperability with sparse matrices.::
In [5]: np.multiply(a, bsp) # Returns NotImplemented to user, bad!
Out[5]: NotImplemted
-Returning ``NotImplemented`` to user should not happen. Moreover::
+Returning :obj:`NotImplemented` to user should not happen. Moreover::
In [6]: np.multiply(asp, b)
Out[6]: array([[ <3x3 sparse matrix of type '<class 'numpy.int64'>'
@@ -106,21 +116,24 @@ Returning ``NotImplemented`` to user should not happen. Moreover::
Here, it appears that the sparse matrix was converted to an object array
scalar, which was then multiplied with all elements of the ``b`` array.
However, this behavior is more confusing than useful, and having a
-``TypeError`` would be preferable.
+:exc:`TypeError` would be preferable.
-Adding the ``__numpy_ufunc__`` functionality fixes this and would
+Adding the ``__array_ufunc__`` functionality fixes this and would
deprecate the other ufunc modifying functions.
.. [5] http://mail.python.org/pipermail/numpy-discussion/2011-June/056945.html
+.. [6] https://github.com/numpy/numpy/issues/5844
Proposed interface
==================
-Objects that want to override Ufuncs can define a ``__numpy_ufunc__`` method.
-The method signature is::
+The standard array class :class:`ndarray` gains an ``__array_ufunc__``
+method and objects can override Ufuncs by overriding this method (if
+they are :class:`ndarray` subclasses) or defining their own. The method
+signature is::
- def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs)
+ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
Here:
@@ -128,141 +141,168 @@ Here:
- *method* is a string indicating which Ufunc method was called
(one of ``"__call__"``, ``"reduce"``, ``"reduceat"``,
``"accumulate"``, ``"outer"``, ``"inner"``).
-- *i* is the index of *self* in *inputs*.
- *inputs* is a tuple of the input arguments to the ``ufunc``
- *kwargs* are the keyword arguments passed to the function. The ``out``
- arguments are always contained in *kwargs*, how positional variables
- are passed is discussed below.
-
-The ufunc's arguments are first normalized into a tuple of input data
-(``inputs``), and dict of keyword arguments. If there are output
-arguments they are handled as follows:
-
-- One positional output variable x is passed in the kwargs dict as ``out :
- x``.
-- Multiple positional output variables ``x0, x1, ...`` are passed as a tuple
- in the kwargs dict as ``out : (x0, x1, ...)``.
-- Keyword output variables like ``out = x`` and ``out = (x0, x1, ...)`` are
- passed unchanged to the kwargs dict like ``out : x`` and ``out : (x0, x1,
- ...)`` respectively.
-- Combinations of positional and keyword output variables are not
- supported.
+ arguments are always contained as a tuple in *kwargs*.
+
+Hence, the arguments are normalized: only the input data (``inputs``)
+are passed on as positional arguments, all the others are passed on as a
+dict of keyword arguments (``kwargs``). In particular, if there are
+output arguments, positional are otherwise, they are passed on as a
+tuple in the ``out`` keyword argument.
The function dispatch proceeds as follows:
-- If one of the input arguments implements ``__numpy_ufunc__`` it is
+- If one of the input arguments implements ``__array_ufunc__`` it is
executed instead of the Ufunc.
-- If more than one of the input arguments implements ``__numpy_ufunc__``,
+- If more than one of the input arguments implements ``__array_ufunc__``,
they are tried in the following order: subclasses before superclasses,
- otherwise left to right. The first ``__numpy_ufunc__`` method returning
- something else than ``NotImplemented`` determines the return value of
+ otherwise left to right. The first ``__array_ufunc__`` method returning
+ something else than :obj:`NotImplemented` determines the return value of
the Ufunc.
-- If all ``__numpy_ufunc__`` methods of the input arguments return
- ``NotImplemented``, a ``TypeError`` is raised.
+- If all ``__array_ufunc__`` methods of the input arguments return
+ :obj:`NotImplemented`, a :exc:`TypeError` is raised.
-- If a ``__numpy_ufunc__`` method raises an error, the error is propagated
+- If a ``__array_ufunc__`` method raises an error, the error is propagated
immediately.
-If none of the input arguments has a ``__numpy_ufunc__`` method, the
+If none of the input arguments has an ``__array_ufunc__`` method, the
execution falls back on the default ufunc behaviour.
+Subclass hierarchies
+--------------------
+
+Hierarchies of such containers (say, a masked quantity), are most easily
+constructed if methods consistently use :func:`super` to pass through
+the class hierarchy [7]_. To support this, :class:`ndarray` has its own
+``__array_ufunc__`` method (which is equivalent to ``getattr(ufunc,
+method)(*inputs, **kwargs)``, i.e., if any of the (adjusted) inputs
+still defines ``__array_ufunc__`` that will be called in turn). This
+should be particularly useful for container-like subclasses of
+:class:`ndarray`, which add an attribute like a unit or mask to a
+regular :class:`ndarray`. Such classes can do possible adjustment of the
+arguments relevant to their own class, pass on to another class in the
+hierarchy using :func:`super` until the Ufunc is actually done, and then
+do possible adjustments of the outputs.
+
+Turning Ufuncs off
+------------------
+
+For some classes, Ufuncs make no sense, and, like for other special
+methods [8]_, one can indicate Ufuncs are not available by setting
+``__array_ufunc__`` to :obj:`None`. Inside a Ufunc, this is
+equivalent to unconditionally return :obj:`NotImplemented`, and thus
+will lead to a :exc:`TypeError` (unless another operand implements
+``__array_ufunc__`` and knows how to deal with the class).
+
+.. [7] https://rhettinger.wordpress.com/2011/05/26/super-considered-super/
+
+.. [8] https://docs.python.org/3/reference/datamodel.html#specialnames
In combination with Python's binary operations
----------------------------------------------
-The ``__numpy_ufunc__`` mechanism is fully independent of Python's
+The ``__array_ufunc__`` mechanism is fully independent of Python's
standard operator override mechanism, and the two do not interact
directly.
-They however have indirect interactions, because NumPy's ``ndarray``
-type implements its binary operations via Ufuncs. Effectively, we have::
-
- class ndarray(object):
+They have indirect interactions, however, because NumPy's
+:class:`ndarray` type implements its binary operations via Ufuncs. For
+most numerical classes, the easiest way to override binary operations is
+thus to define ``__array_ufunc__`` and override the corresponding
+Ufunc. The class can then, like :class:`ndarray` itself, define the
+binary operators in terms of Ufuncs. Here, one has to take some care.
+E.g., the simplest implementation would be::
+
+ class ArrayLike(object):
+ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
+ ...
+ return result
...
def __mul__(self, other):
- return np.multiply(self, other)
+ return self.__array_ufunc__(np.multiply, '__call__', self, other)
-Suppose now we have a second class::
+Suppose now, however, that ``other`` is class that does not know how to
+deal with arrays and ufuncs, but does know how to do multiplication::
class MyObject(object):
- def __numpy_ufunc__(self, *a, **kw):
- return "ufunc"
+ __array_ufunc__ = None
def __mul__(self, other):
return 1234
def __rmul__(self, other):
return 4321
In this case, standard Python override rules combined with the above
-discussion imply::
+discussion would imply::
- a = MyObject()
- b = np.array([0])
+ mine = MyObject()
+ arr = ArrayLike([0])
- a * b # == 1234 OK
- b * a # == "ufunc" surprising
+ mine * arr # == 1234 OK
+ arr * mine # TypeError surprising
-This is not what would be naively expected, and is therefore somewhat
-undesirable behavior.
+The reason why this would occur is: because ``MyObject`` is not an
+``ArrayLike`` subclass, Python resolves the expression ``arr * mine`` by
+calling first ``arr.__mul__``. In the above implementation, this would
+just call the Ufunc, which would see that ``mine.__array_ufunc__`` is
+:obj:`None` and raise a :exc:`TypeError`. (Note that if ``MyObject``
+is a subclass of :class:`ndarray`, Python calls ``mine.__rmul__`` first.)
-The reason why this occurs is: because ``MyObject`` is not an ndarray
-subclass, Python resolves the expression ``b * a`` by calling first
-``b.__mul__``. Since NumPy implements this via an Ufunc, the call is
-forwarded to ``__numpy_ufunc__`` and not to ``__rmul__``. Note that if
-``MyObject`` is a subclass of ``ndarray``, Python calls ``a.__rmul__``
-first. The issue is therefore that ``__numpy_ufunc__`` implements
-"virtual subclassing" of ndarray behavior, without actual subclassing.
+So, a better implementation of the binary operators would check whether
+the other class can be dealt with in ``__array_ufunc__`` and, if not,
+return :obj:`NotImplemented`::
-This issue can be resolved by a modification of the binary operation
-methods in NumPy::
-
- class ndarray(object):
+ class ArrayLike(object):
...
def __mul__(self, other):
- if (not isinstance(other, self.__class__)
- and hasattr(other, '__numpy_ufunc__')
- and hasattr(other, '__rmul__')):
- return NotImplemented
- return np.multiply(self, other)
-
- def __imul__(self, other):
- if (other.__class__ is not self.__class__
- and hasattr(other, '__numpy_ufunc__')
- and hasattr(other, '__rmul__')):
+ if getattr(other, '__array_ufunc__', False) is None:
return NotImplemented
- return np.multiply(self, other, out=self)
-
- b * a # == 4321 OK
-
-The rationale here is the following: since the user class explicitly
-defines both ``__numpy_ufunc__`` and ``__rmul__``, the implementor has
-very likely made sure that the ``__rmul__`` method can process ndarrays.
-If not, the special case is simple to deal with (just call
-``np.multiply``).
-
-The exclusion of subclasses of self can be made because Python itself
-calls the right-hand method first in this case. Moreover, it is
-desirable that ndarray subclasses are able to inherit the right-hand
-binary operation methods from ndarray.
-
-The same priority shuffling needs to be done also for the in-place
-operations, so that ``MyObject.__rmul__`` is prioritized over
-``ndarray.__imul__``.
-
+ return self.__array_ufunc__(np.multiply, '__call__', self, other)
+
+ arr = ArrayLike([0])
+
+ arr * mine # == 4321 OK
+
+Indeed, after long discussion about whether it might make more sense to
+ask classes like ``ArrayLike`` to implement a full ``__array_ufunc__``
+[6]_, the same design as the above was agreed on for :class:`ndarray`
+itself.
+
+.. note:: The above holds for regular operators. For in-place
+ operators, :class:`ndarray` never returns
+ :obj:`NotImplemented`, i.e., ``ndarr *= mine`` would always
+ lead to a :exc:`TypeError`. This is because for arrays
+ in-place operations cannot generically be replaced by a simple
+ reverse operation. For instance, sticking to the above
+ example, what would ``ndarr[:] *= mine`` imply? Assuming it
+ means ``ndarr[:] = ndarr[:] * mine``, as python does by
+ default, is likely to be wrong.
+
+Extension to other numpy functions
+----------------------------------
+
+The ``__array_ufunc__`` method is used to override :func:`~numpy.dot`
+and :func:`~numpy.matmul` as well, since while these functions are not
+(yet) implemented as (generalized) Ufuncs, they are very similar. For
+other functions, such as :func:`~numpy.median`, :func:`~numpy.min`,
+etc., implementations as (generalized) Ufuncs may well be possible and
+logical as well, in which case it will become possible to override these
+as well.
Demo
====
-A pull request[6]_ has been made including the changes proposed in this NEP.
-Here is a demo highlighting the functionality.::
+A pull request [8]_ has been made including the changes and revisions
+proposed in this NEP. Here is a demo highlighting the functionality.::
In [1]: import numpy as np;
In [2]: a = np.array([1])
In [3]: class B():
- ...: def __numpy_ufunc__(self, func, method, pos, inputs, **kwargs):
+ ...: def __array_ufunc__(self, func, method, pos, inputs, **kwargs):
...: return "B"
...:
@@ -274,24 +314,24 @@ Here is a demo highlighting the functionality.::
In [6]: np.multiply(a, b)
Out[6]: 'B'
-A simple ``__numpy_ufunc__`` has been added to SciPy's sparse matrices
-Currently this only handles ``np.dot`` and ``np.multiply`` because it was the
-two most common cases where users would attempt to use sparse matrices with ufuncs.
-The method is defined below::
+As a simple example, one could add the following ``__array_ufunc__`` to
+SciPy's sparse matrices (just for ``np.dot`` and ``np.multiply`` as
+these are the two most common cases where users would attempt to use
+sparse matrices with ufuncs)::
- def __numpy_ufunc__(self, func, method, pos, inputs, **kwargs):
+ def __array_ufunc__(self, func, method, pos, inputs, **kwargs):
"""Method for compatibility with NumPy's ufuncs and dot
functions.
"""
without_self = list(inputs)
- del without_self[pos]
+ without_self.pop(self)
without_self = tuple(without_self)
- if func == np.multiply:
+ if func is np.multiply:
return self.multiply(*without_self)
- elif func == np.dot:
+ elif func is np.dot:
if pos == 0:
return self.__mul__(inputs[1])
if pos == 1: