doc/neps/nep-0018-array-function-protocol.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957

.. _NEP18:

====================================================================
NEP 18 — A dispatch mechanism for NumPy's high level array functions
====================================================================

:Author: Stephan Hoyer <shoyer@google.com>
:Author: Matthew Rocklin <mrocklin@gmail.com>
:Author: Marten van Kerkwijk <mhvk@astro.utoronto.ca>
:Author: Hameer Abbasi <hameerabbasi@yahoo.com>
:Author: Eric Wieser <wieser.eric@gmail.com>
:Status: Final
:Type: Standards Track
:Created: 2018-05-29
:Updated: 2019-05-25
:Resolution: https://mail.python.org/pipermail/numpy-discussion/2018-August/078493.html

Abstract
--------

We propose the ``__array_function__`` protocol, to allow arguments of NumPy
functions to define how that function operates on them. This will allow
using NumPy as a high level API for efficient multi-dimensional array
operations, even with array implementations that differ greatly from
``numpy.ndarray``.

Detailed description
--------------------

NumPy's high level ndarray API has been implemented several times
outside of NumPy itself for different architectures, such as for GPU
arrays (CuPy), Sparse arrays (scipy.sparse, pydata/sparse) and parallel
arrays (Dask array) as well as various NumPy-like implementations in the
deep learning frameworks, like TensorFlow and PyTorch.

Similarly there are many projects that build on top of the NumPy API
for labeled and indexed arrays (XArray), automatic differentiation
(Autograd, Tangent), masked arrays (numpy.ma), physical units (astropy.units,
pint, unyt), etc. that add additional functionality on top of the NumPy API.
Most of these project also implement a close variation of NumPy's level high
API.

We would like to be able to use these libraries together, for example we
would like to be able to place a CuPy array within XArray, or perform
automatic differentiation on Dask array code. This would be easier to
accomplish if code written for NumPy ndarrays could also be used by
other NumPy-like projects.

For example, we would like for the following code example to work
equally well with any NumPy-like array object:

.. code:: python

    def f(x):
        y = np.tensordot(x, x.T)
        return np.mean(np.exp(y))

Some of this is possible today with various protocol mechanisms within
NumPy.

-  The ``np.exp`` function checks the ``__array_ufunc__`` protocol
-  The ``.T`` method works using Python's method dispatch
-  The ``np.mean`` function explicitly checks for a ``.mean`` method on
   the argument

However other functions, like ``np.tensordot`` do not dispatch, and
instead are likely to coerce to a NumPy array (using the ``__array__``)
protocol, or err outright. To achieve enough coverage of the NumPy API
to support downstream projects like XArray and autograd we want to
support *almost all* functions within NumPy, which calls for a more
reaching protocol than just ``__array_ufunc__``. We would like a
protocol that allows arguments of a NumPy function to take control and
divert execution to another function (for example a GPU or parallel
implementation) in a way that is safe and consistent across projects.

Implementation
--------------

We propose adding support for a new protocol in NumPy,
``__array_function__``.

This protocol is intended to be a catch-all for NumPy functionality that
is not covered by the ``__array_ufunc__`` protocol for universal functions
(like ``np.exp``). The semantics are very similar to ``__array_ufunc__``, except
the operation is specified by an arbitrary callable object rather than a ufunc
instance and method.

A prototype implementation can be found in
`this notebook <https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006>`_.

.. warning::

  The ``__array_function__`` protocol, and its use on particular functions,
  is *experimental*. We plan to retain an interface that makes it possible
  to override NumPy functions, but the way to do so for particular functions
  **can and will change** with little warning. If such reduced backwards
  compatibility guarantees are not accepted to you, do not rely upon overrides
  of NumPy functions for non-NumPy arrays. See "Non-goals" below for more
  details.

.. note::

  Dispatch with the ``__array_function__`` protocol has been implemented but is
  not yet enabled by default:

  - In NumPy 1.16, you need to set the environment variable
    ``NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1`` before importing NumPy to test
    NumPy function overrides.
  - In NumPy 1.17, the protocol will be enabled by default, but can be disabled
    with ``NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=0``.
  - Eventually, expect to ``__array_function__`` to always be enabled.

The interface
~~~~~~~~~~~~~

We propose the following signature for implementations of
``__array_function__``:

.. code-block:: python

    def __array_function__(self, func, types, args, kwargs)

-  ``func`` is an arbitrary callable exposed by NumPy's public API,
   which was called in the form ``func(*args, **kwargs)``.
-  ``types`` is a `collection <https://docs.python.org/3/library/collections.abc.html#collections.abc.Collection>`_
   of unique argument types from the original NumPy function call that
   implement ``__array_function__``.
-  The tuple ``args`` and dict ``kwargs`` are directly passed on from the
   original call.

Unlike ``__array_ufunc__``, there are no high-level guarantees about the
type of ``func``, or about which of ``args`` and ``kwargs`` may contain objects
implementing the array API.

As a convenience for ``__array_function__`` implementors, ``types`` provides all
argument types with an ``'__array_function__'`` attribute. This
allows implementors to quickly identify cases where they should defer to
``__array_function__`` implementations on other arguments.
The type of ``types`` is intentionally vague:
``frozenset`` would most closely match intended use, but we may use ``tuple``
instead for performance reasons. In any case, ``__array_function__``
implementations should not rely on the iteration order of ``types``, which
would violate a well-defined "Type casting hierarchy" (as described in
`NEP-13 <https://www.numpy.org/neps/nep-0013-ufunc-overrides.html>`_).

Example for a project implementing the NumPy API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Most implementations of ``__array_function__`` will start with two
checks:

1.  Is the given function something that we know how to overload?
2.  Are all arguments of a type that we know how to handle?

If these conditions hold, ``__array_function__`` should return
the result from calling its implementation for ``func(*args, **kwargs)``.
Otherwise, it should return the sentinel value ``NotImplemented``, indicating
that the function is not implemented by these types. This is preferable to
raising ``TypeError`` directly, because it gives *other* arguments the
opportunity to define the operations.

There are no general requirements on the return value from
``__array_function__``, although most sensible implementations should probably
return array(s) with the same type as one of the function's arguments.
If/when Python gains
`typing support for protocols <https://www.python.org/dev/peps/pep-0544/>`_
and NumPy adds static type annotations, the ``@overload`` implementation
for ``SupportsArrayFunction`` will indicate a return type of ``Any``.

It may also be convenient to define a custom decorators (``implements`` below)
for registering ``__array_function__`` implementations.

.. code:: python

    HANDLED_FUNCTIONS = {}

    class MyArray:
        def __array_function__(self, func, types, args, kwargs):
            if func not in HANDLED_FUNCTIONS:
                return NotImplemented
            # Note: this allows subclasses that don't override
            # __array_function__ to handle MyArray objects
            if not all(issubclass(t, MyArray) for t in types):
                return NotImplemented
            return HANDLED_FUNCTIONS[func](*args, **kwargs)

    def implements(numpy_function):
        """Register an __array_function__ implementation for MyArray objects."""
        def decorator(func):
            HANDLED_FUNCTIONS[numpy_function] = func
            return func
        return decorator

    @implements(np.concatenate)
    def concatenate(arrays, axis=0, out=None):
        ...  # implementation of concatenate for MyArray objects

    @implements(np.broadcast_to)
    def broadcast_to(array, shape):
        ...  # implementation of broadcast_to for MyArray objects

Note that it is not required for ``__array_function__`` implementations to
include *all* of the corresponding NumPy function's optional arguments
(e.g., ``broadcast_to`` above omits the irrelevant ``subok`` argument).
Optional arguments are only passed in to ``__array_function__`` if they
were explicitly used in the NumPy function call.

.. note::

    Just like the case for builtin special methods like ``__add__``, properly
    written ``__array_function__`` methods should always return
    ``NotImplemented`` when an unknown type is encountered. Otherwise, it will
    be impossible to correctly override NumPy functions from another object
    if the operation also includes one of your objects.

Necessary changes within the NumPy codebase itself
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This will require two changes within the NumPy codebase:

1. A function to inspect available inputs, look for the
   ``__array_function__`` attribute on those inputs, and call those
   methods appropriately until one succeeds.  This needs to be fast in the
   common all-NumPy case, and have acceptable performance (no worse than
   linear time) even if the number of overloaded inputs is large (e.g.,
   as might be the case for `np.concatenate`).

   This is one additional function of moderate complexity.
2. Calling this function within all relevant NumPy functions.

   This affects many parts of the NumPy codebase, although with very low
   complexity.

Finding and calling the right ``__array_function__``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a NumPy function, ``*args`` and ``**kwargs`` inputs, we need to
search through ``*args`` and ``**kwargs`` for all appropriate inputs
that might have the ``__array_function__`` attribute. Then we need to
select among those possible methods and execute the right one.
Negotiating between several possible implementations can be complex.

Finding arguments
'''''''''''''''''

Valid arguments may be directly in the ``*args`` and ``**kwargs``, such
as in the case for ``np.tensordot(left, right, out=out)``, or they may
be nested within lists or dictionaries, such as in the case of
``np.concatenate([x, y, z])``. This can be problematic for two reasons:

1. Some functions are given long lists of values, and traversing them
   might be prohibitively expensive.
2. Some functions may have arguments that we don't want to inspect, even
   if they have the ``__array_function__`` method.

To resolve these issues, NumPy functions should explicitly indicate which
of their arguments may be overloaded, and how these arguments should be
checked. As a rule, this should include all arguments documented as either
``array_like`` or ``ndarray``.

We propose to do so by writing "dispatcher" functions for each overloaded
NumPy function:

- These functions will be called with the exact same arguments that were passed
  into the NumPy function (i.e., ``dispatcher(*args, **kwargs)``), and should
  return an iterable of arguments to check for overrides.
- Dispatcher functions are required to share the exact same positional,
  optional and keyword-only arguments as their corresponding NumPy functions.
  Otherwise, valid invocations of a NumPy function could result in an error when
  calling its dispatcher.
- Because default *values* for keyword arguments do not have
  ``__array_function__`` attributes, by convention we set all default argument
  values to ``None``. This reduces the likelihood of signatures falling out
  of sync, and minimizes extraneous information in the dispatcher.
  The only exception should be cases where the argument value in some way
  effects dispatching, which should be rare.

An example of the dispatcher for ``np.concatenate`` may be instructive:

.. code:: python

    def _concatenate_dispatcher(arrays, axis=None, out=None):
        for array in arrays:
            yield array
        if out is not None:
            yield out

The concatenate dispatcher is written as generator function, which allows it
to potentially include the value of the optional ``out`` argument without
needing to create a new sequence with the (potentially long) list of objects
to be concatenated.

Trying ``__array_function__`` methods until the right one works
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Many arguments may implement the ``__array_function__`` protocol. Some
of these may decide that, given the available inputs, they are unable to
determine the correct result. How do we call the right one? If several
are valid then which has precedence?

For the most part, the rules for dispatch with ``__array_function__``
match those for ``__array_ufunc__`` (see
`NEP-13 <https://www.numpy.org/neps/nep-0013-ufunc-overrides.html>`_).
In particular:

-  NumPy will gather implementations of ``__array_function__`` from all
   specified inputs and call them in order: subclasses before
   superclasses, and otherwise left to right. Note that in some edge cases
   involving subclasses, this differs slightly from the
   `current behavior <https://bugs.python.org/issue30140>`_ of Python.
-  Implementations of ``__array_function__`` indicate that they can
   handle the operation by returning any value other than
   ``NotImplemented``.
-  If all ``__array_function__`` methods return ``NotImplemented``,
   NumPy will raise ``TypeError``.

If no ``__array_function__`` methods exist, NumPy will default to calling its
own implementation, intended for use on NumPy arrays. This case arises, for
example, when all array-like arguments are Python numbers or lists.
(NumPy arrays do have a ``__array_function__`` method, given below, but it
always returns ``NotImplemented`` if any argument other than a NumPy array
subclass implements ``__array_function__``.)

One deviation from the current behavior of ``__array_ufunc__`` is that NumPy
will only call ``__array_function__`` on the *first* argument of each unique
type. This matches Python's
`rule for calling reflected methods <https://docs.python.org/3/reference/datamodel.html#object.__ror__>`_,
and this ensures that checking overloads has acceptable performance even when
there are a large number of overloaded arguments. To avoid long-term divergence
between these two dispatch protocols, we should
`also update <https://github.com/numpy/numpy/issues/11306>`_
``__array_ufunc__`` to match this behavior.

The ``__array_function__`` method on ``numpy.ndarray``
''''''''''''''''''''''''''''''''''''''''''''''''''''''

The use cases for subclasses with ``__array_function__`` are the same as those
with ``__array_ufunc__``, so ``numpy.ndarray`` also defines a
``__array_function__`` method:

.. code:: python

    def __array_function__(self, func, types, args, kwargs):
        if not all(issubclass(t, ndarray) for t in types):
            # Defer to any non-subclasses that implement __array_function__
            return NotImplemented

        # Use NumPy's private implementation without __array_function__
        # dispatching
        return func._implementation(*args, **kwargs)

This method matches NumPy's dispatching rules, so for most part it is
possible to pretend that ``ndarray.__array_function__`` does not exist.
The private ``_implementation`` attribute, defined below in the
``array_function_dispatch`` decorator, allows us to avoid the special cases for
NumPy arrays that were needed in the ``__array_ufunc__`` protocol.

The ``__array_function__`` protocol always calls subclasses before
superclasses, so if any ``ndarray`` subclasses are involved in an operation,
they will get the chance to override it, just as if any other argument
overrides ``__array_function__``. But the default behavior in an operation
that combines a base NumPy array and a subclass is different: if the subclass
returns ``NotImplemented``, NumPy's implementation of the function will be
called instead of raising an exception. This is appropriate since subclasses
are `expected to be substitutable <https://en.wikipedia.org/wiki/Liskov_substitution_principle>`_.

We still caution authors of subclasses to exercise caution when relying
upon details of NumPy's internal implementations. It is not always possible to
write a perfectly substitutable ndarray subclass, e.g., in cases involving the
creation of new arrays, not least because NumPy makes use of internal
optimizations specialized to base NumPy arrays, e.g., code written in C. Even
if NumPy's implementation happens to work today, it may not work in the future.
In these cases, your recourse is to re-implement top-level NumPy functions via
``__array_function__`` on your subclass.

Changes within NumPy functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Given a function defining the above behavior, for now call it
``implement_array_function``, we now need to call that
function from within every relevant NumPy function. This is a pervasive change,
but of fairly simple and innocuous code that should complete quickly and
without effect if no arguments implement the ``__array_function__``
protocol.

To achieve this, we define a ``array_function_dispatch`` decorator to rewrite
NumPy functions. The basic implementation is as follows:

.. code:: python

    def array_function_dispatch(dispatcher, module=None):
        """Wrap a function for dispatch with the __array_function__ protocol."""
        def decorator(implementation):
            @functools.wraps(implementation)
            def public_api(*args, **kwargs):
                relevant_args = dispatcher(*args, **kwargs)
                return implement_array_function(
                    implementation, public_api, relevant_args, args, kwargs)
            if module is not None:
                public_api.__module__ = module
            # for ndarray.__array_function__
            public_api._implementation = implementation
            return public_api
        return decorator

    # example usage
    def _broadcast_to_dispatcher(array, shape, subok=None):
        return (array,)

    @array_function_dispatch(_broadcast_to_dispatcher, module='numpy')
    def broadcast_to(array, shape, subok=False):
        ...  # existing definition of np.broadcast_to

Using a decorator is great! We don't need to change the definitions of
existing NumPy functions, and only need to write a few additional lines
for the dispatcher function. We could even reuse a single dispatcher for
families of functions with the same signature (e.g., ``sum`` and ``prod``).
For such functions, the largest change could be adding a few lines to the
docstring to note which arguments are checked for overloads.

It's particularly worth calling out the decorator's use of
``functools.wraps``:

- This ensures that the wrapped function has the same name and docstring as
  the wrapped NumPy function.
- On Python 3, it also ensures that the decorator function copies the original
  function signature, which is important for introspection based tools such as
  auto-complete.
- Finally, it ensures that the wrapped function
  `can be pickled <http://gael-varoquaux.info/programming/decoration-in-python-done-right-decorating-and-pickling.html>`_.

The example usage illustrates several best practices for writing dispatchers
relevant to NumPy contributors:

- We passed the ``module`` argument, which in turn sets the  ``__module__``
  attribute on the generated function. This is for the benefit of better error
  messages, here for errors raised internally by NumPy when no implementation
  is found, e.g.,
  ``TypeError: no implementation found for 'numpy.broadcast_to'``. Setting
  ``__module__`` to the canonical location in NumPy's public API encourages
  users to use NumPy's public API for identifying functions in
  ``__array_function__``.

- The dispatcher is a function that returns a tuple, rather than an equivalent
  (and equally valid) generator using ``yield``:

  .. code:: python

    # example usage
    def broadcast_to(array, shape, subok=None):
        yield array

  This is no accident: NumPy's implementation of dispatch for
  ``__array_function__`` is fastest when dispatcher functions return a builtin
  sequence type (``tuple`` or ``list``).

  On a related note, it's perfectly fine for dispatchers to return arguments
  even if in some cases you *know* that they cannot have an
  ``__array_function__`` method. This can arise for functions with default
  arguments (e.g., ``None``) or complex signatures. NumPy's dispatching logic
  sorts out these cases very quickly, so it generally is not worth the trouble
  of parsing them on your own.

.. note::

    The code for ``array_function_dispatch`` above has been updated from the
    original version of this NEP to match the actual
    `implementation in NumPy <https://github.com/numpy/numpy/blob/e104f03ac8f65ae5b92a9b413b0fa639f39e6de2/numpy/core/overrides.py>`_.

Extensibility
~~~~~~~~~~~~~

An important virtue of this approach is that it allows for adding new
optional arguments to NumPy functions without breaking code that already
relies on ``__array_function__``.

This is not a theoretical concern. NumPy's older, haphazard implementation of
overrides *within* functions like ``np.sum()`` necessitated some awkward
gymnastics when we decided to add new optional arguments, e.g., the new
``keepdims`` argument is only passed in cases where it is used:

.. code:: python

    def sum(array, ..., keepdims=np._NoValue):
        kwargs = {}
        if keepdims is not np._NoValue:
            kwargs['keepdims'] = keepdims
        return array.sum(..., **kwargs)

For ``__array_function__`` implementors, this also means that it is possible
to implement even existing optional arguments incrementally, and only in cases
where it makes sense. For example, a library implementing immutable arrays
would not be required to explicitly include an unsupported ``out`` argument in
the function signature. This can be somewhat onerous to implement properly,
e.g.,

.. code:: python

    def my_sum(array, ..., out=None):
        if out is not None:
            raise TypeError('out argument is not supported')
        ...

We thus avoid encouraging the tempting shortcut of adding catch-all
``**ignored_kwargs`` to the signatures of functions called by NumPy, which fails
silently for misspelled or ignored arguments.

Performance
~~~~~~~~~~~

Performance is always a concern with NumPy, even though NumPy users have
already prioritized usability over pure speed with their choice of the Python
language itself. It's important that this new ``__array_function__`` protocol
not impose a significant cost in the typical case of NumPy functions acting
on NumPy arrays.

Our `microbenchmark results <https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006>`_
show that a pure Python implementation of the override machinery described
above adds roughly 2-3 microseconds of overhead to each NumPy function call
without any overloaded arguments. For context, typical NumPy functions on small
arrays have a runtime of 1-10 microseconds, mostly determined by what fraction
of the function's logic is written in C. For example, one microsecond is about
the difference in speed between the ``ndarray.sum()`` method (1.6 us) and
``numpy.sum()`` function (2.6 us).

Fortunately, we expect significantly less overhead with a C implementation of
``implement_array_function``, which is where the bulk of the
runtime is. This would leave the ``array_function_dispatch`` decorator and
dispatcher function on their own adding about 0.5 microseconds of overhead,
for perhaps ~1 microsecond of overhead in the typical case.

In our view, this level of overhead is reasonable to accept for code written
in Python. We're pretty sure that the vast majority of NumPy users aren't
concerned about performance differences measured in microsecond(s) on NumPy
functions, because it's difficult to do *anything* in Python in less than a
microsecond.

Use outside of NumPy
~~~~~~~~~~~~~~~~~~~~

Nothing about this protocol that is particular to NumPy itself. Should
we encourage use of the same ``__array_function__`` protocol third-party
libraries for overloading non-NumPy functions, e.g., for making
array-implementation generic functionality in SciPy?

This would offer significant advantages (SciPy wouldn't need to invent
its own dispatch system) and no downsides that we can think of, because
every function that dispatches with ``__array_function__`` already needs
to be explicitly recognized. Libraries like Dask, CuPy, and Autograd
already wrap a limited subset of SciPy functionality (e.g.,
``scipy.linalg``) similarly to how they wrap NumPy.

If we want to do this, we should expose at least the decorator
``array_function_dispatch()`` and possibly also the lower level
``implement_array_function()`` as part of NumPy's public API.

Non-goals
---------

We are aiming for basic strategy that can be relatively mechanistically
applied to almost all functions in NumPy's API in a relatively short
period of time, the development cycle of a single NumPy release.

We hope to get both the ``__array_function__`` protocol and all specific
overloads right on the first try, but our explicit aim here is to get
something that mostly works (and can be iterated upon), rather than to
wait for an optimal implementation. The price of moving fast is that for
now **this protocol should be considered strictly experimental**. We
reserve the right to change the details of this protocol and how
specific NumPy functions use it at any time in the future -- even in
otherwise bug-fix only releases of NumPy. In practice, once initial
issues with ``__array_function__`` are worked out, we will use abbreviated
deprecation cycles as short as a single major NumPy release (e.g., as
little as four months).

In particular, we don't plan to write additional NEPs that list all
specific functions to overload, with exactly how they should be
overloaded. We will leave this up to the discretion of committers on
individual pull requests, trusting that they will surface any
controversies for discussion by interested parties.

However, we already know several families of functions that should be
explicitly exclude from ``__array_function__``. These will need their
own protocols:

-  universal functions, which already have their own protocol.
-  ``array`` and ``asarray``, because they are explicitly intended for
   coercion to actual ``numpy.ndarray`` object.
-  dispatch for methods of any kind, e.g., methods on
   ``np.random.RandomState`` objects.

We also expect that the mechanism for overriding specific functions
that will initially use the ``__array_function__`` protocol can and will
change in the future. As a concrete example of how we expect to break
behavior in the future, some functions such as ``np.where`` are currently
not NumPy universal functions, but conceivably could become universal
functions in the future. When/if this happens, we will change such overloads
from using ``__array_function__`` to the more specialized ``__array_ufunc__``.


Backward compatibility
----------------------

This proposal does not change existing semantics, except for those arguments
that currently have ``__array_function__`` attributes, which should be rare.


Alternatives
------------

Specialized protocols
~~~~~~~~~~~~~~~~~~~~~

We could (and should) continue to develop protocols like
``__array_ufunc__`` for cohesive subsets of NumPy functionality.

As mentioned above, if this means that some functions that we overload
with ``__array_function__`` should switch to a new protocol instead,
that is explicitly OK for as long as ``__array_function__`` retains its
experimental status.

Switching to a new protocol should use an abbreviated version of NumPy's
normal deprecation cycle:

- For a single major release, after checking for any new protocols, NumPy
  should still check for ``__array_function__`` methods that implement the
  given function. If any argument returns a value other than
  ``NotImplemented`` from ``__array_function__``, a descriptive
  ``FutureWarning`` should be issued.
- In the next major release, the checks for ``__array_function__`` will be
  removed.

Separate namespace
~~~~~~~~~~~~~~~~~~

A separate namespace for overloaded functions is another possibility,
either inside or outside of NumPy.

This has the advantage of alleviating any possible concerns about
backwards compatibility and would provide the maximum freedom for quick
experimentation. In the long term, it would provide a clean abstraction
layer, separating NumPy's high level API from default implementations on
``numpy.ndarray`` objects.

The downsides are that this would require an explicit opt-in from all
existing code, e.g., ``import numpy.api as np``, and in the long term
would result in the maintenance of two separate NumPy APIs. Also, many
functions from ``numpy`` itself are already overloaded (but
inadequately), so confusion about high vs. low level APIs in NumPy would
still persist.

Alternatively, a separate namespace, e.g., ``numpy.array_only``, could be
created for a non-overloaded version of NumPy's high level API, for cases
where performance with NumPy arrays is a critical concern. This has most
of the same downsides as the separate namespace.

Multiple dispatch
~~~~~~~~~~~~~~~~~

An alternative to our suggestion of the ``__array_function__`` protocol
would be implementing NumPy's core functions as
`multi-methods <https://en.wikipedia.org/wiki/Multiple_dispatch>`_.
Although one of us wrote a `multiple dispatch
library <https://github.com/mrocklin/multipledispatch>`_ for Python, we
don't think this approach makes sense for NumPy in the near term.

The main reason is that NumPy already has a well-proven dispatching
mechanism with ``__array_ufunc__``, based on Python's own dispatching
system for arithmetic, and it would be confusing to add another
mechanism that works in a very different way. This would also be more
invasive change to NumPy itself, which would need to gain a multiple
dispatch implementation.

It is possible that multiple dispatch implementation for NumPy's high
level API could make sense in the future. Fortunately,
``__array_function__`` does not preclude this possibility, because it
would be straightforward to write a shim for a default
``__array_function__`` implementation in terms of multiple dispatch.

Implementations in terms of a limited core API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The internal implementation of some NumPy functions is extremely simple.
For example:

- ``np.stack()`` is implemented in only a few lines of code by combining
  indexing with ``np.newaxis``, ``np.concatenate`` and the ``shape`` attribute.
- ``np.mean()`` is implemented internally in terms of ``np.sum()``,
  ``np.divide()``, ``.astype()`` and ``.shape``.

This suggests the possibility of defining a minimal "core" ndarray
interface, and relying upon it internally in NumPy to implement the full
API. This is an attractive option, because it could significantly reduce
the work required for new array implementations.

However, this also comes with several downsides:

1. The details of how NumPy implements a high-level function in terms of
   overloaded functions now becomes an implicit part of NumPy's public API. For
   example, refactoring ``stack`` to use ``np.block()`` instead of
   ``np.concatenate()`` internally would now become a breaking change.
2. Array libraries may prefer to implement high level functions differently than
   NumPy. For example, a library might prefer to implement a fundamental
   operations like ``mean()`` directly rather than relying on ``sum()`` followed
   by division. More generally, it's not clear yet what exactly qualifies as
   core functionality, and figuring this out could be a large project.
3. We don't yet have an overloading system for attributes and methods on array
   objects, e.g., for accessing ``.dtype`` and ``.shape``. This should be the
   subject of a future NEP, but until then we should be reluctant to rely on
   these properties.

Given these concerns, we think it's valuable to support explicit overloading of
nearly every public function in NumPy's API. This does not preclude the future
possibility of rewriting NumPy functions in terms of simplified core
functionality with ``__array_function__`` and a protocol and/or base class for
ensuring that arrays expose methods and properties like ``numpy.ndarray``.
However, to work well this would require the possibility of implementing
*some* but not all functions with ``__array_function__``, e.g., as described
in the next section.

Partial implementation of NumPy's API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With the current design, classes that implement ``__array_function__``
to overload at least one function implicitly declare an intent to
implement the entire NumPy API. It's not possible to implement *only*
``np.concatenate()`` on a type, but fall back to NumPy's default
behavior of casting with ``np.asarray()`` for all other functions.

This could present a backwards compatibility concern that would
discourage libraries from adopting ``__array_function__`` in an
incremental fashion. For example, currently most numpy functions will
implicitly convert ``pandas.Series`` objects into NumPy arrays, behavior
that assuredly many pandas users rely on. If pandas implemented
``__array_function__`` only for ``np.concatenate``, unrelated NumPy
functions like ``np.nanmean`` would suddenly break on pandas objects by
raising TypeError.

Even libraries that reimplement most of NumPy's public API sometimes rely upon
using utility functions from NumPy without a wrapper. For example, both CuPy
and JAX simply `use an alias <https://github.com/numpy/numpy/issues/12974>`_ to
``np.result_type``, which already supports duck-types with a ``dtype``
attribute.

With ``__array_ufunc__``, it's possible to alleviate this concern by
casting all arguments to numpy arrays and re-calling the ufunc, but the
heterogeneous function signatures supported by ``__array_function__``
make it impossible to implement this generic fallback behavior for
``__array_function__``.

We considered three possible ways to resolve this issue, but none were
entirely satisfactory:

1. Change the meaning of all arguments returning ``NotImplemented`` from
   ``__array_function__`` to indicate that all arguments should be coerced to
   NumPy arrays and the operation should be retried. However, many array
   libraries (e.g., scipy.sparse) really don't want implicit conversions to
   NumPy arrays, and often avoid implementing ``__array__`` for exactly this
   reason. Implicit conversions can result in silent bugs and performance
   degradation.

   Potentially, we could enable this behavior only for types that implement
   ``__array__``, which would resolve the most problematic cases like
   scipy.sparse. But in practice, a large fraction of classes that present a
   high level API like NumPy arrays already implement ``__array__``. This would
   preclude reliable use of NumPy's high level API on these objects.

2. Use another sentinel value of some sort, e.g.,
   ``np.NotImplementedButCoercible``, to indicate that a class implementing
   part of NumPy's higher level array API is coercible as a fallback. If all
   arguments return ``NotImplementedButCoercible``, arguments would be coerced
   and the operation would be retried.

   Unfortunately, correct behavior after encountering
   ``NotImplementedButCoercible`` is not always obvious. Particularly
   challenging is the "mixed" case where some arguments return
   ``NotImplementedButCoercible`` and others return ``NotImplemented``.
   Would dispatching be retried after only coercing the "coercible" arguments?
   If so, then conceivably we could end up looping through the dispatching
   logic an arbitrary number of times. Either way, the dispatching rules would
   definitely get more complex and harder to reason about.

3. Allow access to NumPy's implementation of functions, e.g., in the form of
   a publicly exposed ``__skip_array_function__`` attribute on the NumPy
   functions. This would allow for falling back to NumPy's implementation by
   using ``func.__skip_array_function__`` inside ``__array_function__``
   methods, and could also potentially be used to be used to avoid the
   overhead of dispatching. However, it runs the risk of potentially exposing
   details of NumPy's implementations for NumPy functions that do not call
   ``np.asarray()`` internally. See
   `this note <https://mail.python.org/pipermail/numpy-discussion/2019-May/079541.html>`_
   for a summary of the full discussion.

These solutions would solve real use cases, but at the cost of additional
complexity. We would like to gain experience with how ``__array_function__`` is
actually used before making decisions that would be difficult to roll back.

A magic decorator that inspects type annotations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In principle, Python 3 type annotations contain sufficient information to
automatically create most ``dispatcher`` functions. It would be convenient to
use these annotations to dispense with the need for manually writing
dispatchers, e.g.,

.. code:: python

    @array_function_dispatch
    def broadcast_to(array: ArrayLike
                     shape: Tuple[int, ...],
                     subok: bool = False):
        ...  # existing definition of np.broadcast_to

This would require some form of automatic code generation, either at compile or
import time.

We think this is an interesting possible extension to consider in the future. We
don't think it makes sense to do so now, because code generation involves
tradeoffs and NumPy's experience with type annotations is still
`quite limited <https://github.com/numpy/numpy-stubs>`_. Even if NumPy
was Python 3 only (which will happen
`sometime in 2019 <http://www.numpy.org/neps/nep-0014-dropping-python2.7-proposal.html>`_),
we aren't ready to annotate NumPy's codebase directly yet.

Support for implementation-specific arguments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We could allow ``__array_function__`` implementations to add their own
optional keyword arguments by including ``**ignored_kwargs`` in dispatcher
functions, e.g.,

.. code:: python

    def _concatenate_dispatcher(arrays, axis=None, out=None, **ignored_kwargs):
        ...  # same implementation of _concatenate_dispatcher as above

Implementation-specific arguments are somewhat common in libraries that
otherwise emulate NumPy's higher level API (e.g., ``dask.array.sum()`` adds
``split_every`` and ``tensorflow.reduce_sum()`` adds ``name``). Supporting
them in NumPy would be particularly useful for libraries that implement new
high-level array functions on top of NumPy functions, e.g.,

.. code:: python

    def mean_squared_error(x, y, **kwargs):
        return np.mean((x - y) ** 2, **kwargs)

Otherwise, we would need separate versions of ``mean_squared_error`` for each
array implementation in order to pass implementation-specific arguments to
``mean()``.

We wouldn't allow adding optional positional arguments, because these are
reserved for future use by NumPy itself, but conflicts between keyword arguments
should be relatively rare.

However, this flexibility would come with a cost. In particular, it implicitly
adds ``**kwargs`` to the signature for all wrapped NumPy functions without
actually including it (because we use ``functools.wraps``). This means it is
unlikely to work well with static analysis tools, which could report invalid
arguments. Likewise, there is a price in readability: these optional arguments
won't be included in the docstrings for NumPy functions.

It's not clear that this tradeoff is worth it, so we propose to leave this out
for now. Adding implementation-specific arguments will require using those
libraries directly.

Other possible choices for the protocol
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The array function ``__array_function__`` includes only two arguments, ``func``
and ``types``, that provide information about the context of the function call.

``func`` is part of the protocol because there is no way to avoid it:
implementations need to be able to dispatch by matching a function to NumPy's
public API.

``types`` is included because we can compute it almost for free as part of
collecting ``__array_function__`` implementations to call in
``implement_array_function``. We also think it will be used
by many ``__array_function__`` methods, which otherwise would need to extract
this information themselves. It would be equivalently easy to provide single
instances of each type, but providing only types seemed cleaner.

Taking this even further, it was suggested that ``__array_function__`` should be
a ``classmethod``. We agree that it would be a little cleaner to remove the
redundant ``self`` argument, but feel that this minor clean-up would not be
worth breaking from the precedence of ``__array_ufunc__``.

There are two other arguments that we think *might* be important to pass to
``__array_ufunc__`` implementations:

- Access to the non-dispatched implementation (i.e., before wrapping with
  ``array_function_dispatch``) in ``ndarray.__array_function__`` would allow
  us to drop special case logic for that method from
  ``implement_array_function``.
- Access to the ``dispatcher`` function passed into
  ``array_function_dispatch()`` would allow ``__array_function__``
  implementations to determine the list of "array-like" arguments in a generic
  way by calling ``dispatcher(*args, **kwargs)``. This *could* be useful for
  ``__array_function__`` implementations that dispatch based on the value of an
  array attribute (e.g., ``dtype`` or ``units``) rather than directly on the
  array type.

We have left these out for now, because we don't know that they are necessary.
If we want to include them in the future, the easiest way to do so would be to
update the ``array_function_dispatch`` decorator to add them as function
attributes.

Callable objects generated at runtime
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

NumPy has some APIs that define callable objects *dynamically*, such as
``vectorize`` and methods on ``random.RandomState`` object. Examples can
also be found in other core libraries in the scientific Python stack, e.g.,
distribution objects in scipy.stats and model objects in scikit-learn. It would
be nice to be able to write overloads for such callables, too. This presents a
challenge for the ``__array_function__`` protocol, because unlike the case for
functions there is no public object in the ``numpy`` namespace to pass into
the ``func`` argument.

We could potentially handle this by establishing an alternative convention
for how the ``func`` argument could be inspected, e.g., by using
``func.__self__`` to obtain the class object and ``func.__func__`` to return
the unbound function object. However, some caution is in order, because
this would immesh what are currently implementation details as a permanent
features of the interface, such as the fact that ``vectorize`` is implemented as a
class rather than closure, or whether a method is implemented directly or using
a descriptor.

Given the complexity and the limited use cases, we are also deferring on this
issue for now, but we are confident that ``__array_function__`` could be
expanded to accommodate these use cases in the future if need be.

Discussion
----------

Various alternatives to this proposal were discussed in a few GitHub issues:

1. `pydata/sparse #1 <https://github.com/pydata/sparse/issues/1>`_
2. `numpy/numpy #11129 <https://github.com/numpy/numpy/issues/11129>`_

Additionally it was the subject of `a blogpost
<http://matthewrocklin.com/blog/work/2018/05/27/beyond-numpy>`_. Following this
it was discussed at a `NumPy developer sprint
<https://scisprints.github.io/#may-numpy-developer-sprint>`_ at the `UC
Berkeley Institute for Data Science (BIDS) <https://bids.berkeley.edu/>`_.

Detailed discussion of this proposal itself can be found on the
`the mailing list <https://mail.python.org/pipermail/numpy-discussion/2018-June/078127.html>`_ and relevant pull requests
(`1 <https://github.com/numpy/numpy/pull/11189>`_,
`2 <https://github.com/numpy/numpy/pull/11303#issuecomment-396638175>`_,
`3 <https://github.com/numpy/numpy/pull/11374>`_)

Copyright
---------

This document has been placed in the public domain.