summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorSebastian Berg <sebastian@sipsolutions.net>2020-11-16 10:48:23 -0600
committerGitHub <noreply@github.com>2020-11-16 10:48:23 -0600
commit5da4a8e1835a11d5a03b715e9c0afe3bb96c883b (patch)
tree6aeb45c2beabbeed2bf8809a8c528e7c100ec8b4 /doc
parent360ba0572483457837992d711a0a00580741fc88 (diff)
parent27e67ce1d2cdf85f29e05b88131731b93673e57c (diff)
downloadnumpy-5da4a8e1835a11d5a03b715e9c0afe3bb96c883b.tar.gz
Merge pull request #17725 from pentschev/nep-35-downstream-like-instructions
NEP: Add NEP-35 instructions on reading like= downstream
Diffstat (limited to 'doc')
-rw-r--r--doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst117
1 files changed, 102 insertions, 15 deletions
diff --git a/doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst b/doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
index dca8b2418..5ec01081a 100644
--- a/doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
+++ b/doc/neps/nep-0035-array-creation-dispatch-with-array-function.rst
@@ -8,7 +8,7 @@ NEP 35 — Array Creation Dispatching With __array_function__
:Status: Draft
:Type: Standards Track
:Created: 2019-10-15
-:Updated: 2020-08-17
+:Updated: 2020-11-06
:Resolution:
Abstract
@@ -120,9 +120,9 @@ conversion, ultimately raising a
Now we should look at how a library like Dask could benefit from ``like=``.
Before we understand that, it's important to understand a bit about Dask basics
-and ensures correctness with ``__array_function__``. Note that Dask can perform
-computations on different sorts of objects, like dataframes, bags and arrays,
-here we will focus strictly on arrays, which are the objects we can use
+and how it ensures correctness with ``__array_function__``. Note that Dask can
+perform computations on different sorts of objects, like dataframes, bags and
+arrays, here we will focus strictly on arrays, which are the objects we can use
``__array_function__`` with.
Dask uses a graph computing model, meaning it breaks down a large problem in
@@ -221,11 +221,14 @@ array creation, the new ``like=`` keyword shall be used for the purpose of
dispatching.
Downstream libraries will benefit from the ``like=`` argument without any
-changes to their API, given the argument is of exclusive implementation in
-NumPy. It will still be required that downstream libraries implement the
-``__array_function__`` protocol, as described by NEP 18 [1]_, and appropriately
-introduce the argument to their calls to NumPy array creation functions, as
-exemplified in :ref:`neps.like-kwarg.usage-and-impact`.
+changes to their API, given the argument only needs to be implemented by NumPy.
+It's still allowed that downstream libraries include the ``like=`` argument,
+as it can be useful in some cases, please refer to
+:ref:`neps.like-kwarg.implementation` for details on those cases. It will still
+be required that downstream libraries implement the ``__array_function__``
+protocol, as described by NEP 18 [1]_, and appropriately introduce the argument
+to their calls to NumPy array creation functions, as exemplified in
+:ref:`neps.like-kwarg.usage-and-impact`.
Related work
------------
@@ -235,6 +238,8 @@ protocol's limitation, such as the introduction of the ``__duckarray__``
protocol in NEP 30 [3]_, and the introduction of an overriding mechanism called
``uarray`` by NEP 31 [4]_.
+.. _neps.like-kwarg.implementation:
+
Implementation
--------------
@@ -252,13 +257,20 @@ This newly proposed keyword shall be removed by the ``__array_function__``
mechanism from the keyword dictionary before dispatching. The purpose for this
is twofold:
-1. The object will have no use in the downstream library's implementation; and
-2. Simplifies adoption of array creation by those libraries already opting-in
+1. Simplifies adoption of array creation by those libraries already opting-in
to implement the ``__array_function__`` protocol, thus removing the
- requirement to explicitly opt-in for all array creation functions.
-
-Downstream libraries thus shall _NOT_ include the ``like=`` keyword to their
-array creation APIs, which is a NumPy-exclusive keyword.
+ requirement to explicitly opt-in for all array creation functions; and
+2. Most downstream libraries will have no use for the keyword argument, and
+ those that do may accomplish so by capturing ``self`` from
+ ``__array_function__``.
+
+Downstream libraries thus do not require to include the ``like=`` keyword to
+their array creation APIs. In some cases (e.g., Dask), having the ``like=``
+keyword can be useful, as it would allow the implementation to identify
+array internals. As an example, Dask could benefit from the reference array
+to identify its chunk type (e.g., NumPy, CuPy, Sparse), and thus create a new
+Dask array backed by the same chunk type, something that's not possible unless
+Dask can read the reference array's attributes.
Function Dispatching
~~~~~~~~~~~~~~~~~~~~
@@ -317,6 +329,81 @@ downsides pointed out above we have decided to discard any changes on the Python
side and resolve those issues with a pure-C implementation. Please refer to
[implementation]_ for details.
+Reading the Reference Array Downstream
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As stated in the beginning of :ref:`neps.like-kwarg.implementation` section,
+``like=`` is not propagated to the downstream library, nevertheless, it's still
+possible to access it. This requires some changes in the downstream library's
+``__array_function__`` definition, where the ``self`` attribute is in practice
+that passed via ``like=``. This is the case because we use ``like=`` as the
+dispatching array, unlike other compute functions covered by NEP-18 that usually
+dispatch on the first positional argument.
+
+An example of such use is to create a new Dask array while preserving its
+backend type:
+
+.. code:: python
+ # Returns dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=cupy.ndarray>
+ np.asarray([1, 2, 3], like=da.array(cp.array(())))
+
+ # Returns a cupy.ndarray
+ type(np.asarray([1, 2, 3], like=da.array(cp.array(()))).compute())
+
+Note how above the array is backed by ``chunktype=cupy.ndarray``, and the
+resulting array after computing it is also a ``cupy.ndarray``. If Dask did
+not use the ``like=`` argument via the ``self`` attribute from
+``__array_function__``, the example above would be backed by ``numpy.ndarray``
+instead:
+
+.. code:: python
+ # Returns dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
+ np.asarray([1, 2, 3], like=da.array(cp.array(())))
+
+ # Returns a numpy.ndarray
+ type(np.asarray([1, 2, 3], like=da.array(cp.array(()))).compute())
+
+Given the library would need to rely on ``self`` attribute from
+``__array_function__`` to dispatch the function with the correct reference
+array, we suggest one of two alternatives:
+
+1. Introduce a list of functions in the downstream library that do support the
+ ``like=`` argument and pass ``like=self`` when calling the function; or
+2. Inspect whether the function's signature and verify whether it includes the
+ ``like=`` argument. Note that this may incur in a higher performance penalty
+ and assumes introspection is possible, which may not be if the function is
+ a C function.
+
+To make things clearer, let's take a look at how suggestion 2 could be
+implemented in Dask. The current relevant part of ``__array_function__``
+definition in Dask is seen below:
+
+.. code:: python
+ def __array_function__(self, func, types, args, kwargs):
+ # Code not relevant for this example here
+
+ # Dispatch ``da_func`` (da.asarray, for example) with *args and **kwargs
+ da_func(*args, **kwargs)
+
+And this is how the updated code would look like:
+
+.. code:: python
+ def __array_function__(self, func, types, args, kwargs):
+ # Code not relevant for this example here
+
+ # Inspect ``da_func``'s signature and store keyword-only arguments
+ import inspect
+ kwonlyargs = inspect.getfullargspec(da_func).kwonlyargs
+
+ # If ``like`` is contained in ``da_func``'s signature, add ``like=self``
+ # to the kwargs dictionary.
+ if 'like' in kwonlyargs:
+ kwargs['like'] = self
+
+ # Dispatch ``da_func`` (da.asarray, for example) with args and kwargs.
+ # Here, kwargs contain ``like=self`` if the function's signature does too.
+ da_func(*args, **kwargs)
+
Alternatives
------------