summaryrefslogtreecommitdiff
path: root/doc/neps
diff options
context:
space:
mode:
authorSebastian Berg <sebastian@sipsolutions.net>2022-02-28 14:01:33 -0800
committerSebastian Berg <sebastian@sipsolutions.net>2022-05-25 07:24:42 -0700
commit90433becf2f4ff20bd0f2f965f0d50befd87cf1f (patch)
tree6b4b358ab2e5e09b52cf113cdefcdd2be598d7c4 /doc/neps
parent27733042aed1338347469dbf0e916518139bb112 (diff)
downloadnumpy-90433becf2f4ff20bd0f2f965f0d50befd87cf1f.tar.gz
DOC: Fixups and brief "weakly typed arrays" discussion/alternative
Based on Tylers and Jake Vanderplas' comments.
Diffstat (limited to 'doc/neps')
-rw-r--r--doc/neps/nep-0050-scalar-promotion.rst85
1 files changed, 55 insertions, 30 deletions
diff --git a/doc/neps/nep-0050-scalar-promotion.rst b/doc/neps/nep-0050-scalar-promotion.rst
index 8045f7e48..45edc4013 100644
--- a/doc/neps/nep-0050-scalar-promotion.rst
+++ b/doc/neps/nep-0050-scalar-promotion.rst
@@ -42,16 +42,16 @@ There are two main ways this can lead to confusing results:
Note that the examples apply just as well for operations like multiplication,
addition, or comparisons and the corresponding functions like `np.multiply`.
-This NEPs proposes to refactor the behaviour around two guiding principles:
+This NEP proposes to refactor the behaviour around two guiding principles:
1. The value must never influence the result type.
2. NumPy scalars or 0-D arrays must always lead to the same behaviour as
their N-D counterparts.
-We propose to removes all value-based logic and add special handling for
+We propose to remove all value-based logic and add special handling for
Python scalars to preserve some of the convenience that it provided.
-This changes also apply to ``np.can_cast(100, np.int8)``, however, we expect
+These changes also apply to ``np.can_cast(100, np.int8)``, however, we expect
that the behaviour in functions (promotion) will in practice be far more
relevant than this casting change.
@@ -102,14 +102,16 @@ the following changes.
to some degree, but this is not currently planned.
-Impact on operators functions involving NumPy arrays or scalars
----------------------------------------------------------------
+Impact on operators and functions involving NumPy arrays or scalars
+-------------------------------------------------------------------
The main impact on operations not involving Python scalars (float, int, complex)
will be that 0-D arrays and NumPy scalars will never behave value-sensitive.
This removes currently surprising cases. For example::
np.arange(10, dtype=np.uint8) + np.int64(1)
+ # and:
+ np.add(np.arange(10, dtype=np.uint8), np.int64(1))
Will return an int64 array because the type of ``np.int64(1)`` is strictly
honoured.
@@ -118,49 +120,40 @@ honoured.
Impact on operators involving Python ``int``, ``float``, and ``complex``
------------------------------------------------------------------------
-This NEP attempts to preserve most of the convenience that the old behaviour
-gave for Python operators, but remove it for
-
-The current value-based logic has some nice properties when "untyped" Python
-scalars involved::
+This NEP attempts to preserve the convenience that the old behaviour
+gave when working with literal values.
+The current value-based logic had some nice properties when "untyped",
+literal Python scalars are involved::
np.arange(10, dtype=np.int8) + 1 # returns an int8 array
- nparray([1., 2.], dtype=np.float32) * 3.5 # returns a float32 array
+ np.array([1., 2.], dtype=np.float32) * 3.5 # returns a float32 array
But led to complexity when it came to "unrepresentable" values:
np.arange(10, dtype=np.int8) + 256 # returns int16
- nparray([1., 2.], dtype=np.float32) * 1e200 # returns float64
+ np.array([1., 2.], dtype=np.float32) * 1e200 # returns float64
The proposal is to preserve this behaviour for the most part. This is achieved
by considering Python ``int``, ``float``, and ``complex`` to be "weakly" typed
in these operations.
-To mitigate user surprises, we would further make conversion to the new type
-more strict. This means that the results will be unchanged in the first
+Hoewver, to mitigate user surprises, we plan to make conversion to the new type
+more strict: This means that the results will be unchanged in the first
two examples. For the second one, the results will be the following::
np.arange(10, dtype=np.int8) + 256 # raises a TypeError
- nparray([1., 2.], dtype=np.float32) * 1e200 # warning and returns infinity
+ np.array([1., 2.], dtype=np.float32) * 1e200 # warning and returns infinity
The second one will warn because ``np.float32(1e200)`` overflows to infinity.
It will then do the calculation with ``inf`` as normally.
-Impact on functions involving Python ``int``, ``float``, and ``complex``
-------------------------------------------------------------------------
-
-Most functions, in particular ``ufuncs`` will also use this weakly typed
-logic.
-In some cases, functions will call `np.asarray()` on inputs before any operations
-and thus will s
+.. admonition:: Behaviour in other libraries
-.. note::
+ Overflowing in the conversion rather than raising an error is a choice;
+ it is one that is the default in most C setups (similar to NumPy C can be
+ set up to raise an error due to the overflow, however).
+ It is also for example the behaviour of ``pytorch`` 1.10.
- There is a real alternative to not do this for `ufuncs` and limit the special
- behaviour to Python operators. From a user perspective, we assume that most
- functions effectively call `np.asarray()`.
- Because Python operators allow more custom logic, this would ensure that an
- overflow warning is given for all results with decreased precision.
Backward compatibility
@@ -192,7 +185,7 @@ scalar operators.
Similarliy, if the storage array is float32 a calculation may retain the lower
float32 precision rather than use the default float64.
-Further issues can occure. For example:
+Further issues can occur. For example:
* Floating point comparisons, especially equality, may change when mixing
precisions:
@@ -205,7 +198,7 @@ Further issues can occure. For example:
np.array([1], np.uint8) == 1000 # possibly also
```
to protect users in cases where previous value-based casting led to an
- upcast.
+ upcast. (Failures occur when converting ``1000`` to a ``uint8``.)
* Floating point overflow may occur in odder cases:
```python3
np.float32(1e-30) * 1e50 # will return ``inf`` and a warning
@@ -418,6 +411,37 @@ or even ignore the "unsafe" conversion which (on all relevant hardware) would
lead to ``np.uint8(1000) == np.uint8(232)`` being used.
+Allowing weakly typed arrays
+----------------------------
+
+One problem with having weakly typed Python scalars, but not weakly typed
+arrays is that in many cases ``np.asarray()`` is called indiscriminately on
+inputs. To solve this issue JAX will consider the result of ``np.asarray(1)``
+also to be weakly typed.
+There are, however, two difficulties with this:
+
+1. JAX noticed that it can be confusing that::
+
+ np.broadcast_to(np.asarray(1), (100, 100))
+
+ is a non 0-D array that "inherits" the weak typing. [2]_
+2. Unlike JAX tensors, NumPy arrays are mutable, so assignment may need to
+ cause it to be strongly typed?
+
+A flag will likely be useful as an implementation detail (e.g. in ufuncs),
+however, as of now we do not expect to have this as user API.
+The main reason is that such a flag may be surprising for users if it is
+passed out as a result from a function, rather than used only very localized.
+
+
+.. admonition:: TODO
+
+ Before accepting the NEP it may be good to discuss this issue further.
+ Libraries may need clearer patterns to "propagate" the "weak" type, this
+ could just be an ``np.asarray_or_literal()`` to preserve Python scalars,
+ or a pattern of calling ``np.result_type()`` before ``np.asarray()``.
+
+
Discussion
==========
@@ -436,6 +460,7 @@ References and Footnotes
.. _JAX promotion: https://jax.readthedocs.io/en/latest/type_promotion.html
+.. [2] https://github.com/numpy/numpy/pull/21103/files#r814188019
Copyright
=========