diff options
author | Sebastian Berg <sebastian@sipsolutions.net> | 2022-02-28 14:01:33 -0800 |
---|---|---|
committer | Sebastian Berg <sebastian@sipsolutions.net> | 2022-05-25 07:24:42 -0700 |
commit | 90433becf2f4ff20bd0f2f965f0d50befd87cf1f (patch) | |
tree | 6b4b358ab2e5e09b52cf113cdefcdd2be598d7c4 /doc/neps | |
parent | 27733042aed1338347469dbf0e916518139bb112 (diff) | |
download | numpy-90433becf2f4ff20bd0f2f965f0d50befd87cf1f.tar.gz |
DOC: Fixups and brief "weakly typed arrays" discussion/alternative
Based on Tylers and Jake Vanderplas' comments.
Diffstat (limited to 'doc/neps')
-rw-r--r-- | doc/neps/nep-0050-scalar-promotion.rst | 85 |
1 files changed, 55 insertions, 30 deletions
diff --git a/doc/neps/nep-0050-scalar-promotion.rst b/doc/neps/nep-0050-scalar-promotion.rst index 8045f7e48..45edc4013 100644 --- a/doc/neps/nep-0050-scalar-promotion.rst +++ b/doc/neps/nep-0050-scalar-promotion.rst @@ -42,16 +42,16 @@ There are two main ways this can lead to confusing results: Note that the examples apply just as well for operations like multiplication, addition, or comparisons and the corresponding functions like `np.multiply`. -This NEPs proposes to refactor the behaviour around two guiding principles: +This NEP proposes to refactor the behaviour around two guiding principles: 1. The value must never influence the result type. 2. NumPy scalars or 0-D arrays must always lead to the same behaviour as their N-D counterparts. -We propose to removes all value-based logic and add special handling for +We propose to remove all value-based logic and add special handling for Python scalars to preserve some of the convenience that it provided. -This changes also apply to ``np.can_cast(100, np.int8)``, however, we expect +These changes also apply to ``np.can_cast(100, np.int8)``, however, we expect that the behaviour in functions (promotion) will in practice be far more relevant than this casting change. @@ -102,14 +102,16 @@ the following changes. to some degree, but this is not currently planned. -Impact on operators functions involving NumPy arrays or scalars ---------------------------------------------------------------- +Impact on operators and functions involving NumPy arrays or scalars +------------------------------------------------------------------- The main impact on operations not involving Python scalars (float, int, complex) will be that 0-D arrays and NumPy scalars will never behave value-sensitive. This removes currently surprising cases. For example:: np.arange(10, dtype=np.uint8) + np.int64(1) + # and: + np.add(np.arange(10, dtype=np.uint8), np.int64(1)) Will return an int64 array because the type of ``np.int64(1)`` is strictly honoured. @@ -118,49 +120,40 @@ honoured. Impact on operators involving Python ``int``, ``float``, and ``complex`` ------------------------------------------------------------------------ -This NEP attempts to preserve most of the convenience that the old behaviour -gave for Python operators, but remove it for - -The current value-based logic has some nice properties when "untyped" Python -scalars involved:: +This NEP attempts to preserve the convenience that the old behaviour +gave when working with literal values. +The current value-based logic had some nice properties when "untyped", +literal Python scalars are involved:: np.arange(10, dtype=np.int8) + 1 # returns an int8 array - nparray([1., 2.], dtype=np.float32) * 3.5 # returns a float32 array + np.array([1., 2.], dtype=np.float32) * 3.5 # returns a float32 array But led to complexity when it came to "unrepresentable" values: np.arange(10, dtype=np.int8) + 256 # returns int16 - nparray([1., 2.], dtype=np.float32) * 1e200 # returns float64 + np.array([1., 2.], dtype=np.float32) * 1e200 # returns float64 The proposal is to preserve this behaviour for the most part. This is achieved by considering Python ``int``, ``float``, and ``complex`` to be "weakly" typed in these operations. -To mitigate user surprises, we would further make conversion to the new type -more strict. This means that the results will be unchanged in the first +Hoewver, to mitigate user surprises, we plan to make conversion to the new type +more strict: This means that the results will be unchanged in the first two examples. For the second one, the results will be the following:: np.arange(10, dtype=np.int8) + 256 # raises a TypeError - nparray([1., 2.], dtype=np.float32) * 1e200 # warning and returns infinity + np.array([1., 2.], dtype=np.float32) * 1e200 # warning and returns infinity The second one will warn because ``np.float32(1e200)`` overflows to infinity. It will then do the calculation with ``inf`` as normally. -Impact on functions involving Python ``int``, ``float``, and ``complex`` ------------------------------------------------------------------------- - -Most functions, in particular ``ufuncs`` will also use this weakly typed -logic. -In some cases, functions will call `np.asarray()` on inputs before any operations -and thus will s +.. admonition:: Behaviour in other libraries -.. note:: + Overflowing in the conversion rather than raising an error is a choice; + it is one that is the default in most C setups (similar to NumPy C can be + set up to raise an error due to the overflow, however). + It is also for example the behaviour of ``pytorch`` 1.10. - There is a real alternative to not do this for `ufuncs` and limit the special - behaviour to Python operators. From a user perspective, we assume that most - functions effectively call `np.asarray()`. - Because Python operators allow more custom logic, this would ensure that an - overflow warning is given for all results with decreased precision. Backward compatibility @@ -192,7 +185,7 @@ scalar operators. Similarliy, if the storage array is float32 a calculation may retain the lower float32 precision rather than use the default float64. -Further issues can occure. For example: +Further issues can occur. For example: * Floating point comparisons, especially equality, may change when mixing precisions: @@ -205,7 +198,7 @@ Further issues can occure. For example: np.array([1], np.uint8) == 1000 # possibly also ``` to protect users in cases where previous value-based casting led to an - upcast. + upcast. (Failures occur when converting ``1000`` to a ``uint8``.) * Floating point overflow may occur in odder cases: ```python3 np.float32(1e-30) * 1e50 # will return ``inf`` and a warning @@ -418,6 +411,37 @@ or even ignore the "unsafe" conversion which (on all relevant hardware) would lead to ``np.uint8(1000) == np.uint8(232)`` being used. +Allowing weakly typed arrays +---------------------------- + +One problem with having weakly typed Python scalars, but not weakly typed +arrays is that in many cases ``np.asarray()`` is called indiscriminately on +inputs. To solve this issue JAX will consider the result of ``np.asarray(1)`` +also to be weakly typed. +There are, however, two difficulties with this: + +1. JAX noticed that it can be confusing that:: + + np.broadcast_to(np.asarray(1), (100, 100)) + + is a non 0-D array that "inherits" the weak typing. [2]_ +2. Unlike JAX tensors, NumPy arrays are mutable, so assignment may need to + cause it to be strongly typed? + +A flag will likely be useful as an implementation detail (e.g. in ufuncs), +however, as of now we do not expect to have this as user API. +The main reason is that such a flag may be surprising for users if it is +passed out as a result from a function, rather than used only very localized. + + +.. admonition:: TODO + + Before accepting the NEP it may be good to discuss this issue further. + Libraries may need clearer patterns to "propagate" the "weak" type, this + could just be an ``np.asarray_or_literal()`` to preserve Python scalars, + or a pattern of calling ``np.result_type()`` before ``np.asarray()``. + + Discussion ========== @@ -436,6 +460,7 @@ References and Footnotes .. _JAX promotion: https://jax.readthedocs.io/en/latest/type_promotion.html +.. [2] https://github.com/numpy/numpy/pull/21103/files#r814188019 Copyright ========= |