diff options
author | Matti Picus <matti.picus@gmail.com> | 2021-01-25 09:55:36 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-01-25 09:55:36 +0200 |
commit | bfc7e4347f9b999712768ef7ca8bb3cd0e17dc63 (patch) | |
tree | 36dd8dec1d7998a129ca9dd8dcfe9c129e1d57ed /doc | |
parent | d4e40757ae4dc63a3c9dc8c6546d96abb47996c8 (diff) | |
parent | b97ef1fc36d7547a369928c640cc7246bff1a6ae (diff) | |
download | numpy-bfc7e4347f9b999712768ef7ca8bb3cd0e17dc63.tar.gz |
Merge pull request #18097 from rgommers/nep-backcompat-update
NEP: update backwards compatibility and deprecation policy NEP
Diffstat (limited to 'doc')
-rw-r--r-- | doc/neps/nep-0023-backwards-compatibility.rst | 408 |
1 files changed, 233 insertions, 175 deletions
diff --git a/doc/neps/nep-0023-backwards-compatibility.rst b/doc/neps/nep-0023-backwards-compatibility.rst index c8bd7c180..af5bdab29 100644 --- a/doc/neps/nep-0023-backwards-compatibility.rst +++ b/doc/neps/nep-0023-backwards-compatibility.rst @@ -19,46 +19,222 @@ processes for individual cases where breaking backwards compatibility is considered. -Detailed description +Motivation and Scope -------------------- NumPy has a very large user base. Those users rely on NumPy being stable and the code they write that uses NumPy functionality to keep working. NumPy is also actively maintained and improved -- and sometimes improvements -require, or are made much easier, by breaking backwards compatibility. +require, or are made easier, by breaking backwards compatibility. Finally, there are trade-offs in stability for existing users vs. avoiding errors or having a better user experience for new users. These competing -needs often give rise to heated debates and delays in accepting or rejecting +needs often give rise to long debates and delay accepting or rejecting contributions. This NEP tries to address that by providing a policy as well as examples and rationales for when it is or isn't a good idea to break backwards compatibility. -General principles: - -- Aim not to break users' code unnecessarily. -- Aim never to change code in ways that can result in users silently getting - incorrect results from their previously working code. -- Backwards incompatible changes can be made, provided the benefits outweigh - the costs. -- When assessing the costs, keep in mind that most users do not read the mailing - list, do not look at deprecation warnings, and sometimes wait more than one or - two years before upgrading from their old version. And that NumPy has - many hundreds of thousands or even a couple of million users, so "no one will - do or use this" is very likely incorrect. -- Benefits include improved functionality, usability and performance (in order - of importance), as well as lower maintenance cost and improved future - extensibility. -- Bug fixes are exempt from the backwards compatibility policy. However in case - of serious impact on users (e.g. a downstream library doesn't build anymore), - even bug fixes may have to be delayed for one or more releases. -- The Python API and the C API will be treated in the same way. - - -Examples -^^^^^^^^ - -We now discuss a number of concrete examples to illustrate typical issues -and trade-offs. +In addition, this NEP can serve as documentation for users about how the NumPy +project treats backwards compatibility, and the speed at which they can expect +changes to be made. + +In scope for this NEP are: + +- Principles of NumPy's approach to backwards compatibility. +- How to deprecate functionality, and when to remove already deprecated + functionality. +- Decision making process for deprecations and removals. +- How to ensure that users are well informed about any change. + +Out of scope are: + +- Making concrete decisions about deprecations of particular functionality. +- NumPy's versioning scheme. + + +General principles +------------------ + +When considering proposed changes that are backwards incompatible, the +main principles the NumPy developers use when making a decision are: + +1. Changes need to benefit more than they harm users. +2. NumPy is widely used, so breaking changes should be assumed by default to be + harmful. +3. Decisions should be based on how they affect users and downstream packages + and should be based on usage data where possible. It does not matter whether + this use contradicts the documentation or best practices. +4. The possibility of an incorrect result is worse than an error or even crash. + +When assessing the costs of proposed changes, keep in mind that most users do +not read the mailing list, do not notice deprecation warnings, and sometimes +wait more than one or two years before upgrading from their old version. And +that NumPy has millions of users, so "no one will do or use this" is likely +incorrect. + +Benefits of proposed changes can include improved functionality, usability and +performance, as well as lower maintenance cost and improved future +extensibility. + +Fixes for clear bugs are exempt from this backwards compatibility policy. +However, in case of serious impact on users even bug fixes may have to be +delayed for one or more releases. For example, if a downstream library would no +longer build or would give incorrect results." + + +Strategies related to deprecations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Impact assessment +````````````````` + +Getting hard data on the impact of a deprecation of often difficult. Strategies +that can be used to assess such impact include: + +- Use a code search engine ([1]_, [2]_) or static ([3]_) or dynamic ([4]_) code + analysis tools to determine where and how the functionality is used. +- Test prominent downstream libraries against a development build of NumPy + containing the proposed change to get real-world data on its impact. +- Make a change in master and revert it before release if it causes problems. + We encourage other packages to test against NumPy's master branch and if + that's too burdensome, then at least to test pre-releases. This often + turns up issues quickly. + +Alternatives to deprecations +```````````````````````````` + +If the impact is unclear or significant, it is often good to consider +alternatives to deprecations. For example, discouraging use in documentation +only, or moving the documentation for the functionality to a less prominent +place or even removing it completely. Commenting on open issues related to it +that they are low-prio or labeling them as "wontfix" will also be a signal to +users, and reduce the maintenance effort needing to be spent. + + +Implementing deprecations and removals +-------------------------------------- + +Deprecation warnings are necessary in all cases where functionality +will eventually be removed. If there is no intent to remove functionality, +then it should not be deprecated. A "please don't use this for new code" +in the documentation or other type of warning should be used instead, and the +documentation can be organized such that the preferred alternative is more +prominently shown. + +Deprecations: + +- shall include the version number of the release in which the functionality + was deprecated. +- shall include information on alternatives to the deprecated functionality, or a + reason for the deprecation if no clear alternative is available. Note that + release notes can include longer messages if needed. +- shall use ``DeprecationWarning`` by default, and ``VisibleDeprecation`` + for changes that need attention again after already having been deprecated or + needing extra attention for some reason. +- shall be listed in the release notes of the release where the deprecation is + first present. +- shall not be introduced in micro (bug fix) releases. +- shall set a ``stacklevel``, so the warning appears to come from the correct + place. +- shall be mentioned in the documentation for the functionality. A + ``.. deprecated::`` directive can be used for this. + +Examples of good deprecation warnings (also note standard form of the comments +above the warning, helps when grepping): + +.. code-block:: python + + # NumPy 1.15.0, 2018-09-02 + warnings.warn('np.asscalar(a) is deprecated since NumPy 1.16.0, use ' + 'a.item() instead', DeprecationWarning, stacklevel=3) + + # NumPy 1.15.0, 2018-02-10 + warnings.warn("Importing from numpy.testing.utils is deprecated " + "since 1.15.0, import from numpy.testing instead.", + DeprecationWarning, stacklevel=2) + + # NumPy 1.14.0, 2017-07-14 + warnings.warn( + "Reading unicode strings without specifying the encoding " + "argument is deprecated since NumPy 1.14.0. Set the encoding, " + "use None for the system default.", + np.VisibleDeprecationWarning, stacklevel=2) + +.. code-block:: C + + /* DEPRECATED 2020-05-13, NumPy 1.20 */ + if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1, + matrix_deprecation_msg, ufunc->name, "first") < 0) { + return NULL; + } + +Removal of deprecated functionality: + +- shall be done after at least 2 releases assuming the current 6-monthly + release cycle; if that changes, there shall be at least 1 year between + deprecation and removal. +- shall be listed in the release notes of the release where the removal happened. +- can be done in any minor, but not bugfix, release. + +For backwards incompatible changes that aren't "deprecate and remove" but for +which code will start behaving differently, a ``FutureWarning`` should be +used. Release notes, mentioning version number and using ``stacklevel`` should +be done in the same way as for deprecation warnings. A ``.. versionchanged::`` +directive shall be used in the documentation after the behaviour change was +made to indicate when the behavior changed: + +.. code-block:: python + + def argsort(self, axis=np._NoValue, ...): + """ + Parameters + ---------- + axis : int, optional + Axis along which to sort. If None, the default, the flattened array + is used. + + .. versionchanged:: 1.13.0 + Previously, the default was documented to be -1, but that was + in error. At some future date, the default will change to -1, as + originally intended. + Until then, the axis should be given explicitly when + ``arr.ndim > 1``, to avoid a FutureWarning. + """ + ... + warnings.warn( + "In the future the default for argsort will be axis=-1, not the " + "current None, to match its documentation and np.argsort. " + "Explicitly pass -1 or None to silence this warning.", + MaskedArrayFutureWarning, stacklevel=3) + + +Decision making +--------------- + +In concrete cases where this policy needs to be applied, decisions are made according +to the `NumPy governance model +<https://docs.scipy.org/doc/numpy/dev/governance/index.html>`_. + +All deprecations must be proposed on the mailing list in order to give everyone +with an interest in NumPy development a chance to comment. Removal of +deprecated functionality does not need discussion on the mailing list. + + +Functionality with more strict deprecation policies +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ``numpy.random`` has its own backwards compatibility policy with additional + requirements on top of the ones in this NEP, see + `NEP 19 <http://www.numpy.org/neps/nep-0019-rng-policy.html>`_. +- The file format of ``.npy`` and ``.npz`` files is strictly versioned + independent of the NumPy version; existing format versions must remain + backwards compatible even if a newer format version is introduced. + + +Example cases +------------- + +We now discuss a few concrete examples from NumPy's history to illustrate +typical issues and trade-offs. **Changing the behavior of a function** @@ -89,21 +265,6 @@ forces users to change their code more than once, which is almost never the right thing to do. Instead, a better approach here would have been to deprecate ``histogram`` and introduce a new function ``hist`` in its place. -**Returning a view rather than a copy** - -The ``ndarray.diag`` method used to return a copy. A view would be better for -both performance and design consistency. This change was warned about -(``FutureWarning``) in v.8.0, and in v1.9.0 ``diag`` was changed to return -a *read-only* view. The planned change to a writeable view in v1.10.0 was -postponed due to backwards compatibility concerns, and is still an open issue -(gh-7661). - -What should have happened instead: nothing. This change resulted in a lot of -discussions and wasted effort, did not achieve its final goal, and was not that -important in the first place. Finishing the change to a *writeable* view in -the future is not desired, because it will result in users silently getting -different results if they upgraded multiple versions or simply missed the -warnings. **Disallowing indexing with floats** @@ -120,128 +281,30 @@ scikit-learn. Overall the change was worth the cost, and introducing it in master first to allow testing, then removing it again before a release, is a useful strategy. -Similar recent deprecations also look like good examples of +Similar deprecations that also look like good examples of cleanups/improvements: -- removing deprecated boolean indexing (gh-8312) -- deprecating truth testing on empty arrays (gh-9718) -- deprecating ``np.sum(generator)`` (gh-10670, one issue with this one is that - its warning message is wrong - this should error in the future). +- removing deprecated boolean indexing (in 2016, see `gh-8312 <https://github.com/numpy/numpy/pull/8312>`__) +- deprecating truth testing on empty arrays (in 2017, see `gh-9718 <https://github.com/numpy/numpy/pull/9718>`__) + **Removing the financial functions** -The financial functions (e.g. ``np.pmt``) are badly named, are present in the -main NumPy namespace, and don't really fit well within NumPy's scope. -They were added in 2008 after +The financial functions (e.g. ``np.pmt``) had short non-descriptive names, were +present in the main NumPy namespace, and didn't really fit well within NumPy's +scope. They were added in 2008 after `a discussion <https://mail.python.org/pipermail/numpy-discussion/2008-April/032353.html>`_ on the mailing list where opinion was divided (but a majority in favor). -At the moment these functions don't cause a lot of overhead, however there are -multiple issues and PRs a year for them which cost maintainer time to deal -with. And they clutter up the ``numpy`` namespace. Discussion in 2013 happened -on removing them again (gh-2880). - -This case is borderline, but given that they're clearly out of scope, -deprecation and removal out of at least the main ``numpy`` namespace can be -proposed. Alternatively, document clearly that new features for financial -functions are unwanted, to keep the maintenance costs to a minimum. - -**Examples of features not added because of backwards compatibility** - -TODO: do we have good examples here? Possibly subclassing related? - - -Removing complete submodules -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This year there have been suggestions to consider removing some or all of -``numpy.distutils``, ``numpy.f2py``, ``numpy.linalg``, and ``numpy.random``. -The motivation was that all these cost maintenance effort, and that they slow -down work on the core of NumPy (ndarrays, dtypes and ufuncs). - -The impact on downstream libraries and users would be very large, and -maintenance of these modules would still have to happen. Therefore this is -simply not a good idea; removing these submodules should not happen even for -a new major version of NumPy. - - -Subclassing of ndarray -^^^^^^^^^^^^^^^^^^^^^^ - -Subclassing of ``ndarray`` is a pain point. ``ndarray`` was not (or at least -not well) designed to be subclassed. Despite that, a lot of subclasses have -been created even within the NumPy code base itself, and some of those (e.g. -``MaskedArray``, ``astropy.units.Quantity``) are quite popular. The main -problems with subclasses are: - -- They make it hard to change ``ndarray`` in ways that would otherwise be - backwards compatible. -- Some of them change the behavior of ndarray methods, making it difficult to - write code that accepts array duck-types. - -Subclassing ``ndarray`` has been officially discouraged for a long time. Of -the most important subclasses, ``np.matrix`` will be deprecated (see gh-10142) -and ``MaskedArray`` will be kept in NumPy (`NEP 17 -<http://www.numpy.org/neps/nep-0017-split-out-maskedarray.html>`_). -``MaskedArray`` will ideally be rewritten in a way such that it uses only -public NumPy APIs. For subclasses outside of NumPy, more work is needed to -provide alternatives (e.g. mixins, see gh-9016 and gh-10446) or better support -for custom dtypes (see gh-2899). Until that is done, subclasses need to be -taken into account when making change to the NumPy code base. A future change -in NumPy to not support subclassing will certainly need a major version -increase. - - -Policy ------- - -1. Code changes that have the potential to silently change the results of a users' - code must never be made (except in the case of clear bugs). -2. Code changes that break users' code (i.e. the user will see a clear exception) - can be made, *provided the benefit is worth the cost* and suitable deprecation - warnings have been raised first. -3. Deprecation warnings are in all cases warnings that functionality will be removed. - If there is no intent to remove functionality, then deprecation in documentation - only or other types of warnings shall be used. -4. Deprecations for stylistic reasons (e.g. consistency between functions) are - strongly discouraged. - -Deprecations: - -- shall include the version numbers of both when the functionality was deprecated - and when it will be removed (either two releases after the warning is - introduced, or in the next major version). -- shall include information on alternatives to the deprecated functionality, or a - reason for the deprecation if no clear alternative is available. -- shall use ``VisibleDeprecationWarning`` rather than ``DeprecationWarning`` - for cases of relevance to end users (as opposed to cases only relevant to - libraries building on top of NumPy). -- shall be listed in the release notes of the release where the deprecation happened. - -Removal of deprecated functionality: +The financial functions didn't cause a lot of overhead, however there were +still multiple issues and PRs a year for them which cost maintainer time to +deal with. And they cluttered up the ``numpy`` namespace. Discussion on +removing them was discussed in 2013 (gh-2880, rejected) and in 2019 +(:ref:`NEP32`, accepted without significant complaints). -- shall be done after 2 releases (assuming a 6-monthly release cycle; if that changes, - there shall be at least 1 year between deprecation and removal), unless the - impact of the removal is such that a major version number increase is - warranted. -- shall be listed in the release notes of the release where the removal happened. - -Versioning: - -- removal of deprecated code can be done in any minor (but not bugfix) release. -- for heavily used functionality (e.g. removal of ``np.matrix``, of a whole submodule, - or significant changes to behavior for subclasses) the major version number shall - be increased. - -In concrete cases where this policy needs to be applied, decisions are made according -to the `NumPy governance model -<https://docs.scipy.org/doc/numpy/dev/governance/index.html>`_. - -Functionality with more strict policies: - -- ``numpy.random`` has its own backwards compatibility policy, - see `NEP 19 <http://www.numpy.org/neps/nep-0019-rng-policy.html>`_. -- The file format for ``.npy`` and ``.npz`` files must not be changed in a backwards - incompatible way. +Given that they were clearly outside of NumPy's scope, moving them to a +separate ``numpy-financial`` package and removing them from NumPy after a +deprecation period made sense. That also gave users an easy way to update +their code by doing `pip install numpy-financial`. Alternatives @@ -257,34 +320,29 @@ ecosystem - being fairly conservative is required in order to not increase the extra maintenance for downstream libraries and end users to an unacceptable level. -**Semantic versioning.** - -This would change the versioning scheme for code removals; those could then -only be done when the major version number is increased. Rationale for -rejection: semantic versioning is relatively common in software engineering, -however it is not at all common in the Python world. Also, it would mean that -NumPy's version number simply starts to increase faster, which would be more -confusing than helpful. gh-10156 contains more discussion on this alternative. - Discussion ---------- -TODO - -This section may just be a bullet list including links to any discussions -regarding the NEP: - -- This includes links to mailing list threads or relevant GitHub issues. +- `Mailing list discussion on the first version of this NEP in 2018 <https://mail.python.org/pipermail/numpy-discussion/2018-July/078432.html>`__ References and Footnotes ------------------------ -.. [1] TODO +- `Issue requesting semantic versioning <https://github.com/numpy/numpy/issues/10156>`__ + +- `PEP 387 - Backwards Compatibility Policy <https://www.python.org/dev/peps/pep-0387/>`__ + +.. [1] https://searchcode.com/ + +.. [2] https://sourcegraph.com/search + +.. [3] https://github.com/Quansight-Labs/python-api-inspect +.. [4] https://github.com/data-apis/python-record-api Copyright --------- -This document has been placed in the public domain. [1]_ +This document has been placed in the public domain. |