summaryrefslogtreecommitdiff
path: root/doc/neps
diff options
context:
space:
mode:
authorCharles Harris <charlesr.harris@gmail.com>2018-09-25 11:17:38 -0500
committerGitHub <noreply@github.com>2018-09-25 11:17:38 -0500
commit5f2a5ae02dee34469d27896249f71deaa0c6de24 (patch)
tree57a90955afec18dbce0559546eccdff9b663d05b /doc/neps
parentc233a1e2e2ec7991054f8a8a8f690a4fd578f57a (diff)
parent5ee09d47c7212b2f029408b0197e6954421dca38 (diff)
downloadnumpy-5f2a5ae02dee34469d27896249f71deaa0c6de24.tar.gz
Merge pull request #11596 from rgommers/nep-backcompat
NEP: backwards compatibility and deprecation policy
Diffstat (limited to 'doc/neps')
-rw-r--r--doc/neps/nep-0023-backwards-compatibility.rst288
1 files changed, 288 insertions, 0 deletions
diff --git a/doc/neps/nep-0023-backwards-compatibility.rst b/doc/neps/nep-0023-backwards-compatibility.rst
new file mode 100644
index 000000000..158b08f1f
--- /dev/null
+++ b/doc/neps/nep-0023-backwards-compatibility.rst
@@ -0,0 +1,288 @@
+=======================================================
+NEP 23 — Backwards compatibility and deprecation policy
+=======================================================
+
+:Author: Ralf Gommers <ralf.gommers@gmail.com>
+:Status: Draft
+:Type: Process
+:Created: 2018-07-14
+:Resolution: <url> (required for Accepted | Rejected | Withdrawn)
+
+Abstract
+--------
+
+In this NEP we describe NumPy's approach to backwards compatibility,
+its deprecation and removal policy, and the trade-offs and decision
+processes for individual cases where breaking backwards compatibility
+is considered.
+
+
+Detailed description
+--------------------
+
+NumPy has a very large user base. Those users rely on NumPy being stable
+and the code they write that uses NumPy functionality to keep working.
+NumPy is also actively maintained and improved -- and sometimes improvements
+require, or are made much easier, by breaking backwards compatibility.
+Finally, there are trade-offs in stability for existing users vs. avoiding
+errors or having a better user experience for new users. These competing
+needs often give rise to heated debates and delays in accepting or rejecting
+contributions. This NEP tries to address that by providing a policy as well
+as examples and rationales for when it is or isn't a good idea to break
+backwards compatibility.
+
+General principles:
+
+- Aim not to break users' code unnecessarily.
+- Aim never to change code in ways that can result in users silently getting
+ incorrect results from their previously working code.
+- Backwards incompatible changes can be made, provided the benefits outweigh
+ the costs.
+- When assessing the costs, keep in mind that most users do not read the mailing
+ list, do not look at deprecation warnings, and sometimes wait more than one or
+ two years before upgrading from their old version. And that NumPy has
+ many hundreds of thousands or even a couple of million users, so "no one will
+ do or use this" is very likely incorrect.
+- Benefits include improved functionality, usability and performance (in order
+ of importance), as well as lower maintenance cost and improved future
+ extensibility.
+- Bug fixes are exempt from the backwards compatibility policy. However in case
+ of serious impact on users (e.g. a downstream library doesn't build anymore),
+ even bug fixes may have to be delayed for one or more releases.
+- The Python API and the C API will be treated in the same way.
+
+
+Examples
+^^^^^^^^
+
+We now discuss a number of concrete examples to illustrate typical issues
+and trade-offs.
+
+**Changing the behavior of a function**
+
+``np.histogram`` is probably the most infamous example.
+First, a new keyword ``new=False`` was introduced, this was then switched
+over to None one release later, and finally it was removed again.
+Also, it has a ``normed`` keyword that had behavior that could be considered
+either suboptimal or broken (depending on ones opinion on the statistics).
+A new keyword ``density`` was introduced to replace it; ``normed`` started giving
+``DeprecationWarning`` only in v.1.15.0. Evolution of ``histogram``::
+
+ def histogram(a, bins=10, range=None, normed=False): # v1.0.0
+
+ def histogram(a, bins=10, range=None, normed=False, weights=None, new=False): #v1.1.0
+
+ def histogram(a, bins=10, range=None, normed=False, weights=None, new=None): #v1.2.0
+
+ def histogram(a, bins=10, range=None, normed=False, weights=None): #v1.5.0
+
+ def histogram(a, bins=10, range=None, normed=False, weights=None, density=None): #v1.6.0
+
+ def histogram(a, bins=10, range=None, normed=None, weights=None, density=None): #v1.15.0
+ # v1.15.0 was the first release where `normed` started emitting
+ # DeprecationWarnings
+
+The ``new`` keyword was planned from the start to be temporary. Such a plan
+forces users to change their code more than once, which is almost never the
+right thing to do. Instead, a better approach here would have been to
+deprecate ``histogram`` and introduce a new function ``hist`` in its place.
+
+**Returning a view rather than a copy**
+
+The ``ndarray.diag`` method used to return a copy. A view would be better for
+both performance and design consistency. This change was warned about
+(``FutureWarning``) in v.8.0, and in v1.9.0 ``diag`` was changed to return
+a *read-only* view. The planned change to a writeable view in v1.10.0 was
+postponed due to backwards compatibility concerns, and is still an open issue
+(gh-7661).
+
+What should have happened instead: nothing. This change resulted in a lot of
+discussions and wasted effort, did not achieve its final goal, and was not that
+important in the first place. Finishing the change to a *writeable* view in
+the future is not desired, because it will result in users silently getting
+different results if they upgraded multiple versions or simply missed the
+warnings.
+
+**Disallowing indexing with floats**
+
+Indexing an array with floats is asking for something ambiguous, and can be a
+sign of a bug in user code. After some discussion, it was deemed a good idea
+to deprecate indexing with floats. This was first tried for the v1.8.0
+release, however in pre-release testing it became clear that this would break
+many libraries that depend on NumPy. Therefore it was reverted before release,
+to give those libraries time to fix their code first. It was finally
+introduced for v1.11.0 and turned into a hard error for v1.12.0.
+
+This change was disruptive, however it did catch real bugs in, e.g., SciPy and
+scikit-learn. Overall the change was worth the cost, and introducing it in
+master first to allow testing, then removing it again before a release, is a
+useful strategy.
+
+Similar recent deprecations also look like good examples of
+cleanups/improvements:
+
+- removing deprecated boolean indexing (gh-8312)
+- deprecating truth testing on empty arrays (gh-9718)
+- deprecating ``np.sum(generator)`` (gh-10670, one issue with this one is that
+ its warning message is wrong - this should error in the future).
+
+**Removing the financial functions**
+
+The financial functions (e.g. ``np.pmt``) are badly named, are present in the
+main NumPy namespace, and don't really fit well within NumPy's scope.
+They were added in 2008 after
+`a discussion <https://mail.python.org/pipermail/numpy-discussion/2008-April/032353.html>`_
+on the mailing list where opinion was divided (but a majority in favor).
+At the moment these functions don't cause a lot of overhead, however there are
+multiple issues and PRs a year for them which cost maintainer time to deal
+with. And they clutter up the ``numpy`` namespace. Discussion in 2013 happened
+on removing them again (gh-2880).
+
+This case is borderline, but given that they're clearly out of scope,
+deprecation and removal out of at least the main ``numpy`` namespace can be
+proposed. Alternatively, document clearly that new features for financial
+functions are unwanted, to keep the maintenance costs to a minimum.
+
+**Examples of features not added because of backwards compatibility**
+
+TODO: do we have good examples here? Possibly subclassing related?
+
+
+Removing complete submodules
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This year there have been suggestions to consider removing some or all of
+``numpy.distutils``, ``numpy.f2py``, ``numpy.linalg``, and ``numpy.random``.
+The motivation was that all these cost maintenance effort, and that they slow
+down work on the core of Numpy (ndarrays, dtypes and ufuncs).
+
+The impact on downstream libraries and users would be very large, and
+maintenance of these modules would still have to happen. Therefore this is
+simply not a good idea; removing these submodules should not happen even for
+a new major version of NumPy.
+
+
+Subclassing of ndarray
+^^^^^^^^^^^^^^^^^^^^^^
+
+Subclassing of ``ndarray`` is a pain point. ``ndarray`` was not (or at least
+not well) designed to be subclassed. Despite that, a lot of subclasses have
+been created even within the NumPy code base itself, and some of those (e.g.
+``MaskedArray``, ``astropy.units.Quantity``) are quite popular. The main
+problems with subclasses are:
+
+- They make it hard to change ``ndarray`` in ways that would otherwise be
+ backwards compatible.
+- Some of them change the behavior of ndarray methods, making it difficult to
+ write code that accepts array duck-types.
+
+Subclassing ``ndarray`` has been officially discouraged for a long time. Of
+the most important subclasses, ``np.matrix`` will be deprecated (see gh-10142)
+and ``MaskedArray`` will be kept in NumPy (`NEP 17
+<http://www.numpy.org/neps/nep-0017-split-out-maskedarray.html>`_).
+``MaskedArray`` will ideally be rewritten in a way such that it uses only
+public NumPy APIs. For subclasses outside of NumPy, more work is needed to
+provide alternatives (e.g. mixins, see gh-9016 and gh-10446) or better support
+for custom dtypes (see gh-2899). Until that is done, subclasses need to be
+taken into account when making change to the NumPy code base. A future change
+in NumPy to not support subclassing will certainly need a major version
+increase.
+
+
+Policy
+------
+
+1. Code changes that have the potential to silently change the results of a users'
+ code must never be made (except in the case of clear bugs).
+2. Code changes that break users' code (i.e. the user will see a clear exception)
+ can be made, *provided the benefit is worth the cost* and suitable deprecation
+ warnings have been raised first.
+3. Deprecation warnings are in all cases warnings that functionality will be removed.
+ If there is no intent to remove functionlity, then deprecation in documentation
+ only or other types of warnings shall be used.
+4. Deprecations for stylistic reasons (e.g. consistency between functions) are
+ strongly discouraged.
+
+Deprecations:
+
+- shall include the version numbers of both when the functionality was deprecated
+ and when it will be removed (either two releases after the warning is
+ introduced, or in the next major version).
+- shall include information on alternatives to the deprecated functionality, or a
+ reason for the deprecation if no clear alternative is available.
+- shall use ``VisibleDeprecationWarning`` rather than ``DeprecationWarning``
+ for cases of relevance to end users (as opposed to cases only relevant to
+ libraries building on top of NumPy).
+- shall be listed in the release notes of the release where the deprecation happened.
+
+Removal of deprecated functionality:
+
+- shall be done after 2 releases (assuming a 6-monthly release cycle; if that changes,
+ there shall be at least 1 year between deprecation and removal), unless the
+ impact of the removal is such that a major version number increase is
+ warranted.
+- shall be listed in the release notes of the release where the removal happened.
+
+Versioning:
+
+- removal of deprecated code can be done in any minor (but not bugfix) release.
+- for heavily used functionality (e.g. removal of ``np.matrix``, of a whole submodule,
+ or significant changes to behavior for subclasses) the major version number shall
+ be increased.
+
+In concrete cases where this policy needs to be applied, decisions are made according
+to the `NumPy governance model
+<https://docs.scipy.org/doc/numpy/dev/governance/index.html>`_.
+
+Functionality with more strict policies:
+
+- ``numpy.random`` has its own backwards compatibility policy,
+ see `NEP 19 <http://www.numpy.org/neps/nep-0019-rng-policy.html>`_.
+- The file format for ``.npy`` and ``.npz`` files must not be changed in a backwards
+ incompatible way.
+
+
+Alternatives
+------------
+
+**Being more aggressive with deprecations.**
+
+The goal of being more aggressive is to allow NumPy to move forward faster.
+This would avoid others inventing their own solutions (often in multiple
+places), as well as be a benefit to users without a legacy code base. We
+reject this alternative because of the place NumPy has in the scientific Python
+ecosystem - being fairly conservative is required in order to not increase the
+extra maintenance for downstream libraries and end users to an unacceptable
+level.
+
+**Semantic versioning.**
+
+This would change the versioning scheme for code removals; those could then
+only be done when the major version number is increased. Rationale for
+rejection: semantic versioning is relatively common in software engineering,
+however it is not at all common in the Python world. Also, it would mean that
+NumPy's version number simply starts to increase faster, which would be more
+confusing than helpful. gh-10156 contains more discussion on this alternative.
+
+
+Discussion
+----------
+
+TODO
+
+This section may just be a bullet list including links to any discussions
+regarding the NEP:
+
+- This includes links to mailing list threads or relevant GitHub issues.
+
+
+References and Footnotes
+------------------------
+
+.. [1] TODO
+
+
+Copyright
+---------
+
+This document has been placed in the public domain. [1]_