summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authormattip <matti.picus@gmail.com>2019-10-10 15:08:43 +0300
committermattip <matti.picus@gmail.com>2019-10-10 15:08:43 +0300
commit6dc23b070cc66d10e6c03946483b33b176ecb6d8 (patch)
treeae8c89422aef0860b8b7489f821eaa081138f63b /doc
parentea66b1d154a99ed7e4325150ef1bf509b7a2b268 (diff)
downloadnumpy-6dc23b070cc66d10e6c03946483b33b176ecb6d8.tar.gz
NEP: add default-dtype-object-deprecation nep
Diffstat (limited to 'doc')
-rw-r--r--doc/neps/nep-0034.rst87
1 files changed, 87 insertions, 0 deletions
diff --git a/doc/neps/nep-0034.rst b/doc/neps/nep-0034.rst
new file mode 100644
index 000000000..b7b9d048e
--- /dev/null
+++ b/doc/neps/nep-0034.rst
@@ -0,0 +1,87 @@
+==========================================
+NEP 34 — Disallow default ``dtype=object``
+==========================================
+
+:Author: Matti Picus
+:Status: Draft
+:Type: Standards Track
+:Created: 2019-10-10
+
+
+Abstract
+--------
+
+``np.array([<value])`` with no ``dtype`` keyword argument will sometimes
+default to an ``object``-dtype array. Change the behaviour to raise a
+`ValueError` instead.
+
+Motivation and Scope
+--------------------
+
+Users who specify lists-of-lists when creating a `numpy.ndarray` via
+``np.array`` may mistakenly pass in lists of different lengths. Currently we
+accept this input and create a ragged array with ``dtype=object``. This can be
+confusing, since it is rarely what is desired. Changing the default dtype
+detection to never return ``object`` will force users who actually wish to
+create ``object`` arrays to specify that explicitly, see for instance `issue
+5303`_.
+
+Detailed description
+--------------------
+
+After this change, ragged array creation must explicitly define a dtype:
+
+ a = np.array([1, 2], [1])
+ ValueError: cannot guess the desired dtype from the input
+
+ a = np.array([1, 2], [1], dtype=object)
+ print(a.dtype)
+ object
+
+Related Work
+------------
+
+`PR 14341`_ tried to raise an error when ragged arrays were specified with
+a numeric dtype ``np.array, [[1], [2, 3]], dtype=int)`` but failed due to
+false-postives, for instance ``np.array([1, np.array([5])], dtype=int)``.
+
+.. _`PR 14341`: https://github.com/numpy/numpy/pull/14341
+
+Implementation
+--------------
+
+The code to be changed is inside ``PyArray_DTypeFromObject``, specifically in
+``PyArray_DTypeFromObjectHelper``. Since ``PyArray_DTypeFromObject`` is part of
+the NumPy C-API, its interface cannot be changed, but it can return ``-1`` to
+indicate failure.
+
+Backward compatibility
+----------------------
+
+Anyone depending on ragged lists-of-lists creating object arrays will need to
+modify their code. There will be a deprecation period during which the current
+behaviour will emit a ``DeprecationWarning``
+
+
+Alternatives
+------------
+
+We could continue with the current situation.
+
+Discussion
+----------
+
+Comments to `issue 5303`_ indicate this is unintended behaviour as far back as
+2014. Suggestions to change it have been made in the ensuing years, but none
+have stuck.
+
+References and Footnotes
+------------------------
+
+.. _`issue 5303`: https://github.com/numpy/numpy/issues/5303
+
+
+Copyright
+---------
+
+This document has been placed in the public domain.