summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/release/upcoming_changes/19226.compatibility.rst85
1 files changed, 85 insertions, 0 deletions
diff --git a/doc/release/upcoming_changes/19226.compatibility.rst b/doc/release/upcoming_changes/19226.compatibility.rst
new file mode 100644
index 000000000..0f5932567
--- /dev/null
+++ b/doc/release/upcoming_changes/19226.compatibility.rst
@@ -0,0 +1,85 @@
+Changes to structured (void) dtype promotion and comparisons
+------------------------------------------------------------
+NumPy usually uses field position of structured dtypes when assigning
+from one structured dtype to another. This means that::
+
+ arr[["field1", "field2"]] = arr[["field2", "field1"]]
+
+swaps the data of the two fields. However, until now this behaviour
+was not matched for ``np.concatenate``. NumPy was also overly
+restrictive when comparing two structured dtypes. For exmaple::
+
+ np.ones(3, dtype="i,i") == np.ones(3, dtype="i,d")
+
+will now succeed instead of giving a ``FutureWarning`` and return ``False``.
+
+In general, NumPy now defines correct, but slightly limited, promotion for
+structured dtypes::
+
+ >>> np.result_type(np.dtype("i,i"), np.dtype("i,d"))
+ dtype([('f0', '<i4'), ('f1', '<f8')])
+
+For promotion matching field names, order, and titles are enforced, however
+padding is ignored.
+Note that this also now always ensures native byte-order for all fields,
+which can change the result (this can affect ``np.concatenate``)::
+
+ >>> np.result_type(np.dtype("i,>i"))
+ dtype([('f0', '<i4'), ('f1', '<i4')])
+ >>> np.result_type(np.dtype("i,>i"), np.dtype("i,i"))
+ dtype([('f0', '<i4'), ('f1', '<i4')])
+
+which previously returned the first dtype unmodified.
+
+Further, the new result of ``np.result_type`` and promotion in general
+is considered "canonical". Additionally to ensuring native byte-order
+for all fields, the result will also be "packed". This means that
+all fields is ordered contiguously and any unnecessary padding
+is now removed::
+
+ >>> dt = np.dtype("i1,V3,i4,V1")[["f0", "f2"]]
+ >>> dt
+ dtype({'names':['f0','f2'], 'formats':['i1','<i4'], 'offsets':[0,4], 'itemsize':9})
+ >>> np.result_type(dt)
+ dtype([('f0', 'i1'), ('f2', '<i4')])
+
+Note that the result prints without ``offsets`` or ``itemsize`` indicating no
+additional padding.
+If a structured dtype is created with ``align=True`` ensuring that
+``dtype.isalignedstruct`` is true, this property is preserved:
+
+ >>> dt = np.dtype("i1,V3,i4,V1", align=True)[["f0", "f2"]]
+ >>> dt
+ dtype({'names':['f0','f2'], 'formats':['i1','<i4'], 'offsets':[0,4], 'itemsize':12}, align=True)
+ >>> np.result_type(dt)
+ dtype([('f0', 'i1'), ('f2', '<i4')], align=True)
+ >>> np.result_type(dt).isalignedstruct
+ True
+
+When promoting multiple dtypes, the result is aligned if any of the inputs is::
+
+ >>> np.result_type(np.dtype("i,i"), np.dtype("i,i", align=True))
+ dtype([('f0', '<i4'), ('f1', '<i4')], align=True)
+
+The ``repr`` of aligned structures will now never print the long form
+including ``offsets`` and ``itemsize`` unless the struct includes padding
+not guaranteed by ``align=True``.
+
+
+Changes to structured dtype casting safety
+------------------------------------------
+In alignment with the above changes to the promotion logic, the
+casting safety has been updated:
+
+* ``"equiv"`` enforces matching names and titles. The itemsize
+ is allowed to differ due to padding.
+* ``"safe"`` allows mismatching field names and titles
+* The cast safety is limited by the cast safety of each included
+ field.
+* The order of fields is used to decide cast safety of each
+ individual field. Previously, the field names were used and
+ only unsafe casts were possible when names mismatched.
+
+The main important change here is that name mismatches are now
+considered "safe" casts.
+