doc/release/2.0.0-notes.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259

=========================
NumPy 2.0.0 Release Notes
=========================

[Possibly 1.7.0 release notes, as ABI compatibility is still being maintained]

Highlights
==========


Compatibility notes
===================

In a future version of numpy, the functions np.diag, np.diagonal, and
the diagonal method of ndarrays will return a view onto the original
array, instead of producing a copy as they do now. This makes a
difference if you write to the array returned by any of these
functions. To facilitate this transition, numpy 1.7 produces a
DeprecationWarning if it detects that you may be attempting to write
to such an array. See the documentation for np.diagonal for details.

The default casting rule for UFunc out= parameters has been changed from
'unsafe' to 'same_kind'.  Most usages which violate the 'same_kind'
rule are likely bugs, so this change may expose previously undetected
errors in projects that depend on NumPy.

Full-array boolean indexing has been optimized to use a different,
optimized code path.   This code path should produce the same results,
but any feedback about changes to your code would be appreciated. 

Attempting to write to a read-only array (one with
``arr.flags.writeable`` set to ``False``) used to raise either a
RuntimeError, ValueError, or TypeError inconsistently, depending on
which code path was taken. It now consistently raises a ValueError.

The <ufunc>.reduce functions evaluate some reductions in a different
order than in previous versions of NumPy, generally providing higher
performance. Because of the nature of floating-point arithmetic, this
may subtly change some results, just as linking NumPy to a different
BLAS implementations such as MKL can.

If upgrading from 1.5, then generally in 1.6 and 1.7 there have been
substantial code added and some code paths altered, particularly in
the areas of type resolution and buffered iteration over universal
functions.   This might have an impact on your code particularly if
you relied on accidental behavior in the past. 

New features
============

Mask-based NA missing values
----------------------------

Preliminary support for NA missing values similar to those in R has
been implemented.  This was done by adding optional NA masks to the core
array object.

.. note:: The NA API is *experimental*, and may undergo changes in future
   versions of NumPy.  The current implementation based on masks will likely be
   supplemented by a second one based on bit-patterns, and it is possible that
   a difference will be made between missing and ignored data.

While a significant amount of the NumPy functionality has been extended to
support NA masks, not everything is yet supported. Here is an (incomplete)
list of things that do and do not work with NA values:

What works with NA:
    * Basic indexing and slicing, as well as full boolean mask indexing.
    * All element-wise ufuncs.
    * All UFunc.reduce methods, with a new skipna parameter.
    * np.all and np.any satisfy the rules NA | True == True and
      NA & False == False
    * The nditer object.
    * Array methods:
       + ndarray.clip, ndarray.min, ndarray.max, ndarray.sum, ndarray.prod,
         ndarray.conjugate, ndarray.diagonal, ndarray.flatten
       + numpy.concatenate, numpy.column_stack, numpy.hstack,
         numpy.vstack, numpy.dstack, numpy.squeeze, numpy.mean, numpy.std,
         numpy.var

What doesn't work with NA:
    * Fancy indexing, such as with lists and partial boolean masks.
    * ndarray.flat and any other methods that use the old iterator
      mechanism instead of the newer nditer.
    * Struct dtypes, which will have corresponding struct masks with
      one mask value per primitive field of the struct dtype.
    * UFunc.accumulate, UFunc.reduceat.
    * Ufunc calls with both NA masks and a where= mask at the same time.
    * File I/O has not been extended to support NA-masks.
    * np.logical_and and np.logical_or don't satisfy the
      rules NA | True == True and NA & False == False yet.
    * Array methods:
       + ndarray.argmax, ndarray.argmin,
       + numpy.repeat, numpy.delete (relies on fancy indexing),
         numpy.append, numpy.insert (relies on fancy indexing),
         numpy.where,

Differences with R:
    * R's parameter rm.na=T is spelled skipna=True in NumPy.
    * np.isna(nan) is False, but R's is.na(nan) is TRUE. This is because
      NumPy's NA is treated independently of the underlying data type.
    * Boolean indexing, where the result is compressed to just
      the elements with true in the mask, raises if the boolean mask
      has an NA value in it. This is because that value could be either
      True or False, meaning the count of the output array is actually
      NA. R treats this case in a manner inconsistent with the NA model,
      returning NA values in the spots where the boolean index has NA.
      This may have a practical advantage in spite of violating the
      NA theoretical model, so NumPy could adopt the behavior if necessary

Reduction UFuncs Generalize axis= Parameter
-------------------------------------------

Any ufunc.reduce function call, as well as other reductions like
sum, prod, any, all, max and min support the ability to choose
a subset of the axes to reduce over. Previously, one could say
axis=None to mean all the axes or axis=# to pick a single axis.
Now, one can also say axis=(#,#) to pick a list of axes for reduction.

Reduction UFuncs New keepdims= Parameter
----------------------------------------

There is a new keepdims= parameter, which if set to True, doesn't
throw away the reduction axes but instead sets them to have size one.
When this option is set, the reduction result will broadcast correctly
to the original operand which was reduced.

Datetime support
----------------

.. note:: The datetime API is *experimental* in 1.7.0, and may undergo changes
   in future versions of NumPy.

TODO: describe changes in datetime


Custom formatter for printing arrays
------------------------------------

New function numpy.random.choice
---------------------------------

A generic sampling function has been added which will generate samples from
a given array-like. The samples can be with or without replacement, and
with uniform or given non-uniform probabilities.

New function isclose
--------------------

Returns a boolean array where two arrays are element-wise equal within a
tolerance. Both relative and absolute tolerance can be specified. The
function is NA aware.

Preliminary multi-dimensional support in the polynomial package
---------------------------------------------------------------

Axis keywords have been added to the integration and differentiation
functions and a tensor keyword was added to the evaluation functions.
These additions allow multi-dimensional coefficient arrays to be used in
those functions. New functions for evaluating 2-D and 3-D coefficient
arrays on grids or sets of points were added together with 2-D and 3-D
pseudo-Vandermonde matrices that can be used for fitting.

Support for mask-based NA values in the polynomial package fits
---------------------------------------------------------------

The fitting functions recognize and remove masked data from the fit.

Ability to pad rank-n arrays
----------------------------

A pad module containing functions for padding n-dimensional arrays has
been added. The various private padding functions are exposed as options to
a public 'pad' function.  Example:

pad(a, 5, mode='mean')

Current modes are 'constant', 'edge', 'linear_ramp', 'maximum', 'mean',
'median', 'minimum', 'reflect', 'symmetric', 'wrap', and <function>


New argument to searchsorted
----------------------------

The function searchsorted now accepts a 'sorter' argument that is a
permuation array that sorts the array to search.

C API
-----

New function ``PyArray_RequireWriteable`` provides a consistent
interface for checking array writeability -- any C code which works
with arrays whose WRITEABLE flag is not known to be True a priori,
should make sure to call this function before writing.

Changes
=======

General
-------

The function np.concatenate tries to match the layout of its input
arrays. Previously, the layout did not follow any particular reason,
and depended in an undesirable way on the particular axis chosen for
concatenation. A bug was also fixed which silently allowed out of bounds
axis arguments.

The ufuncs logical_or, logical_and, and logical_not now follow Python's
behavior with object arrays, instead of trying to call methods on the
objects. For example the expression (3 and 'test') produces the string
'test', and now np.logical_and(np.array(3, 'O'), np.array('test', 'O'))
produces 'test' as well.

C-API
-----

The following macros now require trailing semicolons::

    NPY_BEGIN_THREADS_DEF
    NPY_BEGIN_THREADS
    NPY_ALLOW_C_API
    NPY_ALLOW_C_API_DEF
    NPY_DISABLE_C_API

DATETIME
--------

There have been a lot of fixes and enhancements to datetime64. The notes
in doc/source/reference/arrays.datetime.rst or in the generated
documentation should be consulted for the details.


Deprecations
============

General
-------

Specifying a custom string formatter with a `_format` array attribute is
deprecated. The new `formatter` keyword in ``numpy.set_printoptions`` or
``numpy.array2string`` can be used instead.

The deprecated imports in the polynomial package have been removed.

C-API
-----

Direct access to the fields of PyArrayObject* has been deprecated. Direct
access has been recommended against for many releases. Expect similar
deprecations for PyArray_Descr* and other core objects in the future as
preparation for NumPy 2.0.

The macros in old_defines.h are deprecated and will be removed in the next
minor release (>= 1.8). The sed script tools/replace_old_macros.sed can
be used to replace these macros with the newer versions.

You can test your code against the deprecated C API by #defining
NPY_NO_DEPRECATED_API to the target version number, for example
NPY_1_7_API_VERSION, before including any NumPy headers.