diff options
author | Sebastian Berg <sebastian@sipsolutions.net> | 2014-03-07 23:05:47 +0100 |
---|---|---|
committer | Sebastian Berg <sebastian@sipsolutions.net> | 2014-03-12 10:34:02 +0100 |
commit | 5a3b0abf6da4a7357e24a8243bee14b3cc7a242f (patch) | |
tree | 0ce937843dc23515a84476a9842548e91ce71bf5 | |
parent | 4da29a8e7393356094b0eb3289b14360beac9d89 (diff) | |
download | numpy-5a3b0abf6da4a7357e24a8243bee14b3cc7a242f.tar.gz |
BUG: Delay npyiter size check when size may change
When a multi index is tracked and RemoveAxis can be called, the
size of the iterator may still change. This was causing failures
for example for the SVD, because the gufunc machinery requires
a temporarily larger iterator for output allocation.
Thanks to Jaime (jaime.frio@gmail.com) for noting that this is
plausible since the size check can be delayed pretty ok up
until GetIterNext (or similar functions).
Closes gh-4442
-rw-r--r-- | doc/release/1.9.0-notes.rst | 18 | ||||
-rw-r--r-- | doc/source/reference/c-api.iterator.rst | 22 | ||||
-rw-r--r-- | numpy/core/src/multiarray/multiarray_tests.c.src | 106 | ||||
-rw-r--r-- | numpy/core/src/multiarray/nditer_api.c | 39 | ||||
-rw-r--r-- | numpy/core/src/multiarray/nditer_constr.c | 17 | ||||
-rw-r--r-- | numpy/core/src/multiarray/nditer_pywrap.c | 25 | ||||
-rw-r--r-- | numpy/core/src/multiarray/nditer_templ.c.src | 10 | ||||
-rw-r--r-- | numpy/core/tests/test_nditer.py | 40 | ||||
-rw-r--r-- | numpy/linalg/tests/test_regression.py | 6 |
9 files changed, 266 insertions, 17 deletions
diff --git a/doc/release/1.9.0-notes.rst b/doc/release/1.9.0-notes.rst index f999f29b5..2e457ad84 100644 --- a/doc/release/1.9.0-notes.rst +++ b/doc/release/1.9.0-notes.rst @@ -227,10 +227,26 @@ integer/float array that is being casted. Previously the casting was allowed even if the result was truncated. +NDIter +~~~~~~ +When ``NpyIter_RemoveAxis`` is now called, the iterator range will be reset. + +When a multi index is being tracked and an iterator is not buffered, it is +possible to use ``NpyIter_RemoveAxis``. In this case an iterator can shrink +in size. Because the total size of an iterator is limited, the iterator +may be too large before these calls. In this case its size will be set to ``-1`` +and an error issued not at construction time but when removing the multi +index, setting the iterator range, or getting the next function. + +This has no effect on currently working code, but highlights the necessity +of checking for an error return if these conditions can occur. In most +cases the arrays being iterated are as large as the iterator so that such +a problem cannot occur. + + C-API ~~~~~ -None Deprecations ============ diff --git a/doc/source/reference/c-api.iterator.rst b/doc/source/reference/c-api.iterator.rst index 569d8ec56..084fdcbce 100644 --- a/doc/source/reference/c-api.iterator.rst +++ b/doc/source/reference/c-api.iterator.rst @@ -369,7 +369,14 @@ Construction and Destruction Causes the iterator to track a multi-index. This prevents the iterator from coalescing axes to - produce bigger inner loops. + produce bigger inner loops. If the loop is also not buffered + and no index is being tracked (`NpyIter_RemoveAxis` can be called), + then the iterator size can be ``-1`` to indicate that the iterator + is too large. This can happen due to complex broadcasting and + will result in errors being created when the setting the iterator + range, removing the multi index, or getting the next function. + However, it is possible to remove axes again and use the iterator + normally if the size is small enough after removal. .. cvar:: NPY_ITER_EXTERNAL_LOOP @@ -412,8 +419,9 @@ Construction and Destruction Indicates that arrays with a size of zero should be permitted. Since the typical iteration loop does not naturally work with - zero-sized arrays, you must check that the IterSize is non-zero - before entering the iteration loop. + zero-sized arrays, you must check that the IterSize is larger + than zero before entering the iteration loop. + Currently only the operands are checked, not a forced shape. .. cvar:: NPY_ITER_REDUCE_OK @@ -721,7 +729,7 @@ Construction and Destruction **WARNING**: This function may change the internal memory layout of the iterator. Any cached functions or pointers from the iterator - must be retrieved again! + must be retrieved again! The iterator range will be reset as well. Returns ``NPY_SUCCEED`` or ``NPY_FAIL``. @@ -887,7 +895,11 @@ Construction and Destruction .. cfunction:: npy_intp NpyIter_GetIterSize(NpyIter* iter) Returns the number of elements being iterated. This is the product - of all the dimensions in the shape. + of all the dimensions in the shape. When a multi index is being tracked + (and `NpyIter_RemoveAxis` may be called) the size may be ``-1`` to + indicate an iterator is too large. Such an iterator is invalid, but + may become valid after `NpyIter_RemoveAxis` is called. It is not + necessary to check for this case. .. cfunction:: npy_intp NpyIter_GetIterIndex(NpyIter* iter) diff --git a/numpy/core/src/multiarray/multiarray_tests.c.src b/numpy/core/src/multiarray/multiarray_tests.c.src index 6285d0d82..bd0366bd5 100644 --- a/numpy/core/src/multiarray/multiarray_tests.c.src +++ b/numpy/core/src/multiarray/multiarray_tests.c.src @@ -720,6 +720,109 @@ array_indexing(PyObject *NPY_UNUSED(self), PyObject *args) } +/* + * Test nditer of too large arrays using remove axis, etc. + */ +static PyObject * +test_nditer_too_large(PyObject *NPY_UNUSED(self), PyObject *args) { + NpyIter *iter; + PyObject *array_tuple, *arr; + PyArrayObject *arrays[NPY_MAXARGS]; + npy_uint32 op_flags[NPY_MAXARGS]; + Py_ssize_t nop; + int i, axis, mode; + + npy_intp index[NPY_MAXARGS] = {0}; + char *msg; + + if (!PyArg_ParseTuple(args, "Oii", &array_tuple, &axis, &mode)) { + return NULL; + } + + if (!PyTuple_CheckExact(array_tuple)) { + PyErr_SetString(PyExc_ValueError, "tuple required as first argument"); + return NULL; + } + nop = PyTuple_Size(array_tuple); + if (nop > NPY_MAXARGS) { + PyErr_SetString(PyExc_ValueError, "tuple must be smaller then maxargs"); + return NULL; + } + + for (i=0; i < nop; i++) { + arr = PyTuple_GET_ITEM(array_tuple, i); + if (!PyArray_CheckExact(arr)) { + PyErr_SetString(PyExc_ValueError, "require base class ndarray"); + return NULL; + } + arrays[i] = (PyArrayObject *)arr; + op_flags[i] = NPY_ITER_READONLY; + } + + iter = NpyIter_MultiNew(nop, arrays, NPY_ITER_MULTI_INDEX | NPY_ITER_RANGED, + NPY_KEEPORDER, NPY_NO_CASTING, op_flags, NULL); + + if (iter == NULL) { + return NULL; + } + + /* Remove an axis (negative, do not remove any) */ + if (axis >= 0) { + if (!NpyIter_RemoveAxis(iter, axis)) { + goto fail; + } + } + + switch (mode) { + /* Test IterNext getting */ + case 0: + if (NpyIter_GetIterNext(iter, NULL) == NULL) { + goto fail; + } + break; + case 1: + if (NpyIter_GetIterNext(iter, &msg) == NULL) { + PyErr_SetString(PyExc_ValueError, msg); + goto fail; + } + break; + /* Test Multi Index removal */ + case 2: + if (!NpyIter_RemoveMultiIndex(iter)) { + goto fail; + } + break; + /* Test GotoMultiIndex (just 0 hardcoded) */ + case 3: + if (!NpyIter_GotoMultiIndex(iter, index)) { + goto fail; + } + break; + /* Test setting iterrange (hardcoded range of 0, 1) */ + case 4: + if (!NpyIter_ResetToIterIndexRange(iter, 0, 1, NULL)) { + goto fail; + } + break; + case 5: + if (!NpyIter_ResetToIterIndexRange(iter, 0, 1, &msg)) { + PyErr_SetString(PyExc_ValueError, msg); + goto fail; + } + break; + /* Do nothing */ + default: + break; + } + + NpyIter_Deallocate(iter); + Py_RETURN_NONE; + fail: + NpyIter_Deallocate(iter); + return NULL; +} + + static PyMethodDef Multiarray_TestsMethods[] = { {"test_neighborhood_iterator", test_neighborhood_iterator, @@ -747,6 +850,9 @@ static PyMethodDef Multiarray_TestsMethods[] = { {"array_indexing", array_indexing, METH_VARARGS, NULL}, + {"test_nditer_too_large", + test_nditer_too_large, + METH_VARARGS, NULL}, {NULL, NULL, 0, NULL} /* Sentinel */ }; diff --git a/numpy/core/src/multiarray/nditer_api.c b/numpy/core/src/multiarray/nditer_api.c index d2c956426..ca407b1c1 100644 --- a/numpy/core/src/multiarray/nditer_api.c +++ b/numpy/core/src/multiarray/nditer_api.c @@ -14,6 +14,7 @@ /* Indicate that this .c file is allowed to include the header */ #define NPY_ITERATOR_IMPLEMENTATION_CODE #include "nditer_impl.h" +#include "scalarmathmodule.h" /* Internal helper functions private to this file */ static npy_intp @@ -127,13 +128,23 @@ NpyIter_RemoveAxis(NpyIter *iter, int axis) perm[idim] = p; } - /* Adjust the iteration size */ - NIT_ITERSIZE(iter) /= NAD_SHAPE(axisdata_del); - /* Shift all the axisdata structures by one */ axisdata = NIT_INDEX_AXISDATA(axisdata_del, 1); memmove(axisdata_del, axisdata, (ndim-1-xdim)*sizeof_axisdata); + /* Adjust the iteration size and reset iterend */ + NIT_ITERSIZE(iter) = 1; + axisdata = NIT_AXISDATA(iter); + for (idim = 0; idim < ndim-1; ++idim) { + if (npy_mul_with_overflow_intp(&NIT_ITERSIZE(iter), + NIT_ITERSIZE(iter), NAD_SHAPE(axisdata))) { + NIT_ITERSIZE(iter) = -1; + break; + } + NIT_ADVANCE_AXISDATA(axisdata, 1); + } + NIT_ITEREND(iter) = NIT_ITERSIZE(iter); + /* Shrink the iterator */ NIT_NDIM(iter) = ndim - 1; /* If it is now 0-d fill the singleton dimension */ @@ -166,6 +177,11 @@ NpyIter_RemoveMultiIndex(NpyIter *iter) itflags = NIT_ITFLAGS(iter); if (itflags&NPY_ITFLAG_HASMULTIINDEX) { + if (NIT_ITERSIZE(iter) < 0) { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + return NPY_FAIL; + } + NIT_ITFLAGS(iter) = itflags & ~NPY_ITFLAG_HASMULTIINDEX; npyiter_coalesce_axes(iter); } @@ -349,6 +365,15 @@ NpyIter_ResetToIterIndexRange(NpyIter *iter, } if (istart < 0 || iend > NIT_ITERSIZE(iter)) { + if (NIT_ITERSIZE(iter) < 0) { + if (errmsg == NULL) { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + } + else { + *errmsg = "iterator is too large"; + } + return NPY_FAIL; + } if (errmsg == NULL) { PyErr_Format(PyExc_ValueError, "Out-of-bounds range [%d, %d) passed to " @@ -454,6 +479,10 @@ NpyIter_GotoMultiIndex(NpyIter *iter, npy_intp *multi_index) } if (iterindex < NIT_ITERSTART(iter) || iterindex >= NIT_ITEREND(iter)) { + if (NIT_ITERSIZE(iter) < 0) { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + return NPY_FAIL; + } PyErr_SetString(PyExc_IndexError, "Iterator GotoMultiIndex called with a multi-index outside the " "restricted iteration range"); @@ -574,6 +603,10 @@ NpyIter_GotoIterIndex(NpyIter *iter, npy_intp iterindex) } if (iterindex < NIT_ITERSTART(iter) || iterindex >= NIT_ITEREND(iter)) { + if (NIT_ITERSIZE(iter) < 0) { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + return NPY_FAIL; + } PyErr_SetString(PyExc_IndexError, "Iterator GotoIterIndex called with an iterindex outside the " "iteration range."); diff --git a/numpy/core/src/multiarray/nditer_constr.c b/numpy/core/src/multiarray/nditer_constr.c index 47523863e..41f0b97c4 100644 --- a/numpy/core/src/multiarray/nditer_constr.c +++ b/numpy/core/src/multiarray/nditer_constr.c @@ -1656,8 +1656,21 @@ npyiter_fill_axisdata(NpyIter *iter, npy_uint32 flags, npyiter_opitflags *op_itf for (idim = 0; idim < ndim; ++idim) { if (npy_mul_with_overflow_intp(&NIT_ITERSIZE(iter), NIT_ITERSIZE(iter), broadcast_shape[idim])) { - PyErr_SetString(PyExc_ValueError, "iterator is too large"); - return 0; + if ((itflags & NPY_ITFLAG_HASMULTIINDEX) && + !(itflags & NPY_ITFLAG_HASINDEX) && + !(itflags & NPY_ITFLAG_BUFFER)) { + /* + * If RemoveAxis may be called, the size check is delayed + * until either the multi index is removed, or GetIterNext + * is called. + */ + NIT_ITERSIZE(iter) = -1; + break; + } + else { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + return 0; + } } } /* The range defaults to everything */ diff --git a/numpy/core/src/multiarray/nditer_pywrap.c b/numpy/core/src/multiarray/nditer_pywrap.c index 339a713cd..1ca5de8d6 100644 --- a/numpy/core/src/multiarray/nditer_pywrap.c +++ b/numpy/core/src/multiarray/nditer_pywrap.c @@ -37,12 +37,16 @@ struct NewNpyArrayIterObject_tag { char writeflags[NPY_MAXARGS]; }; -static void npyiter_cache_values(NewNpyArrayIterObject *self) +static int npyiter_cache_values(NewNpyArrayIterObject *self) { NpyIter *iter = self->iter; /* iternext and get_multi_index functions */ self->iternext = NpyIter_GetIterNext(iter, NULL); + if (self->iternext == NULL) { + return -1; + } + if (NpyIter_HasMultiIndex(iter) && !NpyIter_HasDelayedBufAlloc(iter)) { self->get_multi_index = NpyIter_GetGetMultiIndex(iter, NULL); } @@ -67,6 +71,7 @@ static void npyiter_cache_values(NewNpyArrayIterObject *self) /* The read/write settings */ NpyIter_GetReadFlags(iter, self->readflags); NpyIter_GetWriteFlags(iter, self->writeflags); + return 0; } static PyObject * @@ -803,7 +808,9 @@ npyiter_init(NewNpyArrayIterObject *self, PyObject *args, PyObject *kwds) } /* Cache some values for the member functions to use */ - npyiter_cache_values(self); + if (npyiter_cache_values(self) < 0) { + goto fail; + } if (NpyIter_GetIterSize(self->iter) == 0) { self->started = 1; @@ -1068,7 +1075,10 @@ NpyIter_NestedIters(PyObject *NPY_UNUSED(self), } /* Cache some values for the member functions to use */ - npyiter_cache_values(iter); + if (npyiter_cache_values(iter) < 0) { + Py_DECREF(ret); + goto fail; + } if (NpyIter_GetIterSize(iter->iter) == 0) { iter->started = 1; @@ -1242,7 +1252,10 @@ npyiter_copy(NewNpyArrayIterObject *self) } /* Cache some values for the member functions to use */ - npyiter_cache_values(iter); + if (npyiter_cache_values(iter) < 0) { + Py_DECREF(iter); + return NULL; + } iter->started = self->started; iter->finished = self->finished; @@ -1287,7 +1300,9 @@ npyiter_remove_axis(NewNpyArrayIterObject *self, PyObject *args) return NULL; } /* RemoveAxis invalidates cached values */ - npyiter_cache_values(self); + if (npyiter_cache_values(self) < 0) { + return NULL; + } /* RemoveAxis also resets the iterator */ if (NpyIter_GetIterSize(self->iter) == 0) { self->started = 1; diff --git a/numpy/core/src/multiarray/nditer_templ.c.src b/numpy/core/src/multiarray/nditer_templ.c.src index 59aae244b..8976b132e 100644 --- a/numpy/core/src/multiarray/nditer_templ.c.src +++ b/numpy/core/src/multiarray/nditer_templ.c.src @@ -347,6 +347,16 @@ NpyIter_GetIterNext(NpyIter *iter, char **errmsg) int ndim = NIT_NDIM(iter); int nop = NIT_NOP(iter); + if (NIT_ITERSIZE(iter) < 0) { + if (errmsg == NULL) { + PyErr_SetString(PyExc_ValueError, "iterator is too large"); + } + else { + *errmsg = "iterator is too large"; + } + return NULL; + } + /* * When there is just one iteration and buffering is disabled * the iternext function is very simple. diff --git a/numpy/core/tests/test_nditer.py b/numpy/core/tests/test_nditer.py index 3eac5428f..0055c038b 100644 --- a/numpy/core/tests/test_nditer.py +++ b/numpy/core/tests/test_nditer.py @@ -6,7 +6,7 @@ import numpy as np from numpy import array, arange, nditer, all from numpy.compat import asbytes, sixu from numpy.testing import * - +from numpy.core.multiarray_tests import test_nditer_too_large def iter_multi_index(i): @@ -2586,6 +2586,44 @@ def test_iter_too_large(): size = np.iinfo(np.intp).max // 1024 arr = np.lib.stride_tricks.as_strided(np.zeros(1), (size,), (0,)) assert_raises(ValueError, nditer, (arr, arr[:, None])) + # test the same for multiindex. That may get more interesting when + # removing 0 dimensional axis is allowed (since an iterator can grow then) + assert_raises(ValueError, nditer, + (arr, arr[:, None]), flags=['multi_index']) + + +def test_iter_too_large_with_multiindex(): + # When a multi index is being tracked, the error is delayed this + # checks the delayed error messages and getting below that by + # removing an axis. + base_size = 2**10 + num = 1 + while base_size**num < np.iinfo(np.intp).max: + num += 1 + + shape_template = [1, 1] * num + arrays = [] + for i in range(num): + shape = shape_template[:] + shape[i * 2] = 2**10 + arrays.append(np.empty(shape)) + arrays = tuple(arrays) + + # arrays are now too large to be broadcast. The different modes test + # different nditer functionality with or without GIL. + for mode in range(6): + assert_raises(ValueError, test_nditer_too_large, arrays, -1, mode) + # but if we do nothing with the nditer, it can be constructed: + test_nditer_too_large(arrays, -1, 7) + + # When an axis is removed, things should work again (half the time): + for i in range(num): + for mode in range(6): + # an axis with size 1024 is removed: + test_nditer_too_large(arrays, i*2, mode) + # an axis with size 1 is removed: + assert_raises(ValueError, test_nditer_too_large, + arrays, i*2 + 1, mode) if __name__ == "__main__": diff --git a/numpy/linalg/tests/test_regression.py b/numpy/linalg/tests/test_regression.py index 4ff14a6a5..b0b2b2e30 100644 --- a/numpy/linalg/tests/test_regression.py +++ b/numpy/linalg/tests/test_regression.py @@ -69,5 +69,11 @@ class TestRegression(TestCase): bp = linalg.cholesky(b) assert_array_equal(ap, bp) + def test_large_svd_32bit(self): + # See gh-4442, 64bit would require very large/slow matrices. + x = np.eye(1000, 66) + np.linalg.svd(x) + + if __name__ == '__main__': run_module_suite() |