From c5176ed63306d4abceca7d76d6a3ef063ed5a6b2 Mon Sep 17 00:00:00 2001 From: Pauli Virtanen Date: Mon, 5 Sep 2016 20:41:56 +0200 Subject: ENH: NpyIter: add a flag to handle read/write operand overlap Add a new NPY_ITER_COPY_IF_OVERLAP iterator flag to NpyIter, which instructs it to check if read operands overlap with write operands in memory, and make temporary copies to eliminate detected overlap. Thanks to Sebastian Berg. --- doc/source/reference/c-api.iterator.rst | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 'doc/source/reference/c-api.iterator.rst') diff --git a/doc/source/reference/c-api.iterator.rst b/doc/source/reference/c-api.iterator.rst index b38c21390..b5d00f4be 100644 --- a/doc/source/reference/c-api.iterator.rst +++ b/doc/source/reference/c-api.iterator.rst @@ -461,6 +461,33 @@ Construction and Destruction Then, call :c:func:`NpyIter_Reset` to allocate and fill the buffers with their initial values. + .. c:var:: NPY_ITER_COPY_IF_OVERLAP + + If a write operand has overlap with a read operand, eliminate all + overlap by making temporary copies (with UPDATEIFCOPY for write + operands). + + Overlapping means: + + - For a (read, write) pair of operands, there is a memory address + that contains data common to both arrays, which can be reached + via *different* index/dtype/shape combinations. + + - In particular, unless the arrays have the same shape, dtype, + strides, and start address, any shared common data byte accessible + by indexing implies overlap. + + Because exact overlap detection has exponential runtime + in the number of dimensions, the decision is made based + on heuristics, which has false positives (needless copies in unusual + cases) but has no false negatives. + + If read/write overlap exists and write operands are modified in the + iterator loop element-wise, this flag ensures the result of the + operation is the same as if all operands were copied. + In cases where copies would need to be made, **the result of the + computation may be undefined without this flag!** + Flags that may be passed in ``op_flags[i]``, where ``0 <= i < nop``: .. c:var:: NPY_ITER_READWRITE -- cgit v1.2.1