From b3239c67acc85c2bd649548598971ac9b390bad0 Mon Sep 17 00:00:00 2001
From: Chiara Marmo <chiara.marmo@inria.fr>
Date: Mon, 2 Aug 2021 20:22:31 +0200
Subject: Dedent list of macros.

---
 doc/source/reference/c-api/array.rst    |   24 +-
 doc/source/reference/c-api/iterator.rst | 1157 ++++++++++++++++---------------
 2 files changed, 592 insertions(+), 589 deletions(-)

(limited to 'doc/source/reference')

diff --git a/doc/source/reference/c-api/array.rst b/doc/source/reference/c-api/array.rst
index 901affe8b..26a8f643d 100644
--- a/doc/source/reference/c-api/array.rst
+++ b/doc/source/reference/c-api/array.rst
@@ -1540,19 +1540,6 @@ specify desired properties of the new array.
 
     Make sure the resulting object is an actual ndarray, and not a sub-class.
 
-These constant are used in :c:func:`PyArray_DescrNewByteorder` to describe the
-byteorder of the new datatype.
-
-.. c:macro:: NPY_IGNORE
-
-.. c:macro:: NPY_SWAP
-
-.. c:macro:: NPY_NATIVE
-
-.. c:macro:: NPY_LITTLE
-
-.. c:macro:: NPY_BIG
-
 
 Flag checking
 ^^^^^^^^^^^^^
@@ -2849,6 +2836,17 @@ Data-type descriptors
     *newendian*. All referenced data-type objects (in subdescr and
     fields members of the data-type object) are also changed
     (recursively).
+
+    The value of *newendian* is one of these macros:
+..
+    dedent the enumeration of flags to avoid missing references sphinx warnings 
+
+.. c:macro:: NPY_IGNORE
+             NPY_SWAP
+             NPY_NATIVE
+             NPY_LITTLE
+             NPY_BIG
+
     If a byteorder of :c:data:`NPY_IGNORE` is encountered it
     is left alone. If newendian is :c:data:`NPY_SWAP`, then all byte-orders
     are swapped. Other valid newendian values are :c:data:`NPY_NATIVE`,
diff --git a/doc/source/reference/c-api/iterator.rst b/doc/source/reference/c-api/iterator.rst
index 1d09bcef8..2208cdd2f 100644
--- a/doc/source/reference/c-api/iterator.rst
+++ b/doc/source/reference/c-api/iterator.rst
@@ -310,777 +310,782 @@ Construction and Destruction
     Returns NULL if there is an error, otherwise returns the allocated
     iterator.
 
-.. c:function:: NpyIter* NpyIter_AdvancedNew( \
-        npy_intp nop, PyArrayObject** op, npy_uint32 flags, NPY_ORDER order, \
-        NPY_CASTING casting, npy_uint32* op_flags, PyArray_Descr** op_dtypes, \
-        int oa_ndim, int** op_axes, npy_intp const* itershape, npy_intp buffersize)
-
-    Extends :c:func:`NpyIter_MultiNew` with several advanced options providing
-    more control over broadcasting and buffering.
-
-    If -1/NULL values are passed to ``oa_ndim``, ``op_axes``, ``itershape``,
-    and ``buffersize``, it is equivalent to :c:func:`NpyIter_MultiNew`.
+    Flags that may be passed in ``flags``, applying to the whole
+    iterator, are:
+..
+    dedent the enumeration of flags to avoid missing references sphinx warnings 
 
-    The parameter ``oa_ndim``, when not zero or -1, specifies the number of
-    dimensions that will be iterated with customized broadcasting.
-    If it is provided, ``op_axes`` must and ``itershape`` can also be provided.
-    The ``op_axes`` parameter let you control in detail how the
-    axes of the operand arrays get matched together and iterated.
-    In ``op_axes``, you must provide an array of ``nop`` pointers
-    to ``oa_ndim``-sized arrays of type ``npy_intp``.  If an entry
-    in ``op_axes`` is NULL, normal broadcasting rules will apply.
-    In ``op_axes[j][i]`` is stored either a valid axis of ``op[j]``, or
-    -1 which means ``newaxis``.  Within each ``op_axes[j]`` array, axes
-    may not be repeated.  The following example is how normal broadcasting
-    applies to a 3-D array, a 2-D array, a 1-D array and a scalar.
+.. c:macro:: NPY_ITER_C_INDEX
 
-    **Note**: Before NumPy 1.8 ``oa_ndim == 0` was used for signalling that
-    that ``op_axes`` and ``itershape`` are unused. This is deprecated and
-    should be replaced with -1. Better backward compatibility may be
-    achieved by using :c:func:`NpyIter_MultiNew` for this case.
+    Causes the iterator to track a raveled flat index matching C
+    order. This option cannot be used with :c:data:`NPY_ITER_F_INDEX`.
 
-    .. code-block:: c
+.. c:macro:: NPY_ITER_F_INDEX
 
-        int oa_ndim = 3;               /* # iteration axes */
-        int op0_axes[] = {0, 1, 2};    /* 3-D operand */
-        int op1_axes[] = {-1, 0, 1};   /* 2-D operand */
-        int op2_axes[] = {-1, -1, 0};  /* 1-D operand */
-        int op3_axes[] = {-1, -1, -1}  /* 0-D (scalar) operand */
-        int* op_axes[] = {op0_axes, op1_axes, op2_axes, op3_axes};
+    Causes the iterator to track a raveled flat index matching Fortran
+    order. This option cannot be used with :c:data:`NPY_ITER_C_INDEX`.
 
-    The ``itershape`` parameter allows you to force the iterator
-    to have a specific iteration shape. It is an array of length
-    ``oa_ndim``. When an entry is negative, its value is determined
-    from the operands. This parameter allows automatically allocated
-    outputs to get additional dimensions which don't match up with
-    any dimension of an input.
+.. c:macro:: NPY_ITER_MULTI_INDEX
 
-    If ``buffersize`` is zero, a default buffer size is used,
-    otherwise it specifies how big of a buffer to use.  Buffers
-    which are powers of 2 such as 4096 or 8192 are recommended.
+    Causes the iterator to track a multi-index.
+    This prevents the iterator from coalescing axes to
+    produce bigger inner loops. If the loop is also not buffered
+    and no index is being tracked (`NpyIter_RemoveAxis` can be called),
+    then the iterator size can be ``-1`` to indicate that the iterator
+    is too large. This can happen due to complex broadcasting and
+    will result in errors being created when the setting the iterator
+    range, removing the multi index, or getting the next function.
+    However, it is possible to remove axes again and use the iterator
+    normally if the size is small enough after removal.
 
-    Returns NULL if there is an error, otherwise returns the allocated
-    iterator.
+.. c:macro:: NPY_ITER_EXTERNAL_LOOP
 
-.. c:function:: NpyIter* NpyIter_Copy(NpyIter* iter)
+    Causes the iterator to skip iteration of the innermost
+    loop, requiring the user of the iterator to handle it.
 
-    Makes a copy of the given iterator.  This function is provided
-    primarily to enable multi-threaded iteration of the data.
+    This flag is incompatible with :c:data:`NPY_ITER_C_INDEX`,
+    :c:data:`NPY_ITER_F_INDEX`, and :c:data:`NPY_ITER_MULTI_INDEX`.
 
-    *TODO*: Move this to a section about multithreaded iteration.
+.. c:macro:: NPY_ITER_DONT_NEGATE_STRIDES
 
-    The recommended approach to multithreaded iteration is to
-    first create an iterator with the flags
-    :c:data:`NPY_ITER_EXTERNAL_LOOP`, :c:data:`NPY_ITER_RANGED`,
-    :c:data:`NPY_ITER_BUFFERED`, :c:data:`NPY_ITER_DELAY_BUFALLOC`, and
-    possibly :c:data:`NPY_ITER_GROWINNER`.  Create a copy of this iterator
-    for each thread (minus one for the first iterator).  Then, take
-    the iteration index range ``[0, NpyIter_GetIterSize(iter))`` and
-    split it up into tasks, for example using a TBB parallel_for loop.
-    When a thread gets a task to execute, it then uses its copy of
-    the iterator by calling :c:func:`NpyIter_ResetToIterIndexRange` and
-    iterating over the full range.
+    This only affects the iterator when :c:type:`NPY_KEEPORDER` is
+    specified for the order parameter.  By default with
+    :c:type:`NPY_KEEPORDER`, the iterator reverses axes which have
+    negative strides, so that memory is traversed in a forward
+    direction.  This disables this step.  Use this flag if you
+    want to use the underlying memory-ordering of the axes,
+    but don't want an axis reversed. This is the behavior of
+    ``numpy.ravel(a, order='K')``, for instance.
 
-    When using the iterator in multi-threaded code or in code not
-    holding the Python GIL, care must be taken to only call functions
-    which are safe in that context.  :c:func:`NpyIter_Copy` cannot be safely
-    called without the Python GIL, because it increments Python
-    references.  The ``Reset*`` and some other functions may be safely
-    called by passing in the ``errmsg`` parameter as non-NULL, so that
-    the functions will pass back errors through it instead of setting
-    a Python exception.
+.. c:macro:: NPY_ITER_COMMON_DTYPE
 
-    :c:func:`NpyIter_Deallocate` must be called for each copy.
+    Causes the iterator to convert all the operands to a common
+    data type, calculated based on the ufunc type promotion rules.
+    Copying or buffering must be enabled.
 
-.. c:function:: int NpyIter_RemoveAxis(NpyIter* iter, int axis)
+    If the common data type is known ahead of time, don't use this
+    flag.  Instead, set the requested dtype for all the operands.
 
-    Removes an axis from iteration.  This requires that
-    :c:data:`NPY_ITER_MULTI_INDEX` was set for iterator creation, and does
-    not work if buffering is enabled or an index is being tracked. This
-    function also resets the iterator to its initial state.
+.. c:macro:: NPY_ITER_REFS_OK
 
-    This is useful for setting up an accumulation loop, for example.
-    The iterator can first be created with all the dimensions, including
-    the accumulation axis, so that the output gets created correctly.
-    Then, the accumulation axis can be removed, and the calculation
-    done in a nested fashion.
+    Indicates that arrays with reference types (object
+    arrays or structured arrays containing an object type)
+    may be accepted and used in the iterator.  If this flag
+    is enabled, the caller must be sure to check whether
+    :c:expr:`NpyIter_IterationNeedsAPI(iter)` is true, in which case
+    it may not release the GIL during iteration.
 
-    **WARNING**: This function may change the internal memory layout of
-    the iterator.  Any cached functions or pointers from the iterator
-    must be retrieved again! The iterator range will be reset as well.
+.. c:macro:: NPY_ITER_ZEROSIZE_OK
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    Indicates that arrays with a size of zero should be permitted.
+    Since the typical iteration loop does not naturally work with
+    zero-sized arrays, you must check that the IterSize is larger
+    than zero before entering the iteration loop.
+    Currently only the operands are checked, not a forced shape.
 
+.. c:macro:: NPY_ITER_REDUCE_OK
 
-.. c:function:: int NpyIter_RemoveMultiIndex(NpyIter* iter)
+    Permits writeable operands with a dimension with zero
+    stride and size greater than one.  Note that such operands
+    must be read/write.
 
-    If the iterator is tracking a multi-index, this strips support for them,
-    and does further iterator optimizations that are possible if multi-indices
-    are not needed.  This function also resets the iterator to its initial
-    state.
+    When buffering is enabled, this also switches to a special
+    buffering mode which reduces the loop length as necessary to
+    not trample on values being reduced.
 
-    **WARNING**: This function may change the internal memory layout of
-    the iterator.  Any cached functions or pointers from the iterator
-    must be retrieved again!
+    Note that if you want to do a reduction on an automatically
+    allocated output, you must use :c:func:`NpyIter_GetOperandArray`
+    to get its reference, then set every value to the reduction
+    unit before doing the iteration loop.  In the case of a
+    buffered reduction, this means you must also specify the
+    flag :c:data:`NPY_ITER_DELAY_BUFALLOC`, then reset the iterator
+    after initializing the allocated operand to prepare the
+    buffers.
 
-    After calling this function, :c:expr:`NpyIter_HasMultiIndex(iter)` will
-    return false.
+.. c:macro:: NPY_ITER_RANGED
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    Enables support for iteration of sub-ranges of the full
+    ``iterindex`` range ``[0, NpyIter_IterSize(iter))``.  Use
+    the function :c:func:`NpyIter_ResetToIterIndexRange` to specify
+    a range for iteration.
 
-.. c:function:: int NpyIter_EnableExternalLoop(NpyIter* iter)
+    This flag can only be used with :c:data:`NPY_ITER_EXTERNAL_LOOP`
+    when :c:data:`NPY_ITER_BUFFERED` is enabled.  This is because
+    without buffering, the inner loop is always the size of the
+    innermost iteration dimension, and allowing it to get cut up
+    would require special handling, effectively making it more
+    like the buffered version.
 
-    If :c:func:`NpyIter_RemoveMultiIndex` was called, you may want to enable the
-    flag :c:data:`NPY_ITER_EXTERNAL_LOOP`.  This flag is not permitted
-    together with :c:data:`NPY_ITER_MULTI_INDEX`, so this function is provided
-    to enable the feature after :c:func:`NpyIter_RemoveMultiIndex` is called.
-    This function also resets the iterator to its initial state.
+.. c:macro:: NPY_ITER_BUFFERED
 
-    **WARNING**: This function changes the internal logic of the iterator.
-    Any cached functions or pointers from the iterator must be retrieved
-    again!
+    Causes the iterator to store buffering data, and use buffering
+    to satisfy data type, alignment, and byte-order requirements.
+    To buffer an operand, do not specify the :c:data:`NPY_ITER_COPY`
+    or :c:data:`NPY_ITER_UPDATEIFCOPY` flags, because they will
+    override buffering.  Buffering is especially useful for Python
+    code using the iterator, allowing for larger chunks
+    of data at once to amortize the Python interpreter overhead.
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    If used with :c:data:`NPY_ITER_EXTERNAL_LOOP`, the inner loop
+    for the caller may get larger chunks than would be possible
+    without buffering, because of how the strides are laid out.
 
-.. c:function:: int NpyIter_Deallocate(NpyIter* iter)
+    Note that if an operand is given the flag :c:data:`NPY_ITER_COPY`
+    or :c:data:`NPY_ITER_UPDATEIFCOPY`, a copy will be made in preference
+    to buffering.  Buffering will still occur when the array was
+    broadcast so elements need to be duplicated to get a constant
+    stride.
 
-    Deallocates the iterator object and resolves any needed writebacks.
+    In normal buffering, the size of each inner loop is equal
+    to the buffer size, or possibly larger if
+    :c:data:`NPY_ITER_GROWINNER` is specified.  If
+    :c:data:`NPY_ITER_REDUCE_OK` is enabled and a reduction occurs,
+    the inner loops may become smaller depending
+    on the structure of the reduction.
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+.. c:macro:: NPY_ITER_GROWINNER
 
-.. c:function:: int NpyIter_Reset(NpyIter* iter, char** errmsg)
+    When buffering is enabled, this allows the size of the inner
+    loop to grow when buffering isn't necessary.  This option
+    is best used if you're doing a straight pass through all the
+    data, rather than anything with small cache-friendly arrays
+    of temporary values for each inner loop.
 
-    Resets the iterator back to its initial state, at the beginning
-    of the iteration range.
+.. c:macro:: NPY_ITER_DELAY_BUFALLOC
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
-    no Python exception is set when ``NPY_FAIL`` is returned.
-    Instead, \*errmsg is set to an error message.  When errmsg is
-    non-NULL, the function may be safely called without holding
-    the Python GIL.
+    When buffering is enabled, this delays allocation of the
+    buffers until :c:func:`NpyIter_Reset` or another reset function is
+    called.  This flag exists to avoid wasteful copying of
+    buffer data when making multiple copies of a buffered
+    iterator for multi-threaded iteration.
 
-.. c:function:: int NpyIter_ResetToIterIndexRange( \
-        NpyIter* iter, npy_intp istart, npy_intp iend, char** errmsg)
+    Another use of this flag is for setting up reduction operations.
+    After the iterator is created, and a reduction output
+    is allocated automatically by the iterator (be sure to use
+    READWRITE access), its value may be initialized to the reduction
+    unit.  Use :c:func:`NpyIter_GetOperandArray` to get the object.
+    Then, call :c:func:`NpyIter_Reset` to allocate and fill the buffers
+    with their initial values.
 
-    Resets the iterator and restricts it to the ``iterindex`` range
-    ``[istart, iend)``.  See :c:func:`NpyIter_Copy` for an explanation of
-    how to use this for multi-threaded iteration.  This requires that
-    the flag :c:data:`NPY_ITER_RANGED` was passed to the iterator constructor.
+.. c:macro:: NPY_ITER_COPY_IF_OVERLAP
 
-    If you want to reset both the ``iterindex`` range and the base
-    pointers at the same time, you can do the following to avoid
-    extra buffer copying (be sure to add the return code error checks
-    when you copy this code).
+    If any write operand has overlap with any read operand, eliminate all
+    overlap by making temporary copies (enabling UPDATEIFCOPY for write
+    operands, if necessary). A pair of operands has overlap if there is
+    a memory address that contains data common to both arrays.
 
-    .. code-block:: c
+    Because exact overlap detection has exponential runtime
+    in the number of dimensions, the decision is made based
+    on heuristics, which has false positives (needless copies in unusual
+    cases) but has no false negatives.
 
-        /* Set to a trivial empty range */
-        NpyIter_ResetToIterIndexRange(iter, 0, 0);
-        /* Set the base pointers */
-        NpyIter_ResetBasePointers(iter, baseptrs);
-        /* Set to the desired range */
-        NpyIter_ResetToIterIndexRange(iter, istart, iend);
+    If any read/write overlap exists, this flag ensures the result of the
+    operation is the same as if all operands were copied.
+    In cases where copies would need to be made, **the result of the
+    computation may be undefined without this flag!**
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
-    no Python exception is set when ``NPY_FAIL`` is returned.
-    Instead, \*errmsg is set to an error message.  When errmsg is
-    non-NULL, the function may be safely called without holding
-    the Python GIL.
+    Flags that may be passed in ``op_flags[i]``, where ``0 <= i < nop``:
+..
+    dedent the enumeration of flags to avoid missing references sphinx warnings 
 
-.. c:function:: int NpyIter_ResetBasePointers( \
-        NpyIter *iter, char** baseptrs, char** errmsg)
+.. c:macro:: NPY_ITER_READWRITE
+.. c:macro:: NPY_ITER_READONLY
+.. c:macro:: NPY_ITER_WRITEONLY
 
-    Resets the iterator back to its initial state, but using the values
-    in ``baseptrs`` for the data instead of the pointers from the arrays
-    being iterated.  This functions is intended to be used, together with
-    the ``op_axes`` parameter, by nested iteration code with two or more
-    iterators.
+    Indicate how the user of the iterator will read or write
+    to ``op[i]``.  Exactly one of these flags must be specified
+    per operand. Using ``NPY_ITER_READWRITE`` or ``NPY_ITER_WRITEONLY``
+    for a user-provided operand may trigger `WRITEBACKIFCOPY``
+    semantics. The data will be written back to the original array
+    when ``NpyIter_Deallocate`` is called.
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
-    no Python exception is set when ``NPY_FAIL`` is returned.
-    Instead, \*errmsg is set to an error message.  When errmsg is
-    non-NULL, the function may be safely called without holding
-    the Python GIL.
+.. c:macro:: NPY_ITER_COPY
 
-    *TODO*: Move the following into a special section on nested iterators.
+    Allow a copy of ``op[i]`` to be made if it does not
+    meet the data type or alignment requirements as specified
+    by the constructor flags and parameters.
 
-    Creating iterators for nested iteration requires some care.  All
-    the iterator operands must match exactly, or the calls to
-    :c:func:`NpyIter_ResetBasePointers` will be invalid.  This means that
-    automatic copies and output allocation should not be used haphazardly.
-    It is possible to still use the automatic data conversion and casting
-    features of the iterator by creating one of the iterators with
-    all the conversion parameters enabled, then grabbing the allocated
-    operands with the :c:func:`NpyIter_GetOperandArray` function and passing
-    them into the constructors for the rest of the iterators.
+.. c:macro:: NPY_ITER_UPDATEIFCOPY
 
-    **WARNING**: When creating iterators for nested iteration,
-    the code must not use a dimension more than once in the different
-    iterators.  If this is done, nested iteration will produce
-    out-of-bounds pointers during iteration.
+    Triggers :c:data:`NPY_ITER_COPY`, and when an array operand
+    is flagged for writing and is copied, causes the data
+    in a copy to be copied back to ``op[i]`` when
+    ``NpyIter_Deallocate`` is called.
 
-    **WARNING**: When creating iterators for nested iteration, buffering
-    can only be applied to the innermost iterator.  If a buffered iterator
-    is used as the source for ``baseptrs``, it will point into a small buffer
-    instead of the array and the inner iteration will be invalid.
+    If the operand is flagged as write-only and a copy is needed,
+    an uninitialized temporary array will be created and then copied
+    to back to ``op[i]`` on calling ``NpyIter_Deallocate``, instead of
+    doing the unnecessary copy operation.
 
-    The pattern for using nested iterators is as follows.
+.. c:macro:: NPY_ITER_NBO
+.. c:macro:: NPY_ITER_ALIGNED
+.. c:macro:: NPY_ITER_CONTIG
 
-    .. code-block:: c
+    Causes the iterator to provide data for ``op[i]``
+    that is in native byte order, aligned according to
+    the dtype requirements, contiguous, or any combination.
 
-        NpyIter *iter1, *iter1;
-        NpyIter_IterNextFunc *iternext1, *iternext2;
-        char **dataptrs1;
+    By default, the iterator produces pointers into the
+    arrays provided, which may be aligned or unaligned, and
+    with any byte order.  If copying or buffering is not
+    enabled and the operand data doesn't satisfy the constraints,
+    an error will be raised.
 
-        /*
-         * With the exact same operands, no copies allowed, and
-         * no axis in op_axes used both in iter1 and iter2.
-         * Buffering may be enabled for iter2, but not for iter1.
-         */
-        iter1 = ...; iter2 = ...;
+    The contiguous constraint applies only to the inner loop,
+    successive inner loops may have arbitrary pointer changes.
 
-        iternext1 = NpyIter_GetIterNext(iter1);
-        iternext2 = NpyIter_GetIterNext(iter2);
-        dataptrs1 = NpyIter_GetDataPtrArray(iter1);
+    If the requested data type is in non-native byte order,
+    the NBO flag overrides it and the requested data type is
+    converted to be in native byte order.
 
-        do {
-            NpyIter_ResetBasePointers(iter2, dataptrs1);
-            do {
-                /* Use the iter2 values */
-            } while (iternext2(iter2));
-        } while (iternext1(iter1));
+.. c:macro:: NPY_ITER_ALLOCATE
 
-.. c:function:: int NpyIter_GotoMultiIndex(NpyIter* iter, npy_intp const* multi_index)
+    This is for output arrays, and requires that the flag
+    :c:data:`NPY_ITER_WRITEONLY` or :c:data:`NPY_ITER_READWRITE`
+    be set.  If ``op[i]`` is NULL, creates a new array with
+    the final broadcast dimensions, and a layout matching
+    the iteration order of the iterator.
 
-    Adjusts the iterator to point to the ``ndim`` indices
-    pointed to by ``multi_index``.  Returns an error if a multi-index
-    is not being tracked, the indices are out of bounds,
-    or inner loop iteration is disabled.
+    When ``op[i]`` is NULL, the requested data type
+    ``op_dtypes[i]`` may be NULL as well, in which case it is
+    automatically generated from the dtypes of the arrays which
+    are flagged as readable.  The rules for generating the dtype
+    are the same is for UFuncs.  Of special note is handling
+    of byte order in the selected dtype.  If there is exactly
+    one input, the input's dtype is used as is.  Otherwise,
+    if more than one input dtypes are combined together, the
+    output will be in native byte order.
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    After being allocated with this flag, the caller may retrieve
+    the new array by calling :c:func:`NpyIter_GetOperandArray` and
+    getting the i-th object in the returned C array.  The caller
+    must call Py_INCREF on it to claim a reference to the array.
 
-.. c:function:: int NpyIter_GotoIndex(NpyIter* iter, npy_intp index)
+.. c:macro:: NPY_ITER_NO_SUBTYPE
 
-    Adjusts the iterator to point to the ``index`` specified.
-    If the iterator was constructed with the flag
-    :c:data:`NPY_ITER_C_INDEX`, ``index`` is the C-order index,
-    and if the iterator was constructed with the flag
-    :c:data:`NPY_ITER_F_INDEX`, ``index`` is the Fortran-order
-    index.  Returns an error if there is no index being tracked,
-    the index is out of bounds, or inner loop iteration is disabled.
+    For use with :c:data:`NPY_ITER_ALLOCATE`, this flag disables
+    allocating an array subtype for the output, forcing
+    it to be a straight ndarray.
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    TODO: Maybe it would be better to introduce a function
+    ``NpyIter_GetWrappedOutput`` and remove this flag?
 
-.. c:function:: npy_intp NpyIter_GetIterSize(NpyIter* iter)
+.. c:macro:: NPY_ITER_NO_BROADCAST
 
-    Returns the number of elements being iterated.  This is the product
-    of all the dimensions in the shape.  When a multi index is being tracked
-    (and `NpyIter_RemoveAxis` may be called) the size may be ``-1`` to
-    indicate an iterator is too large.  Such an iterator is invalid, but
-    may become valid after `NpyIter_RemoveAxis` is called. It is not
-    necessary to check for this case.
+    Ensures that the input or output matches the iteration
+    dimensions exactly.
 
-.. c:function:: npy_intp NpyIter_GetIterIndex(NpyIter* iter)
+.. c:macro:: NPY_ITER_ARRAYMASK
 
-    Gets the ``iterindex`` of the iterator, which is an index matching
-    the iteration order of the iterator.
+    .. versionadded:: 1.7
 
-.. c:function:: void NpyIter_GetIterIndexRange( \
-        NpyIter* iter, npy_intp* istart, npy_intp* iend)
+    Indicates that this operand is the mask to use for
+    selecting elements when writing to operands which have
+    the :c:data:`NPY_ITER_WRITEMASKED` flag applied to them.
+    Only one operand may have :c:data:`NPY_ITER_ARRAYMASK` flag
+    applied to it.
 
-    Gets the ``iterindex`` sub-range that is being iterated.  If
-    :c:data:`NPY_ITER_RANGED` was not specified, this always returns the
-    range ``[0, NpyIter_IterSize(iter))``.
+    The data type of an operand with this flag should be either
+    :c:data:`NPY_BOOL`, :c:data:`NPY_MASK`, or a struct dtype
+    whose fields are all valid mask dtypes. In the latter case,
+    it must match up with a struct operand being WRITEMASKED,
+    as it is specifying a mask for each field of that array.
 
-.. c:function:: int NpyIter_GotoIterIndex(NpyIter* iter, npy_intp iterindex)
+    This flag only affects writing from the buffer back to
+    the array. This means that if the operand is also
+    :c:data:`NPY_ITER_READWRITE` or :c:data:`NPY_ITER_WRITEONLY`,
+    code doing iteration can write to this operand to
+    control which elements will be untouched and which ones will be
+    modified. This is useful when the mask should be a combination
+    of input masks.
 
-    Adjusts the iterator to point to the ``iterindex`` specified.
-    The IterIndex is an index matching the iteration order of the iterator.
-    Returns an error if the ``iterindex`` is out of bounds,
-    buffering is enabled, or inner loop iteration is disabled.
+.. c:macro:: NPY_ITER_WRITEMASKED
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    .. versionadded:: 1.7
 
-.. c:function:: npy_bool NpyIter_HasDelayedBufAlloc(NpyIter* iter)
+    This array is the mask for all `writemasked <numpy.nditer>`
+    operands. Code uses the ``writemasked`` flag which indicates 
+    that only elements where the chosen ARRAYMASK operand is True
+    will be written to. In general, the iterator does not enforce
+    this, it is up to the code doing the iteration to follow that
+    promise.
 
-    Returns 1 if the flag :c:data:`NPY_ITER_DELAY_BUFALLOC` was passed
-    to the iterator constructor, and no call to one of the Reset
-    functions has been done yet, 0 otherwise.
+    When ``writemasked`` flag is used, and this operand is buffered,
+    this changes how data is copied from the buffer into the array.
+    A masked copying routine is used, which only copies the
+    elements in the buffer for which ``writemasked``
+    returns true from the corresponding element in the ARRAYMASK
+    operand.
 
-.. c:function:: npy_bool NpyIter_HasExternalLoop(NpyIter* iter)
+.. c:macro:: NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE
 
-    Returns 1 if the caller needs to handle the inner-most 1-dimensional
-    loop, or 0 if the iterator handles all looping. This is controlled
-    by the constructor flag :c:data:`NPY_ITER_EXTERNAL_LOOP` or
-    :c:func:`NpyIter_EnableExternalLoop`.
+    In memory overlap checks, assume that operands with
+    ``NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE`` enabled are accessed only
+    in the iterator order.
 
-.. c:function:: npy_bool NpyIter_HasMultiIndex(NpyIter* iter)
+    This enables the iterator to reason about data dependency,
+    possibly avoiding unnecessary copies.
 
-    Returns 1 if the iterator was created with the
-    :c:data:`NPY_ITER_MULTI_INDEX` flag, 0 otherwise.
+    This flag has effect only if ``NPY_ITER_COPY_IF_OVERLAP`` is enabled
+    on the iterator.
 
-.. c:function:: npy_bool NpyIter_HasIndex(NpyIter* iter)
+.. c:function:: NpyIter* NpyIter_AdvancedNew( \
+        npy_intp nop, PyArrayObject** op, npy_uint32 flags, NPY_ORDER order, \
+        NPY_CASTING casting, npy_uint32* op_flags, PyArray_Descr** op_dtypes, \
+        int oa_ndim, int** op_axes, npy_intp const* itershape, npy_intp buffersize)
 
-    Returns 1 if the iterator was created with the
-    :c:data:`NPY_ITER_C_INDEX` or :c:data:`NPY_ITER_F_INDEX`
-    flag, 0 otherwise.
+    Extends :c:func:`NpyIter_MultiNew` with several advanced options providing
+    more control over broadcasting and buffering.
 
-.. c:function:: npy_bool NpyIter_RequiresBuffering(NpyIter* iter)
+    If -1/NULL values are passed to ``oa_ndim``, ``op_axes``, ``itershape``,
+    and ``buffersize``, it is equivalent to :c:func:`NpyIter_MultiNew`.
 
-    Returns 1 if the iterator requires buffering, which occurs
-    when an operand needs conversion or alignment and so cannot
-    be used directly.
+    The parameter ``oa_ndim``, when not zero or -1, specifies the number of
+    dimensions that will be iterated with customized broadcasting.
+    If it is provided, ``op_axes`` must and ``itershape`` can also be provided.
+    The ``op_axes`` parameter let you control in detail how the
+    axes of the operand arrays get matched together and iterated.
+    In ``op_axes``, you must provide an array of ``nop`` pointers
+    to ``oa_ndim``-sized arrays of type ``npy_intp``.  If an entry
+    in ``op_axes`` is NULL, normal broadcasting rules will apply.
+    In ``op_axes[j][i]`` is stored either a valid axis of ``op[j]``, or
+    -1 which means ``newaxis``.  Within each ``op_axes[j]`` array, axes
+    may not be repeated.  The following example is how normal broadcasting
+    applies to a 3-D array, a 2-D array, a 1-D array and a scalar.
 
-.. c:function:: npy_bool NpyIter_IsBuffered(NpyIter* iter)
+    **Note**: Before NumPy 1.8 ``oa_ndim == 0` was used for signalling that
+    that ``op_axes`` and ``itershape`` are unused. This is deprecated and
+    should be replaced with -1. Better backward compatibility may be
+    achieved by using :c:func:`NpyIter_MultiNew` for this case.
 
-    Returns 1 if the iterator was created with the
-    :c:data:`NPY_ITER_BUFFERED` flag, 0 otherwise.
+    .. code-block:: c
 
-.. c:function:: npy_bool NpyIter_IsGrowInner(NpyIter* iter)
+        int oa_ndim = 3;               /* # iteration axes */
+        int op0_axes[] = {0, 1, 2};    /* 3-D operand */
+        int op1_axes[] = {-1, 0, 1};   /* 2-D operand */
+        int op2_axes[] = {-1, -1, 0};  /* 1-D operand */
+        int op3_axes[] = {-1, -1, -1}  /* 0-D (scalar) operand */
+        int* op_axes[] = {op0_axes, op1_axes, op2_axes, op3_axes};
 
-    Returns 1 if the iterator was created with the
-    :c:data:`NPY_ITER_GROWINNER` flag, 0 otherwise.
+    The ``itershape`` parameter allows you to force the iterator
+    to have a specific iteration shape. It is an array of length
+    ``oa_ndim``. When an entry is negative, its value is determined
+    from the operands. This parameter allows automatically allocated
+    outputs to get additional dimensions which don't match up with
+    any dimension of an input.
 
-.. c:function:: npy_intp NpyIter_GetBufferSize(NpyIter* iter)
+    If ``buffersize`` is zero, a default buffer size is used,
+    otherwise it specifies how big of a buffer to use.  Buffers
+    which are powers of 2 such as 4096 or 8192 are recommended.
 
-    If the iterator is buffered, returns the size of the buffer
-    being used, otherwise returns 0.
+    Returns NULL if there is an error, otherwise returns the allocated
+    iterator.
 
-.. c:function:: int NpyIter_GetNDim(NpyIter* iter)
+.. c:function:: NpyIter* NpyIter_Copy(NpyIter* iter)
 
-    Returns the number of dimensions being iterated.  If a multi-index
-    was not requested in the iterator constructor, this value
-    may be smaller than the number of dimensions in the original
-    objects.
+    Makes a copy of the given iterator.  This function is provided
+    primarily to enable multi-threaded iteration of the data.
 
-.. c:function:: int NpyIter_GetNOp(NpyIter* iter)
+    *TODO*: Move this to a section about multithreaded iteration.
 
-    Returns the number of operands in the iterator.
+    The recommended approach to multithreaded iteration is to
+    first create an iterator with the flags
+    :c:data:`NPY_ITER_EXTERNAL_LOOP`, :c:data:`NPY_ITER_RANGED`,
+    :c:data:`NPY_ITER_BUFFERED`, :c:data:`NPY_ITER_DELAY_BUFALLOC`, and
+    possibly :c:data:`NPY_ITER_GROWINNER`.  Create a copy of this iterator
+    for each thread (minus one for the first iterator).  Then, take
+    the iteration index range ``[0, NpyIter_GetIterSize(iter))`` and
+    split it up into tasks, for example using a TBB parallel_for loop.
+    When a thread gets a task to execute, it then uses its copy of
+    the iterator by calling :c:func:`NpyIter_ResetToIterIndexRange` and
+    iterating over the full range.
 
-.. c:function:: npy_intp* NpyIter_GetAxisStrideArray(NpyIter* iter, int axis)
+    When using the iterator in multi-threaded code or in code not
+    holding the Python GIL, care must be taken to only call functions
+    which are safe in that context.  :c:func:`NpyIter_Copy` cannot be safely
+    called without the Python GIL, because it increments Python
+    references.  The ``Reset*`` and some other functions may be safely
+    called by passing in the ``errmsg`` parameter as non-NULL, so that
+    the functions will pass back errors through it instead of setting
+    a Python exception.
 
-    Gets the array of strides for the specified axis. Requires that
-    the iterator be tracking a multi-index, and that buffering not
-    be enabled.
+    :c:func:`NpyIter_Deallocate` must be called for each copy.
 
-    This may be used when you want to match up operand axes in
-    some fashion, then remove them with :c:func:`NpyIter_RemoveAxis` to
-    handle their processing manually.  By calling this function
-    before removing the axes, you can get the strides for the
-    manual processing.
+.. c:function:: int NpyIter_RemoveAxis(NpyIter* iter, int axis)
 
-    Returns ``NULL`` on error.
+    Removes an axis from iteration.  This requires that
+    :c:data:`NPY_ITER_MULTI_INDEX` was set for iterator creation, and does
+    not work if buffering is enabled or an index is being tracked. This
+    function also resets the iterator to its initial state.
 
-.. c:function:: int NpyIter_GetShape(NpyIter* iter, npy_intp* outshape)
+    This is useful for setting up an accumulation loop, for example.
+    The iterator can first be created with all the dimensions, including
+    the accumulation axis, so that the output gets created correctly.
+    Then, the accumulation axis can be removed, and the calculation
+    done in a nested fashion.
 
-    Returns the broadcast shape of the iterator in ``outshape``.
-    This can only be called on an iterator which is tracking a multi-index.
+    **WARNING**: This function may change the internal memory layout of
+    the iterator.  Any cached functions or pointers from the iterator
+    must be retrieved again! The iterator range will be reset as well.
 
     Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-.. c:function:: PyArray_Descr** NpyIter_GetDescrArray(NpyIter* iter)
 
-    This gives back a pointer to the ``nop`` data type Descrs for
-    the objects being iterated.  The result points into ``iter``,
-    so the caller does not gain any references to the Descrs.
+.. c:function:: int NpyIter_RemoveMultiIndex(NpyIter* iter)
 
-    This pointer may be cached before the iteration loop, calling
-    ``iternext`` will not change it.
+    If the iterator is tracking a multi-index, this strips support for them,
+    and does further iterator optimizations that are possible if multi-indices
+    are not needed.  This function also resets the iterator to its initial
+    state.
 
-.. c:function:: PyObject** NpyIter_GetOperandArray(NpyIter* iter)
+    **WARNING**: This function may change the internal memory layout of
+    the iterator.  Any cached functions or pointers from the iterator
+    must be retrieved again!
 
-    This gives back a pointer to the ``nop`` operand PyObjects
-    that are being iterated.  The result points into ``iter``,
-    so the caller does not gain any references to the PyObjects.
+    After calling this function, :c:expr:`NpyIter_HasMultiIndex(iter)` will
+    return false.
 
-.. c:function:: PyObject* NpyIter_GetIterView(NpyIter* iter, npy_intp i)
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-    This gives back a reference to a new ndarray view, which is a view
-    into the i-th object in the array :c:func:`NpyIter_GetOperandArray()`,
-    whose dimensions and strides match the internal optimized
-    iteration pattern.  A C-order iteration of this view is equivalent
-    to the iterator's iteration order.
+.. c:function:: int NpyIter_EnableExternalLoop(NpyIter* iter)
 
-    For example, if an iterator was created with a single array as its
-    input, and it was possible to rearrange all its axes and then
-    collapse it into a single strided iteration, this would return
-    a view that is a one-dimensional array.
+    If :c:func:`NpyIter_RemoveMultiIndex` was called, you may want to enable the
+    flag :c:data:`NPY_ITER_EXTERNAL_LOOP`.  This flag is not permitted
+    together with :c:data:`NPY_ITER_MULTI_INDEX`, so this function is provided
+    to enable the feature after :c:func:`NpyIter_RemoveMultiIndex` is called.
+    This function also resets the iterator to its initial state.
 
-.. c:function:: void NpyIter_GetReadFlags(NpyIter* iter, char* outreadflags)
+    **WARNING**: This function changes the internal logic of the iterator.
+    Any cached functions or pointers from the iterator must be retrieved
+    again!
 
-    Fills ``nop`` flags. Sets ``outreadflags[i]`` to 1 if
-    ``op[i]`` can be read from, and to 0 if not.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-.. c:function:: void NpyIter_GetWriteFlags(NpyIter* iter, char* outwriteflags)
+.. c:function:: int NpyIter_Deallocate(NpyIter* iter)
+
+    Deallocates the iterator object and resolves any needed writebacks.
+
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-    Fills ``nop`` flags. Sets ``outwriteflags[i]`` to 1 if
-    ``op[i]`` can be written to, and to 0 if not.
+.. c:function:: int NpyIter_Reset(NpyIter* iter, char** errmsg)
 
-.. c:function:: int NpyIter_CreateCompatibleStrides( \
-        NpyIter* iter, npy_intp itemsize, npy_intp* outstrides)
+    Resets the iterator back to its initial state, at the beginning
+    of the iteration range.
 
-    Builds a set of strides which are the same as the strides of an
-    output array created using the :c:data:`NPY_ITER_ALLOCATE` flag, where NULL
-    was passed for op_axes.  This is for data packed contiguously,
-    but not necessarily in C or Fortran order. This should be used
-    together with :c:func:`NpyIter_GetShape` and :c:func:`NpyIter_GetNDim`
-    with the flag :c:data:`NPY_ITER_MULTI_INDEX` passed into the constructor.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
+    no Python exception is set when ``NPY_FAIL`` is returned.
+    Instead, \*errmsg is set to an error message.  When errmsg is
+    non-NULL, the function may be safely called without holding
+    the Python GIL.
 
-    A use case for this function is to match the shape and layout of
-    the iterator and tack on one or more dimensions.  For example,
-    in order to generate a vector per input value for a numerical gradient,
-    you pass in ndim*itemsize for itemsize, then add another dimension to
-    the end with size ndim and stride itemsize.  To do the Hessian matrix,
-    you do the same thing but add two dimensions, or take advantage of
-    the symmetry and pack it into 1 dimension with a particular encoding.
+.. c:function:: int NpyIter_ResetToIterIndexRange( \
+        NpyIter* iter, npy_intp istart, npy_intp iend, char** errmsg)
 
-    This function may only be called if the iterator is tracking a multi-index
-    and if :c:data:`NPY_ITER_DONT_NEGATE_STRIDES` was used to prevent an axis
-    from being iterated in reverse order.
+    Resets the iterator and restricts it to the ``iterindex`` range
+    ``[istart, iend)``.  See :c:func:`NpyIter_Copy` for an explanation of
+    how to use this for multi-threaded iteration.  This requires that
+    the flag :c:data:`NPY_ITER_RANGED` was passed to the iterator constructor.
 
-    If an array is created with this method, simply adding 'itemsize'
-    for each iteration will traverse the new array matching the
-    iterator.
+    If you want to reset both the ``iterindex`` range and the base
+    pointers at the same time, you can do the following to avoid
+    extra buffer copying (be sure to add the return code error checks
+    when you copy this code).
 
-    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
+    .. code-block:: c
 
-.. c:function:: npy_bool NpyIter_IsFirstVisit(NpyIter* iter, int iop)
+        /* Set to a trivial empty range */
+        NpyIter_ResetToIterIndexRange(iter, 0, 0);
+        /* Set the base pointers */
+        NpyIter_ResetBasePointers(iter, baseptrs);
+        /* Set to the desired range */
+        NpyIter_ResetToIterIndexRange(iter, istart, iend);
 
-    .. versionadded:: 1.7
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
+    no Python exception is set when ``NPY_FAIL`` is returned.
+    Instead, \*errmsg is set to an error message.  When errmsg is
+    non-NULL, the function may be safely called without holding
+    the Python GIL.
 
-    Checks to see whether this is the first time the elements of the
-    specified reduction operand which the iterator points at are being
-    seen for the first time. The function returns a reasonable answer
-    for reduction operands and when buffering is disabled. The answer
-    may be incorrect for buffered non-reduction operands.
+.. c:function:: int NpyIter_ResetBasePointers( \
+        NpyIter *iter, char** baseptrs, char** errmsg)
 
-    This function is intended to be used in EXTERNAL_LOOP mode only,
-    and will produce some wrong answers when that mode is not enabled.
+    Resets the iterator back to its initial state, but using the values
+    in ``baseptrs`` for the data instead of the pointers from the arrays
+    being iterated.  This functions is intended to be used, together with
+    the ``op_axes`` parameter, by nested iteration code with two or more
+    iterators.
 
-    If this function returns true, the caller should also check the inner
-    loop stride of the operand, because if that stride is 0, then only
-    the first element of the innermost external loop is being visited
-    for the first time.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.  If errmsg is non-NULL,
+    no Python exception is set when ``NPY_FAIL`` is returned.
+    Instead, \*errmsg is set to an error message.  When errmsg is
+    non-NULL, the function may be safely called without holding
+    the Python GIL.
 
-    *WARNING*: For performance reasons, 'iop' is not bounds-checked,
-    it is not confirmed that 'iop' is actually a reduction operand,
-    and it is not confirmed that EXTERNAL_LOOP mode is enabled. These
-    checks are the responsibility of the caller, and should be done
-    outside of any inner loops.
+    *TODO*: Move the following into a special section on nested iterators.
 
-Flags that may be passed in ``flags``, applying to the whole iterator, are:
+    Creating iterators for nested iteration requires some care.  All
+    the iterator operands must match exactly, or the calls to
+    :c:func:`NpyIter_ResetBasePointers` will be invalid.  This means that
+    automatic copies and output allocation should not be used haphazardly.
+    It is possible to still use the automatic data conversion and casting
+    features of the iterator by creating one of the iterators with
+    all the conversion parameters enabled, then grabbing the allocated
+    operands with the :c:func:`NpyIter_GetOperandArray` function and passing
+    them into the constructors for the rest of the iterators.
 
-.. c:macro:: NPY_ITER_C_INDEX
+    **WARNING**: When creating iterators for nested iteration,
+    the code must not use a dimension more than once in the different
+    iterators.  If this is done, nested iteration will produce
+    out-of-bounds pointers during iteration.
 
-    Causes the iterator to track a raveled flat index matching C
-    order. This option cannot be used with :c:data:`NPY_ITER_F_INDEX`.
+    **WARNING**: When creating iterators for nested iteration, buffering
+    can only be applied to the innermost iterator.  If a buffered iterator
+    is used as the source for ``baseptrs``, it will point into a small buffer
+    instead of the array and the inner iteration will be invalid.
 
-.. c:macro:: NPY_ITER_F_INDEX
+    The pattern for using nested iterators is as follows.
 
-    Causes the iterator to track a raveled flat index matching Fortran
-    order. This option cannot be used with :c:data:`NPY_ITER_C_INDEX`.
+    .. code-block:: c
 
-.. c:macro:: NPY_ITER_MULTI_INDEX
+        NpyIter *iter1, *iter1;
+        NpyIter_IterNextFunc *iternext1, *iternext2;
+        char **dataptrs1;
 
-    Causes the iterator to track a multi-index.
-    This prevents the iterator from coalescing axes to
-    produce bigger inner loops. If the loop is also not buffered
-    and no index is being tracked (`NpyIter_RemoveAxis` can be called),
-    then the iterator size can be ``-1`` to indicate that the iterator
-    is too large. This can happen due to complex broadcasting and
-    will result in errors being created when the setting the iterator
-    range, removing the multi index, or getting the next function.
-    However, it is possible to remove axes again and use the iterator
-    normally if the size is small enough after removal.
+        /*
+         * With the exact same operands, no copies allowed, and
+         * no axis in op_axes used both in iter1 and iter2.
+         * Buffering may be enabled for iter2, but not for iter1.
+         */
+        iter1 = ...; iter2 = ...;
 
-.. c:macro:: NPY_ITER_EXTERNAL_LOOP
+        iternext1 = NpyIter_GetIterNext(iter1);
+        iternext2 = NpyIter_GetIterNext(iter2);
+        dataptrs1 = NpyIter_GetDataPtrArray(iter1);
 
-    Causes the iterator to skip iteration of the innermost
-    loop, requiring the user of the iterator to handle it.
+        do {
+            NpyIter_ResetBasePointers(iter2, dataptrs1);
+            do {
+                /* Use the iter2 values */
+            } while (iternext2(iter2));
+        } while (iternext1(iter1));
 
-    This flag is incompatible with :c:data:`NPY_ITER_C_INDEX`,
-    :c:data:`NPY_ITER_F_INDEX`, and :c:data:`NPY_ITER_MULTI_INDEX`.
+.. c:function:: int NpyIter_GotoMultiIndex(NpyIter* iter, npy_intp const* multi_index)
 
-.. c:macro:: NPY_ITER_DONT_NEGATE_STRIDES
+    Adjusts the iterator to point to the ``ndim`` indices
+    pointed to by ``multi_index``.  Returns an error if a multi-index
+    is not being tracked, the indices are out of bounds,
+    or inner loop iteration is disabled.
 
-    This only affects the iterator when :c:type:`NPY_KEEPORDER` is
-    specified for the order parameter.  By default with
-    :c:type:`NPY_KEEPORDER`, the iterator reverses axes which have
-    negative strides, so that memory is traversed in a forward
-    direction.  This disables this step.  Use this flag if you
-    want to use the underlying memory-ordering of the axes,
-    but don't want an axis reversed. This is the behavior of
-    ``numpy.ravel(a, order='K')``, for instance.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-.. c:macro:: NPY_ITER_COMMON_DTYPE
+.. c:function:: int NpyIter_GotoIndex(NpyIter* iter, npy_intp index)
 
-    Causes the iterator to convert all the operands to a common
-    data type, calculated based on the ufunc type promotion rules.
-    Copying or buffering must be enabled.
+    Adjusts the iterator to point to the ``index`` specified.
+    If the iterator was constructed with the flag
+    :c:data:`NPY_ITER_C_INDEX`, ``index`` is the C-order index,
+    and if the iterator was constructed with the flag
+    :c:data:`NPY_ITER_F_INDEX`, ``index`` is the Fortran-order
+    index.  Returns an error if there is no index being tracked,
+    the index is out of bounds, or inner loop iteration is disabled.
 
-    If the common data type is known ahead of time, don't use this
-    flag.  Instead, set the requested dtype for all the operands.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-.. c:macro:: NPY_ITER_REFS_OK
+.. c:function:: npy_intp NpyIter_GetIterSize(NpyIter* iter)
 
-    Indicates that arrays with reference types (object
-    arrays or structured arrays containing an object type)
-    may be accepted and used in the iterator.  If this flag
-    is enabled, the caller must be sure to check whether
-    :c:expr:`NpyIter_IterationNeedsAPI(iter)` is true, in which case
-    it may not release the GIL during iteration.
+    Returns the number of elements being iterated.  This is the product
+    of all the dimensions in the shape.  When a multi index is being tracked
+    (and `NpyIter_RemoveAxis` may be called) the size may be ``-1`` to
+    indicate an iterator is too large.  Such an iterator is invalid, but
+    may become valid after `NpyIter_RemoveAxis` is called. It is not
+    necessary to check for this case.
 
-.. c:macro:: NPY_ITER_ZEROSIZE_OK
+.. c:function:: npy_intp NpyIter_GetIterIndex(NpyIter* iter)
 
-    Indicates that arrays with a size of zero should be permitted.
-    Since the typical iteration loop does not naturally work with
-    zero-sized arrays, you must check that the IterSize is larger
-    than zero before entering the iteration loop.
-    Currently only the operands are checked, not a forced shape.
+    Gets the ``iterindex`` of the iterator, which is an index matching
+    the iteration order of the iterator.
 
-.. c:macro:: NPY_ITER_REDUCE_OK
+.. c:function:: void NpyIter_GetIterIndexRange( \
+        NpyIter* iter, npy_intp* istart, npy_intp* iend)
 
-    Permits writeable operands with a dimension with zero
-    stride and size greater than one.  Note that such operands
-    must be read/write.
+    Gets the ``iterindex`` sub-range that is being iterated.  If
+    :c:data:`NPY_ITER_RANGED` was not specified, this always returns the
+    range ``[0, NpyIter_IterSize(iter))``.
 
-    When buffering is enabled, this also switches to a special
-    buffering mode which reduces the loop length as necessary to
-    not trample on values being reduced.
+.. c:function:: int NpyIter_GotoIterIndex(NpyIter* iter, npy_intp iterindex)
 
-    Note that if you want to do a reduction on an automatically
-    allocated output, you must use :c:func:`NpyIter_GetOperandArray`
-    to get its reference, then set every value to the reduction
-    unit before doing the iteration loop.  In the case of a
-    buffered reduction, this means you must also specify the
-    flag :c:data:`NPY_ITER_DELAY_BUFALLOC`, then reset the iterator
-    after initializing the allocated operand to prepare the
-    buffers.
+    Adjusts the iterator to point to the ``iterindex`` specified.
+    The IterIndex is an index matching the iteration order of the iterator.
+    Returns an error if the ``iterindex`` is out of bounds,
+    buffering is enabled, or inner loop iteration is disabled.
 
-.. c:macro:: NPY_ITER_RANGED
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-    Enables support for iteration of sub-ranges of the full
-    ``iterindex`` range ``[0, NpyIter_IterSize(iter))``.  Use
-    the function :c:func:`NpyIter_ResetToIterIndexRange` to specify
-    a range for iteration.
+.. c:function:: npy_bool NpyIter_HasDelayedBufAlloc(NpyIter* iter)
 
-    This flag can only be used with :c:data:`NPY_ITER_EXTERNAL_LOOP`
-    when :c:data:`NPY_ITER_BUFFERED` is enabled.  This is because
-    without buffering, the inner loop is always the size of the
-    innermost iteration dimension, and allowing it to get cut up
-    would require special handling, effectively making it more
-    like the buffered version.
+    Returns 1 if the flag :c:data:`NPY_ITER_DELAY_BUFALLOC` was passed
+    to the iterator constructor, and no call to one of the Reset
+    functions has been done yet, 0 otherwise.
 
-.. c:macro:: NPY_ITER_BUFFERED
+.. c:function:: npy_bool NpyIter_HasExternalLoop(NpyIter* iter)
 
-    Causes the iterator to store buffering data, and use buffering
-    to satisfy data type, alignment, and byte-order requirements.
-    To buffer an operand, do not specify the :c:data:`NPY_ITER_COPY`
-    or :c:data:`NPY_ITER_UPDATEIFCOPY` flags, because they will
-    override buffering.  Buffering is especially useful for Python
-    code using the iterator, allowing for larger chunks
-    of data at once to amortize the Python interpreter overhead.
+    Returns 1 if the caller needs to handle the inner-most 1-dimensional
+    loop, or 0 if the iterator handles all looping. This is controlled
+    by the constructor flag :c:data:`NPY_ITER_EXTERNAL_LOOP` or
+    :c:func:`NpyIter_EnableExternalLoop`.
 
-    If used with :c:data:`NPY_ITER_EXTERNAL_LOOP`, the inner loop
-    for the caller may get larger chunks than would be possible
-    without buffering, because of how the strides are laid out.
+.. c:function:: npy_bool NpyIter_HasMultiIndex(NpyIter* iter)
 
-    Note that if an operand is given the flag :c:data:`NPY_ITER_COPY`
-    or :c:data:`NPY_ITER_UPDATEIFCOPY`, a copy will be made in preference
-    to buffering.  Buffering will still occur when the array was
-    broadcast so elements need to be duplicated to get a constant
-    stride.
+    Returns 1 if the iterator was created with the
+    :c:data:`NPY_ITER_MULTI_INDEX` flag, 0 otherwise.
 
-    In normal buffering, the size of each inner loop is equal
-    to the buffer size, or possibly larger if
-    :c:data:`NPY_ITER_GROWINNER` is specified.  If
-    :c:data:`NPY_ITER_REDUCE_OK` is enabled and a reduction occurs,
-    the inner loops may become smaller depending
-    on the structure of the reduction.
+.. c:function:: npy_bool NpyIter_HasIndex(NpyIter* iter)
 
-.. c:macro:: NPY_ITER_GROWINNER
+    Returns 1 if the iterator was created with the
+    :c:data:`NPY_ITER_C_INDEX` or :c:data:`NPY_ITER_F_INDEX`
+    flag, 0 otherwise.
 
-    When buffering is enabled, this allows the size of the inner
-    loop to grow when buffering isn't necessary.  This option
-    is best used if you're doing a straight pass through all the
-    data, rather than anything with small cache-friendly arrays
-    of temporary values for each inner loop.
+.. c:function:: npy_bool NpyIter_RequiresBuffering(NpyIter* iter)
 
-.. c:macro:: NPY_ITER_DELAY_BUFALLOC
+    Returns 1 if the iterator requires buffering, which occurs
+    when an operand needs conversion or alignment and so cannot
+    be used directly.
 
-    When buffering is enabled, this delays allocation of the
-    buffers until :c:func:`NpyIter_Reset` or another reset function is
-    called.  This flag exists to avoid wasteful copying of
-    buffer data when making multiple copies of a buffered
-    iterator for multi-threaded iteration.
+.. c:function:: npy_bool NpyIter_IsBuffered(NpyIter* iter)
 
-    Another use of this flag is for setting up reduction operations.
-    After the iterator is created, and a reduction output
-    is allocated automatically by the iterator (be sure to use
-    READWRITE access), its value may be initialized to the reduction
-    unit.  Use :c:func:`NpyIter_GetOperandArray` to get the object.
-    Then, call :c:func:`NpyIter_Reset` to allocate and fill the buffers
-    with their initial values.
+    Returns 1 if the iterator was created with the
+    :c:data:`NPY_ITER_BUFFERED` flag, 0 otherwise.
 
-.. c:macro:: NPY_ITER_COPY_IF_OVERLAP
+.. c:function:: npy_bool NpyIter_IsGrowInner(NpyIter* iter)
 
-    If any write operand has overlap with any read operand, eliminate all
-    overlap by making temporary copies (enabling UPDATEIFCOPY for write
-    operands, if necessary). A pair of operands has overlap if there is
-    a memory address that contains data common to both arrays.
+    Returns 1 if the iterator was created with the
+    :c:data:`NPY_ITER_GROWINNER` flag, 0 otherwise.
 
-    Because exact overlap detection has exponential runtime
-    in the number of dimensions, the decision is made based
-    on heuristics, which has false positives (needless copies in unusual
-    cases) but has no false negatives.
+.. c:function:: npy_intp NpyIter_GetBufferSize(NpyIter* iter)
 
-    If any read/write overlap exists, this flag ensures the result of the
-    operation is the same as if all operands were copied.
-    In cases where copies would need to be made, **the result of the
-    computation may be undefined without this flag!**
+    If the iterator is buffered, returns the size of the buffer
+    being used, otherwise returns 0.
 
-Flags that may be passed in ``op_flags[i]``, where ``0 <= i < nop``:
+.. c:function:: int NpyIter_GetNDim(NpyIter* iter)
 
-.. c:macro:: NPY_ITER_READWRITE
-.. c:macro:: NPY_ITER_READONLY
-.. c:macro:: NPY_ITER_WRITEONLY
+    Returns the number of dimensions being iterated.  If a multi-index
+    was not requested in the iterator constructor, this value
+    may be smaller than the number of dimensions in the original
+    objects.
 
-    Indicate how the user of the iterator will read or write
-    to ``op[i]``.  Exactly one of these flags must be specified
-    per operand. Using ``NPY_ITER_READWRITE`` or ``NPY_ITER_WRITEONLY``
-    for a user-provided operand may trigger `WRITEBACKIFCOPY``
-    semantics. The data will be written back to the original array
-    when ``NpyIter_Deallocate`` is called.
+.. c:function:: int NpyIter_GetNOp(NpyIter* iter)
 
-.. c:macro:: NPY_ITER_COPY
+    Returns the number of operands in the iterator.
 
-    Allow a copy of ``op[i]`` to be made if it does not
-    meet the data type or alignment requirements as specified
-    by the constructor flags and parameters.
+.. c:function:: npy_intp* NpyIter_GetAxisStrideArray(NpyIter* iter, int axis)
 
-.. c:macro:: NPY_ITER_UPDATEIFCOPY
+    Gets the array of strides for the specified axis. Requires that
+    the iterator be tracking a multi-index, and that buffering not
+    be enabled.
 
-    Triggers :c:data:`NPY_ITER_COPY`, and when an array operand
-    is flagged for writing and is copied, causes the data
-    in a copy to be copied back to ``op[i]`` when
-    ``NpyIter_Deallocate`` is called.
+    This may be used when you want to match up operand axes in
+    some fashion, then remove them with :c:func:`NpyIter_RemoveAxis` to
+    handle their processing manually.  By calling this function
+    before removing the axes, you can get the strides for the
+    manual processing.
 
-    If the operand is flagged as write-only and a copy is needed,
-    an uninitialized temporary array will be created and then copied
-    to back to ``op[i]`` on calling ``NpyIter_Deallocate``, instead of
-    doing the unnecessary copy operation.
+    Returns ``NULL`` on error.
 
-.. c:macro:: NPY_ITER_NBO
-.. c:macro:: NPY_ITER_ALIGNED
-.. c:macro:: NPY_ITER_CONTIG
+.. c:function:: int NpyIter_GetShape(NpyIter* iter, npy_intp* outshape)
 
-    Causes the iterator to provide data for ``op[i]``
-    that is in native byte order, aligned according to
-    the dtype requirements, contiguous, or any combination.
+    Returns the broadcast shape of the iterator in ``outshape``.
+    This can only be called on an iterator which is tracking a multi-index.
 
-    By default, the iterator produces pointers into the
-    arrays provided, which may be aligned or unaligned, and
-    with any byte order.  If copying or buffering is not
-    enabled and the operand data doesn't satisfy the constraints,
-    an error will be raised.
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-    The contiguous constraint applies only to the inner loop,
-    successive inner loops may have arbitrary pointer changes.
+.. c:function:: PyArray_Descr** NpyIter_GetDescrArray(NpyIter* iter)
 
-    If the requested data type is in non-native byte order,
-    the NBO flag overrides it and the requested data type is
-    converted to be in native byte order.
+    This gives back a pointer to the ``nop`` data type Descrs for
+    the objects being iterated.  The result points into ``iter``,
+    so the caller does not gain any references to the Descrs.
 
-.. c:macro:: NPY_ITER_ALLOCATE
+    This pointer may be cached before the iteration loop, calling
+    ``iternext`` will not change it.
 
-    This is for output arrays, and requires that the flag
-    :c:data:`NPY_ITER_WRITEONLY` or :c:data:`NPY_ITER_READWRITE`
-    be set.  If ``op[i]`` is NULL, creates a new array with
-    the final broadcast dimensions, and a layout matching
-    the iteration order of the iterator.
+.. c:function:: PyObject** NpyIter_GetOperandArray(NpyIter* iter)
 
-    When ``op[i]`` is NULL, the requested data type
-    ``op_dtypes[i]`` may be NULL as well, in which case it is
-    automatically generated from the dtypes of the arrays which
-    are flagged as readable.  The rules for generating the dtype
-    are the same is for UFuncs.  Of special note is handling
-    of byte order in the selected dtype.  If there is exactly
-    one input, the input's dtype is used as is.  Otherwise,
-    if more than one input dtypes are combined together, the
-    output will be in native byte order.
+    This gives back a pointer to the ``nop`` operand PyObjects
+    that are being iterated.  The result points into ``iter``,
+    so the caller does not gain any references to the PyObjects.
 
-    After being allocated with this flag, the caller may retrieve
-    the new array by calling :c:func:`NpyIter_GetOperandArray` and
-    getting the i-th object in the returned C array.  The caller
-    must call Py_INCREF on it to claim a reference to the array.
+.. c:function:: PyObject* NpyIter_GetIterView(NpyIter* iter, npy_intp i)
 
-.. c:macro:: NPY_ITER_NO_SUBTYPE
+    This gives back a reference to a new ndarray view, which is a view
+    into the i-th object in the array :c:func:`NpyIter_GetOperandArray()`,
+    whose dimensions and strides match the internal optimized
+    iteration pattern.  A C-order iteration of this view is equivalent
+    to the iterator's iteration order.
 
-    For use with :c:data:`NPY_ITER_ALLOCATE`, this flag disables
-    allocating an array subtype for the output, forcing
-    it to be a straight ndarray.
+    For example, if an iterator was created with a single array as its
+    input, and it was possible to rearrange all its axes and then
+    collapse it into a single strided iteration, this would return
+    a view that is a one-dimensional array.
 
-    TODO: Maybe it would be better to introduce a function
-    ``NpyIter_GetWrappedOutput`` and remove this flag?
+.. c:function:: void NpyIter_GetReadFlags(NpyIter* iter, char* outreadflags)
 
-.. c:macro:: NPY_ITER_NO_BROADCAST
+    Fills ``nop`` flags. Sets ``outreadflags[i]`` to 1 if
+    ``op[i]`` can be read from, and to 0 if not.
 
-    Ensures that the input or output matches the iteration
-    dimensions exactly.
+.. c:function:: void NpyIter_GetWriteFlags(NpyIter* iter, char* outwriteflags)
 
-.. c:macro:: NPY_ITER_ARRAYMASK
+    Fills ``nop`` flags. Sets ``outwriteflags[i]`` to 1 if
+    ``op[i]`` can be written to, and to 0 if not.
 
-    .. versionadded:: 1.7
+.. c:function:: int NpyIter_CreateCompatibleStrides( \
+        NpyIter* iter, npy_intp itemsize, npy_intp* outstrides)
 
-    Indicates that this operand is the mask to use for
-    selecting elements when writing to operands which have
-    the :c:data:`NPY_ITER_WRITEMASKED` flag applied to them.
-    Only one operand may have :c:data:`NPY_ITER_ARRAYMASK` flag
-    applied to it.
+    Builds a set of strides which are the same as the strides of an
+    output array created using the :c:data:`NPY_ITER_ALLOCATE` flag, where NULL
+    was passed for op_axes.  This is for data packed contiguously,
+    but not necessarily in C or Fortran order. This should be used
+    together with :c:func:`NpyIter_GetShape` and :c:func:`NpyIter_GetNDim`
+    with the flag :c:data:`NPY_ITER_MULTI_INDEX` passed into the constructor.
 
-    The data type of an operand with this flag should be either
-    :c:data:`NPY_BOOL`, :c:data:`NPY_MASK`, or a struct dtype
-    whose fields are all valid mask dtypes. In the latter case,
-    it must match up with a struct operand being WRITEMASKED,
-    as it is specifying a mask for each field of that array.
+    A use case for this function is to match the shape and layout of
+    the iterator and tack on one or more dimensions.  For example,
+    in order to generate a vector per input value for a numerical gradient,
+    you pass in ndim*itemsize for itemsize, then add another dimension to
+    the end with size ndim and stride itemsize.  To do the Hessian matrix,
+    you do the same thing but add two dimensions, or take advantage of
+    the symmetry and pack it into 1 dimension with a particular encoding.
 
-    This flag only affects writing from the buffer back to
-    the array. This means that if the operand is also
-    :c:data:`NPY_ITER_READWRITE` or :c:data:`NPY_ITER_WRITEONLY`,
-    code doing iteration can write to this operand to
-    control which elements will be untouched and which ones will be
-    modified. This is useful when the mask should be a combination
-    of input masks.
+    This function may only be called if the iterator is tracking a multi-index
+    and if :c:data:`NPY_ITER_DONT_NEGATE_STRIDES` was used to prevent an axis
+    from being iterated in reverse order.
 
-.. c:macro:: NPY_ITER_WRITEMASKED
+    If an array is created with this method, simply adding 'itemsize'
+    for each iteration will traverse the new array matching the
+    iterator.
 
-    .. versionadded:: 1.7
+    Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.
 
-    This array is the mask for all `writemasked <numpy.nditer>`
-    operands. Code uses the ``writemasked`` flag which indicates 
-    that only elements where the chosen ARRAYMASK operand is True
-    will be written to. In general, the iterator does not enforce
-    this, it is up to the code doing the iteration to follow that
-    promise.
+.. c:function:: npy_bool NpyIter_IsFirstVisit(NpyIter* iter, int iop)
 
-    When ``writemasked`` flag is used, and this operand is buffered,
-    this changes how data is copied from the buffer into the array.
-    A masked copying routine is used, which only copies the
-    elements in the buffer for which ``writemasked``
-    returns true from the corresponding element in the ARRAYMASK
-    operand.
+    .. versionadded:: 1.7
 
-.. c:macro:: NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE
+    Checks to see whether this is the first time the elements of the
+    specified reduction operand which the iterator points at are being
+    seen for the first time. The function returns a reasonable answer
+    for reduction operands and when buffering is disabled. The answer
+    may be incorrect for buffered non-reduction operands.
 
-    In memory overlap checks, assume that operands with
-    ``NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE`` enabled are accessed only
-    in the iterator order.
+    This function is intended to be used in EXTERNAL_LOOP mode only,
+    and will produce some wrong answers when that mode is not enabled.
 
-    This enables the iterator to reason about data dependency,
-    possibly avoiding unnecessary copies.
+    If this function returns true, the caller should also check the inner
+    loop stride of the operand, because if that stride is 0, then only
+    the first element of the innermost external loop is being visited
+    for the first time.
 
-    This flag has effect only if ``NPY_ITER_COPY_IF_OVERLAP`` is enabled
-    on the iterator.
+    *WARNING*: For performance reasons, 'iop' is not bounds-checked,
+    it is not confirmed that 'iop' is actually a reduction operand,
+    and it is not confirmed that EXTERNAL_LOOP mode is enabled. These
+    checks are the responsibility of the caller, and should be done
+    outside of any inner loops.
 
 Functions For Iteration
 -----------------------
-- 
cgit v1.2.1