8 files changed, 209 insertions, 23 deletions
diff --git a/doc/source/reference/c-api/array.rst b/doc/source/reference/c-api/array.rst
index 6a135fd71..bb4405825 100644
--- a/doc/source/reference/c-api/array.rst
+++ b/doc/source/reference/c-api/array.rst
@@ -325,8 +325,7 @@ From scratch
     should be increased after the pointer is passed in, and the base member
     of the returned ndarray should point to the Python object that owns
     the data. This will ensure that the provided memory is not
-    freed while the returned array is in existence. To free memory as soon
-    as the ndarray is deallocated, set the OWNDATA flag on the returned ndarray.
+    freed while the returned array is in existence.
 
 .. c:function:: PyObject* PyArray_SimpleNewFromDescr( \
         int nd, npy_int const* dims, PyArray_Descr* descr)
@@ -1323,7 +1322,7 @@ User-defined data types
     data-type object, *descr*, of the given *scalar* kind. Use
     *scalar* = :c:data:`NPY_NOSCALAR` to register that an array of data-type
     *descr* can be cast safely to a data-type whose type_number is
-    *totype*.
+    *totype*. The return value is 0 on success or -1 on failure.
 
 .. c:function:: int PyArray_TypeNumFromName( \
         char const *str)
@@ -1463,7 +1462,9 @@ of the constant names is deprecated in 1.7.
 
 .. c:macro:: NPY_ARRAY_OWNDATA
 
-    The data area is owned by this array.
+    The data area is owned by this array. Should never be set manually, instead
+    create a ``PyObject`` wrapping the data and set the array's base to that
+    object. For an example, see the test in ``test_mem_policy``.
 
 .. c:macro:: NPY_ARRAY_ALIGNED
 
@@ -2778,13 +2779,19 @@ Array Scalars
     whenever 0-dimensional arrays could be returned to Python.
 
 .. c:function:: PyObject* PyArray_Scalar( \
-        void* data, PyArray_Descr* dtype, PyObject* itemsize)
-
-    Return an array scalar object of the given enumerated *typenum*
-    and *itemsize* by **copying** from memory pointed to by *data*
-    . If *swap* is nonzero then this function will byteswap the data
-    if appropriate to the data-type because array scalars are always
-    in correct machine-byte order.
+        void* data, PyArray_Descr* dtype, PyObject* base)
+
+    Return an array scalar object of the given *dtype* by **copying**
+    from memory pointed to by *data*.  *base* is expected to be the
+    array object that is the owner of the data.  *base* is required
+    if `dtype` is a ``void`` scalar, or if the ``NPY_USE_GETITEM``
+    flag is set and it is known that the ``getitem`` method uses
+    the ``arr`` argument without checking if it is ``NULL``.  Otherwise
+    `base` may be ``NULL``.
+
+    If the data is not in native byte order (as indicated by
+    ``dtype->byteorder``) then this function will byteswap the data,
+    because array scalars are always in correct machine-byte order.
 
 .. c:function:: PyObject* PyArray_ToScalar(void* data, PyArrayObject* arr)
 
diff --git a/doc/source/reference/c-api/data_memory.rst b/doc/source/reference/c-api/data_memory.rst
new file mode 100644
index 000000000..2084ab5d0
--- /dev/null
+++ b/doc/source/reference/c-api/data_memory.rst
@@ -0,0 +1,161 @@
+.. _data_memory:
+
+Memory management in NumPy
+==========================
+
+The `numpy.ndarray` is a python class. It requires additional memory allocations
+to hold `numpy.ndarray.strides`, `numpy.ndarray.shape` and
+`numpy.ndarray.data` attributes. These attributes are specially allocated
+after creating the python object in `__new__`. The ``strides`` and
+``shape`` are stored in a piece of memory allocated internally.
+
+The ``data`` allocation used to store the actual array values (which could be
+pointers in the case of ``object`` arrays) can be very large, so NumPy has
+provided interfaces to manage its allocation and release. This document details
+how those interfaces work.
+
+Historical overview
+-------------------
+
+Since version 1.7.0, NumPy has exposed a set of ``PyDataMem_*`` functions
+(:c:func:`PyDataMem_NEW`, :c:func:`PyDataMem_FREE`, :c:func:`PyDataMem_RENEW`)
+which are backed by `alloc`, `free`, `realloc` respectively. In that version
+NumPy also exposed the `PyDataMem_EventHook` function (now deprecated)
+described below, which wrap the OS-level calls.
+
+Since those early days, Python also improved its memory management
+capabilities, and began providing
+various :ref:`management policies <memoryoverview>` beginning in version
+3.4. These routines are divided into a set of domains, each domain has a
+:c:type:`PyMemAllocatorEx` structure of routines for memory management. Python also
+added a `tracemalloc` module to trace calls to the various routines. These
+tracking hooks were added to the NumPy ``PyDataMem_*`` routines.
+
+NumPy added a small cache of allocated memory in its internal
+``npy_alloc_cache``, ``npy_alloc_cache_zero``, and ``npy_free_cache``
+functions. These wrap ``alloc``, ``alloc-and-memset(0)`` and ``free``
+respectively, but when ``npy_free_cache`` is called, it adds the pointer to a
+short list of available blocks marked by size. These blocks can be re-used by
+subsequent calls to ``npy_alloc*``, avoiding memory thrashing.
+
+Configurable memory routines in NumPy (NEP 49)
+----------------------------------------------
+
+Users may wish to override the internal data memory routines with ones of their
+own. Since NumPy does not use the Python domain strategy to manage data memory,
+it provides an alternative set of C-APIs to change memory routines. There are
+no Python domain-wide strategies for large chunks of object data, so those are
+less suited to NumPy's needs. User who wish to change the NumPy data memory
+management routines can use :c:func:`PyDataMem_SetHandler`, which uses a
+:c:type:`PyDataMem_Handler` structure to hold pointers to functions used to
+manage the data memory. The calls are still wrapped by internal routines to
+call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will
+use the deprecated :c:func:`PyDataMem_EventHookFunc` mechanism. Since the
+functions may change during the lifetime of the process, each ``ndarray``
+carries with it the functions used at the time of its instantiation, and these
+will be used to reallocate or free the data memory of the instance.
+
+.. c:type:: PyDataMem_Handler
+
+    A struct to hold function pointers used to manipulate memory
+
+    .. code-block:: c
+
+        typedef struct {
+            char name[127];  /* multiple of 64 to keep the struct aligned */
+            uint8_t version; /* currently 1 */
+            PyDataMemAllocator allocator;
+        } PyDataMem_Handler;
+
+    where the allocator structure is
+
+    .. code-block:: c
+
+        /* The declaration of free differs from PyMemAllocatorEx */ 
+        typedef struct {
+            void *ctx;
+            void* (*malloc) (void *ctx, size_t size);
+            void* (*calloc) (void *ctx, size_t nelem, size_t elsize);
+            void* (*realloc) (void *ctx, void *ptr, size_t new_size);
+            void (*free) (void *ctx, void *ptr, size_t size);
+        } PyDataMemAllocator;
+
+.. c:function:: PyObject * PyDataMem_SetHandler(PyObject *handler)
+
+   Set a new allocation policy. If the input value is ``NULL``, will reset the
+   policy to the default. Return the previous policy, or
+   return ``NULL`` if an error has occurred. We wrap the user-provided functions
+   so they will still call the python and numpy memory management callback
+   hooks.
+    
+.. c:function:: PyObject * PyDataMem_GetHandler()
+
+   Return the current policy that will be used to allocate data for the
+   next ``PyArrayObject``. On failure, return ``NULL``.
+
+For an example of setting up and using the PyDataMem_Handler, see the test in
+:file:`numpy/core/tests/test_mem_policy.py`
+
+.. c:function:: void PyDataMem_EventHookFunc(void *inp, void *outp, size_t size, void *user_data);
+
+    This function will be called during data memory manipulation
+
+.. c:function:: PyDataMem_EventHookFunc * PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook, void *user_data, void **old_data)
+
+    Sets the allocation event hook for numpy array data.
+  
+    Returns a pointer to the previous hook or ``NULL``.  If old_data is
+    non-``NULL``, the previous user_data pointer will be copied to it.
+  
+    If not ``NULL``, hook will be called at the end of each ``PyDataMem_NEW/FREE/RENEW``:
+
+    .. code-block:: c
+   
+        result = PyDataMem_NEW(size)        -> (*hook)(NULL, result, size, user_data)
+        PyDataMem_FREE(ptr)                 -> (*hook)(ptr, NULL, 0, user_data)
+        result = PyDataMem_RENEW(ptr, size) -> (*hook)(ptr, result, size, user_data)
+  
+    When the hook is called, the GIL will be held by the calling
+    thread.  The hook should be written to be reentrant, if it performs
+    operations that might cause new allocation events (such as the
+    creation/destruction numpy objects, or creating/destroying Python
+    objects which might cause a gc).
+
+    Deprecated in v1.23
+
+What happens when deallocating if there is no policy set
+--------------------------------------------------------
+
+A rare but useful technique is to allocate a buffer outside NumPy, use
+:c:func:`PyArray_NewFromDescr` to wrap the buffer in a ``ndarray``, then switch
+the ``OWNDATA`` flag to true. When the ``ndarray`` is released, the
+appropriate function from the ``ndarray``'s ``PyDataMem_Handler`` should be
+called to free the buffer. But the ``PyDataMem_Handler`` field was never set,
+it will be ``NULL``. For backward compatibility, NumPy will call ``free()`` to
+release the buffer. If ``NUMPY_WARN_IF_NO_MEM_POLICY`` is set to ``1``, a
+warning will be emitted. The current default is not to emit a warning, this may
+change in a future version of NumPy.
+
+A better technique would be to use a ``PyCapsule`` as a base object:
+
+.. code-block:: c
+
+    /* define a PyCapsule_Destructor, using the correct deallocator for buff */
+    void free_wrap(void *capsule){
+        void * obj = PyCapsule_GetPointer(capsule, PyCapsule_GetName(capsule));
+        free(obj); 
+    };
+
+    /* then inside the function that creates arr from buff */
+    ...
+    arr = PyArray_NewFromDescr(... buf, ...);
+    if (arr == NULL) {
+        return NULL;
+    }
+    capsule = PyCapsule_New(buf, "my_wrapped_buffer",
+                            (PyCapsule_Destructor)&free_wrap);
+    if (PyArray_SetBaseObject(arr, capsule) == -1) {
+        Py_DECREF(arr);
+        return NULL;
+    }
+    ...
diff --git a/doc/source/reference/c-api/index.rst b/doc/source/reference/c-api/index.rst
index bb1ed154e..6288ff33b 100644
--- a/doc/source/reference/c-api/index.rst
+++ b/doc/source/reference/c-api/index.rst
@@ -49,3 +49,4 @@ code.
    generalized-ufuncs
    coremath
    deprecations
+   data_memory
diff --git a/doc/source/reference/global_state.rst b/doc/source/reference/global_state.rst
index f18481235..20874ceaa 100644
--- a/doc/source/reference/global_state.rst
+++ b/doc/source/reference/global_state.rst
@@ -84,3 +84,13 @@ contiguous in memory.
 Most users will have no reason to change these; for details
 see the :ref:`memory layout <memory-layout>` documentation.
 
+
+Warn if no memory allocation policy when deallocating data
+----------------------------------------------------------
+
+Some users might pass ownership of the data pointer to the ``ndarray`` by
+setting the ``OWNDATA`` flag. If they do this without setting (manually) a
+memory allocation policy, the default will be to call ``free``. If
+``NUMPY_WARN_IF_NO_MEM_POLICY`` is set to ``"1"``, a ``RuntimeWarning`` will
+be emitted. A better alternative is to use a ``PyCapsule`` with a deallocator
+and set the ``ndarray.base``.
diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst
index 96cd47017..aaabc9b39 100644
--- a/doc/source/reference/random/index.rst
+++ b/doc/source/reference/random/index.rst
@@ -55,7 +55,7 @@ properties than the legacy `MT19937` used in `RandomState`.
   more_vals = random.standard_normal(10)
 
 `Generator` can be used as a replacement for `RandomState`. Both class
-instances hold a internal `BitGenerator` instance to provide the bit
+instances hold an internal `BitGenerator` instance to provide the bit
 stream, it is accessible as ``gen.bit_generator``. Some long-overdue API
 cleanup means that legacy and compatibility methods have been removed from
 `Generator`
diff --git a/doc/source/reference/random/performance.rst b/doc/source/reference/random/performance.rst
index 85855be59..cb9b94113 100644
--- a/doc/source/reference/random/performance.rst
+++ b/doc/source/reference/random/performance.rst
@@ -13,7 +13,7 @@ full-featured, and fast on most platforms, but somewhat slow when compiled for
 parallelism would indicate using `PCG64DXSM`.
 
 `Philox` is fairly slow, but its statistical properties have
-very high quality, and it is easy to get assuredly-independent stream by using
+very high quality, and it is easy to get an assuredly-independent stream by using
 unique keys. If that is the style you wish to use for parallel streams, or you
 are porting from another system that uses that style, then
 `Philox` is your choice.
diff --git a/doc/source/reference/routines.math.rst b/doc/source/reference/routines.math.rst
index 3c2f96830..2a09b8d20 100644
--- a/doc/source/reference/routines.math.rst
+++ b/doc/source/reference/routines.math.rst
@@ -143,6 +143,21 @@ Handling complex numbers
    conj
    conjugate
 
+Extrema Finding
+---------------
+.. autosummary::
+   :toctree: generated/
+
+   maximum
+   fmax
+   amax
+   nanmax
+   
+   minimum
+   fmin
+   amin
+   nanmin
+   
 
 Miscellaneous
 -------------
@@ -160,11 +175,7 @@ Miscellaneous
    fabs
    sign
    heaviside
-   maximum
-   minimum
-   fmax
-   fmin
-
+   
    nan_to_num
    real_if_close
 
diff --git a/doc/source/reference/routines.statistics.rst b/doc/source/reference/routines.statistics.rst
index c675b6090..cd93e6025 100644
--- a/doc/source/reference/routines.statistics.rst
+++ b/doc/source/reference/routines.statistics.rst
@@ -9,11 +9,7 @@ Order statistics
 
 .. autosummary::
    :toctree: generated/
-
-   amin
-   amax
-   nanmin
-   nanmax
+   
    ptp
    percentile
    nanpercentile