diff options
author | jaimefrio <jaime.frio@gmail.com> | 2014-10-20 22:43:35 -0700 |
---|---|---|
committer | jaimefrio <jaime.frio@gmail.com> | 2014-10-20 23:07:13 -0700 |
commit | 528bac1380c782772b9de207bb8466b03117b96d (patch) | |
tree | e5a492af281410b74f622dd202fddabaf168ded2 /doc | |
parent | 140c50537087dccc764cd7540b46b039cb934530 (diff) | |
download | numpy-528bac1380c782772b9de207bb8466b03117b96d.tar.gz |
DOC: Stricter checks for gufunc signatures
Documented the the new behavior in c-api.generalized-ufuncs.rst.
Added PyUFunc_FromFuncAndDataAndSignature to c-api.ufunc.rst.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/release/1.10.0-notes.rst | 8 | ||||
-rw-r--r-- | doc/source/reference/c-api.generalized-ufuncs.rst | 66 | ||||
-rw-r--r-- | doc/source/reference/c-api.ufunc.rst | 16 |
3 files changed, 66 insertions, 24 deletions
diff --git a/doc/release/1.10.0-notes.rst b/doc/release/1.10.0-notes.rst index 553267cad..34b4c4d0e 100644 --- a/doc/release/1.10.0-notes.rst +++ b/doc/release/1.10.0-notes.rst @@ -47,6 +47,14 @@ The cblas versions of dot, inner, and vdot have been integrated into the multiarray module. In particular, vdot is now a multiarray function, which it was not before. +stricter check of gufunc signature compliance +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Inputs to generalized universal functions are now more strictly checked +against the function's signature: all core dimensions are now required to +be present in input arrays; core dimensions with the same label must have +the exact same size; and output core dimension's must be specified, either +by a same label input core dimension or by a passed-in output array. + Deprecations ============ diff --git a/doc/source/reference/c-api.generalized-ufuncs.rst b/doc/source/reference/c-api.generalized-ufuncs.rst index 14f33efcb..92dc8aec0 100644 --- a/doc/source/reference/c-api.generalized-ufuncs.rst +++ b/doc/source/reference/c-api.generalized-ufuncs.rst @@ -18,30 +18,52 @@ arguments is called the "signature" of a ufunc. For example, the ufunc numpy.add has signature ``(),()->()`` defining two scalar inputs and one scalar output. -Another example is the function ``inner1d(a,b)`` with a signature of -``(i),(i)->()``. This applies the inner product along the last axis of +Another example is the function ``inner1d(a, b)`` with a signature of +``(i),(i)->()``. This applies the inner product along the last axis of each input, but keeps the remaining indices intact. -For example, where ``a`` is of shape ``(3,5,N)`` -and ``b`` is of shape ``(5,N)``, this will return an output of shape ``(3,5)``. +For example, where ``a`` is of shape ``(3, 5, N)`` and ``b`` is of shape +``(5, N)``, this will return an output of shape ``(3,5)``. The underlying elementary function is called ``3 * 5`` times. In the signature, we specify one core dimension ``(i)`` for each input and zero core dimensions ``()`` for the output, since it takes two 1-d arrays and returns a scalar. By using the same name ``i``, we specify that the two -corresponding dimensions should be of the same size (or one of them is -of size 1 and will be broadcasted). +corresponding dimensions should be of the same size. The dimensions beyond the core dimensions are called "loop" dimensions. In -the above example, this corresponds to ``(3,5)``. - -The usual numpy "broadcasting" rules apply, where the signature -determines how the dimensions of each input/output object are split -into core and loop dimensions: - -#. While an input array has a smaller dimensionality than the corresponding - number of core dimensions, 1's are pre-pended to its shape. +the above example, this corresponds to ``(3, 5)``. + +The signature determines how the dimensions of each input/output array are +split into core and loop dimensions: + +#. Each dimension in the signature is matched to a dimension of the + corresponding passed-in array, starting from the end of the shape tuple. + These are the core dimensions, and they must be present in the arrays, or + an error will be raised. +#. Core dimensions assigned to the same label in the signature (e.g. the + ``i`` in ``inner1d``'s ``(i),(i)->()``) must have exactly matching sizes, + no broadcasting is performed. #. The core dimensions are removed from all inputs and the remaining - dimensions are broadcasted; defining the loop dimensions. -#. The output is given by the loop dimensions plus the output core dimensions. + dimensions are broadcast together, defining the loop dimensions. +#. The shape of each output is determined from the loop dimensions plus the + output's core dimensions + +Typically, the size of all core dimensions in an output will be determined by +the size of a core dimension with the same label in an input array. This is +not a requirement, and it is possible to define a signature where a label +comes up for the first time in an output, although some precautions must be +taken when calling such a function. An example would be the function +``euclidean_pdist(a)``, with signature ``(n,d)->(p)``, that given an array of +``n`` ``d``-dimensional vectors, computes all unique pairwise Euclidean +distances among them. The output dimension ``p`` must therefore be equal to +``n * (n - 1) / 2``, but it is the caller's responsibility to pass in an +output array of the right size. If the size of a core dimension of an output +cannot be determined from a passed in input or output array, an error will be +raised. + +Note: Prior to Numpy 1.10.0, less strict checks were in place: missing core +dimensions were created by prepending 1's to the shape as necessary, core +dimensions with the same label were broadcast together, and undetermined +dimensions were created with size 1. Definitions @@ -70,7 +92,7 @@ Core Dimension Dimension Name A dimension name represents a core dimension in the signature. Different dimensions may share a name, indicating that they are of - the same size (or are broadcastable). + the same size. Dimension Index A dimension index is an integer representing a dimension name. It @@ -93,8 +115,7 @@ following format: * Dimension lists for different arguments are separated by ``","``. Input/output arguments are separated by ``"->"``. * If one uses the same dimension name in multiple locations, this - enforces the same size (or broadcastable size) of the corresponding - dimensions. + enforces the same size of the corresponding dimensions. The formal syntax of signatures is as follows:: @@ -111,10 +132,9 @@ The formal syntax of signatures is as follows:: Notes: #. All quotes are for clarity. -#. Core dimensions that share the same name must be broadcastable, as - the two ``i`` in our example above. Each dimension name typically - corresponding to one level of looping in the elementary function's - implementation. +#. Core dimensions that share the same name must have the exact same size. + Each dimension name typically corresponds to one level of looping in the + elementary function's implementation. #. White spaces are ignored. Here are some examples of signatures: diff --git a/doc/source/reference/c-api.ufunc.rst b/doc/source/reference/c-api.ufunc.rst index 71abffd04..3673958d9 100644 --- a/doc/source/reference/c-api.ufunc.rst +++ b/doc/source/reference/c-api.ufunc.rst @@ -114,7 +114,6 @@ Functions data type, it will be internally upcast to the int_ (or uint) data type. - :param doc: Allows passing in a documentation string to be stored with the ufunc. The documentation string should not contain the name @@ -128,6 +127,21 @@ Functions structure and it does get set with this value when the ufunc object is created. +.. cfunction:: PyObject* PyUFunc_FromFuncAndDataAndSignature(PyUFuncGenericFunction* func, + void** data, char* types, int ntypes, int nin, int nout, int identity, + char* name, char* doc, int check_return, char *signature) + + This function is very similar to PyUFunc_FromFuncAndData above, but has + an extra *signature* argument, to define generalized universal functions. + Similarly to how ufuncs are built around an element-by-element operation, + gufuncs are around subarray-by-subarray operations, the signature defining + the subarrays to operate on. + + :param signature: + The signature for the new gufunc. Setting it to NULL is equivalent + to calling PyUFunc_FromFuncAndData. A copy of the string is made, + so the passed in buffer can be freed. + .. cfunction:: int PyUFunc_RegisterLoopForType(PyUFuncObject* ufunc, int usertype, PyUFuncGenericFunction function, int* arg_types, void* data) |