summaryrefslogtreecommitdiff
path: root/doc/source
diff options
context:
space:
mode:
authorCharles Harris <charlesr.harris@gmail.com>2014-10-27 12:07:48 -0700
committerCharles Harris <charlesr.harris@gmail.com>2014-10-27 12:07:48 -0700
commit16575443239fa84615fc795692a79ef27f25c216 (patch)
treeed9c9e0fb1f74d45eebc69317a4680ce5160cb11 /doc/source
parent23ee379e86434518bc33ccd9e711a86188914de0 (diff)
parent528bac1380c782772b9de207bb8466b03117b96d (diff)
downloadnumpy-16575443239fa84615fc795692a79ef27f25c216.tar.gz
Merge pull request #5077 from jaimefrio/gufuncs_core_dim_no_broadcast
WIP: gufunc core dimensions should not broadcast
Diffstat (limited to 'doc/source')
-rw-r--r--doc/source/reference/c-api.generalized-ufuncs.rst66
-rw-r--r--doc/source/reference/c-api.ufunc.rst16
2 files changed, 58 insertions, 24 deletions
diff --git a/doc/source/reference/c-api.generalized-ufuncs.rst b/doc/source/reference/c-api.generalized-ufuncs.rst
index 14f33efcb..92dc8aec0 100644
--- a/doc/source/reference/c-api.generalized-ufuncs.rst
+++ b/doc/source/reference/c-api.generalized-ufuncs.rst
@@ -18,30 +18,52 @@ arguments is called the "signature" of a ufunc. For example, the
ufunc numpy.add has signature ``(),()->()`` defining two scalar inputs
and one scalar output.
-Another example is the function ``inner1d(a,b)`` with a signature of
-``(i),(i)->()``. This applies the inner product along the last axis of
+Another example is the function ``inner1d(a, b)`` with a signature of
+``(i),(i)->()``. This applies the inner product along the last axis of
each input, but keeps the remaining indices intact.
-For example, where ``a`` is of shape ``(3,5,N)``
-and ``b`` is of shape ``(5,N)``, this will return an output of shape ``(3,5)``.
+For example, where ``a`` is of shape ``(3, 5, N)`` and ``b`` is of shape
+``(5, N)``, this will return an output of shape ``(3,5)``.
The underlying elementary function is called ``3 * 5`` times. In the
signature, we specify one core dimension ``(i)`` for each input and zero core
dimensions ``()`` for the output, since it takes two 1-d arrays and
returns a scalar. By using the same name ``i``, we specify that the two
-corresponding dimensions should be of the same size (or one of them is
-of size 1 and will be broadcasted).
+corresponding dimensions should be of the same size.
The dimensions beyond the core dimensions are called "loop" dimensions. In
-the above example, this corresponds to ``(3,5)``.
-
-The usual numpy "broadcasting" rules apply, where the signature
-determines how the dimensions of each input/output object are split
-into core and loop dimensions:
-
-#. While an input array has a smaller dimensionality than the corresponding
- number of core dimensions, 1's are pre-pended to its shape.
+the above example, this corresponds to ``(3, 5)``.
+
+The signature determines how the dimensions of each input/output array are
+split into core and loop dimensions:
+
+#. Each dimension in the signature is matched to a dimension of the
+ corresponding passed-in array, starting from the end of the shape tuple.
+ These are the core dimensions, and they must be present in the arrays, or
+ an error will be raised.
+#. Core dimensions assigned to the same label in the signature (e.g. the
+ ``i`` in ``inner1d``'s ``(i),(i)->()``) must have exactly matching sizes,
+ no broadcasting is performed.
#. The core dimensions are removed from all inputs and the remaining
- dimensions are broadcasted; defining the loop dimensions.
-#. The output is given by the loop dimensions plus the output core dimensions.
+ dimensions are broadcast together, defining the loop dimensions.
+#. The shape of each output is determined from the loop dimensions plus the
+ output's core dimensions
+
+Typically, the size of all core dimensions in an output will be determined by
+the size of a core dimension with the same label in an input array. This is
+not a requirement, and it is possible to define a signature where a label
+comes up for the first time in an output, although some precautions must be
+taken when calling such a function. An example would be the function
+``euclidean_pdist(a)``, with signature ``(n,d)->(p)``, that given an array of
+``n`` ``d``-dimensional vectors, computes all unique pairwise Euclidean
+distances among them. The output dimension ``p`` must therefore be equal to
+``n * (n - 1) / 2``, but it is the caller's responsibility to pass in an
+output array of the right size. If the size of a core dimension of an output
+cannot be determined from a passed in input or output array, an error will be
+raised.
+
+Note: Prior to Numpy 1.10.0, less strict checks were in place: missing core
+dimensions were created by prepending 1's to the shape as necessary, core
+dimensions with the same label were broadcast together, and undetermined
+dimensions were created with size 1.
Definitions
@@ -70,7 +92,7 @@ Core Dimension
Dimension Name
A dimension name represents a core dimension in the signature.
Different dimensions may share a name, indicating that they are of
- the same size (or are broadcastable).
+ the same size.
Dimension Index
A dimension index is an integer representing a dimension name. It
@@ -93,8 +115,7 @@ following format:
* Dimension lists for different arguments are separated by ``","``.
Input/output arguments are separated by ``"->"``.
* If one uses the same dimension name in multiple locations, this
- enforces the same size (or broadcastable size) of the corresponding
- dimensions.
+ enforces the same size of the corresponding dimensions.
The formal syntax of signatures is as follows::
@@ -111,10 +132,9 @@ The formal syntax of signatures is as follows::
Notes:
#. All quotes are for clarity.
-#. Core dimensions that share the same name must be broadcastable, as
- the two ``i`` in our example above. Each dimension name typically
- corresponding to one level of looping in the elementary function's
- implementation.
+#. Core dimensions that share the same name must have the exact same size.
+ Each dimension name typically corresponds to one level of looping in the
+ elementary function's implementation.
#. White spaces are ignored.
Here are some examples of signatures:
diff --git a/doc/source/reference/c-api.ufunc.rst b/doc/source/reference/c-api.ufunc.rst
index 71abffd04..3673958d9 100644
--- a/doc/source/reference/c-api.ufunc.rst
+++ b/doc/source/reference/c-api.ufunc.rst
@@ -114,7 +114,6 @@ Functions
data type, it will be internally upcast to the int_ (or uint)
data type.
-
:param doc:
Allows passing in a documentation string to be stored with the
ufunc. The documentation string should not contain the name
@@ -128,6 +127,21 @@ Functions
structure and it does get set with this value when the ufunc
object is created.
+.. cfunction:: PyObject* PyUFunc_FromFuncAndDataAndSignature(PyUFuncGenericFunction* func,
+ void** data, char* types, int ntypes, int nin, int nout, int identity,
+ char* name, char* doc, int check_return, char *signature)
+
+ This function is very similar to PyUFunc_FromFuncAndData above, but has
+ an extra *signature* argument, to define generalized universal functions.
+ Similarly to how ufuncs are built around an element-by-element operation,
+ gufuncs are around subarray-by-subarray operations, the signature defining
+ the subarrays to operate on.
+
+ :param signature:
+ The signature for the new gufunc. Setting it to NULL is equivalent
+ to calling PyUFunc_FromFuncAndData. A copy of the string is made,
+ so the passed in buffer can be freed.
+
.. cfunction:: int PyUFunc_RegisterLoopForType(PyUFuncObject* ufunc,
int usertype, PyUFuncGenericFunction function, int* arg_types, void* data)