From 83a55036e9f0e4dca9819b4e6d1eb326bcf4167f Mon Sep 17 00:00:00 2001 From: Chris Jordan-Squire Date: Wed, 3 Aug 2011 16:11:06 -0500 Subject: DOCS: New ufunc creation docs --- doc/source/reference/c-api.coremath.rst | 4 +- doc/source/user/c-info.beyond-basics.rst | 180 ------ doc/source/user/c-info.rst | 1 + doc/source/user/c-info.ufunc-tutorial.rst | 897 ++++++++++++++++++++++++++++++ 4 files changed, 900 insertions(+), 182 deletions(-) create mode 100644 doc/source/user/c-info.ufunc-tutorial.rst (limited to 'doc') diff --git a/doc/source/reference/c-api.coremath.rst b/doc/source/reference/c-api.coremath.rst index 6584f216d..dba092a20 100644 --- a/doc/source/reference/c-api.coremath.rst +++ b/doc/source/reference/c-api.coremath.rst @@ -175,9 +175,9 @@ Linking against the core math library in an extension To use the core math library in your own extension, you need to add the npymath compile and link options to your extension in your setup.py: - >>> from numpy.distutils.misc_utils import get_info + >>> from numpy.distutils.misc_util import get_info >>> info = get_info('npymath') - >>> config.add_extension('foo', sources=['foo.c'], extra_info=**info) + >>> config.add_extension('foo', sources=['foo.c'], extra_info=info) In other words, the usage of info is exactly the same as when using blas_info and co. diff --git a/doc/source/user/c-info.beyond-basics.rst b/doc/source/user/c-info.beyond-basics.rst index 5ff92a122..9ed2ab386 100644 --- a/doc/source/user/c-info.beyond-basics.rst +++ b/doc/source/user/c-info.beyond-basics.rst @@ -204,186 +204,6 @@ There are several examples of using the multi-iterator in the NumPy source code as it makes N-dimensional broadcasting-code very simple to write. Browse the source for more examples. -.. _`sec:Creating-a-new`: - -Creating a new universal function -================================= - -.. index:: - pair: ufunc; adding new - -The umath module is a computer-generated C-module that creates many -ufuncs. It provides a great many examples of how to create a universal -function. Creating your own ufunc that will make use of the ufunc -machinery is not difficult either. Suppose you have a function that -you want to operate element-by-element over its inputs. By creating a -new ufunc you will obtain a function that handles - -- broadcasting - -- N-dimensional looping - -- automatic type-conversions with minimal memory usage - -- optional output arrays - -It is not difficult to create your own ufunc. All that is required is -a 1-d loop for each data-type you want to support. Each 1-d loop must -have a specific signature, and only ufuncs for fixed-size data-types -can be used. The function call used to create a new ufunc to work on -built-in data-types is given below. A different mechanism is used to -register ufuncs for user-defined data-types. - -.. cfunction:: PyObject *PyUFunc_FromFuncAndData( PyUFuncGenericFunction* func, - void** data, char* types, int ntypes, int nin, int nout, int identity, - char* name, char* doc, int check_return) - - *func* - - A pointer to an array of 1-d functions to use. This array must be at - least ntypes long. Each entry in the array must be a - ``PyUFuncGenericFunction`` function. This function has the following - signature. An example of a valid 1d loop function is also given. - - .. cfunction:: void loop1d(char** args, npy_intp* dimensions, - npy_intp* steps, void* data) - - *args* - - An array of pointers to the actual data for the input and output - arrays. The input arguments are given first followed by the output - arguments. - - *dimensions* - - A pointer to the size of the dimension over which this function is - looping. - - *steps* - - A pointer to the number of bytes to jump to get to the - next element in this dimension for each of the input and - output arguments. - - *data* - - Arbitrary data (extra arguments, function names, *etc.* ) - that can be stored with the ufunc and will be passed in - when it is called. - - .. code-block:: c - - static void - double_add(char *args, npy_intp *dimensions, npy_intp *steps, - void *extra) - { - npy_intp i; - npy_intp is1=steps[0], is2=steps[1]; - npy_intp os=steps[2], n=dimensions[0]; - char *i1=args[0], *i2=args[1], *op=args[2]; - for (i=0; i`_ and in `How to extend +Numpy `_ + +The umath module is a computer-generated C-module that creates many +ufuncs. It provides a great many examples of how to create a universal +function. Creating your own ufunc that will make use of the ufunc +machinery is not difficult either. Suppose you have a function that +you want to operate element-by-element over its inputs. By creating a +new ufunc you will obtain a function that handles + +- broadcasting + +- N-dimensional looping + +- automatic type-conversions with minimal memory usage + +- optional output arrays + +It is not difficult to create your own ufunc. All that is required is +a 1-d loop for each data-type you want to support. Each 1-d loop must +have a specific signature, and only ufuncs for fixed-size data-types +can be used. The function call used to create a new ufunc to work on +built-in data-types is given below. A different mechanism is used to +register ufuncs for user-defined data-types. + +In the next several sections we give example code that can be +easily modified to create your own ufuncs. The examples are +successively more complete or complicated versions of the logit +function, a common function in statistical modeling. Logit is also +interesting because, due to the magic of IEEE standards (specifically +IEEE 754), all of the logit functions created below +automatically have the following behavior. + +>>> logit(0) +-inf +>>> logit(1) +inf +>>> logit(2) +nan +>>> logit(-2) +nan + +This is wonderful because the function writer doesn't have to +manually propagate infs or nans. + +.. _`sec:Non-numpy-example`: + +Example Non-ufunc extension +=========================== + +.. index:: + pair: ufunc; adding new + +For comparison and general edificaiton of the reader we provide +a simple implementation of a C extension of logit that uses no +numpy. + +To do this we need two files. The first is the C file which contains +the actual code, and the second is the setup.py file used to create +the module. + + .. code-block:: c + + #include + #include + + /* + spammodule.c + This is the C code for a non-numpy Python extension to + define the logit function, where logit(p) = log(p/(1-p)). + This function will not work on numpy arrays automatically. + numpy.vectorize must be called in python to generate + a numpy-friendly function. + + Details explaining the Python-C API can be found under + 'Extending and Embedding' and 'Python/C API' at + docs.python.org . + */ + + + /* This declares the logit function */ + static PyObject* spam_logit(PyObject *self, PyObject *args); + + + /* This tells Python what methods this module has */ + static PyMethodDef SpamMethods[] = { + {"logit", spam_logit, METH_VARARGS, + "compute logit"}, + {NULL, NULL, 0, NULL} + }; + + + /* + This actually defines the logit function for + input args from Python. + */ + + static PyObject* spam_logit(PyObject *self, PyObject *args){ + + double p; + + /* This parses the Python argument into a double */ + if(!PyArg_ParseTuple(args, "d", &p)){ + return NULL; + } + + /* THE ACTUAL LOGIT FUNCTION */ + p = p/(1-p); + p = log(p); + + /*This builds the answer back into a python object */ + return Py_BuildValue("d", p); + } + + + /* This initiates the module using the above definitions. */ + PyMODINIT_FUNC initspam(void){ + PyObject *m; + + m = Py_InitModule("spam", SpamMethods); + if( m==NULL){ + return; + } + + } + +To use the setup.py file, place setup.py and spammodule.c in the same +folder. Then python setup.py build will build the module to import, +or setup.py install will install the module to your site-packages +directory. + + .. code-block:: python + + ''' + setup.py file for spammodule.c + + Calling + $python setup.py build + will build the extension library in a file + that looks like ./build/lib*, where lib* is + a file that begins with lib. + + Calling + $python setup.py install + will install the module in your site-packages file. + + See the distutils section of + 'Extending and Embedding the Python Interpreter' + at docs.python.org for more information. + ''' + + + from distutils.core import setup, Extension + + module1 = Extension('spam', sources=['spammodule.c'], + include_dirs=['/usr/local/lib']) + + setup(name = 'spam', + version='1.0', + description='This is my spam package', + ext_modules = [module1]) + + +Once the spam module is imported into python, you can call logit +via spam.logit. Note that the function used above cannot be applied +as-is to numpy arrays. To do so we must call numpy.vectorize on it. +For example: + +>>> import numpy as np +>>> import spam +>>> f = np.vectorize(spam.logit) + +THE RESULTING LOGIT FUNCTION IS NOT FAST! numpy.vectorize simply +loops over spam.logit. The loop is done at the C level, but the numpy +array is constantly being parsed and build back up. This is expensive. +When the author compared numpy.vectorize(spam.logit) against the +logit ufuncs constructed below, the logit ufuncs were almost exactly +4 times faster. Larger or smaller speedups are, of course, possible +depending on the nature of the function. + + +.. _`sec:Numpy-one-loop`: + +Example Numpy ufunc for one dtype +================================= + +.. index:: + pair: ufunc; adding new + +For simplicity we give a ufunc for a single dtype, the 'f8' double. +As in the previous section, we first give the .c file and then the +setup.py file used to create the module containing the ufunc. + +The place in the code corresponding to the actual computations for +the ufunc are marked with /\*BEGIN main ufunc computation\*/ and +/\*END main ufunc computation\*/. The code in between those lines is +the primary thing that must be changed to create your own ufunc. + + .. code-block:: c + + #include "Python.h" + #include "math.h" + #include "numpy/ndarraytypes.h" + #include "numpy/ufuncobject.h" + #include "numpy/halffloat.h" + + /* + single_type_logit.c + This is the C code for creating your own + Numpy ufunc for a logit function. + + In this code we only define the ufunc for + a single dtype. The computations that must + be replaced to create a ufunc for + a different funciton are marked with BEGIN + and END. + + Details explaining the Python-C API can be found under + 'Extending and Embedding' and 'Python/C API' at + docs.python.org . + + */ + + + static PyMethodDef LogitMethods[] = { + {NULL, NULL, 0, NULL} + }; + + + static void double_logit(char **args, npy_intp *dimensions, + npy_intp* steps, void* data){ + + npy_intp i; + npy_intp n=dimensions[0]; + char *in=args[0], *out=args[1]; + npy_intp in_step=steps[0], out_step=steps[1]; + + double tmp; + + for(i=0; i>> import numpy as np +>>> import npspam +>>> npspam.logit(0.5) +0.0 +>>> a = np.linspace(0,1,5) +>>> npspam.logit(a) +array([ -inf, -1.09861229, 0. , 1.09861229, inf]) + + + +.. _`sec:Numpy-many-loop`: + +Example Numpy ufunc with multiple dtypes +======================================== + +.. index:: + pair: ufunc; adding new + +We finally give an example of a full ufunc, with inner loops for +half-floats, floats, doubles, and long doubles. As in the previous +sections we first give the .c file and then the corresponding +setup.py file. + +The places in the code corresponding to the actual computations for +the ufunc are marked with /\*BEGIN main ufunc computation\*/ and +/\*END main ufunc computation\*/. The code in between those lines is +the primary thing that must be changed to create your own ufunc. + + + .. code-block:: c + + #include "Python.h" + #include "math.h" + #include "numpy/ndarraytypes.h" + #include "numpy/ufuncobject.h" + #include "numpy/halffloat.h" + + /* + multi_type_logit.c + This is the C code for creating your own + Numpy ufunc for a logit function. + + Each function of the form type_logit defines the + logit function for a different numpy dtype. Each + of these functions must be modified when you + create your own ufunc. The computations that must + be replaced to create a ufunc for + a different funciton are marked with BEGIN + and END. + + Details explaining the Python-C API can be found under + 'Extending and Embedding' and 'Python/C API' at + docs.python.org . + + */ + + + static PyMethodDef LogitMethods[] = { + {NULL, NULL, 0, NULL} + }; + + + static void long_double_logit(char **args, npy_intp *dimensions, + npy_intp* steps, void* data){ + + npy_intp i; + npy_intp n=dimensions[0]; + char *in=args[0], *out=args[1]; + npy_intp in_step=steps[0], out_step=steps[1]; + + long double tmp; + + for(i=0; i>> import numpy as np +>>> import npspam +>>> npspam.logit(0.5) +0.0 +>>> a = np.linspace(0,1,5) +>>> npspam.logit(a) +array([ -inf, -1.09861229, 0. , 1.09861229, inf]) + + + +.. _`sec:Numpy-many-arg`: + +Example Numpy ufunc with multiple arguments/return values +========================================================= + +Our final example is a ufunc with multiple arguments. It is a modification +of the code for a logit ufunc for data with a single dtype. We +compute (A*B, logit(A*B)). + +We only give the C code as the setup.py file is exactly the same as +the setup.py file in `Example Numpy ufunc for one dtype`_, except that +the line + + .. code-block:: python + + config.add_extension('npspam', ['single_type_logit.c']) + +is replaced with + + .. code-block:: python + + config.add_extension('npspam', ['multi_arg_logit.c']) + +The C file is given below. The ufunc generated takes two arguments A +and B. It returns a tuple whose first element is A*B and whose second +element is logit(A*B). Note that it automatically supports broadcasting, +as well as all other properties of a ufunc. + + .. code-block:: c + + #include "Python.h" + #include "math.h" + #include "numpy/ndarraytypes.h" + #include "numpy/ufuncobject.h" + #include "numpy/halffloat.h" + + /* + multi_arg_logit.c + This is the C code for creating your own + Numpy ufunc for a multiple argument, multiple + return value ufunc. The places where the + ufunc computation is carried out are marked + with comments. + + Details explaining the Python-C API can be found under + 'Extending and Embedding' and 'Python/C API' at + docs.python.org . + + */ + + + static PyMethodDef LogitMethods[] = { + {NULL, NULL, 0, NULL} + }; + + + static void double_logitprod(char **args, npy_intp *dimensions, + npy_intp* steps, void* data){ + + npy_intp i; + npy_intp n=dimensions[0]; + char *in1=args[0], *in2=args[1]; + char *out1=args[2], *out2=args[3]; + npy_intp in1_step=steps[0], in2_step=steps[1]; + npy_intp out1_step=steps[2], out2_step=steps[3]; + + double tmp; + + for(i=0; i