diff options
Diffstat (limited to 'trunk/source/user')
-rw-r--r-- | trunk/source/user/basics.broadcasting.rst | 7 | ||||
-rw-r--r-- | trunk/source/user/basics.creation.rst | 9 | ||||
-rw-r--r-- | trunk/source/user/basics.indexing.rst | 16 | ||||
-rw-r--r-- | trunk/source/user/basics.rec.rst | 5 | ||||
-rw-r--r-- | trunk/source/user/basics.rst | 17 | ||||
-rw-r--r-- | trunk/source/user/basics.subclassing.rst | 7 | ||||
-rw-r--r-- | trunk/source/user/basics.types.rst | 14 | ||||
-rw-r--r-- | trunk/source/user/c-info.beyond-basics.rst | 734 | ||||
-rw-r--r-- | trunk/source/user/c-info.how-to-extend.rst | 641 | ||||
-rw-r--r-- | trunk/source/user/c-info.python-as-glue.rst | 1523 | ||||
-rw-r--r-- | trunk/source/user/c-info.rst | 9 | ||||
-rw-r--r-- | trunk/source/user/howtofind.rst | 9 | ||||
-rw-r--r-- | trunk/source/user/index.rst | 27 | ||||
-rw-r--r-- | trunk/source/user/misc.rst | 9 | ||||
-rw-r--r-- | trunk/source/user/performance.rst | 7 |
15 files changed, 0 insertions, 3034 deletions
diff --git a/trunk/source/user/basics.broadcasting.rst b/trunk/source/user/basics.broadcasting.rst deleted file mode 100644 index 65584b1fd..000000000 --- a/trunk/source/user/basics.broadcasting.rst +++ /dev/null @@ -1,7 +0,0 @@ -************ -Broadcasting -************ - -.. seealso:: :class:`numpy.broadcast` - -.. automodule:: numpy.doc.broadcasting diff --git a/trunk/source/user/basics.creation.rst b/trunk/source/user/basics.creation.rst deleted file mode 100644 index b3fa81017..000000000 --- a/trunk/source/user/basics.creation.rst +++ /dev/null @@ -1,9 +0,0 @@ -.. _arrays.creation: - -************** -Array creation -************** - -.. seealso:: :ref:`Array creation routines <routines.array-creation>` - -.. automodule:: numpy.doc.creation diff --git a/trunk/source/user/basics.indexing.rst b/trunk/source/user/basics.indexing.rst deleted file mode 100644 index 7427874a5..000000000 --- a/trunk/source/user/basics.indexing.rst +++ /dev/null @@ -1,16 +0,0 @@ -.. _basics.indexing: - -******** -Indexing -******** - -.. seealso:: :ref:`Indexing routines <routines.indexing>` - -.. note:: - - XXX: Combine ``numpy.doc.indexing`` with material - section 2.2 Basic indexing? - Or incorporate the material directly here? - - -.. automodule:: numpy.doc.indexing diff --git a/trunk/source/user/basics.rec.rst b/trunk/source/user/basics.rec.rst deleted file mode 100644 index 81a3de8e3..000000000 --- a/trunk/source/user/basics.rec.rst +++ /dev/null @@ -1,5 +0,0 @@ -*************************************** -Structured arrays (aka "Record arrays") -*************************************** - -.. automodule:: numpy.doc.structured_arrays diff --git a/trunk/source/user/basics.rst b/trunk/source/user/basics.rst deleted file mode 100644 index b31f38ae9..000000000 --- a/trunk/source/user/basics.rst +++ /dev/null @@ -1,17 +0,0 @@ -************ -Numpy basics -************ - -.. note:: - - XXX: there is overlap between this text extracted from ``numpy.doc`` - and "Guide to Numpy" chapter 2. Needs combining? - -.. toctree:: - - basics.types - basics.creation - basics.indexing - basics.broadcasting - basics.rec - basics.subclassing diff --git a/trunk/source/user/basics.subclassing.rst b/trunk/source/user/basics.subclassing.rst deleted file mode 100644 index 43315521c..000000000 --- a/trunk/source/user/basics.subclassing.rst +++ /dev/null @@ -1,7 +0,0 @@ -.. _basics.subclassing: - -******************* -Subclassing ndarray -******************* - -.. automodule:: numpy.doc.subclassing diff --git a/trunk/source/user/basics.types.rst b/trunk/source/user/basics.types.rst deleted file mode 100644 index 1a95dc6b4..000000000 --- a/trunk/source/user/basics.types.rst +++ /dev/null @@ -1,14 +0,0 @@ -********** -Data types -********** - -.. seealso:: :ref:`Data type objects <arrays.dtypes>` - -.. note:: - - XXX: Combine ``numpy.doc.indexing`` with material from - "Guide to Numpy" (section 2.1 Data-Type descriptors)? - Or incorporate the material directly here? - - -.. automodule:: numpy.doc.basics diff --git a/trunk/source/user/c-info.beyond-basics.rst b/trunk/source/user/c-info.beyond-basics.rst deleted file mode 100644 index 905ab67eb..000000000 --- a/trunk/source/user/c-info.beyond-basics.rst +++ /dev/null @@ -1,734 +0,0 @@ -***************** -Beyond the Basics -***************** - -| The voyage of discovery is not in seeking new landscapes but in having -| new eyes. -| --- *Marcel Proust* - -| Discovery is seeing what everyone else has seen and thinking what no -| one else has thought. -| --- *Albert Szent-Gyorgi* - - -Iterating over elements in the array -==================================== - -.. _`sec:array_iterator`: - -Basic Iteration ---------------- - -One common algorithmic requirement is to be able to walk over all -elements in a multidimensional array. The array iterator object makes -this easy to do in a generic way that works for arrays of any -dimension. Naturally, if you know the number of dimensions you will be -using, then you can always write nested for loops to accomplish the -iteration. If, however, you want to write code that works with any -number of dimensions, then you can make use of the array iterator. An -array iterator object is returned when accessing the .flat attribute -of an array. - -.. index:: - single: array iterator - -Basic usage is to call :cfunc:`PyArray_IterNew` ( ``array`` ) where array -is an ndarray object (or one of its sub-classes). The returned object -is an array-iterator object (the same object returned by the .flat -attribute of the ndarray). This object is usually cast to -PyArrayIterObject* so that its members can be accessed. The only -members that are needed are ``iter->size`` which contains the total -size of the array, ``iter->index``, which contains the current 1-d -index into the array, and ``iter->dataptr`` which is a pointer to the -data for the current element of the array. Sometimes it is also -useful to access ``iter->ao`` which is a pointer to the underlying -ndarray object. - -After processing data at the current element of the array, the next -element of the array can be obtained using the macro -:cfunc:`PyArray_ITER_NEXT` ( ``iter`` ). The iteration always proceeds in a -C-style contiguous fashion (last index varying the fastest). The -:cfunc:`PyArray_ITER_GOTO` ( ``iter``, ``destination`` ) can be used to -jump to a particular point in the array, where ``destination`` is an -array of npy_intp data-type with space to handle at least the number -of dimensions in the underlying array. Occasionally it is useful to -use :cfunc:`PyArray_ITER_GOTO1D` ( ``iter``, ``index`` ) which will jump -to the 1-d index given by the value of ``index``. The most common -usage, however, is given in the following example. - -.. code-block:: c - - PyObject *obj; /* assumed to be some ndarray object */ - PyArrayIterObject *iter; - ... - iter = (PyArrayIterObject *)PyArray_IterNew(obj); - if (iter == NULL) goto fail; /* Assume fail has clean-up code */ - while (iter->index < iter->size) { - /* do something with the data at it->dataptr */ - PyArray_ITER_NEXT(it); - } - ... - -You can also use :cfunc:`PyArrayIter_Check` ( ``obj`` ) to ensure you have -an iterator object and :cfunc:`PyArray_ITER_RESET` ( ``iter`` ) to reset an -iterator object back to the beginning of the array. - -It should be emphasized at this point that you may not need the array -iterator if your array is already contiguous (using an array iterator -will work but will be slower than the fastest code you could write). -The major purpose of array iterators is to encapsulate iteration over -N-dimensional arrays with arbitrary strides. They are used in many, -many places in the NumPy source code itself. If you already know your -array is contiguous (Fortran or C), then simply adding the element- -size to a running pointer variable will step you through the array -very efficiently. In other words, code like this will probably be -faster for you in the contiguous case (assuming doubles). - -.. code-block:: c - - npy_intp size; - double *dptr; /* could make this any variable type */ - size = PyArray_SIZE(obj); - dptr = PyArray_DATA(obj); - while(size--) { - /* do something with the data at dptr */ - dptr++; - } - - -Iterating over all but one axis -------------------------------- - -A common algorithm is to loop over all elements of an array and -perform some function with each element by issuing a function call. As -function calls can be time consuming, one way to speed up this kind of -algorithm is to write the function so it takes a vector of data and -then write the iteration so the function call is performed for an -entire dimension of data at a time. This increases the amount of work -done per function call, thereby reducing the function-call over-head -to a small(er) fraction of the total time. Even if the interior of the -loop is performed without a function call it can be advantageous to -perform the inner loop over the dimension with the highest number of -elements to take advantage of speed enhancements available on micro- -processors that use pipelining to enhance fundmental operations. - -The :cfunc:`PyArray_IterAllButAxis` ( ``array``, ``&dim`` ) constructs an -iterator object that is modified so that it will not iterate over the -dimension indicated by dim. The only restriction on this iterator -object, is that the :cfunc:`PyArray_Iter_GOTO1D` ( ``it``, ``ind`` ) macro -cannot be used (thus flat indexing won't work either if you pass this -object back to Python --- so you shouldn't do this). Note that the -returned object from this routine is still usually cast to -PyArrayIterObject \*. All that's been done is to modify the strides -and dimensions of the returned iterator to simulate iterating over -array[...,0,...] where 0 is placed on the -:math:`\textrm{dim}^{\textrm{th}}` dimension. If dim is negative, then -the dimension with the largest axis is found and used. - - -Iterating over multiple arrays ------------------------------- - -Very often, it is desireable to iterate over several arrays at the -same time. The universal functions are an example of this kind of -behavior. If all you want to do is iterate over arrays with the same -shape, then simply creating several iterator objects is the standard -procedure. For example, the following code iterates over two arrays -assumed to be the same shape and size (actually obj1 just has to have -at least as many total elements as does obj2): - -.. code-block:: c - - /* It is already assumed that obj1 and obj2 - are ndarrays of the same shape and size. - */ - iter1 = (PyArrayIterObject *)PyArray_IterNew(obj1); - if (iter1 == NULL) goto fail; - iter2 = (PyArrayIterObject *)PyArray_IterNew(obj2); - if (iter2 == NULL) goto fail; /* assume iter1 is DECREF'd at fail */ - while (iter2->index < iter2->size) { - /* process with iter1->dataptr and iter2->dataptr */ - PyArray_ITER_NEXT(iter1); - PyArray_ITER_NEXT(iter2); - } - - -Broadcasting over multiple arrays ---------------------------------- - -.. index:: - single: broadcasting - -When multiple arrays are involved in an operation, you may want to use the same -broadcasting rules that the math operations ( *i.e.* the ufuncs) use. This can -be done easily using the :ctype:`PyArrayMultiIterObject`. This is the object -returned from the Python command numpy.broadcast and it is almost as easy to -use from C. The function :cfunc:`PyArray_MultiIterNew` ( ``n``, ``...`` ) is -used (with ``n`` input objects in place of ``...`` ). The input objects can be -arrays or anything that can be converted into an array. A pointer to a -PyArrayMultiIterObject is returned. Broadcasting has already been accomplished -which adjusts the iterators so that all that needs to be done to advance to the -next element in each array is for PyArray_ITER_NEXT to be called for each of -the inputs. This incrementing is automatically performed by -:cfunc:`PyArray_MultiIter_NEXT` ( ``obj`` ) macro (which can handle a -multiterator ``obj`` as either a :ctype:`PyArrayMultiObject *` or a -:ctype:`PyObject *`). The data from input number ``i`` is available using -:cfunc:`PyArray_MultiIter_DATA` ( ``obj``, ``i`` ) and the total (broadcasted) -size as :cfunc:`PyArray_MultiIter_SIZE` ( ``obj``). An example of using this -feature follows. - -.. code-block:: c - - mobj = PyArray_MultiIterNew(2, obj1, obj2); - size = PyArray_MultiIter_SIZE(obj); - while(size--) { - ptr1 = PyArray_MultiIter_DATA(mobj, 0); - ptr2 = PyArray_MultiIter_DATA(mobj, 1); - /* code using contents of ptr1 and ptr2 */ - PyArray_MultiIter_NEXT(mobj); - } - -The function :cfunc:`PyArray_RemoveLargest` ( ``multi`` ) can be used to -take a multi-iterator object and adjust all the iterators so that -iteration does not take place over the largest dimension (it makes -that dimension of size 1). The code being looped over that makes use -of the pointers will very-likely also need the strides data for each -of the iterators. This information is stored in -multi->iters[i]->strides. - -.. index:: - single: array iterator - -There are several examples of using the multi-iterator in the NumPy -source code as it makes N-dimensional broadcasting-code very simple to -write. Browse the source for more examples. - -.. _`sec:Creating-a-new`: - -Creating a new universal function -================================= - -.. index:: - pair: ufunc; adding new - -The umath module is a computer-generated C-module that creates many -ufuncs. It provides a great many examples of how to create a universal -function. Creating your own ufunc that will make use of the ufunc -machinery is not difficult either. Suppose you have a function that -you want to operate element-by-element over its inputs. By creating a -new ufunc you will obtain a function that handles - -- broadcasting - -- N-dimensional looping - -- automatic type-conversions with minimal memory usage - -- optional output arrays - -It is not difficult to create your own ufunc. All that is required is -a 1-d loop for each data-type you want to support. Each 1-d loop must -have a specific signature, and only ufuncs for fixed-size data-types -can be used. The function call used to create a new ufunc to work on -built-in data-types is given below. A different mechanism is used to -register ufuncs for user-defined data-types. - -.. cfunction:: PyObject *PyUFunc_FromFuncAndData( PyUFuncGenericFunction* func, void** data, char* types, int ntypes, int nin, int nout, int identity, char* name, char* doc, int check_return) - - *func* - - A pointer to an array of 1-d functions to use. This array must be at - least ntypes long. Each entry in the array must be a ``PyUFuncGenericFunction`` function. This function has the following signature. An example of a - valid 1d loop function is also given. - - .. cfunction:: void loop1d(char** args, npy_intp* dimensions, npy_intp* steps, void* data) - - *args* - - An array of pointers to the actual data for the input and output - arrays. The input arguments are given first followed by the output - arguments. - - *dimensions* - - A pointer to the size of the dimension over which this function is - looping. - - *steps* - - A pointer to the number of bytes to jump to get to the - next element in this dimension for each of the input and - output arguments. - - *data* - - Arbitrary data (extra arguments, function names, *etc.* ) - that can be stored with the ufunc and will be passed in - when it is called. - - .. code-block:: c - - static void - double_add(char *args, npy_intp *dimensions, npy_intp *steps, void *extra) - { - npy_intp i; - npy_intp is1=steps[0], is2=steps[1]; - npy_intp os=steps[2], n=dimensions[0]; - char *i1=args[0], *i2=args[1], *op=args[2]; - for (i=0; i<n; i++) { - *((double *)op) = *((double *)i1) + \ - *((double *)i2); - i1 += is1; i2 += is2; op += os; - } - } - - *data* - - An array of data. There should be ntypes entries (or NULL) --- one for - every loop function defined for this ufunc. This data will be passed - in to the 1-d loop. One common use of this data variable is to pass in - an actual function to call to compute the result when a generic 1-d - loop (e.g. :cfunc:`PyUFunc_d_d`) is being used. - - *types* - - An array of type-number signatures (type ``char`` ). This - array should be of size (nin+nout)*ntypes and contain the - data-types for the corresponding 1-d loop. The inputs should - be first followed by the outputs. For example, suppose I have - a ufunc that supports 1 integer and 1 double 1-d loop - (length-2 func and data arrays) that takes 2 inputs and - returns 1 output that is always a complex double, then the - types array would be - - - The bit-width names can also be used (e.g. :cdata:`NPY_INT32`, - :cdata:`NPY_COMPLEX128` ) if desired. - - *ntypes* - - The number of data-types supported. This is equal to the number of 1-d - loops provided. - - *nin* - - The number of input arguments. - - *nout* - - The number of output arguments. - - *identity* - - Either :cdata:`PyUFunc_One`, :cdata:`PyUFunc_Zero`, :cdata:`PyUFunc_None`. - This specifies what should be returned when an empty array is - passed to the reduce method of the ufunc. - - *name* - - A ``NULL`` -terminated string providing the name of this ufunc - (should be the Python name it will be called). - - *doc* - - A documentation string for this ufunc (will be used in generating the - response to ``{ufunc_name}.__doc__``). Do not include the function - signature or the name as this is generated automatically. - - *check_return* - - Not presently used, but this integer value does get set in the - structure-member of similar name. - - .. index:: - pair: ufunc; adding new - - The returned ufunc object is a callable Python object. It should be - placed in a (module) dictionary under the same name as was used in the - name argument to the ufunc-creation routine. The following example is - adapted from the umath module - - .. code-block:: c - - static PyUFuncGenericFunction atan2_functions[]=\ - {PyUFunc_ff_f, PyUFunc_dd_d, - PyUFunc_gg_g, PyUFunc_OO_O_method}; - static void* atan2_data[]=\ - {(void *)atan2f,(void *) atan2, - (void *)atan2l,(void *)"arctan2"}; - static char atan2_signatures[]=\ - {NPY_FLOAT, NPY_FLOAT, NPY_FLOAT, - NPY_DOUBLE, NPY_DOUBLE, - NPY_DOUBLE, NPY_LONGDOUBLE, - NPY_LONGDOUBLE, NPY_LONGDOUBLE - NPY_OBJECT, NPY_OBJECT, - NPY_OBJECT}; - ... - /* in the module initialization code */ - PyObject *f, *dict, *module; - ... - dict = PyModule_GetDict(module); - ... - f = PyUFunc_FromFuncAndData(atan2_functions, - atan2_data, atan2_signatures, 4, 2, 1, - PyUFunc_None, "arctan2", - "a safe and correct arctan(x1/x2)", 0); - PyDict_SetItemString(dict, "arctan2", f); - Py_DECREF(f); - ... - - -User-defined data-types -======================= - -NumPy comes with 21 builtin data-types. While this covers a large -majority of possible use cases, it is conceivable that a user may have -a need for an additional data-type. There is some support for adding -an additional data-type into the NumPy system. This additional data- -type will behave much like a regular data-type except ufuncs must have -1-d loops registered to handle it separately. Also checking for -whether or not other data-types can be cast "safely" to and from this -new type or not will always return "can cast" unless you also register -which types your new data-type can be cast to and from. Adding -data-types is one of the less well-tested areas for NumPy 1.0, so -there may be bugs remaining in the approach. Only add a new data-type -if you can't do what you want to do using the OBJECT or VOID -data-types that are already available. As an example of what I -consider a useful application of the ability to add data-types is the -possibility of adding a data-type of arbitrary precision floats to -NumPy. - -.. index:: - pair: dtype; adding new - - -Adding the new data-type ------------------------- - -To begin to make use of the new data-type, you need to first define a -new Python type to hold the scalars of your new data-type. It should -be acceptable to inherit from one of the array scalars if your new -type has a binary compatible layout. This will allow your new data -type to have the methods and attributes of array scalars. New data- -types must have a fixed memory size (if you want to define a data-type -that needs a flexible representation, like a variable-precision -number, then use a pointer to the object as the data-type). The memory -layout of the object structure for the new Python type must be -PyObject_HEAD followed by the fixed-size memory needed for the data- -type. For example, a suitable structure for the new Python type is: - -.. code-block:: c - - typedef struct { - PyObject_HEAD; - some_data_type obval; - /* the name can be whatever you want */ - } PySomeDataTypeObject; - -After you have defined a new Python type object, you must then define -a new :ctype:`PyArray_Descr` structure whose typeobject member will contain a -pointer to the data-type you've just defined. In addition, the -required functions in the ".f" member must be defined: nonzero, -copyswap, copyswapn, setitem, getitem, and cast. The more functions in -the ".f" member you define, however, the more useful the new data-type -will be. It is very important to intialize unused functions to NULL. -This can be achieved using :cfunc:`PyArray_InitArrFuncs` (f). - -Once a new :ctype:`PyArray_Descr` structure is created and filled with the -needed information and useful functions you call -:cfunc:`PyArray_RegisterDataType` (new_descr). The return value from this -call is an integer providing you with a unique type_number that -specifies your data-type. This type number should be stored and made -available by your module so that other modules can use it to recognize -your data-type (the other mechanism for finding a user-defined -data-type number is to search based on the name of the type-object -associated with the data-type using :cfunc:`PyArray_TypeNumFromName` ). - - -Registering a casting function ------------------------------- - -You may want to allow builtin (and other user-defined) data-types to -be cast automatically to your data-type. In order to make this -possible, you must register a casting function with the data-type you -want to be able to cast from. This requires writing low-level casting -functions for each conversion you want to support and then registering -these functions with the data-type descriptor. A low-level casting -function has the signature. - -.. cfunction:: void castfunc( void* from, void* to, npy_intp n, void* fromarr, void* toarr) - - Cast ``n`` elements ``from`` one type ``to`` another. The data to - cast from is in a contiguous, correctly-swapped and aligned chunk - of memory pointed to by from. The buffer to cast to is also - contiguous, correctly-swapped and aligned. The fromarr and toarr - arguments should only be used for flexible-element-sized arrays - (string, unicode, void). - -An example castfunc is: - -.. code-block:: c - - static void - double_to_float(double *from, float* to, npy_intp n, - void* ig1, void* ig2); - while (n--) { - (*to++) = (double) *(from++); - } - -This could then be registered to convert doubles to floats using the -code: - -.. code-block:: c - - doub = PyArray_DescrFromType(NPY_DOUBLE); - PyArray_RegisterCastFunc(doub, NPY_FLOAT, - (PyArray_VectorUnaryFunc *)double_to_float); - Py_DECREF(doub); - - -Registering coercion rules --------------------------- - -By default, all user-defined data-types are not presumed to be safely -castable to any builtin data-types. In addition builtin data-types are -not presumed to be safely castable to user-defined data-types. This -situation limits the ability of user-defined data-types to participate -in the coercion system used by ufuncs and other times when automatic -coercion takes place in NumPy. This can be changed by registering -data-types as safely castable from a particlar data-type object. The -function :cfunc:`PyArray_RegisterCanCast` (from_descr, totype_number, -scalarkind) should be used to specify that the data-type object -from_descr can be cast to the data-type with type number -totype_number. If you are not trying to alter scalar coercion rules, -then use :cdata:`PyArray_NOSCALAR` for the scalarkind argument. - -If you want to allow your new data-type to also be able to share in -the scalar coercion rules, then you need to specify the scalarkind -function in the data-type object's ".f" member to return the kind of -scalar the new data-type should be seen as (the value of the scalar is -available to that function). Then, you can register data-types that -can be cast to separately for each scalar kind that may be returned -from your user-defined data-type. If you don't register scalar -coercion handling, then all of your user-defined data-types will be -seen as :cdata:`PyArray_NOSCALAR`. - - -Registering a ufunc loop ------------------------- - -You may also want to register low-level ufunc loops for your data-type -so that an ndarray of your data-type can have math applied to it -seamlessly. Registering a new loop with exactly the same arg_types -signature, silently replaces any previously registered loops for that -data-type. - -Before you can register a 1-d loop for a ufunc, the ufunc must be -previously created. Then you call :cfunc:`PyUFunc_RegisterLoopForType` -(...) with the information needed for the loop. The return value of -this function is ``0`` if the process was successful and ``-1`` with -an error condition set if it was not successful. - -.. cfunction:: int PyUFunc_RegisterLoopForType( PyUFuncObject* ufunc, int usertype, PyUFuncGenericFunction function, int* arg_types, void* data) - - *ufunc* - - The ufunc to attach this loop to. - - *usertype* - - The user-defined type this loop should be indexed under. This number - must be a user-defined type or an error occurs. - - *function* - - The ufunc inner 1-d loop. This function must have the signature as - explained in Section `3 <#sec-creating-a-new>`__ . - - *arg_types* - - (optional) If given, this should contain an array of integers of at - least size ufunc.nargs containing the data-types expected by the loop - function. The data will be copied into a NumPy-managed structure so - the memory for this argument should be deleted after calling this - function. If this is NULL, then it will be assumed that all data-types - are of type usertype. - - *data* - - (optional) Specify any optional data needed by the function which will - be passed when the function is called. - - .. index:: - pair: dtype; adding new - - -Subtyping the ndarray in C -========================== - -One of the lesser-used features that has been lurking in Python since -2.2 is the ability to sub-class types in C. This facility is one of -the important reasons for basing NumPy off of the Numeric code-base -which was already in C. A sub-type in C allows much more flexibility -with regards to memory management. Sub-typing in C is not difficult -even if you have only a rudimentary understanding of how to create new -types for Python. While it is easiest to sub-type from a single parent -type, sub-typing from multiple parent types is also possible. Multiple -inheritence in C is generally less useful than it is in Python because -a restriction on Python sub-types is that they have a binary -compatible memory layout. Perhaps for this reason, it is somewhat -easier to sub-type from a single parent type. - -.. index:: - pair: ndarray; subtyping - -All C-structures corresponding to Python objects must begin with -:cmacro:`PyObject_HEAD` (or :cmacro:`PyObject_VAR_HEAD`). In the same -way, any sub-type must have a C-structure that begins with exactly the -same memory layout as the parent type (or all of the parent types in -the case of multiple-inheritance). The reason for this is that Python -may attempt to access a member of the sub-type structure as if it had -the parent structure ( *i.e.* it will cast a given pointer to a -pointer to the parent structure and then dereference one of it's -members). If the memory layouts are not compatible, then this attempt -will cause unpredictable behavior (eventually leading to a memory -violation and program crash). - -One of the elements in :cmacro:`PyObject_HEAD` is a pointer to a -type-object structure. A new Python type is created by creating a new -type-object structure and populating it with functions and pointers to -describe the desired behavior of the type. Typically, a new -C-structure is also created to contain the instance-specific -information needed for each object of the type as well. For example, -:cdata:`&PyArray_Type` is a pointer to the type-object table for the ndarray -while a :ctype:`PyArrayObject *` variable is a pointer to a particular instance -of an ndarray (one of the members of the ndarray structure is, in -turn, a pointer to the type- object table :cdata:`&PyArray_Type`). Finally -:cfunc:`PyType_Ready` (<pointer_to_type_object>) must be called for -every new Python type. - - -Creating sub-types ------------------- - -To create a sub-type, a similar proceedure must be followed except -only behaviors that are different require new entries in the type- -object structure. All other entires can be NULL and will be filled in -by :cfunc:`PyType_Ready` with appropriate functions from the parent -type(s). In particular, to create a sub-type in C follow these steps: - -1. If needed create a new C-structure to handle each instance of your - type. A typical C-structure would be: - - .. code-block:: c - - typedef _new_struct { - PyArrayObject base; - /* new things here */ - } NewArrayObject; - - Notice that the full PyArrayObject is used as the first entry in order - to ensure that the binary layout of instances of the new type is - identical to the PyArrayObject. - -2. Fill in a new Python type-object structure with pointers to new - functions that will over-ride the default behavior while leaving any - function that should remain the same unfilled (or NULL). The tp_name - element should be different. - -3. Fill in the tp_base member of the new type-object structure with a - pointer to the (main) parent type object. For multiple-inheritance, - also fill in the tp_bases member with a tuple containing all of the - parent objects in the order they should be used to define inheritance. - Remember, all parent-types must have the same C-structure for multiple - inheritance to work properly. - -4. Call :cfunc:`PyType_Ready` (<pointer_to_new_type>). If this function - returns a negative number, a failure occurred and the type is not - initialized. Otherwise, the type is ready to be used. It is - generally important to place a reference to the new type into the - module dictionary so it can be accessed from Python. - -More information on creating sub-types in C can be learned by reading -PEP 253 (available at http://www.python.org/dev/peps/pep-0253). - - -Specific features of ndarray sub-typing ---------------------------------------- - -Some special methods and attributes are used by arrays in order to -facilitate the interoperation of sub-types with the base ndarray type. - -.. note:: XXX: some of the documentation below needs to be moved to the - reference guide. - - -The __array_finalize\__ method -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. attribute:: ndarray.__array_finalize__ - - Several array-creation functions of the ndarray allow - specification of a particular sub-type to be created. This allows - sub-types to be handled seamlessly in many routines. When a - sub-type is created in such a fashion, however, neither the - __new_\_ method nor the __init\__ method gets called. Instead, the - sub-type is allocated and the appropriate instance-structure - members are filled in. Finally, the :obj:`__array_finalize__` - attribute is looked-up in the object dictionary. If it is present - and not None, then it can be either a CObject containing a pointer - to a :cfunc:`PyArray_FinalizeFunc` or it can be a method taking a - single argument (which could be None). - - If the :obj:`__array_finalize__` attribute is a CObject, then the pointer - must be a pointer to a function with the signature: - - .. code-block:: c - - (int) (PyArrayObject *, PyObject *) - - The first argument is the newly created sub-type. The second argument - (if not NULL) is the "parent" array (if the array was created using - slicing or some other operation where a clearly-distinguishable parent - is present). This routine can do anything it wants to. It should - return a -1 on error and 0 otherwise. - - If the :obj:`__array_finalize__` attribute is not None nor a CObject, - then it must be a Python method that takes the parent array as an - argument (which could be None if there is no parent), and returns - nothing. Errors in this method will be caught and handled. - - -The __array_priority\__ attribute -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. attribute:: ndarray.__array_priority__ - - This attribute allows simple but flexible determination of which sub- - type should be considered "primary" when an operation involving two or - more sub-types arises. In operations where different sub-types are - being used, the sub-type with the largest :obj:`__array_priority__` - attribute will determine the sub-type of the output(s). If two sub- - types have the same :obj:`__array_priority__` then the sub-type of the - first argument determines the output. The default - :obj:`__array_priority__` attribute returns a value of 0.0 for the base - ndarray type and 1.0 for a sub-type. This attribute can also be - defined by objects that are not sub-types of the ndarray and can be - used to determine which :obj:`__array_wrap__` method should be called for - the return output. - -The __array_wrap\__ method -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. attribute:: ndarray.__array_wrap__ - - Any class or type can define this method which should take an ndarray - argument and return an instance of the type. It can be seen as the - opposite of the :obj:`__array__` method. This method is used by the - ufuncs (and other NumPy functions) to allow other objects to pass - through. For Python >2.4, it can also be used to write a decorator - that converts a function that works only with ndarrays to one that - works with any type with :obj:`__array__` and :obj:`__array_wrap__` methods. - -.. index:: - pair: ndarray; subtyping diff --git a/trunk/source/user/c-info.how-to-extend.rst b/trunk/source/user/c-info.how-to-extend.rst deleted file mode 100644 index 56f3c99f1..000000000 --- a/trunk/source/user/c-info.how-to-extend.rst +++ /dev/null @@ -1,641 +0,0 @@ -******************* -How to extend NumPy -******************* - -| That which is static and repetitive is boring. That which is dynamic -| and random is confusing. In between lies art. -| --- *John A. Locke* - -| Science is a differential equation. Religion is a boundary condition. -| --- *Alan Turing* - - -.. _`sec:Writing-an-extension`: - -Writing an extension module -=========================== - -While the ndarray object is designed to allow rapid computation in -Python, it is also designed to be general-purpose and satisfy a wide- -variety of computational needs. As a result, if absolute speed is -essential, there is no replacement for a well-crafted, compiled loop -specific to your application and hardware. This is one of the reasons -that numpy includes f2py so that an easy-to-use mechanisms for linking -(simple) C/C++ and (arbitrary) Fortran code directly into Python are -available. You are encouraged to use and improve this mechanism. The -purpose of this section is not to document this tool but to document -the more basic steps to writing an extension module that this tool -depends on. - -.. index:: - single: extension module - -When an extension module is written, compiled, and installed to -somewhere in the Python path (sys.path), the code can then be imported -into Python as if it were a standard python file. It will contain -objects and methods that have been defined and compiled in C code. The -basic steps for doing this in Python are well-documented and you can -find more information in the documentation for Python itself available -online at `www.python.org <http://www.python.org>`_ . - -In addition to the Python C-API, there is a full and rich C-API for -NumPy allowing sophisticated manipulations on a C-level. However, for -most applications, only a few API calls will typically be used. If all -you need to do is extract a pointer to memory along with some shape -information to pass to another calculation routine, then you will use -very different calls, then if you are trying to create a new array- -like type or add a new data type for ndarrays. This chapter documents -the API calls and macros that are most commonly used. - - -Required subroutine -=================== - -There is exactly one function that must be defined in your C-code in -order for Python to use it as an extension module. The function must -be called init{name} where {name} is the name of the module from -Python. This function must be declared so that it is visible to code -outside of the routine. Besides adding the methods and constants you -desire, this subroutine must also contain calls to import_array() -and/or import_ufunc() depending on which C-API is needed. Forgetting -to place these commands will show itself as an ugly segmentation fault -(crash) as soon as any C-API subroutine is actually called. It is -actually possible to have multiple init{name} functions in a single -file in which case multiple modules will be defined by that file. -However, there are some tricks to get that to work correctly and it is -not covered here. - -A minimal ``init{name}`` method looks like: - -.. code-block:: c - - PyMODINIT_FUNC - init{name}(void) - { - (void)Py_InitModule({name}, mymethods); - import_array(); - } - -The mymethods must be an array (usually statically declared) of -PyMethodDef structures which contain method names, actual C-functions, -a variable indicating whether the method uses keyword arguments or -not, and docstrings. These are explained in the next section. If you -want to add constants to the module, then you store the returned value -from Py_InitModule which is a module object. The most general way to -add itmes to the module is to get the module dictionary using -PyModule_GetDict(module). With the module dictionary, you can add -whatever you like to the module manually. An easier way to add objects -to the module is to use one of three additional Python C-API calls -that do not require a separate extraction of the module dictionary. -These are documented in the Python documentation, but repeated here -for convenience: - -.. cfunction:: int PyModule_AddObject(PyObject* module, char* name, PyObject* value) - -.. cfunction:: int PyModule_AddIntConstant(PyObject* module, char* name, long value) - -.. cfunction:: int PyModule_AddStringConstant(PyObject* module, char* name, char* value) - - All three of these functions require the *module* object (the - return value of Py_InitModule). The *name* is a string that - labels the value in the module. Depending on which function is - called, the *value* argument is either a general object - (:cfunc:`PyModule_AddObject` steals a reference to it), an integer - constant, or a string constant. - - -Defining functions -================== - -The second argument passed in to the Py_InitModule function is a -structure that makes it easy to to define functions in the module. In -the example given above, the mymethods structure would have been -defined earlier in the file (usually right before the init{name} -subroutine) to: - -.. code-block:: c - - static PyMethodDef mymethods[] = { - { nokeywordfunc,nokeyword_cfunc, - METH_VARARGS, - Doc string}, - { keywordfunc, keyword_cfunc, - METH_VARARGS|METH_KEYWORDS, - Doc string}, - {NULL, NULL, 0, NULL} /* Sentinel */ - } - -Each entry in the mymethods array is a :ctype:`PyMethodDef` structure -containing 1) the Python name, 2) the C-function that implements the -function, 3) flags indicating whether or not keywords are accepted for -this function, and 4) The docstring for the function. Any number of -functions may be defined for a single module by adding more entries to -this table. The last entry must be all NULL as shown to act as a -sentinel. Python looks for this entry to know that all of the -functions for the module have been defined. - -The last thing that must be done to finish the extension module is to -actually write the code that performs the desired functions. There are -two kinds of functions: those that don't accept keyword arguments, and -those that do. - - -Functions without keyword arguments ------------------------------------ - -Functions that don't accept keyword arguments should be written as: - -.. code-block:: c - - static PyObject* - nokeyword_cfunc (PyObject *dummy, PyObject *args) - { - /* convert Python arguments */ - /* do function */ - /* return something */ - } - -The dummy argument is not used in this context and can be safely -ignored. The *args* argument contains all of the arguments passed in -to the function as a tuple. You can do anything you want at this -point, but usually the easiest way to manage the input arguments is to -call :cfunc:`PyArg_ParseTuple` (args, format_string, -addresses_to_C_variables...) or :cfunc:`PyArg_UnpackTuple` (tuple, "name" , -min, max, ...). A good description of how to use the first function is -contained in the Python C-API reference manual under section 5.5 -(Parsing arguments and building values). You should pay particular -attention to the "O&" format which uses converter functions to go -between the Python object and the C object. All of the other format -functions can be (mostly) thought of as special cases of this general -rule. There are several converter functions defined in the NumPy C-API -that may be of use. In particular, the :cfunc:`PyArray_DescrConverter` -function is very useful to support arbitrary data-type specification. -This function transforms any valid data-type Python object into a -:ctype:`PyArray_Descr *` object. Remember to pass in the address of the -C-variables that should be filled in. - -There are lots of examples of how to use :cfunc:`PyArg_ParseTuple` -throughout the NumPy source code. The standard usage is like this: - -.. code-block:: c - - PyObject *input; - PyArray_Descr *dtype; - if (!PyArg_ParseTuple(args, "OO&", &input, - PyArray_DescrConverter, - &dtype)) return NULL; - -It is important to keep in mind that you get a *borrowed* reference to -the object when using the "O" format string. However, the converter -functions usually require some form of memory handling. In this -example, if the conversion is successful, *dtype* will hold a new -reference to a :ctype:`PyArray_Descr *` object, while *input* will hold a -borrowed reference. Therefore, if this conversion were mixed with -another conversion (say to an integer) and the data-type conversion -was successful but the integer conversion failed, then you would need -to release the reference count to the data-type object before -returning. A typical way to do this is to set *dtype* to ``NULL`` -before calling :cfunc:`PyArg_ParseTuple` and then use :cfunc:`Py_XDECREF` -on *dtype* before returning. - -After the input arguments are processed, the code that actually does -the work is written (likely calling other functions as needed). The -final step of the C-function is to return something. If an error is -encountered then ``NULL`` should be returned (making sure an error has -actually been set). If nothing should be returned then increment -:cdata:`Py_None` and return it. If a single object should be returned then -it is returned (ensuring that you own a reference to it first). If -multiple objects should be returned then you need to return a tuple. -The :cfunc:`Py_BuildValue` (format_string, c_variables...) function makes -it easy to build tuples of Python objects from C variables. Pay -special attention to the difference between 'N' and 'O' in the format -string or you can easily create memory leaks. The 'O' format string -increments the reference count of the :ctype:`PyObject *` C-variable it -corresponds to, while the 'N' format string steals a reference to the -corresponding :ctype:`PyObject *` C-variable. You should use 'N' if you ave -already created a reference for the object and just want to give that -reference to the tuple. You should use 'O' if you only have a borrowed -reference to an object and need to create one to provide for the -tuple. - - -Functions with keyword arguments --------------------------------- - -These functions are very similar to functions without keyword -arguments. The only difference is that the function signature is: - -.. code-block:: c - - static PyObject* - keyword_cfunc (PyObject *dummy, PyObject *args, PyObject *kwds) - { - ... - } - -The kwds argument holds a Python dictionary whose keys are the names -of the keyword arguments and whose values are the corresponding -keyword-argument values. This dictionary can be processed however you -see fit. The easiest way to handle it, however, is to replace the -:cfunc:`PyArg_ParseTuple` (args, format_string, addresses...) function with -a call to :cfunc:`PyArg_ParseTupleAndKeywords` (args, kwds, format_string, -char \*kwlist[], addresses...). The kwlist parameter to this function -is a ``NULL`` -terminated array of strings providing the expected -keyword arguments. There should be one string for each entry in the -format_string. Using this function will raise a TypeError if invalid -keyword arguments are passed in. - -For more help on this function please see section 1.8 (Keyword -Paramters for Extension Functions) of the Extending and Embedding -tutorial in the Python documentation. - - -Reference counting ------------------- - -The biggest difficulty when writing extension modules is reference -counting. It is an important reason for the popularity of f2py, weave, -pyrex, ctypes, etc.... If you mis-handle reference counts you can get -problems from memory-leaks to segmentation faults. The only strategy I -know of to handle reference counts correctly is blood, sweat, and -tears. First, you force it into your head that every Python variable -has a reference count. Then, you understand exactly what each function -does to the reference count of your objects, so that you can properly -use DECREF and INCREF when you need them. Reference counting can -really test the amount of patience and diligence you have towards your -programming craft. Despite the grim depiction, most cases of reference -counting are quite straightforward with the most common difficulty -being not using DECREF on objects before exiting early from a routine -due to some error. In second place, is the common error of not owning -the reference on an object that is passed to a function or macro that -is going to steal the reference ( *e.g.* :cfunc:`PyTuple_SET_ITEM`, and -most functions that take :ctype:`PyArray_Descr` objects). - -.. index:: - single: reference counting - -Typically you get a new reference to a variable when it is created or -is the return value of some function (there are some prominent -exceptions, however --- such as getting an item out of a tuple or a -dictionary). When you own the reference, you are responsible to make -sure that :cfunc:`Py_DECREF` (var) is called when the variable is no -longer necessary (and no other function has "stolen" its -reference). Also, if you are passing a Python object to a function -that will "steal" the reference, then you need to make sure you own it -(or use :cfunc:`Py_INCREF` to get your own reference). You will also -encounter the notion of borrowing a reference. A function that borrows -a reference does not alter the reference count of the object and does -not expect to "hold on "to the reference. It's just going to use the -object temporarily. When you use :cfunc:`PyArg_ParseTuple` or -:cfunc:`PyArg_UnpackTuple` you receive a borrowed reference to the -objects in the tuple and should not alter their reference count inside -your function. With practice, you can learn to get reference counting -right, but it can be frustrating at first. - -One common source of reference-count errors is the :cfunc:`Py_BuildValue` -function. Pay careful attention to the difference between the 'N' -format character and the 'O' format character. If you create a new -object in your subroutine (such as an output array), and you are -passing it back in a tuple of return values, then you should most- -likely use the 'N' format character in :cfunc:`Py_BuildValue`. The 'O' -character will increase the reference count by one. This will leave -the caller with two reference counts for a brand-new array. When the -variable is deleted and the reference count decremented by one, there -will still be that extra reference count, and the array will never be -deallocated. You will have a reference-counting induced memory leak. -Using the 'N' character will avoid this situation as it will return to -the caller an object (inside the tuple) with a single reference count. - -.. index:: - single: reference counting - - - - -Dealing with array objects -========================== - -Most extension modules for NumPy will need to access the memory for an -ndarray object (or one of it's sub-classes). The easiest way to do -this doesn't require you to know much about the internals of NumPy. -The method is to - -1. Ensure you are dealing with a well-behaved array (aligned, in machine - byte-order and single-segment) of the correct type and number of - dimensions. - - 1. By converting it from some Python object using - :cfunc:`PyArray_FromAny` or a macro built on it. - - 2. By constructing a new ndarray of your desired shape and type - using :cfunc:`PyArray_NewFromDescr` or a simpler macro or function - based on it. - - -2. Get the shape of the array and a pointer to its actual data. - -3. Pass the data and shape information on to a subroutine or other - section of code that actually performs the computation. - -4. If you are writing the algorithm, then I recommend that you use the - stride information contained in the array to access the elements of - the array (the :cfunc:`PyArray_GETPTR` macros make this painless). Then, - you can relax your requirements so as not to force a single-segment - array and the data-copying that might result. - -Each of these sub-topics is covered in the following sub-sections. - - -Converting an arbitrary sequence object ---------------------------------------- - -The main routine for obtaining an array from any Python object that -can be converted to an array is :cfunc:`PyArray_FromAny`. This -function is very flexible with many input arguments. Several macros -make it easier to use the basic function. :cfunc:`PyArray_FROM_OTF` is -arguably the most useful of these macros for the most common uses. It -allows you to convert an arbitrary Python object to an array of a -specific builtin data-type ( *e.g.* float), while specifying a -particular set of requirements ( *e.g.* contiguous, aligned, and -writeable). The syntax is - -.. cfunction:: PyObject *PyArray_FROM_OTF(PyObject* obj, int typenum, int requirements) - - Return an ndarray from any Python object, *obj*, that can be - converted to an array. The number of dimensions in the returned - array is determined by the object. The desired data-type of the - returned array is provided in *typenum* which should be one of the - enumerated types. The *requirements* for the returned array can be - any combination of standard array flags. Each of these arguments - is explained in more detail below. You receive a new reference to - the array on success. On failure, ``NULL`` is returned and an - exception is set. - - *obj* - - The object can be any Python object convertable to an ndarray. - If the object is already (a subclass of) the ndarray that - satisfies the requirements then a new reference is returned. - Otherwise, a new array is constructed. The contents of *obj* - are copied to the new array unless the array interface is used - so that data does not have to be copied. Objects that can be - converted to an array include: 1) any nested sequence object, - 2) any object exposing the array interface, 3) any object with - an :obj:`__array__` method (which should return an ndarray), - and 4) any scalar object (becomes a zero-dimensional - array). Sub-classes of the ndarray that otherwise fit the - requirements will be passed through. If you want to ensure - a base-class ndarray, then use :cdata:`NPY_ENSUREARRAY` in the - requirements flag. A copy is made only if necessary. If you - want to guarantee a copy, then pass in :cdata:`NPY_ENSURECOPY` - to the requirements flag. - - *typenum* - - One of the enumerated types or :cdata:`NPY_NOTYPE` if the data-type - should be determined from the object itself. The C-based names - can be used: - - :cdata:`NPY_BOOL`, :cdata:`NPY_BYTE`, :cdata:`NPY_UBYTE`, - :cdata:`NPY_SHORT`, :cdata:`NPY_USHORT`, :cdata:`NPY_INT`, - :cdata:`NPY_UINT`, :cdata:`NPY_LONG`, :cdata:`NPY_ULONG`, - :cdata:`NPY_LONGLONG`, :cdata:`NPY_ULONGLONG`, :cdata:`NPY_DOUBLE`, - :cdata:`NPY_LONGDOUBLE`, :cdata:`NPY_CFLOAT`, :cdata:`NPY_CDOUBLE`, - :cdata:`NPY_CLONGDOUBLE`, :cdata:`NPY_OBJECT`. - - Alternatively, the bit-width names can be used as supported on the - platform. For example: - - :cdata:`NPY_INT8`, :cdata:`NPY_INT16`, :cdata:`NPY_INT32`, - :cdata:`NPY_INT64`, :cdata:`NPY_UINT8`, - :cdata:`NPY_UINT16`, :cdata:`NPY_UINT32`, - :cdata:`NPY_UINT64`, :cdata:`NPY_FLOAT32`, - :cdata:`NPY_FLOAT64`, :cdata:`NPY_COMPLEX64`, - :cdata:`NPY_COMPLEX128`. - - The object will be converted to the desired type only if it - can be done without losing precision. Otherwise ``NULL`` will - be returned and an error raised. Use :cdata:`NPY_FORCECAST` in the - requirements flag to override this behavior. - - *requirements* - - The memory model for an ndarray admits arbitrary strides in - each dimension to advance to the next element of the array. - Often, however, you need to interface with code that expects a - C-contiguous or a Fortran-contiguous memory layout. In - addition, an ndarray can be misaligned (the address of an - element is not at an integral multiple of the size of the - element) which can cause your program to crash (or at least - work more slowly) if you try and dereference a pointer into - the array data. Both of these problems can be solved by - converting the Python object into an array that is more - "well-behaved" for your specific usage. - - The requirements flag allows specification of what kind of array is - acceptable. If the object passed in does not satisfy this requirements - then a copy is made so that thre returned object will satisfy the - requirements. these ndarray can use a very generic pointer to memory. - This flag allows specification of the desired properties of the - returned array object. All of the flags are explained in the detailed - API chapter. The flags most commonly needed are :cdata:`NPY_IN_ARRAY`, - :cdata:`NPY_OUT_ARRAY`, and :cdata:`NPY_INOUT_ARRAY`: - - .. cvar:: NPY_IN_ARRAY - - Equivalent to :cdata:`NPY_CONTIGUOUS` \| - :cdata:`NPY_ALIGNED`. This combination of flags is useful - for arrays that must be in C-contiguous order and aligned. - These kinds of arrays are usually input arrays for some - algorithm. - - .. cvar:: NPY_OUT_ARRAY - - Equivalent to :cdata:`NPY_CONTIGUOUS` \| - :cdata:`NPY_ALIGNED` \| :cdata:`NPY_WRITEABLE`. This - combination of flags is useful to specify an array that is - in C-contiguous order, is aligned, and can be written to - as well. Such an array is usually returned as output - (although normally such output arrays are created from - scratch). - - .. cvar:: NPY_INOUT_ARRAY - - Equivalent to :cdata:`NPY_CONTIGUOUS` \| - :cdata:`NPY_ALIGNED` \| :cdata:`NPY_WRITEABLE` \| - :cdata:`NPY_UPDATEIFCOPY`. This combination of flags is - useful to specify an array that will be used for both - input and output. If a copy is needed, then when the - temporary is deleted (by your use of :cfunc:`Py_DECREF` at - the end of the interface routine), the temporary array - will be copied back into the original array passed in. Use - of the :cdata:`UPDATEIFCOPY` flag requires that the input - object is already an array (because other objects cannot - be automatically updated in this fashion). If an error - occurs use :cfunc:`PyArray_DECREF_ERR` (obj) on an array - with the :cdata:`NPY_UPDATEIFCOPY` flag set. This will - delete the array without causing the contents to be copied - back into the original array. - - - Other useful flags that can be OR'd as additional requirements are: - - .. cvar:: NPY_FORCECAST - - Cast to the desired type, even if it can't be done without losing - information. - - .. cvar:: NPY_ENSURECOPY - - Make sure the resulting array is a copy of the original. - - .. cvar:: NPY_ENSUREARRAY - - Make sure the resulting object is an actual ndarray and not a sub- - class. - -.. note:: - - Whether or not an array is byte-swapped is determined by the - data-type of the array. Native byte-order arrays are always - requested by :cfunc:`PyArray_FROM_OTF` and so there is no need for - a :cdata:`NPY_NOTSWAPPED` flag in the requirements argument. There - is also no way to get a byte-swapped array from this routine. - - -Creating a brand-new ndarray ----------------------------- - -Quite often new arrays must be created from within extension-module -code. Perhaps an output array is needed and you don't want the caller -to have to supply it. Perhaps only a temporary array is needed to hold -an intermediate calculation. Whatever the need there are simple ways -to get an ndarray object of whatever data-type is needed. The most -general function for doing this is :cfunc:`PyArray_NewFromDescr`. All array -creation functions go through this heavily re-used code. Because of -its flexibility, it can be somewhat confusing to use. As a result, -simpler forms exist that are easier to use. - -.. cfunction:: PyObject *PyArray_SimpleNew(int nd, npy_intp* dims, int typenum) - - This function allocates new memory and places it in an ndarray - with *nd* dimensions whose shape is determined by the array of - at least *nd* items pointed to by *dims*. The memory for the - array is uninitialized (unless typenum is :cdata:`PyArray_OBJECT` in - which case each element in the array is set to NULL). The - *typenum* argument allows specification of any of the builtin - data-types such as :cdata:`PyArray_FLOAT` or :cdata:`PyArray_LONG`. The - memory for the array can be set to zero if desired using - :cfunc:`PyArray_FILLWBYTE` (return_object, 0). - -.. cfunction:: PyObject *PyArray_SimpleNewFromData( int nd, npy_intp* dims, int typenum, void* data) - - Sometimes, you want to wrap memory allocated elsewhere into an - ndarray object for downstream use. This routine makes it - straightforward to do that. The first three arguments are the same - as in :cfunc:`PyArray_SimpleNew`, the final argument is a pointer to a - block of contiguous memory that the ndarray should use as it's - data-buffer which will be interpreted in C-style contiguous - fashion. A new reference to an ndarray is returned, but the - ndarray will not own its data. When this ndarray is deallocated, - the pointer will not be freed. - - You should ensure that the provided memory is not freed while the - returned array is in existence. The easiest way to handle this is - if data comes from another reference-counted Python object. The - reference count on this object should be increased after the - pointer is passed in, and the base member of the returned ndarray - should point to the Python object that owns the data. Then, when - the ndarray is deallocated, the base-member will be DECREF'd - appropriately. If you want the memory to be freed as soon as the - ndarray is deallocated then simply set the OWNDATA flag on the - returned ndarray. - - -Getting at ndarray memory and accessing elements of the ndarray ---------------------------------------------------------------- - -If obj is an ndarray (:ctype:`PyArrayObject *`), then the data-area of the -ndarray is pointed to by the void* pointer :cfunc:`PyArray_DATA` (obj) or -the char* pointer :cfunc:`PyArray_BYTES` (obj). Remember that (in general) -this data-area may not be aligned according to the data-type, it may -represent byte-swapped data, and/or it may not be writeable. If the -data area is aligned and in native byte-order, then how to get at a -specific element of the array is determined only by the array of -npy_intp variables, :cfunc:`PyArray_STRIDES` (obj). In particular, this -c-array of integers shows how many **bytes** must be added to the -current element pointer to get to the next element in each dimension. -For arrays less than 4-dimensions there are :cfunc:`PyArray_GETPTR{k}` -(obj, ...) macros where {k} is the integer 1, 2, 3, or 4 that make -using the array strides easier. The arguments .... represent {k} non- -negative integer indices into the array. For example, suppose ``E`` is -a 3-dimensional ndarray. A (void*) pointer to the element ``E[i,j,k]`` -is obtained as :cfunc:`PyArray_GETPTR3` (E, i, j, k). - -As explained previously, C-style contiguous arrays and Fortran-style -contiguous arrays have particular striding patterns. Two array flags -(:cdata:`NPY_C_CONTIGUOUS` and :cdata`NPY_F_CONTIGUOUS`) indicate -whether or not the striding pattern of a particular array matches the -C-style contiguous or Fortran-style contiguous or neither. Whether or -not the striding pattern matches a standard C or Fortran one can be -tested Using :cfunc:`PyArray_ISCONTIGUOUS` (obj) and -:cfunc:`PyArray_ISFORTRAN` (obj) respectively. Most third-party -libraries expect contiguous arrays. But, often it is not difficult to -support general-purpose striding. I encourage you to use the striding -information in your own code whenever possible, and reserve -single-segment requirements for wrapping third-party code. Using the -striding information provided with the ndarray rather than requiring a -contiguous striding reduces copying that otherwise must be made. - - -Example -======= - -.. index:: - single: extension module - -The following example shows how you might write a wrapper that accepts -two input arguments (that will be converted to an array) and an output -argument (that must be an array). The function returns None and -updates the output array. - -.. code-block:: c - - static PyObject * - example_wrapper(PyObject *dummy, PyObject *args) - { - PyObject *arg1=NULL, *arg2=NULL, *out=NULL; - PyObject *arr1=NULL, *arr2=NULL, *oarr=NULL; - - if (!PyArg_ParseTuple(args, OOO&, &arg1, *arg2, - &PyArrayType, *out)) return NULL; - - arr1 = PyArray_FROM_OTF(arg1, NPY_DOUBLE, NPY_IN_ARRAY); - if (arr1 == NULL) return NULL; - arr2 = PyArray_FROM_OTF(arg2, NPY_DOUBLE, NPY_IN_ARRAY); - if (arr2 == NULL) goto fail; - oarr = PyArray_FROM_OTF(out, NPY_DOUBLE, NPY_INOUT_ARRAY); - if (oarr == NULL) goto fail; - - /* code that makes use of arguments */ - /* You will probably need at least - nd = PyArray_NDIM(<..>) -- number of dimensions - dims = PyArray_DIMS(<..>) -- npy_intp array of length nd - showing length in each dim. - dptr = (double *)PyArray_DATA(<..>) -- pointer to data. - - If an error occurs goto fail. - */ - - Py_DECREF(arr1); - Py_DECREF(arr2); - Py_DECREF(oarr); - Py_INCREF(Py_None); - return Py_None; - - fail: - Py_XDECREF(arr1); - Py_XDECREF(arr2); - PyArray_XDECREF_ERR(oarr); - return NULL; - } diff --git a/trunk/source/user/c-info.python-as-glue.rst b/trunk/source/user/c-info.python-as-glue.rst deleted file mode 100644 index 0e0c73cd8..000000000 --- a/trunk/source/user/c-info.python-as-glue.rst +++ /dev/null @@ -1,1523 +0,0 @@ -******************** -Using Python as glue -******************** - -| There is no conversation more boring than the one where everybody -| agrees. -| --- *Michel de Montaigne* - -| Duct tape is like the force. It has a light side, and a dark side, and -| it holds the universe together. -| --- *Carl Zwanzig* - -Many people like to say that Python is a fantastic glue language. -Hopefully, this Chapter will convince you that this is true. The first -adopters of Python for science were typically people who used it to -glue together large applicaton codes running on super-computers. Not -only was it much nicer to code in Python than in a shell script or -Perl, in addition, the ability to easily extend Python made it -relatively easy to create new classes and types specifically adapted -to the problems being solved. From the interactions of these early -contributors, Numeric emerged as an array-like object that could be -used to pass data between these applications. - -As Numeric has matured and developed into NumPy, people have been able -to write more code directly in NumPy. Often this code is fast-enough -for production use, but there are still times that there is a need to -access compiled code. Either to get that last bit of efficiency out of -the algorithm or to make it easier to access widely-available codes -written in C/C++ or Fortran. - -This chapter will review many of the tools that are available for the -purpose of accessing code written in other compiled languages. There -are many resources available for learning to call other compiled -libraries from Python and the purpose of this Chapter is not to make -you an expert. The main goal is to make you aware of some of the -possibilities so that you will know what to "Google" in order to learn more. - -The http://www.scipy.org website also contains a great deal of useful -information about many of these tools. For example, there is a nice -description of using several of the tools explained in this chapter at -http://www.scipy.org/PerformancePython. This link provides several -ways to solve the same problem showing how to use and connect with -compiled code to get the best performance. In the process you can get -a taste for several of the approaches that will be discussed in this -chapter. - - -Calling other compiled libraries from Python -============================================ - -While Python is a great language and a pleasure to code in, its -dynamic nature results in overhead that can cause some code ( *i.e.* -raw computations inside of for loops) to be up 10-100 times slower -than equivalent code written in a static compiled language. In -addition, it can cause memory usage to be larger than necessary as -temporary arrays are created and destroyed during computation. For -many types of computing needs the extra slow-down and memory -consumption can often not be spared (at least for time- or memory- -critical portions of your code). Therefore one of the most common -needs is to call out from Python code to a fast, machine-code routine -(e.g. compiled using C/C++ or Fortran). The fact that this is -relatively easy to do is a big reason why Python is such an excellent -high-level language for scientific and engineering programming. - -Their are two basic approaches to calling compiled code: writing an -extension module that is then imported to Python using the import -command, or calling a shared-library subroutine directly from Python -using the ctypes module (included in the standard distribution with -Python 2.5). The first method is the most common (but with the -inclusion of ctypes into Python 2.5 this status may change). - -.. warning:: - - Calling C-code from Python can result in Python crashes if you are not - careful. None of the approaches in this chapter are immune. You have - to know something about the way data is handled by both NumPy and by - the third-party library being used. - - -Hand-generated wrappers -======================= - -Extension modules were discussed in Chapter `1 -<#sec-writing-an-extension>`__ . The most basic way to interface with -compiled code is to write an extension module and construct a module -method that calls the compiled code. For improved readability, your -method should take advantage of the PyArg_ParseTuple call to convert -between Python objects and C data-types. For standard C data-types -there is probably already a built-in converter. For others you may -need to write your own converter and use the "O&" format string which -allows you to specify a function that will be used to perform the -conversion from the Python object to whatever C-structures are needed. - -Once the conversions to the appropriate C-structures and C data-types -have been performed, the next step in the wrapper is to call the -underlying function. This is straightforward if the underlying -function is in C or C++. However, in order to call Fortran code you -must be familiar with how Fortran subroutines are called from C/C++ -using your compiler and platform. This can vary somewhat platforms and -compilers (which is another reason f2py makes life much simpler for -interfacing Fortran code) but generally involves underscore mangling -of the name and the fact that all variables are passed by reference -(i.e. all arguments are pointers). - -The advantage of the hand-generated wrapper is that you have complete -control over how the C-library gets used and called which can lead to -a lean and tight interface with minimal over-head. The disadvantage is -that you have to write, debug, and maintain C-code, although most of -it can be adapted using the time-honored technique of -"cutting-pasting-and-modifying" from other extension modules. Because, -the procedure of calling out to additional C-code is fairly -regimented, code-generation procedures have been developed to make -this process easier. One of these code- generation techniques is -distributed with NumPy and allows easy integration with Fortran and -(simple) C code. This package, f2py, will be covered briefly in the -next session. - - -f2py -==== - -F2py allows you to automatically construct an extension module that -interfaces to routines in Fortran 77/90/95 code. It has the ability to -parse Fortran 77/90/95 code and automatically generate Python -signatures for the subroutines it encounters, or you can guide how the -subroutine interfaces with Python by constructing an interface- -defintion-file (or modifying the f2py-produced one). - -.. index:: - single: f2py - -Creating source for a basic extension module --------------------------------------------- - -Probably the easiest way to introduce f2py is to offer a simple -example. Here is one of the subroutines contained in a file named -:file:`add.f`: - -.. code-block:: none - - C - SUBROUTINE ZADD(A,B,C,N) - C - DOUBLE COMPLEX A(*) - DOUBLE COMPLEX B(*) - DOUBLE COMPLEX C(*) - INTEGER N - DO 20 J = 1, N - C(J) = A(J)+B(J) - 20 CONTINUE - END - -This routine simply adds the elements in two contiguous arrays and -places the result in a third. The memory for all three arrays must be -provided by the calling routine. A very basic interface to this -routine can be automatically generated by f2py:: - - f2py -m add add.f - -You should be able to run this command assuming your search-path is -set-up properly. This command will produce an extension module named -addmodule.c in the current directory. This extension module can now be -compiled and used from Python just like any other extension module. - - -Creating a compiled extension module ------------------------------------- - -You can also get f2py to compile add.f and also compile its produced -extension module leaving only a shared-library extension file that can -be imported from Python:: - - f2py -c -m add add.f - -This command leaves a file named add.{ext} in the current directory -(where {ext} is the appropriate extension for a python extension -module on your platform --- so, pyd, *etc.* ). This module may then be -imported from Python. It will contain a method for each subroutin in -add (zadd, cadd, dadd, sadd). The docstring of each method contains -information about how the module method may be called: - - >>> import add - >>> print add.zadd.__doc__ - zadd - Function signature: - zadd(a,b,c,n) - Required arguments: - a : input rank-1 array('D') with bounds (*) - b : input rank-1 array('D') with bounds (*) - c : input rank-1 array('D') with bounds (*) - n : input int - - -Improving the basic interface ------------------------------ - -The default interface is a very literal translation of the fortran -code into Python. The Fortran array arguments must now be NumPy arrays -and the integer argument should be an integer. The interface will -attempt to convert all arguments to their required types (and shapes) -and issue an error if unsuccessful. However, because it knows nothing -about the semantics of the arguments (such that C is an output and n -should really match the array sizes), it is possible to abuse this -function in ways that can cause Python to crash. For example: - - >>> add.zadd([1,2,3],[1,2],[3,4],1000) - -will cause a program crash on most systems. Under the covers, the -lists are being converted to proper arrays but then the underlying add -loop is told to cycle way beyond the borders of the allocated memory. - -In order to improve the interface, directives should be provided. This -is accomplished by constructing an interface definition file. It is -usually best to start from the interface file that f2py can produce -(where it gets its default behavior from). To get f2py to generate the -interface file use the -h option:: - - f2py -h add.pyf -m add add.f - -This command leaves the file add.pyf in the current directory. The -section of this file corresponding to zadd is: - -.. code-block:: none - - subroutine zadd(a,b,c,n) ! in :add:add.f - double complex dimension(*) :: a - double complex dimension(*) :: b - double complex dimension(*) :: c - integer :: n - end subroutine zadd - -By placing intent directives and checking code, the interface can be -cleaned up quite a bit until the Python module method is both easier -to use and more robust. - -.. code-block:: none - - subroutine zadd(a,b,c,n) ! in :add:add.f - double complex dimension(n) :: a - double complex dimension(n) :: b - double complex intent(out),dimension(n) :: c - integer intent(hide),depend(a) :: n=len(a) - end subroutine zadd - -The intent directive, intent(out) is used to tell f2py that ``c`` is -an output variable and should be created by the interface before being -passed to the underlying code. The intent(hide) directive tells f2py -to not allow the user to specify the variable, ``n``, but instead to -get it from the size of ``a``. The depend( ``a`` ) directive is -necessary to tell f2py that the value of n depends on the input a (so -that it won't try to create the variable n until the variable a is -created). - -The new interface has docstring: - - >>> print add.zadd.__doc__ - zadd - Function signature: - c = zadd(a,b) - Required arguments: - a : input rank-1 array('D') with bounds (n) - b : input rank-1 array('D') with bounds (n) - Return objects: - c : rank-1 array('D') with bounds (n) - -Now, the function can be called in a much more robust way: - - >>> add.zadd([1,2,3],[4,5,6]) - array([ 5.+0.j, 7.+0.j, 9.+0.j]) - -Notice the automatic conversion to the correct format that occurred. - - -Inserting directives in Fortran source --------------------------------------- - -The nice interface can also be generated automatically by placing the -variable directives as special comments in the original fortran code. -Thus, if I modify the source code to contain: - -.. code-block:: none - - C - SUBROUTINE ZADD(A,B,C,N) - C - CF2PY INTENT(OUT) :: C - CF2PY INTENT(HIDE) :: N - CF2PY DOUBLE COMPLEX :: A(N) - CF2PY DOUBLE COMPLEX :: B(N) - CF2PY DOUBLE COMPLEX :: C(N) - DOUBLE COMPLEX A(*) - DOUBLE COMPLEX B(*) - DOUBLE COMPLEX C(*) - INTEGER N - DO 20 J = 1, N - C(J) = A(J) + B(J) - 20 CONTINUE - END - -Then, I can compile the extension module using:: - - f2py -c -m add add.f - -The resulting signature for the function add.zadd is exactly the same -one that was created previously. If the original source code had -contained A(N) instead of A(\*) and so forth with B and C, then I -could obtain (nearly) the same interface simply by placing the -INTENT(OUT) :: C comment line in the source code. The only difference -is that N would be an optional input that would default to the length -of A. - - -A filtering example -------------------- - -For comparison with the other methods to be discussed. Here is another -example of a function that filters a two-dimensional array of double -precision floating-point numbers using a fixed averaging filter. The -advantage of using Fortran to index into multi-dimensional arrays -should be clear from this example. - -.. code-block:: none - - SUBROUTINE DFILTER2D(A,B,M,N) - C - DOUBLE PRECISION A(M,N) - DOUBLE PRECISION B(M,N) - INTEGER N, M - CF2PY INTENT(OUT) :: B - CF2PY INTENT(HIDE) :: N - CF2PY INTENT(HIDE) :: M - DO 20 I = 2,M-1 - DO 40 J=2,N-1 - B(I,J) = A(I,J) + - $ (A(I-1,J)+A(I+1,J) + - $ A(I,J-1)+A(I,J+1) )*0.5D0 + - $ (A(I-1,J-1) + A(I-1,J+1) + - $ A(I+1,J-1) + A(I+1,J+1))*0.25D0 - 40 CONTINUE - 20 CONTINUE - END - -This code can be compiled and linked into an extension module named -filter using:: - - f2py -c -m filter filter.f - -This will produce an extension module named filter.so in the current -directory with a method named dfilter2d that returns a filtered -version of the input. - - -Calling f2py from Python ------------------------- - -The f2py program is written in Python and can be run from inside your -module. This provides a facility that is somewhat similar to the use -of weave.ext_tools described below. An example of the final interface -executed using Python code is: - -.. code-block:: python - - import numpy.f2py as f2py - fid = open('add.f') - source = fid.read() - fid.close() - f2py.compile(source, modulename='add') - import add - -The source string can be any valid Fortran code. If you want to save -the extension-module source code then a suitable file-name can be -provided by the source_fn keyword to the compile function. - - -Automatic extension module generation -------------------------------------- - -If you want to distribute your f2py extension module, then you only -need to include the .pyf file and the Fortran code. The distutils -extensions in NumPy allow you to define an extension module entirely -in terms of this interface file. A valid setup.py file allowing -distribution of the add.f module (as part of the package f2py_examples -so that it would be loaded as f2py_examples.add) is: - -.. code-block:: python - - def configuration(parent_package='', top_path=None) - from numpy.distutils.misc_util import Configuration - config = Configuration('f2py_examples',parent_package, top_path) - config.add_extension('add', sources=['add.pyf','add.f']) - return config - - if __name__ == '__main__': - from numpy.distutils.core import setup - setup(**configuration(top_path='').todict()) - -Installation of the new package is easy using:: - - python setup.py install - -assuming you have the proper permissions to write to the main site- -packages directory for the version of Python you are using. For the -resulting package to work, you need to create a file named __init__.py -(in the same directory as add.pyf). Notice the extension module is -defined entirely in terms of the "add.pyf" and "add.f" files. The -conversion of the .pyf file to a .c file is handled by numpy.disutils. - - -Conclusion ----------- - -The interface definition file (.pyf) is how you can fine-tune the -interface between Python and Fortran. There is decent documentation -for f2py found in the numpy/f2py/docs directory where-ever NumPy is -installed on your system (usually under site-packages). There is also -more information on using f2py (including how to use it to wrap C -codes) at http://www.scipy.org/Cookbook under the "Using NumPy with -Other Languages" heading. - -The f2py method of linking compiled code is currently the most -sophisticated and integrated approach. It allows clean separation of -Python with compiled code while still allowing for separate -distribution of the extension module. The only draw-back is that it -requires the existence of a Fortran compiler in order for a user to -install the code. However, with the existence of the free-compilers -g77, gfortran, and g95, as well as high-quality commerical compilers, -this restriction is not particularly onerous. In my opinion, Fortran -is still the easiest way to write fast and clear code for scientific -computing. It handles complex numbers, and multi-dimensional indexing -in the most straightforward way. Be aware, however, that some Fortran -compilers will not be able to optimize code as well as good hand- -written C-code. - -.. index:: - single: f2py - - -weave -===== - -Weave is a scipy package that can be used to automate the process of -extending Python with C/C++ code. It can be used to speed up -evaluation of an array expression that would otherwise create -temporary variables, to directly "inline" C/C++ code into Python, or -to create a fully-named extension module. You must either install -scipy or get the weave package separately and install it using the -standard python setup.py install. You must also have a C/C++-compiler -installed and useable by Python distutils in order to use weave. - -.. index:: - single: weave - -Somewhat dated, but still useful documentation for weave can be found -at the link http://www.scipy/Weave. There are also many examples found -in the examples directory which is installed under the weave directory -in the place where weave is installed on your system. - - -Speed up code involving arrays (also see scipy.numexpr) -------------------------------------------------------- - -This is the easiest way to use weave and requires minimal changes to -your Python code. It involves placing quotes around the expression of -interest and calling weave.blitz. Weave will parse the code and -generate C++ code using Blitz C++ arrays. It will then compile the -code and catalog the shared library so that the next time this exact -string is asked for (and the array types are the same), the already- -compiled shared library will be loaded and used. Because Blitz makes -extensive use of C++ templating, it can take a long time to compile -the first time. After that, however, the code should evaluate more -quickly than the equivalent NumPy expression. This is especially true -if your array sizes are large and the expression would require NumPy -to create several temporaries. Only expressions involving basic -arithmetic operations and basic array slicing can be converted to -Blitz C++ code. - -For example, consider the expression:: - - d = 4*a + 5*a*b + 6*b*c - -where a, b, and c are all arrays of the same type and shape. When the -data-type is double-precision and the size is 1000x1000, this -expression takes about 0.5 seconds to compute on an 1.1Ghz AMD Athlon -machine. When this expression is executed instead using blitz: - -.. code-block:: python - - d = empty(a.shape, 'd'); weave.blitz(expr) - -execution time is only about 0.20 seconds (about 0.14 seconds spent in -weave and the rest in allocating space for d). Thus, we've sped up the -code by a factor of 2 using only a simnple command (weave.blitz). Your -mileage may vary, but factors of 2-8 speed-ups are possible with this -very simple technique. - -If you are interested in using weave in this way, then you should also -look at scipy.numexpr which is another similar way to speed up -expressions by eliminating the need for temporary variables. Using -numexpr does not require a C/C++ compiler. - - -Inline C-code -------------- - -Probably the most widely-used method of employing weave is to -"in-line" C/C++ code into Python in order to speed up a time-critical -section of Python code. In this method of using weave, you define a -string containing useful C-code and then pass it to the function -**weave.inline** ( ``code_string``, ``variables`` ), where -code_string is a string of valid C/C++ code and variables is a list of -variables that should be passed in from Python. The C/C++ code should -refer to the variables with the same names as they are defined with in -Python. If weave.line should return anything the the special value -return_val should be set to whatever object should be returned. The -following example shows how to use weave on basic Python objects: - -.. code-block:: python - - code = r""" - int i; - py::tuple results(2); - for (i=0; i<a.length(); i++) { - a[i] = i; - } - results[0] = 3.0; - results[1] = 4.0; - return_val = results; - """ - a = [None]*10 - res = weave.inline(code,['a']) - -The C++ code shown in the code string uses the name 'a' to refer to -the Python list that is passed in. Because the Python List is a -mutable type, the elements of the list itself are modified by the C++ -code. A set of C++ classes are used to access Python objects using -simple syntax. - -The main advantage of using C-code, however, is to speed up processing -on an array of data. Accessing a NumPy array in C++ code using weave, -depends on what kind of type converter is chosen in going from NumPy -arrays to C++ code. The default converter creates 5 variables for the -C-code for every NumPy array passed in to weave.inline. The following -table shows these variables which can all be used in the C++ code. The -table assumes that ``myvar`` is the name of the array in Python with -data-type {dtype} (i.e. float64, float32, int8, etc.) - -=========== ============== ========================================= -Variable Type Contents -=========== ============== ========================================= -myvar {dtype}* Pointer to the first element of the array -Nmyvar npy_intp* A pointer to the dimensions array -Smyvar npy_intp* A pointer to the strides array -Dmyvar int The number of dimensions -myvar_array PyArrayObject* The entire structure for the array -=========== ============== ========================================= - -The in-lined code can contain references to any of these variables as -well as to the standard macros MYVAR1(i), MYVAR2(i,j), MYVAR3(i,j,k), -and MYVAR4(i,j,k,l). These name-based macros (they are the Python name -capitalized followed by the number of dimensions needed) will de- -reference the memory for the array at the given location with no error -checking (be-sure to use the correct macro and ensure the array is -aligned and in correct byte-swap order in order to get useful -results). The following code shows how you might use these variables -and macros to code a loop in C that computes a simple 2-d weighted -averaging filter. - -.. code-block:: c++ - - int i,j; - for(i=1;i<Na[0]-1;i++) { - for(j=1;j<Na[1]-1;j++) { - B2(i,j) = A2(i,j) + (A2(i-1,j) + - A2(i+1,j)+A2(i,j-1) - + A2(i,j+1))*0.5 - + (A2(i-1,j-1) - + A2(i-1,j+1) - + A2(i+1,j-1) - + A2(i+1,j+1))*0.25 - } - } - -The above code doesn't have any error checking and so could fail with -a Python crash if, ``a`` had the wrong number of dimensions, or ``b`` -did not have the same shape as ``a``. However, it could be placed -inside a standard Python function with the necessary error checking to -produce a robust but fast subroutine. - -One final note about weave.inline: if you have additional code you -want to include in the final extension module such as supporting -function calls, include statments, etc. you can pass this code in as a -string using the keyword support_code: ``weave.inline(code, variables, -support_code=support)``. If you need the extension module to link -against an additional library then you can also pass in -distutils-style keyword arguments such as library_dirs, libraries, -and/or runtime_library_dirs which point to the appropriate libraries -and directories. - -Simplify creation of an extension module ----------------------------------------- - -The inline function creates one extension module for each function to- -be inlined. It also generates a lot of intermediate code that is -duplicated for each extension module. If you have several related -codes to execute in C, it would be better to make them all separate -functions in a single extension module with multiple functions. You -can also use the tools weave provides to produce this larger extension -module. In fact, the weave.inline function just uses these more -general tools to do its work. - -The approach is to: - -1. construct a extension module object using - ext_tools.ext_module(``module_name``); - -2. create function objects using ext_tools.ext_function(``func_name``, - ``code``, ``variables``); - -3. (optional) add support code to the function using the - .customize.add_support_code( ``support_code`` ) method of the - function object; - -4. add the functions to the extension module object using the - .add_function(``func``) method; - -5. when all the functions are added, compile the extension with its - .compile() method. - -Several examples are available in the examples directory where weave -is installed on your system. Look particularly at ramp2.py, -increment_example.py and fibonacii.py - - -Conclusion ----------- - -Weave is a useful tool for quickly routines in C/C++ and linking them -into Python. It's caching-mechanism allows for on-the-fly compilation -which makes it particularly attractive for in-house code. Because of -the requirement that the user have a C++-compiler, it can be difficult -(but not impossible) to distribute a package that uses weave to other -users who don't have a compiler installed. Of course, weave could be -used to construct an extension module which is then distributed in the -normal way *(* using a setup.py file). While you can use weave to -build larger extension modules with many methods, creating methods -with a variable- number of arguments is not possible. Thus, for a more -sophisticated module, you will still probably want a Python-layer that -calls the weave-produced extension. - -.. index:: - single: weave - - -Pyrex -===== - -Pyrex is a way to write C-extension modules using Python-like syntax. -It is an interesting way to generate extension modules that is growing -in popularity, particularly among people who have rusty or non- -existent C-skills. It does require the user to write the "interface" -code and so is more time-consuming than SWIG or f2py if you are trying -to interface to a large library of code. However, if you are writing -an extension module that will include quite a bit of your own -algorithmic code, as well, then Pyrex is a good match. A big weakness -perhaps is the inability to easily and quickly access the elements of -a multidimensional array. - -.. index:: - single: pyrex - -Notice that Pyrex is an extension-module generator only. Unlike weave -or f2py, it includes no automatic facility for compiling and linking -the extension module (which must be done in the usual fashion). It -does provide a modified distutils class called build_ext which lets -you build an extension module from a .pyx source. Thus, you could -write in a setup.py file: - -.. code-block:: python - - from Pyrex.Distutils import build_ext - from distutils.extension import Extension - from distutils.core import setup - - import numpy - py_ext = Extension('mine', ['mine.pyx'], - include_dirs=[numpy.get_include()]) - - setup(name='mine', description='Nothing', - ext_modules=[pyx_ext], - cmdclass = {'build_ext':build_ext}) - -Adding the NumPy include directory is, of course, only necessary if -you are using NumPy arrays in the extension module (which is what I -assume you are using Pyrex for). The distutils extensions in NumPy -also include support for automatically producing the extension-module -and linking it from a ``.pyx`` file. It works so that if the user does -not have Pyrex installed, then it looks for a file with the same -file-name but a ``.c`` extension which it then uses instead of trying -to produce the ``.c`` file again. - -Pyrex does not natively understand NumPy arrays. However, it is not -difficult to include information that lets Pyrex deal with them -usefully. In fact, the numpy.random.mtrand module was written using -Pyrex so an example of Pyrex usage is already included in the NumPy -source distribution. That experience led to the creation of a standard -c_numpy.pxd file that you can use to simplify interacting with NumPy -array objects in a Pyrex-written extension. The file may not be -complete (it wasn't at the time of this writing). If you have -additions you'd like to contribute, please send them. The file is -located in the .../site-packages/numpy/doc/pyrex directory where you -have Python installed. There is also an example in that directory of -using Pyrex to construct a simple extension module. It shows that -Pyrex looks a lot like Python but also contains some new syntax that -is necessary in order to get C-like speed. - -If you just use Pyrex to compile a standard Python module, then you -will get a C-extension module that runs either as fast or, possibly, -more slowly than the equivalent Python module. Speed increases are -possible only when you use cdef to statically define C variables and -use a special construct to create for loops: - -.. code-block:: none - - cdef int i - for i from start <= i < stop - -Let's look at two examples we've seen before to see how they might be -implemented using Pyrex. These examples were compiled into extension -modules using Pyrex-0.9.3.1. - - -Pyrex-add ---------- - -Here is part of a Pyrex-file I named add.pyx which implements the add -functions we previously implemented using f2py: - -.. code-block:: none - - cimport c_numpy - from c_numpy cimport import_array, ndarray, npy_intp, npy_cdouble, \ - npy_cfloat, NPY_DOUBLE, NPY_CDOUBLE, NPY_FLOAT, \ - NPY_CFLOAT - - #We need to initialize NumPy - import_array() - - def zadd(object ao, object bo): - cdef ndarray c, a, b - cdef npy_intp i - a = c_numpy.PyArray_ContiguousFromAny(ao, - NPY_CDOUBLE, 1, 1) - b = c_numpy.PyArray_ContiguousFromAny(bo, - NPY_CDOUBLE, 1, 1) - c = c_numpy.PyArray_SimpleNew(a.nd, a.dimensions, - a.descr.type_num) - for i from 0 <= i < a.dimensions[0]: - (<npy_cdouble *>c.data)[i].real = \ - (<npy_cdouble *>a.data)[i].real + \ - (<npy_cdouble *>b.data)[i].real - (<npy_cdouble *>c.data)[i].imag = \ - (<npy_cdouble *>a.data)[i].imag + \ - (<npy_cdouble *>b.data)[i].imag - return c - -This module shows use of the ``cimport`` statement to load the -definitions from the c_numpy.pxd file. As shown, both versions of the -import statement are supported. It also shows use of the NumPy C-API -to construct NumPy arrays from arbitrary input objects. The array c is -created using PyArray_SimpleNew. Then the c-array is filled by -addition. Casting to a particiular data-type is accomplished using -<cast \*>. Pointers are de-referenced with bracket notation and -members of structures are accessed using '.' notation even if the -object is techinically a pointer to a structure. The use of the -special for loop construct ensures that the underlying code will have -a similar C-loop so the addition calculation will proceed quickly. -Notice that we have not checked for NULL after calling to the C-API ---- a cardinal sin when writing C-code. For routines that return -Python objects, Pyrex inserts the checks for NULL into the C-code for -you and returns with failure if need be. There is also a way to get -Pyrex to automatically check for exceptions when you call functions -that don't return Python objects. See the documentation of Pyrex for -details. - - -Pyrex-filter ------------- - -The two-dimensional example we created using weave is a bit uglierto -implement in Pyrex because two-dimensional indexing using Pyrex is not -as simple. But, it is straightforward (and possibly faster because of -pre-computed indices). Here is the Pyrex-file I named image.pyx. - -.. code-block:: none - - cimport c_numpy - from c_numpy cimport import_array, ndarray, npy_intp,\ - NPY_DOUBLE, NPY_CDOUBLE, \ - NPY_FLOAT, NPY_CFLOAT, NPY_ALIGNED \ - - #We need to initialize NumPy - import_array() - def filter(object ao): - cdef ndarray a, b - cdef npy_intp i, j, M, N, oS - cdef npy_intp r,rm1,rp1,c,cm1,cp1 - cdef double value - # Require an ALIGNED array - # (but not necessarily contiguous) - # We will use strides to access the elements. - a = c_numpy.PyArray_FROMANY(ao, NPY_DOUBLE, \ - 2, 2, NPY_ALIGNED) - b = c_numpy.PyArray_SimpleNew(a.nd,a.dimensions, \ - a.descr.type_num) - M = a.dimensions[0] - N = a.dimensions[1] - S0 = a.strides[0] - S1 = a.strides[1] - for i from 1 <= i < M-1: - r = i*S0 - rm1 = r-S0 - rp1 = r+S0 - oS = i*N - for j from 1 <= j < N-1: - c = j*S1 - cm1 = c-S1 - cp1 = c+S1 - (<double *>b.data)[oS+j] = \ - (<double *>(a.data+r+c))[0] + \ - ((<double *>(a.data+rm1+c))[0] + \ - (<double *>(a.data+rp1+c))[0] + \ - (<double *>(a.data+r+cm1))[0] + \ - (<double *>(a.data+r+cp1))[0])*0.5 + \ - ((<double *>(a.data+rm1+cm1))[0] + \ - (<double *>(a.data+rp1+cm1))[0] + \ - (<double *>(a.data+rp1+cp1))[0] + \ - (<double *>(a.data+rm1+cp1))[0])*0.25 - return b - -This 2-d averaging filter runs quickly because the loop is in C and -the pointer computations are done only as needed. However, it is not -particularly easy to understand what is happening. A 2-d image, ``in`` -, can be filtered using this code very quickly using: - -.. code-block:: python - - import image - out = image.filter(in) - - -Conclusion ----------- - -There are several disadvantages of using Pyrex: - -1. The syntax for Pyrex can get a bit bulky, and it can be confusing at - first to understand what kind of objects you are getting and how to - interface them with C-like constructs. - -2. Inappropriate Pyrex syntax or incorrect calls to C-code or type- - mismatches can result in failures such as - - 1. Pyrex failing to generate the extension module source code, - - 2. Compiler failure while generating the extension module binary due to - incorrect C syntax, - - 3. Python failure when trying to use the module. - - -3. It is easy to lose a clean separation between Python and C which makes - re-using your C-code for other non-Python-related projects more - difficult. - -4. Multi-dimensional arrays are "bulky" to index (appropriate macros - may be able to fix this). - -5. The C-code generated by Prex is hard to read and modify (and typically - compiles with annoying but harmless warnings). - -Writing a good Pyrex extension module still takes a bit of effort -because not only does it require (a little) familiarity with C, but -also with Pyrex's brand of Python-mixed-with C. One big advantage of -Pyrex-generated extension modules is that they are easy to distribute -using distutils. In summary, Pyrex is a very capable tool for either -gluing C-code or generating an extension module quickly and should not -be over-looked. It is especially useful for people that can't or won't -write C-code or Fortran code. But, if you are already able to write -simple subroutines in C or Fortran, then I would use one of the other -approaches such as f2py (for Fortran), ctypes (for C shared- -libraries), or weave (for inline C-code). - -.. index:: - single: pyrex - - - - -ctypes -====== - -Ctypes is a python extension module (downloaded separately for Python -<2.5 and included with Python 2.5) that allows you to call an -arbitrary function in a shared library directly from Python. This -approach allows you to interface with C-code directly from Python. -This opens up an enormous number of libraries for use from Python. The -drawback, however, is that coding mistakes can lead to ugly program -crashes very easily (just as can happen in C) because there is little -type or bounds checking done on the parameters. This is especially -true when array data is passed in as a pointer to a raw memory -location. The responsibility is then on you that the subroutine will -not access memory outside the actual array area. But, if you don't -mind living a little dangerously ctypes can be an effective tool for -quickly taking advantage of a large shared library (or writing -extended functionality in your own shared library). - -.. index:: - single: ctypes - -Because the ctypes approach exposes a raw interface to the compiled -code it is not always tolerant of user mistakes. Robust use of the -ctypes module typically involves an additional layer of Python code in -order to check the data types and array bounds of objects passed to -the underlying subroutine. This additional layer of checking (not to -mention the conversion from ctypes objects to C-data-types that ctypes -itself performs), will make the interface slower than a hand-written -extension-module interface. However, this overhead should be neglible -if the C-routine being called is doing any significant amount of work. -If you are a great Python programmer with weak C-skills, ctypes is an -easy way to write a useful interface to a (shared) library of compiled -code. - -To use c-types you must - -1. Have a shared library. - -2. Load the shared library. - -3. Convert the python objects to ctypes-understood arguments. - -4. Call the function from the library with the ctypes arguments. - - -Having a shared library ------------------------ - -There are several requirements for a shared library that can be used -with c-types that are platform specific. This guide assumes you have -some familiarity with making a shared library on your system (or -simply have a shared library available to you). Items to remember are: - -- A shared library must be compiled in a special way ( *e.g.* using - the -shared flag with gcc). - -- On some platforms (*e.g.* Windows) , a shared library requires a - .def file that specifies the functions to be exported. For example a - mylib.def file might contain. - - :: - - LIBRARY mylib.dll - EXPORTS - cool_function1 - cool_function2 - - Alternatively, you may be able to use the storage-class specifier - __declspec(dllexport) in the C-definition of the function to avoid the - need for this .def file. - -There is no standard way in Python distutils to create a standard -shared library (an extension module is a "special" shared library -Python understands) in a cross-platform manner. Thus, a big -disadvantage of ctypes at the time of writing this book is that it is -difficult to distribute in a cross-platform manner a Python extension -that uses c-types and includes your own code which should be compiled -as a shared library on the users system. - - -Loading the shared library --------------------------- - -A simple, but robust way to load the shared library is to get the -absolute path name and load it using the cdll object of ctypes.: - -.. code-block:: python - - lib = ctypes.cdll[<full_path_name>] - -However, on Windows accessing an attribute of the cdll method will -load the first DLL by that name found in the current directory or on -the PATH. Loading the absolute path name requires a little finesse for -cross-platform work since the extension of shared libraries varies. -There is a ``ctypes.util.find_library`` utility available that can -simplify the process of finding the library to load but it is not -foolproof. Complicating matters, different platforms have different -default extensions used by shared libraries (e.g. .dll -- Windows, .so --- Linux, .dylib -- Mac OS X). This must also be taken into account if -you are using c-types to wrap code that needs to work on several -platforms. - -NumPy provides a convenience function called -:func:`ctypeslib.load_library` (name, path). This function takes the name -of the shared library (including any prefix like 'lib' but excluding -the extension) and a path where the shared library can be located. It -returns a ctypes library object or raises an OSError if the library -cannot be found or raises an ImportError if the ctypes module is not -available. (Windows users: the ctypes library object loaded using -:func:`load_library` is always loaded assuming cdecl calling convention. -See the ctypes documentation under ctypes.windll and/or ctypes.oledll -for ways to load libraries under other calling conventions). - -The functions in the shared library are available as attributes of the -ctypes library object (returned from :func:`ctypeslib.load_library`) or -as items using ``lib['func_name']`` syntax. The latter method for -retrieving a function name is particularly useful if the function name -contains characters that are not allowable in Python variable names. - - -Converting arguments --------------------- - -Python ints/longs, strings, and unicode objects are automatically -converted as needed to equivalent c-types arguments The None object is -also converted automatically to a NULL pointer. All other Python -objects must be converted to ctypes-specific types. There are two ways -around this restriction that allow c-types to integrate with other -objects. - -1. Don't set the argtypes attribute of the function object and define an - :obj:`_as_parameter_` method for the object you want to pass in. The - :obj:`_as_parameter_` method must return a Python int which will be passed - directly to the function. - -2. Set the argtypes attribute to a list whose entries contain objects - with a classmethod named from_param that knows how to convert your - object to an object that ctypes can understand (an int/long, string, - unicode, or object with the :obj:`_as_parameter_` attribute). - -NumPy uses both methods with a preference for the second method -because it can be safer. The ctypes attribute of the ndarray returns -an object that has an _as_parameter\_ attribute which returns an -integer representing the address of the ndarray to which it is -associated. As a result, one can pass this ctypes attribute object -directly to a function expecting a pointer to the data in your -ndarray. The caller must be sure that the ndarray object is of the -correct type, shape, and has the correct flags set or risk nasty -crashes if the data-pointer to inappropriate arrays are passsed in. - -To implement the second method, NumPy provides the class-factory -function :func:`ndpointer` in the :mod:`ctypeslib` module. This -class-factory function produces an appropriate class that can be -placed in an argtypes attribute entry of a ctypes function. The class -will contain a from_param method which ctypes will use to convert any -ndarray passed in to the function to a ctypes-recognized object. In -the process, the conversion will perform checking on any properties of -the ndarray that were specified by the user in the call to :func:`ndpointer`. -Aspects of the ndarray that can be checked include the data-type, the -number-of-dimensions, the shape, and/or the state of the flags on any -array passed. The return value of the from_param method is the ctypes -attribute of the array which (because it contains the _as_parameter\_ -attribute pointing to the array data area) can be used by ctypes -directly. - -The ctypes attribute of an ndarray is also endowed with additional -attributes that may be convenient when passing additional information -about the array into a ctypes function. The attributes **data**, -**shape**, and **strides** can provide c-types compatible types -corresponding to the data-area, the shape, and the strides of the -array. The data attribute reutrns a ``c_void_p`` representing a -pointer to the data area. The shape and strides attributes each return -an array of ctypes integers (or None representing a NULL pointer, if a -0-d array). The base ctype of the array is a ctype integer of the same -size as a pointer on the platform. There are also methods -data_as({ctype}), shape_as(<base ctype>), and strides_as(<base -ctype>). These return the data as a ctype object of your choice and -the shape/strides arrays using an underlying base type of your choice. -For convenience, the **ctypeslib** module also contains **c_intp** as -a ctypes integer data-type whose size is the same as the size of -``c_void_p`` on the platform (it's value is None if ctypes is not -installed). - - -Calling the function --------------------- - -The function is accessed as an attribute of or an item from the loaded -shared-library. Thus, if "./mylib.so" has a function named -"cool_function1" , I could access this function either as: - -.. code-block:: python - - lib = numpy.ctypeslib.load_library('mylib','.') - func1 = lib.cool_function1 # or equivalently - func1 = lib['cool_function1'] - -In ctypes, the return-value of a function is set to be 'int' by -default. This behavior can be changed by setting the restype attribute -of the function. Use None for the restype if the function has no -return value ('void'): - -.. code-block:: python - - func1.restype = None - -As previously discussed, you can also set the argtypes attribute of -the function in order to have ctypes check the types of the input -arguments when the function is called. Use the :func:`ndpointer` factory -function to generate a ready-made class for data-type, shape, and -flags checking on your new function. The :func:`ndpointer` function has the -signature - -.. function:: ndpointer(dtype=None, ndim=None, shape=None, flags=None) - - Keyword arguments with the value ``None`` are not checked. - Specifying a keyword enforces checking of that aspect of the - ndarray on conversion to a ctypes-compatible object. The dtype - keyword can be any object understood as a data-type object. The - ndim keyword should be an integer, and the shape keyword should be - an integer or a sequence of integers. The flags keyword specifies - the minimal flags that are required on any array passed in. This - can be specified as a string of comma separated requirements, an - integer indicating the requirement bits OR'd together, or a flags - object returned from the flags attribute of an array with the - necessary requirements. - -Using an ndpointer class in the argtypes method can make it -significantly safer to call a C-function using ctypes and the data- -area of an ndarray. You may still want to wrap the function in an -additional Python wrapper to make it user-friendly (hiding some -obvious arguments and making some arguments output arguments). In this -process, the **requires** function in NumPy may be useful to return the right kind of array from -a given input. - - -Complete example ----------------- - -In this example, I will show how the addition function and the filter -function implemented previously using the other approaches can be -implemented using ctypes. First, the C-code which implements the -algorithms contains the functions zadd, dadd, sadd, cadd, and -dfilter2d. The zadd function is: - -.. code-block:: c - - /* Add arrays of contiguous data */ - typedef struct {double real; double imag;} cdouble; - typedef struct {float real; float imag;} cfloat; - void zadd(cdouble *a, cdouble *b, cdouble *c, long n) - { - while (n--) { - c->real = a->real + b->real; - c->imag = a->imag + b->imag; - a++; b++; c++; - } - } - -with similar code for cadd, dadd, and sadd that handles complex float, -double, and float data-types, respectively: - -.. code-block:: c - - void cadd(cfloat *a, cfloat *b, cfloat *c, long n) - { - while (n--) { - c->real = a->real + b->real; - c->imag = a->imag + b->imag; - a++; b++; c++; - } - } - void dadd(double *a, double *b, double *c, long n) - { - while (n--) { - *c++ = *a++ + *b++; - } - } - void sadd(float *a, float *b, float *c, long n) - { - while (n--) { - *c++ = *a++ + *b++; - } - } - -The code.c file also contains the function dfilter2d: - -.. code-block:: c - - /* Assumes b is contiguous and - a has strides that are multiples of sizeof(double) - */ - void - dfilter2d(double *a, double *b, int *astrides, int *dims) - { - int i, j, M, N, S0, S1; - int r, c, rm1, rp1, cp1, cm1; - - M = dims[0]; N = dims[1]; - S0 = astrides[0]/sizeof(double); - S1=astrides[1]/sizeof(double); - for (i=1; i<M-1; i++) { - r = i*S0; rp1 = r+S0; rm1 = r-S0; - for (j=1; j<N-1; j++) { - c = j*S1; cp1 = j+S1; cm1 = j-S1; - b[i*N+j] = a[r+c] + \ - (a[rp1+c] + a[rm1+c] + \ - a[r+cp1] + a[r+cm1])*0.5 + \ - (a[rp1+cp1] + a[rp1+cm1] + \ - a[rm1+cp1] + a[rm1+cp1])*0.25; - } - } - } - -A possible advantage this code has over the Fortran-equivalent code is -that it takes arbitrarily strided (i.e. non-contiguous arrays) and may -also run faster depending on the optimization capability of your -compiler. But, it is a obviously more complicated than the simple code -in filter.f. This code must be compiled into a shared library. On my -Linux system this is accomplished using:: - - gcc -o code.so -shared code.c - -Which creates a shared_library named code.so in the current directory. -On Windows don't forget to either add __declspec(dllexport) in front -of void on the line preceeding each function definition, or write a -code.def file that lists the names of the functions to be exported. - -A suitable Python interface to this shared library should be -constructed. To do this create a file named interface.py with the -following lines at the top: - -.. code-block:: python - - __all__ = ['add', 'filter2d'] - - import numpy as N - import os - - _path = os.path.dirname('__file__') - lib = N.ctypeslib.load_library('code', _path) - _typedict = {'zadd' : complex, 'sadd' : N.single, - 'cadd' : N.csingle, 'dadd' : float} - for name in _typedict.keys(): - val = getattr(lib, name) - val.restype = None - _type = _typedict[name] - val.argtypes = [N.ctypeslib.ndpointer(_type, - flags='aligned, contiguous'), - N.ctypeslib.ndpointer(_type, - flags='aligned, contiguous'), - N.ctypeslib.ndpointer(_type, - flags='aligned, contiguous,'\ - 'writeable'), - N.ctypeslib.c_intp] - -This code loads the shared library named code.{ext} located in the -same path as this file. It then adds a return type of void to the -functions contained in the library. It also adds argument checking to -the functions in the library so that ndarrays can be passed as the -first three arguments along with an integer (large enough to hold a -pointer on the platform) as the fourth argument. - -Setting up the filtering function is similar and allows the filtering -function to be called with ndarray arguments as the first two -arguments and with pointers to integers (large enough to handle the -strides and shape of an ndarray) as the last two arguments.: - -.. code-block:: python - - lib.dfilter2d.restype=None - lib.dfilter2d.argtypes = [N.ctypeslib.ndpointer(float, ndim=2, - flags='aligned'), - N.ctypeslib.ndpointer(float, ndim=2, - flags='aligned, contiguous,'\ - 'writeable'), - ctypes.POINTER(N.ctypeslib.c_intp), - ctypes.POINTER(N.ctypeslib.c_intp)] - -Next, define a simple selection function that chooses which addition -function to call in the shared library based on the data-type: - -.. code-block:: python - - def select(dtype): - if dtype.char in ['?bBhHf']: - return lib.sadd, single - elif dtype.char in ['F']: - return lib.cadd, csingle - elif dtype.char in ['DG']: - return lib.zadd, complex - else: - return lib.dadd, float - return func, ntype - -Finally, the two functions to be exported by the interface can be -written simply as: - -.. code-block:: python - - def add(a, b): - requires = ['CONTIGUOUS', 'ALIGNED'] - a = N.asanyarray(a) - func, dtype = select(a.dtype) - a = N.require(a, dtype, requires) - b = N.require(b, dtype, requires) - c = N.empty_like(a) - func(a,b,c,a.size) - return c - -and: - -.. code-block:: python - - def filter2d(a): - a = N.require(a, float, ['ALIGNED']) - b = N.zeros_like(a) - lib.dfilter2d(a, b, a.ctypes.strides, a.ctypes.shape) - return b - - -Conclusion ----------- - -.. index:: - single: ctypes - -Using ctypes is a powerful way to connect Python with arbitrary -C-code. It's advantages for extending Python include - -- clean separation of C-code from Python code - - - no need to learn a new syntax except Python and C - - - allows re-use of C-code - - - functionality in shared libraries written for other purposes can be - obtained with a simple Python wrapper and search for the library. - - -- easy integration with NumPy through the ctypes attribute - -- full argument checking with the ndpointer class factory - -It's disadvantages include - -- It is difficult to distribute an extension module made using ctypes - because of a lack of support for building shared libraries in - distutils (but I suspect this will change in time). - -- You must have shared-libraries of your code (no static libraries). - -- Very little support for C++ code and it's different library-calling - conventions. You will probably need a C-wrapper around C++ code to use - with ctypes (or just use Boost.Python instead). - -Because of the difficulty in distributing an extension module made -using ctypes, f2py is still the easiest way to extend Python for -package creation. However, ctypes is a close second and will probably -be growing in popularity now that it is part of the Python -distribution. This should bring more features to ctypes that should -eliminate the difficulty in extending Python and distributing the -extension using ctypes. - - -Additional tools you may find useful -==================================== - -These tools have been found useful by others using Python and so are -included here. They are discussed separately because I see them as -either older ways to do things more modernly handled by f2py, weave, -Pyrex, or ctypes (SWIG, PyFort, PyInline) or because I don't know much -about them (SIP, Boost, Instant). I have not added links to these -methods because my experience is that you can find the most relevant -link faster using Google or some other search engine, and any links -provided here would be quickly dated. Do not assume that just because -it is included in this list, I don't think the package deserves your -attention. I'm including information about these packages because many -people have found them useful and I'd like to give you as many options -as possible for tackling the problem of easily integrating your code. - - -SWIG ----- - -.. index:: - single: swig - -Simplified Wrapper and Interface Generator (SWIG) is an old and fairly -stable method for wrapping C/C++-libraries to a large variety of other -languages. It does not specifically understand NumPy arrays but can be -made useable with NumPy through the use of typemaps. There are some -sample typemaps in the numpy/doc/swig directory under numpy.i along -with an example module that makes use of them. SWIG excels at wrapping -large C/C++ libraries because it can (almost) parse their headers and -auto-produce an interface. Technically, you need to generate a ``.i`` -file that defines the interface. Often, however, this ``.i`` file can -be parts of the header itself. The interface usually needs a bit of -tweaking to be very useful. This ability to parse C/C++ headers and -auto-generate the interface still makes SWIG a useful approach to -adding functionalilty from C/C++ into Python, despite the other -methods that have emerged that are more targeted to Python. SWIG can -actually target extensions for several languages, but the typemaps -usually have to be language-specific. Nonetheless, with modifications -to the Python-specific typemaps, SWIG can be used to interface a -library with other languages such as Perl, Tcl, and Ruby. - -My experience with SWIG has been generally positive in that it is -relatively easy to use and quite powerful. I used to use it quite -often before becoming more proficient at writing C-extensions. -However, I struggled writing custom interfaces with SWIG because it -must be done using the concept of typemaps which are not Python -specific and are written in a C-like syntax. Therefore, I tend to -prefer other gluing strategies and would only attempt to use SWIG to -wrap a very-large C/C++ library. Nonetheless, there are others who use -SWIG quite happily. - - -SIP ---- - -.. index:: - single: SIP - -SIP is another tool for wrapping C/C++ libraries that is Python -specific and appears to have very good support for C++. Riverbank -Computing developed SIP in order to create Python bindings to the QT -library. An interface file must be written to generate the binding, -but the interface file looks a lot like a C/C++ header file. While SIP -is not a full C++ parser, it understands quite a bit of C++ syntax as -well as its own special directives that allow modification of how the -Python binding is accomplished. It also allows the user to define -mappings between Python types and C/C++ structrues and classes. - - -Boost Python ------------- - -.. index:: - single: Boost.Python - -Boost is a repository of C++ libraries and Boost.Python is one of -those libraries which provides a concise interface for binding C++ -classes and functions to Python. The amazing part of the Boost.Python -approach is that it works entirely in pure C++ without introducing a -new syntax. Many users of C++ report that Boost.Python makes it -possible to combine the best of both worlds in a seamless fashion. I -have not used Boost.Python because I am not a big user of C++ and -using Boost to wrap simple C-subroutines is usually over-kill. It's -primary purpose is to make C++ classes available in Python. So, if you -have a set of C++ classes that need to be integrated cleanly into -Python, consider learning about and using Boost.Python. - - -Instant -------- - -.. index:: - single: Instant - -This is a relatively new package (called pyinstant at sourceforge) -that builds on top of SWIG to make it easy to inline C and C++ code in -Python very much like weave. However, Instant builds extension modules -on the fly with specific module names and specific method names. In -this repsect it is more more like f2py in its behavior. The extension -modules are built on-the fly (as long as the SWIG is installed). They -can then be imported. Here is an example of using Instant with NumPy -arrays (adapted from the test2 included in the Instant distribution): - -.. code-block:: python - - code=""" - PyObject* add(PyObject* a_, PyObject* b_){ - /* - various checks - */ - PyArrayObject* a=(PyArrayObject*) a_; - PyArrayObject* b=(PyArrayObject*) b_; - int n = a->dimensions[0]; - int dims[1]; - dims[0] = n; - PyArrayObject* ret; - ret = (PyArrayObject*) PyArray_FromDims(1, dims, NPY_DOUBLE); - int i; - char *aj=a->data; - char *bj=b->data; - double *retj = (double *)ret->data; - for (i=0; i < n; i++) { - *retj++ = *((double *)aj) + *((double *)bj); - aj += a->strides[0]; - bj += b->strides[0]; - } - return (PyObject *)ret; - } - """ - import Instant, numpy - ext = Instant.Instant() - ext.create_extension(code=s, headers=["numpy/arrayobject.h"], - include_dirs=[numpy.get_include()], - init_code='import_array();', module="test2b_ext") - import test2b_ext - a = numpy.arange(1000) - b = numpy.arange(1000) - d = test2b_ext.add(a,b) - -Except perhaps for the dependence on SWIG, Instant is a -straightforward utility for writing extension modules. - - -PyInline --------- - -This is a much older module that allows automatic building of -extension modules so that C-code can be included with Python code. -It's latest release (version 0.03) was in 2001, and it appears that it -is not being updated. - - -PyFort ------- - -PyFort is a nice tool for wrapping Fortran and Fortran-like C-code -into Python with support for Numeric arrays. It was written by Paul -Dubois, a distinguished computer scientist and the very first -maintainer of Numeric (now retired). It is worth mentioning in the -hopes that somebody will update PyFort to work with NumPy arrays as -well which now support either Fortran or C-style contiguous arrays. diff --git a/trunk/source/user/c-info.rst b/trunk/source/user/c-info.rst deleted file mode 100644 index 086f97c8d..000000000 --- a/trunk/source/user/c-info.rst +++ /dev/null @@ -1,9 +0,0 @@ -################# -Using Numpy C-API -################# - -.. toctree:: - - c-info.how-to-extend - c-info.python-as-glue - c-info.beyond-basics diff --git a/trunk/source/user/howtofind.rst b/trunk/source/user/howtofind.rst deleted file mode 100644 index 5f6b49012..000000000 --- a/trunk/source/user/howtofind.rst +++ /dev/null @@ -1,9 +0,0 @@ -************************* -How to find documentation -************************* - -.. seealso:: :ref:`Numpy-specific help functions <routines.help>` - -.. note:: XXX: this part is not yet written. - -.. automodule:: numpy.doc.howtofind diff --git a/trunk/source/user/index.rst b/trunk/source/user/index.rst deleted file mode 100644 index 750062d50..000000000 --- a/trunk/source/user/index.rst +++ /dev/null @@ -1,27 +0,0 @@ -.. _user: - -################ -Numpy User Guide -################ - -This guide explains how to make use of different features -of Numpy. For a detailed documentation about different functions -and classes, see :ref:`reference`. - -.. warning:: - - This "User Guide" is still very much work in progress; the material - is not organized, and many aspects of Numpy are not covered. - - More documentation for Numpy can be found on the - `scipy.org <http://www.scipy.org/Documentation>`__ website. - -.. toctree:: - :maxdepth: 2 - - howtofind - basics - performance - misc - c-info - diff --git a/trunk/source/user/misc.rst b/trunk/source/user/misc.rst deleted file mode 100644 index 4e2ec9fdb..000000000 --- a/trunk/source/user/misc.rst +++ /dev/null @@ -1,9 +0,0 @@ -************* -Miscellaneous -************* - -.. note:: XXX: This section is not yet written. - -.. automodule:: numpy.doc.misc - -.. automodule:: numpy.doc.methods_vs_functions diff --git a/trunk/source/user/performance.rst b/trunk/source/user/performance.rst deleted file mode 100644 index 1f6e4e16c..000000000 --- a/trunk/source/user/performance.rst +++ /dev/null @@ -1,7 +0,0 @@ -*********** -Performance -*********** - -.. note:: XXX: This section is not yet written. - -.. automodule:: numpy.doc.performance |