diff options
author | rgommers <ralf.gommers@googlemail.com> | 2010-11-11 21:29:56 +0800 |
---|---|---|
committer | rgommers <ralf.gommers@googlemail.com> | 2010-11-11 21:29:56 +0800 |
commit | 4a7de57448091ef02e50edf9d1e302c20a26ff0c (patch) | |
tree | 357de983a7ba4ff56e7800781f4619d70111a89f /doc/ufuncs.rst.txt | |
parent | a07ac0f4703cbac0e8747ac0e9e08f41cba9b896 (diff) | |
download | numpy-4a7de57448091ef02e50edf9d1e302c20a26ff0c.tar.gz |
DOC: rename ReST files under doc/ from *.txt to *.rst.txt, so they render on github.
Diffstat (limited to 'doc/ufuncs.rst.txt')
-rw-r--r-- | doc/ufuncs.rst.txt | 103 |
1 files changed, 103 insertions, 0 deletions
diff --git a/doc/ufuncs.rst.txt b/doc/ufuncs.rst.txt new file mode 100644 index 000000000..fa107cc21 --- /dev/null +++ b/doc/ufuncs.rst.txt @@ -0,0 +1,103 @@ +BUFFERED General Ufunc explanation +================================== + +.. note:: + + This was implemented already, but the notes are kept here for historical + and explanatory purposes. + +We need to optimize the section of ufunc code that handles mixed-type +and misbehaved arrays. In particular, we need to fix it so that items +are not copied into the buffer if they don't have to be. + +Right now, all data is copied into the buffers (even scalars are copied +multiple times into the buffers even if they are not going to be cast). + +Some benchmarks show that this results in a significant slow-down +(factor of 4) over similar numarray code. + +The approach is therefore, to loop over the largest-dimension (just like +the NO_BUFFER) portion of the code. All arrays will either have N or +1 in this last dimension (or their would be a mis-match error). The +buffer size is B. + +If N <= B (and only if needed), we copy the entire last-dimension into +the buffer as fast as possible using the single-stride information. + +Also we only copy into output arrays if needed as well (other-wise the +output arrays are used directly in the ufunc code). + +Call the function using the appropriate strides information from all the input +arrays. Only set the strides to the element-size for arrays that will be copied. + +If N > B, then we have to do the above operation in a loop (with an extra loop +at the end with a different buffer size). + +Both of these cases are handled with the following code:: + + Compute N = quotient * B + remainder. + quotient = N / B # integer math + (store quotient + 1) as the number of innerloops + remainder = N % B # integer remainder + +On the inner-dimension we will have (quotient + 1) loops where +the size of the inner function is B for all but the last when the niter size is +remainder. + +So, the code looks very similar to NOBUFFER_LOOP except the inner loop is +replaced with:: + + for(k=0; i<quotient+1; k++) { + if (k==quotient+1) make itersize remainder size + copy only needed items to buffer. + swap input buffers if needed + cast input buffers if needed + call function() + cast outputs in buffers if needed + swap outputs in buffers if needed + copy only needed items back to output arrays. + update all data-pointers by strides*niter + } + + +Reference counting for OBJECT arrays: + +If there are object arrays involved then loop->obj gets set to 1. Then there are two cases: + +1) The loop function is an object loop: + + Inputs: + - castbuf starts as NULL and then gets filled with new references. + - function gets called and doesn't alter the reference count in castbuf + - on the next iteration (next value of k), the casting function will + DECREF what is present in castbuf already and place a new object. + + - At the end of the inner loop (for loop over k), the final new-references + in castbuf must be DECREF'd. If its a scalar then a single DECREF suffices + Otherwise, "bufsize" DECREF's are needed (unless there was only one + loop, then "remainder" DECREF's are needed). + + Outputs: + - castbuf contains a new reference as the result of the function call. This + gets converted to the type of interest and. This new reference in castbuf + will be DECREF'd by later calls to the function. Thus, only after the + inner most loop do we need to DECREF the remaining references in castbuf. + +2) The loop function is of a different type: + + Inputs: + + - The PyObject input is copied over to buffer which receives a "borrowed" + reference. This reference is then used but not altered by the cast + call. Nothing needs to be done. + + Outputs: + + - The buffer[i] memory receives the PyObject input after the cast. This is + a new reference which will be "stolen" as it is copied over into memory. + The only problem is that what is presently in memory must be DECREF'd first. + + + + + |