diff options
author | Julian Taylor <jtaylor.debian@googlemail.com> | 2013-06-08 23:13:34 +0200 |
---|---|---|
committer | Julian Taylor <jtaylor.debian@googlemail.com> | 2013-06-08 23:18:58 +0200 |
commit | 7fb8b714906a92516905cc0f03e45511bd1ac1ed (patch) | |
tree | 48d492f5557424b48b43b1f4406a989b6ceee096 | |
parent | 63123538b7d4d948919dfb5366a78eaa972fcda9 (diff) | |
download | numpy-7fb8b714906a92516905cc0f03e45511bd1ac1ed.tar.gz |
DOC: document isnan/bswap and SSE2 improvements
-rw-r--r-- | doc/release/1.8.0-notes.rst | 18 |
1 files changed, 18 insertions, 0 deletions
diff --git a/doc/release/1.8.0-notes.rst b/doc/release/1.8.0-notes.rst index 76dcf50c2..c5315e6cd 100644 --- a/doc/release/1.8.0-notes.rst +++ b/doc/release/1.8.0-notes.rst @@ -142,6 +142,24 @@ The `pad` function has a new implementation, greatly improving performance for all inputs except `mode=<function>` (retained for backwards compatibility). Scaling with dimensionality is dramatically improved for rank >= 4. +Performance improvements to `isnan`, `isinf`, `isfinite` and `byteswap` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +`isnan`, `isinf`, `isfinite` and `byteswap` have been improved to take +advantage of compiler builtins to avoid expensive calls to libc. +This improves performance of these operations by about a factor of two on gnu +libc systems. + +Performance improvements to `sqrt` and `abs` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The `sqrt` and `abs` functions for unit stride elementary operations have been +improved to make use of SSE2 CPU SIMD instructions. +This improves performance of these operations up to 4x/2x for float32/float64 +depending on the location of the data in the CPU caches. The performance gain +is greatest for in-place operations. +In order to use the improved functions the SSE2 instruction set must be enabled +at compile time. It is enabled by default on x86_64 systems. On x86_32 with a +capable CPU it must be enabled by passing the appropriate flag to CFLAGS build +variable (-msse2 with gcc). Changes ======= |