summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJulian Taylor <jtaylor.debian@googlemail.com>2013-06-08 23:13:34 +0200
committerJulian Taylor <jtaylor.debian@googlemail.com>2013-06-08 23:18:58 +0200
commit7fb8b714906a92516905cc0f03e45511bd1ac1ed (patch)
tree48d492f5557424b48b43b1f4406a989b6ceee096
parent63123538b7d4d948919dfb5366a78eaa972fcda9 (diff)
downloadnumpy-7fb8b714906a92516905cc0f03e45511bd1ac1ed.tar.gz
DOC: document isnan/bswap and SSE2 improvements
-rw-r--r--doc/release/1.8.0-notes.rst18
1 files changed, 18 insertions, 0 deletions
diff --git a/doc/release/1.8.0-notes.rst b/doc/release/1.8.0-notes.rst
index 76dcf50c2..c5315e6cd 100644
--- a/doc/release/1.8.0-notes.rst
+++ b/doc/release/1.8.0-notes.rst
@@ -142,6 +142,24 @@ The `pad` function has a new implementation, greatly improving performance for
all inputs except `mode=<function>` (retained for backwards compatibility).
Scaling with dimensionality is dramatically improved for rank >= 4.
+Performance improvements to `isnan`, `isinf`, `isfinite` and `byteswap`
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+`isnan`, `isinf`, `isfinite` and `byteswap` have been improved to take
+advantage of compiler builtins to avoid expensive calls to libc.
+This improves performance of these operations by about a factor of two on gnu
+libc systems.
+
+Performance improvements to `sqrt` and `abs`
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The `sqrt` and `abs` functions for unit stride elementary operations have been
+improved to make use of SSE2 CPU SIMD instructions.
+This improves performance of these operations up to 4x/2x for float32/float64
+depending on the location of the data in the CPU caches. The performance gain
+is greatest for in-place operations.
+In order to use the improved functions the SSE2 instruction set must be enabled
+at compile time. It is enabled by default on x86_64 systems. On x86_32 with a
+capable CPU it must be enabled by passing the appropriate flag to CFLAGS build
+variable (-msse2 with gcc).
Changes
=======