diff options
-rw-r--r-- | doc/release/1.8.0-notes.rst | 14 |
1 files changed, 7 insertions, 7 deletions
diff --git a/doc/release/1.8.0-notes.rst b/doc/release/1.8.0-notes.rst index b0caf8103..127226054 100644 --- a/doc/release/1.8.0-notes.rst +++ b/doc/release/1.8.0-notes.rst @@ -149,18 +149,18 @@ advantage of compiler builtins to avoid expensive calls to libc. This improves performance of these operations by about a factor of two on gnu libc systems. -Performance improvements to base math, `sqrt`, `abs` and `min/max` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The base math (add, subtract, divide, multiply) and `sqrt`, `abs`, `min/max` -functions for unit stride elementary operations have been improved to make use -of SSE2 CPU SIMD instructions. +Performance improvements to base math, `sqrt`, `absolute` and `minimum/maximum` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The base math (add, subtract, divide, multiply), `sqrt`, `absolute` and +`minimum/maximum` functions for unit stride elementary operations have been +improved to make use of SSE2 CPU SIMD instructions. This improves performance of these operations up to 4x/2x for float32/float64 depending on the location of the data in the CPU caches. The performance gain is greatest for in-place operations. In order to use the improved functions the SSE2 instruction set must be enabled at compile time. It is enabled by default on x86_64 systems. On x86_32 with a -capable CPU it must be enabled by passing the appropriate flag to CFLAGS build -variable (-msse2 with gcc). +capable CPU it must be enabled by passing the appropriate flag to the CFLAGS +build variable (-msse2 with gcc). Changes ======= |