delta/python-packages/numpy.git - github.com: numpy/numpy.git

diff options

author	Julian Taylor <jtaylor.debian@googlemail.com>	2013-05-19 17:04:27 +0200
committer	Julian Taylor <jtaylor.debian@googlemail.com>	2013-05-25 17:36:00 +0200
commit	0adccaaa910ab495e993f453956fd983775604f3 (patch)
tree	575e6b1bc7066bbe24ade1fee8576e4e31f2f7ef /numpy/core/numeric.py
parent	8ff5e37bff03925da4c1b121b38188f9fd779b4d (diff)
download	numpy-0adccaaa910ab495e993f453956fd983775604f3.tar.gz

ENH: vectorize sqrt ufunc using SSE2

specialize the sqrt ufunc for float and double and vectorize it using SSE2. improves performance by 4/2 for float/double if one is not memory bound due to non-cached data. performance is always better on all tested machines (amd phenom X2, intel xeon 5xxx/7xxx, core2duo, corei7) This version will not set errno on invalid input, but numpy only checks the fpu flags so the behavior is the same. In principle the compiler could autovectorize it when setting ffast-math (for no errno) and specializing the loop for the vectorizable strides and giving it some hints (restrict, __builtin_assume_aligned, etc.), but its simpler and more reliable to simply vectorize it by hand.

Diffstat (limited to 'numpy/core/numeric.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: