summaryrefslogtreecommitdiff
path: root/numpy/lib/function_base.py
diff options
context:
space:
mode:
authorJulian Taylor <jtaylor.debian@googlemail.com>2016-11-29 00:19:21 +0100
committerJulian Taylor <jtaylor.debian@googlemail.com>2017-01-12 17:17:07 +0100
commitf0f7ad80f2ef2d7525965dfe27c0e2ab68647197 (patch)
tree885b69a2c89ca924395da8f908f5ba2c48383864 /numpy/lib/function_base.py
parent7e6091c9a3fc4536ccbadb337e88650b2c901313 (diff)
downloadnumpy-f0f7ad80f2ef2d7525965dfe27c0e2ab68647197.tar.gz
ENH: vectorize packbits with SSE2
SSE2 has a special instruction to pack bytes into bits, available as the intrinsic _mm_movemask_epi8. It is significantly faster than the per byte loop currently being used. Unfortunately packbits is bitwise "big endian", the most significant bit is the first in the input byte while _mm_movemask_epi is little endian so we need to byteswap the input first. But it is still about 8-10 times faster than the scalar code.
Diffstat (limited to 'numpy/lib/function_base.py')
0 files changed, 0 insertions, 0 deletions