diff options
author | Rafael Cardoso Fernandes Sousa <rafaelcfsousa@ibm.com> | 2022-02-15 19:07:58 -0600 |
---|---|---|
committer | Rafael Cardoso Fernandes Sousa <rafaelcfsousa@ibm.com> | 2022-04-15 22:42:38 -0500 |
commit | a14d04752036c9f1b4eb000d079b27da3bacedf2 (patch) | |
tree | cb918438e3bff523fbff923eed84c9b51686f171 /numpy/array_api/_typing.py | |
parent | 1ab7e8fbf90ac4a81d2ffdde7d78ec464dccb02e (diff) | |
download | numpy-a14d04752036c9f1b4eb000d079b27da3bacedf2.tar.gz |
ENH,SIMD: Vectorize modulo/divide using the universal intrinsics
This commit optimizes the operations below:
- fmod (signed/unsigned integers)
- remainder (signed/unsigned integers)
- divmod (signed/unsigned integers)
- floor_divide (signed integers)
using the VSX4/Power10 integer vector division/modulo instructions.
See the improvements below (maximum speedup):
- numpy.fmod
- arr OP arr: signed (1.17x), unsigned (1.13x)
- arr OP scalar: signed (1.34x), unsigned (1.29x)
- numpy.remainder
- arr OP arr: signed (4.19x), unsigned (1.17x)
- arr OP scalar: signed (4.87x), unsigned (1.29x)
- numpy.divmod
- arr OP arr: signed (4.73x), unsigned (1.23x)
- arr OP scalar: signed (5.05x), unsigned (1.31x)
- numpy.floor_divide
- arr OP arr: signed (4.44x)
The times above were collected using the benchmark tool available in NumPy.
Diffstat (limited to 'numpy/array_api/_typing.py')
0 files changed, 0 insertions, 0 deletions