summaryrefslogtreecommitdiff
path: root/ext/reflection/php_reflection.c
diff options
context:
space:
mode:
authorSebastian Pop <spop@amazon.com>2019-07-03 20:10:38 +0000
committerDmitry Stogov <dmitry@zend.com>2019-07-11 12:04:29 +0300
commit3b73c9fb8692a6ffc0f4cb6e66eb649871dfed34 (patch)
tree134ec1adc671faad79666100f2ff430fadeb826d /ext/reflection/php_reflection.c
parent2a535a9707c89502df8bc0bd785f2e9192929422 (diff)
downloadphp-git-3b73c9fb8692a6ffc0f4cb6e66eb649871dfed34.tar.gz
neon vectorization for base64
A similar algorithm is used to vectorize on x86_64, with a good description in https://arxiv.org/abs/1704.00605 . On AArch64 the implementation differs in that instead of using multiplies to shift bits around, it uses the vld3+vst4 and vld4+vst3 combinations to load and store interleaved data. This patch is based on the NEON implementation of Wojciech Mula: https://github.com/WojciechMula/base64simd/blob/master/encode/encode.neon.cpp https://github.com/WojciechMula/base64simd/blob/master/encode/lookup.neon.cpp and https://github.com/WojciechMula/base64simd/blob/master/encode/encode.neon.cpp https://github.com/WojciechMula/base64simd/blob/master/encode/encode.neon.cpp adapted to php/ext/standard/base64.c and vectorized with factor 16 instead of 8. On a Graviton A1 instance and on the synthetic benchmarks in https://github.com/lemire/fastbase64 I see 175% speedup on base64 encoding and 60% speedup on base64 decode compared to the scalar implementation. The patch passes `make test` regression testing on aarch64-linux.
Diffstat (limited to 'ext/reflection/php_reflection.c')
0 files changed, 0 insertions, 0 deletions