diff options
author | Matti Picus <matti.picus@gmail.com> | 2020-07-12 16:30:42 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-07-12 08:30:42 -0500 |
commit | b234742e2e26ff886f52864c73460fe4916a66d0 (patch) | |
tree | 5f6f90804adf53329d03edefadffd6d1e07a6a1b /doc/source/reference/simd/simd-optimizations-tables.inc | |
parent | 62fa23c44fb49c1d238e1de4f791ffc3ca4b1d11 (diff) | |
download | numpy-b234742e2e26ff886f52864c73460fe4916a66d0.tar.gz |
DOC: Add SIMD optimization documentation (gh-15551)
Add documentation for the new build infrastructure and API developed to enable
universal intrinsics. Written by @seiko2plus with some fixes by @mattip.
* DOC: add SIMD optimization doc (seiko2plus)
* DOC: reformat as valid RST
* trim whitespace
* first part of Understanding CPU Dispatching
* update build options and remove implied features, gonna update it later
* add more explanations for the dispatcher and fix doc style
* fix up style
* add figure
* Improve and more explanations for Understanding CPU Dispatching
* fix up syntax
* DOC: tweak formatting
* DOC: more tweaks
* fix rst formatting
* DOC: Generate CPU features tables from CCompilerOpt
* DOC: move files around
* DOC: add comment to top of file
* DOC: rebuild tables, fix links
* DOC: minor copyedits
Co-authored-by: Sayed Adel <seiko@imavr.com>
Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
Diffstat (limited to 'doc/source/reference/simd/simd-optimizations-tables.inc')
-rw-r--r-- | doc/source/reference/simd/simd-optimizations-tables.inc | 110 |
1 files changed, 110 insertions, 0 deletions
diff --git a/doc/source/reference/simd/simd-optimizations-tables.inc b/doc/source/reference/simd/simd-optimizations-tables.inc new file mode 100644 index 000000000..d5b82ee0c --- /dev/null +++ b/doc/source/reference/simd/simd-optimizations-tables.inc @@ -0,0 +1,110 @@ +.. generated via source/reference/simd/simd-optimizations.py + +``X86`` - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ======== ================================================================================================================= + Name Implies + ======== ================================================================================================================= + SSE ``SSE`` ``SSE2`` + SSE2 ``SSE`` ``SSE2`` + SSE3 ``SSE`` ``SSE2`` + SSSE3 ``SSE`` ``SSE2`` ``SSE3`` + SSE41 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` + POPCNT ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` + SSE42 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` + AVX ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` + XOP ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + FMA4 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + F16C ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + FMA3 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` + AVX2 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` + AVX512F ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` + AVX512CD ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` + ======== ================================================================================================================= + +``X86`` - Group names +~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ========== ===================================================== =========================================================================================================================================================================== + Name Gather Implies + ========== ===================================================== =========================================================================================================================================================================== + AVX512_KNL ``AVX512ER`` ``AVX512PF`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` + AVX512_KNM ``AVX5124FMAPS`` ``AVX5124VNNIW`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_KNL`` + AVX512_SKX ``AVX512VL`` ``AVX512BW`` ``AVX512DQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` + AVX512_CLX ``AVX512VNNI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` + AVX512_CNL ``AVX512IFMA`` ``AVX512VBMI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` + AVX512_ICL ``AVX512VBMI2`` ``AVX512BITALG`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` ``AVX512_CLX`` ``AVX512_CNL`` + ========== ===================================================== =========================================================================================================================================================================== + +``IBM/POWER`` ``big-endian`` - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ==== ================ + Name Implies + ==== ================ + VSX + VSX2 ``VSX`` + VSX3 ``VSX`` ``VSX2`` + ==== ================ + +``IBM/POWER`` ``little-endian mode`` - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ==== ================ + Name Implies + ==== ================ + VSX ``VSX`` ``VSX2`` + VSX2 ``VSX`` ``VSX2`` + VSX3 ``VSX`` ``VSX2`` + ==== ================ + +``ARMHF`` - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ========== =========================================================== + Name Implies + ========== =========================================================== + NEON + NEON_FP16 ``NEON`` + NEON_VFPV4 ``NEON`` ``NEON_FP16`` + ASIMD ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` + ASIMDHP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMDDP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMDFHM ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` + ========== =========================================================== + +``ARM64`` ``AARCH64`` - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. table:: + :align: left + + ========== =========================================================== + Name Implies + ========== =========================================================== + NEON ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + NEON_FP16 ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + NEON_VFPV4 ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMD ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMDHP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMDDP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ASIMDFHM ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` + ========== =========================================================== + +
\ No newline at end of file |