diff options
author | Sayed Adel <seiko@imavr.com> | 2020-07-13 05:28:10 +0200 |
---|---|---|
committer | Sayed Adel <seiko@imavr.com> | 2020-07-14 09:13:59 +0200 |
commit | f817de8f058bd7503bbc94e153279e926bf15603 (patch) | |
tree | 6b0614c882a3ff14d23807e3b9c93ccb74e0de24 /doc/source/reference/simd/simd-optimizations-tables.inc | |
parent | 151c0aae81c16627cad79e9e81424c2221b94970 (diff) | |
download | numpy-f817de8f058bd7503bbc94e153279e926bf15603.tar.gz |
DOC: improve SIMD features tables
- improve the tables generator(style/simplify)
- show the differences between the compilers
- add an explanation about interrelated CPU features
Diffstat (limited to 'doc/source/reference/simd/simd-optimizations-tables.inc')
-rw-r--r-- | doc/source/reference/simd/simd-optimizations-tables.inc | 165 |
1 files changed, 79 insertions, 86 deletions
diff --git a/doc/source/reference/simd/simd-optimizations-tables.inc b/doc/source/reference/simd/simd-optimizations-tables.inc index d5b82ee0c..f038a91e1 100644 --- a/doc/source/reference/simd/simd-optimizations-tables.inc +++ b/doc/source/reference/simd/simd-optimizations-tables.inc @@ -1,110 +1,103 @@ .. generated via source/reference/simd/simd-optimizations.py -``X86`` - CPU feature names -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - +x86 - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~ .. table:: :align: left - ======== ================================================================================================================= - Name Implies - ======== ================================================================================================================= - SSE ``SSE`` ``SSE2`` - SSE2 ``SSE`` ``SSE2`` - SSE3 ``SSE`` ``SSE2`` - SSSE3 ``SSE`` ``SSE2`` ``SSE3`` - SSE41 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` - POPCNT ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` - SSE42 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` - AVX ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` - XOP ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` - FMA4 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` - F16C ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` - FMA3 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` - AVX2 ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` - AVX512F ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` - AVX512CD ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` - ======== ================================================================================================================= - -``X86`` - Group names -~~~~~~~~~~~~~~~~~~~~~ - + ============ ================================================================================================================= + Name Implies + ============ ================================================================================================================= + ``SSE`` ``SSE2`` + ``SSE2`` ``SSE`` + ``SSE3`` ``SSE`` ``SSE2`` + ``SSSE3`` ``SSE`` ``SSE2`` ``SSE3`` + ``SSE41`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` + ``POPCNT`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` + ``SSE42`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` + ``AVX`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` + ``XOP`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + ``FMA4`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + ``F16C`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` + ``FMA3`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` + ``AVX2`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` + ``AVX512F`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` + ``AVX512CD`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` + ============ ================================================================================================================= + +x86 - Group names +~~~~~~~~~~~~~~~~~ .. table:: :align: left - ========== ===================================================== =========================================================================================================================================================================== - Name Gather Implies - ========== ===================================================== =========================================================================================================================================================================== - AVX512_KNL ``AVX512ER`` ``AVX512PF`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` - AVX512_KNM ``AVX5124FMAPS`` ``AVX5124VNNIW`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_KNL`` - AVX512_SKX ``AVX512VL`` ``AVX512BW`` ``AVX512DQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` - AVX512_CLX ``AVX512VNNI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` - AVX512_CNL ``AVX512IFMA`` ``AVX512VBMI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` - AVX512_ICL ``AVX512VBMI2`` ``AVX512BITALG`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` ``AVX512_CLX`` ``AVX512_CNL`` - ========== ===================================================== =========================================================================================================================================================================== - -``IBM/POWER`` ``big-endian`` - CPU feature names -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - + ============== ===================================================== =========================================================================================================================================================================== + Name Gather Implies + ============== ===================================================== =========================================================================================================================================================================== + ``AVX512_KNL`` ``AVX512ER`` ``AVX512PF`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` + ``AVX512_KNM`` ``AVX5124FMAPS`` ``AVX5124VNNIW`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_KNL`` + ``AVX512_SKX`` ``AVX512VL`` ``AVX512BW`` ``AVX512DQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` + ``AVX512_CLX`` ``AVX512VNNI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` + ``AVX512_CNL`` ``AVX512IFMA`` ``AVX512VBMI`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` + ``AVX512_ICL`` ``AVX512VBMI2`` ``AVX512BITALG`` ``AVX512VPOPCNTDQ`` ``SSE`` ``SSE2`` ``SSE3`` ``SSSE3`` ``SSE41`` ``POPCNT`` ``SSE42`` ``AVX`` ``F16C`` ``FMA3`` ``AVX2`` ``AVX512F`` ``AVX512CD`` ``AVX512_SKX`` ``AVX512_CLX`` ``AVX512_CNL`` + ============== ===================================================== =========================================================================================================================================================================== + +IBM/POWER big-endian - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. table:: :align: left - ==== ================ - Name Implies - ==== ================ - VSX - VSX2 ``VSX`` - VSX3 ``VSX`` ``VSX2`` - ==== ================ - -``IBM/POWER`` ``little-endian mode`` - CPU feature names -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + ======== ================ + Name Implies + ======== ================ + ``VSX`` + ``VSX2`` ``VSX`` + ``VSX3`` ``VSX`` ``VSX2`` + ======== ================ +IBM/POWER little-endian - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. table:: :align: left - ==== ================ - Name Implies - ==== ================ - VSX ``VSX`` ``VSX2`` - VSX2 ``VSX`` ``VSX2`` - VSX3 ``VSX`` ``VSX2`` - ==== ================ + ======== ================ + Name Implies + ======== ================ + ``VSX`` ``VSX2`` + ``VSX2`` ``VSX`` + ``VSX3`` ``VSX`` ``VSX2`` + ======== ================ -``ARMHF`` - CPU feature names +ARMv7/A32 - CPU feature names ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - .. table:: :align: left - ========== =========================================================== - Name Implies - ========== =========================================================== - NEON - NEON_FP16 ``NEON`` - NEON_VFPV4 ``NEON`` ``NEON_FP16`` - ASIMD ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` - ASIMDHP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMDDP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMDFHM ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` - ========== =========================================================== - -``ARM64`` ``AARCH64`` - CPU feature names -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - + ============== =========================================================== + Name Implies + ============== =========================================================== + ``NEON`` + ``NEON_FP16`` ``NEON`` + ``NEON_VFPV4`` ``NEON`` ``NEON_FP16`` + ``ASIMD`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` + ``ASIMDHP`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ``ASIMDDP`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ``ASIMDFHM`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` + ============== =========================================================== + +ARMv8/A64 - CPU feature names +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. table:: :align: left - ========== =========================================================== - Name Implies - ========== =========================================================== - NEON ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - NEON_FP16 ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - NEON_VFPV4 ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMD ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMDHP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMDDP ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` - ASIMDFHM ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` - ========== =========================================================== + ============== =========================================================== + Name Implies + ============== =========================================================== + ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ``NEON_FP16`` ``NEON`` ``NEON_VFPV4`` ``ASIMD`` + ``NEON_VFPV4`` ``NEON`` ``NEON_FP16`` ``ASIMD`` + ``ASIMD`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` + ``ASIMDHP`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ``ASIMDDP`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` + ``ASIMDFHM`` ``NEON`` ``NEON_FP16`` ``NEON_VFPV4`` ``ASIMD`` ``ASIMDHP`` + ============== =========================================================== -
\ No newline at end of file |