Changes in this revision: * Remove features that classified as feature creep (Gimple folding and rewriting the aarch64/arm dotprod builtin initialization routines). These will be submitted separately later. * Add missing second mode to arm-backend pattern missed in original. * Add implementation for internal_fn in `directly_supported_p' for convert optabs. * Reuse existing iterators in the i386 backend. * Improve ChangeLog entries involving renaming of back-end patterns. * Improve tests, including new test with run-time checks to verify correctness.
----- Given the specification in the GCC internals manual defines the {u|s}dot_prod<m> standard name as taking "two signed elements of the same mode, adding them to a third operand of wider mode", there is currently ambiguity in the relationship between the mode of the first two arguments and that of the third. This vagueness means that, in theory, different modes may be supportable in the third argument. This flexibility would allow for a given backend to add to the accumulator a different number of vectorized products, e.g. A backend may provide instructions for both: accum += a[0] * b[0] and accum += a[0] * b[0] + a[1] * b[1], as is now seen in the SVE2.1 extension to AArch64. In spite of the aforementioned flexibility, modeling the dot-product operation as a direct optab means that we have no way to encode both input and the accumulator data modes into the backend pattern name, which prevents us from harnessing this flexibility. The purpose of this patch-series is therefore to remedy this current shortcoming, moving the `dot_prod' from its current implementation as a direct optab to an implementation where, as a conversion optab, we are able to differentiate between dot products taking the same input mode but resulting in a different output mode. Regression-tested on x86_64, aarch64 and armhf. I'd appreciate help running relevant tests on the remaining architectures, i.e. arc, mips, altivec and c6x to ensure I've not inadvertently broken anything for those back-ends. Victor Do Nascimento (10): optabs: Make all `*dot_prod_optab's modeled as conversions autovectorizer: Add basic support for convert optabs aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns arm: Fix arm backend-use of (u|s|us)dot_prod patterns i386: Fix dot_prod backend patterns for mmx and sse targets arc: Adjust dot-product backend patterns mips: Adjust dot-product backend patterns rs6000: Adjust altivec dot-product backend patterns c6x: Adjust dot-product backend patterns autovectorizer: Test autovectorization of different dot-prod modes. gcc/config/aarch64/aarch64-builtins.cc | 7 ++ gcc/config/aarch64/aarch64-simd-builtins.def | 6 +- gcc/config/aarch64/aarch64-simd.md | 9 +- .../aarch64/aarch64-sve-builtins-base.cc | 13 +-- gcc/config/aarch64/aarch64-sve-builtins.cc | 17 ++++ gcc/config/aarch64/aarch64-sve-builtins.h | 3 + gcc/config/aarch64/aarch64-sve.md | 6 +- gcc/config/aarch64/aarch64-sve2.md | 2 +- gcc/config/arc/simdext.md | 8 +- gcc/config/arm/arm-builtins.cc | 95 +++++++++++++++++++ gcc/config/arm/arm-protos.h | 3 + gcc/config/arm/arm.cc | 1 + gcc/config/arm/arm_neon_builtins.def | 3 - gcc/config/arm/neon.md | 6 +- gcc/config/c6x/c6x.md | 2 +- gcc/config/i386/mmx.md | 30 +++--- gcc/config/i386/sse.md | 38 ++++---- gcc/config/mips/loongson-mmi.md | 2 +- gcc/config/rs6000/altivec.md | 4 +- gcc/doc/md.texi | 46 ++++----- gcc/gimple-match-exports.cc | 23 +++++ gcc/gimple-match.h | 2 + gcc/optabs.cc | 3 +- gcc/optabs.def | 6 +- .../gcc.dg/vect/vect-dotprod-twoway.c | 39 ++++++++ .../aarch64/sme/vect-dotprod-twoway.c | 25 +++++ .../gcc.target/aarch64/vect-dotprod-twoway.c | 65 +++++++++++++ gcc/testsuite/lib/target-supports.exp | 8 ++ gcc/tree-vect-loop.cc | 1 + gcc/tree-vect-patterns.cc | 43 ++++++++- 30 files changed, 420 insertions(+), 96 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/vect-dotprod-twoway.c create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-dotprod-twoway.c -- 2.34.1