Changes in this revision:

* Remove features that classified as feature creep (Gimple folding and
rewriting the aarch64/arm dotprod builtin initialization routines).
These will be submitted separately later.
* Add missing second mode to arm-backend pattern missed in original.
* Add implementation for internal_fn in `directly_supported_p' for
convert optabs.
* Reuse existing iterators in the i386 backend.
* Improve ChangeLog entries involving renaming of back-end patterns.
* Improve tests, including new test with run-time checks to verify
correctness.

-----

Given the specification in the GCC internals manual defines the
{u|s}dot_prod<m> standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the third.

This vagueness means that, in theory, different modes may be
supportable in the third argument.  This flexibility would allow for a
given backend to add to the accumulator a different number of
vectorized products, e.g. A backend may provide instructions for both:

  accum += a[0] * b[0]

and

  accum += a[0] * b[0] + a[1] * b[1],

as is now seen in the SVE2.1 extension to AArch64.  In spite of the
aforementioned flexibility, modeling the dot-product operation as a
direct optab means that we have no way to encode both input and the
accumulator data modes into the backend pattern name, which prevents
us from harnessing this flexibility.

The purpose of this patch-series is therefore to remedy this current
shortcoming, moving the `dot_prod' from its current implementation as
a direct optab to an implementation where, as a conversion optab, we
are able to differentiate between dot products taking the same input
mode but resulting in a different output mode.

Regression-tested on x86_64, aarch64 and armhf.  I'd appreciate help
running relevant tests on the remaining architectures, i.e. arc, mips,
altivec and c6x to ensure I've not inadvertently broken anything for
those back-ends.

Victor Do Nascimento (10):
  optabs: Make all `*dot_prod_optab's modeled as conversions
  autovectorizer: Add basic support for convert optabs
  aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns
  arm: Fix arm backend-use of (u|s|us)dot_prod patterns
  i386: Fix dot_prod backend patterns for mmx and sse targets
  arc: Adjust dot-product backend patterns
  mips:  Adjust dot-product backend patterns
  rs6000: Adjust altivec dot-product backend patterns
  c6x:  Adjust dot-product backend patterns
  autovectorizer: Test autovectorization of different dot-prod modes.

 gcc/config/aarch64/aarch64-builtins.cc        |  7 ++
 gcc/config/aarch64/aarch64-simd-builtins.def  |  6 +-
 gcc/config/aarch64/aarch64-simd.md            |  9 +-
 .../aarch64/aarch64-sve-builtins-base.cc      | 13 +--
 gcc/config/aarch64/aarch64-sve-builtins.cc    | 17 ++++
 gcc/config/aarch64/aarch64-sve-builtins.h     |  3 +
 gcc/config/aarch64/aarch64-sve.md             |  6 +-
 gcc/config/aarch64/aarch64-sve2.md            |  2 +-
 gcc/config/arc/simdext.md                     |  8 +-
 gcc/config/arm/arm-builtins.cc                | 95 +++++++++++++++++++
 gcc/config/arm/arm-protos.h                   |  3 +
 gcc/config/arm/arm.cc                         |  1 +
 gcc/config/arm/arm_neon_builtins.def          |  3 -
 gcc/config/arm/neon.md                        |  6 +-
 gcc/config/c6x/c6x.md                         |  2 +-
 gcc/config/i386/mmx.md                        | 30 +++---
 gcc/config/i386/sse.md                        | 38 ++++----
 gcc/config/mips/loongson-mmi.md               |  2 +-
 gcc/config/rs6000/altivec.md                  |  4 +-
 gcc/doc/md.texi                               | 46 ++++-----
 gcc/gimple-match-exports.cc                   | 23 +++++
 gcc/gimple-match.h                            |  2 +
 gcc/optabs.cc                                 |  3 +-
 gcc/optabs.def                                |  6 +-
 .../gcc.dg/vect/vect-dotprod-twoway.c         | 39 ++++++++
 .../aarch64/sme/vect-dotprod-twoway.c         | 25 +++++
 .../gcc.target/aarch64/vect-dotprod-twoway.c  | 65 +++++++++++++
 gcc/testsuite/lib/target-supports.exp         |  8 ++
 gcc/tree-vect-loop.cc                         |  1 +
 gcc/tree-vect-patterns.cc                     | 43 ++++++++-
 30 files changed, 420 insertions(+), 96 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/vect-dotprod-twoway.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-dotprod-twoway.c

-- 
2.34.1

Reply via email to