On Thu, Jul 11, 2024 at 9:03 AM Tamar Christina <tamar.christ...@arm.com> wrote:
>
> Hi Victor,
>
> > -----Original Message-----
> > From: Victor Do Nascimento <victor.donascime...@arm.com>
> > Sent: Wednesday, July 10, 2024 3:06 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw
> > <richard.earns...@arm.com>; Victor Do Nascimento
> > <vicdo...@e125768.arm.com>
> > Subject: [PATCH 10/10] autovectorizer: Test autovectorization of different 
> > dot-
> > prod modes.
> >
> > From: Victor Do Nascimento <vicdo...@e125768.arm.com>
> >
> > Given the novel treatment of the dot product optab as a conversion we
> > are now able to target, for a given architecture, different
> > relationships between output modes and input modes.
> >
> > This is made clearer by way of example. Previously, on AArch64, the
> > following loop was vectorizable:
> >
> > uint32_t udot4(int n, uint8_t* data) {
> >   uint32_t sum = 0;
> >   for (int i=0; i<n; i+=1)
> >     sum += data[i] * data[i];
> >   return sum;
> > }
> >
> > while the following wasn't:
> >
> > uint32_t udot2(int n, uint16_t* data) {
> >   uint32_t sum = 0;
> >   for (int i=0; i<n; i+=1)
> >     sum += data[i] * data[i];
> >   return sum;
> > }
> >
> > Under the new treatment of the dot product optab, they are both now
> > vectorizable.
> >
> > This adds the relevant target-agnostic check to ensure this behaviour
> > in the autovectorizer.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/vect/vect-dotprod-twoway.c: New.
> > ---
> >  .../gcc.dg/vect/vect-dotprod-twoway.c         | 38 +++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> > b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> > new file mode 100644
> > index 00000000000..5caa7b81fce
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target vect_int } */
> > +/* Ensure both the two-way and four-way dot products are autovectorized.  
> > */
> > +#include <stdint.h>
> > +
> > +uint32_t udot4(int n, uint8_t* data) {
> > +  uint32_t sum = 0;
> > +  for (int i=0; i<n; i+=1) {
> > +    sum += data[i] * data[i];
> > +  }
> > +  return sum;
> > +}
> > +
> > +int32_t sdot4(int n, int8_t* data) {
> > +  int32_t sum = 0;
> > +  for (int i=0; i<n; i+=1) {
> > +    sum += data[i] * data[i];
> > +  }
> > +  return sum;
> > +}
> > +
> > +uint32_t udot2(int n, uint16_t* data) {
> > +  uint32_t sum = 0;
> > +  for (int i=0; i<n; i+=1) {
> > +    sum += data[i] * data[i];
> > +  }
> > +  return sum;
> > +}
> > +
> > +int32_t sdot2(int n, int16_t* data) {
> > +  int32_t sum = 0;
> > +  for (int i=0; i<n; i+=1) {
> > +    sum += data[i] * data[i];
> > +  }
> > +  return sum;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
>
> These tests only test that you have vectorized the loops, not that the loop 
> was vectorized
> using dotprod.  I think you want to have a scan for DOT_PROD_EXPR as well, 
> gated to the
> targets that support two-way dot prod.

Ideally they'd also verify correctness, thus make them have runtime checks.

> Cheers,
> Tamar
>
> > --
> > 2.34.1
>

Reply via email to