> -----Original Message-----
> From: Richard Sandiford <richard.sandif...@arm.com>
> Sent: Thursday, July 15, 2021 8:35 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw
> <richard.earns...@arm.com>; Marcus Shawcroft
> <marcus.shawcr...@arm.com>; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Subject: Re: [PATCH 2/4]AArch64: correct usdot vectorizer and intrinsics
> optabs
> 
> Tamar Christina <tamar.christ...@arm.com> writes:
> > Hi All,
> >
> > There's a slight mismatch between the vectorizer optabs and the
> > intrinsics patterns for NEON.  The vectorizer expects operands[3] and
> > operands[0] to be the same but the aarch64 intrinsics expanders expect
> > operands[0] and operands[1] to be the same.
> >
> > This means we need different patterns here.  This adds a separate
> > usdot vectorizer pattern which just shuffles around the RTL params.
> >
> > There's also an inconsistency between the usdot and (u|s)dot
> > intrinsics RTL patterns which is not corrected here.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> 
> Couldn't we just change:
> 
> > diff --git a/gcc/config/aarch64/arm_neon.h
> > b/gcc/config/aarch64/arm_neon.h index
> >
> 00d76ea937ace5763746478cbdfadf6479e0b15a..17e059efb80fa86a8a32127ac
> e4f
> > c7f43e2040a8 100644
> > --- a/gcc/config/aarch64/arm_neon.h
> > +++ b/gcc/config/aarch64/arm_neon.h
> > @@ -34039,14 +34039,14 @@ __extension__ extern __inline int32x2_t
> > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> >  vusdot_s32 (int32x2_t __r, uint8x8_t __a, int8x8_t __b)  {
> > -  return __builtin_aarch64_usdot_prodv8qi_ssus (__r, __a, __b);
> > +  return __builtin_aarch64_usdotv8qi_ssus (__r, __a, __b);
> 
> …this to __builtin_aarch64_usdot_prodv8qi_ssus (__a, __b, __r) etc.?

Not easily, as I was mentioning before, Neon intrinsics have the assumption that
operands[0] and operands[1] are the same. And this goes much further than just
the header call.

The actual type is determined by the optabs and the C stubs that are generated.

aarch64_init_simd_builtins which creates the C function stubs starts processing
arguments from the end and on non-void functions assumes that the value at
operands[0] be the return type. So simply moving __r will get it to think that
the result type should be uint8x8_t.

I can bypass this but then have to write a custom expander in expand code to
handle this, but at point, is it really worth it..

Tamar

> I think that's an OK thing to do when the function is named after
> an optab rather than an arm_neon.h intrinsic.
> 
> Thanks,
> Richard
> 
> >  }
> >
> >  __extension__ extern __inline int32x4_t
> >  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> >  vusdotq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b)
> >  {
> > -  return __builtin_aarch64_usdot_prodv16qi_ssus (__r, __a, __b);
> > +  return __builtin_aarch64_usdotv16qi_ssus (__r, __a, __b);
> >  }
> >
> >  __extension__ extern __inline int32x2_t

Reply via email to