Tamar Christina <tamar.christ...@arm.com> writes:
> Hi All,
>
> There's a slight mismatch between the vectorizer optabs and the intrinsics
> patterns for NEON.  The vectorizer expects operands[3] and operands[0] to be
> the same but the aarch64 intrinsics expanders expect operands[0] and
> operands[1] to be the same.
>
> This means we need different patterns here.  This adds a separate usdot
> vectorizer pattern which just shuffles around the RTL params.
>
> There's also an inconsistency between the usdot and (u|s)dot intrinsics RTL
> patterns which is not corrected here.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?

Couldn't we just change:

> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index 
> 00d76ea937ace5763746478cbdfadf6479e0b15a..17e059efb80fa86a8a32127ace4fc7f43e2040a8
>  100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -34039,14 +34039,14 @@ __extension__ extern __inline int32x2_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vusdot_s32 (int32x2_t __r, uint8x8_t __a, int8x8_t __b)
>  {
> -  return __builtin_aarch64_usdot_prodv8qi_ssus (__r, __a, __b);
> +  return __builtin_aarch64_usdotv8qi_ssus (__r, __a, __b);

…this to __builtin_aarch64_usdot_prodv8qi_ssus (__a, __b, __r) etc.?
I think that's an OK thing to do when the function is named after
an optab rather than an arm_neon.h intrinsic.

Thanks,
Richard

>  }
>  
>  __extension__ extern __inline int32x4_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vusdotq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b)
>  {
> -  return __builtin_aarch64_usdot_prodv16qi_ssus (__r, __a, __b);
> +  return __builtin_aarch64_usdotv16qi_ssus (__r, __a, __b);
>  }
>  
>  __extension__ extern __inline int32x2_t

Reply via email to