On Thu, 21 Nov 2024, Tamar Christina wrote:

> Hi All,
> 
> This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on the
> GCC 14 branch and some of the ones on the master.
> 
> The current matching just looks for one order for multiplication and was 
> relying
> on canonicalization to always give the right order because of the 
> TWO_OPERANDS.
> 
> However when it comes to the multiplication trying only one order is a bit
> fragile as they can be flipped.
> 
> The failing tests on the branch are:
> 
> #include <complex.h>
> 
> #define TYPE double
> #define N 200
> 
> void fms180snd(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
>                _Complex TYPE c[restrict N]) {
>   for (int i = 0; i < N; i++)
>     c[i] -= a[i] * (b[i] * I * I);
> }
> 
> void fms180fst(_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
>                _Complex TYPE c[restrict N]) {
>   for (int i = 0; i < N; i++)
>     c[i] -= (a[i] * I * I) * b[i];
> }
> 
> The issue is just a small difference in commutative operations.
> we look for {R,R} * {R,I} but found {R,I} * {R,R}.
> 
> Since the DF analysis is cached, we should be able to swap operands and retry
> for multiply cheaply.
> 
> There is a constraint being checked by vect_validate_multiplication for the 
> data
> flow of the operands feeding the multiplications.  So e.g.
> 
> between the nodes:
> 
> note:   node 0x4d1d210 (max_nunits=2, refcnt=3) vector(2) double
> note:   op template: _27 = _10 * _25;
> note:      stmt 0 _27 = _10 * _25;
> note:      stmt 1 _29 = _11 * _25;
> note:   node 0x4d1d060 (max_nunits=2, refcnt=2) vector(2) double
> note:   op template: _26 = _11 * _24;
> note:      stmt 0 _26 = _11 * _24;
> note:      stmt 1 _28 = _10 * _24;
> 
> we require the lanes to come from the same source which
> vect_validate_multiplication checks.  As such it doesn't make sense to flip 
> them
> individually because that would invalidate the earlier linear_loads_p checks
> which have validated that the arguments all come from the same datarefs.
> 
> This patch thus flips the operands in unison to still maintain this invariant,
> but also honor the commutative nature of multiplication.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues on master and
> GCC 14 branch.
> 
> Ok for master and backport to GCC 14?

OK for master and branch (after a week or so).

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       PR tree-optimizations/116463
>       * tree-vect-slp-patterns.cc (complex_mul_pattern::matches,
>       complex_fms_pattern::matches): Try swapping operands on multiply.
> 
> ---
> diff --git a/gcc/tree-vect-slp-patterns.cc b/gcc/tree-vect-slp-patterns.cc
> index 
> d62682be43c98f2d16af8bf6a6a049c73100ef16..2535d46db3e84700d7591cc8d1bae3b0d098a803
>  100644
> --- a/gcc/tree-vect-slp-patterns.cc
> +++ b/gcc/tree-vect-slp-patterns.cc
> @@ -1076,7 +1076,15 @@ complex_mul_pattern::matches (complex_operation_t op,
>    enum _conj_status status;
>    if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
>                                    right_op, false, &status))
> -    return IFN_LAST;
> +    {
> +      /* Try swapping the order and re-trying since multiplication is
> +      commutative.  */
> +      std::swap (left_op[0], left_op[1]);
> +      std::swap (right_op[0], right_op[1]);
> +      if (!vect_validate_multiplication (perm_cache, compat_cache, left_op,
> +                                      right_op, false, &status))
> +     return IFN_LAST;
> +    }
>  
>    if (status == CONJ_NONE)
>      {
> @@ -1293,7 +1301,15 @@ complex_fms_pattern::matches (complex_operation_t op,
>    enum _conj_status status;
>    if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
>                                    left_op, true, &status))
> -    return IFN_LAST;
> +    {
> +      /* Try swapping the order and re-trying since multiplication is
> +      commutative.  */
> +      std::swap (left_op[0], left_op[1]);
> +      std::swap (right_op[0], right_op[1]);
> +      if (!vect_validate_multiplication (perm_cache, compat_cache, right_op,
> +                                      left_op, true, &status))
> +     return IFN_LAST;
> +    }
>  
>    if (status == CONJ_NONE)
>      ifn = IFN_COMPLEX_FMS;
> 
> 
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to