RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Richard Biener Thu, 11 Jul 2024 05:10:09 -0700

On Wed, 10 Jul 2024, Tamar Christina wrote:

> > > > I might also point back to the idea I threw in somewhere, adding
> > > > OEP_VALUE (or a better name) to the set of flags accepted by
> > > > operand_equal_p.  You mentioned hashing IIRC but I don't see the patches
> > > > touching hashing?
> > > >
> > >
> > > Yes, That can indeed be done with this approach.  The hashing was that 
> > > before I
> > > was trying to prevent the "duplicate" IV expressions from being created 
> > > in the
> > > first place by modifying get_loop_invariant_expr.
> > >
> > > This function looks up if we have already seen a particular IV expression 
> > > and if
> > > we have it just returns that expression.  However after reading more of 
> > > the code
> > > I realized this wasn't the right approach, as without also dealing with 
> > > the
> > candidates
> > > we'd end up creating IV expression that can't be handled by any candidate.
> > >
> > > IVops would just give up then.   Reading the code it seems that
> > get_loop_invariant_expr
> > > is just there to prevent blatant duplicates.  i.e. it treats `(signed) a` 
> > > and `a` as the
> > same.
> > >
> > > This is also why I think that everywhere else *has* to continue stripping 
> > > the
> > expression.
> > >
> > > On a note from Richard S that he thought IVopts already had some code to 
> > > deal
> > with
> > > expressions that differ only in signs led me to take a different approach.
> > >
> > > The goal wasn't to remove the different sign/unsigned IV expressions, but
> > instead get
> > > Then to be servable by the same candidates. i.e. we want them in the same
> > candidate
> > > groups and then candidate optimization will just do its thing.
> > >
> > > That seemed a more natural fit to how it worked.
> > 
> > Yeah, I agree that sounds like the better strathegy.
> > 
> > > Do you want me to try the operand_equal_p approach? Though in this case 
> > > the
> > issue
> > > is we not only need to know they're equal, but also need to know the scale
> > factor.
> > 
> > For this case yes, but if you'd keep the code as-is, the equal with scale
> > factor one case would be fixed.  Not a case with different scale factors
> > though - but conversions "elsewhere" should be handled via the stripping.
> > So it would work to simply adjust the operand_equal_p check here?
> > 
> > > get_computation_aff_1 scales the common type IV by the scale we 
> > > determined,
> > > so I think operand_equal_p would not be very useful here.  But it does 
> > > look like
> > > constant_multiple_of can just be implemented with
> > aff_combination_constant_multiple_p.
> > >
> > > Should I try?
> > 
> > You've had the other places where you replace operand_equal_p with
> > affine-compute and compare.  As said that has some associated cost
> > as well as a limit on the number of elements after which it resorts
> > back to operand_equal_p.  So for strict equality tests implementing
> > a weaker operand_equal_p might be a better solution.
> > 
> 
> The structural comparison is implemented as a new mode for operand_equal_p 
> which
> compares two expressions ignoring NOP converts (unless their bitsizes differ)
> and ignoring constant values, but requiring both operands be a constant.
> 
> There is one downside compared to affine comparison, in that this approach 
> does
> not deal well with commutative operations. i.e. it does not see a + (b + c) as
> equivalent to c + (b + a).
> 
> This means we lose out on some of the more complicated addressing modes, but
> with so many operations the address will likely be split anyway and we'll deal
> with it then.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu -m32, -m64 and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       PR tree-optimization/114932
>       * fold-const.cc (operand_compare::operand_equal_p): Use it.
>       (operand_compare::verify_hash_value): Likewise.
>       * tree-core.h (enum operand_equal_flag): Add OEP_STRUCTURAL_EQ.
>       * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.
> 
> gcc/testsuite/ChangeLog:
> 
>       PR tree-optimization/114932
>       * gfortran.dg/addressing-modes_2.f90: New test.
> 
> -- inline copy of --
> 
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 
> 710d697c0217c784b34f9f9f7b00b1945369076a..3d43020541c082c094164724da9d17fbb5793237
>  100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -3191,6 +3191,9 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>       precision differences.  */
>    if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
>      {
> +      if (flags & OEP_STRUCTURAL_EQ)
> +     return true;
> +


Hmm, so you ignore all constants?

>        /* Address of INTEGER_CST is not defined; check that we did not forget
>        to drop the OEP_ADDRESS_OF flags.  */
>        gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
> @@ -3204,7 +3207,8 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>        because they may change the signedness of the arguments.  As pointers
>        strictly don't have a signedness, require either two pointers or
>        two non-pointers as well.  */
> -      if (TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE 
> (arg1))
> +      if ((TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE 
> (arg1))
> +        && !(flags & OEP_STRUCTURAL_EQ))
>         || POINTER_TYPE_P (TREE_TYPE (arg0))
>                            != POINTER_TYPE_P (TREE_TYPE (arg1)))

and all sign-conversions?

I was thinking to implement a "value equivalence" check, so
(int)((unsigned)i + (unsigned)j) would be equal to i + j for int i and j.

Doesn't your patch still treat (int)(unsigned)i and i as different?
And won't it treat (unsigned)i / 4 and (int)i / 4 as the same?  I'm not
sure this is the desired semantic for IVOPTs (or a useful one in general).

That said, when OEP_STRUCTURAL_EQ ignores a sign-change it can only
do that for operations where that maintains value-equivalence
(in terms of twos-complement bit representation).

OTOH if you ignore all differences in constants that's another thing.

>       return false;
> @@ -4204,7 +4208,7 @@ operand_compare::verify_hash_value (const_tree arg0, 
> const_tree arg1,
>      {
>        if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
>       {
> -       if (arg0 != arg1 && !(flags & OEP_DECL_NAME))
> +       if (arg0 != arg1 && !(flags & (OEP_DECL_NAME | OEP_STRUCTURAL_EQ)))
>           {
>             inchash::hash hstate0 (0), hstate1 (0);
>             hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK);
> diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 
> b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c2e24a4973c8539fae
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> @@ -0,0 +1,20 @@
> +! { dg-do compile { target aarch64-*-* } }
> +! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" }
> +
> +module a
> +integer, parameter :: b=3, c=b
> +contains
> +subroutine d(block)
> +integer f, col   , block(:, :, :), e
> +do f = 1, c
> +   do col = 1, c
> +             block(:f,                          :, e()) = do
> +     end do
> +  end do
> +  end
> +  end
> +
> +! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3 
> IVs:} ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2 
> IVs:} 1 ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1 
> IVs:} 1 ivopts } }
> +
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index 
> 27c569c77022227643151fa4a909a95f3d45bf55..fe398d99bcaa9e8fcec45e5e07d133bc393aa4a7
>  100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -962,7 +962,9 @@ enum operand_equal_flag {
>    /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations
>       of the same kind.  Used to compare VLA bounds involving parameters
>       across redeclarations of the same function.  */
> -  OEP_DECL_NAME = 512
> +  OEP_DECL_NAME = 512,
> +  /* Check for structural equivalence and ignore NOP_CONVERT casts.  */
> +  OEP_STRUCTURAL_EQ = 1024
>  };
>  
>  /* Enum and arrays used for tree allocation stats.
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index 
> c3218a3e8eedbb8d0a7f14c01eeb069cb6024c29..b05e4ac1e63b5c110707a8a3ed5e614b7ccc702f
>  100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -1623,8 +1623,8 @@ record_group_use (struct ivopts_data *data, tree *use_p,
>  
>         /* Check if it has the same stripped base and step.  */
>         if (operand_equal_p (iv->base_object, use->iv->base_object, 0)
> -           && operand_equal_p (iv->step, use->iv->step, 0)
> -           && operand_equal_p (addr_base, use->addr_base, 0))
> +           && operand_equal_p (iv->step, use->iv->step, OEP_STRUCTURAL_EQ)
> +           && operand_equal_p (addr_base, use->addr_base, OEP_STRUCTURAL_EQ))
>           break;
>       }
>        if (i == data->vgroups.length ())
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Reply via email to