RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Tamar Christina Wed, 10 Jul 2024 03:19:44 -0700

> > > I might also point back to the idea I threw in somewhere, adding
> > > OEP_VALUE (or a better name) to the set of flags accepted by
> > > operand_equal_p.  You mentioned hashing IIRC but I don't see the patches
> > > touching hashing?
> > >
> >
> > Yes, That can indeed be done with this approach.  The hashing was that 
> > before I
> > was trying to prevent the "duplicate" IV expressions from being created in 
> > the
> > first place by modifying get_loop_invariant_expr.
> >
> > This function looks up if we have already seen a particular IV expression 
> > and if
> > we have it just returns that expression.  However after reading more of the 
> > code
> > I realized this wasn't the right approach, as without also dealing with the
> candidates
> > we'd end up creating IV expression that can't be handled by any candidate.
> >
> > IVops would just give up then.   Reading the code it seems that
> get_loop_invariant_expr
> > is just there to prevent blatant duplicates.  i.e. it treats `(signed) a` 
> > and `a` as the
> same.
> >
> > This is also why I think that everywhere else *has* to continue stripping 
> > the
> expression.
> >
> > On a note from Richard S that he thought IVopts already had some code to 
> > deal
> with
> > expressions that differ only in signs led me to take a different approach.
> >
> > The goal wasn't to remove the different sign/unsigned IV expressions, but
> instead get
> > Then to be servable by the same candidates. i.e. we want them in the same
> candidate
> > groups and then candidate optimization will just do its thing.
> >
> > That seemed a more natural fit to how it worked.
> 
> Yeah, I agree that sounds like the better strathegy.
> 
> > Do you want me to try the operand_equal_p approach? Though in this case the
> issue
> > is we not only need to know they're equal, but also need to know the scale
> factor.
> 
> For this case yes, but if you'd keep the code as-is, the equal with scale
> factor one case would be fixed.  Not a case with different scale factors
> though - but conversions "elsewhere" should be handled via the stripping.
> So it would work to simply adjust the operand_equal_p check here?
> 
> > get_computation_aff_1 scales the common type IV by the scale we determined,
> > so I think operand_equal_p would not be very useful here.  But it does look 
> > like
> > constant_multiple_of can just be implemented with
> aff_combination_constant_multiple_p.
> >
> > Should I try?
> 
> You've had the other places where you replace operand_equal_p with
> affine-compute and compare.  As said that has some associated cost
> as well as a limit on the number of elements after which it resorts
> back to operand_equal_p.  So for strict equality tests implementing
> a weaker operand_equal_p might be a better solution.
>


The structural comparison is implemented as a new mode for operand_equal_p which
compares two expressions ignoring NOP converts (unless their bitsizes differ)
and ignoring constant values, but requiring both operands be a constant.

There is one downside compared to affine comparison, in that this approach does
not deal well with commutative operations. i.e. it does not see a + (b + c) as
equivalent to c + (b + a).

This means we lose out on some of the more complicated addressing modes, but
with so many operations the address will likely be split anyway and we'll deal
with it then.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu -m32, -m64 and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

        PR tree-optimization/114932
        * fold-const.cc (operand_compare::operand_equal_p): Use it.
        (operand_compare::verify_hash_value): Likewise.
        * tree-core.h (enum operand_equal_flag): Add OEP_STRUCTURAL_EQ.
        * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.

gcc/testsuite/ChangeLog:

        PR tree-optimization/114932
        * gfortran.dg/addressing-modes_2.f90: New test.

-- inline copy of --

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 
710d697c0217c784b34f9f9f7b00b1945369076a..3d43020541c082c094164724da9d17fbb5793237
 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -3191,6 +3191,9 @@ operand_compare::operand_equal_p (const_tree arg0, 
const_tree arg1,
      precision differences.  */
   if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
     {
+      if (flags & OEP_STRUCTURAL_EQ)
+       return true;
+
       /* Address of INTEGER_CST is not defined; check that we did not forget
         to drop the OEP_ADDRESS_OF flags.  */
       gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
@@ -3204,7 +3207,8 @@ operand_compare::operand_equal_p (const_tree arg0, 
const_tree arg1,
         because they may change the signedness of the arguments.  As pointers
         strictly don't have a signedness, require either two pointers or
         two non-pointers as well.  */
-      if (TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE (arg1))
+      if ((TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE (arg1))
+          && !(flags & OEP_STRUCTURAL_EQ))
          || POINTER_TYPE_P (TREE_TYPE (arg0))
                             != POINTER_TYPE_P (TREE_TYPE (arg1)))
        return false;
@@ -4204,7 +4208,7 @@ operand_compare::verify_hash_value (const_tree arg0, 
const_tree arg1,
     {
       if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
        {
-         if (arg0 != arg1 && !(flags & OEP_DECL_NAME))
+         if (arg0 != arg1 && !(flags & (OEP_DECL_NAME | OEP_STRUCTURAL_EQ)))
            {
              inchash::hash hstate0 (0), hstate1 (0);
              hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK);
diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 
b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
new file mode 100644
index 
0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c2e24a4973c8539fae
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
@@ -0,0 +1,20 @@
+! { dg-do compile { target aarch64-*-* } }
+! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" }
+
+module a
+integer, parameter :: b=3, c=b
+contains
+subroutine d(block)
+integer f, col   , block(:, :, :), e
+do f = 1, c
+   do col = 1, c
+             block(:f,                          :, e()) = do
+     end do
+  end do
+  end
+  end
+
+! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3 IVs:} 
ivopts } }
+! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2 
IVs:} 1 ivopts } }
+! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1 
IVs:} 1 ivopts } }
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 
27c569c77022227643151fa4a909a95f3d45bf55..fe398d99bcaa9e8fcec45e5e07d133bc393aa4a7
 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -962,7 +962,9 @@ enum operand_equal_flag {
   /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations
      of the same kind.  Used to compare VLA bounds involving parameters
      across redeclarations of the same function.  */
-  OEP_DECL_NAME = 512
+  OEP_DECL_NAME = 512,
+  /* Check for structural equivalence and ignore NOP_CONVERT casts.  */
+  OEP_STRUCTURAL_EQ = 1024
 };
 
 /* Enum and arrays used for tree allocation stats.
diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 
c3218a3e8eedbb8d0a7f14c01eeb069cb6024c29..b05e4ac1e63b5c110707a8a3ed5e614b7ccc702f
 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -1623,8 +1623,8 @@ record_group_use (struct ivopts_data *data, tree *use_p,
 
          /* Check if it has the same stripped base and step.  */
          if (operand_equal_p (iv->base_object, use->iv->base_object, 0)
-             && operand_equal_p (iv->step, use->iv->step, 0)
-             && operand_equal_p (addr_base, use->addr_base, 0))
+             && operand_equal_p (iv->step, use->iv->step, OEP_STRUCTURAL_EQ)
+             && operand_equal_p (addr_base, use->addr_base, OEP_STRUCTURAL_EQ))
            break;
        }
       if (i == data->vgroups.length ())

rb18566.patch
Description: rb18566.patch

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Reply via email to