RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Tamar Christina Thu, 11 Jul 2024 06:32:43 -0700

> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Thursday, July 11, 2024 1:10 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; bin.ch...@linux.alibaba.com
> Subject: RE: [PATCH][ivopts]: use affine_tree when comparing IVs during 
> candidate
> selection [PR114932]
> 
> On Wed, 10 Jul 2024, Tamar Christina wrote:
> 
> > > > > I might also point back to the idea I threw in somewhere, adding
> > > > > OEP_VALUE (or a better name) to the set of flags accepted by
> > > > > operand_equal_p.  You mentioned hashing IIRC but I don't see the 
> > > > > patches
> > > > > touching hashing?
> > > > >
> > > >
> > > > Yes, That can indeed be done with this approach.  The hashing was that
> before I
> > > > was trying to prevent the "duplicate" IV expressions from being created 
> > > > in the
> > > > first place by modifying get_loop_invariant_expr.
> > > >
> > > > This function looks up if we have already seen a particular IV 
> > > > expression and if
> > > > we have it just returns that expression.  However after reading more of 
> > > > the
> code
> > > > I realized this wasn't the right approach, as without also dealing with 
> > > > the
> > > candidates
> > > > we'd end up creating IV expression that can't be handled by any 
> > > > candidate.
> > > >
> > > > IVops would just give up then.   Reading the code it seems that
> > > get_loop_invariant_expr
> > > > is just there to prevent blatant duplicates.  i.e. it treats `(signed) 
> > > > a` and `a` as
> the
> > > same.
> > > >
> > > > This is also why I think that everywhere else *has* to continue 
> > > > stripping the
> > > expression.
> > > >
> > > > On a note from Richard S that he thought IVopts already had some code to
> deal
> > > with
> > > > expressions that differ only in signs led me to take a different 
> > > > approach.
> > > >
> > > > The goal wasn't to remove the different sign/unsigned IV expressions, 
> > > > but
> > > instead get
> > > > Then to be servable by the same candidates. i.e. we want them in the 
> > > > same
> > > candidate
> > > > groups and then candidate optimization will just do its thing.
> > > >
> > > > That seemed a more natural fit to how it worked.
> > >
> > > Yeah, I agree that sounds like the better strathegy.
> > >
> > > > Do you want me to try the operand_equal_p approach? Though in this case
> the
> > > issue
> > > > is we not only need to know they're equal, but also need to know the 
> > > > scale
> > > factor.
> > >
> > > For this case yes, but if you'd keep the code as-is, the equal with scale
> > > factor one case would be fixed.  Not a case with different scale factors
> > > though - but conversions "elsewhere" should be handled via the stripping.
> > > So it would work to simply adjust the operand_equal_p check here?
> > >
> > > > get_computation_aff_1 scales the common type IV by the scale we
> determined,
> > > > so I think operand_equal_p would not be very useful here.  But it does 
> > > > look
> like
> > > > constant_multiple_of can just be implemented with
> > > aff_combination_constant_multiple_p.
> > > >
> > > > Should I try?
> > >
> > > You've had the other places where you replace operand_equal_p with
> > > affine-compute and compare.  As said that has some associated cost
> > > as well as a limit on the number of elements after which it resorts
> > > back to operand_equal_p.  So for strict equality tests implementing
> > > a weaker operand_equal_p might be a better solution.
> > >
> >
> > The structural comparison is implemented as a new mode for operand_equal_p
> which
> > compares two expressions ignoring NOP converts (unless their bitsizes 
> > differ)
> > and ignoring constant values, but requiring both operands be a constant.
> >
> > There is one downside compared to affine comparison, in that this approach
> does
> > not deal well with commutative operations. i.e. it does not see a + (b + c) 
> > as
> > equivalent to c + (b + a).
> >
> > This means we lose out on some of the more complicated addressing modes, but
> > with so many operations the address will likely be split anyway and we'll 
> > deal
> > with it then.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > x86_64-pc-linux-gnu -m32, -m64 and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >     PR tree-optimization/114932
> >     * fold-const.cc (operand_compare::operand_equal_p): Use it.
> >     (operand_compare::verify_hash_value): Likewise.
> >     * tree-core.h (enum operand_equal_flag): Add OEP_STRUCTURAL_EQ.
> >     * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.
> >
> > gcc/testsuite/ChangeLog:
> >
> >     PR tree-optimization/114932
> >     * gfortran.dg/addressing-modes_2.f90: New test.
> >
> > -- inline copy of --
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index
> 710d697c0217c784b34f9f9f7b00b1945369076a..3d43020541c082c09416472
> 4da9d17fbb5793237 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -3191,6 +3191,9 @@ operand_compare::operand_equal_p (const_tree
> arg0, const_tree arg1,
> >       precision differences.  */
> >    if (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST)
> >      {
> > +      if (flags & OEP_STRUCTURAL_EQ)
> > +   return true;
> > +
> 
> Hmm, so you ignore all constants?


Yes, because later on it still needs to check that one expression is a multiple 
of the other.
If not while they're in the same group, they'll get different IVs.

Ideally I'd only check if the two expressions differ only in a constant offset 
or multiply.
But I have no idea how to do this from operand_equal_p as I'd need to know what 
the
parent expression is.  I guess I could do extra checks when comparing + and *.

> 
> >        /* Address of INTEGER_CST is not defined; check that we did not 
> > forget
> >      to drop the OEP_ADDRESS_OF flags.  */
> >        gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
> > @@ -3204,7 +3207,8 @@ operand_compare::operand_equal_p (const_tree
> arg0, const_tree arg1,
> >      because they may change the signedness of the arguments.  As pointers
> >      strictly don't have a signedness, require either two pointers or
> >      two non-pointers as well.  */
> > -      if (TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE
> (arg1))
> > +      if ((TYPE_UNSIGNED (TREE_TYPE (arg0)) != TYPE_UNSIGNED (TREE_TYPE
> (arg1))
> > +      && !(flags & OEP_STRUCTURAL_EQ))
> >       || POINTER_TYPE_P (TREE_TYPE (arg0))
> >                          != POINTER_TYPE_P (TREE_TYPE (arg1)))
> 
> and all sign-conversions?

Yes, as long as they don't change bitsize.  That's checked in the lines below.

> 
> I was thinking to implement a "value equivalence" check, so
> (int)((unsigned)i + (unsigned)j) would be equal to i + j for int i and j.
> 

That's not enough for IV ops, what it's after is knowing whether two expressions
are the same but for some constant offset of multiply.  i.e. can one expression
replace the other with come constant known adjustment.

> Doesn't your patch still treat (int)(unsigned)i and i as different?

Sure, but we don't' get those because we've folded everything to
unsigned before in patch 1.

> And won't it treat (unsigned)i / 4 and (int)i / 4 as the same?  I'm not
> sure this is the desired semantic for IVOPTs (or a useful one in general).

Agree, though I didn't expect to have division in addressing mode calculations.
I only expected +,-,*.  Do we actually do /?

> 
> That said, when OEP_STRUCTURAL_EQ ignores a sign-change it can only
> do that for operations where that maintains value-equivalence
> (in terms of twos-complement bit representation).
> 
> OTOH if you ignore all differences in constants that's another thing.

It's a bit of a weird one.  As I mentioned above I only need +,- and *.
But whatever semantics I encode the use of the flag is going to be
specialized.

I can start by figuring out how to ignore the constants only in those
three cases.

Thanks,
Tamar

> 
> >     return false;
> > @@ -4204,7 +4208,7 @@ operand_compare::verify_hash_value (const_tree
> arg0, const_tree arg1,
> >      {
> >        if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
> >     {
> > -     if (arg0 != arg1 && !(flags & OEP_DECL_NAME))
> > +     if (arg0 != arg1 && !(flags & (OEP_DECL_NAME |
> OEP_STRUCTURAL_EQ)))
> >         {
> >           inchash::hash hstate0 (0), hstate1 (0);
> >           hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK);
> > diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> > new file mode 100644
> > index
> 0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c
> 2e24a4973c8539fae
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> > @@ -0,0 +1,20 @@
> > +! { dg-do compile { target aarch64-*-* } }
> > +! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" }
> > +
> > +module a
> > +integer, parameter :: b=3, c=b
> > +contains
> > +subroutine d(block)
> > +integer f, col   , block(:, :, :), e
> > +do f = 1, c
> > +   do col = 1, c
> > +             block(:f,                          :, e()) = do
> > +     end do
> > +  end do
> > +  end
> > +  end
> > +
> > +! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3 
> > IVs:} ivopts
> } }
> > +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2 
> > IVs:} 1
> ivopts } }
> > +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1 
> > IVs:} 1
> ivopts } }
> > +
> > diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> > index
> 27c569c77022227643151fa4a909a95f3d45bf55..fe398d99bcaa9e8fcec45e5e0
> 7d133bc393aa4a7 100644
> > --- a/gcc/tree-core.h
> > +++ b/gcc/tree-core.h
> > @@ -962,7 +962,9 @@ enum operand_equal_flag {
> >    /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations
> >       of the same kind.  Used to compare VLA bounds involving parameters
> >       across redeclarations of the same function.  */
> > -  OEP_DECL_NAME = 512
> > +  OEP_DECL_NAME = 512,
> > +  /* Check for structural equivalence and ignore NOP_CONVERT casts.  */
> > +  OEP_STRUCTURAL_EQ = 1024
> >  };
> >
> >  /* Enum and arrays used for tree allocation stats.
> > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> > index
> c3218a3e8eedbb8d0a7f14c01eeb069cb6024c29..b05e4ac1e63b5c110707a8a3
> ed5e614b7ccc702f 100644
> > --- a/gcc/tree-ssa-loop-ivopts.cc
> > +++ b/gcc/tree-ssa-loop-ivopts.cc
> > @@ -1623,8 +1623,8 @@ record_group_use (struct ivopts_data *data, tree
> *use_p,
> >
> >       /* Check if it has the same stripped base and step.  */
> >       if (operand_equal_p (iv->base_object, use->iv->base_object, 0)
> > -         && operand_equal_p (iv->step, use->iv->step, 0)
> > -         && operand_equal_p (addr_base, use->addr_base, 0))
> > +         && operand_equal_p (iv->step, use->iv->step, OEP_STRUCTURAL_EQ)
> > +         && operand_equal_p (addr_base, use->addr_base,
> OEP_STRUCTURAL_EQ))
> >         break;
> >     }
> >        if (i == data->vgroups.length ())
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH][ivopts]: use affine_tree when comparing IVs during candidate selection [PR114932]

Reply via email to