RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

Richard Biener Thu, 05 Dec 2024 05:11:46 -0800

On Tue, 3 Dec 2024, Tamar Christina wrote:

> > > This patch implements a new OEP flag called OEP_STRUCTURAL_EQ.  This flag 
> > > will
> > > check if the operands would produce the same bit values after the 
> > > computations
> > > even if the final sign is different.
> > 
> > I think the name is badly chosen - we already have OEP_LEXICOGRAPHIC and
> > OEP_BITWISE.  I would suggest to use OEP_ASSUME_TWOS_COMPLEMENT
> > or OEP_ASSUME_WRAPV.
> > 
> 
> I went with OEP_ASSUME_WRAPV.. saves chars on the 80 char line limit :)
> 
> > > -  if (!(flags & OEP_ADDRESS_OF))
> > > +  if ((flags & OEP_STRUCTURAL_EQ)
> > > +      && (CONVERT_EXPR_P (arg0) || CONVERT_EXPR_P (arg1)))
> > > +    {
> > > +      const_tree t_arg0 = arg0;
> > > +      const_tree t_arg1 = arg1;
> > > +      STRIP_NOPS (arg0);
> > > +      STRIP_NOPS (arg1);
> > > +      /* Only recurse if the conversion was one that was valid to strip. 
> > >  */
> > > +      if (t_arg0 != arg0 || t_arg1 != arg1)
> > > +       return operand_equal_p (type0, arg0, type1, arg1, flags);
> > 
> > (*)
> > 
> > > +    }
> > > +
> > > +  if (!(flags & (OEP_ADDRESS_OF | OEP_STRUCTURAL_EQ)))
> > >      {
> > >        /* If both types don't have the same signedness, then we can't 
> > > consider
> > >          them equal.  We must check this before the STRIP_NOPS calls
> > > @@ -3439,6 +3451,22 @@ operand_compare::operand_equal_p (tree type0,
> > const_tree arg0,
> > >    if (flags & (OEP_ONLY_CONST | OEP_BITWISE))
> > >      return false;
> > >
> > > +  /* Check if we are checking an operation where the two's compliment 
> > > bitwise
> > > +     representation of the result is not the same between signed and
> > > +     unsigned arithmetic.  */
> > > +  switch (TREE_CODE (arg0))
> > > +  {
> > > +  case PLUS_EXPR:
> > > +  case MINUS_EXPR:
> > > +  case MULT_EXPR:
> > > +    break;
> > > +
> > > +  default:
> > > +    /* Clear the structural comparison bit.  */
> > > +    flags &= ~OEP_STRUCTURAL_EQ;
> > 
> > So this handling means that a / (int)((unsigned)b - (unsigned)c)) and
> > a / (b - c) are not equal with respect to OEP_STRUCTURAL_EQ because
> > we drop that flag when processing the division and thus the recursion on
> > the minus operands doesn't have it any more.
> 
> Yeah I did do this intentionally, since I don’t expect division in IV 
> computations
> anyway.  But given that this is supposed to be a more generic helper I've 
> extended
> it properly now.
> 
> > 
> > So shouldn't this and (*) be integrated with the TYPE_UNSIGNED compare in
> > the !OEP_ADDRESS_OF case?  Aka not require same signedness for
> > PLUS/MINUS/MULT/CONVERT?
> > 
> 
> Yup, fair enough, I had a similar thought after I sent out the patch, in that 
> the precision
> check is still good to have.  I've done this in the patch, it turns the 
> conditions for ignoring
> the signs into an opt-in check rather than an opt-out check.  Hopefully I 
> kept the logic simple
> to follow. 
> 
> > I think we have to document what the TYPE, ARG combination means
> > semantically.  As far as I understand TYPE doesn't specify the signedness
> > of the operation ARG but instead represents a possibly stripped conversion
> > of the result?
> > 
> 
> I added some more documentation to the function arguments and the enum 
> definition.
> 
> > > +  /* (int)((unsigned x) + (unsigned y)) == x + y.  */
> > > +  ASSERT_TRUE (operand_equal_p (lhs1, rhs1, OEP_STRUCTURAL_EQ));
> > > +  ASSERT_FALSE (operand_equal_p (lhs1, rhs1, 0));
> > > +
> > > +  /* (int)(unsigned) x == x.  */
> > > +  tree lhs2 = build1 (NOP_EXPR, stype,
> > > +               build1 (NOP_EXPR, utype, x));
> > > +  tree rhs2 = x;
> > > +  ASSERT_TRUE (operand_equal_p (lhs2, rhs2, OEP_STRUCTURAL_EQ));
> > > +  ASSERT_TRUE (operand_equal_p (lhs2, rhs2, 0));
> > 
> > That's not too useful - we'd never see this due to folding.
> > 
> 
> Yeah, it's why I avoided fold here explicitly.  I added this as when I had 
> messed up
> something it was a cheap and early test that failed.  So in that sense it's 
> usefulness
> is purely for debugging.
> 
> --
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       PR tree-optimization/114932
>       * fold-const.cc (operand_compare::operand_equal_p): Use it.
>       (operand_compare::verify_hash_value): Likewise.
>       (test_operand_equality::test): New.
>       (fold_const_cc_tests): Use it.
>       * tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV.
>       * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq.
> 
> gcc/testsuite/ChangeLog:
> 
>       PR tree-optimization/114932
>       * gfortran.dg/addressing-modes_2.f90: New test.
> 
> -- inline copy of patch --
> 
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 
> 158591165ccdea3c7a7c47fb575d59484368e147..cfbd367c708475ba7fefb3433d0c3ca9f98f949d
>  100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -3172,8 +3172,10 @@ operand_compare::operand_equal_p (const_tree arg0, 
> const_tree arg1,
>    return operand_equal_p (TREE_TYPE (arg0), arg0, TREE_TYPE (arg1), arg1, 
> flags);
>  }
>  
> -/* The same as operand_equal_p however the type of ARG0 and ARG1 are assumed 
> to be
> -   the TYPE0 and TYPE1 respectively.  */
> +/* The same as operand_equal_p however the type of ARG0 and ARG1 are assumed 
> to
> +   be the TYPE0 and TYPE1 respectively.  TYPE0 and TYPE1 represent the type 
> the
> +   expression is being compared under for equality.  This means that they can
> +   differ from the actual TREE_TYPE (..) value of ARG0 and ARG1.  */
>  
>  bool
>  operand_compare::operand_equal_p (tree type0, const_tree arg0,
> @@ -3220,15 +3222,52 @@ operand_compare::operand_equal_p (tree type0, 
> const_tree arg0,
>        return tree_int_cst_equal (arg0, arg1);
>      }
>  
> +  if ((flags & OEP_ASSUME_WRAPV)
> +      && (CONVERT_EXPR_P (arg0) || CONVERT_EXPR_P (arg1)))
> +    {
> +      const_tree t_arg0 = arg0;
> +      const_tree t_arg1 = arg1;
> +      STRIP_NOPS (arg0);
> +      STRIP_NOPS (arg1);
> +      /* Only recurse if the conversion was one that was valid to strip.  */
> +      if (t_arg0 != arg0 || t_arg1 != arg1)
> +     return operand_equal_p (type0, arg0, type1, arg1, flags);
> +    }
> +
>    if (!(flags & OEP_ADDRESS_OF))
>      {
> +      /* Check if we are checking an operation where the two's compliment
> +      bitwise representation of the result is not the same between signed and
> +      unsigned arithmetic.  */
> +      bool enforce_signedness = true;
> +      if (flags & OEP_ASSUME_WRAPV)
> +     {
> +       switch (TREE_CODE (arg0))
> +         {
            /* Twos-complement ops.  */
> +         case PLUS_EXPR:
> +         case MINUS_EXPR:
> +         case MULT_EXPR:


the obvious BIT_*_EXPR are missing?

            /* Leafs.  */
> +         case SSA_NAME:
> +         case INTEGER_CST:
> +         case VAR_DECL:
> +         case PARM_DECL:

is this set complete?  An obvious missing case would be RETURN_DECL.

> +         CASE_CONVERT:
> +           enforce_signedness = false;
> +           break;
> +
> +         default:
> +           break;
> +         }
> +     }
> +
>        /* If both types don't have the same signedness, then we can't consider
>        them equal.  We must check this before the STRIP_NOPS calls
>        because they may change the signedness of the arguments.  As pointers
>        strictly don't have a signedness, require either two pointers or
>        two non-pointers as well.  */
> -      if (TYPE_UNSIGNED (type0) != TYPE_UNSIGNED (type1)
> -       || POINTER_TYPE_P (type0) != POINTER_TYPE_P (type1))
> +      if (POINTER_TYPE_P (type0) != POINTER_TYPE_P (type1)
> +       || (TYPE_UNSIGNED (type0) != TYPE_UNSIGNED (type1)
> +           && enforce_signedness))
>       return false;
>  
>        /* If both types don't have the same precision, then it is not safe
> @@ -4236,7 +4275,7 @@ operand_compare::verify_hash_value (const_tree arg0, 
> const_tree arg1,
>      {
>        if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
>       {
> -       if (arg0 != arg1 && !(flags & OEP_DECL_NAME))
> +       if (arg0 != arg1 && !(flags & (OEP_DECL_NAME | OEP_ASSUME_WRAPV)))

I think we should document that OEP_ASSUME_WRAPV cannot be used with
hash_operand to produce a hash that compares equal when operand_equal_p
does.

Note there are several points in operand_equal_p where we need to
clear OEP_ASSUME_WRAPV, basically the same where we clear
OEP_ADDRESS_OF, like when recusing into parts that do not form
the expression themselves.  I realize this might never be relevant
for the IVOPTs use but it would make it robust from the start.
Like when comparing DECL_FIELD_OFFSET of CTOR element indices or
TYPE_SIZE of IMAGPART_EXPRs.  I'm not sure expressions in such
fields should be covered by the outer expression flag setting.

So I suggest to strip OEP_ASSUME_WRAPV when OEP_ADDRESS_OF is
stripped.  COND_EXPR is an interesting case in this regard.
Maybe it's also a documentation issue.

Otherwise OK.

Richard.

>           {
>             inchash::hash hstate0 (0), hstate1 (0);
>             hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK);
> @@ -17476,6 +17515,111 @@ test_arithmetic_folding ()
>                                  x);
>  }
>  
> +namespace test_operand_equality {
> +
> +/* Verify structural equality.  */
> +
> +/* Execute fold_vec_perm_cst unit tests.  */
> +
> +static void
> +test ()
> +{
> +  tree stype = integer_type_node;
> +  tree utype = unsigned_type_node;
> +  tree x = create_tmp_var_raw (stype, "x");
> +  tree y = create_tmp_var_raw (stype, "y");
> +  tree z = create_tmp_var_raw (stype, "z");
> +  tree four = build_int_cst (stype, 4);
> +  tree lhs1 = fold_build2 (PLUS_EXPR, stype, x, y);
> +  tree rhs1 = fold_convert (stype,
> +             fold_build2 (PLUS_EXPR, utype,
> +                          fold_convert (utype, x),
> +                          fold_convert (utype, y)));
> +
> +  /* (int)((unsigned x) + (unsigned y)) == x + y.  */
> +  ASSERT_TRUE (operand_equal_p (lhs1, rhs1, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs1, rhs1, 0));
> +
> +  /* (int)(unsigned) x == x.  */
> +  tree lhs2 = build1 (NOP_EXPR, stype,
> +             build1 (NOP_EXPR, utype, x));
> +  tree rhs2 = x;
> +  ASSERT_TRUE (operand_equal_p (lhs2, rhs2, OEP_ASSUME_WRAPV));
> +  ASSERT_TRUE (operand_equal_p (lhs2, rhs2, 0));
> +
> +  /* (unsigned x) + (unsigned y) == x + y.  */
> +  tree lhs3 = lhs1;
> +  tree rhs3 = fold_build2 (PLUS_EXPR, utype,
> +                        fold_convert (utype, x),
> +                        fold_convert (utype, y));
> +  ASSERT_TRUE (operand_equal_p (lhs3, rhs3, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs3, rhs3, 0));
> +
> +  /* (unsigned x) / (unsigned y) == x / y.  */
> +  tree lhs4 = fold_build2 (TRUNC_DIV_EXPR, stype, x, y);;
> +  tree rhs4 = fold_build2 (TRUNC_DIV_EXPR, utype,
> +                        fold_convert (utype, x),
> +                        fold_convert (utype, y));
> +  ASSERT_FALSE (operand_equal_p (lhs4, rhs4, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs4, rhs4, 0));
> +
> +  /* (long x) / 4 == (long)(x / 4).  */
> +  tree lstype = long_long_integer_type_node;
> +  tree lfour = build_int_cst (lstype, 4);
> +  tree lhs5 = fold_build2 (TRUNC_DIV_EXPR, lstype,
> +                        fold_build1 (VIEW_CONVERT_EXPR, lstype, x), lfour);
> +  tree rhs5 = fold_build1 (VIEW_CONVERT_EXPR, lstype,
> +                        fold_build2 (TRUNC_DIV_EXPR, stype, x, four));
> +  ASSERT_FALSE (operand_equal_p (lhs5, rhs5, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs5, rhs5, 0));
> +
> +  /* (unsigned x) / 4 == x / 4.  */
> +  tree lhs6 = fold_build2 (TRUNC_DIV_EXPR, stype, x, four);;
> +  tree rhs6 = fold_build2 (TRUNC_DIV_EXPR, utype,
> +                        fold_convert (utype, x),
> +                        fold_convert (utype, four));
> +  ASSERT_FALSE (operand_equal_p (lhs6, rhs6, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs6, rhs6, 0));
> +
> +  /* a / (int)((unsigned)b - (unsigned)c)) == a / (b - c).  */
> +  tree lhs7 = fold_build2 (TRUNC_DIV_EXPR, stype, x, lhs1);
> +  tree rhs7 = fold_build2 (TRUNC_DIV_EXPR, stype, x, rhs1);
> +  ASSERT_TRUE (operand_equal_p (lhs7, rhs7, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs7, rhs7, 0));
> +
> +  /* (unsigned x) + 4 == x + 4.  */
> +  tree lhs8 = fold_build2 (PLUS_EXPR, stype, x, four);
> +  tree rhs8 = fold_build2 (PLUS_EXPR, utype,
> +                        fold_convert (utype, x),
> +                        fold_convert (utype, four));
> +  ASSERT_TRUE (operand_equal_p (lhs8, rhs8, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs8, rhs8, 0));
> +
> +  /* (unsigned x) + 4 == 4 + x.  */
> +  tree lhs9 = fold_build2 (PLUS_EXPR, stype, four, x);
> +  tree rhs9 = fold_build2 (PLUS_EXPR, utype,
> +                        fold_convert (utype, x),
> +                        fold_convert (utype, four));
> +  ASSERT_TRUE (operand_equal_p (lhs9, rhs9, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs9, rhs9, 0));
> +
> +  /* ((unsigned x) + 4) * (unsigned y)) + z == ((4 + x) * y) + z.  */
> +  tree lhs10 = fold_build2 (PLUS_EXPR, stype,
> +                        fold_build2 (MULT_EXPR, stype,
> +                                     fold_build2 (PLUS_EXPR, stype, four, x),
> +                                     y),
> +                        z);
> +  tree rhs10 = fold_build2 (MULT_EXPR, utype,
> +                        fold_build2 (PLUS_EXPR, utype,
> +                                     fold_convert (utype, x),
> +                                     fold_convert (utype, four)),
> +                        fold_convert (utype, y));
> +  rhs10 = fold_build2 (PLUS_EXPR, stype, fold_convert (stype, rhs10), z);
> +  ASSERT_TRUE (operand_equal_p (lhs10, rhs10, OEP_ASSUME_WRAPV));
> +  ASSERT_FALSE (operand_equal_p (lhs10, rhs10, 0));
> +}
> +}
> +
>  namespace test_fold_vec_perm_cst {
>  
>  /* Build a VECTOR_CST corresponding to VMODE, and has
> @@ -18269,6 +18413,7 @@ fold_const_cc_tests ()
>    test_vector_folding ();
>    test_vec_duplicate_folding ();
>    test_fold_vec_perm_cst::test ();
> +  test_operand_equality::test ();
>  }
>  
>  } // namespace selftest
> diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 
> b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c2e24a4973c8539fae
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90
> @@ -0,0 +1,20 @@
> +! { dg-do compile { target aarch64-*-* } }
> +! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" }
> +
> +module a
> +integer, parameter :: b=3, c=b
> +contains
> +subroutine d(block)
> +integer f, col   , block(:, :, :), e
> +do f = 1, c
> +   do col = 1, c
> +             block(:f,                          :, e()) = do
> +     end do
> +  end do
> +  end
> +  end
> +
> +! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3 
> IVs:} ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2 
> IVs:} 1 ivopts } }
> +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1 
> IVs:} 1 ivopts } }
> +
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index 
> 1776c154d94e12e6efaf846d1b60c3170d469605..30bada4776d8363f7fa355caa9bd6c2608d3074a
>  100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1007,7 +1007,14 @@ enum operand_equal_flag {
>    /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations
>       of the same kind.  Used to compare VLA bounds involving parameters
>       across redeclarations of the same function.  */
> -  OEP_DECL_NAME = 512
> +  OEP_DECL_NAME = 512,
> +  /* Check if two expressions result in the same bit values while possibly
> +     ignoring the sign of the expressions and any differences in undefined
> +     behaviour.  The compared expressions must however perform the same
> +     operations.  Because this comparison ignores any possible UB it cannot
> +     be used blindly without ensuring that the context you are using it in
> +     itself doesn't guarantee that there will be no UB.  */
> +  OEP_ASSUME_WRAPV = 1024
>  };
>  
>  /* Enum and arrays used for tree allocation stats.
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index 
> b38bc6971a8ebfa16c30329bc69d8e3add77f8d0..0f5dd089a0e6335410acc8bd0a466b9c525551c6
>  100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -1624,8 +1624,8 @@ record_group_use (struct ivopts_data *data, tree *use_p,
>  
>         /* Check if it has the same stripped base and step.  */
>         if (operand_equal_p (iv->base_object, use->iv->base_object, 0)
> -           && operand_equal_p (iv->step, use->iv->step, 0)
> -           && operand_equal_p (addr_base, use->addr_base, 0))
> +           && operand_equal_p (iv->step, use->iv->step, OEP_ASSUME_WRAPV)
> +           && operand_equal_p (addr_base, use->addr_base, OEP_ASSUME_WRAPV))
>           break;
>       }
>        if (i == data->vgroups.length ())
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

Reply via email to