On Tue, 3 Dec 2024, Tamar Christina wrote: > > > This patch implements a new OEP flag called OEP_STRUCTURAL_EQ. This flag > > > will > > > check if the operands would produce the same bit values after the > > > computations > > > even if the final sign is different. > > > > I think the name is badly chosen - we already have OEP_LEXICOGRAPHIC and > > OEP_BITWISE. I would suggest to use OEP_ASSUME_TWOS_COMPLEMENT > > or OEP_ASSUME_WRAPV. > > > > I went with OEP_ASSUME_WRAPV.. saves chars on the 80 char line limit :) > > > > - if (!(flags & OEP_ADDRESS_OF)) > > > + if ((flags & OEP_STRUCTURAL_EQ) > > > + && (CONVERT_EXPR_P (arg0) || CONVERT_EXPR_P (arg1))) > > > + { > > > + const_tree t_arg0 = arg0; > > > + const_tree t_arg1 = arg1; > > > + STRIP_NOPS (arg0); > > > + STRIP_NOPS (arg1); > > > + /* Only recurse if the conversion was one that was valid to strip. > > > */ > > > + if (t_arg0 != arg0 || t_arg1 != arg1) > > > + return operand_equal_p (type0, arg0, type1, arg1, flags); > > > > (*) > > > > > + } > > > + > > > + if (!(flags & (OEP_ADDRESS_OF | OEP_STRUCTURAL_EQ))) > > > { > > > /* If both types don't have the same signedness, then we can't > > > consider > > > them equal. We must check this before the STRIP_NOPS calls > > > @@ -3439,6 +3451,22 @@ operand_compare::operand_equal_p (tree type0, > > const_tree arg0, > > > if (flags & (OEP_ONLY_CONST | OEP_BITWISE)) > > > return false; > > > > > > + /* Check if we are checking an operation where the two's compliment > > > bitwise > > > + representation of the result is not the same between signed and > > > + unsigned arithmetic. */ > > > + switch (TREE_CODE (arg0)) > > > + { > > > + case PLUS_EXPR: > > > + case MINUS_EXPR: > > > + case MULT_EXPR: > > > + break; > > > + > > > + default: > > > + /* Clear the structural comparison bit. */ > > > + flags &= ~OEP_STRUCTURAL_EQ; > > > > So this handling means that a / (int)((unsigned)b - (unsigned)c)) and > > a / (b - c) are not equal with respect to OEP_STRUCTURAL_EQ because > > we drop that flag when processing the division and thus the recursion on > > the minus operands doesn't have it any more. > > Yeah I did do this intentionally, since I don’t expect division in IV > computations > anyway. But given that this is supposed to be a more generic helper I've > extended > it properly now. > > > > > So shouldn't this and (*) be integrated with the TYPE_UNSIGNED compare in > > the !OEP_ADDRESS_OF case? Aka not require same signedness for > > PLUS/MINUS/MULT/CONVERT? > > > > Yup, fair enough, I had a similar thought after I sent out the patch, in that > the precision > check is still good to have. I've done this in the patch, it turns the > conditions for ignoring > the signs into an opt-in check rather than an opt-out check. Hopefully I > kept the logic simple > to follow. > > > I think we have to document what the TYPE, ARG combination means > > semantically. As far as I understand TYPE doesn't specify the signedness > > of the operation ARG but instead represents a possibly stripped conversion > > of the result? > > > > I added some more documentation to the function arguments and the enum > definition. > > > > + /* (int)((unsigned x) + (unsigned y)) == x + y. */ > > > + ASSERT_TRUE (operand_equal_p (lhs1, rhs1, OEP_STRUCTURAL_EQ)); > > > + ASSERT_FALSE (operand_equal_p (lhs1, rhs1, 0)); > > > + > > > + /* (int)(unsigned) x == x. */ > > > + tree lhs2 = build1 (NOP_EXPR, stype, > > > + build1 (NOP_EXPR, utype, x)); > > > + tree rhs2 = x; > > > + ASSERT_TRUE (operand_equal_p (lhs2, rhs2, OEP_STRUCTURAL_EQ)); > > > + ASSERT_TRUE (operand_equal_p (lhs2, rhs2, 0)); > > > > That's not too useful - we'd never see this due to folding. > > > > Yeah, it's why I avoided fold here explicitly. I added this as when I had > messed up > something it was a cheap and early test that failed. So in that sense it's > usefulness > is purely for debugging. > > -- > > Bootstrapped Regtested on aarch64-none-linux-gnu, > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > -m32, -m64 and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR tree-optimization/114932 > * fold-const.cc (operand_compare::operand_equal_p): Use it. > (operand_compare::verify_hash_value): Likewise. > (test_operand_equality::test): New. > (fold_const_cc_tests): Use it. > * tree-core.h (enum operand_equal_flag): Add OEP_ASSUME_WRAPV. > * tree-ssa-loop-ivopts.cc (record_group_use): Check for structural eq. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/114932 > * gfortran.dg/addressing-modes_2.f90: New test. > > -- inline copy of patch -- > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index > 158591165ccdea3c7a7c47fb575d59484368e147..cfbd367c708475ba7fefb3433d0c3ca9f98f949d > 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -3172,8 +3172,10 @@ operand_compare::operand_equal_p (const_tree arg0, > const_tree arg1, > return operand_equal_p (TREE_TYPE (arg0), arg0, TREE_TYPE (arg1), arg1, > flags); > } > > -/* The same as operand_equal_p however the type of ARG0 and ARG1 are assumed > to be > - the TYPE0 and TYPE1 respectively. */ > +/* The same as operand_equal_p however the type of ARG0 and ARG1 are assumed > to > + be the TYPE0 and TYPE1 respectively. TYPE0 and TYPE1 represent the type > the > + expression is being compared under for equality. This means that they can > + differ from the actual TREE_TYPE (..) value of ARG0 and ARG1. */ > > bool > operand_compare::operand_equal_p (tree type0, const_tree arg0, > @@ -3220,15 +3222,52 @@ operand_compare::operand_equal_p (tree type0, > const_tree arg0, > return tree_int_cst_equal (arg0, arg1); > } > > + if ((flags & OEP_ASSUME_WRAPV) > + && (CONVERT_EXPR_P (arg0) || CONVERT_EXPR_P (arg1))) > + { > + const_tree t_arg0 = arg0; > + const_tree t_arg1 = arg1; > + STRIP_NOPS (arg0); > + STRIP_NOPS (arg1); > + /* Only recurse if the conversion was one that was valid to strip. */ > + if (t_arg0 != arg0 || t_arg1 != arg1) > + return operand_equal_p (type0, arg0, type1, arg1, flags); > + } > + > if (!(flags & OEP_ADDRESS_OF)) > { > + /* Check if we are checking an operation where the two's compliment > + bitwise representation of the result is not the same between signed and > + unsigned arithmetic. */ > + bool enforce_signedness = true; > + if (flags & OEP_ASSUME_WRAPV) > + { > + switch (TREE_CODE (arg0)) > + { /* Twos-complement ops. */ > + case PLUS_EXPR: > + case MINUS_EXPR: > + case MULT_EXPR:
the obvious BIT_*_EXPR are missing? /* Leafs. */ > + case SSA_NAME: > + case INTEGER_CST: > + case VAR_DECL: > + case PARM_DECL: is this set complete? An obvious missing case would be RETURN_DECL. > + CASE_CONVERT: > + enforce_signedness = false; > + break; > + > + default: > + break; > + } > + } > + > /* If both types don't have the same signedness, then we can't consider > them equal. We must check this before the STRIP_NOPS calls > because they may change the signedness of the arguments. As pointers > strictly don't have a signedness, require either two pointers or > two non-pointers as well. */ > - if (TYPE_UNSIGNED (type0) != TYPE_UNSIGNED (type1) > - || POINTER_TYPE_P (type0) != POINTER_TYPE_P (type1)) > + if (POINTER_TYPE_P (type0) != POINTER_TYPE_P (type1) > + || (TYPE_UNSIGNED (type0) != TYPE_UNSIGNED (type1) > + && enforce_signedness)) > return false; > > /* If both types don't have the same precision, then it is not safe > @@ -4236,7 +4275,7 @@ operand_compare::verify_hash_value (const_tree arg0, > const_tree arg1, > { > if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK)) > { > - if (arg0 != arg1 && !(flags & OEP_DECL_NAME)) > + if (arg0 != arg1 && !(flags & (OEP_DECL_NAME | OEP_ASSUME_WRAPV))) I think we should document that OEP_ASSUME_WRAPV cannot be used with hash_operand to produce a hash that compares equal when operand_equal_p does. Note there are several points in operand_equal_p where we need to clear OEP_ASSUME_WRAPV, basically the same where we clear OEP_ADDRESS_OF, like when recusing into parts that do not form the expression themselves. I realize this might never be relevant for the IVOPTs use but it would make it robust from the start. Like when comparing DECL_FIELD_OFFSET of CTOR element indices or TYPE_SIZE of IMAGPART_EXPRs. I'm not sure expressions in such fields should be covered by the outer expression flag setting. So I suggest to strip OEP_ASSUME_WRAPV when OEP_ADDRESS_OF is stripped. COND_EXPR is an interesting case in this regard. Maybe it's also a documentation issue. Otherwise OK. Richard. > { > inchash::hash hstate0 (0), hstate1 (0); > hash_operand (arg0, hstate0, flags | OEP_HASH_CHECK); > @@ -17476,6 +17515,111 @@ test_arithmetic_folding () > x); > } > > +namespace test_operand_equality { > + > +/* Verify structural equality. */ > + > +/* Execute fold_vec_perm_cst unit tests. */ > + > +static void > +test () > +{ > + tree stype = integer_type_node; > + tree utype = unsigned_type_node; > + tree x = create_tmp_var_raw (stype, "x"); > + tree y = create_tmp_var_raw (stype, "y"); > + tree z = create_tmp_var_raw (stype, "z"); > + tree four = build_int_cst (stype, 4); > + tree lhs1 = fold_build2 (PLUS_EXPR, stype, x, y); > + tree rhs1 = fold_convert (stype, > + fold_build2 (PLUS_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, y))); > + > + /* (int)((unsigned x) + (unsigned y)) == x + y. */ > + ASSERT_TRUE (operand_equal_p (lhs1, rhs1, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs1, rhs1, 0)); > + > + /* (int)(unsigned) x == x. */ > + tree lhs2 = build1 (NOP_EXPR, stype, > + build1 (NOP_EXPR, utype, x)); > + tree rhs2 = x; > + ASSERT_TRUE (operand_equal_p (lhs2, rhs2, OEP_ASSUME_WRAPV)); > + ASSERT_TRUE (operand_equal_p (lhs2, rhs2, 0)); > + > + /* (unsigned x) + (unsigned y) == x + y. */ > + tree lhs3 = lhs1; > + tree rhs3 = fold_build2 (PLUS_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, y)); > + ASSERT_TRUE (operand_equal_p (lhs3, rhs3, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs3, rhs3, 0)); > + > + /* (unsigned x) / (unsigned y) == x / y. */ > + tree lhs4 = fold_build2 (TRUNC_DIV_EXPR, stype, x, y);; > + tree rhs4 = fold_build2 (TRUNC_DIV_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, y)); > + ASSERT_FALSE (operand_equal_p (lhs4, rhs4, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs4, rhs4, 0)); > + > + /* (long x) / 4 == (long)(x / 4). */ > + tree lstype = long_long_integer_type_node; > + tree lfour = build_int_cst (lstype, 4); > + tree lhs5 = fold_build2 (TRUNC_DIV_EXPR, lstype, > + fold_build1 (VIEW_CONVERT_EXPR, lstype, x), lfour); > + tree rhs5 = fold_build1 (VIEW_CONVERT_EXPR, lstype, > + fold_build2 (TRUNC_DIV_EXPR, stype, x, four)); > + ASSERT_FALSE (operand_equal_p (lhs5, rhs5, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs5, rhs5, 0)); > + > + /* (unsigned x) / 4 == x / 4. */ > + tree lhs6 = fold_build2 (TRUNC_DIV_EXPR, stype, x, four);; > + tree rhs6 = fold_build2 (TRUNC_DIV_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, four)); > + ASSERT_FALSE (operand_equal_p (lhs6, rhs6, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs6, rhs6, 0)); > + > + /* a / (int)((unsigned)b - (unsigned)c)) == a / (b - c). */ > + tree lhs7 = fold_build2 (TRUNC_DIV_EXPR, stype, x, lhs1); > + tree rhs7 = fold_build2 (TRUNC_DIV_EXPR, stype, x, rhs1); > + ASSERT_TRUE (operand_equal_p (lhs7, rhs7, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs7, rhs7, 0)); > + > + /* (unsigned x) + 4 == x + 4. */ > + tree lhs8 = fold_build2 (PLUS_EXPR, stype, x, four); > + tree rhs8 = fold_build2 (PLUS_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, four)); > + ASSERT_TRUE (operand_equal_p (lhs8, rhs8, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs8, rhs8, 0)); > + > + /* (unsigned x) + 4 == 4 + x. */ > + tree lhs9 = fold_build2 (PLUS_EXPR, stype, four, x); > + tree rhs9 = fold_build2 (PLUS_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, four)); > + ASSERT_TRUE (operand_equal_p (lhs9, rhs9, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs9, rhs9, 0)); > + > + /* ((unsigned x) + 4) * (unsigned y)) + z == ((4 + x) * y) + z. */ > + tree lhs10 = fold_build2 (PLUS_EXPR, stype, > + fold_build2 (MULT_EXPR, stype, > + fold_build2 (PLUS_EXPR, stype, four, x), > + y), > + z); > + tree rhs10 = fold_build2 (MULT_EXPR, utype, > + fold_build2 (PLUS_EXPR, utype, > + fold_convert (utype, x), > + fold_convert (utype, four)), > + fold_convert (utype, y)); > + rhs10 = fold_build2 (PLUS_EXPR, stype, fold_convert (stype, rhs10), z); > + ASSERT_TRUE (operand_equal_p (lhs10, rhs10, OEP_ASSUME_WRAPV)); > + ASSERT_FALSE (operand_equal_p (lhs10, rhs10, 0)); > +} > +} > + > namespace test_fold_vec_perm_cst { > > /* Build a VECTOR_CST corresponding to VMODE, and has > @@ -18269,6 +18413,7 @@ fold_const_cc_tests () > test_vector_folding (); > test_vec_duplicate_folding (); > test_fold_vec_perm_cst::test (); > + test_operand_equality::test (); > } > > } // namespace selftest > diff --git a/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 > b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 > new file mode 100644 > index > 0000000000000000000000000000000000000000..8eee4be3dc4d69fecfacd4c2e24a4973c8539fae > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/addressing-modes_2.f90 > @@ -0,0 +1,20 @@ > +! { dg-do compile { target aarch64-*-* } } > +! { dg-additional-options "-w -Ofast -fdump-tree-ivopts-all" } > + > +module a > +integer, parameter :: b=3, c=b > +contains > +subroutine d(block) > +integer f, col , block(:, :, :), e > +do f = 1, c > + do col = 1, c > + block(:f, :, e()) = do > + end do > + end do > + end > + end > + > +! { dg-final { scan-tree-dump-not {Selected IV set for loop .+ niters, 3 > IVs:} ivopts } } > +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 2 > IVs:} 1 ivopts } } > +! { dg-final { scan-tree-dump-times {Selected IV set for loop .+ niters, 1 > IVs:} 1 ivopts } } > + > diff --git a/gcc/tree-core.h b/gcc/tree-core.h > index > 1776c154d94e12e6efaf846d1b60c3170d469605..30bada4776d8363f7fa355caa9bd6c2608d3074a > 100644 > --- a/gcc/tree-core.h > +++ b/gcc/tree-core.h > @@ -1007,7 +1007,14 @@ enum operand_equal_flag { > /* In conjunction with OEP_LEXICOGRAPHIC considers names of declarations > of the same kind. Used to compare VLA bounds involving parameters > across redeclarations of the same function. */ > - OEP_DECL_NAME = 512 > + OEP_DECL_NAME = 512, > + /* Check if two expressions result in the same bit values while possibly > + ignoring the sign of the expressions and any differences in undefined > + behaviour. The compared expressions must however perform the same > + operations. Because this comparison ignores any possible UB it cannot > + be used blindly without ensuring that the context you are using it in > + itself doesn't guarantee that there will be no UB. */ > + OEP_ASSUME_WRAPV = 1024 > }; > > /* Enum and arrays used for tree allocation stats. > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > index > b38bc6971a8ebfa16c30329bc69d8e3add77f8d0..0f5dd089a0e6335410acc8bd0a466b9c525551c6 > 100644 > --- a/gcc/tree-ssa-loop-ivopts.cc > +++ b/gcc/tree-ssa-loop-ivopts.cc > @@ -1624,8 +1624,8 @@ record_group_use (struct ivopts_data *data, tree *use_p, > > /* Check if it has the same stripped base and step. */ > if (operand_equal_p (iv->base_object, use->iv->base_object, 0) > - && operand_equal_p (iv->step, use->iv->step, 0) > - && operand_equal_p (addr_base, use->addr_base, 0)) > + && operand_equal_p (iv->step, use->iv->step, OEP_ASSUME_WRAPV) > + && operand_equal_p (addr_base, use->addr_base, OEP_ASSUME_WRAPV)) > break; > } > if (i == data->vgroups.length ()) > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)