On Thu, 27 Jan 2022, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, reassoc1 miscompiles following testcase. > We have: > if (a.1_1 >= 0) > goto <bb 5>; [59.00%] > else > goto <bb 4>; [41.00%] > > <bb 4> [local count: 440234144]: > _3 = -2147483647 - a.1_1; > _9 = a.1_1 != -2147479551; > _4 = _3 == 1; > _8 = _4 | _9; > if (_8 != 0) > goto <bb 5>; [34.51%] > else > goto <bb 3>; [65.49%] > > and the inter-bb range test optimization treats it as: > if ((a.1_1 >= 0) > | (-2147483647 - a.1_1 == 1) > | (a.1_1 != -2147479551)) > goto bb5; > else > goto bb3; > and recognizes that a.1_1 >= 0 is redundant with a.1_1 != -2147479551 > and so will optimize it into: > if (0 > | (-2147483647 - a.1_1 == 1) > | (a.1_1 != -2147479551)) > goto bb5; > else > goto bb3; > When merging 2 comparisons, we use update_range_test which picks one > of the comparisons as the one holding the result (currently always > the RANGE one rather than all the OTHERRANGE* ones) and adjusts the > others to be true or false. > The problem with doing that is that means the > _3 = -2147483647 - a.1_1; > stmt with undefined behavior on overflow used to be conditional before > but now is unconditional. reassoc performs a no_side_effect_bb check > which among other checks punts on gimple_has_side_effects and > gimple_assign_rhs_could_trap_p stmts as well as ones that have uses of > their lhs outside of the same bb, but it doesn't punt for this potential > signed overflow case. > > Now, for this testcase, it can be fixed in update_range_test by being > smarter and choosing the other comparison to modify. This is achieved > by storing into oe->id index of the bb with GIMPLE_COND the > comparison feeds into not just for the cases where the comparison is > the GIMPLE_COND itself, but in all cases, and then finding oe->id that > isn't dominated by others. If we find such, use that oe for the merge > test and if successful, swap range->idx and swap_with->idx. > So for the above case we optimize it into: > if ((a.1_1 != -2147479551) > | (-2147483647 - a.1_1 == 1) > | 0) > goto bb5; > else > goto bb3; > instead. > > Unfortunately, this doesn't work in all cases, > optimize_range_tests_to_bit_test and > optimize_range_tests_cmp_bitwise optimizations use non-NULL seq > to update_range_test and they carefully choose a particular comparison > because the sequence then uses SSA_NAMEs that may be defined only in > their blocks. For that case, the patch keeps using the chosen comparison > but if the merge is successful, rewrites stmts with signed overflow behavior > into defined overflow. > For this I ran into a problem, rewrite_to_defined_overflow returns a > sequence that includes the original stmt with modified arguments, this means > it needs to be gsi_remove first. Unfortunately, gsi_remove regardless of > the remove_permanently argument creates debug temps for the lhs, which I > think is quite undesirable here. So I've added an argument (default to > false) to rewrite_to_defined_overflow to do the modification in place > without the need to remove the stmt. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK. Thanks, Richard. > 2022-01-27 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/104196 > * gimple-fold.h (rewrite_to_defined_overflow): Add IN_PLACE argument. > * gimple-fold.cc (rewrite_to_defined_overflow): Likewise. If true, > return NULL and emit needed stmts before and after stmt. > * tree-ssa-reassoc.cc (update_range_test): For inter-bb range opt > pick as operand_entry that will hold the merged test the one feeding > earliest condition, ensure that by swapping range->idx with some > other range's idx if needed. If seq is non-NULL, don't actually swap > it but instead rewrite stmts with undefined overflow in between > the two locations. > (maybe_optimize_range_tests): Set ops[]->id to bb->index with the > corresponding condition even if they have non-NULL ops[]->op. > Formatting fix. > > * gcc.c-torture/execute/pr104196.c: New test. > > --- gcc/gimple-fold.h.jj 2022-01-18 11:58:59.615981585 +0100 > +++ gcc/gimple-fold.h 2022-01-26 18:46:04.524911097 +0100 > @@ -62,7 +62,7 @@ extern tree gimple_fold_indirect_ref (tr > extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *); > extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *); > extern bool arith_code_with_undefined_signed_overflow (tree_code); > -extern gimple_seq rewrite_to_defined_overflow (gimple *); > +extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false); > extern void replace_call_with_value (gimple_stmt_iterator *, tree); > extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, > tree); > > --- gcc/gimple-fold.cc.jj 2022-01-21 11:01:12.473449208 +0100 > +++ gcc/gimple-fold.cc 2022-01-26 18:52:52.094085990 +0100 > @@ -8539,11 +8539,12 @@ arith_code_with_undefined_signed_overflo > its operand, carrying out the operation in the corresponding unsigned > type and converting the result back to the original type. > > - Returns a sequence of statements that replace STMT and also contain > - a modified form of STMT itself. */ > + If IN_PLACE is true, adjust the stmt in place and return NULL. > + Otherwise returns a sequence of statements that replace STMT and also > + contain a modified form of STMT itself. */ > > gimple_seq > -rewrite_to_defined_overflow (gimple *stmt) > +rewrite_to_defined_overflow (gimple *stmt, bool in_place /* = false */) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -8568,9 +8569,24 @@ rewrite_to_defined_overflow (gimple *stm > if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) > gimple_assign_set_rhs_code (stmt, PLUS_EXPR); > gimple_set_modified (stmt, true); > - gimple_seq_add_stmt (&stmts, stmt); > + if (in_place) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); > + if (stmts) > + gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT); > + stmts = NULL; > + } > + else > + gimple_seq_add_stmt (&stmts, stmt); > gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs > (stmt)); > - gimple_seq_add_stmt (&stmts, cvt); > + if (in_place) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); > + gsi_insert_after (&gsi, cvt, GSI_SAME_STMT); > + update_stmt (stmt); > + } > + else > + gimple_seq_add_stmt (&stmts, cvt); > > return stmts; > } > --- gcc/tree-ssa-reassoc.cc.jj 2022-01-26 11:57:37.106698099 +0100 > +++ gcc/tree-ssa-reassoc.cc 2022-01-26 18:59:55.993027492 +0100 > @@ -2768,7 +2768,49 @@ update_range_test (struct range_entry *r > vec<operand_entry *> *ops, tree exp, gimple_seq seq, > bool in_p, tree low, tree high, bool strict_overflow_p) > { > - operand_entry *oe = (*ops)[range->idx]; > + unsigned int idx = range->idx; > + struct range_entry *swap_with = NULL; > + basic_block rewrite_bb_first = NULL, rewrite_bb_last = NULL; > + if (opcode == ERROR_MARK) > + { > + /* For inter-bb range test optimization, pick from the range tests > + the one which is tested in the earliest condition (one dominating > + the others), because otherwise there could be some UB (e.g. signed > + overflow) in following bbs that we'd expose which wasn't there in > + the original program. See PR104196. */ > + basic_block orig_range_bb = BASIC_BLOCK_FOR_FN (cfun, (*ops)[idx]->id); > + basic_block range_bb = orig_range_bb; > + for (unsigned int i = 0; i < count; i++) > + { > + struct range_entry *this_range; > + if (otherrange) > + this_range = otherrange + i; > + else > + this_range = otherrangep[i]; > + operand_entry *oe = (*ops)[this_range->idx]; > + basic_block this_bb = BASIC_BLOCK_FOR_FN (cfun, oe->id); > + if (range_bb != this_bb > + && dominated_by_p (CDI_DOMINATORS, range_bb, this_bb)) > + { > + swap_with = this_range; > + range_bb = this_bb; > + idx = this_range->idx; > + } > + } > + /* If seq is non-NULL, it can contain statements that use SSA_NAMEs > + only defined in later blocks. In this case we can't move the > + merged comparison earlier, so instead check if there are any stmts > + that might trigger signed integer overflow in between and rewrite > + them. But only after we check if the optimization is possible. */ > + if (seq && swap_with) > + { > + rewrite_bb_first = range_bb; > + rewrite_bb_last = orig_range_bb; > + idx = range->idx; > + swap_with = NULL; > + } > + } > + operand_entry *oe = (*ops)[idx]; > tree op = oe->op; > gimple *stmt = op ? SSA_NAME_DEF_STMT (op) > : last_stmt (BASIC_BLOCK_FOR_FN (cfun, oe->id)); > @@ -2805,6 +2847,9 @@ update_range_test (struct range_entry *r > return false; > } > > + if (swap_with) > + std::swap (range->idx, swap_with->idx); > + > if (strict_overflow_p && issue_strict_overflow_warning (wc)) > warning_at (loc, OPT_Wstrict_overflow, > "assuming signed overflow does not occur " > @@ -2839,6 +2884,42 @@ update_range_test (struct range_entry *r > fprintf (dump_file, "\n"); > } > > + /* In inter-bb range optimization mode, if we have a seq, we can't > + move the merged comparison to the earliest bb from the comparisons > + being replaced, so instead rewrite stmts that could trigger signed > + integer overflow. */ > + for (basic_block bb = rewrite_bb_last; > + bb != rewrite_bb_first; bb = single_pred (bb)) > + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple *stmt = gsi_stmt (gsi); > + if (is_gimple_assign (stmt)) > + if (tree lhs = gimple_assign_lhs (stmt)) > + if ((INTEGRAL_TYPE_P (TREE_TYPE (lhs)) > + || POINTER_TYPE_P (TREE_TYPE (lhs))) > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (lhs))) > + { > + enum tree_code code = gimple_assign_rhs_code (stmt); > + if (arith_code_with_undefined_signed_overflow (code)) > + { > + gimple_stmt_iterator gsip = gsi; > + gimple_stmt_iterator gsin = gsi; > + gsi_prev (&gsip); > + gsi_next (&gsin); > + rewrite_to_defined_overflow (stmt, true); > + unsigned uid = gimple_uid (stmt); > + if (gsi_end_p (gsip)) > + gsip = gsi_after_labels (bb); > + else > + gsi_next (&gsip); > + for (; gsi_stmt (gsip) != gsi_stmt (gsin); > + gsi_next (&gsip)) > + gimple_set_uid (gsi_stmt (gsip), uid); > + } > + } > + } > + > if (opcode == BIT_IOR_EXPR > || (opcode == ERROR_MARK && oe->rank == BIT_IOR_EXPR)) > tem = invert_truthvalue_loc (loc, tem); > @@ -4755,7 +4836,7 @@ maybe_optimize_range_tests (gimple *stmt > && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) > != tcc_comparison) > && !get_ops (rhs, code, &ops, > - loop_containing_stmt (stmt)) > + loop_containing_stmt (stmt)) > && has_single_use (rhs)) > { > /* Otherwise, push the _234 range test itself. */ > @@ -4792,6 +4873,8 @@ maybe_optimize_range_tests (gimple *stmt > bb_ent.op = rhs; > } > bbinfo.safe_push (bb_ent); > + for (unsigned int i = bb_ent.first_idx; i < bb_ent.last_idx; ++i) > + ops[i]->id = bb->index; > continue; > } > else if (bb == last_bb) > @@ -4855,6 +4938,8 @@ maybe_optimize_range_tests (gimple *stmt > bb_ent.last_idx = ops.length (); > } > bbinfo.safe_push (bb_ent); > + for (unsigned int i = bb_ent.first_idx; i < bb_ent.last_idx; ++i) > + ops[i]->id = bb->index; > if (bb == first_bb) > break; > } > --- gcc/testsuite/gcc.c-torture/execute/pr104196.c.jj 2022-01-26 > 14:19:37.449144097 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr104196.c 2022-01-26 > 14:19:37.449144097 +0100 > @@ -0,0 +1,19 @@ > +/* PR tree-optimization/104196 */ > + > +int a = 6; > + > +int > +main () > +{ > + while (1) > + { > + int b = a < 0 && 0 < -__INT_MAX__ - a ? 0 : a; > + if (b != 4096 - __INT_MAX__) > + { > + if (a < 6) > + __builtin_abort (); > + break; > + } > + } > + return 0; > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)