On Thu, 27 Jan 2022, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, reassoc1 miscompiles following testcase.
> We have:
>   if (a.1_1 >= 0)
>     goto <bb 5>; [59.00%]
>   else
>     goto <bb 4>; [41.00%]
> 
>   <bb 4> [local count: 440234144]:
>   _3 = -2147483647 - a.1_1;
>   _9 = a.1_1 != -2147479551;
>   _4 = _3 == 1;
>   _8 = _4 | _9;
>   if (_8 != 0)
>     goto <bb 5>; [34.51%]
>   else
>     goto <bb 3>; [65.49%]
> 
> and the inter-bb range test optimization treats it as:
>   if ((a.1_1 >= 0)
>       | (-2147483647 - a.1_1 == 1)
>       | (a.1_1 != -2147479551))
>     goto bb5;
>   else
>     goto bb3;
> and recognizes that a.1_1 >= 0 is redundant with a.1_1 != -2147479551
> and so will optimize it into:
>   if (0
>       | (-2147483647 - a.1_1 == 1)
>       | (a.1_1 != -2147479551))
>     goto bb5;
>   else
>     goto bb3;
> When merging 2 comparisons, we use update_range_test which picks one
> of the comparisons as the one holding the result (currently always
> the RANGE one rather than all the OTHERRANGE* ones) and adjusts the
> others to be true or false.
> The problem with doing that is that means the
>   _3 = -2147483647 - a.1_1;
> stmt with undefined behavior on overflow used to be conditional before
> but now is unconditional.  reassoc performs a no_side_effect_bb check
> which among other checks punts on gimple_has_side_effects and
> gimple_assign_rhs_could_trap_p stmts as well as ones that have uses of
> their lhs outside of the same bb, but it doesn't punt for this potential
> signed overflow case.
> 
> Now, for this testcase, it can be fixed in update_range_test by being
> smarter and choosing the other comparison to modify.  This is achieved
> by storing into oe->id index of the bb with GIMPLE_COND the
> comparison feeds into not just for the cases where the comparison is
> the GIMPLE_COND itself, but in all cases, and then finding oe->id that
> isn't dominated by others.  If we find such, use that oe for the merge
> test and if successful, swap range->idx and swap_with->idx.
> So for the above case we optimize it into:
>   if ((a.1_1 != -2147479551)
>       | (-2147483647 - a.1_1 == 1)
>       | 0)
>     goto bb5;
>   else
>     goto bb3;
> instead.
> 
> Unfortunately, this doesn't work in all cases,
> optimize_range_tests_to_bit_test and
> optimize_range_tests_cmp_bitwise optimizations use non-NULL seq
> to update_range_test and they carefully choose a particular comparison
> because the sequence then uses SSA_NAMEs that may be defined only in
> their blocks.  For that case, the patch keeps using the chosen comparison
> but if the merge is successful, rewrites stmts with signed overflow behavior
> into defined overflow.
> For this I ran into a problem, rewrite_to_defined_overflow returns a
> sequence that includes the original stmt with modified arguments, this means
> it needs to be gsi_remove first.  Unfortunately, gsi_remove regardless of
> the remove_permanently argument creates debug temps for the lhs, which I
> think is quite undesirable here.  So I've added an argument (default to
> false) to rewrite_to_defined_overflow to do the modification in place
> without the need to remove the stmt.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-01-27  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/104196
>       * gimple-fold.h (rewrite_to_defined_overflow): Add IN_PLACE argument.
>       * gimple-fold.cc (rewrite_to_defined_overflow): Likewise.  If true,
>       return NULL and emit needed stmts before and after stmt.
>       * tree-ssa-reassoc.cc (update_range_test): For inter-bb range opt
>       pick as operand_entry that will hold the merged test the one feeding
>       earliest condition, ensure that by swapping range->idx with some
>       other range's idx if needed.  If seq is non-NULL, don't actually swap
>       it but instead rewrite stmts with undefined overflow in between
>       the two locations.
>       (maybe_optimize_range_tests): Set ops[]->id to bb->index with the
>       corresponding condition even if they have non-NULL ops[]->op.
>       Formatting fix.
> 
>       * gcc.c-torture/execute/pr104196.c: New test.
> 
> --- gcc/gimple-fold.h.jj      2022-01-18 11:58:59.615981585 +0100
> +++ gcc/gimple-fold.h 2022-01-26 18:46:04.524911097 +0100
> @@ -62,7 +62,7 @@ extern tree gimple_fold_indirect_ref (tr
>  extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *);
>  extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *);
>  extern bool arith_code_with_undefined_signed_overflow (tree_code);
> -extern gimple_seq rewrite_to_defined_overflow (gimple *);
> +extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false);
>  extern void replace_call_with_value (gimple_stmt_iterator *, tree);
>  extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, 
> tree);
>  
> --- gcc/gimple-fold.cc.jj     2022-01-21 11:01:12.473449208 +0100
> +++ gcc/gimple-fold.cc        2022-01-26 18:52:52.094085990 +0100
> @@ -8539,11 +8539,12 @@ arith_code_with_undefined_signed_overflo
>     its operand, carrying out the operation in the corresponding unsigned
>     type and converting the result back to the original type.
>  
> -   Returns a sequence of statements that replace STMT and also contain
> -   a modified form of STMT itself.  */
> +   If IN_PLACE is true, adjust the stmt in place and return NULL.
> +   Otherwise returns a sequence of statements that replace STMT and also
> +   contain a modified form of STMT itself.  */
>  
>  gimple_seq
> -rewrite_to_defined_overflow (gimple *stmt)
> +rewrite_to_defined_overflow (gimple *stmt, bool in_place /* = false */)
>  {
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -8568,9 +8569,24 @@ rewrite_to_defined_overflow (gimple *stm
>    if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
>      gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
>    gimple_set_modified (stmt, true);
> -  gimple_seq_add_stmt (&stmts, stmt);
> +  if (in_place)
> +    {
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      if (stmts)
> +     gsi_insert_seq_before (&gsi, stmts, GSI_SAME_STMT);
> +      stmts = NULL;
> +    }
> +  else
> +    gimple_seq_add_stmt (&stmts, stmt);
>    gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs 
> (stmt));
> -  gimple_seq_add_stmt (&stmts, cvt);
> +  if (in_place)
> +    {
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      gsi_insert_after (&gsi, cvt, GSI_SAME_STMT);
> +      update_stmt (stmt);
> +    }
> +  else
> +    gimple_seq_add_stmt (&stmts, cvt);
>  
>    return stmts;
>  }
> --- gcc/tree-ssa-reassoc.cc.jj        2022-01-26 11:57:37.106698099 +0100
> +++ gcc/tree-ssa-reassoc.cc   2022-01-26 18:59:55.993027492 +0100
> @@ -2768,7 +2768,49 @@ update_range_test (struct range_entry *r
>                  vec<operand_entry *> *ops, tree exp, gimple_seq seq,
>                  bool in_p, tree low, tree high, bool strict_overflow_p)
>  {
> -  operand_entry *oe = (*ops)[range->idx];
> +  unsigned int idx = range->idx;
> +  struct range_entry *swap_with = NULL;
> +  basic_block rewrite_bb_first = NULL, rewrite_bb_last = NULL;
> +  if (opcode == ERROR_MARK)
> +    {
> +      /* For inter-bb range test optimization, pick from the range tests
> +      the one which is tested in the earliest condition (one dominating
> +      the others), because otherwise there could be some UB (e.g. signed
> +      overflow) in following bbs that we'd expose which wasn't there in
> +      the original program.  See PR104196.  */
> +      basic_block orig_range_bb = BASIC_BLOCK_FOR_FN (cfun, (*ops)[idx]->id);
> +      basic_block range_bb = orig_range_bb;
> +      for (unsigned int i = 0; i < count; i++)
> +     {
> +       struct range_entry *this_range;
> +       if (otherrange)
> +         this_range = otherrange + i;
> +       else
> +         this_range = otherrangep[i];
> +       operand_entry *oe = (*ops)[this_range->idx];
> +       basic_block this_bb = BASIC_BLOCK_FOR_FN (cfun, oe->id);
> +       if (range_bb != this_bb
> +           && dominated_by_p (CDI_DOMINATORS, range_bb, this_bb))
> +         {
> +           swap_with = this_range;
> +           range_bb = this_bb;
> +           idx = this_range->idx;
> +         }
> +     }
> +      /* If seq is non-NULL, it can contain statements that use SSA_NAMEs
> +      only defined in later blocks.  In this case we can't move the
> +      merged comparison earlier, so instead check if there are any stmts
> +      that might trigger signed integer overflow in between and rewrite
> +      them.  But only after we check if the optimization is possible.  */
> +      if (seq && swap_with)
> +     {
> +       rewrite_bb_first = range_bb;
> +       rewrite_bb_last = orig_range_bb;
> +       idx = range->idx;
> +       swap_with = NULL;
> +     }
> +    }
> +  operand_entry *oe = (*ops)[idx];
>    tree op = oe->op;
>    gimple *stmt = op ? SSA_NAME_DEF_STMT (op)
>                   : last_stmt (BASIC_BLOCK_FOR_FN (cfun, oe->id));
> @@ -2805,6 +2847,9 @@ update_range_test (struct range_entry *r
>       return false;
>      }
>  
> +  if (swap_with)
> +    std::swap (range->idx, swap_with->idx);
> +
>    if (strict_overflow_p && issue_strict_overflow_warning (wc))
>      warning_at (loc, OPT_Wstrict_overflow,
>               "assuming signed overflow does not occur "
> @@ -2839,6 +2884,42 @@ update_range_test (struct range_entry *r
>        fprintf (dump_file, "\n");
>      }
>  
> +  /* In inter-bb range optimization mode, if we have a seq, we can't
> +     move the merged comparison to the earliest bb from the comparisons
> +     being replaced, so instead rewrite stmts that could trigger signed
> +     integer overflow.  */
> +  for (basic_block bb = rewrite_bb_last;
> +       bb != rewrite_bb_first; bb = single_pred (bb))
> +    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
> +      !gsi_end_p (gsi); gsi_next (&gsi))
> +      {
> +     gimple *stmt = gsi_stmt (gsi);
> +     if (is_gimple_assign (stmt))
> +       if (tree lhs = gimple_assign_lhs (stmt))
> +         if ((INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +              || POINTER_TYPE_P (TREE_TYPE (lhs)))
> +             && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (lhs)))
> +           {
> +             enum tree_code code = gimple_assign_rhs_code (stmt);
> +             if (arith_code_with_undefined_signed_overflow (code))
> +               {
> +                 gimple_stmt_iterator gsip = gsi;
> +                 gimple_stmt_iterator gsin = gsi;
> +                 gsi_prev (&gsip);
> +                 gsi_next (&gsin);
> +                 rewrite_to_defined_overflow (stmt, true);
> +                 unsigned uid = gimple_uid (stmt);
> +                 if (gsi_end_p (gsip))
> +                   gsip = gsi_after_labels (bb);
> +                 else
> +                   gsi_next (&gsip);
> +                 for (; gsi_stmt (gsip) != gsi_stmt (gsin);
> +                      gsi_next (&gsip))
> +                   gimple_set_uid (gsi_stmt (gsip), uid);
> +               }
> +           }
> +      }
> +
>    if (opcode == BIT_IOR_EXPR
>        || (opcode == ERROR_MARK && oe->rank == BIT_IOR_EXPR))
>      tem = invert_truthvalue_loc (loc, tem);
> @@ -4755,7 +4836,7 @@ maybe_optimize_range_tests (gimple *stmt
>             && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
>                 != tcc_comparison)
>             && !get_ops (rhs, code, &ops,
> -                     loop_containing_stmt (stmt))
> +                        loop_containing_stmt (stmt))
>             && has_single_use (rhs))
>           {
>             /* Otherwise, push the _234 range test itself.  */
> @@ -4792,6 +4873,8 @@ maybe_optimize_range_tests (gimple *stmt
>             bb_ent.op = rhs;
>           }
>         bbinfo.safe_push (bb_ent);
> +       for (unsigned int i = bb_ent.first_idx; i < bb_ent.last_idx; ++i)
> +         ops[i]->id = bb->index;
>         continue;
>       }
>        else if (bb == last_bb)
> @@ -4855,6 +4938,8 @@ maybe_optimize_range_tests (gimple *stmt
>         bb_ent.last_idx = ops.length ();
>       }
>        bbinfo.safe_push (bb_ent);
> +      for (unsigned int i = bb_ent.first_idx; i < bb_ent.last_idx; ++i)
> +     ops[i]->id = bb->index;
>        if (bb == first_bb)
>       break;
>      }
> --- gcc/testsuite/gcc.c-torture/execute/pr104196.c.jj 2022-01-26 
> 14:19:37.449144097 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr104196.c    2022-01-26 
> 14:19:37.449144097 +0100
> @@ -0,0 +1,19 @@
> +/* PR tree-optimization/104196 */
> +
> +int a = 6;
> +
> +int
> +main ()
> +{
> +  while (1)
> +    {
> +      int b = a < 0 && 0 < -__INT_MAX__ - a ? 0 : a;
> +      if (b != 4096 - __INT_MAX__)
> +     {
> +       if (a < 6)
> +         __builtin_abort ();
> +       break;
> +     }
> +    }
> +  return 0;
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Reply via email to