Hi!

The PR38359 change made the -1 >> x to -1 optimization less useful by
requiring that the x must be non-negative.
Shifts by negative amount are UB, but we for historic reasons had in some
(but not all) places some hack to treat shifts by negative value as the
other direction shifts by the negated amount.

The following patch just removes that special handling, instead we punt on
optimizing those (and ideally path isolation should catch that up and turn
those into __builtin_unreachable, perhaps with __builtin_warning next to
it).  Folding the shifts in some places as if they were rotates and in other
as if they were saturating just leads to inconsistencies.

For C++ constexpr diagnostics and -fpermissive, I've added code to pretend
fold-const.c has not changed, without -fpermissive it will be an error
anyway and I think it is better not to change all the diagnostics.

During x86_64-linux and i686-linux bootstrap/regtest, my statistics
gathering patch noted 185 unique -m32/-m64 x TU x function_name x shift_kind
x fold-const/tree-ssa-ccp cases.  I have investigated the
64 ../../gcc/config/i386/i386.c x86_output_aligned_bss LSHIFT_EXPR 
wide_int_bitop
64 ../../gcc/config/i386/i386-expand.c emit_memmov LSHIFT_EXPR wide_int_bitop
64 ../../gcc/config/i386/i386-expand.c ix86_expand_carry_flag_compare 
LSHIFT_EXPR wide_int_bitop
64 ../../gcc/expmed.c expand_divmod LSHIFT_EXPR wide_int_bitop
64 ../../gcc/lra-lives.c process_bb_lives LSHIFT_EXPR wide_int_bitop
64 ../../gcc/rtlanal.c nonzero_bits1 LSHIFT_EXPR wide_int_bitop
64 ../../gcc/varasm.c optimize_constant_pool.isra LSHIFT_EXPR wide_int_bitop
cases and all of them are either during jump threading (dom) or during PRE.
For jump threading, the most common case is 1 << floor_log2 (whatever) where
floor_log2 is return HOST_BITS_PER_WIDE_INT - 1 - clz_hwi (x);
and clz_hwi is if (x == 0) return HOST_BITS_PER_WIDE_INT; return __builtin_clz* 
(x);
and so has range [-1, 63] and a comparison against == 0 which makes the
threader think it might be nice to jump thread the case leading to 1 << -1.
I think it is better to keep the 1 << -1 s in the IL for this and let path
isolation turn that into __builtin_unreachable () if the user wishes so.

Earlier version of this patch has been bootstrapped/regtested on
x86_64-linux and i686-linux, is this one ok for trunk if it passes the same
testing?

2020-11-23  Jakub Jelinek  <ja...@redhat.com>

        PR tree-optimization/96929
        * fold-const.c (wide_int_binop) <case LSHIFT_EXPR, case RSHIFT_EXPR>:
        Return false on negative second argument rather than trying to handle
        it as shift in the other direction.
        * tree-ssa-ccp.c (bit_value_binop) <case LSHIFT_EXPR,
        case RSHIFT_EXPR>: Punt on negative shift count rather than trying
        to handle it as shift in the other direction.
        * match.pd (-1 >> x to -1): Remove tree_expr_nonnegative_p check.

        * constexpr.c (cxx_eval_binary_expression): For shifts by constant
        with MSB set, emulate older wide_int_binop behavior to preserve
        diagnostics and -fpermissive behavior.

        * gcc.dg/tree-ssa/pr96929.c: New test.

--- gcc/fold-const.c.jj 2020-11-20 12:26:16.921936188 +0100
+++ gcc/fold-const.c    2020-11-23 12:42:13.784315448 +0100
@@ -992,26 +992,19 @@ wide_int_binop (wide_int &res,
       res = wi::bit_and (arg1, arg2);
       break;
 
-    case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
-       {
-         tmp = -arg2;
-         if (code == RSHIFT_EXPR)
-           code = LSHIFT_EXPR;
-         else
-           code = RSHIFT_EXPR;
-       }
-      else
-        tmp = arg2;
+       return false;
+      res = wi::lshift (arg1, arg2);
+      break;
 
-      if (code == RSHIFT_EXPR)
-       /* It's unclear from the C standard whether shifts can overflow.
-          The following code ignores overflow; perhaps a C standard
-          interpretation ruling is needed.  */
-       res = wi::rshift (arg1, tmp, sign);
-      else
-       res = wi::lshift (arg1, tmp);
+    case RSHIFT_EXPR:
+      if (wi::neg_p (arg2))
+       return false;
+      /* It's unclear from the C standard whether shifts can overflow.
+        The following code ignores overflow; perhaps a C standard
+        interpretation ruling is needed.  */
+      res = wi::rshift (arg1, arg2, sign);
       break;
 
     case RROTATE_EXPR:
--- gcc/tree-ssa-ccp.c.jj       2020-11-20 12:26:16.946935910 +0100
+++ gcc/tree-ssa-ccp.c  2020-11-23 12:42:13.785315436 +0100
@@ -1432,13 +1432,7 @@ bit_value_binop (enum tree_code code, si
          else
            {
              if (wi::neg_p (shift))
-               {
-                 shift = -shift;
-                 if (code == RSHIFT_EXPR)
-                   code = LSHIFT_EXPR;
-                 else
-                   code = RSHIFT_EXPR;
-               }
+               break;
              if (code == RSHIFT_EXPR)
                {
                  *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
--- gcc/match.pd.jj     2020-11-20 12:26:16.941935966 +0100
+++ gcc/match.pd        2020-11-23 12:42:13.809315167 +0100
@@ -2900,8 +2900,7 @@ (define_operator_list COND_TERNARY
 /* Optimize -1 >> x for arithmetic right shifts.  */
 (simplify
  (rshift integer_all_onesp@0 @1)
- (if (!TYPE_UNSIGNED (type)
-      && tree_expr_nonnegative_p (@1))
+ (if (!TYPE_UNSIGNED (type))
   @0))
 
 /* Optimize (x >> c) << c into x & (-1<<c).  */
--- gcc/cp/constexpr.c.jj       2020-11-22 19:11:44.053589081 +0100
+++ gcc/cp/constexpr.c  2020-11-23 13:08:16.815759331 +0100
@@ -3162,6 +3162,21 @@ cxx_eval_binary_expression (const conste
   if (r == NULL_TREE)
     r = fold_binary_loc (loc, code, type, lhs, rhs);
 
+  if (r == NULL_TREE
+      && (code == LSHIFT_EXPR || code == RSHIFT_EXPR)
+      && TREE_CODE (lhs) == INTEGER_CST
+      && TREE_CODE (rhs) == INTEGER_CST
+      && wi::neg_p (wi::to_wide (rhs)))
+    {
+      /* For diagnostics and -fpermissive emulate previous behavior of
+        handling shifts by negative amount.  */
+      tree nrhs = const_unop (NEGATE_EXPR, TREE_TYPE (rhs), rhs);
+      if (nrhs)
+       r = fold_binary_loc (loc,
+                            code == LSHIFT_EXPR ? RSHIFT_EXPR : LSHIFT_EXPR,
+                            type, lhs, nrhs);
+    }
+
   if (r == NULL_TREE)
     {
       if (lhs == orig_lhs && rhs == orig_rhs)
--- gcc/testsuite/gcc.dg/tree-ssa/pr96929.c.jj  2020-11-23 12:42:13.809315167 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr96929.c     2020-11-23 12:42:13.809315167 
+0100
@@ -0,0 +1,21 @@
+/* PR tree-optimization/96929 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump "baz \\\(\\\);" "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return -1;" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " >> " "optimized" } } */
+
+int baz (void);
+
+int
+foo (void)
+{
+  return -1 >> baz ();
+}
+
+int
+bar (int y)
+{
+  int z = -1;
+  return z >> y;
+}

        Jakub

Reply via email to