Excessive use of __builtin_assume_aligned can cause missed optimizations
because those calls are propagation barriers.  The following removes
those that are redundant and provide no extra information, on the
testcase allowng store-merging to apply.

Since the bit lattice and the const/copy lattice are merged
we cannot track this during CCP propagation (make a copy out
of the redundant call and propagate that out).  Thus I apply
it in the CCP specific folding routine called during
substitute_and_fold only.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-08-20  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/91482
        * tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless
        BUILT_IN_ASSUME_ALIGNED calls.

        * gcc.dg/tree-ssa/pr91482.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c  (revision 274670)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -2315,6 +2315,27 @@ ccp_folder::fold_stmt (gimple_stmt_itera
              }
           }
 
+       /* If there's no extra info from an assume_aligned call,
+          drop it so it doesn't act as otherwise useless dataflow
+          barrier.  */
+       if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED))
+         {
+           tree ptr = gimple_call_arg (stmt, 0);
+           ccp_prop_value_t ptrval = get_value_for_expr (ptr, true);
+           ccp_prop_value_t val = bit_value_assume_aligned (stmt, NULL_TREE,
+                                                            ptrval, false);
+           unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ());
+           unsigned int align = least_bit_hwi (val.mask.to_uhwi ());
+           if (ptralign == align
+               && ((TREE_INT_CST_LOW (ptrval.value) & (align - 1))
+                   == (TREE_INT_CST_LOW (val.value) & (align - 1))))
+             {
+               bool res = update_call_from_tree (gsi, ptr);
+               gcc_assert (res);
+               return true;
+             }
+         }
+
        /* Propagate into the call arguments.  Compared to replace_uses_in
           this can use the argument slot types for type verification
           instead of the current argument type.  We also can safely
Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c     (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c     (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */
+
+void write64 (void *p)
+{
+  unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8);
+  *p1++ = 0;
+  unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4);
+  *p2++ = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 
2 stores" "store-merging" { target lp64 } } } */

Reply via email to