For the loop in the testcase we currently fail to hoist the guard
check of the inner loop (m > 0) out of the outer loop because
find_loop_guard checks all blocks of the outer loop for side-effects,
including those that are skipped by the guard.  This usually
is harmless as the guard does not skip any blocks in the outer loop
but in this case store-motion was applied to the inner loop and thus
there's now a skipped store in the outer loop.

The following properly skips blocks that are dominated by the
entry to the skipped region.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

        PR tree-optimization/117510
        * tree-ssa-loop-unswitch.cc (find_loop_guard): Only check
        not skipped blocks for side-effects.

        * gcc.dg/vect/vect-outer-pr117510.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/vect-outer-pr117510.c | 13 +++++++++++++
 gcc/tree-ssa-loop-unswitch.cc                   |  6 +++++-
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-outer-pr117510.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-pr117510.c 
b/gcc/testsuite/gcc.dg/vect/vect-outer-pr117510.c
new file mode 100644
index 00000000000..e50b67ce040
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-pr117510.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-additional-options "-O3" } */
+
+void f(int n, int m, double *a)
+{
+  a = __builtin_assume_aligned (a, __BIGGEST_ALIGNMENT__);
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < m; j++)
+      a[i] += 2*a[i] + j;
+}
+
+/* { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" } } */
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index 847f7ac739f..88516fdb0a1 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -1256,7 +1256,11 @@ find_loop_guard (class loop *loop, vec<gimple *> 
&dbg_to_reset)
          guard_edge = NULL;
          goto end;
        }
-      if (!empty_bb_without_guard_p (loop, bb, dbg_to_reset))
+      /* If any of the not skipped blocks has side-effects or defs with
+        uses outside of the loop we cannot hoist the guard.  */
+      if (!dominated_by_p (CDI_DOMINATORS,
+                          bb, guard_edge == te ? fe->dest : te->dest)
+         && !empty_bb_without_guard_p (loop, bb, dbg_to_reset))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
-- 
2.43.0

Reply via email to