On 26/04/13 16:27, Tom de Vries wrote:
> On 25/04/13 16:19, Richard Biener wrote:
> <SNIP>
>> and compared to the previous patch changed the tree-ssa-tailmerge.c
>> part to deal with merging of loop latch and loop preheader (even
>> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
>> Any suggestion on how to improve that part welcome.
<SNIP>
> So I think this is really a cornercase, and we should disregard it if that 
> makes
> things simpler.
> 
> Rather than fixing up the loop structure, we could prevent tail-merge in these
> cases.
> 
> The current fix tests for current_loops == NULL, and I'm not sure that can 
> still
> happen there, given that we have PROP_loops.

Richard,

I've found that it happens in these g++ test-cases:
  g++.dg/ext/mv1.C
  g++.dg/ext/mv12.C
  g++.dg/ext/mv2.C
  g++.dg/ext/mv5.C
  g++.dg/torture/covariant-1.C
  g++.dg/torture/pr43068.C
  g++.dg/torture/pr47714.C
This seems rare enough to just bail out of tail-merge in those cases.

> It's not evident to me that the test bb2->loop_father->latch == bb2 is
> sufficient. Before calling tail_merge_optimize, we call 
> loop_optimizer_finalize
> in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
> theory we might miss some latches.
> 
> But I guess that pre (having started out with simple latches) maintains simple
> latches throughout, and that tail-merge does the same.

I've added a comment related to this in the patch.

Bootstrapped and reg-tested (ada inclusive) on x86_64.

OK for trunk?

Thanks,
- Tom

2013-04-28  Tom de Vries  <t...@codesourcery.com>

        * tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs.
        (replace_block_by): Don't set LOOPS_NEED_FIXUP.
        (tail_merge_optimize): Handle current_loops == NULL.

        * gcc.dg/pr50763.c: Update test.

diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index f2ab7444..5467ae5 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -689,7 +689,15 @@ find_same_succ_bb (basic_block bb, same_succ *same_p)
   edge_iterator ei;
   edge e;
 
-  if (bb == NULL)
+  if (bb == NULL
+      /* Be conservative with loop structure.  It's not evident that this test
+	 is sufficient.  Before tail-merge, we've just called
+	 loop_optimizer_finalize, and LOOPS_MAY_HAVE_MULTIPLE_LATCHES is now
+	 set, so there's no guarantee that the loop->latch value is still valid.
+	 But we assume that, since we've forced LOOPS_HAVE_SIMPLE_LATCHES at the
+	 start of pre, we've kept that property intact throughout pre, and are
+	 keeping it throughout tail-merge using this test.  */
+      || bb->loop_father->latch == bb)
     return;
   bitmap_set_bit (same->bbs, bb->index);
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -1460,17 +1468,6 @@ replace_block_by (basic_block bb1, basic_block bb2)
   /* Mark the basic block as deleted.  */
   mark_basic_block_deleted (bb1);
 
-  /* ???  If we merge the loop preheader with the loop latch we are creating
-     additional entries into the loop, eventually rotating it.
-     Mark loops for fixup in this case.
-     ???  This is a completely unwanted transform and will wreck most
-     loops at this point - but with just not considering loop latches as
-     merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c.
-     A better fix to avoid that regression is needed.  */
-  if (current_loops
-      && bb2->loop_father->latch == bb2)
-    loops_state_set (LOOPS_NEED_FIXUP);
-
   /* Redirect the incoming edges of bb1 to bb2.  */
   for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
     {
@@ -1612,7 +1609,19 @@ tail_merge_optimize (unsigned int todo)
   int iteration_nr = 0;
   int max_iterations = PARAM_VALUE (PARAM_MAX_TAIL_MERGE_ITERATIONS);
 
-  if (!flag_tree_tail_merge || max_iterations == 0)
+  if (!flag_tree_tail_merge
+      || max_iterations == 0
+      /* We try to be conservative with respect to loop structure, since:
+	 - the cases where tail-merging could both affect loop structure and be
+	   benificial are rare,
+	 - it prevents us from having to fixup the loops using
+	   loops_state_set (LOOPS_NEED_FIXUP), and
+	 - keeping loop structure may allow us to simplify the pass.
+	 In order to be conservative, we need loop information.	 In rare cases
+	 (about 7 test-cases in the g++ testsuite) there is none (because
+	 loop_optimizer_finalize has been called before tail-merge, and
+	 PROP_loops is not set), so we bail out.  */
+      || current_loops == NULL)
     return 0;
 
   timevar_push (TV_TREE_TAIL_MERGE);
diff --git a/gcc/testsuite/gcc.dg/pr50763.c b/gcc/testsuite/gcc.dg/pr50763.c
index 60025e3..695b61c 100644
--- a/gcc/testsuite/gcc.dg/pr50763.c
+++ b/gcc/testsuite/gcc.dg/pr50763.c
@@ -12,5 +12,5 @@ foo (int c, int d)
   while (c == d);
 }
 
-/* { dg-final { scan-tree-dump-times "== 33" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "== 33" 2 "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */

Reply via email to