On 23/11/15 12:31, Richard Biener wrote:
 From the dump below I understand you want no memory references in
> >the outer loop?
> >So the issue seems to be that store motion fails
> >to insert the preheader load / exit store to the outermost loop
> >possible and thus another LIM pass is needed to "store motion" those
> >again?
>
>Yep.
>
> >  But a simple testcase
> >
> >int a;
> >int *p = &a;
> >int foo (int n)
> >{
> >    for (int i = 0; i < n; ++i)
> >      for (int j = 0; j < 100; ++j)
> >        *p += j + i;
> >    return a;
> >}
> >
> >shows that LIM can do this in one step.
>
>I've filed a FTR PR68465 - "pass_lim doesn't detect identical loop entry
>conditions" for a test-case where that doesn't happen (when using
>-fno-tree-dominator-opts).
>
> >Which means it should
> >be investigated why it doesn't do this properly for your testcase
> >(store motion of *_25).
>
>There seems to be two related problems:
>1. the store has tree_could_trap_p (ref->mem.ref) true, which should be
>    false. I'll work on a fix for this.
>2. Give that the store can trap, I  was running into PR68465. I managed
>    to eliminate the 2nd pass_lim by moving the pass_dominator instance
>    before the pass_lim instance.
>
>Attached patch shows the pass group with only one pass_lim. I hope to be able
>to eliminate the first pass_dominator instance before pass_lim once I fix 1.
>
> >Simply adding two LIM passes either papers over a wrong-code
> >bug (in LIM or in DOM) or over a missed-optimization in LIM.
>
>AFAIU now, it's PR68465, a missed optimization in LIM.
Ok, it's not really LIMs job to cleanup loop header copying that way.

DOM performs jump-threading for this but FRE should also be able
to handle this just fine.  Ah, it doesn't because the outer loop
header directly contains the condition

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c        (revision 230737)
+++ gcc/tree-ssa-sccvn.c        (working copy)
@@ -4357,20 +4402,32 @@ sccvn_dom_walker::before_dom_children (b

    /* If we have a single predecessor record the equivalence from a
       possible condition on the predecessor edge.  */
-  if (single_pred_p (bb))
+  edge pred_e = NULL;
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      if (e->flags & EDGE_DFS_BACK)
+       continue;
+      if (! pred_e)
+       pred_e = e;
+      else
+       {
+         pred_e = NULL;
+         break;
+       }
+    }
+  if (pred_e)
      {
-      edge e = single_pred_edge (bb);
        /* Check if there are multiple executable successor edges in
          the source block.  Otherwise there is no additional info
          to be recorded.  */
        edge e2;
-      FOR_EACH_EDGE (e2, ei, e->src->succs)
-       if (e2 != e
+      FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
+       if (e2 != pred_e
             && e2->flags & EDGE_EXECUTABLE)
           break;
        if (e2 && (e2->flags & EDGE_EXECUTABLE))
         {
-         gimple *stmt = last_stmt (e->src);
+         gimple *stmt = last_stmt (pred_e->src);
           if (stmt
               && gimple_code (stmt) == GIMPLE_COND)
             {
@@ -4378,11 +4435,11 @@ sccvn_dom_walker::before_dom_children (b
               tree lhs = gimple_cond_lhs (stmt);
               tree rhs = gimple_cond_rhs (stmt);
               record_conds (bb, code, lhs, rhs,
-                           (e->flags & EDGE_TRUE_VALUE) != 0);
+                           (pred_e->flags & EDGE_TRUE_VALUE) != 0);
               code = invert_tree_comparison (code, HONOR_NANS (lhs));
               if (code != ERROR_MARK)
                 record_conds (bb, code, lhs, rhs,
-                             (e->flags & EDGE_TRUE_VALUE) == 0);
+                             (pred_e->flags & EDGE_TRUE_VALUE) == 0);
             }
         }
      }

fixes this for me (for a small testcase).  Does it help yours?


Yes, it has the desired effect (of not needing pass_dominator before pass_lim) . But, patch "Mark by_ref mem_ref in build_receiver_ref as non-trapping" committed as r230738, also has that effect, so AFAIU I don't require this tree-ssa-sccvn.c fix.

Thanks,
- Tom

Otherwise untested of course (I hope EDGE_DFS_BACK is good enough,
it's supposed to match edges that have the src dominated by the dest).
Testing the above now.

Reply via email to