On 23/11/15 12:31, Richard Biener wrote:
From the dump below I understand you want no memory references in
> >the outer loop?
> >So the issue seems to be that store motion fails
> >to insert the preheader load / exit store to the outermost loop
> >possible and thus another LIM pass is needed to "store motion" those
> >again?
>
>Yep.
>
> > But a simple testcase
> >
> >int a;
> >int *p = &a;
> >int foo (int n)
> >{
> > for (int i = 0; i < n; ++i)
> > for (int j = 0; j < 100; ++j)
> > *p += j + i;
> > return a;
> >}
> >
> >shows that LIM can do this in one step.
>
>I've filed a FTR PR68465 - "pass_lim doesn't detect identical loop entry
>conditions" for a test-case where that doesn't happen (when using
>-fno-tree-dominator-opts).
>
> >Which means it should
> >be investigated why it doesn't do this properly for your testcase
> >(store motion of *_25).
>
>There seems to be two related problems:
>1. the store has tree_could_trap_p (ref->mem.ref) true, which should be
> false. I'll work on a fix for this.
>2. Give that the store can trap, I was running into PR68465. I managed
> to eliminate the 2nd pass_lim by moving the pass_dominator instance
> before the pass_lim instance.
>
>Attached patch shows the pass group with only one pass_lim. I hope to be able
>to eliminate the first pass_dominator instance before pass_lim once I fix 1.
>
> >Simply adding two LIM passes either papers over a wrong-code
> >bug (in LIM or in DOM) or over a missed-optimization in LIM.
>
>AFAIU now, it's PR68465, a missed optimization in LIM.
Ok, it's not really LIMs job to cleanup loop header copying that way.
DOM performs jump-threading for this but FRE should also be able
to handle this just fine. Ah, it doesn't because the outer loop
header directly contains the condition
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c (revision 230737)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -4357,20 +4402,32 @@ sccvn_dom_walker::before_dom_children (b
/* If we have a single predecessor record the equivalence from a
possible condition on the predecessor edge. */
- if (single_pred_p (bb))
+ edge pred_e = NULL;
+ FOR_EACH_EDGE (e, ei, bb->preds)
+ {
+ if (e->flags & EDGE_DFS_BACK)
+ continue;
+ if (! pred_e)
+ pred_e = e;
+ else
+ {
+ pred_e = NULL;
+ break;
+ }
+ }
+ if (pred_e)
{
- edge e = single_pred_edge (bb);
/* Check if there are multiple executable successor edges in
the source block. Otherwise there is no additional info
to be recorded. */
edge e2;
- FOR_EACH_EDGE (e2, ei, e->src->succs)
- if (e2 != e
+ FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
+ if (e2 != pred_e
&& e2->flags & EDGE_EXECUTABLE)
break;
if (e2 && (e2->flags & EDGE_EXECUTABLE))
{
- gimple *stmt = last_stmt (e->src);
+ gimple *stmt = last_stmt (pred_e->src);
if (stmt
&& gimple_code (stmt) == GIMPLE_COND)
{
@@ -4378,11 +4435,11 @@ sccvn_dom_walker::before_dom_children (b
tree lhs = gimple_cond_lhs (stmt);
tree rhs = gimple_cond_rhs (stmt);
record_conds (bb, code, lhs, rhs,
- (e->flags & EDGE_TRUE_VALUE) != 0);
+ (pred_e->flags & EDGE_TRUE_VALUE) != 0);
code = invert_tree_comparison (code, HONOR_NANS (lhs));
if (code != ERROR_MARK)
record_conds (bb, code, lhs, rhs,
- (e->flags & EDGE_TRUE_VALUE) == 0);
+ (pred_e->flags & EDGE_TRUE_VALUE) == 0);
}
}
}
fixes this for me (for a small testcase). Does it help yours?
Yes, it has the desired effect (of not needing pass_dominator before
pass_lim) . But, patch "Mark by_ref mem_ref in build_receiver_ref as
non-trapping" committed as r230738, also has that effect, so AFAIU I
don't require this tree-ssa-sccvn.c fix.
Thanks,
- Tom
Otherwise untested of course (I hope EDGE_DFS_BACK is good enough,
it's supposed to match edges that have the src dominated by the dest).
Testing the above now.