On Mon, 16 Aug 2021, Xiong Hu Luo wrote: > It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for > nested loops. inn_loop is updated to inner loop, so it need be restored > when exiting from innermost loop. With this patch, the store instruction > in outer loop could also be moved out of outer loop by store motion. > Any comments? Thanks.
> gcc/ChangeLog: > > * tree-ssa-loop-im.c (fill_always_executed_in_1): Restore > inn_loop when exiting from innermost loop. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-lim-19.c: New test. > --- > gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 24 ++++++++++++++++++++++ > gcc/tree-ssa-loop-im.c | 6 +++++- > 2 files changed, 29 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c > new file mode 100644 > index 00000000000..097a5ee4a4b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c > @@ -0,0 +1,24 @@ > +/* PR/101293 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-lim2-details" } */ > + > +struct X { int i; int j; int k;}; > + > +void foo(struct X *x, int n, int l) > +{ > + for (int j = 0; j < l; j++) > + { > + for (int i = 0; i < n; ++i) > + { > + int *p = &x->j; > + int tem = *p; > + x->j += tem * i; > + } > + int *r = &x->k; > + int tem2 = *r; > + x->k += tem2 * j; > + } > +} > + > +/* { dg-final { scan-tree-dump-times "Executing store motion" 2 "lim2" } } */ > + > diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c > index b24bc64f2a7..5ca4738b20e 100644 > --- a/gcc/tree-ssa-loop-im.c > +++ b/gcc/tree-ssa-loop-im.c > @@ -3211,6 +3211,10 @@ fill_always_executed_in_1 (class loop *loop, sbitmap > contains_call) > if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) > last = bb; > > + if (inn_loop != loop > + && flow_loop_nested_p (bb->loop_father, inn_loop)) > + inn_loop = bb->loop_father; > + The comment says /* In a loop that is always entered we may proceed anyway. But record that we entered it and stop once we leave it. */ inn_loop = bb->loop_father; and your change would defeat that early return, no? > if (bitmap_bit_p (contains_call, bb->index)) > break; > > @@ -3238,7 +3242,7 @@ fill_always_executed_in_1 (class loop *loop, sbitmap > contains_call) > > if (bb->loop_father->header == bb) > { > - if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) > + if (!dominated_by_p (CDI_DOMINATORS, bb->loop_father->latch, bb)) > break; That's now a always false condition - a loops latch is always dominated by its header. The condition as written tries to verify whether the loop is always entered - mind we visit all blocks, not only those always executed. In fact for your testcase the x->j ref is _not_ always executed since the inner loop is conditional on n > 0. Richard.