On Tue, Jul 25, 2017 at 2:38 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Tue, Jul 25, 2017 at 12:48 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Mon, Jul 10, 2017 at 10:24 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Tue, Jun 27, 2017 at 11:49 AM, Bin Cheng <bin.ch...@arm.com> wrote: >>>> Hi, >>>> This is a followup patch better handling below case: >>>> for (i = 0; i < n; i++) >>>> { >>>> a[i] = 1; >>>> a[i+2] = 2; >>>> } >>>> Instead of generating root variables by loading from memory and >>>> propagating with PHI >>>> nodes, like: >>>> t0 = a[0]; >>>> t1 = a[1]; >>>> for (i = 0; i < n; i++) >>>> { >>>> a[i] = 1; >>>> t2 = 2; >>>> t0 = t1; >>>> t1 = t2; >>>> } >>>> a[n] = t0; >>>> a[n+1] = t1; >>>> We can simply store loop invariant values after loop body if we know loop >>>> iterates more >>>> than chain->length times, like: >>>> for (i = 0; i < n; i++) >>>> { >>>> a[i] = 1; >>>> } >>>> a[n] = 2; >>>> a[n+1] = 2; >>>> >>>> Bootstrap(O2/O3) in patch series on x86_64 and AArch64. Is it OK? >>> Update patch wrto changes in previous patch. >>> Bootstrap and test on x86_64 and AArch64. Is it OK? >> >> + if (TREE_CODE (val) == INTEGER_CST || TREE_CODE (val) == REAL_CST) >> + continue; >> >> Please use CONSTANT_CLASS_P (val) instead. I suppose VECTOR_CST or >> FIXED_CST would be ok as well for example. >> >> Ok with that change. Did we eventually optimize this in followup >> passes previously? > Probably not? Given below test: > > int a[10000], b[10000], c[10000]; > int f(void) > { > int i, n = 100; > int t0 = a[0]; > int t1 = a[1]; > for (i = 0; i < n; i++) > { > a[i] = 1; > int t2 = 2; > t0 = t1; > t1 = t2; > } > a[n] = t0; > a[n+1] = t1; > return 0; > } > The optimized dump is as: > > <bb 2> [1.00%] [count: INV]: > t1_8 = a[1]; > ivtmp.9_17 = (unsigned long) &a; > _16 = ivtmp.9_17 + 400; > > <bb 3> [99.00%] [count: INV]: > # t1_20 = PHI <2(3), t1_8(2)> > # ivtmp.9_2 = PHI <ivtmp.9_1(3), ivtmp.9_17(2)> > _15 = (void *) ivtmp.9_2; > MEM[base: _15, offset: 0B] = 1; > ivtmp.9_1 = ivtmp.9_2 + 4; > if (ivtmp.9_1 != _16) > goto <bb 3>; [98.99%] [count: INV] > else > goto <bb 4>; [1.01%] [count: INV] > > <bb 4> [1.00%] [count: INV]: > a[100] = t1_20; > a[101] = 2; > return 0; > > We now eliminate one phi and leave another behind. It is vrp1/dce2 > when the phi is eliminated.
Ok, I see. Maybe worth filing a missed optimization PR. Richard. > Thanks, > bin