On Fri, Jul 8, 2016 at 4:07 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Hi Richard, > > Thanks for your help - your patch looks much better. > Here is new patch in which additional argument was added to determine > source loop of reference. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk?
Yes. Thanks, Richard. > ChangeLog: > 2016-07-08 Yuri Rumyantsev <ysrum...@gmail.com> > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which > contains REF, use it to check safelen, assume that safelen value > must be greater 1, fix style. > (ref_indep_loop_p_2): Add REF_LOOP argument. > (ref_indep_loop_p): Pass LOOP as additional argument to > ref_indep_loop_p_2. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. > > 2016-07-08 11:18 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> I checked simd3.f90 and found out that my additional check reject >>> independence of references >>> >>> REF is independent in loop#3 >>> .istart0.19, .iend0.20 >>> which are defined in loop#1 which is outer for loop#3. >>> Note that these references are defined by >>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >>> which is in loop#1. >>> It is clear that both these references can not be independent for loop#3. >> >> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops >> of LOOP to catch memory references in those as well. So the issue is really >> that we look at the wrong loop for safelen and we _do_ want to apply safelen >> to inner loops as well. >> >> So better track the loop we are ultimately asking the question for, like in >> the >> attached patch (fixes the testcase for me). >> >> Richard. >> >> >> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>>> I Added this check because of new failures in libgomp.fortran suite. >>>>> Here is copy of Jakub message: >>>>> --- Comment #29 from Jakub Jelinek <jakub at gcc dot gnu.org> --- >>>>> The #c27 r237844 change looks bogus to me. >>>>> First of all, IMNSHO you can argue this way only if ref is a reference >>>>> seen in >>>>> loop LOOP, >>>> >>>> or inner loops of LOOP I guess. I _think_ we never call >>>> ref_indep_loop_p_1 with >>>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not >>>> make >>>> sense to do that, it would be a waste of time). >>>> >>>> So only if "or inner loops of LOOP" is not correct the check would be >>>> needed >>>> but then my issue with unrolling an inner loop and turning a ref that >>>> safelen >>>> does not apply to into a ref that it now applies to arises. >>>> >>>> I don't fully get what Jakub is hinting at. >>>> >>>> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you >>>> explain that bitmap check with a simple testcase? >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 >>>>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - >>>>> the >>>>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer >>>>> loop >>>>> obviously can be dependent on many of the loads and/or stores in the >>>>> loop, be >>>>> it "omp simd array" or not. >>>>> Say for >>>>> void >>>>> foo (int *p, int *q) >>>>> { >>>>> #pragma omp simd >>>>> for (int i = 0; i < 1024; i++) >>>>> p[i] += q[0]; >>>>> } >>>>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could >>>>> write >>>>> something that changes its value, and then it would behave differently >>>>> from >>>>> using VF = 1024, where everything is performed in parallel. >>>>> Though, actually, it can alias, just it would have to write the same >>>>> value as >>>>> was there. So, if this is used to determine if it is safe to hoist the >>>>> load >>>>> before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] >>>>> && >>>>> &q[0] <= &p[1023], then it is not fine. >>>>> >>>>> For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a >>>>> valid program. #pragma omp simd I think guarantees that the last >>>>> iteration is >>>>> executed last, it isn't necessarily executed last alone, it could be, or >>>>> together with one before last iteration, or (for simdlen INT_MAX) even all >>>>> iterations can be done concurrently, in hw or sw, so it is fine if it is >>>>> transformed into: >>>>> int temp[1024], temp2[1024], temp3[1024]; >>>>> for (int i = 0; i < 1024; i++) >>>>> temp[i] = p[i]; >>>>> for (int i = 0; i < 1024; i++) >>>>> temp2[i] = q[0]; >>>>> /* The above two loops can be also swapped, or intermixed. */ >>>>> for (int i = 0; i < 1024; i++) >>>>> temp3[i] = temp[i] + temp2[i]; >>>>> for (int i = 0; i < 1024; i++) >>>>> p[i] = temp3[i]; >>>>> /* Or the above loop reversed etc. */ >>>>> >>>>> If you have: >>>>> int >>>>> bar (int *p, int *q) >>>>> { >>>>> q[0] = 0; >>>>> #pragma omp simd >>>>> for (int i = 0; i < 1024; i++) >>>>> p[i]++; >>>>> return q[0]; >>>>> } >>>>> i.e. something similar to what misbehaves in simd3.f90 with the change, >>>>> then >>>>> the answer is that q[0] isn't guaranteed to be independent of any >>>>> references in >>>>> the simd loop. >>>>> >>>>> 2016-07-07 16:57 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>> On Thu, Jul 7, 2016 at 3:24 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> wrote: >>>>>>> Richard, >>>>>>> >>>>>>> Did you meas the following check enhancement: >>>>>>> >>>>>>> outer = loop_outer (loop); >>>>>>> /* We consider REF defined in LOOP as independent if at least 2 loop >>>>>>> iterations are not dependent. */ >>>>>>> if ((loop->safelen > 1 >>>>>>> || (outer && outer->safelen > 1)) >>>>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>>>> ref->id)) >>>>>>> return true; >>>>>>> which allows us to consider reference x[0] as invariant for the test >>>>>>> #pragma omp simd >>>>>>> for (i = 0; i< 100; i++) >>>>>>> { >>>>>>> y[i] = i * 2; >>>>>>> for (j = 0; j < 100; j++) >>>>>>> z[j] += x[0]; >>>>>>> } >>>>>>> >>>>>>> Moving statement >>>>>>> _9 = *x_19(D); >>>>>>> (cost 20) out of loop 1. >>>>>> >>>>>> outer loops will be automatically considered by outermost_indep_loop >>>>>> which eventually >>>>>> calls the function you are patching. >>>>>> >>>>>> I was questioning the added test >>>>>> >>>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>>> ref->id)) >>>>>> >>>>>> and still do not understand why you need that. It makes us only >>>>>> consider the >>>>>> loop of ref itself when looking at safelen (but not refs that belong >>>>>> to inner loops >>>>>> of loop). >>>>>> >>>>>> Richard. >>>>>> >>>>>> >>>>>>> 2016-07-07 11:46 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>> On Wed, Jul 6, 2016 at 4:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> wrote: >>>>>>>>> Richard, >>>>>>>>> >>>>>>>>> I don't understand your question and how it is related to proposed >>>>>>>>> patch. >>>>>>>>> >>>>>>>>> 2016-07-06 16:55 GMT+03:00 Richard Biener >>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>> On Wed, Jul 6, 2016 at 3:50 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> See for example Jakub comments for 70729. >>>>>>>>>> >>>>>>>>>> But how can the check be valid given a >>>>>>>>>> >>>>>>>>>> #pragma omp simd >>>>>>>>>> for (;;) >>>>>>>>>> { >>>>>>>>>> for (;;) >>>>>>>>>> ...ref in question... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> can be transformed to >>>>>>>>>> >>>>>>>>>> #pragma omp simd >>>>>>>>>> for (;;) >>>>>>>>>> { >>>>>>>>>> ...ref in question... >>>>>>>>>> ...ref in question.. >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> by means of unrolling? >>>>>>>> >>>>>>>> The bitmap check would return false for the not unrolled inner loop >>>>>>>> and true for the inner unrolled loop. >>>>>>>> So it cannot be correct. >>>>>>>> >>>>>>>> (not sure why this is all off-list btw) >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>>>> Richard. >>>>>>>>>> >>>>>>>>>>> 2016-07-06 16:25 GMT+03:00 Richard Biener >>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:43 PM, Yuri Rumyantsev >>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>> I did not learn LIM detailed but I see on the following example >>>>>>>>>>>>> #pragma omp simd >>>>>>>>>>>>> for (i = 0; i< 100; i++) >>>>>>>>>>>>> { >>>>>>>>>>>>> y[i] = i * 2; >>>>>>>>>>>>> for (j = 0; j < 100; j++) >>>>>>>>>>>>> z[j] += x[0]; >>>>>>>>>>>>> } >>>>>>>>>>>>> that only inner,ost loop is handled: >>>>>>>>>>>>> >>>>>>>>>>>>> Memory reference in loop#2 1: *_7 >>>>>>>>>>>>> Memory reference in loop#2 2: *x_19(D) >>>>>>>>>>>>> Memory reference in loop#1 3: *_3 >>>>>>>>>>>>> Basic block 3 (loop 1 -- depth 1): >>>>>>>>>>>>> >>>>>>>>>>>>> Basic block 4 (loop 2 -- depth 2): >>>>>>>>>>>>> >>>>>>>>>>>>> Loop#2 is innermost >>>>>>>>>>>>> Querying dependency of refs 2 and 1: dependent. >>>>>>>>>>>>> Querying dependencies of ref 2 in loop 2: dependent >>>>>>>>>>>> >>>>>>>>>>>> well, LIM doesn't bother to check against refs in an outer loop if >>>>>>>>>>>> refs in the inner loop are already dependent. >>>>>>>>>>>> >>>>>>>>>>>>> Loop unroll of inner loop does not invalidate 'safelen' of outer >>>>>>>>>>>>> one >>>>>>>>>>>>> accordingly with safelen definition. >>>>>>>>>>>> >>>>>>>>>>>> So why do you need the bitmap_bit_p check then? >>>>>>>>>>>> >>>>>>>>>>>> Richard. >>>>>>>>>>>> >>>>>>>>>>>>> 2016-07-06 14:23 GMT+03:00 Richard Biener >>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:17 PM, Yuri Rumyantsev >>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> LIM phase considers only innermost loops and does not handle >>>>>>>>>>>>>>> loop >>>>>>>>>>>>>>> nests, so in your case the innermost loop (j) does not have >>>>>>>>>>>>>>> non-zero >>>>>>>>>>>>>>> safelen. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Huh? LIM considers loop nests just fine. And yes, the >>>>>>>>>>>>>> innermost loop >>>>>>>>>>>>>> does not have safelen but shouldn't it? Consider the inner loop >>>>>>>>>>>>>> being unrolled >>>>>>>>>>>>>> by GCC so it is removed - should that then invalidate the outer >>>>>>>>>>>>>> loop safelen >>>>>>>>>>>>>> because innermost loop refs now appear in the outer loop? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2016-07-06 13:34 GMT+03:00 Richard Biener >>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 11:50 AM, Yuri Rumyantsev >>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I pointed out in the commentary that REF is defined inside >>>>>>>>>>>>>>>>> loop and >>>>>>>>>>>>>>>>> this check is related to this statement. Should I clarify it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> + /* We consider REF defined in LOOP as independent if at >>>>>>>>>>>>>>>>> least 2 loop >>>>>>>>>>>>>>>>> + iterations are not dependent. */ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes, I fail to see why x[0] should not be disambiguated >>>>>>>>>>>>>>>> against y[i] in >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> #pragma GCC loop ivdep >>>>>>>>>>>>>>>> for (i...) >>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>> y[i] = ...; >>>>>>>>>>>>>>>> for (j...) >>>>>>>>>>>>>>>> ... = x[0]; >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> REF is always inside the loop nest with LOOP being the >>>>>>>>>>>>>>>> outermost loop. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2016-07-06 12:38 GMT+03:00 Richard Biener >>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>>> On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev >>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Here is a simple fix to cure regressions introduced by my >>>>>>>>>>>>>>>>>>> fix for >>>>>>>>>>>>>>>>>>> 70729. Patch also contains minor changes in test found by >>>>>>>>>>>>>>>>>>> Jakub. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Bootstrapping and regression testing did not show any new >>>>>>>>>>>>>>>>>>> failures. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Is it OK for trunk? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> + && bitmap_bit_p >>>>>>>>>>>>>>>>>> (&memory_accesses.refs_in_loop[loop->num], ref->id)) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So safelen does not apply to refs in nested loops? The >>>>>>>>>>>>>>>>>> added comment only >>>>>>>>>>>>>>>>>> explains the safelen check change but this also requires >>>>>>>>>>>>>>>>>> explanation. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ChangeLog: >>>>>>>>>>>>>>>>>>> 2016-07-05 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF >>>>>>>>>>>>>>>>>>> defined in >>>>>>>>>>>>>>>>>>> LOOP as independent if at least two loop iterations are not >>>>>>>>>>>>>>>>>>> dependent. >>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>>>>>>>> * g++.dg/vect/pr70729.cc: Delete redundant dg >>>>>>>>>>>>>>>>>>> options, fix style.