Hi Richard, Thanks for your help - your patch looks much better. Here is new patch in which additional argument was added to determine source loop of reference.
Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-08 Yuri Rumyantsev <ysrum...@gmail.com> PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which contains REF, use it to check safelen, assume that safelen value must be greater 1, fix style. (ref_indep_loop_p_2): Add REF_LOOP argument. (ref_indep_loop_p): Pass LOOP as additional argument to ref_indep_loop_p_2. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. 2016-07-08 11:18 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: > On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >> I checked simd3.f90 and found out that my additional check reject >> independence of references >> >> REF is independent in loop#3 >> .istart0.19, .iend0.20 >> which are defined in loop#1 which is outer for loop#3. >> Note that these references are defined by >> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >> which is in loop#1. >> It is clear that both these references can not be independent for loop#3. > > Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops > of LOOP to catch memory references in those as well. So the issue is really > that we look at the wrong loop for safelen and we _do_ want to apply safelen > to inner loops as well. > > So better track the loop we are ultimately asking the question for, like in > the > attached patch (fixes the testcase for me). > > Richard. > > > >> 2016-07-07 17:11 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>> I Added this check because of new failures in libgomp.fortran suite. >>>> Here is copy of Jakub message: >>>> --- Comment #29 from Jakub Jelinek <jakub at gcc dot gnu.org> --- >>>> The #c27 r237844 change looks bogus to me. >>>> First of all, IMNSHO you can argue this way only if ref is a reference >>>> seen in >>>> loop LOOP, >>> >>> or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 >>> with >>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not >>> make >>> sense to do that, it would be a waste of time). >>> >>> So only if "or inner loops of LOOP" is not correct the check would be needed >>> but then my issue with unrolling an inner loop and turning a ref that >>> safelen >>> does not apply to into a ref that it now applies to arises. >>> >>> I don't fully get what Jakub is hinting at. >>> >>> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you >>> explain that bitmap check with a simple testcase? >>> >>> Thanks, >>> Richard. >>> >>>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 >>>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - >>>> the >>>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer >>>> loop >>>> obviously can be dependent on many of the loads and/or stores in the loop, >>>> be >>>> it "omp simd array" or not. >>>> Say for >>>> void >>>> foo (int *p, int *q) >>>> { >>>> #pragma omp simd >>>> for (int i = 0; i < 1024; i++) >>>> p[i] += q[0]; >>>> } >>>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could write >>>> something that changes its value, and then it would behave differently from >>>> using VF = 1024, where everything is performed in parallel. >>>> Though, actually, it can alias, just it would have to write the same value >>>> as >>>> was there. So, if this is used to determine if it is safe to hoist the >>>> load >>>> before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] >>>> && >>>> &q[0] <= &p[1023], then it is not fine. >>>> >>>> For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a >>>> valid program. #pragma omp simd I think guarantees that the last >>>> iteration is >>>> executed last, it isn't necessarily executed last alone, it could be, or >>>> together with one before last iteration, or (for simdlen INT_MAX) even all >>>> iterations can be done concurrently, in hw or sw, so it is fine if it is >>>> transformed into: >>>> int temp[1024], temp2[1024], temp3[1024]; >>>> for (int i = 0; i < 1024; i++) >>>> temp[i] = p[i]; >>>> for (int i = 0; i < 1024; i++) >>>> temp2[i] = q[0]; >>>> /* The above two loops can be also swapped, or intermixed. */ >>>> for (int i = 0; i < 1024; i++) >>>> temp3[i] = temp[i] + temp2[i]; >>>> for (int i = 0; i < 1024; i++) >>>> p[i] = temp3[i]; >>>> /* Or the above loop reversed etc. */ >>>> >>>> If you have: >>>> int >>>> bar (int *p, int *q) >>>> { >>>> q[0] = 0; >>>> #pragma omp simd >>>> for (int i = 0; i < 1024; i++) >>>> p[i]++; >>>> return q[0]; >>>> } >>>> i.e. something similar to what misbehaves in simd3.f90 with the change, >>>> then >>>> the answer is that q[0] isn't guaranteed to be independent of any >>>> references in >>>> the simd loop. >>>> >>>> 2016-07-07 16:57 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>> On Thu, Jul 7, 2016 at 3:24 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> Richard, >>>>>> >>>>>> Did you meas the following check enhancement: >>>>>> >>>>>> outer = loop_outer (loop); >>>>>> /* We consider REF defined in LOOP as independent if at least 2 loop >>>>>> iterations are not dependent. */ >>>>>> if ((loop->safelen > 1 >>>>>> || (outer && outer->safelen > 1)) >>>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>>> ref->id)) >>>>>> return true; >>>>>> which allows us to consider reference x[0] as invariant for the test >>>>>> #pragma omp simd >>>>>> for (i = 0; i< 100; i++) >>>>>> { >>>>>> y[i] = i * 2; >>>>>> for (j = 0; j < 100; j++) >>>>>> z[j] += x[0]; >>>>>> } >>>>>> >>>>>> Moving statement >>>>>> _9 = *x_19(D); >>>>>> (cost 20) out of loop 1. >>>>> >>>>> outer loops will be automatically considered by outermost_indep_loop >>>>> which eventually >>>>> calls the function you are patching. >>>>> >>>>> I was questioning the added test >>>>> >>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>> ref->id)) >>>>> >>>>> and still do not understand why you need that. It makes us only consider >>>>> the >>>>> loop of ref itself when looking at safelen (but not refs that belong >>>>> to inner loops >>>>> of loop). >>>>> >>>>> Richard. >>>>> >>>>> >>>>>> 2016-07-07 11:46 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>> On Wed, Jul 6, 2016 at 4:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> wrote: >>>>>>>> Richard, >>>>>>>> >>>>>>>> I don't understand your question and how it is related to proposed >>>>>>>> patch. >>>>>>>> >>>>>>>> 2016-07-06 16:55 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>>> On Wed, Jul 6, 2016 at 3:50 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> See for example Jakub comments for 70729. >>>>>>>>> >>>>>>>>> But how can the check be valid given a >>>>>>>>> >>>>>>>>> #pragma omp simd >>>>>>>>> for (;;) >>>>>>>>> { >>>>>>>>> for (;;) >>>>>>>>> ...ref in question... >>>>>>>>> } >>>>>>>>> >>>>>>>>> can be transformed to >>>>>>>>> >>>>>>>>> #pragma omp simd >>>>>>>>> for (;;) >>>>>>>>> { >>>>>>>>> ...ref in question... >>>>>>>>> ...ref in question.. >>>>>>>>> ... >>>>>>>>> } >>>>>>>>> >>>>>>>>> by means of unrolling? >>>>>>> >>>>>>> The bitmap check would return false for the not unrolled inner loop >>>>>>> and true for the inner unrolled loop. >>>>>>> So it cannot be correct. >>>>>>> >>>>>>> (not sure why this is all off-list btw) >>>>>>> >>>>>>> Richard. >>>>>>> >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>>> 2016-07-06 16:25 GMT+03:00 Richard Biener >>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>> On Wed, Jul 6, 2016 at 1:43 PM, Yuri Rumyantsev >>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>> I did not learn LIM detailed but I see on the following example >>>>>>>>>>>> #pragma omp simd >>>>>>>>>>>> for (i = 0; i< 100; i++) >>>>>>>>>>>> { >>>>>>>>>>>> y[i] = i * 2; >>>>>>>>>>>> for (j = 0; j < 100; j++) >>>>>>>>>>>> z[j] += x[0]; >>>>>>>>>>>> } >>>>>>>>>>>> that only inner,ost loop is handled: >>>>>>>>>>>> >>>>>>>>>>>> Memory reference in loop#2 1: *_7 >>>>>>>>>>>> Memory reference in loop#2 2: *x_19(D) >>>>>>>>>>>> Memory reference in loop#1 3: *_3 >>>>>>>>>>>> Basic block 3 (loop 1 -- depth 1): >>>>>>>>>>>> >>>>>>>>>>>> Basic block 4 (loop 2 -- depth 2): >>>>>>>>>>>> >>>>>>>>>>>> Loop#2 is innermost >>>>>>>>>>>> Querying dependency of refs 2 and 1: dependent. >>>>>>>>>>>> Querying dependencies of ref 2 in loop 2: dependent >>>>>>>>>>> >>>>>>>>>>> well, LIM doesn't bother to check against refs in an outer loop if >>>>>>>>>>> refs in the inner loop are already dependent. >>>>>>>>>>> >>>>>>>>>>>> Loop unroll of inner loop does not invalidate 'safelen' of outer >>>>>>>>>>>> one >>>>>>>>>>>> accordingly with safelen definition. >>>>>>>>>>> >>>>>>>>>>> So why do you need the bitmap_bit_p check then? >>>>>>>>>>> >>>>>>>>>>> Richard. >>>>>>>>>>> >>>>>>>>>>>> 2016-07-06 14:23 GMT+03:00 Richard Biener >>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:17 PM, Yuri Rumyantsev >>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>> >>>>>>>>>>>>>> LIM phase considers only innermost loops and does not handle loop >>>>>>>>>>>>>> nests, so in your case the innermost loop (j) does not have >>>>>>>>>>>>>> non-zero >>>>>>>>>>>>>> safelen. >>>>>>>>>>>>> >>>>>>>>>>>>> Huh? LIM considers loop nests just fine. And yes, the innermost >>>>>>>>>>>>> loop >>>>>>>>>>>>> does not have safelen but shouldn't it? Consider the inner loop >>>>>>>>>>>>> being unrolled >>>>>>>>>>>>> by GCC so it is removed - should that then invalidate the outer >>>>>>>>>>>>> loop safelen >>>>>>>>>>>>> because innermost loop refs now appear in the outer loop? >>>>>>>>>>>>> >>>>>>>>>>>>> Richard. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 2016-07-06 13:34 GMT+03:00 Richard Biener >>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 11:50 AM, Yuri Rumyantsev >>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I pointed out in the commentary that REF is defined inside >>>>>>>>>>>>>>>> loop and >>>>>>>>>>>>>>>> this check is related to this statement. Should I clarify it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> + /* We consider REF defined in LOOP as independent if at >>>>>>>>>>>>>>>> least 2 loop >>>>>>>>>>>>>>>> + iterations are not dependent. */ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, I fail to see why x[0] should not be disambiguated against >>>>>>>>>>>>>>> y[i] in >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #pragma GCC loop ivdep >>>>>>>>>>>>>>> for (i...) >>>>>>>>>>>>>>> { >>>>>>>>>>>>>>> y[i] = ...; >>>>>>>>>>>>>>> for (j...) >>>>>>>>>>>>>>> ... = x[0]; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> REF is always inside the loop nest with LOOP being the >>>>>>>>>>>>>>> outermost loop. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2016-07-06 12:38 GMT+03:00 Richard Biener >>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>> On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev >>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Here is a simple fix to cure regressions introduced by my >>>>>>>>>>>>>>>>>> fix for >>>>>>>>>>>>>>>>>> 70729. Patch also contains minor changes in test found by >>>>>>>>>>>>>>>>>> Jakub. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Bootstrapping and regression testing did not show any new >>>>>>>>>>>>>>>>>> failures. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Is it OK for trunk? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> + && bitmap_bit_p >>>>>>>>>>>>>>>>> (&memory_accesses.refs_in_loop[loop->num], ref->id)) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So safelen does not apply to refs in nested loops? The added >>>>>>>>>>>>>>>>> comment only >>>>>>>>>>>>>>>>> explains the safelen check change but this also requires >>>>>>>>>>>>>>>>> explanation. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ChangeLog: >>>>>>>>>>>>>>>>>> 2016-07-05 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF >>>>>>>>>>>>>>>>>> defined in >>>>>>>>>>>>>>>>>> LOOP as independent if at least two loop iterations are not >>>>>>>>>>>>>>>>>> dependent. >>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>>>>>>> * g++.dg/vect/pr70729.cc: Delete redundant dg >>>>>>>>>>>>>>>>>> options, fix style.
71734.patch
Description: Binary data