Richard, I checked that this move helps. Does it mean that I've got approval to integrate it to trunk?
2016-08-09 14:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: > On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >> Richard, >> >> The patch proposed by you does not work properly for >> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has >> been cached as dependent for outer loop and loop is not vectorized: >> >> g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c >> -fdump-tree-vect-details >> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect >> <not found> >> >> You missed additional check I added before check on cached dependence. > > Ok, but it should get the correctness right? > > I suppose that if you move the cache checks inside the else clause it > would work? > I'd be ok with that change. > > Richard. > >> 2016-08-09 13:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>> Yes it is impossible since all basic blocks are handled from outer >>>> loops to innermost so we can not have the sequence with wrong >>>> dependence, i.e. we cached that reference is independent (due to >>>> safelen) but the same reference in outer loop must be evaluated as >>>> dependent. So we must re-evaluate only dependent references in loops >>>> having non-zero safelen attribute. >>> >>> Hmm. I don't like depending on this implementation detail. Does the >>> attached patch work >>> which simply avoids any positive/negative caching on safelen affected >>> refs? It also makes >>> the query cheaper by avoiding the dive into child loops. >>> >>> Richard. >>> >>>> 2016-08-09 11:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> Richard, >>>>>> >>>>>> I added additional check before caching dependencies since (1) all >>>>>> statements in loop are handled in loop postorder order, i.e. form >>>>>> outer to inner; (2) we can change dependency for reference in subloops >>>>>> which have non-zero safelen attribute. So I propose to re-evaluate it >>>>>> in such cases. I don't see why we need to avoid dependence caching for >>>>>> all loop nests since pragma omp simd is used very rarely. >>>>> >>>>> You think it is impossible to construct a testcase which hits the >>>>> correctness issue? >>>>> "very rarely" is not a good argument to generate wrong code. >>>>> >>>>> Richard. >>>>> >>>>>> 2016-08-05 16:50 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> wrote: >>>>>>>> Richard, >>>>>>>> >>>>>>>> Here is updated patch which implements your proposal - I pass loop >>>>>>>> instead of stmt to determine either REF is defined inside LOOP nest or >>>>>>>> not. I checked that for pr70729-nest.cc the reference this->S_n for >>>>>>>> statements which are out of innermost loop are not considered as >>>>>>>> independent as you pointed out. >>>>>>>> >>>>>>>> Regression testing did not show any new failures and both failed tests >>>>>>>> from libgomp.fortran suite now passed. >>>>>>>> >>>>>>>> Is it OK for trunk? >>>>>>> >>>>>>> I don't quite understand >>>>>>> >>>>>>> + /* Ignore dependence for loops having greater safelen. */ >>>>>>> + if (new_safelen == safelen >>>>>>> + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, >>>>>>> stored_p))) >>>>>>> return false; >>>>>>> >>>>>>> this seems to suggest (correctly I think) that we cannot rely on the >>>>>>> caching >>>>>>> for safelen, neither for optimal results (you seem to address that) but >>>>>>> also >>>>>>> not for correctness (we cache the no-dep result from a safelen run and >>>>>>> then happily re-use that info for a ref that is not safe for safelen). >>>>>>> >>>>>>> It seems to me we need to avoid any caching if we made things >>>>>>> independent >>>>>>> because of safelen and simply not do the dep test afterwards. this >>>>>>> means >>>>>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great >>>>>>> way >>>>>>> to do this w/o confusing the control flow). >>>>>>> >>>>>>> Richard. >>>>>>> >>>>>>>> ChangeLog: >>>>>>>> 2016-08-05 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> >>>>>>>> PR tree-optimization/71734 >>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. >>>>>>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to >>>>>>>> ref_indep_loop_p. >>>>>>>> (ref_indep_loop_p_1): Fix commentary. >>>>>>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new >>>>>>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore >>>>>>>> dependencde for loop having greater safelen value, pass REF_LOOP in >>>>>>>> recursive call. >>>>>>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. >>>>>>>> >>>>>>>> 2016-08-03 16:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> Hi Richard. >>>>>>>>>> >>>>>>>>>> It turned out that the fix proposed by you does not work for liggomp >>>>>>>>>> tests simd3 and simd4. >>>>>>>>>> The reason is that we can't change safelen value for references not >>>>>>>>>> defined inside loop. So I add missed check on it to patch. >>>>>>>>>> Is it OK for trunk? >>>>>>>>> >>>>>>>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as >>>>>>>>> that operation can end up being quadratic in the loop depth/width. >>>>>>>>> >>>>>>>>> But I also wonder about correctness given that LIM "commons" >>>>>>>>> references. So we can have >>>>>>>>> >>>>>>>>> for (;;) >>>>>>>>> .. = ref; (1) >>>>>>>>> for (;;) // safelen == 2 (2) >>>>>>>>> ... = ref; >>>>>>>>> >>>>>>>>> and when looking at the ref at (1) which according to you should not >>>>>>>>> have safelen applied your function will happily return that ref is >>>>>>>>> defined >>>>>>>>> in the inner loop. >>>>>>>>> >>>>>>>>> So it looks like to be able to apply safelen the caller of >>>>>>>>> ref_indep_loop_p >>>>>>>>> needs to pass down a ref plus a location (a stmt). In which case your >>>>>>>>> function can simply use flow_loop_nested_p (loop, gimple_bb >>>>>>>>> (stmt)->loop_father); >>>>>>>>> >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>>> ChangeLog: >>>>>>>>>> 2016-07-29 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>> >>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. >>>>>>>>>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside >>>>>>>>>> LOOP. >>>>>>>>>> >>>>>>>>>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>>>>> Sorry H.J. >>>>>>>>>>> >>>>>>>>>>> I checked both these tests manually but forgot to pass "-fopenmp" >>>>>>>>>>> option. >>>>>>>>>>> I will fix the issue asap. >>>>>>>>>>> >>>>>>>>>>> 2016-07-29 0:33 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev >>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>> Richard, >>>>>>>>>>>>> >>>>>>>>>>>>> I prepare a patch which is based on yours. New test is also >>>>>>>>>>>>> included. >>>>>>>>>>>>> Bootstrapping and regression testing did not show any new >>>>>>>>>>>>> failures. >>>>>>>>>>>>> Is it OK for trunk? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>>> ChangeLog: >>>>>>>>>>>>> 2016-07-28 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>> >>>>>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen >>>>>>>>>>>>> attribute instead of REF_LOOP and use it. >>>>>>>>>>>>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and >>>>>>>>>>>>> set it for Loops having non-zero safelen attribute. >>>>>>>>>>>>> (ref_indep_loop_p): Pass zero as initial value for safelen. >>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>> * g++.dg/vect/pr70729-nest.cc: New test. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Does this cause >>>>>>>>>>>> >>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer >>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>>> test >>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test >>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer >>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>>> test >>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test >>>>>>>>>>>> >>>>>>>>>>>> on AVX machines and >>>>>>>>>>>> >>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer >>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>>> test >>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test >>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer >>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>>> test >>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test >>>>>>>>>>>> >>>>>>>>>>>> on non-AVX machines? >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> H.J.