On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > Richard, > > The patch proposed by you does not work properly for > g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has > been cached as dependent for outer loop and loop is not vectorized: > > g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c > -fdump-tree-vect-details > grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect > <not found> > > You missed additional check I added before check on cached dependence.
Ok, but it should get the correctness right? I suppose that if you move the cache checks inside the else clause it would work? I'd be ok with that change. Richard. > 2016-08-09 13:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>> Yes it is impossible since all basic blocks are handled from outer >>> loops to innermost so we can not have the sequence with wrong >>> dependence, i.e. we cached that reference is independent (due to >>> safelen) but the same reference in outer loop must be evaluated as >>> dependent. So we must re-evaluate only dependent references in loops >>> having non-zero safelen attribute. >> >> Hmm. I don't like depending on this implementation detail. Does the >> attached patch work >> which simply avoids any positive/negative caching on safelen affected >> refs? It also makes >> the query cheaper by avoiding the dive into child loops. >> >> Richard. >> >>> 2016-08-09 11:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>>> Richard, >>>>> >>>>> I added additional check before caching dependencies since (1) all >>>>> statements in loop are handled in loop postorder order, i.e. form >>>>> outer to inner; (2) we can change dependency for reference in subloops >>>>> which have non-zero safelen attribute. So I propose to re-evaluate it >>>>> in such cases. I don't see why we need to avoid dependence caching for >>>>> all loop nests since pragma omp simd is used very rarely. >>>> >>>> You think it is impossible to construct a testcase which hits the >>>> correctness issue? >>>> "very rarely" is not a good argument to generate wrong code. >>>> >>>> Richard. >>>> >>>>> 2016-08-05 16:50 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> wrote: >>>>>>> Richard, >>>>>>> >>>>>>> Here is updated patch which implements your proposal - I pass loop >>>>>>> instead of stmt to determine either REF is defined inside LOOP nest or >>>>>>> not. I checked that for pr70729-nest.cc the reference this->S_n for >>>>>>> statements which are out of innermost loop are not considered as >>>>>>> independent as you pointed out. >>>>>>> >>>>>>> Regression testing did not show any new failures and both failed tests >>>>>>> from libgomp.fortran suite now passed. >>>>>>> >>>>>>> Is it OK for trunk? >>>>>> >>>>>> I don't quite understand >>>>>> >>>>>> + /* Ignore dependence for loops having greater safelen. */ >>>>>> + if (new_safelen == safelen >>>>>> + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, >>>>>> stored_p))) >>>>>> return false; >>>>>> >>>>>> this seems to suggest (correctly I think) that we cannot rely on the >>>>>> caching >>>>>> for safelen, neither for optimal results (you seem to address that) but >>>>>> also >>>>>> not for correctness (we cache the no-dep result from a safelen run and >>>>>> then happily re-use that info for a ref that is not safe for safelen). >>>>>> >>>>>> It seems to me we need to avoid any caching if we made things independent >>>>>> because of safelen and simply not do the dep test afterwards. this means >>>>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great >>>>>> way >>>>>> to do this w/o confusing the control flow). >>>>>> >>>>>> Richard. >>>>>> >>>>>>> ChangeLog: >>>>>>> 2016-08-05 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> >>>>>>> PR tree-optimization/71734 >>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. >>>>>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to >>>>>>> ref_indep_loop_p. >>>>>>> (ref_indep_loop_p_1): Fix commentary. >>>>>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new >>>>>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore >>>>>>> dependencde for loop having greater safelen value, pass REF_LOOP in >>>>>>> recursive call. >>>>>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. >>>>>>> >>>>>>> 2016-08-03 16:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> wrote: >>>>>>>>> Hi Richard. >>>>>>>>> >>>>>>>>> It turned out that the fix proposed by you does not work for liggomp >>>>>>>>> tests simd3 and simd4. >>>>>>>>> The reason is that we can't change safelen value for references not >>>>>>>>> defined inside loop. So I add missed check on it to patch. >>>>>>>>> Is it OK for trunk? >>>>>>>> >>>>>>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as >>>>>>>> that operation can end up being quadratic in the loop depth/width. >>>>>>>> >>>>>>>> But I also wonder about correctness given that LIM "commons" >>>>>>>> references. So we can have >>>>>>>> >>>>>>>> for (;;) >>>>>>>> .. = ref; (1) >>>>>>>> for (;;) // safelen == 2 (2) >>>>>>>> ... = ref; >>>>>>>> >>>>>>>> and when looking at the ref at (1) which according to you should not >>>>>>>> have safelen applied your function will happily return that ref is >>>>>>>> defined >>>>>>>> in the inner loop. >>>>>>>> >>>>>>>> So it looks like to be able to apply safelen the caller of >>>>>>>> ref_indep_loop_p >>>>>>>> needs to pass down a ref plus a location (a stmt). In which case your >>>>>>>> function can simply use flow_loop_nested_p (loop, gimple_bb >>>>>>>> (stmt)->loop_father); >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>>> ChangeLog: >>>>>>>>> 2016-07-29 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>> >>>>>>>>> PR tree-optimization/71734 >>>>>>>>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. >>>>>>>>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside >>>>>>>>> LOOP. >>>>>>>>> >>>>>>>>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>>>> Sorry H.J. >>>>>>>>>> >>>>>>>>>> I checked both these tests manually but forgot to pass "-fopenmp" >>>>>>>>>> option. >>>>>>>>>> I will fix the issue asap. >>>>>>>>>> >>>>>>>>>> 2016-07-29 0:33 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>: >>>>>>>>>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev >>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>> Richard, >>>>>>>>>>>> >>>>>>>>>>>> I prepare a patch which is based on yours. New test is also >>>>>>>>>>>> included. >>>>>>>>>>>> Bootstrapping and regression testing did not show any new failures. >>>>>>>>>>>> Is it OK for trunk? >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> ChangeLog: >>>>>>>>>>>> 2016-07-28 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>> >>>>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen >>>>>>>>>>>> attribute instead of REF_LOOP and use it. >>>>>>>>>>>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and >>>>>>>>>>>> set it for Loops having non-zero safelen attribute. >>>>>>>>>>>> (ref_indep_loop_p): Pass zero as initial value for safelen. >>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>> * g++.dg/vect/pr70729-nest.cc: New test. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Does this cause >>>>>>>>>>> >>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer >>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>> test >>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test >>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer >>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>> test >>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test >>>>>>>>>>> >>>>>>>>>>> on AVX machines and >>>>>>>>>>> >>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer >>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>> test >>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test >>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer >>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>>>>>>>>>> test >>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test >>>>>>>>>>> >>>>>>>>>>> on non-AVX machines? >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> H.J.