On Fri, Jun 23, 2017 at 12:19 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Mon, Jun 19, 2017 at 4:20 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Mon, Jun 19, 2017 at 3:40 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Wed, Jun 14, 2017 at 2:54 PM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Mon, Jun 12, 2017 at 7:03 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>> Hi, >>>>> Current primitive cost model merges partitions with data references >>>>> sharing the same >>>>> base address. I believe it's designed to maximize data reuse in >>>>> distribution, but >>>>> that should be done by dedicated data reusing algorithm. At this stage >>>>> of merging, >>>>> we should be conservative and only merge partitions with the same >>>>> references. >>>>> Bootstrap and test on x86_64 and AArch64. Is it OK? >>>> >>>> Well, I'd say "conservative" is merging more, not less. For example >>>> splitting a[i+1] from a[i] >>>> would be bad(?), so I'd see to allow unequal DR_INIT as "equal" for >>>> merging. Maybe >>>> DR_INIT within a cacheline or so. >>>> >>>> How many extra distributions in say SPEC do you get from this change alone? >>> Hi, >>> I collected data for spec2006 only with/without this patch. I am a >>> bit surprised that it doesn't change the number of distributed loops. >>>> >>>> It shows also that having partition->reads_and_writes would be nice >>>> ... the code duplication >>> Yeah, I merged read/write data references in previous patch, now this >>> duplication is gone. Update patch attached. Is it OK? >> >> + gcc_assert (i < datarefs_vec.length ()); >> + dr1 = datarefs_vec[i]; >> >> these asserts are superfluous -- vec::operator[] does them as well. >> >> Ok if you remove them. > Done. > I realized I made mistakes when measuring the impact of this patch. > This patch only apparently causes failure of > gcc.dg/tree-ssa/ldist-6.c, so here is the updated patch. I also > collected the number of distributed loops in spec2k6 as below: > trunk: 5882 > only this patch: 7130 > whole patch series: 5237 > So the conclusion is, this patch does aggressive distribution like > ldist-6.c, which means worse data-locality. The following patch does > more fusion which mitigates impact of this patch and results in > conservative distribution overall.
What changed in the patch? Did you attach the correct one? I'm not sure ldist-6.c is a "valid" testcase but I didn't try to see where it was reduced from. > But as we lack of data locality > cost model, ldist-6.c remains failed even after applying whole patch > series. Hmm, a cache-sensitive cost model is need for several passes > now, distribution, prefetch and (possible) interchange. > Richard, do you have second comment based on the new data? I expected the "only this patch" result somewhat, as said, I'd have allowed "related" references to fuse by not requiring equal DR_INIT for example. I suggest to go forward with it in its current form. We can tweak the cost model later. Thanks, Richard. > Thanks, > bin > 2017-06-20 Bin Cheng <bin.ch...@arm.com> > > * tree-loop-distribution.c (ref_base_address): Delete. > (similar_memory_accesses): Rename ... > (share_memory_accesses): ... to this. Check if partitions access > the same memory reference. > (distribute_loop): Call share_memory_accesses. > > gcc/testsuite/ChangeLog > 2017-06-20 Bin Cheng <bin.ch...@arm.com> > > * gcc.dg/tree-ssa/ldist-6.c: XFAIL.