On Fri, 22 Sep 2017, Sebastian Pop wrote: > On Fri, Sep 22, 2017 at 8:03 AM, Richard Biener <rguent...@suse.de> wrote: > > > > > This simplifies canonicalize_loop_closed_ssa and does other minimal > > TLC. It also adds a testcase I reduced from a stupid mistake I made > > when reworking canonicalize_loop_closed_ssa. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. > > > > SPEC CPU 2006 is happy with it, current statistics on x86_64 with > > -Ofast -march=haswell -floop-nest-optimize are > > > > 61 loop nests "optimized" > > 45 loop nest transforms cancelled because of code generation issues > > 21 loop nest optimizations timed out the 350000 ISL "operations" we allow > > > > I say "optimized" because the usual transform I've seen is static tiling > > as enforced by GRAPHITE according to --param loop-block-tile-size. > > There's no way to automagically figure what kind of transform ISL did > > > > Here is how to automate (without magic) the detection > of the transform that isl did. > > The problem solved by isl is the minimization of strides > in memory, and to do this, we need to tell the isl scheduler > the validity dependence graph, in graphite-optimize-isl.c > see the validity (RAW, WAR, WAW) and the proximity > (RAR + validity) maps. The proximity does include the > read after read, as the isl scheduler needs to minimize > strides between consecutive reads. > > When you apply the schedule to the dependence graph, > one can tell from the result the strides in memory, a good > way to say whether a transform was beneficial is to sum up > all memory strides, and make sure that the sum of all strides > decreases after transform. We could add a printf with the > sum of strides before and after transforms, and have the > testcases check for that.
Interesting. Can you perhaps show me in code how to do that? Thanks, Richard.