On Fri, Sep 15, 2017 at 5:44 AM, Andrew Pinski <pins...@gmail.com> wrote: > On Thu, Sep 14, 2017 at 6:30 PM, Kugan Vivekanandarajah > <kugan.vivekanandara...@linaro.org> wrote: >> This patch prevent tree unroller from completely unrolling inner loops if >> that >> results in excessive strided-loads in outer loop. > > Same comments from the RTL version. > > Though one more comment here: > + if (!INDIRECT_REF_P (op)
There's no INDIRECT_REF in GIMPLE. > + && TREE_CODE (op) != MEM_REF > + && TREE_CODE (op) != TARGET_MEM_REF) > + continue; > > This does not handle ARRAY_REF which might be/should be handled. It looks like he wants to do op = get_base_address (op); first. But OTOH the routine looks completely bogus to me ... You want to do find_data_references_in_stmt () and then look at the data-refs and the evolution of their access fns. The function needs _way_ more comments though, you have to apply excessive guessing as to what it computes. It also feels like this should be a target hook but part of some generic cost modeling infrastructure and the target should instead provide the number of load/store streams it can handle well (aka HW-prefetch). That would be also (very) useful information for the loop distribution pass. Related information that is missing is for the vectorizer peeling cost model the number of store buffers when deciding whether to peel stores for alignment for example. Richard. > > + if ((loop_father = loop_outer (loop))) > > Since you don't use loop_father outside of the if statement use the > following (allowed) format > if (struct loop *loop_father = loop_outer (loop)) > > Thinking about this more, hw_prefetchers_avail might not be equivalent > to num_slots (PARAM_SIMULTANEOUS_PREFETCHES) but the name does not fit > what it means if I understand your hardware correctly. > Maybe hw_load_non_cacheline_prefetcher_avail since if I understand the > micro-arch is that the prefetchers are not based on the cacheline > being loaded. > > Thanks, > Andrew > >> >> Thanks, >> Kugan >> >> gcc/ChangeLog: >> >> 2017-09-12 Kugan Vivekanandarajah <kug...@linaro.org> >> >> * config/aarch64/aarch64.c (count_mem_load_streams): New. >> (aarch64_ok_to_unroll): New. >> * doc/tm.texi (ok_to_unroll): Define new target hook. >> * doc/tm.texi.in (ok_to_unroll): Likewise. >> * target.def (ok_to_unroll): Likewise. >> * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use >> ok_to_unroll while unrolling.