On Thu, Oct 5, 2017 at 3:17 PM, Bin Cheng <bin.ch...@arm.com> wrote: > Hi, > For now distribution pass only handles the innermost loop. This patch > extends the pass > to cover two-level innermost loop nest. It also refactors code in > pass_loop_distribution::execute > for better reading. Note I restrict it to 2-level loop nest on purpose > because of high > cost in data dependence computation. Some compilation time optimizations > like reusing > the data reference finding, data dependence computing, would require a > rewrite of this > pass like the proposed loop interchange implementation. But that's another > task. > > This patch introduces a temporary TODO for loop nest builtin partition which > is covered > by next two patches. > > With this patch, kernel loop in bwaves now can be distributed, thus exposed > for further > interchange. This patch adds new test for matrix multiplication, as well as > adjusts > test strings of existing tests. > Bootstrap and test in patch set on x86_64 and AArch64, is it OK?
@ -714,9 +719,11 @@ ssa_name_has_uses_outside_loop_p (tree def, loop_p loop) FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def) { - gimple *use_stmt = USE_STMT (use_p); - if (!is_gimple_debug (use_stmt) - && loop != loop_containing_stmt (use_stmt)) + if (is_gimple_debug (USE_STMT (use_p))) + continue; + + basic_block use_bb = gimple_bb (USE_STMT (use_p)); + if (use_bb == NULL || !flow_bb_inside_loop_p (loop, use_bb)) return true; use_bb should never be NULL. + /* Don't support loop nest distribution under runtime alias check + since it's not likely to enable many vectorization opportunities. */ + if (loop->inner) + { + merge_dep_scc_partitions (rdg, &partitions, false); + } extra {} + /* Support loop nest distribution enclosing current innermost loop. + For the moment, we only support the innermost two-level loop nest. */ + if (flag_tree_loop_distribution + && outer->num > 0 && outer->inner == loop && loop->next == NULL The canonical check for is-this-non-root is loop_outer (outer) instead of outer->num > 0. + && single_exit (outer) not sure how exits are counted but if the inner loop exits also the outer loop do we correctly handle/reject this case? - if (nb_generated_loops + nb_generated_calls > 0) - { - changed = true; - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, - loc, "Loop %d distributed: split to %d loops " - "and %d library calls.\n", - num, nb_generated_loops, nb_generated_calls); + if (nb_generated_loops + nb_generated_calls > 0) + { + changed = true; + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, + loc, "Loop%s %d distributed: split to %d loops " + "and %d library calls.\n", + loop_nest_p ? " nest" : "", loop->num, + nb_generated_loops, nb_generated_call ... can you adjust the printfs to say "loop nest distributed" in case we distributed a nest? Can you rewrite the iteration over the nest so it would theoretically support arbitrary deep perfect nests? Thus simply initialize loop_nest_p less cleverly... Otherwise looks ok to me. Thanks, Richard. > Thanks, > bin > 2017-10-04 Bin Cheng <bin.ch...@arm.com> > > * tree-loop-distribution.c: Adjust the general comment. > (NUM_PARTITION_THRESHOLD): New macro. > (ssa_name_has_uses_outside_loop_p): Support loop nest distribution. > (classify_partition): Skip builtin pattern of loop nest's inner loop. > (merge_dep_scc_partitions): New parameter ignore_alias_p and use it > in call to build_partition_graph. > (finalize_partitions): New parameter. Make loop distribution more > conservative by fusing more partitions. > (distribute_loop): Don't do runtime alias check in case of loop nest > distribution. > (find_seed_stmts_for_distribution): New function. > (pass_loop_distribution::execute): Refactor code finding seed stmts > into above function. Support loop nest distribution for two-level > innermost loop nest. Adjust dump information. > > gcc/testsuite/ChangeLog > 2017-10-04 Bin Cheng <bin.ch...@arm.com> > > * gcc.dg/tree-ssa/ldist-7.c: Adjust test string. > * gcc.dg/tree-ssa/ldist-16.c: Ditto. > * gcc.dg/tree-ssa/ldist-25.c: Ditto. > * gcc.dg/tree-ssa/ldist-33.c: Ditto.