On Wed, Jun 7, 2017 at 4:46 AM, Richard Biener <rguent...@suse.de> wrote: > > When the order of visiting dominator children in domwalk changed > GRAPHITE falls foul of relying on a particular order BBs are visited > when computing the original schedule from the vector of pbbs. > > The following restores an order that I believe might work. > > In the end the original schedule should probably be computed > not relying on the order of pbbs in the pbb array but by > visiting the SESE region in an edge walk that "works" > (build_original_schedule). We seem to lack a BB -> pbb mapping > though. > > So the patch somewhat feels like a hack - not fixing the real > problem in the design of build_original_schedule, but it seems > to work ... > > Boostrapped and tested on x86_64-unknown-linux-gnu, ok? > > Thanks, > Richard. > > 2017-06-07 Richard Biener <rguent...@suse.de> > > PR tree-optimization/79483 > * graphite-scop-detection.c (order): New global. > (get_order): Compute bb to order mapping that satisfies code > generation constraints. > (cmp_pbbs): New helper. > (build_scops): Start domwalk at entry block, sort generated > pbbs.
I think the change looks good. Thanks, Sebastian > > * gcc.dg/graphite/pr79483.c: New testcase. > > Index: gcc/graphite-scop-detection.c > =================================================================== > --- gcc/graphite-scop-detection.c (revision 248914) > +++ gcc/graphite-scop-detection.c (working copy) > @@ -1999,6 +1999,46 @@ gather_bbs::after_dom_children (basic_bl > } > } > > + > +/* Compute sth like an execution order, dominator order with first executing > + edges that stay inside the current loop, delaying processing exit edges. > */ > + > +static vec<unsigned> order; > + > +static void > +get_order (scop_p scop, basic_block bb, vec<unsigned> *order, unsigned > *dfs_num) > +{ > + if (! bb_in_sese_p (bb, scop->scop_info->region)) > + return; > + > + (*order)[bb->index] = (*dfs_num)++; > + for (basic_block son = first_dom_son (CDI_DOMINATORS, bb); > + son; > + son = next_dom_son (CDI_DOMINATORS, son)) > + if (flow_bb_inside_loop_p (bb->loop_father, son)) > + get_order (scop, son, order, dfs_num); > + for (basic_block son = first_dom_son (CDI_DOMINATORS, bb); > + son; > + son = next_dom_son (CDI_DOMINATORS, son)) > + if (! flow_bb_inside_loop_p (bb->loop_father, son)) > + get_order (scop, son, order, dfs_num); > +} > + > +/* Helper for qsort, sorting after order above. */ > + > +static int > +cmp_pbbs (const void *pa, const void *pb) > +{ > + poly_bb_p bb1 = *((const poly_bb_p *)pa); > + poly_bb_p bb2 = *((const poly_bb_p *)pb); > + if (order[bb1->black_box->bb->index] < order[bb2->black_box->bb->index]) > + return -1; > + else if (order[bb1->black_box->bb->index] > > order[bb2->black_box->bb->index]) > + return 1; > + else > + return 0; > +} > + > /* Find Static Control Parts (SCoP) in the current function and pushes > them to SCOPS. */ > > @@ -2022,7 +2062,18 @@ build_scops (vec<scop_p> *scops) > scop_p scop = new_scop (s->entry, s->exit); > > /* Record all basic blocks and their conditions in REGION. */ > - gather_bbs (CDI_DOMINATORS, scop).walk (cfun->cfg->x_entry_block_ptr); > + gather_bbs (CDI_DOMINATORS, scop).walk (s->entry->dest); > + > + /* domwalk does not fulfil our code-generations constraints on the > + order of pbb which is to produce sth like execution order, delaying > + exection of loop exit edges. So compute such order and sort after > + that. */ > + order.create (last_basic_block_for_fn (cfun)); > + order.quick_grow (last_basic_block_for_fn (cfun)); > + unsigned dfs_num = 0; > + get_order (scop, s->entry->dest, &order, &dfs_num); > + scop->pbbs.qsort (cmp_pbbs); > + order.release (); > > build_alias_set (scop); > > Index: gcc/testsuite/gcc.dg/graphite/pr79483.c > =================================================================== > --- gcc/testsuite/gcc.dg/graphite/pr79483.c (nonexistent) > +++ gcc/testsuite/gcc.dg/graphite/pr79483.c (working copy) > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fgraphite-identity" } */ > + > +int *a; > +extern int b[]; > +int c; > +void d () > +{ > + double e[2][3] = {0.0, 0.0, 1.0}; > + for (int f = 0; f < 2; ++f) > + for (int g = 0; g < 6; ++g) > + b[0] = a[g] * e[f][2]; > + c = b[0]; > +}