On Thu, Oct 31, 2019 at 3:38 PM Feng Xue OS <f...@os.amperecomputing.com> wrote: > > Hi, Richard > > This is a new patch to support more generalized semi-invariant condition, > which uses > control dependence analysis.
Uh. Note it's not exactly helpful to change algorithms between reviews, that makes it just harder :/ Btw, I notice you use post-dominance info. Note that we generally do not keep that up-to-date with CFG manipulations (and for dominators fast queries are disabled). Probably the way we walk & transform loops makes this safe but it's something to remember when extending that. Possibly doing analysis of all candidates first and then applying the transform for all wanted cases would avoid this (and maybe also can reduce the number of update_ssa calls). I guess this can be done as followup. The patch is OK. Thanks, Richard. > Thanks, > Feng > > ________________________________________ > From: Feng Xue OS <f...@os.amperecomputing.com> > Sent: Friday, October 25, 2019 11:43 AM > To: Richard Biener > Cc: Michael Matz; Philipp Tomsich; gcc-patches@gcc.gnu.org; Christoph > Müllner; erick.oc...@theobroma-systems.com > Subject: Re: [PATCH V3] Loop split upon semi-invariant condition (PR > tree-optimization/89134) > > Richard, > > Thanks for your comments. > > >+ /* For PHI node that is not in loop header, its source operands should > >+ be defined inside the loop, which are seen as loop variant. */ > >+ if (def_bb != loop->header || !skip_head) > >+ return false; > > > so if we have > > > > for (;;) > > { > > if (x) > > a = ..; > > else > > a = ...; > > if (cond-to-split-on dependent on a) > > ... > > } > > > > the above is too restrictive in case 'x' is semi-invariant as well, correct? > In above case, cond-on-a will not be identified as semi-invariant, in that > a is defined by PHI with real multi-sources. To handle it, besides each > source value, we should add extra check on each source's control > dependence node (x in the case), which might have not a little code expansion. > Anyway, I'll have a try. > > > >+ /* A new value comes from outside of loop. */ > >+ if (!bb || !flow_bb_inside_loop_p (loop, bb)) > >+ return false; > > > but that means starting from the second iteration the value is invariant. > No. Traversal direction is reverse to loop execution. In the following, > start from "x_1 = ", extract latch value x_3, and get x_3 definition, and > finally reach "x_1 =". > > Loop: > x_1 = PHI (x_0, x_3) > ... > x_3 = > ... > goto Loop; > > > >+ /* Don't consider redefinitions in excluded basic blocks. > >*/ > >+ if (!dominated_by_p (CDI_DOMINATORS, e->src, skip_head)) > >+ { > >+ /* There are more than one source operands that can > >+ provide value to the SSA name, it is variant. */ > >+ if (from) > >+ return false; > > > > they might be the same though, for PHIs with > 2 arguments. > OK. Will add value equivalence check. > > > > In the cycle handling you are not recursing via stmt_semi_invariant_p > > but only handle SSA name copies - any particular reason for that? > The cycle handling is specified for ssa that crosses iteration. It is > semi-invariant if it remains unchanged after certain iteration, which > means its value in previous iteration (coming from latch edge) is just > a copy of its self, nothing else. So, recursion via stmt_semi_invariant_p > is unnecessary. > > Loop: > x_1 = PHI (x_0, x_3); > x_2 = PHI(x_1, value defined in excluded branch); > x_3 = x_2; > goto Loop; > > > >+static bool > >+branch_removable_p (basic_block branch_bb) > >+{ > >+ if (single_pred_p (branch_bb)) > >+ return true; > > > > I'm not sure what this function tests - at least the single_pred_p check > > looks odd to me given the dominator checks later. The single predecessor > > could simply be a forwarder. I wonder if you are looking for branches > > forming > > an irreducible loop? I think you can then check EDGE_IRREDUCIBLE_LOOP > > or BB_IRREDUCIBLE_LOOP on the condition block (btw, I don't see > > testcases covering the appearant special-cases in the patch - refering to > > existing ones via a comment often helps understanding the code). > > Upon condition evaluation, if a branch is not selected, > This function test a branch is reachable from other place other than its > conditional statement. This ensure that when the branch is not selected > upon condition evaluation, trace path led by the branch will never > be executed so that it can be excluded during semi-invariantness analysis. > > If single_pred_p, only condition statement can reach the branch. > > If not, consider a half diamond condition control graph, with a back-edge to > true branch. > > condition > | \ > | \ > | false branch > .--->----. | / > | | | / > other true branch > | | > '---<----' > > If there is an edge from false branch, true branch can not be excluded even it > is not selected. And back edge from "other" (dominated by true branch) does > not have any impact. > > > >+ > >+ return EDGE_SUCC (cond_bb, (unsigned) invar[1]); > >+} > > > > magic ensures that invar[1] is always the invariant edge? Oh, it's a bool. > > Ick. I wonder if logic with int invariant_edge = -1; and the loop setting > > it to either 0 or 1 would be easier to follow... > OK. > > > > Note your stmt_semi_invariant_p check is exponential for a condition > > like > > > > _1 = 1; > > _2 = _1 + _1; > > _3 = _2 + _2; > > if (_3 != param_4(D)) > > > > because you don't track ops you already proved semi-invariant. We've > > run into such situation repeatedly in SCEV analysis so I doubt it can be > > disregarded as irrelevant in practice. A worklist approach could then > > also get rid of the recursion. You are already computing the stmts > > forming the condition in compute_added_num_insns so another option > > is to re-use that. > OK. > > > > Btw, I wonder if we can simply re-use PARAM_MAX_PEELED_INSNS > > instead of adding yet another param (it also happens to have the same > > size). Because we are "peeling" the loop. > I'll check that. > > >+ edge invar_branch = get_cond_invariant_branch (loop, cond); > >+ > >+ if (!invar_branch) > >+ return NULL; > > > > extra vertical space is unwanted in such cases. > OK. > > >+ if (dump_file && (dump_flags & TDF_DETAILS)) > >+ { > >+ fprintf (dump_file, "In %s(), split loop %d at branch<%s>, BB %d\n", > >+ current_function_name (), loop1->num, > >+ true_invar ? "T" : "F", cond_bb->index); > >+ print_gimple_stmt (dump_file, cond, 0, TDF_SLIM | TDF_VOPS); > >+ } > > > > can you please use sth like > > > > if (dump_enabled_p ()) > > dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, > > cond, "loop split on semi-invariant condition"); > > > > so -fopt-info-loop will show it? > OK. > > > >+ /* Generate a bool type temporary to hold result of the condition. */ > >+ tree tmp = make_ssa_name (boolean_type_node); > >+ gimple_stmt_iterator gsi = gsi_last_bb (cond_bb); > >+ gimple *stmt = gimple_build_assign (tmp, > >+ gimple_cond_code (cond), > >+ gimple_cond_lhs (cond), > >+ gimple_cond_rhs (cond)); > > > > shorter is > > > > gimple_seq stmts = NULL; > > tree tmp = gimple_build (&stmts, gimple_cond_code (cond), > > boolean_type_node, > > gimple_cond_lhs (cond), gimple_cond_rhs (cond)); > > gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT); > OK. > > > >+ gsi_insert_before (&gsi, stmt, GSI_NEW_STMT); > >+ gimple_cond_set_condition (cond, EQ_EXPR, tmp, boolean_true_node); > > > > but I wonder what's the point here to move the condition computation to > > a temporary? Why not just build the original condition again for > > break_cond? > OK. > > > > in split_loop_on_cond you'll find the first semi-invariant condition > > to split on, > > but we'll not visit the split loop again (also for original splitting I > > guess). > > Don't we eventually want to recurse on that? > Currently, we only do a round of loop-split. It is a TODO to enable more than > one loop-splits on a loop.