Richard Biener <richard.guent...@gmail.com> writes: > On Thu, Oct 15, 2015 at 3:17 PM, Richard Sandiford > <richard.sandif...@arm.com> wrote: >> This patch adds a pass that collects information that is common to all >> uses of an SSA name X and back-propagates that information up the statements >> that generate X. The general idea is to use the information to simplify >> instructions (rather than a pure DCE) so I've simply called it >> tree-ssa-backprop.c, to go with tree-ssa-forwprop.c. >> >> At the moment the only use of the pass is to remove unnecessry sign >> operations, so that it's effectively a global version of >> fold_strip_sign_ops. I'm hoping it could be extended in future to >> record which bits of an integer are significant. There are probably >> other potential uses too. >> >> A later patch gets rid of fold_strip_sign_ops. >> >> Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi. >> OK to install? >> >> Thanks, >> Richard >> >> >> gcc/ >> * doc/invoke.texi (-fdump-tree-backprop, -ftree-backprop): Document. >> * Makefile.in (OBJS): Add tree-ssa-backprop.o. >> * common.opt (ftree-backprop): New option. >> * fold-const.h (negate_mathfn_p): Declare. >> * fold-const.c (negate_mathfn_p): Make public. >> * timevar.def (TV_TREE_BACKPROP): New. >> * tree-passes.h (make_pass_backprop): Declare. >> * passes.def (pass_backprop): Add. >> * tree-ssa-backprop.c: New file. >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/backprop-1.c, gcc.dg/tree-ssa/backprop-2.c, >> gcc.dg/tree-ssa/backprop-3.c, gcc.dg/tree-ssa/backprop-4.c, >> gcc.dg/tree-ssa/backprop-5.c, gcc.dg/tree-ssa/backprop-6.c: New >> tests. >> >> diff --git a/gcc/Makefile.in b/gcc/Makefile.in >> index 783e4c9..69e669d 100644 >> --- a/gcc/Makefile.in >> +++ b/gcc/Makefile.in >> @@ -1445,6 +1445,7 @@ OBJS = \ >> tree-switch-conversion.o \ >> tree-ssa-address.o \ >> tree-ssa-alias.o \ >> + tree-ssa-backprop.o \ >> tree-ssa-ccp.o \ >> tree-ssa-coalesce.o \ >> tree-ssa-copy.o \ >> diff --git a/gcc/common.opt b/gcc/common.opt >> index 5060208..5aef625 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -2364,6 +2364,10 @@ ftree-pta >> Common Report Var(flag_tree_pta) Optimization >> Perform function-local points-to analysis on trees. >> >> +ftree-backprop >> +Common Report Var(flag_tree_backprop) Init(1) Optimization >> +Enable backward propagation of use properties at the tree level. > > Don't add new -ftree-* "tree" doesn't add any info for our users. I'd > also refer to SSA level rather than "tree" level. Not sure if -fbackprop > is good, but let's go for it.
OK. >> diff --git a/gcc/fold-const.c b/gcc/fold-const.c >> index de45a2c..7f00e72 100644 >> --- a/gcc/fold-const.c >> +++ b/gcc/fold-const.c >> @@ -319,7 +318,7 @@ fold_overflow_warning (const char* gmsgid, enum > warn_strict_overflow_code wc) >> /* Return true if the built-in mathematical function specified by CODE >> is odd, i.e. -f(x) == f(-x). */ >> >> -static bool >> +bool >> negate_mathfn_p (enum built_in_function code) >> { >> switch (code) > > Belongs more to builtins.[ch] if exported. The long-term plan is to abstract away whether it's a built-in function or an internal function, in which case I hope to have a single predicate that handles both. I'm not sure where the code should go after that change. Maybe a new file? >> diff --git a/gcc/fold-const.h b/gcc/fold-const.h >> index ee74dc8..4d5b24b 100644 >> --- a/gcc/fold-const.h >> +++ b/gcc/fold-const.h >> @@ -173,6 +173,7 @@ extern tree sign_bit_p (tree, const_tree); >> extern tree exact_inverse (tree, tree); >> extern tree const_unop (enum tree_code, tree, tree); >> extern tree const_binop (enum tree_code, tree, tree, tree); >> +extern bool negate_mathfn_p (enum built_in_function); >> >> /* Return OFF converted to a pointer offset type suitable as offset for >> POINTER_PLUS_EXPR. Use location LOC for this conversion. */ >> diff --git a/gcc/passes.def b/gcc/passes.def >> index dc3f44c..36d2b3b 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -159,6 +159,7 @@ along with GCC; see the file COPYING3. If not see >> /* After CCP we rewrite no longer addressed locals into SSA >> form if possible. */ >> NEXT_PASS (pass_complete_unrolli); >> + NEXT_PASS (pass_backprop); >> NEXT_PASS (pass_phiprop); >> NEXT_PASS (pass_forwprop); > > Any reason to not put this later? I was thinking before reassoc. I think we're relying on FRE to notice the redundancy in the builtins-*.c tests, once this pass has converted the version with redundant sign ops to make it look like the version without. reassoc is likely to be too late. I also thought it should go before rather than after some instance of forwprop because the pass might expose more forward folding opportunities. E.g. if the sign of A = -B * B doesn't matter, we'll end up with A = B * B, which might be foldable with uses of A. It seems less likely that forwprop would expose more backprop opportunities. >> diff --git a/gcc/tree-ssa-backprop.c b/gcc/tree-ssa-backprop.c >> new file mode 100644 >> index 0000000..bf2dcd9 >> --- /dev/null >> +++ b/gcc/tree-ssa-backprop.c > > gimple-ssa-backprop.c OK. >> +/* Return usage information for general operand OP, or null if none. */ >> + >> +const usage_info * >> +backprop::lookup_operand (tree op) >> +{ >> + if (op && TREE_CODE (op) == SSA_NAME) >> + if (usage_info **slot = m_info_map.get (op)) >> + return *slot; > > Is this usually fully populated at the end? If so a direct mapping using > SSA_NAME_VERSION would be cheaper. No, in practice it should be very sparse. >> +/* Make INFO describe all uses of RHS in ASSIGN. */ >> + >> +void >> +backprop::process_assign_use (gassign *assign, tree rhs, usage_info *info) >> +{ >> + tree lhs = gimple_assign_lhs (assign); >> + switch (gimple_assign_rhs_code (assign)) >> + { >> + case ABS_EXPR: >> + /* The sign of the input doesn't matter. */ >> + info->flags.ignore_sign = true; >> + break; >> + >> + case COND_EXPR: >> + /* For A = B ? C : D, propagate information about all uses of A >> + to B and C. */ >> + if (rhs != gimple_assign_rhs1 (assign)) >> + if (const usage_info *lhs_info = lookup_operand (lhs)) > > Use && instead of nested if That means introducing an extra level of braces just for something that that isn't needed by the first statement, i.e.: { const usage_info *lhs_info; if (rhs != gimple_assign_rhs1 (assign) && (lhs_info = lookup_operand (lhs))) *info = *lhs_info; break; } There also used to be a strong preference for not embedding assignments in && and || conditions. If there had been some other set-up for the lookup_operand call, we would have had: if (rhs != gimple_assign_rhs1 (assign)) { ... if (const usage_info *lhs_info = lookup_operand (lhs)) .. } and presumably that would have been OK. So if the original really isn't, acceptable, I'd rather write it as: if (rhs != gimple_assign_rhs1 (assign)) { const usage_info *lhs_info = lookup_operand (lhs); if (lhs_info) .. } I thought we'd embraced the idea of declarations in if conditions though. >> +/* Process all uses of VAR and record or update the result in >> + M_INFO_MAP and M_VARS. */ >> + >> +void >> +backprop::process_var (tree var) >> +{ >> + if (has_zero_uses (var)) >> + return; >> + >> + usage_info info; >> + intersect_uses (var, &info); > > It feels somewhat backward that we do a backward walk over all stmts > (thus processing all use-stmts before defs) but then "forward" process > uses of defs. Wouldn't it be easier(?) to "intersect" incrementally > by processing only uses for all stmts we walk? Or is this about > avoiding to allocate a 'info' for DEFs that end up with no interesting > info? Yeah, that was the idea. In practice very few SSA names end up with any useful information. >> +/* Process all statements and phis in BB, during the first post-order > walk. */ >> + >> +void >> +backprop::process_block (basic_block bb) >> +{ >> + for (gimple_stmt_iterator gsi = gsi_last_bb (bb); !gsi_end_p (gsi); >> + gsi_prev (&gsi)) >> + { >> + tree lhs = gimple_get_lhs (gsi_stmt (gsi)); >> + if (lhs && TREE_CODE (lhs) == SSA_NAME) >> + process_var (lhs); > > I think it would be cleaner to walk over al SSA defs (that includes asms which > you seem to miss, no idea if that's important) The idea was the same here. The point of the pass is to back-propagate information from uses to definitions, so there's no point recording information about SSA names whose definitions could never benefit. >> + } >> + for (gphi_iterator gpi = gsi_start_phis (bb); !gsi_end_p (gpi); >> + gsi_next (&gpi)) >> + process_var (gimple_phi_result (gpi.phi ())); >> +} >> + >> +/* Delete the definition of VAR, which has no uses. */ >> + >> +static void >> +remove_unused_var (tree var) >> +{ >> + gimple *stmt = SSA_NAME_DEF_STMT (var); >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "Deleting "); >> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >> + } >> + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); >> + gsi_remove (&gsi, true); >> + release_ssa_name (var); > > What if var wasn't the only def? Please use release_defs () > so in case this is a call with a virtual definition it is released. I didn't think the calls involved here could have those, but maybe they could because of errno? Ugh. Are there any other valid reasons for having VDEFs on these calls? >> +/* Strip all sign operations from the rvalue at *RHS_PTR in STMT. >> + Return true if something changed. The caller is responsible >> + for the necessary bookkeeping. */ >> + >> +static bool >> +strip_sign_op (gimple *stmt, tree *rhs_ptr) >> +{ >> + if (tree new_rhs = strip_sign_op (*rhs_ptr)) >> + { >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + note_replacement (stmt, *rhs_ptr, new_rhs); >> + *rhs_ptr = new_rhs; > > So it looks you are only changing stmts when the stmt result produces > the same value. Just double-checking, as otherwise you'd need to care > about debug stmts ... No, it can change values, like the case you saw later for phis. This applies to all the cases where the optimisation depends on the propagated info for the lhs, rather than being inherent to the operation. So e.g. we can change the value of A in A = B * C, if all uses of A don't care about the sign. At the moment the only change we can make is that the result could be the negative of its original value. >> +/* STMT has been changed. Give the fold machinery a chance to simplify >> + and canonicalize it (e.g. by ensuring that commutative operands have >> + the right order), then record the updates. */ >> + >> +void >> +backprop::complete_change (gimple *stmt) >> +{ >> + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); >> + if (fold_stmt (&gsi)) >> + { >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, " which folds to: "); >> + print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_SLIM); >> + } > > I suppose it can't (yet) happen that the result no longer throws. Yeah, think so. >> +/* Optimize PHI on the basis that INFO describes all uses of the result. */ >> + >> +void >> +backprop::optimize_phi (gphi *phi, const usage_info *info) >> +{ >> + /* If the sign of the result doesn't matter, strip sign operations >> + from all arguments. */ >> + if (info->flags.ignore_sign) >> + { >> + use_operand_p use; >> + ssa_op_iter oi; >> + FOR_EACH_PHI_ARG (use, phi, oi, SSA_OP_USE) >> + if (tree new_arg = strip_sign_op (USE_FROM_PTR (use))) >> + { >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + note_replacement (phi, USE_FROM_PTR (use), new_arg); >> + replace_exp (use, new_arg); > > So this seems to be a case where the value _does_ change? Why bother > to substitute in PHIs btw.? For: double f (double *a, int n, double start) { double x = fabs (start); for (int i = 0; i < n; ++i) x *= a[i]; return __builtin_cos (x); } it's the loop phi for x that has the redundant sign op as argument. We want to rewrite it to "start" even if we need to keep "fabs (start)" around for some other uses (assuming an extended example). Thanks, Richard