resend the patch for stage1: https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538186.html
The performance of exchange2 built with PGO will decrease ~28% by r278808 due to profile count set incorrectly. The cloned nodes are updated to a very small count caused later pass cunroll fail to unroll the recursive function in exchange2. digits_2 -> digits_2.constprop.0, digits_2.constprop.1, etc. 1. Enable proportion orig_sum to the new nodes for self recursive node (k means multiple self recursive calls): new_sum = (orig_sum + new_sum) * 1 / k \ * self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth). 2. Two mutually recursive functions are not supported in self recursive clone yet so also not supported in update_profiling_info here. 3. Improve value range for param_ipa_cp_max_recursive_depth to (0, 8). If it calls itself two different sets, usually recursive boudary limit will stop the specialize first, otherwise it is slow even without recursive specialize. gcc/ChangeLog: 2020-05-29 Xionghu Luo <luo...@linux.ibm.com> * ipa-cp.c (update_profiling_info): Check self_scc node. * params.opt (ipa-cp-max-recursive-depth): Set param range. gcc/testsuite/ChangeLog: 2020-05-29 Xionghu Luo <luo...@linux.ibm.com> * gcc.dg/ipa/ipa-clone-3.c: Update count. --- gcc/ipa-cp.c | 33 +++++++++++++++++++++++++- gcc/params.opt | 2 +- gcc/testsuite/gcc.dg/ipa/ipa-clone-3.c | 2 +- 3 files changed, 34 insertions(+), 3 deletions(-) diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index c64e9104a94..919c741ecfa 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -2046,7 +2046,7 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs, /* Recursively generate lattice values with a limited count. */ FOR_EACH_VEC_ELT (val_seeds, i, src_val) { - for (int j = 1; j < max_recursive_depth; j++) + for (int j = 0; j < max_recursive_depth; j++) { tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2, src_val, res_type); @@ -4345,6 +4345,37 @@ update_profiling_info (struct cgraph_node *orig_node, false); new_sum = stats.count_sum; + class ipa_node_params *info = IPA_NODE_REF (orig_node); + if (info && info->node_is_self_scc) + { + profile_count self_recursive_count = profile_count::zero (); + unsigned k = 0; + + /* The self recursive edge is on the orig_node. */ + for (cs = orig_node->callees; cs; cs = cs->next_callee) + if (ipa_edge_within_scc (cs)) + { + cgraph_node *callee = cs->callee->function_symbol (); + if (callee && cs->caller == cs->callee) + { + self_recursive_count += cs->count; + k++; + } + } + + /* Proportion count for multiple self recursive node. */ + if (k) + self_recursive_count.apply_scale (1, k); + + /* Proportion count for self recursive node from all callers. */ + new_sum + = (orig_sum + new_sum).apply_scale (self_recursive_count, orig_sum); + + /* Proportion count for specialized cloned node. */ + if (param_ipa_cp_max_recursive_depth) + new_sum = new_sum.apply_scale (1, param_ipa_cp_max_recursive_depth); + } + if (orig_node_count < orig_sum + new_sum) { if (dump_file) diff --git a/gcc/params.opt b/gcc/params.opt index 4aec480798b..ad000145dfb 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -199,7 +199,7 @@ Common Joined UInteger Var(param_ipa_cp_loop_hint_bonus) Init(64) Param Optimiza Compile-time bonus IPA-CP assigns to candidates which make loop bounds or strides known. -param=ipa-cp-max-recursive-depth= -Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) Param Optimization +Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(7) IntegerRange(0, 8) Param Optimization Maximum depth of recursive cloning for self-recursive function. -param=ipa-cp-min-recursive-probability= diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-clone-3.c b/gcc/testsuite/gcc.dg/ipa/ipa-clone-3.c index a73cb8b63fc..1efa23ae109 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-clone-3.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-clone-3.c @@ -41,4 +41,4 @@ int main () return recur_fn (&v); } -/* { dg-final { scan-ipa-dump-times "Creating a specialized node of recur_fn/\[0-9\]*\\." 8 "cp" } } */ +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of recur_fn/\[0-9\]*\\." 9 "cp" } } */ -- 2.21.0.777.g83232e3864