In AutoFDO, we increase einline iterations. This could lead to extensive code bloat if we have recursive calls like:
dtor() { destroy(node); } destroy(node) { destroy(left) destroy(right) } In this case, the size growth will be around 8 which is smaller than threshold (11). However, if we allow this to happen for 2 iterations, it will expand the size by 1024X. To fix this problem, we want to set a much smaller threshold in the AutoFDO case. This is because AutoFDO do not not rely on aggressive einline to gain more profile context. And also, in AutoFDO pass, after we processed a function, we need to recompute inline parameters because rebuild_cgraph_edges will zero out all inline parameters. The patch is attached below, bootstrapped and perf test on-going. OK for google-4_9? Thanks, Dehao Index: gcc/auto-profile.c =================================================================== --- gcc/auto-profile.c (revision 217523) +++ gcc/auto-profile.c (working copy) @@ -1771,6 +1771,7 @@ auto_profile (void) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); rebuild_cgraph_edges (); + compute_inline_parameters (cgraph_get_node (current_function_decl), true); pop_cfun (); } Index: gcc/opts.c =================================================================== --- gcc/opts.c (revision 217523) +++ gcc/opts.c (working copy) @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts, maybe_set_param_value ( PARAM_EARLY_INLINER_MAX_ITERATIONS, 10, opts->x_param_values, opts_set->x_param_values); + maybe_set_param_value ( + PARAM_EARLY_INLINING_INSNS, 4, + opts->x_param_values, opts_set->x_param_values); + maybe_set_param_value ( + PARAM_EARLY_INLINING_INSNS_COMDAT, 4, + opts->x_param_values, opts_set->x_param_values); value = true; /* No break here - do -fauto-profile processing. */ case OPT_fauto_profile: