The patch was updated to ignore comdat einline tuning for AutoFDO. Performance testing is green.
OK for google-4_9? Thanks, Dehao Index: gcc/auto-profile.c =================================================================== --- gcc/auto-profile.c (revision 217523) +++ gcc/auto-profile.c (working copy) @@ -1771,6 +1771,7 @@ auto_profile (void) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); rebuild_cgraph_edges (); + compute_inline_parameters (cgraph_get_node (current_function_decl), true); pop_cfun (); } Index: gcc/ipa-inline.c =================================================================== --- gcc/ipa-inline.c (revision 217523) +++ gcc/ipa-inline.c (working copy) @@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge * growth); want_inline = false; } - else if (DECL_COMDAT (callee->decl) + else if (!flag_auto_profile && DECL_COMDAT (callee->decl) && growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT)) ; else if ((n = num_calls (callee)) != 0 On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <de...@google.com> wrote: > We do not do sophisticated recursive call detection in einline phase. > It only happens in ipa-inline phase. > > Dehao > > On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davi...@google.com> wrote: >> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <de...@google.com> wrote: >>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it >>> should not harm to set max iter as 10. >>> >>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davi...@google.com> >>> wrote: >>>> After inline summary is recomputed, the large code growth problem will >>>> also be better controlled, right? >>> >>> For this case, recomputing inline summary does not help because the >>> code was bloated in first einline phase. >> >> For recursive inlining, the inline summary for the cloned edges need >> to be updated to prevent the growth? >> >> david >> >>> >>> Dehao >>> >>>> >>>> David >>>> >>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davi...@google.com> >>>> wrote: >>>>> Is there a need to have 10 iterations of early inline for autofdo? >>>>> >>>>> David >>>>> >>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <de...@google.com> wrote: >>>>>> In AutoFDO, we increase einline iterations. This could lead to >>>>>> extensive code bloat if we have recursive calls like: >>>>>> >>>>>> dtor() { >>>>>> destroy(node); >>>>>> } >>>>>> >>>>>> destroy(node) { >>>>>> destroy(left) >>>>>> destroy(right) >>>>>> } >>>>>> >>>>>> In this case, the size growth will be around 8 which is smaller than >>>>>> threshold (11). However, if we allow this to happen for 2 iterations, >>>>>> it will expand the size by 1024X. To fix this problem, we want to set >>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO >>>>>> do not not rely on aggressive einline to gain more profile context. >>>>>> >>>>>> And also, in AutoFDO pass, after we processed a function, we need to >>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out >>>>>> all inline parameters. >>>>>> >>>>>> The patch is attached below, bootstrapped and perf test on-going. OK >>>>>> for google-4_9? >>>>>> >>>>>> Thanks, >>>>>> Dehao >>>>>> >>>>>> Index: gcc/auto-profile.c >>>>>> =================================================================== >>>>>> --- gcc/auto-profile.c (revision 217523) >>>>>> +++ gcc/auto-profile.c (working copy) >>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void) >>>>>> free_dominance_info (CDI_DOMINATORS); >>>>>> free_dominance_info (CDI_POST_DOMINATORS); >>>>>> rebuild_cgraph_edges (); >>>>>> + compute_inline_parameters (cgraph_get_node >>>>>> (current_function_decl), true); >>>>>> pop_cfun (); >>>>>> } >>>>>> >>>>>> Index: gcc/opts.c >>>>>> =================================================================== >>>>>> --- gcc/opts.c (revision 217523) >>>>>> +++ gcc/opts.c (working copy) >>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts, >>>>>> maybe_set_param_value ( >>>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10, >>>>>> opts->x_param_values, opts_set->x_param_values); >>>>>> + maybe_set_param_value ( >>>>>> + PARAM_EARLY_INLINING_INSNS, 4, >>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>> + maybe_set_param_value ( >>>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4, >>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>> value = true; >>>>>> /* No break here - do -fauto-profile processing. */ >>>>>> case OPT_fauto_profile: