We do not do sophisticated recursive call detection in einline phase. It only happens in ipa-inline phase.
Dehao On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davi...@google.com> wrote: > On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <de...@google.com> wrote: >> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it >> should not harm to set max iter as 10. >> >> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> After inline summary is recomputed, the large code growth problem will >>> also be better controlled, right? >> >> For this case, recomputing inline summary does not help because the >> code was bloated in first einline phase. > > For recursive inlining, the inline summary for the cloned edges need > to be updated to prevent the growth? > > david > >> >> Dehao >> >>> >>> David >>> >>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davi...@google.com> >>> wrote: >>>> Is there a need to have 10 iterations of early inline for autofdo? >>>> >>>> David >>>> >>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <de...@google.com> wrote: >>>>> In AutoFDO, we increase einline iterations. This could lead to >>>>> extensive code bloat if we have recursive calls like: >>>>> >>>>> dtor() { >>>>> destroy(node); >>>>> } >>>>> >>>>> destroy(node) { >>>>> destroy(left) >>>>> destroy(right) >>>>> } >>>>> >>>>> In this case, the size growth will be around 8 which is smaller than >>>>> threshold (11). However, if we allow this to happen for 2 iterations, >>>>> it will expand the size by 1024X. To fix this problem, we want to set >>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO >>>>> do not not rely on aggressive einline to gain more profile context. >>>>> >>>>> And also, in AutoFDO pass, after we processed a function, we need to >>>>> recompute inline parameters because rebuild_cgraph_edges will zero out >>>>> all inline parameters. >>>>> >>>>> The patch is attached below, bootstrapped and perf test on-going. OK >>>>> for google-4_9? >>>>> >>>>> Thanks, >>>>> Dehao >>>>> >>>>> Index: gcc/auto-profile.c >>>>> =================================================================== >>>>> --- gcc/auto-profile.c (revision 217523) >>>>> +++ gcc/auto-profile.c (working copy) >>>>> @@ -1771,6 +1771,7 @@ auto_profile (void) >>>>> free_dominance_info (CDI_DOMINATORS); >>>>> free_dominance_info (CDI_POST_DOMINATORS); >>>>> rebuild_cgraph_edges (); >>>>> + compute_inline_parameters (cgraph_get_node >>>>> (current_function_decl), true); >>>>> pop_cfun (); >>>>> } >>>>> >>>>> Index: gcc/opts.c >>>>> =================================================================== >>>>> --- gcc/opts.c (revision 217523) >>>>> +++ gcc/opts.c (working copy) >>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts, >>>>> maybe_set_param_value ( >>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10, >>>>> opts->x_param_values, opts_set->x_param_values); >>>>> + maybe_set_param_value ( >>>>> + PARAM_EARLY_INLINING_INSNS, 4, >>>>> + opts->x_param_values, opts_set->x_param_values); >>>>> + maybe_set_param_value ( >>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4, >>>>> + opts->x_param_values, opts_set->x_param_values); >>>>> value = true; >>>>> /* No break here - do -fauto-profile processing. */ >>>>> case OPT_fauto_profile: