> > > > Are you sure? I thought we compute that up-front ... at least > > we make sure we can_inline_edge_p all calls before even trying > > to inline all calls - that looks somewhat redundant then if it > > can happen that we give up anyway. Ah, so can_inline_edge_p has > > > > /* Check if caller growth allows the inlining. */ > > else if (!DECL_DISREGARD_INLINE_LIMITS (callee->decl) > > && !disregard_limits > > && !lookup_attribute ("flatten", > > DECL_ATTRIBUTES (caller->decl)) > > && !caller_growth_limits (e)) > > inlinable = false; > > > > and we check that from when we do the inlining and upfront from
> > want_inline_function_to_all_callers_p ... we also give up > > if we inline a recursive function it seems. Yep, there is early check for recursion and sizes. But you can still hit the size limits later. If a function A calls 1000 times function B, each of the calls individually may be inlinable, but not all together. > > > > > For next stage1 I plan to finish the overahaul to sreals that will let > > > me to make update_overall_summary incremental (i.e. > > > O(size_of_inlined_function) > > > and not O(size_of_inlined_function+size_of_function_inlined_to)) that will > > > also solve these issues. > > > > Hopefully. > > > > > One can probably do a pesimistic estimate on code size of the function > > > (i.e. add the size of inlined function) and only if that hits the large > > > function growth do the exact calculation. Or we can just decide that it > > > is OK to ignore large function growht in this partiuclar case. > > > > Ignoring it is probably not a good idea and will just lead to a different > > degenerate case. As we update the summary afterwards it's probably > > ok to do incremental (pessimistic) updates on the size anyway? In > > inline_merge_summary I mean? Or should I open-code that into > > inline_call if ! update_overall_summary? > > So like below - I update self-size by the growth estimate which is > supposed to match, this should keep the overall function growth > limits working, not sure about the stack stuff but that doesn't seem > to be updated by update_overall_summaries either. Not really, the sizes are also computed in fixed point math, because we estimate probabilities when insn is going to be removed. There are some cases wher estimate_growth is not as precise as the real thing, you can see the #if 0 out asstert in ipa-inline-transform abou tthat. Well, for inline functions called once, I suppose precision of these details is not as important, so we may just go with the patch. I will finish the sreal conversion next stage1 (I got stuck on this patch series on gengtype change to allow sreal inline_summary which I do not think ever got reviewed) and do the incremental update. So if you fell this PR is important, go ahead with the patch. Honza > > This also fixes a similar issue in early inlining where we can then > trivially delay update_overall_summaries. I remember seeing early > inline analysis behave similarly quadratic. > > It leaves alone recursive and IPA small function inlining as those > update overall unit size as well - I first thought about somehow > making overall summary update lazy, but that looks quite complicated > and IIRC the round-off errors issue was when re-computing the > key for the fibheap after doing one inlining, right? > > I hope you can get to use sreals early in next stage1 though I wonder > how that will avoid all corner-cases. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. Ok for trunk > and branch? > > Thanks, > Richard. > > 2016-02-18 Richard Biener <rguent...@suse.de> > > PR ipa/37448 > * ipa-inline-transform.c (inline_call): When not updating > overall summaries adjust self size by the growth estimate. > * ipa-inline.c (inline_to_all_callers_1): Add to the callers > hash-set, do not update overall summaries here. Renamed from ... > (inline_to_all_callers): ... this which is now wrapping the > above and performing delayed overall summary update. > (early_inline_small_functions): Delay updating of the overall > summary. > > Index: gcc/ipa-inline-transform.c > =================================================================== > --- gcc/ipa-inline-transform.c (revision 233498) > +++ gcc/ipa-inline-transform.c (working copy) > @@ -300,10 +300,12 @@ inline_call (struct cgraph_edge *e, bool > struct cgraph_edge *curr = e; > struct cgraph_node *callee = e->callee->ultimate_alias_target (); > bool new_edges_found = false; > + int estimated_growth; > > + if (! update_overall_summary) > + estimated_growth = estimate_edge_growth (e); > /* This is used only for assert bellow. */ > #if 0 > - int estimated_growth = estimate_edge_growth (e); > bool predicated = inline_edge_summary (e)->predicate != NULL; > #endif > > @@ -373,7 +375,13 @@ inline_call (struct cgraph_edge *e, bool > new_edges_found = ipa_propagate_indirect_call_infos (curr, new_edges); > check_speculations (e->callee); > if (update_overall_summary) > - inline_update_overall_summary (to); > + inline_update_overall_summary (to); > + else > + /* Update self size by the estimate so overall function growth limits > + work for further inlining into this function. Before inlining > + the function we inlined to again we expect the caller to update > + the overall summary. */ > + inline_summaries->get (to)->size += estimated_growth; > new_size = inline_summaries->get (to)->size; > > if (callee->calls_comdat_local) > Index: gcc/ipa-inline.c > =================================================================== > --- gcc/ipa-inline.c (revision 233498) > +++ gcc/ipa-inline.c (working copy) > @@ -2163,7 +2163,8 @@ flatten_function (struct cgraph_node *no > recursion. */ > > static bool > -inline_to_all_callers (struct cgraph_node *node, void *data) > +inline_to_all_callers_1 (struct cgraph_node *node, void *data, > + hash_set<cgraph_node *> *callers) > { > int *num_calls = (int *)data; > bool callee_removed = false; > @@ -2193,7 +2194,10 @@ inline_to_all_callers (struct cgraph_nod > inline_summaries->get (node->callers->caller)->size); > } > > - inline_call (node->callers, true, NULL, NULL, true, &callee_removed); > + /* Remember which callers we inlined to, delaying updating the > + overall summary. */ > + callers->add (node->callers->caller); > + inline_call (node->callers, true, NULL, NULL, false, &callee_removed); > if (dump_file) > fprintf (dump_file, > " Inlined into %s which now has %i size\n", > @@ -2211,6 +2215,23 @@ inline_to_all_callers (struct cgraph_nod > return false; > } > > +/* Wrapper around inline_to_all_callers_1 doing delayed overall summary > + update. */ > + > +static bool > +inline_to_all_callers (struct cgraph_node *node, void *data) > +{ > + hash_set<cgraph_node *> callers; > + bool res = inline_to_all_callers_1 (node, data, &callers); > + /* Perform the delayed update of the overall summary of all callers > + processed. This avoids quadratic behavior in the cases where > + we have a lot of calls to the same function. */ > + for (hash_set<cgraph_node *>::iterator i = callers.begin (); > + i != callers.end (); ++i) > + inline_update_overall_summary (*i); > + return res; > +} > + > /* Output overall time estimate. */ > static void > dump_overall_stats (void) > @@ -2590,10 +2611,13 @@ early_inline_small_functions (struct cgr > fprintf (dump_file, " Inlining %s into %s.\n", > xstrdup_for_dump (callee->name ()), > xstrdup_for_dump (e->caller->name ())); > - inline_call (e, true, NULL, NULL, true); > + inline_call (e, true, NULL, NULL, false); > inlined = true; > } > > + if (inlined) > + inline_update_overall_summary (node); > + > return inlined; > } >