On 9/26/19 7:23 AM, luoxhu wrote: > Thanks Martin, > > > On 2019/9/25 18:57, Martin Liška wrote: >> On 9/25/19 5:45 AM, luoxhu wrote: >>> Hi, >>> >>> Sorry for replying so late due to cauldron conference and other LTO issues >>> I was working on. >> >> Hello. >> >> That's fine, we still have plenty of time for patch review. >> >> Not fixed issues which I reported in v3 (and still valid in v4): >> - please come up with indirect_target_info::indirect_target_info and use it > Sorry for miss out.
Hello. Sure, please use a contructor initialization (see my patch). > > >> - do you need to stream out indirect_call_targets when common_target_id == 0? > > No need to stream out items with common_target_id == 0, removed the if > condition in lto-cgraph.c. Fine. Do we have a guarantee that item->common_target_id is always != 0? Please put there an assert. > >> >> Then I'm suggesting to use vec::is_empty (please see my patch). > OK. But has_multiple_indirect_call_p should return different than > has_indirect_call_p as it checks more that one targets? Sure, that was mistake in my patch from previous reply. > > gcc/cgraph.c > /* Return true if this edge has multiple indirect call targets. */ > bool > cgraph_edge::has_multiple_indirect_call_p (void) > { > - return indirect_info && indirect_info->indirect_call_targets > - && indirect_info->indirect_call_targets->length () > 1; > + return (indirect_info && indirect_info->indirect_call_targets > + && indirect_info->indirect_call_targets->length () > 1); > } > > /* Return true if this edge has at least one indirect call target. */ > bool > cgraph_edge::has_indirect_call_p (void) > { > - return indirect_info && indirect_info->indirect_call_targets > - && indirect_info->indirect_call_targets->length (); > + return (indirect_info && indirect_info->indirect_call_targets > + && !indirect_info->indirect_call_targets->is_empty ()); > } > >> >> I see following failures for the tests provided: >> FAIL: gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c compilation, >> -fprofile-generate -D_PROFILE_GENERATE >> FAIL: gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c compilation, >> -fprofile-generate -D_PROFILE_GENERATE >> FAIL: gcc.dg/tree-prof/indir-call-prof-topn.c compilation, >> -fprofile-generate -D_PROFILE_GENERATE > > Sorry that I forgot to remove the deprecated build option in the 3 cases > (also updated the scan exp check): > -/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param > indir-call-topn-profile=1" } */ > +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate" } */ > > > The new patch is attached. Thanks. Hm, looking at the gimple_ic_transform function. I think the function should always return false as it never does a GIMPLE transformation. Apart from that, I'm fine with the patch. Note that I'm not the maintainer, but I bet we simplified the patch review to Honza significantly. Last missing piece is probably the update ChangeLog. Thank you for working on that, Martin > > > Xiong Hu > >> >> Next comments follow directly in the email body: >> >>> >>> v4 Changes: >>> 1. Rebase to trunk. >>> 2. Remove num_of_ics and use vector's length to avoid redundancy. >>> 3. Update the code in ipa-profile.c to improve review feasibility. >>> 4. Add function has_indirect_call_p and has_multiple_indirect_call_p. >>> 5. For parameter control, I will leave it to next patch as it is a >>> relative independent function. Currently, maximum number of >>> promotions is GCOV_TOPN_VALUES as only 4 profiling value limited >>> from profile-generate, therefore minimum probability is adjusted to >>> 25% in value-prof.c, it was 75% also by hard code for single >>> indirect target. No control to minimal number of edge >>> executions yet. What's more, this patch is a bit large now. >>> >>> This patch aims to fix PR69678 caused by PGO indirect call profiling >>> performance issues. >>> The bug that profiling data is never working was fixed by Martin's pull >>> back of topN patches, performance got GEOMEAN ~1% improvement(+24% for >>> 511.povray_r specifically). >>> Still, currently the default profile only generates SINGLE indirect target >>> that called more than 75%. This patch leverages MULTIPLE indirect >>> targets use in LTO-WPA and LTO-LTRANS stage, as a result, function >>> specialization, profiling, partial devirtualization, inlining and >>> cloning could be done successfully based on it. >>> Performance can get improved from 0.70 sec to 0.38 sec on simple tests. >>> Details are: >>> 1. PGO with topn is enabled by default now, but only one indirect >>> target edge will be generated in ipa-profile pass, so add variables to >>> enable >>> multiple speculative edges through passes, speculative_id will record the >>> direct edge index bind to the indirect edge, indirect_call_targets length >>> records how many direct edges owned by the indirect edge, postpone >>> gimple_ic >>> to ipa-profile like default as inline pass will decide whether it is >>> benefit >>> to transform indirect call. >>> 2. Use speculative_id to track and search the reference node matched >>> with the direct edge's callee for multiple targets. Actually, it is the >>> caller's responsibility to handle the direct edges mapped to same >>> indirect >>> edge. speculative_call_info will return one of the direct edge >>> specified, >>> this will leverage current IPA edge process framework mostly. >>> 3. Enable LTO WPA/LTRANS stage multiple indirect call targets analysis >>> for >>> profile full support in ipa passes and cgraph_edge functions. >>> speculative_id >>> can be set by make_speculative id when multiple targets are binded to >>> one indirect edge, and cloned if new edge is cloned. speculative_id >>> is streamed out and stream int by lto like lto_stmt_uid. >>> 4. Add 1 in module testcase and 2 cross module testcases. >>> 5. Bootstrap and regression test passed on Power8-LE. No function >>> and performance regression for SPEC2017. >>> >>> gcc/ChangeLog >>> >>> 2019-09-25 Xiong Hu Luo <luo...@linux.ibm.com> >>> >>> PR ipa/69678 >>> * cgraph.c (symbol_table::create_edge): Init speculative_id. >>> (cgraph_edge::make_speculative): Add param for setting speculative_id. >>> (cgraph_edge::speculative_call_info): Find reference by >>> speculative_id for multiple indirect targets. >>> (cgraph_edge::resolve_speculation): Decrease the speculations >>> for indirect edge, drop it's speculative if not direct target >>> left. >>> (cgraph_edge::redirect_call_stmt_to_callee): Likewise. >>> (cgraph_node::verify_node): Don't report error if speculative >>> edge not include statement. >>> (cgraph_edge::has_multiple_indirect_call_p): New function. >>> (cgraph_edge::has_indirect_call_p): New function. >>> * cgraph.h (struct indirect_target_info): New struct. >>> (indirect_call_targets): New vector variable. >>> (make_speculative): Add param for setting speculative_id. >>> (cgraph_edge::has_multiple_indirect_call_p): New declare. >>> (cgraph_edge::has_indirect_call_p): New declare. >>> (speculative_id): New variable. >>> * cgraphclones.c (cgraph_node::create_clone): Clone speculative_id. >>> * ipa-inline.c (inline_small_functions): Fix iterator update. >>> * ipa-profile.c (ipa_profile_generate_summary): Add indirect >>> multiple targets logic. >>> (ipa_profile): Likewise. >>> * ipa-ref.h (speculative_id): New variable. >>> * ipa.c (process_references): Fix typo. >>> * lto-cgraph.c (lto_output_edge): Add indirect multiple targets >>> logic. Stream out speculative_id. >>> (input_edge): Likewise. >>> * predict.c (dump_prediction): Remove edges count assert to be >>> precise. >>> * symtab.c (symtab_node::create_reference): Init speculative_id. >>> (symtab_node::clone_references): Clone speculative_id. >>> (symtab_node::clone_referring): Clone speculative_id. >>> (symtab_node::clone_reference): Clone speculative_id. >>> (symtab_node::clear_stmts_in_references): Clear speculative_id. >>> * tree-inline.c (copy_bb): Duplicate all the speculative edges >>> if indirect call contains multiple speculative targets. >>> * tree-profile.c (gimple_gen_ic_profiler): Use the new variable >>> __gcov_indirect_call.counters and __gcov_indirect_call.callee. >>> (gimple_gen_ic_func_profiler): Likewise. >>> (pass_ipa_tree_profile::gate): Fix comment typos. >>> * value-prof.c (gimple_ic_transform): Handle topn case. >>> Fix comment typos. >>> >>> gcc/testsuite/ChangeLog >>> >>> 2019-09-25 Xiong Hu Luo <luo...@linux.ibm.com> >>> >>> PR ipa/69678 >>> * gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase. >>> --- >>> gcc/cgraph.c | 90 ++++++++++++++++- >>> gcc/cgraph.h | 29 +++++- >>> gcc/cgraphclones.c | 1 + >>> gcc/ipa-inline.c | 15 +-- >>> gcc/ipa-profile.c | 96 ++++++++++++++----- >>> gcc/ipa-ref.h | 1 + >>> gcc/ipa.c | 2 +- >>> gcc/lto-cgraph.c | 57 +++++++++-- >>> gcc/predict.c | 1 - >>> gcc/symtab.c | 5 + >>> .../tree-prof/crossmodule-indir-call-topn-1.c | 35 +++++++ >>> .../crossmodule-indir-call-topn-1a.c | 22 +++++ >>> .../tree-prof/crossmodule-indir-call-topn-2.c | 42 ++++++++ >>> .../gcc.dg/tree-prof/indir-call-prof-topn.c | 38 ++++++++ >>> gcc/tree-inline.c | 19 ++++ >>> gcc/tree-profile.c | 12 +-- >>> gcc/value-prof.c | 86 +++++++++-------- >>> 17 files changed, 452 insertions(+), 99 deletions(-) >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> >>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c >>> index 843891e9e56..9a28aca435f 100644 >>> --- a/gcc/cgraph.c >>> +++ b/gcc/cgraph.c >>> @@ -860,6 +860,7 @@ symbol_table::create_edge (cgraph_node *caller, >>> cgraph_node *callee, >>> edge->prev_callee = NULL; >>> edge->next_callee = NULL; >>> edge->lto_stmt_uid = 0; >>> + edge->speculative_id = 0; >>> edge->count = count; >>> @@ -1051,7 +1052,8 @@ cgraph_edge::remove (void) >>> Return direct edge created. */ >>> cgraph_edge * >>> -cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count) >>> +cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count, >>> + unsigned int speculative_id) >>> { >>> cgraph_node *n = caller; >>> ipa_ref *ref = NULL; >>> @@ -1069,11 +1071,13 @@ cgraph_edge::make_speculative (cgraph_node *n2, >>> profile_count direct_count) >>> else >>> e2->can_throw_external = can_throw_external; >>> e2->lto_stmt_uid = lto_stmt_uid; >>> + e2->speculative_id = speculative_id; >>> e2->in_polymorphic_cdtor = in_polymorphic_cdtor; >>> count -= e2->count; >>> symtab->call_edge_duplication_hooks (this, e2); >>> ref = n->create_reference (n2, IPA_REF_ADDR, call_stmt); >>> ref->lto_stmt_uid = lto_stmt_uid; >>> + ref->speculative_id = speculative_id; >>> ref->speculative = speculative; >>> n2->mark_address_taken (); >>> return e2; >>> @@ -1087,6 +1091,38 @@ cgraph_edge::make_speculative (cgraph_node *n2, >>> profile_count direct_count) >>> call) and if one of them exists, all of them must exist. >>> Given speculative call edge, return all three components. >>> + >>> + For some indirect edge, it may maps to multiple direct edges, i.e. 1:N. >>> + check the speculative_id to return all the three components for >>> specified >>> + direct edge or indirect edge. >>> + If input is indirect, caller of this function will get the direct edge >>> one by >>> + one, get_edge will just return one of the direct edge mapped to the >>> indirect >>> + edge, the returned direct edge will be resolved or redirected by the >>> caller, >>> + then number of indirect calls (speculations) is deceased in each access. >>> + If input is direct, this function will get the indirect edge and >>> reference >>> + with matched speculative_id, the returned edge will also be resolved or >>> + redirected, decrease the speculations accordingly. >>> + Speculations of indirect edge will be dropped only if all direct edges >>> + be handled. >>> + >>> + e.g. for indirect edge E statement "call call_dest": >>> + >>> + Redirect N3 after redirected N2: >>> + >>> + if (call_dest == N2) >>> + n2 (); >>> + else if (call_dest == N3) >>> + n3 (); >>> + else >>> + call call_dest >>> + >>> + Resolve N3 and only redirect N2: >>> + >>> + if (call_dest == N2) >>> + n2 (); >>> + else >>> + call call_dest >>> + >>> */ >>> void >>> @@ -1126,7 +1162,7 @@ cgraph_edge::speculative_call_info (cgraph_edge >>> *&direct, >>> reference = NULL; >>> for (i = 0; e->caller->iterate_reference (i, ref); i++) >>> - if (ref->speculative >>> + if (ref->speculative && ref->speculative_id == e->speculative_id >>> && ((ref->stmt && ref->stmt == e->call_stmt) >>> || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid))) >>> { >>> @@ -1187,7 +1223,21 @@ cgraph_edge::resolve_speculation (tree callee_decl) >>> in the functions inlined through it. */ >>> } >>> edge->count += e2->count; >>> - edge->speculative = false; >>> + /* edge is indirect, e2 is direct. If edge contains multiple >>> speculations, >>> + remove one of speculations for this indirect edge, then if edge still >>> + contains direct target, keep the speculation, next direct target >>> + will continue use it. Give up speculation completely if no direct >>> + target is left for this indirect edge. */ >>> + if (edge->has_indirect_call_p ()) >>> + { >>> + /* As the direct targets are sorted by decrease, delete the first >>> target >>> + when it is resolved. */ >>> + edge->indirect_info->indirect_call_targets->ordered_remove (0); >>> + if (!edge->indirect_info->indirect_call_targets->length ()) >>> + edge->speculative = false; >>> + } >>> + else >>> + edge->speculative = false; >>> e2->speculative = false; >>> ref->remove_reference (); >>> if (e2->indirect_unknown_callee || e2->inline_failed) >>> @@ -1321,7 +1371,21 @@ cgraph_edge::redirect_call_stmt_to_callee (void) >>> e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt, >>> false); >>> e->count = gimple_bb (e->call_stmt)->count; >>> - e2->speculative = false; >>> + /* edge is direct, e2 is indirect here. If e2 contains multiple >>> + speculations, remove one of speculations for this indirect edge, >>> + then if e2 still contains direct target, keep the speculation, >>> + next direct target will continue use it. Give up speculation >>> + completely if no direct target is left for this indirect e2. */ >>> + if (e2->has_indirect_call_p ()) >>> + { >>> + /* As the direct targets are sorted by decrease, delete the first >>> + target when it is redirected. */ >>> + e2->indirect_info->indirect_call_targets->ordered_remove (0); >>> + if (!e2->indirect_info->indirect_call_targets->length ()) >>> + e2->speculative = false; >>> + } >>> + else >>> + e2->speculative = false; >>> e2->count = gimple_bb (e2->call_stmt)->count; >>> ref->speculative = false; >>> ref->stmt = NULL; >>> @@ -3445,7 +3509,7 @@ cgraph_node::verify_node (void) >>> for (e = callees; e; e = e->next_callee) >>> { >>> - if (!e->aux) >>> + if (!e->aux && !e->speculative) >>> { >>> error ("edge %s->%s has no corresponding call_stmt", >>> identifier_to_locale (e->caller->name ()), >>> @@ -3872,6 +3936,22 @@ cgraph_edge::possibly_call_in_translation_unit_p >>> (void) >>> return node->get_availability () >= AVAIL_AVAILABLE; >>> } >>> +/* Return true if this edge has multiple indirect call targets. */ >>> +bool >>> +cgraph_edge::has_multiple_indirect_call_p (void) >>> +{ >>> + return indirect_info && indirect_info->indirect_call_targets >>> + && indirect_info->indirect_call_targets->length () > 1; >>> +} >> >> For multiline && expression, we typically wrap the whole condition >> in '(' and ')'. >> >>> + >>> +/* Return true if this edge has at least one indirect call target. */ >>> +bool >>> +cgraph_edge::has_indirect_call_p (void) >>> +{ >>> + return indirect_info && indirect_info->indirect_call_targets >>> + && indirect_info->indirect_call_targets->length (); >>> +} >> >> Likewise here. >> >>> + >>> /* A stashed copy of "symtab" for use by selftest::symbol_table_test. >>> This needs to be a global so that it can be a GC root, and thus >>> prevent the stashed copy from being garbage-collected if the GC runs >>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >>> index 4c54210123a..33c8454c4e0 100644 >>> --- a/gcc/cgraph.h >>> +++ b/gcc/cgraph.h >>> @@ -1636,6 +1636,16 @@ private: >>> void make_speculative (tree otr_type = NULL); >>> }; >>> +/* Structure containing indirect target information from profile. */ >>> + >>> +struct GTY (()) indirect_target_info >>> +{ >>> + /* Profile_id of common target obtained from profile. */ >>> + unsigned int common_target_id; >>> + /* Probability that call will land in function with COMMON_TARGET_ID. */ >>> + int common_target_probability; >>> +}; >>> + >>> /* Structure containing additional information about an indirect call. */ >>> class GTY(()) cgraph_indirect_call_info >>> @@ -1654,10 +1664,9 @@ public: >>> int param_index; >>> /* ECF flags determined from the caller. */ >>> int ecf_flags; >>> - /* Profile_id of common target obtrained from profile. */ >>> - int common_target_id; >>> - /* Probability that call will land in function with COMMON_TARGET_ID. */ >>> - int common_target_probability; >>> + >>> + /* An indirect call may contain one or multiple call targets. */ >>> + vec<indirect_target_info, va_gc> *indirect_call_targets; >>> /* Set when the call is a virtual call with the parameter being the >>> associated object pointer rather than a simple direct call. */ >>> @@ -1714,7 +1723,8 @@ public: >>> /* Turn edge into speculative call calling N2. Update >>> the profile so the direct call is taken COUNT times >>> with FREQUENCY. */ >>> - cgraph_edge *make_speculative (cgraph_node *n2, profile_count >>> direct_count); >>> + cgraph_edge *make_speculative (cgraph_node *n2, profile_count >>> direct_count, >>> + unsigned int speculative_id = 0); >>> /* Given speculative call edge, return all three components. */ >>> void speculative_call_info (cgraph_edge *&direct, cgraph_edge >>> *&indirect, >>> @@ -1773,6 +1783,12 @@ public: >>> be internal to the current translation unit. */ >>> bool possibly_call_in_translation_unit_p (void); >>> + /* Return true if this edge has multiple indirect call targets. */ >>> + bool has_multiple_indirect_call_p (void); >>> + >>> + /* Return true if this edge has at least one indirect call target. */ >>> + bool has_indirect_call_p (void); >>> + >>> /* Expected number of executions: calculated in profile.c. */ >>> profile_count count; >>> cgraph_node *caller; >>> @@ -1792,6 +1808,9 @@ public: >>> /* The stmt_uid of call_stmt. This is used by LTO to recover the >>> call_stmt >>> when the function is serialized in. */ >>> unsigned int lto_stmt_uid; >>> + /* speculative id is used by multiple indirect targets when the function >>> is >>> + speculated. */ >>> + unsigned int speculative_id; >>> /* Whether this edge was made direct by indirect inlining. */ >>> unsigned int indirect_inlining_edge : 1; >>> /* Whether this edge describes an indirect call with an undetermined >>> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c >>> index fa753697c78..5dbd8d90b77 100644 >>> --- a/gcc/cgraphclones.c >>> +++ b/gcc/cgraphclones.c >>> @@ -128,6 +128,7 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, >>> unsigned stmt_uid, >>> new_edge->inline_failed = inline_failed; >>> new_edge->indirect_inlining_edge = indirect_inlining_edge; >>> new_edge->lto_stmt_uid = stmt_uid; >>> + new_edge->speculative_id = speculative_id; >>> /* Clone flags that depend on call_stmt availability manually. */ >>> new_edge->can_throw_external = can_throw_external; >>> new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p; >>> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c >>> index b62d280eb25..6136214f9ac 100644 >>> --- a/gcc/ipa-inline.c >>> +++ b/gcc/ipa-inline.c >>> @@ -1881,12 +1881,15 @@ inline_small_functions (void) >>> } >>> if (has_speculative) >>> for (edge = node->callees; edge; edge = next) >>> - if (edge->speculative && !speculation_useful_p (edge, >>> - edge->aux != NULL)) >>> - { >>> - edge->resolve_speculation (); >>> - update = true; >>> - } >>> + { >>> + next = edge->next_callee; >>> + if (edge->speculative >>> + && !speculation_useful_p (edge, edge->aux != NULL)) >>> + { >>> + edge->resolve_speculation (); >>> + update = true; >>> + } >>> + } >>> if (update) >>> { >>> struct cgraph_node *where = node->global.inlined_to >>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c >>> index 970dba39c80..342e8ea05d1 100644 >>> --- a/gcc/ipa-profile.c >>> +++ b/gcc/ipa-profile.c >>> @@ -192,23 +192,35 @@ ipa_profile_generate_summary (void) >>> if (h) >>> { >>> gcov_type val, count, all; >>> - if (get_nth_most_common_value (NULL, "indirect call", h, >>> - &val, &count, &all)) >>> + struct cgraph_edge *e = node->get_edge (stmt); >>> + if (e && !e->indirect_unknown_callee) >>> + continue; >>> + >>> + struct indirect_target_info item; >>> + for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++) >>> { >>> - struct cgraph_edge * e = node->get_edge (stmt); >>> - if (e && !e->indirect_unknown_callee) >>> + if (!get_nth_most_common_value (NULL, "indirect call", >>> + h, &val, &count, &all, >>> + j)) >>> + continue; >>> + >>> + if (val == 0) >>> continue; >>> - e->indirect_info->common_target_id = val; >>> - e->indirect_info->common_target_probability >>> + item.common_target_id = val; >>> + item.common_target_probability >>> = GCOV_COMPUTE_SCALE (count, all); >> >> There's one of the places where you can use the constructor. >> >>> - if (e->indirect_info->common_target_probability > >>> REG_BR_PROB_BASE) >>> + if (item.common_target_probability > REG_BR_PROB_BASE) >>> { >>> if (dump_file) >>> - fprintf (dump_file, "Probability capped to 1\n"); >>> - e->indirect_info->common_target_probability = >>> REG_BR_PROB_BASE; >>> + fprintf (dump_file, >>> + "Probability capped to 1\n"); >>> + item.common_target_probability = REG_BR_PROB_BASE; >>> } >>> + vec_safe_push ( >>> + e->indirect_info->indirect_call_targets, item); >>> } >>> + >>> gimple_remove_histogram_value (DECL_STRUCT_FUNCTION >>> (node->decl), >>> stmt, h); >>> } >>> @@ -492,6 +504,7 @@ ipa_profile (void) >>> int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted >>> = 0; >>> int nmismatch = 0, nimpossible = 0; >>> bool node_map_initialized = false; >>> + gcov_type threshold; >>> if (dump_file) >>> dump_histogram (dump_file, histogram); >>> @@ -500,14 +513,12 @@ ipa_profile (void) >>> overall_time += histogram[i]->count * histogram[i]->time; >>> overall_size += histogram[i]->size; >>> } >>> + threshold = 0; >>> if (overall_time) >>> { >>> - gcov_type threshold; >>> - >>> gcc_assert (overall_size); >>> cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + >>> 500) / 1000; >>> - threshold = 0; >>> for (i = 0; cumulated < cutoff; i++) >>> { >>> cumulated += histogram[i]->count * histogram[i]->time; >>> @@ -543,7 +554,7 @@ ipa_profile (void) >>> histogram.release (); >>> histogram_pool.release (); >>> - /* Produce speculative calls: we saved common traget from porfiling >>> into >>> + /* Produce speculative calls: we saved common target from profiling into >>> e->common_target_id. Now, at link time, we can look up corresponding >>> function node and produce speculative call. */ >>> @@ -558,13 +569,37 @@ ipa_profile (void) >>> { >>> if (n->count.initialized_p ()) >>> nindirect++; >>> - if (e->indirect_info->common_target_id) >>> + if (e->has_indirect_call_p ()) >>> { >>> if (!node_map_initialized) >>> - init_node_map (false); >>> + init_node_map (false); >>> node_map_initialized = true; >>> ncommon++; >>> - n2 = find_func_by_profile_id >>> (e->indirect_info->common_target_id); >>> + >>> + if (in_lto_p) >>> + { >>> + if (dump_file) >>> + { >>> + fprintf (dump_file, >>> + "Updating hotness threshold in LTO mode.\n"); >>> + fprintf (dump_file, "Updated min count: %" PRId64 "\n", >>> + (int64_t) threshold); >>> + } >>> + set_hot_bb_threshold (threshold >>> + / e->indirect_info->indirect_call_targets->length ()); >>> + } >>> + >>> + unsigned speculative_id = 0; >>> + struct indirect_target_info *item; >>> + /* The code below is not formatted yet for review convenience. >>> + Move to a seprate small function is not easy as too many local >>> + variables used in it. Need format and remove this comments >>> + once got approved. */ >>> + FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, >>> i, >>> + item) >>> + { >>> + bool speculative_found = false; >>> + n2 = find_func_by_profile_id (item->common_target_id); >>> if (n2) >>> { >>> if (dump_file) >>> @@ -573,11 +608,10 @@ ipa_profile (void) >>> " other module %s => %s, prob %3.2f\n", >>> n->dump_name (), >>> n2->dump_name (), >>> - e->indirect_info->common_target_probability >>> - / (float)REG_BR_PROB_BASE); >>> + item->common_target_probability >>> + / (float) REG_BR_PROB_BASE); >>> } >>> - if (e->indirect_info->common_target_probability >>> - < REG_BR_PROB_BASE / 2) >>> + if (item->common_target_probability < REG_BR_PROB_BASE / 2) >>> { >>> nuseless++; >>> if (dump_file) >>> @@ -613,7 +647,7 @@ ipa_profile (void) >>> if (dump_file) >>> fprintf (dump_file, >>> "Not speculating: " >>> - "parameter count mistmatch\n"); >>> + "parameter count mismatch\n"); >>> } >>> else if (e->indirect_info->polymorphic >>> && !opt_for_fn (n->decl, flag_devirtualize) >>> @@ -640,20 +674,30 @@ ipa_profile (void) >>> n2 = alias; >>> } >>> nconverted++; >>> - e->make_speculative >>> - (n2, >>> - e->count.apply_probability >>> - (e->indirect_info->common_target_probability)); >>> + e->make_speculative (n2, >>> + e->count.apply_probability ( >>> + item->common_target_probability), >>> + speculative_id); >>> update = true; >>> + speculative_id++; >>> + speculative_found = true; >>> } >>> } >>> else >>> { >>> if (dump_file) >>> fprintf (dump_file, "Function with profile-id %i not >>> found.\n", >>> - e->indirect_info->common_target_id); >>> + item->common_target_id); >>> nunknown++; >>> } >>> + if (!speculative_found) >>> + { >>> + /* Remove item from indirect_call_targets if no >>> + speculative edge generated, rollback the iteration. */ >>> + e->indirect_info->indirect_call_targets->ordered_remove (i); >>> + i--; >>> + } >>> + } >>> } >>> } >>> if (update) >>> diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h >>> index 0d8e509c932..3e6562ec9d1 100644 >>> --- a/gcc/ipa-ref.h >>> +++ b/gcc/ipa-ref.h >>> @@ -59,6 +59,7 @@ public: >>> symtab_node *referred; >>> gimple *stmt; >>> unsigned int lto_stmt_uid; >>> + unsigned int speculative_id; >>> unsigned int referred_index; >>> ENUM_BITFIELD (ipa_ref_use) use:3; >>> unsigned int speculative:1; >>> diff --git a/gcc/ipa.c b/gcc/ipa.c >>> index 6b84e1f9bda..a10b0603f14 100644 >>> --- a/gcc/ipa.c >>> +++ b/gcc/ipa.c >>> @@ -166,7 +166,7 @@ process_references (symtab_node *snode, >>> devirtualization happens. After inlining still keep their declarations >>> around, so we can devirtualize to a direct call. >>> - Also try to make trivial devirutalization when no or only one target >>> is >>> + Also try to make trivial devirtualization when no or only one target is >>> possible. */ >>> static void >>> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c >>> index bc0f0107333..61380dcc7b8 100644 >>> --- a/gcc/lto-cgraph.c >>> +++ b/gcc/lto-cgraph.c >>> @@ -238,6 +238,7 @@ lto_output_edge (struct lto_simple_output_block *ob, >>> struct cgraph_edge *edge, >>> unsigned int uid; >>> intptr_t ref; >>> struct bitpack_d bp; >>> + unsigned len; >>> if (edge->indirect_unknown_callee) >>> streamer_write_enum (ob->main_stream, LTO_symtab_tags, >>> LTO_symtab_last_tag, >>> @@ -265,6 +266,7 @@ lto_output_edge (struct lto_simple_output_block *ob, >>> struct cgraph_edge *edge, >>> bp_pack_enum (&bp, cgraph_inline_failed_t, >>> CIF_N_REASONS, edge->inline_failed); >>> bp_pack_var_len_unsigned (&bp, uid); >>> + bp_pack_var_len_unsigned (&bp, edge->speculative_id); >>> bp_pack_value (&bp, edge->indirect_inlining_edge, 1); >>> bp_pack_value (&bp, edge->speculative, 1); >>> bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); >>> @@ -291,11 +293,27 @@ lto_output_edge (struct lto_simple_output_block *ob, >>> struct cgraph_edge *edge, >>> streamer_write_bitpack (&bp); >>> if (edge->indirect_unknown_callee) >>> { >>> - streamer_write_hwi_stream (ob->main_stream, >>> - edge->indirect_info->common_target_id); >>> - if (edge->indirect_info->common_target_id) >>> - streamer_write_hwi_stream >>> - (ob->main_stream, edge->indirect_info->common_target_probability); >>> + struct indirect_target_info *item; >>> + unsigned int i; >>> + len = edge->has_indirect_call_p () >>> + ? edge->indirect_info->indirect_call_targets->length () >>> + : 0; >>> + gcc_assert (len <= GCOV_TOPN_VALUES); >>> + >>> + streamer_write_hwi_stream (ob->main_stream, len); >>> + >>> + if (len) >>> + { >>> + FOR_EACH_VEC_SAFE_ELT (edge->indirect_info->indirect_call_targets, i, >>> + item) >>> + { >>> + streamer_write_hwi_stream (ob->main_stream, >>> + item->common_target_id); >>> + if (item->common_target_id) >>> + streamer_write_hwi_stream (ob->main_stream, >>> + item->common_target_probability); >>> + } >>> + } >>> } >>> } >>> @@ -688,6 +706,7 @@ lto_output_ref (struct lto_simple_output_block *ob, >>> struct ipa_ref *ref, >>> if (ref->stmt) >>> uid = gimple_uid (ref->stmt) + 1; >>> streamer_write_hwi_stream (ob->main_stream, uid); >>> + streamer_write_hwi_stream (ob->main_stream, ref->speculative_id); >>> } >>> } >>> @@ -1419,7 +1438,10 @@ input_ref (class lto_input_block *ib, >>> ref = referring_node->create_reference (node, use); >>> ref->speculative = speculative; >>> if (is_a <cgraph_node *> (referring_node)) >>> - ref->lto_stmt_uid = streamer_read_hwi (ib); >>> + { >>> + ref->lto_stmt_uid = streamer_read_hwi (ib); >>> + ref->speculative_id = streamer_read_hwi (ib); >>> + } >>> } >>> /* Read an edge from IB. NODES points to a vector of previously read >>> nodes for >>> @@ -1433,11 +1455,12 @@ input_edge (class lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> { >>> struct cgraph_node *caller, *callee; >>> struct cgraph_edge *edge; >>> - unsigned int stmt_id; >>> + unsigned int stmt_id, speculative_id; >>> profile_count count; >>> cgraph_inline_failed_t inline_failed; >>> struct bitpack_d bp; >>> int ecf_flags = 0; >>> + unsigned i, len; >>> caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]); >>> if (caller == NULL || caller->decl == NULL_TREE) >>> @@ -1457,6 +1480,7 @@ input_edge (class lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> bp = streamer_read_bitpack (ib); >>> inline_failed = bp_unpack_enum (&bp, cgraph_inline_failed_t, >>> CIF_N_REASONS); >>> stmt_id = bp_unpack_var_len_unsigned (&bp); >>> + speculative_id = bp_unpack_var_len_unsigned (&bp); >>> if (indirect) >>> edge = caller->create_indirect_edge (NULL, 0, count); >>> @@ -1466,6 +1490,7 @@ input_edge (class lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> edge->indirect_inlining_edge = bp_unpack_value (&bp, 1); >>> edge->speculative = bp_unpack_value (&bp, 1); >>> edge->lto_stmt_uid = stmt_id; >>> + edge->speculative_id = speculative_id; >>> edge->inline_failed = inline_failed; >>> edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1); >>> edge->can_throw_external = bp_unpack_value (&bp, 1); >>> @@ -1485,9 +1510,21 @@ input_edge (class lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> if (bp_unpack_value (&bp, 1)) >>> ecf_flags |= ECF_RETURNS_TWICE; >>> edge->indirect_info->ecf_flags = ecf_flags; >>> - edge->indirect_info->common_target_id = streamer_read_hwi (ib); >>> - if (edge->indirect_info->common_target_id) >>> - edge->indirect_info->common_target_probability = streamer_read_hwi >>> (ib); >>> + >>> + len = streamer_read_hwi (ib); >>> + >>> + gcc_assert (len <= GCOV_TOPN_VALUES); >>> + >>> + if (len) >>> + { >>> + indirect_target_info item; >>> + for (i = 0; i < len; i++) >>> + { >>> + item.common_target_id = streamer_read_hwi (ib); >>> + item.common_target_probability = streamer_read_hwi (ib); >>> + vec_safe_push (edge->indirect_info->indirect_call_targets, item); >>> + } >>> + } >>> } >>> } >>> diff --git a/gcc/predict.c b/gcc/predict.c >>> index 915f0806b11..3f56fa3a74a 100644 >>> --- a/gcc/predict.c >>> +++ b/gcc/predict.c >>> @@ -762,7 +762,6 @@ dump_prediction (FILE *file, enum br_predictor >>> predictor, int probability, >>> && bb->count.precise_p () >>> && reason == REASON_NONE) >>> { >>> - gcc_assert (e->count ().precise_p ()); >>> fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n", >>> predictor_info[predictor].name, >>> bb->count.to_gcov_type (), e->count ().to_gcov_type (), >>> diff --git a/gcc/symtab.c b/gcc/symtab.c >>> index ee9723c3453..d4c36fd3e5a 100644 >>> --- a/gcc/symtab.c >>> +++ b/gcc/symtab.c >>> @@ -603,6 +603,7 @@ symtab_node::create_reference (symtab_node >>> *referred_node, >>> ref->referred = referred_node; >>> ref->stmt = stmt; >>> ref->lto_stmt_uid = 0; >>> + ref->speculative_id = 0; >>> ref->use = use_type; >>> ref->speculative = 0; >>> @@ -660,6 +661,7 @@ symtab_node::clone_references (symtab_node *node) >>> ref2 = create_reference (ref->referred, ref->use, ref->stmt); >>> ref2->speculative = speculative; >>> ref2->lto_stmt_uid = stmt_uid; >>> + ref2->speculative_id = ref->speculative_id; >>> } >>> } >>> @@ -678,6 +680,7 @@ symtab_node::clone_referring (symtab_node *node) >>> ref2 = ref->referring->create_reference (this, ref->use, ref->stmt); >>> ref2->speculative = speculative; >>> ref2->lto_stmt_uid = stmt_uid; >>> + ref2->speculative_id = ref->speculative_id; >>> } >>> } >>> @@ -693,6 +696,7 @@ symtab_node::clone_reference (ipa_ref *ref, gimple >>> *stmt) >>> ref2 = create_reference (ref->referred, ref->use, stmt); >>> ref2->speculative = speculative; >>> ref2->lto_stmt_uid = stmt_uid; >>> + ref2->speculative_id = ref->speculative_id; >>> return ref2; >>> } >>> @@ -747,6 +751,7 @@ symtab_node::clear_stmts_in_references (void) >>> { >>> r->stmt = NULL; >>> r->lto_stmt_uid = 0; >>> + r->speculative_id = 0; >>> } >>> } >>> diff --git >>> a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> new file mode 100644 >>> index 00000000000..e0a83c2e067 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> @@ -0,0 +1,35 @@ >>> +/* { dg-require-effective-target lto } */ >>> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param >>> indir-call-topn-profile=1" } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a); >>> + >>> +int >>> +two (int a); >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int >>> +main() >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + x = one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile_estimate" } } */ >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile_estimate" } } */ >>> + >>> diff --git >>> a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> new file mode 100644 >>> index 00000000000..a8c6e365fb9 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> @@ -0,0 +1,22 @@ >>> +/* It seems there is no way to avoid the other source of mulitple >>> + source testcase from being compiled independently. Just avoid >>> + error. */ >>> +#ifdef DOJOB >>> +int >>> +one (int a) >>> +{ >>> + return 1; >>> +} >>> + >>> +int >>> +two (int a) >>> +{ >>> + return 0; >>> +} >>> +#else >>> +int >>> +main() >>> +{ >>> + return 0; >>> +} >>> +#endif >>> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> new file mode 100644 >>> index 00000000000..aa3887fde83 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> @@ -0,0 +1,42 @@ >>> +/* { dg-require-effective-target lto } */ >>> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param >>> indir-call-topn-profile=1" } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a); >>> + >>> +int >>> +two (int a); >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int foo () >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + x = one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + return x; >>> +} >>> + >>> +int >>> +main() >>> +{ >>> + int x = foo (); >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile_estimate" } } */ >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile_estimate" } } */ >>> + >>> + >>> diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> new file mode 100644 >>> index 00000000000..951bc7ddd19 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> @@ -0,0 +1,38 @@ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -fdump-ipa-profile --param indir-call-topn-profile=1" >>> } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a) >>> +{ >>> + return 1; >>> +} >>> + >>> +int >>> +two (int a) >>> +{ >>> + return 0; >>> +} >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int >>> +main() >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile" } } */ >>> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile" } } */ >>> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c >>> index b9c1a3b1456..fe3e172fbd1 100644 >>> --- a/gcc/tree-inline.c >>> +++ b/gcc/tree-inline.c >>> @@ -2167,6 +2167,25 @@ copy_bb (copy_body_data *id, basic_block bb, >>> gcc_assert (!edge->indirect_unknown_callee); >>> old_edge->speculative_call_info (direct, indirect, ref); >>> + while (old_edge->next_callee >>> + && old_edge->next_callee->speculative >>> + && indirect->has_multiple_indirect_call_p ()) >>> + { >>> + /* Some speculative calls may contain more than >>> + one direct target, loop iterate it to clone all >>> + related direct edges before cloning the related >>> + indirect edge. */ >>> + id->dst_node->clone_reference (ref, stmt); >>> + >>> + edge = old_edge->next_callee; >>> + edge = edge->clone (id->dst_node, call_stmt, >>> + gimple_uid (stmt), num, den, >>> + true); >>> + old_edge = old_edge->next_callee; >>> + gcc_assert (!edge->indirect_unknown_callee); >>> + old_edge->speculative_call_info (direct, indirect, >>> + ref); >>> + } >>> profile_count indir_cnt = indirect->count; >>> indirect = indirect->clone (id->dst_node, call_stmt, >>> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c >>> index 4c1ead5781f..ef7748668f8 100644 >>> --- a/gcc/tree-profile.c >>> +++ b/gcc/tree-profile.c >>> @@ -74,8 +74,8 @@ static GTY(()) tree ic_tuple_callee_field; >>> /* Do initialization work for the edge profiler. */ >>> /* Add code: >>> - __thread gcov* __gcov_indirect_call_counters; // pointer to actual >>> counter >>> - __thread void* __gcov_indirect_call_callee; // actual callee address >>> + __thread gcov* __gcov_indirect_call.counters; // pointer to actual >>> counter >>> + __thread void* __gcov_indirect_call.callee; // actual callee address >>> __thread int __gcov_function_counter; // time profiler function counter >>> */ >>> static void >>> @@ -382,7 +382,7 @@ gimple_gen_ic_profiler (histogram_value value, unsigned >>> tag) >>> f_1 = foo; >>> __gcov_indirect_call.counters = &__gcov4.main[0]; >>> PROF_9 = f_1; >>> - __gcov_indirect_call_callee = PROF_9; >>> + __gcov_indirect_call.callee = PROF_9; >>> _4 = f_1 (); >>> */ >>> @@ -445,11 +445,11 @@ gimple_gen_ic_func_profiler (void) >>> /* Insert code: >>> - if (__gcov_indirect_call_callee != NULL) >>> + if (__gcov_indirect_call.callee != NULL) >>> __gcov_indirect_call_profiler_v3 (profile_id, >>> ¤t_function_decl); >>> The function __gcov_indirect_call_profiler_v3 is responsible for >>> - resetting __gcov_indirect_call_callee to NULL. */ >>> + resetting __gcov_indirect_call.callee to NULL. */ >>> gimple_stmt_iterator gsi = gsi_start_bb (cond_bb); >>> void0 = build_int_cst (ptr_type_node, 0); >>> @@ -891,7 +891,7 @@ pass_ipa_tree_profile::gate (function *) >>> { >>> /* When profile instrumentation, use or test coverage shall be >>> performed. >>> But for AutoFDO, this there is no instrumentation, thus this pass is >>> - diabled. */ >>> + disabled. */ >>> return (!in_lto_p && !flag_auto_profile >>> && (flag_branch_probabilities || flag_test_coverage >>> || profile_arc_flag)); >>> diff --git a/gcc/value-prof.c b/gcc/value-prof.c >>> index 55ea0973a03..0588df0fce9 100644 >>> --- a/gcc/value-prof.c >>> +++ b/gcc/value-prof.c >>> @@ -1406,11 +1406,10 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node >>> *direct_call, >>> return dcall_stmt; >>> } >>> -/* >>> - For every checked indirect/virtual call determine if most common pid of >>> - function/class method has probability more than 50%. If yes modify code >>> of >>> - this call to: >>> - */ >>> +/* There maybe multiple indirect targets in histogram. Check every >>> + indirect/virtual call if callee function exists, if not exist, leave it >>> to >>> + LTO stage for later process. Modify code of this indirect call to an >>> if-else >>> + structure in ipa-profile finally. */ >>> static bool >>> gimple_ic_transform (gimple_stmt_iterator *gsi) >>> @@ -1434,48 +1433,57 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) >>> if (!histogram) >>> return false; >> >> The function is not correct, note that the function can now return true >> when this transformation happens: >> "Indirect call -> direct call from other " >> "module %T=> %i (will resolve only with LTO)\n", >> >> Current trunk returns false in that case. >> >>> - if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val, >>> - &count, &all)) >>> - return false; >>> + count = 0; >>> + all = histogram->hvalue.counters[0]; >>> - if (4 * count <= 3 * all) >>> - return false; >>> + for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++) >>> + { >>> + if (!get_nth_most_common_value (NULL, "indirect call", histogram, >>> &val, >>> + &count, &all, j)) >>> + continue; >> >> You should break here as get_nth_most_common_value (..., j + 1) will also >> return >> false. >> >>> - direct_call = find_func_by_profile_id ((int)val); >>> + /* Minimum probability. should be higher than 25%. */ >>> + if (4 * count <= all) >>> + continue; >> >> You can break here as well. >> >> Thank you, >> Martin >> >>> - if (direct_call == NULL) >>> - { >>> - if (val) >>> + direct_call = find_func_by_profile_id ((int) val); >>> + >>> + if (direct_call == NULL) >>> + { >>> + if (val) >>> + { >>> + if (dump_enabled_p ()) >>> + dump_printf_loc ( >>> + MSG_MISSED_OPTIMIZATION, stmt, >>> + "Indirect call -> direct call from other " >>> + "module %T=> %i (will resolve only with LTO)\n", >>> + gimple_call_fn (stmt), (int) val); >>> + } >>> + continue; >>> + } >>> + >>> + if (!check_ic_target (stmt, direct_call)) >>> { >>> if (dump_enabled_p ()) >>> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, stmt, >>> - "Indirect call -> direct call from other " >>> - "module %T=> %i (will resolve only with LTO)\n", >>> - gimple_call_fn (stmt), (int)val); >>> + dump_printf_loc ( >>> + MSG_MISSED_OPTIMIZATION, stmt, >>> + "Indirect call -> direct call %T => %T " >>> + "transformation skipped because of type mismatch: %G", >>> + gimple_call_fn (stmt), direct_call->decl, stmt); >>> + gimple_remove_histogram_value (cfun, stmt, histogram); >>> + return false; >>> } >>> - return false; >>> - } >>> - if (!check_ic_target (stmt, direct_call)) >>> - { >>> if (dump_enabled_p ()) >>> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, stmt, >>> - "Indirect call -> direct call %T => %T " >>> - "transformation skipped because of type mismatch: %G", >>> - gimple_call_fn (stmt), direct_call->decl, stmt); >>> - gimple_remove_histogram_value (cfun, stmt, histogram); >>> - return false; >>> - } >>> - >>> - if (dump_enabled_p ()) >>> - { >>> - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, stmt, >>> - "Indirect call -> direct call " >>> - "%T => %T transformation on insn postponed\n", >>> - gimple_call_fn (stmt), direct_call->decl); >>> - dump_printf_loc (MSG_NOTE, stmt, >>> - "hist->count %" PRId64 >>> - " hist->all %" PRId64"\n", count, all); >>> + { >>> + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, stmt, >>> + "Indirect call -> direct call " >>> + "%T => %T transformation on insn postponed\n", >>> + gimple_call_fn (stmt), direct_call->decl); >>> + dump_printf_loc (MSG_NOTE, stmt, >>> + "hist->count %" PRId64 " hist->all %" PRId64 "\n", >>> + count, all); >>> + } >>> } >>> return true; >>> >>
diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 43187bd7a19..e38cf69716d 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1651,16 +1651,14 @@ private: struct GTY (()) indirect_target_info { + indirect_target_info (unsigned int id, int prob): + common_target_id (id), common_target_probability (prob) + {} + /* Profile_id of common target obtained from profile. */ unsigned int common_target_id; /* Probability that call will land in function with COMMON_TARGET_ID. */ int common_target_probability; - - indirect_target_info (unsigned int id, int prob) - { - common_target_id = id; - common_target_probability = prob; - } }; /* Structure containing additional information about an indirect call. */