On 6/18/19 11:02 AM, luoxhu wrote: > Hi, > > On 2019/6/18 13:51, Martin Liška wrote: >> On 6/18/19 3:45 AM, Xiong Hu Luo wrote: >> >> Hello. >> >> Thank you for the interest in the area. >> >>> This patch aims to fix PR69678 caused by PGO indirect call profiling bugs. >>> Currently the default instrument function can only find the indirect >>> function >>> that called more than 50% with an incorrect count number returned. >> Can you please explain what you mean by 'an incorrect count number returned'? > > For a test case indir-call-topn.c, it include 2 indirect calls "one" and > "two". the profiling data is as below with trunk code (including your patch, > count[0] and count[2] is switched by your code, the count[0] is used in > ipa-profile but only support the top1 format, my patch adds the support for > the topn format. count[0] was incorrect as WITHOUT your patch it is 0, > things getting better with your fix as the count[0] is 350000000, but still > not correct, in fact, "one" is running 175000000 times, and "two" is running > the other 175000000 times): > > indir-call-topn.gcda: 22: 01a90000: 18:COUNTERS indirect_call 9 counts > indir-call-topn.gcda: 24: 0: *350000000 1868707024 0* 0 0 > 0 0 0 > > Running with the "--param indir-call-topn-profile=1" will give below profile > data, My patch is based on this profile result and do the optimization for > multiple indirect targets, performance can get much improve on this testcase > and SPEC2017 for some benchmarks(LLVM already support this several years > ago...). > > indir-call-topn.gcda: 26: 01b10000: 18:COUNTERS indirect_call_topn 9 > counts > indir-call-topn.gcda: 28: 0: *0 969338501 175000000 > 1868707024 175000000* 0 0 0 > > > test case indir-call-topn.c: > > #include <stdio.h> > > > typedef int (*fptr) (int); > int > one (int a) > { > return 1; > } > > int > two (int a) > { > return 0; > } > > fptr table[] = {&one, &two}; > > int > main() > { > int i, x; > fptr p = &one; > > one (3); > > for (i = 0; i < 350000000; i++) > { > x = (*p) (3); > p = table[x]; > } > printf ("done:%d\n", x); > }
I've got it. So it's situation where you have distribution equal to 50% and 50%. Note that it's the only valid situation where both edges with be >= 50%. That's the threshold for which we speculatively devirtualize edges. That said, you don't need generic topn counter, but a probably only a top2 counter which can be generalized from single-value counter type. I'm saying that because I removed the TOPN, mainly due to: https://github.com/gcc-mirror/gcc/commit/5cb221f2b9c268df47c97b4837230b15e65f9c14#diff-d003c64ae14449d86df03508de98bde7L179 which is over-complicated profiling function. And the changes that I've done recently are motivated to preserve a stable builds. That's achieved by noticing that a single-value counter can't handle all seen values. > >> >>> This patch >>> leverages the "--param indir-call-topn-profile=1" and enables multiple >>> indirect >> Note that I've remove indir-call-topn-profile last week, the patch will not >> apply >> on current trunk. However, I can help you how to adapt single-value counters >> to support tracking of multiple values. > > It will be very useful if you help me to track multiple values similarly on > trunk code. I will rebase to your code once topn is ready again. Actually > topn is more general and top1 is included in, I thought that top1 should be > removed instead of topn, though topn will consume longer time than top1 in > profile-generate. As mentioned earlier, I really don't want to put TOPN back. I can help you once Honza will agree with the general IPA changes. > >> >>> targets profiling and use in LTO-WPA and LTO-LTRANS stage, as a result, >>> function >>> specialization, profiling, partial devirtualization, inlining and cloning >>> could >>> be done successfully based on it. >> This decision is definitely big question for Honza? >> >>> Performance can get improved 3x (1.7 sec -> 0.4 sec) on simple tests. >>> Details are: >>> 1. When do PGO with indir-call-topn-profile, the gcda data format is not >>> supported in ipa-profile pass, >> If you take a look at gcc/ipa-profile.c:195 you can see how the probability >> is propagated to IPA passes. Why is that not sufficient? > > Current code only support single indirect target, I need track multiple > indirect targets and create multiple speculative edges on single indirect > call statement. > > What's more, many ICEs happened in later stage due to single speculative > target design, part of this patch is to solve the ICEs of multiple > speculative target edges handling. Well, to be honest I don't like the patch much. It brings another level of complexity for a quite rare situation where one calls 2 functions via an indirect call. And as mentioned, current IPA optimization are not happy about multiple indirect branches. Martin > > > Thanks > > Xionghu > >> >> Martin >> >>> so add variables to pass the information >>> through passes, and postpone gimple_ic to ipa-profile like default as >>> inline >>> pass will decide whether it is benefit to transform indirect call. >>> 2. Enable LTO WPA/LTRANS stage multiple indirect call targets analysis >>> for >>> profile full support in ipa passes and cgraph_edge functions. >>> 3. Fix various hidden speculative call ICEs exposed after enabling this >>> feature when running SPEC2017. >>> 4. Add 1 in module testcase and 2 cross module testcases. >>> 5. TODOs: >>> 5.1. Some reference info will be dropped from WPA to LTRANS, so >>> reference check will be difficult in LTRANS, need replace the strstr >>> with reference compare. >>> 5.2. Some duplicate code need be removed as top1 and topn share same >>> logic. >>> Actually top1 related logic could be eliminated totally as topn >>> includes it. >>> 5.3. Split patch maybe needed as too big but not sure how many would >>> be >>> reasonable. >>> 6. Performance result for ppc64le: >>> 6.1. Representative test: indir-call-prof-topn.c runtime improved from >>> 1.7s to 0.4s. >>> 6.2. SPEC2017 peakrate: >>> 523.xalancbmk_r (+4.87%); 538.imagick_r (+4.59%); 511.povray_r >>> (+13.33%); >>> 525.x264_r (-5.29%). >>> No big changes of other benchmarks. >>> Option: -Ofast -mcpu=power8 >>> PASS1_OPTIMIZE: -fprofile-generate --param >>> indir-call-topn-profile=1 -flto >>> PASS2_OPTIMIZE: -fprofile-use --param indir-call-topn-profile=1 >>> -flto >>> -fprofile-correction >>> 6.3. No performance change on PHP benchmark. >>> 7. Bootstrap and regression test passed on Power8-LE. >>> >>> gcc/ChangeLog >>> >>> 2019-06-17 Xiong Hu Luo <luo...@linux.ibm.com> >>> >>> PR ipa/69678 >>> * cgraph.c (cgraph_node::get_create): Copy profile_id. >>> (cgraph_edge::speculative_call_info): Find real >>> reference for indirect targets. >>> (cgraph_edge::resolve_speculation): Add speculative code process >>> for indirect targets. >>> (cgraph_edge::redirect_call_stmt_to_callee): Likewise. >>> (cgraph_node::verify_node): Likewise. >>> * cgraph.h (common_target_ids): New variable. >>> (common_target_probabilities): Likewise. >>> (num_of_ics): Likewise. >>> * cgraphclones.c (cgraph_node::create_clone): Copy profile_id. >>> * ipa-inline.c (inline_small_functions): Add iterator update. >>> * ipa-profile.c (ipa_profile_generate_summary): Add indirect >>> multiple targets logic. >>> (ipa_profile): Likewise. >>> * ipa-utils.c (ipa_merge_profiles): Clone speculative src's >>> referrings to dst. >>> * ipa.c (process_references): Fix typo. >>> * lto-cgraph.c (lto_output_edge): Add indirect multiple targets >>> logic. >>> (input_edge): Likewise. >>> * predict.c (dump_prediction): Revome edges count assert to be >>> precise. >>> * tree-profile.c (gimple_gen_ic_profiler): Use the new variable >>> __gcov_indirect_call.counters and __gcov_indirect_call.callee. >>> (gimple_gen_ic_func_profiler): Likewise. >>> (pass_ipa_tree_profile::gate): Fix comment typos. >>> * tree-inline.c (copy_bb): Duplicate all the speculative edges >>> if indirect call contains multiple speculative targets. >>> * value-prof.c (check_counter): Proportion the counter for >>> multiple targets. >>> (ic_transform_topn): New function. >>> (gimple_ic_transform): Handle topn case, fix comment typos. >>> >>> gcc/testsuite/ChangeLog >>> >>> 2019-06-17 Xiong Hu Luo <luo...@linux.ibm.com> >>> >>> PR ipa/69678 >>> * gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase. >>> * gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase. >>> --- >>> gcc/cgraph.c | 38 +++- >>> gcc/cgraph.h | 9 +- >>> gcc/cgraphclones.c | 1 + >>> gcc/ipa-inline.c | 3 + >>> gcc/ipa-profile.c | 185 +++++++++++++++++- >>> gcc/ipa-utils.c | 5 + >>> gcc/ipa.c | 2 +- >>> gcc/lto-cgraph.c | 38 ++++ >>> gcc/predict.c | 1 - >>> .../tree-prof/crossmodule-indir-call-topn-1.c | 35 ++++ >>> .../crossmodule-indir-call-topn-1a.c | 22 +++ >>> .../tree-prof/crossmodule-indir-call-topn-2.c | 42 ++++ >>> .../gcc.dg/tree-prof/indir-call-prof-topn.c | 38 ++++ >>> gcc/tree-inline.c | 97 +++++---- >>> gcc/tree-profile.c | 12 +- >>> gcc/value-prof.c | 146 +++++++++++++- >>> 16 files changed, 606 insertions(+), 68 deletions(-) >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> create mode 100644 >>> gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> >>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c >>> index de82316d4b1..0d373a67d1b 100644 >>> --- a/gcc/cgraph.c >>> +++ b/gcc/cgraph.c >>> @@ -553,6 +553,7 @@ cgraph_node::get_create (tree decl) >>> fprintf (dump_file, "Introduced new external node " >>> "(%s) and turned into root of the clone tree.\n", >>> node->dump_name ()); >>> + node->profile_id = first_clone->profile_id; >>> } >>> else if (dump_file) >>> fprintf (dump_file, "Introduced new external node " >>> @@ -1110,6 +1111,7 @@ cgraph_edge::speculative_call_info (cgraph_edge >>> *&direct, >>> int i; >>> cgraph_edge *e2; >>> cgraph_edge *e = this; >>> + cgraph_node *referred_node; >>> if (!e->indirect_unknown_callee) >>> for (e2 = e->caller->indirect_calls; >>> @@ -1142,8 +1144,20 @@ cgraph_edge::speculative_call_info (cgraph_edge >>> *&direct, >>> && ((ref->stmt && ref->stmt == e->call_stmt) >>> || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid))) >>> { >>> - reference = ref; >>> - break; >>> + if (e2->indirect_info && e2->indirect_info->num_of_ics) >>> + { >>> + referred_node = dyn_cast<cgraph_node *> (ref->referred); >>> + if (strstr (e->callee->name (), referred_node->name ())) >>> + { >>> + reference = ref; >>> + break; >>> + } >>> + } >>> + else >>> + { >>> + reference = ref; >>> + break; >>> + } >>> } >>> /* Speculative edge always consist of all three components - direct >>> edge, >>> @@ -1199,7 +1213,14 @@ cgraph_edge::resolve_speculation (tree callee_decl) >>> in the functions inlined through it. */ >>> } >>> edge->count += e2->count; >>> - edge->speculative = false; >>> + if (edge->indirect_info && edge->indirect_info->num_of_ics) >>> + { >>> + edge->indirect_info->num_of_ics--; >>> + if (edge->indirect_info->num_of_ics == 0) >>> + edge->speculative = false; >>> + } >>> + else >>> + edge->speculative = false; >>> e2->speculative = false; >>> ref->remove_reference (); >>> if (e2->indirect_unknown_callee || e2->inline_failed) >>> @@ -1333,7 +1354,14 @@ cgraph_edge::redirect_call_stmt_to_callee (void) >>> e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt, >>> false); >>> e->count = gimple_bb (e->call_stmt)->count; >>> - e2->speculative = false; >>> + if (e2->indirect_info && e2->indirect_info->num_of_ics) >>> + { >>> + e2->indirect_info->num_of_ics--; >>> + if (e2->indirect_info->num_of_ics == 0) >>> + e2->speculative = false; >>> + } >>> + else >>> + e2->speculative = false; >>> e2->count = gimple_bb (e2->call_stmt)->count; >>> ref->speculative = false; >>> ref->stmt = NULL; >>> @@ -3407,7 +3435,7 @@ cgraph_node::verify_node (void) >>> for (e = callees; e; e = e->next_callee) >>> { >>> - if (!e->aux) >>> + if (!e->aux && !e->speculative) >>> { >>> error ("edge %s->%s has no corresponding call_stmt", >>> identifier_to_locale (e->caller->name ()), >>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >>> index c294602d762..ed0fbc60432 100644 >>> --- a/gcc/cgraph.h >>> +++ b/gcc/cgraph.h >>> @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see >>> #include "profile-count.h" >>> #include "ipa-ref.h" >>> #include "plugin-api.h" >>> +#include "gcov-io.h" >>> extern void debuginfo_early_init (void); >>> extern void debuginfo_init (void); >>> @@ -1638,11 +1639,17 @@ struct GTY(()) cgraph_indirect_call_info >>> int param_index; >>> /* ECF flags determined from the caller. */ >>> int ecf_flags; >>> - /* Profile_id of common target obtrained from profile. */ >>> + /* Profile_id of common target obtained from profile. */ >>> int common_target_id; >>> /* Probability that call will land in function with COMMON_TARGET_ID. >>> */ >>> int common_target_probability; >>> + /* Profile_id of common target obtained from profile. */ >>> + int common_target_ids[GCOV_ICALL_TOPN_NCOUNTS / 2]; >>> + /* Probabilities that call will land in function with COMMON_TARGET_IDS. >>> */ >>> + int common_target_probabilities[GCOV_ICALL_TOPN_NCOUNTS / 2]; >>> + unsigned num_of_ics; >>> + >>> /* Set when the call is a virtual call with the parameter being the >>> associated object pointer rather than a simple direct call. */ >>> unsigned polymorphic : 1; >>> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c >>> index 15f7e119d18..94f424bc10c 100644 >>> --- a/gcc/cgraphclones.c >>> +++ b/gcc/cgraphclones.c >>> @@ -467,6 +467,7 @@ cgraph_node::create_clone (tree new_decl, profile_count >>> prof_count, >>> new_node->icf_merged = icf_merged; >>> new_node->merged_comdat = merged_comdat; >>> new_node->thunk = thunk; >>> + new_node->profile_id = profile_id; >>> new_node->clone.tree_map = NULL; >>> new_node->clone.args_to_skip = args_to_skip; >>> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c >>> index 360c3de3289..ef2b217b3f9 100644 >>> --- a/gcc/ipa-inline.c >>> +++ b/gcc/ipa-inline.c >>> @@ -1866,12 +1866,15 @@ inline_small_functions (void) >>> } >>> if (has_speculative) >>> for (edge = node->callees; edge; edge = next) >>> + { >>> + next = edge->next_callee; >>> if (edge->speculative && !speculation_useful_p (edge, >>> edge->aux != NULL)) >>> { >>> edge->resolve_speculation (); >>> update = true; >>> } >>> + } >>> if (update) >>> { >>> struct cgraph_node *where = node->global.inlined_to >>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c >>> index de9563d808c..d04476295a0 100644 >>> --- a/gcc/ipa-profile.c >>> +++ b/gcc/ipa-profile.c >>> @@ -168,6 +168,10 @@ ipa_profile_generate_summary (void) >>> struct cgraph_node *node; >>> gimple_stmt_iterator gsi; >>> basic_block bb; >>> + enum hist_type type; >>> + >>> + type = PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) ? >>> HIST_TYPE_INDIR_CALL_TOPN >>> + : HIST_TYPE_INDIR_CALL; >>> hash_table<histogram_hash> hashtable (10); >>> @@ -186,10 +190,10 @@ ipa_profile_generate_summary (void) >>> histogram_value h; >>> h = gimple_histogram_value_of_type >>> (DECL_STRUCT_FUNCTION (node->decl), >>> - stmt, HIST_TYPE_INDIR_CALL); >>> + stmt, type); >>> /* No need to do sanity check: gimple_ic_transform already >>> takes away bad histograms. */ >>> - if (h) >>> + if (h && type == HIST_TYPE_INDIR_CALL) >>> { >>> /* counter 0 is target, counter 1 is number of execution we >>> called target, >>> counter 2 is total number of executions. */ >>> @@ -212,6 +216,46 @@ ipa_profile_generate_summary (void) >>> gimple_remove_histogram_value (DECL_STRUCT_FUNCTION >>> (node->decl), >>> stmt, h); >>> } >>> + else if (h && type == HIST_TYPE_INDIR_CALL_TOPN) >>> + { >>> + unsigned j; >>> + struct cgraph_edge *e = node->get_edge (stmt); >>> + if (e && !e->indirect_unknown_callee) >>> + continue; >>> + >>> + e->indirect_info->num_of_ics = 0; >>> + for (j = 1; j < h->n_counters; j += 2) >>> + { >>> + if (h->hvalue.counters[j] == 0) >>> + continue; >>> + >>> + e->indirect_info->common_target_ids[j / 2] >>> + = h->hvalue.counters[j]; >>> + e->indirect_info->common_target_probabilities[j / 2] >>> + = GCOV_COMPUTE_SCALE ( >>> + h->hvalue.counters[j + 1], >>> + gimple_bb (stmt)->count.ipa ().to_gcov_type ()); >>> + if (e->indirect_info >>> + ->common_target_probabilities[j / 2] >>> + > REG_BR_PROB_BASE) >>> + { >>> + if (dump_file) >>> + fprintf (dump_file, >>> + "Probability capped to 1\n"); >>> + e->indirect_info >>> + ->common_target_probabilities[j / 2] >>> + = REG_BR_PROB_BASE; >>> + } >>> + e->indirect_info->num_of_ics++; >>> + } >>> + >>> + gcc_assert (e->indirect_info->num_of_ics >>> + <= GCOV_ICALL_TOPN_NCOUNTS / 2); >>> + >>> + gimple_remove_histogram_value (DECL_STRUCT_FUNCTION ( >>> + node->decl), >>> + stmt, h); >>> + } >>> } >>> time += estimate_num_insns (stmt, &eni_time_weights); >>> size += estimate_num_insns (stmt, &eni_size_weights); >>> @@ -492,6 +536,7 @@ ipa_profile (void) >>> int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted >>> = 0; >>> int nmismatch = 0, nimpossible = 0; >>> bool node_map_initialized = false; >>> + gcov_type threshold; >>> if (dump_file) >>> dump_histogram (dump_file, histogram); >>> @@ -500,14 +545,12 @@ ipa_profile (void) >>> overall_time += histogram[i]->count * histogram[i]->time; >>> overall_size += histogram[i]->size; >>> } >>> + threshold = 0; >>> if (overall_time) >>> { >>> - gcov_type threshold; >>> - >>> gcc_assert (overall_size); >>> cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + >>> 500) / 1000; >>> - threshold = 0; >>> for (i = 0; cumulated < cutoff; i++) >>> { >>> cumulated += histogram[i]->count * histogram[i]->time; >>> @@ -543,7 +586,7 @@ ipa_profile (void) >>> histogram.release (); >>> histogram_pool.release (); >>> - /* Produce speculative calls: we saved common traget from porfiling >>> into >>> + /* Produce speculative calls: we saved common target from profiling into >>> e->common_target_id. Now, at link time, we can look up corresponding >>> function node and produce speculative call. */ >>> @@ -558,7 +601,8 @@ ipa_profile (void) >>> { >>> if (n->count.initialized_p ()) >>> nindirect++; >>> - if (e->indirect_info->common_target_id) >>> + if (e->indirect_info->common_target_id >>> + || (e->indirect_info && e->indirect_info->num_of_ics == 1)) >>> { >>> if (!node_map_initialized) >>> init_node_map (false); >>> @@ -613,7 +657,7 @@ ipa_profile (void) >>> if (dump_file) >>> fprintf (dump_file, >>> "Not speculating: " >>> - "parameter count mistmatch\n"); >>> + "parameter count mismatch\n"); >>> } >>> else if (e->indirect_info->polymorphic >>> && !opt_for_fn (n->decl, flag_devirtualize) >>> @@ -655,7 +699,130 @@ ipa_profile (void) >>> nunknown++; >>> } >>> } >>> - } >>> + if (e->indirect_info && e->indirect_info->num_of_ics > 1) >>> + { >>> + if (in_lto_p) >>> + { >>> + if (dump_file) >>> + { >>> + fprintf (dump_file, >>> + "Updating hotness threshold in LTO mode.\n"); >>> + fprintf (dump_file, "Updated min count: %" PRId64 "\n", >>> + (int64_t) threshold); >>> + } >>> + set_hot_bb_threshold (threshold >>> + / e->indirect_info->num_of_ics); >>> + } >>> + if (!node_map_initialized) >>> + init_node_map (false); >>> + node_map_initialized = true; >>> + ncommon++; >>> + unsigned speculative = 0; >>> + for (i = 0; i < (int)e->indirect_info->num_of_ics; i++) >>> + { >>> + n2 = find_func_by_profile_id ( >>> + e->indirect_info->common_target_ids[i]); >>> + if (n2) >>> + { >>> + if (dump_file) >>> + { >>> + fprintf ( >>> + dump_file, >>> + "Indirect call -> direct call from" >>> + " other module %s => %s, prob %3.2f\n", >>> + n->dump_name (), n2->dump_name (), >>> + e->indirect_info->common_target_probabilities[i] >>> + / (float) REG_BR_PROB_BASE); >>> + } >>> + if (e->indirect_info->common_target_probabilities[i] >>> + < REG_BR_PROB_BASE / 2) >>> + { >>> + nuseless++; >>> + if (dump_file) >>> + fprintf ( >>> + dump_file, >>> + "Not speculating: probability is too low.\n"); >>> + } >>> + else if (!e->maybe_hot_p ()) >>> + { >>> + nuseless++; >>> + if (dump_file) >>> + fprintf (dump_file, >>> + "Not speculating: call is cold.\n"); >>> + } >>> + else if (n2->get_availability () <= AVAIL_INTERPOSABLE >>> + && n2->can_be_discarded_p ()) >>> + { >>> + nuseless++; >>> + if (dump_file) >>> + fprintf (dump_file, >>> + "Not speculating: target is overwritable " >>> + "and can be discarded.\n"); >>> + } >>> + else if (ipa_node_params_sum && ipa_edge_args_sum >>> + && (!vec_safe_is_empty ( >>> + IPA_NODE_REF (n2)->descriptors)) >>> + && ipa_get_param_count (IPA_NODE_REF (n2)) >>> + != ipa_get_cs_argument_count ( >>> + IPA_EDGE_REF (e)) >>> + && (ipa_get_param_count (IPA_NODE_REF (n2)) >>> + >= ipa_get_cs_argument_count ( >>> + IPA_EDGE_REF (e)) >>> + || !stdarg_p (TREE_TYPE (n2->decl)))) >>> + { >>> + nmismatch++; >>> + if (dump_file) >>> + fprintf (dump_file, "Not speculating: " >>> + "parameter count mismatch\n"); >>> + } >>> + else if (e->indirect_info->polymorphic >>> + && !opt_for_fn (n->decl, flag_devirtualize) >>> + && !possible_polymorphic_call_target_p (e, n2)) >>> + { >>> + nimpossible++; >>> + if (dump_file) >>> + fprintf (dump_file, >>> + "Not speculating: " >>> + "function is not in the polymorphic " >>> + "call target list\n"); >>> + } >>> + else >>> + { >>> + /* Target may be overwritable, but profile says that >>> + control flow goes to this particular implementation >>> + of N2. Speculate on the local alias to allow >>> + inlining. >>> + */ >>> + if (!n2->can_be_discarded_p ()) >>> + { >>> + cgraph_node *alias; >>> + alias = dyn_cast<cgraph_node *> ( >>> + n2->noninterposable_alias ()); >>> + if (alias) >>> + n2 = alias; >>> + } >>> + nconverted++; >>> + e->make_speculative ( >>> + n2, e->count.apply_probability ( >>> + e->indirect_info >>> + ->common_target_probabilities[i])); >>> + update = true; >>> + speculative++; >>> + } >>> + } >>> + else >>> + { >>> + if (dump_file) >>> + fprintf (dump_file, >>> + "Function with profile-id %i not found.\n", >>> + e->indirect_info->common_target_ids[i]); >>> + nunknown++; >>> + } >>> + } >>> + if (speculative < e->indirect_info->num_of_ics) >>> + e->indirect_info->num_of_ics = speculative; >>> + } >>> + } >>> if (update) >>> ipa_update_overall_fn_summary (n); >>> } >>> diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c >>> index 79b250c3943..30347691029 100644 >>> --- a/gcc/ipa-utils.c >>> +++ b/gcc/ipa-utils.c >>> @@ -587,6 +587,11 @@ ipa_merge_profiles (struct cgraph_node *dst, >>> update_max_bb_count (); >>> compute_function_frequency (); >>> pop_cfun (); >>> + /* When src is speculative, clone the referrings. */ >>> + if (src->indirect_call_target) >>> + for (e = src->callers; e; e = e->next_caller) >>> + if (e->callee == src && e->speculative) >>> + dst->clone_referring (src); >>> for (e = dst->callees; e; e = e->next_callee) >>> { >>> if (e->speculative) >>> diff --git a/gcc/ipa.c b/gcc/ipa.c >>> index 2496694124c..c1fe081a72d 100644 >>> --- a/gcc/ipa.c >>> +++ b/gcc/ipa.c >>> @@ -166,7 +166,7 @@ process_references (symtab_node *snode, >>> devirtualization happens. After inlining still keep their declarations >>> around, so we can devirtualize to a direct call. >>> - Also try to make trivial devirutalization when no or only one target >>> is >>> + Also try to make trivial devirtualization when no or only one target is >>> possible. */ >>> static void >>> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c >>> index 4dfa2862be3..0c8f547d44e 100644 >>> --- a/gcc/lto-cgraph.c >>> +++ b/gcc/lto-cgraph.c >>> @@ -238,6 +238,7 @@ lto_output_edge (struct lto_simple_output_block *ob, >>> struct cgraph_edge *edge, >>> unsigned int uid; >>> intptr_t ref; >>> struct bitpack_d bp; >>> + unsigned i; >>> if (edge->indirect_unknown_callee) >>> streamer_write_enum (ob->main_stream, LTO_symtab_tags, >>> LTO_symtab_last_tag, >>> @@ -296,6 +297,25 @@ lto_output_edge (struct lto_simple_output_block *ob, >>> struct cgraph_edge *edge, >>> if (edge->indirect_info->common_target_id) >>> streamer_write_hwi_stream >>> (ob->main_stream, edge->indirect_info->common_target_probability); >>> + >>> + gcc_assert (edge->indirect_info->num_of_ics >>> + <= GCOV_ICALL_TOPN_NCOUNTS / 2); >>> + >>> + streamer_write_hwi_stream (ob->main_stream, >>> + edge->indirect_info->num_of_ics); >>> + >>> + if (edge->indirect_info->num_of_ics) >>> + { >>> + for (i = 0; i < edge->indirect_info->num_of_ics; i++) >>> + { >>> + streamer_write_hwi_stream ( >>> + ob->main_stream, edge->indirect_info->common_target_ids[i]); >>> + if (edge->indirect_info->common_target_ids[i]) >>> + streamer_write_hwi_stream ( >>> + ob->main_stream, >>> + edge->indirect_info->common_target_probabilities[i]); >>> + } >>> + } >>> } >>> } >>> @@ -1438,6 +1458,7 @@ input_edge (struct lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> cgraph_inline_failed_t inline_failed; >>> struct bitpack_d bp; >>> int ecf_flags = 0; >>> + unsigned i; >>> caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]); >>> if (caller == NULL || caller->decl == NULL_TREE) >>> @@ -1488,6 +1509,23 @@ input_edge (struct lto_input_block *ib, >>> vec<symtab_node *> nodes, >>> edge->indirect_info->common_target_id = streamer_read_hwi (ib); >>> if (edge->indirect_info->common_target_id) >>> edge->indirect_info->common_target_probability = >>> streamer_read_hwi (ib); >>> + >>> + edge->indirect_info->num_of_ics = streamer_read_hwi (ib); >>> + >>> + gcc_assert (edge->indirect_info->num_of_ics >>> + <= GCOV_ICALL_TOPN_NCOUNTS / 2); >>> + >>> + if (edge->indirect_info->num_of_ics) >>> + { >>> + for (i = 0; i < edge->indirect_info->num_of_ics; i++) >>> + { >>> + edge->indirect_info->common_target_ids[i] >>> + = streamer_read_hwi (ib); >>> + if (edge->indirect_info->common_target_ids[i]) >>> + edge->indirect_info->common_target_probabilities[i] >>> + = streamer_read_hwi (ib); >>> + } >>> + } >>> } >>> } >>> diff --git a/gcc/predict.c b/gcc/predict.c >>> index 43ee91a5b13..b7f38891c72 100644 >>> --- a/gcc/predict.c >>> +++ b/gcc/predict.c >>> @@ -763,7 +763,6 @@ dump_prediction (FILE *file, enum br_predictor >>> predictor, int probability, >>> && bb->count.precise_p () >>> && reason == REASON_NONE) >>> { >>> - gcc_assert (e->count ().precise_p ()); >>> fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n", >>> predictor_info[predictor].name, >>> bb->count.to_gcov_type (), e->count ().to_gcov_type (), >>> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> new file mode 100644 >>> index 00000000000..e0a83c2e067 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c >>> @@ -0,0 +1,35 @@ >>> +/* { dg-require-effective-target lto } */ >>> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param >>> indir-call-topn-profile=1" } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a); >>> + >>> +int >>> +two (int a); >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int >>> +main() >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + x = one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile_estimate" } } */ >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile_estimate" } } */ >>> + >>> diff --git >>> a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> new file mode 100644 >>> index 00000000000..a8c6e365fb9 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c >>> @@ -0,0 +1,22 @@ >>> +/* It seems there is no way to avoid the other source of mulitple >>> + source testcase from being compiled independently. Just avoid >>> + error. */ >>> +#ifdef DOJOB >>> +int >>> +one (int a) >>> +{ >>> + return 1; >>> +} >>> + >>> +int >>> +two (int a) >>> +{ >>> + return 0; >>> +} >>> +#else >>> +int >>> +main() >>> +{ >>> + return 0; >>> +} >>> +#endif >>> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> new file mode 100644 >>> index 00000000000..aa3887fde83 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c >>> @@ -0,0 +1,42 @@ >>> +/* { dg-require-effective-target lto } */ >>> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param >>> indir-call-topn-profile=1" } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a); >>> + >>> +int >>> +two (int a); >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int foo () >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + x = one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + return x; >>> +} >>> + >>> +int >>> +main() >>> +{ >>> + int x = foo (); >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile_estimate" } } */ >>> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile_estimate" } } */ >>> + >>> + >>> diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> new file mode 100644 >>> index 00000000000..951bc7ddd19 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c >>> @@ -0,0 +1,38 @@ >>> +/* { dg-require-profiling "-fprofile-generate" } */ >>> +/* { dg-options "-O2 -fdump-ipa-profile --param indir-call-topn-profile=1" >>> } */ >>> + >>> +#include <stdio.h> >>> + >>> +typedef int (*fptr) (int); >>> +int >>> +one (int a) >>> +{ >>> + return 1; >>> +} >>> + >>> +int >>> +two (int a) >>> +{ >>> + return 0; >>> +} >>> + >>> +fptr table[] = {&one, &two}; >>> + >>> +int >>> +main() >>> +{ >>> + int i, x; >>> + fptr p = &one; >>> + >>> + one (3); >>> + >>> + for (i = 0; i < 350000000; i++) >>> + { >>> + x = (*p) (3); >>> + p = table[x]; >>> + } >>> + printf ("done:%d\n", x); >>> +} >>> + >>> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct >>> call.* one transformation on insn" "profile" } } */ >>> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct >>> call.* two transformation on insn" "profile" } } */ >>> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c >>> index 9017da878b1..f69b31b197e 100644 >>> --- a/gcc/tree-inline.c >>> +++ b/gcc/tree-inline.c >>> @@ -2028,43 +2028,66 @@ copy_bb (copy_body_data *id, basic_block bb, >>> switch (id->transform_call_graph_edges) >>> { >>> case CB_CGE_DUPLICATE: >>> - edge = id->src_node->get_edge (orig_stmt); >>> - if (edge) >>> - { >>> - struct cgraph_edge *old_edge = edge; >>> - profile_count old_cnt = edge->count; >>> - edge = edge->clone (id->dst_node, call_stmt, >>> - gimple_uid (stmt), >>> - num, den, >>> - true); >>> - >>> - /* Speculative calls consist of two edges - direct and >>> - indirect. Duplicate the whole thing and distribute >>> - frequencies accordingly. */ >>> - if (edge->speculative) >>> - { >>> - struct cgraph_edge *direct, *indirect; >>> - struct ipa_ref *ref; >>> - >>> - gcc_assert (!edge->indirect_unknown_callee); >>> - old_edge->speculative_call_info (direct, indirect, ref); >>> - >>> - profile_count indir_cnt = indirect->count; >>> - indirect = indirect->clone (id->dst_node, call_stmt, >>> - gimple_uid (stmt), >>> - num, den, >>> - true); >>> - >>> - profile_probability prob >>> - = indir_cnt.probability_in (old_cnt + indir_cnt); >>> - indirect->count >>> - = copy_basic_block->count.apply_probability (prob); >>> - edge->count = copy_basic_block->count - indirect->count; >>> - id->dst_node->clone_reference (ref, stmt); >>> - } >>> - else >>> - edge->count = copy_basic_block->count; >>> - } >>> + { >>> + edge = id->src_node->get_edge (orig_stmt); >>> + struct cgraph_edge *old_edge = edge; >>> + struct cgraph_edge *direct, *indirect; >>> + bool next_speculative; >>> + do >>> + { >>> + next_speculative = false; >>> + if (edge) >>> + { >>> + profile_count old_cnt = edge->count; >>> + edge >>> + = edge->clone (id->dst_node, call_stmt, >>> + gimple_uid (stmt), num, den, true); >>> + >>> + /* Speculative calls consist of two edges - direct >>> + and indirect. Duplicate the whole thing and >>> + distribute frequencies accordingly. */ >>> + if (edge->speculative) >>> + { >>> + struct ipa_ref *ref; >>> + >>> + gcc_assert (!edge->indirect_unknown_callee); >>> + old_edge->speculative_call_info (direct, >>> + indirect, ref); >>> + >>> + profile_count indir_cnt = indirect->count; >>> + indirect >>> + = indirect->clone (id->dst_node, call_stmt, >>> + gimple_uid (stmt), num, >>> + den, true); >>> + >>> + profile_probability prob >>> + = indir_cnt.probability_in (old_cnt >>> + + indir_cnt); >>> + indirect->count >>> + = copy_basic_block->count.apply_probability ( >>> + prob); >>> + edge->count >>> + = copy_basic_block->count - indirect->count; >>> + id->dst_node->clone_reference (ref, stmt); >>> + } >>> + else >>> + edge->count = copy_basic_block->count; >>> + } >>> + /* If the indirect call contains more than one indirect >>> + targets, need clone all speculative edges here. */ >>> + if (old_edge && old_edge->next_callee >>> + && old_edge->speculative && indirect >>> + && indirect->indirect_info >>> + && indirect->indirect_info->num_of_ics > 1) >>> + { >>> + edge = old_edge->next_callee; >>> + old_edge = old_edge->next_callee; >>> + if (edge->speculative) >>> + next_speculative = true; >>> + } >>> + } >>> + while (next_speculative); >>> + } >>> break; >>> case CB_CGE_MOVE_CLONES: >>> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c >>> index 1c3034aac10..4964dbdebb5 100644 >>> --- a/gcc/tree-profile.c >>> +++ b/gcc/tree-profile.c >>> @@ -74,8 +74,8 @@ static GTY(()) tree ic_tuple_callee_field; >>> /* Do initialization work for the edge profiler. */ >>> /* Add code: >>> - __thread gcov* __gcov_indirect_call_counters; // pointer to actual >>> counter >>> - __thread void* __gcov_indirect_call_callee; // actual callee address >>> + __thread gcov* __gcov_indirect_call.counters; // pointer to actual >>> counter >>> + __thread void* __gcov_indirect_call.callee; // actual callee address >>> __thread int __gcov_function_counter; // time profiler function counter >>> */ >>> static void >>> @@ -395,7 +395,7 @@ gimple_gen_ic_profiler (histogram_value value, unsigned >>> tag, unsigned base) >>> f_1 = foo; >>> __gcov_indirect_call.counters = &__gcov4.main[0]; >>> PROF_9 = f_1; >>> - __gcov_indirect_call_callee = PROF_9; >>> + __gcov_indirect_call.callee = PROF_9; >>> _4 = f_1 (); >>> */ >>> @@ -458,11 +458,11 @@ gimple_gen_ic_func_profiler (void) >>> /* Insert code: >>> - if (__gcov_indirect_call_callee != NULL) >>> + if (__gcov_indirect_call.callee != NULL) >>> __gcov_indirect_call_profiler_v3 (profile_id, >>> ¤t_function_decl); >>> The function __gcov_indirect_call_profiler_v3 is responsible for >>> - resetting __gcov_indirect_call_callee to NULL. */ >>> + resetting __gcov_indirect_call.callee to NULL. */ >>> gimple_stmt_iterator gsi = gsi_start_bb (cond_bb); >>> void0 = build_int_cst (ptr_type_node, 0); >>> @@ -904,7 +904,7 @@ pass_ipa_tree_profile::gate (function *) >>> { >>> /* When profile instrumentation, use or test coverage shall be >>> performed. >>> But for AutoFDO, this there is no instrumentation, thus this pass is >>> - diabled. */ >>> + disabled. */ >>> return (!in_lto_p && !flag_auto_profile >>> && (flag_branch_probabilities || flag_test_coverage >>> || profile_arc_flag)); >>> diff --git a/gcc/value-prof.c b/gcc/value-prof.c >>> index 5013956cf86..4869ab8ccd6 100644 >>> --- a/gcc/value-prof.c >>> +++ b/gcc/value-prof.c >>> @@ -579,8 +579,8 @@ free_histograms (struct function *fn) >>> somehow. */ >>> static bool >>> -check_counter (gimple *stmt, const char * name, >>> - gcov_type *count, gcov_type *all, profile_count bb_count_d) >>> +check_counter (gimple *stmt, const char *name, gcov_type *count, gcov_type >>> *all, >>> + profile_count bb_count_d, float ratio = 1.0f) >>> { >>> gcov_type bb_count = bb_count_d.ipa ().to_gcov_type (); >>> if (*all != bb_count || *count > *all) >>> @@ -599,7 +599,7 @@ check_counter (gimple *stmt, const char * name, >>> "count (%d)\n", name, (int)*all, >>> (int)bb_count); >>> *all = bb_count; >>> if (*count > *all) >>> - *count = *all; >>> + *count = *all * ratio; >>> return false; >>> } >>> else >>> @@ -1410,9 +1410,132 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node >>> *direct_call, >>> return dcall_stmt; >>> } >>> +/* If --param=indir-call-topn-profile=1 is specified when compiling, >>> there maybe >>> + multiple indirect targets in histogram. Check every indirect/virtual >>> call >>> + if callee function exists, if not exit, leave it to LTO stage for later >>> + process. Modify code of this indirect call to an if-else structure in >>> + ipa-profile finally. */ >>> +static bool >>> +ic_transform_topn (gimple_stmt_iterator *gsi) >>> +{ >>> + unsigned j; >>> + gcall *stmt; >>> + histogram_value histogram; >>> + gcov_type val, count, count_all, all, bb_all; >>> + struct cgraph_node *d_call; >>> + profile_count bb_count; >>> + >>> + stmt = dyn_cast<gcall *> (gsi_stmt (*gsi)); >>> + if (!stmt) >>> + return false; >>> + >>> + if (gimple_call_fndecl (stmt) != NULL_TREE) >>> + return false; >>> + >>> + if (gimple_call_internal_p (stmt)) >>> + return false; >>> + >>> + histogram >>> + = gimple_histogram_value_of_type (cfun, stmt, >>> HIST_TYPE_INDIR_CALL_TOPN); >>> + if (!histogram) >>> + return false; >>> + >>> + count = 0; >>> + all = 0; >>> + bb_all = gimple_bb (stmt)->count.ipa ().to_gcov_type (); >>> + bb_count = gimple_bb (stmt)->count; >>> + >>> + /* n_counters need be odd to avoid access violation. */ >>> + gcc_assert (histogram->n_counters % 2 == 1); >>> + >>> + /* For indirect call topn, accumulate all the counts first. */ >>> + for (j = 1; j < histogram->n_counters; j += 2) >>> + { >>> + val = histogram->hvalue.counters[j]; >>> + count = histogram->hvalue.counters[j + 1]; >>> + if (val) >>> + all += count; >>> + } >>> + >>> + count_all = all; >>> + /* Do the indirect call conversion if function body exists, or else >>> leave it >>> + to LTO stage. */ >>> + for (j = 1; j < histogram->n_counters; j += 2) >>> + { >>> + val = histogram->hvalue.counters[j]; >>> + count = histogram->hvalue.counters[j + 1]; >>> + if (val) >>> + { >>> + /* The order of CHECK_COUNTER calls is important >>> + since check_counter can correct the third parameter >>> + and we want to make count <= all <= bb_count. */ >>> + if (check_counter (stmt, "ic", &all, &bb_all, bb_count) >>> + || check_counter (stmt, "ic", &count, &all, >>> + profile_count::from_gcov_type (all), >>> + (float) count / count_all)) >>> + { >>> + gimple_remove_histogram_value (cfun, stmt, histogram); >>> + return false; >>> + } >>> + >>> + d_call = find_func_by_profile_id ((int) val); >>> + >>> + if (d_call == NULL) >>> + { >>> + if (val) >>> + { >>> + if (dump_file) >>> + { >>> + fprintf ( >>> + dump_file, >>> + "Indirect call -> direct call from other module"); >>> + print_generic_expr (dump_file, gimple_call_fn (stmt), >>> + TDF_SLIM); >>> + fprintf (dump_file, >>> + "=> %i (will resolve only with LTO)\n", >>> + (int) val); >>> + } >>> + } >>> + return false; >>> + } >>> + >>> + if (!check_ic_target (stmt, d_call)) >>> + { >>> + if (dump_file) >>> + { >>> + fprintf (dump_file, "Indirect call -> direct call "); >>> + print_generic_expr (dump_file, gimple_call_fn (stmt), >>> + TDF_SLIM); >>> + fprintf (dump_file, "=> "); >>> + print_generic_expr (dump_file, d_call->decl, TDF_SLIM); >>> + fprintf (dump_file, >>> + " transformation skipped because of type mismatch"); >>> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >>> + } >>> + gimple_remove_histogram_value (cfun, stmt, histogram); >>> + return false; >>> + } >>> + >>> + if (dump_file) >>> + { >>> + fprintf (dump_file, "Indirect call -> direct call "); >>> + print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM); >>> + fprintf (dump_file, "=> "); >>> + print_generic_expr (dump_file, d_call->decl, TDF_SLIM); >>> + fprintf (dump_file, >>> + " transformation on insn postponed to ipa-profile"); >>> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >>> + fprintf (dump_file, "hist->count %" PRId64 >>> + " hist->all %" PRId64"\n", count, all); >>> + } >>> + } >>> + } >>> + >>> + return true; >>> +} >>> /* >>> For every checked indirect/virtual call determine if most common pid of >>> - function/class method has probability more than 50%. If yes modify code >>> of >>> + function/class method has probability more than 50%. If yes modify code >>> of >>> this call to: >>> */ >>> @@ -1423,6 +1546,7 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) >>> histogram_value histogram; >>> gcov_type val, count, all, bb_all; >>> struct cgraph_node *direct_call; >>> + enum hist_type type; >>> stmt = dyn_cast <gcall *> (gsi_stmt (*gsi)); >>> if (!stmt) >>> @@ -1434,18 +1558,24 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) >>> if (gimple_call_internal_p (stmt)) >>> return false; >>> - histogram = gimple_histogram_value_of_type (cfun, stmt, >>> HIST_TYPE_INDIR_CALL); >>> + type = PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) ? >>> HIST_TYPE_INDIR_CALL_TOPN >>> + : HIST_TYPE_INDIR_CALL; >>> + >>> + histogram = gimple_histogram_value_of_type (cfun, stmt, type); >>> if (!histogram) >>> return false; >>> + if (type == HIST_TYPE_INDIR_CALL_TOPN) >>> + return ic_transform_topn (gsi); >>> + >>> val = histogram->hvalue.counters [0]; >>> count = histogram->hvalue.counters [1]; >>> all = histogram->hvalue.counters [2]; >>> bb_all = gimple_bb (stmt)->count.ipa ().to_gcov_type (); >>> - /* The order of CHECK_COUNTER calls is important - >>> + /* The order of CHECK_COUNTER calls is important >>> since check_counter can correct the third parameter >>> - and we want to make count <= all <= bb_all. */ >>> + and we want to make count <= all <= bb_all. */ >>> if (check_counter (stmt, "ic", &all, &bb_all, gimple_bb (stmt)->count) >>> || check_counter (stmt, "ic", &count, &all, >>> profile_count::from_gcov_type (all))) >>> @@ -1494,7 +1624,7 @@ gimple_ic_transform (gimple_stmt_iterator *gsi) >>> print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM); >>> fprintf (dump_file, "=> "); >>> print_generic_expr (dump_file, direct_call->decl, TDF_SLIM); >>> - fprintf (dump_file, " transformation on insn postponned to >>> ipa-profile"); >>> + fprintf (dump_file, " transformation on insn postponed to >>> ipa-profile"); >>> print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >>> fprintf (dump_file, "hist->count %" PRId64 >>> " hist->all %" PRId64"\n", count, all); >>> >> >