Here is the patch updated to use the new parameter from r203830. Passed bootstrap and regression tests.
2013-10-18 Jan Hubicka <j...@suse.cz> Teresa Johnson <tejohn...@google.com> * predict.c (handle_missing_profiles): New function. (counts_to_freqs): Don't overwrite estimated frequencies when function has no profile counts. * predict.h (handle_missing_profiles): Declare. * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. Index: predict.c =================================================================== --- predict.c (revision 203830) +++ predict.c (working copy) @@ -2758,6 +2758,40 @@ estimate_loops (void) BITMAP_FREE (tovisit); } +void +handle_missing_profiles (void) +{ + struct cgraph_node *node; + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); + /* See if 0 count function has non-0 count callers. In this case we + lost some profile. Drop its function profile to PROFILE_GUESSED. */ + FOR_EACH_DEFINED_FUNCTION (node) + { + struct cgraph_edge *e; + gcov_type call_count = 0; + struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl); + if (node->count) + continue; + for (e = node->callers; e; e = e->next_caller) + call_count += e->count; + if (call_count + && fn && fn->cfg + && (call_count * unlikely_count_fraction >= profile_info->runs)) + { + bool maybe_hot = maybe_hot_count_p (NULL, call_count); + if (dump_file) + fprintf (dump_file, + "Dropping 0 profile for %s/%i. %s based on calls.\n", + cgraph_node_name (node), node->symbol.order, + maybe_hot ? "Function is hot" : "Function is normal"); + profile_status_for_function (fn) + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); + node->frequency + = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; + } + } +} + /* Convert counts measured by profile driven feedback to frequencies. Return nonzero iff there was any nonzero execution count. */ @@ -2767,6 +2801,9 @@ counts_to_freqs (void) gcov_type count_max, true_count_max = 0; basic_block bb; + if (!ENTRY_BLOCK_PTR->count) + return 0; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) true_count_max = MAX (bb->count, true_count_max); Index: predict.h =================================================================== --- predict.h (revision 203830) +++ predict.h (working copy) @@ -37,6 +37,7 @@ enum prediction extern void predict_insn_def (rtx, enum br_predictor, enum prediction); extern int counts_to_freqs (void); +extern void handle_missing_profiles (void); extern void estimate_bb_frequencies (bool); extern const char *predictor_name (enum br_predictor); extern tree build_predict_expr (enum br_predictor, enum prediction); Index: tree-profile.c =================================================================== --- tree-profile.c (revision 203830) +++ tree-profile.c (working copy) @@ -607,6 +607,8 @@ tree_profiling (void) pop_cfun (); } + handle_missing_profiles (); + del_node_map (); return 0; } On Wed, Oct 16, 2013 at 12:02 PM, Teresa Johnson <tejohn...@google.com> wrote: > On Tue, Oct 15, 2013 at 3:39 PM, Teresa Johnson <tejohn...@google.com> wrote: >> On Tue, Oct 15, 2013 at 2:57 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >>>> On Tue, Oct 15, 2013 at 2:03 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >>>> >> Yes, that would work. So let's discard this patch because the fix for >>>> >> comdat can also fix this problem. >>>> > >>>> > Unforutnately ipa-profile-estimate is an IPA pass and as such it does >>>> > not have access to profile_status_to_function. >>>> >>>> I see. I was coding that up today but hadn't tested it yet. >>>> >>>> > You can probably just factor this out into a function that can be called >>>> > and for normal FDO we call it where the loop stis now and for auto-FDO >>>> > we can >>>> > probably have another invocation from before early passes where auto-FDO >>>> > is collected. >>>> >>>> Ok, let's go with that approach for now. It won't address the 0 count >>>> COMDAT calling another 0 count COMDAT problem, but I will probably >>>> just find a way to deal with this when inlining. >>> >>> You can still propagate, since tree-profile is an simple-ipa pass. >> >> Ok, I think I will leave that for the second patch, since I need to do >> some testing on the effects - i.e. the propagation I had in mind would >> make any 0-count routine called by a routine that was dropped to >> guessed to also be dropped to guessed (and no longer unlikely). It may >> be too aggressive, I need to check. >> >>>> >>>> >> >>> + if (node->count) >>>> >> >>> + continue; >>>> > Also here we should sum the counts and consider function non unlikely >>>> > executed >>>> > in the same way as probably_never_executed does. >>>> >>>> I assume you mean by doing the same comparison to the number of >>>> profile->runs. Yes, this makes sense. >>> >>> Yes. >>>> >>>> > >>>> > I can prepare updated patch, but i am currently travelling, so i would >>>> > not >>>> > be disapointed if you beat me ;) >>>> >>>> I'm working on it, and I think based on Dehao's needs I am going to >>>> split up the patch into two phases, the one that does just the part >>>> you had sent a patch for (making sure 0 count routines with non-zero >>>> calls are marked guessed and have their node frequency set >>>> appropriately), and a subsequent one to do the count application when >>>> we inline a 0-count routine into a non-zero callsite. I'll shoot for >>>> getting this ready by tomorrow. >>>> >>>> BTW, in your original patch you are checking for both e->count or >>>> cgraph_maybe_hot_edge_p(e), but AFAICT the call to >>>> cgraph_maybe_hot_edge_p will never return true when e->count is zero. >>>> When there is a profile it will return false via maybe_hot_count_p >>>> since e->count == 0. When there is no profile it will return false >>>> when the callee has NODE_FREQUENCY_UNLIKELY_EXECUTED. So I think just >>>> checking for e->count >0 is sufficient here. >>> >>> I think I was checking caller count here (that is read) and the code >>> was supposed to make functoin with hot caller to be hot... >> >> The code in your patch was just checking the edge count, not the >> caller count. cgraph_maybe_hot_edge_p also doesn't check the caller >> count, just the edge count. Do we want to make all functions with hot >> callers also be hot, even when the edge count is not hot? This may be >> to aggressive. I was simply going to make the code check e->count. > > Here is the patch from Honza, that I revised based on discussions and > testing. Once this related patch goes in, I can change the check against the > profile_info->runs that hardcodes the threshold with the new parameter > proposed for probably_never_executed there: > http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00743.html > > Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? > > Thanks, Teresa > > 2013-10-16 Jan Hubicka <j...@suse.cz> > Teresa Johnson <tejohn...@google.com> > > * predict.c (handle_missing_profiles): New function. > (counts_to_freqs): Don't overwrite estimated frequencies > when function has no profile counts. > * predict.h (handle_missing_profiles): Declare. > * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. > > Index: predict.c > =================================================================== > --- predict.c (revision 203633) > +++ predict.c (working copy) > @@ -2742,6 +2742,39 @@ estimate_loops (void) > BITMAP_FREE (tovisit); > } > > +void > +handle_missing_profiles (void) > +{ > + struct cgraph_node *node; > + /* See if 0 count function has non-0 count callers. In this case we > + lost some profile. Drop its function profile to PROFILE_GUESSED. */ > + FOR_EACH_DEFINED_FUNCTION (node) > + { > + struct cgraph_edge *e; > + gcov_type call_count = 0; > + struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl); > + if (node->count) > + continue; > + for (e = node->callers; e; e = e->next_caller) > + call_count += e->count; > + if (call_count > + && fn && fn->cfg > + && (call_count * 4 >= profile_info->runs)) > + { > + bool maybe_hot = maybe_hot_count_p (NULL, call_count); > + if (dump_file) > + fprintf (dump_file, > + "Dropping 0 profile for %s/%i. %s based on calls.\n", > + cgraph_node_name (node), node->symbol.order, > + maybe_hot ? "Function is hot" : "Function is normal"); > + profile_status_for_function (fn) > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > + node->frequency > + = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > + } > + } > +} > + > /* Convert counts measured by profile driven feedback to frequencies. > Return nonzero iff there was any nonzero execution count. */ > > @@ -2751,6 +2784,9 @@ counts_to_freqs (void) > gcov_type count_max, true_count_max = 0; > basic_block bb; > > + if (!ENTRY_BLOCK_PTR->count) > + return 0; > + > FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) > true_count_max = MAX (bb->count, true_count_max); > > Index: predict.h > =================================================================== > --- predict.h (revision 203633) > +++ predict.h (working copy) > @@ -37,6 +37,7 @@ enum prediction > > extern void predict_insn_def (rtx, enum br_predictor, enum prediction); > extern int counts_to_freqs (void); > +extern void handle_missing_profiles (void); > extern void estimate_bb_frequencies (bool); > extern const char *predictor_name (enum br_predictor); > extern tree build_predict_expr (enum br_predictor, enum prediction); > Index: tree-profile.c > =================================================================== > --- tree-profile.c (revision 203633) > +++ tree-profile.c (working copy) > @@ -607,6 +607,8 @@ tree_profiling (void) > pop_cfun (); > } > > + handle_missing_profiles (); > + > del_node_map (); > return 0; > } > > >> >> Teresa >> >>> >>> Honza >>>> >>>> Thanks, >>>> Teresa >>>> >>>> > >>>> > Honza >>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413