Ping. Thanks, Teresa On Wed, Oct 30, 2013 at 1:15 PM, Teresa Johnson <tejohn...@google.com> wrote: > On Fri, Oct 18, 2013 at 2:17 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >>> Here is the patch updated to use the new parameter from r203830. >>> Passed bootstrap and regression tests. >>> >>> 2013-10-18 Jan Hubicka <j...@suse.cz> >>> Teresa Johnson <tejohn...@google.com> >>> >>> * predict.c (handle_missing_profiles): New function. >>> (counts_to_freqs): Don't overwrite estimated frequencies >>> when function has no profile counts. >>> * predict.h (handle_missing_profiles): Declare. >>> * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. >>> >>> Index: predict.c >>> =================================================================== >>> --- predict.c (revision 203830) >>> +++ predict.c (working copy) >>> @@ -2758,6 +2758,40 @@ estimate_loops (void) >>> BITMAP_FREE (tovisit); >>> } >>> >> >> You need block comment. It should explain the problem of COMDATs and how >> they are handled. >> It is not an obvious thing. > > Done. > >> >>> +void >>> +handle_missing_profiles (void) >>> +{ >>> + struct cgraph_node *node; >>> + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); >> extra line >>> + /* See if 0 count function has non-0 count callers. In this case we >>> + lost some profile. Drop its function profile to PROFILE_GUESSED. */ >>> + FOR_EACH_DEFINED_FUNCTION (node) >>> + { >>> + struct cgraph_edge *e; >>> + gcov_type call_count = 0; >>> + struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl); >> Extra line >>> + if (node->count) >>> + continue; >>> + for (e = node->callers; e; e = e->next_caller) >>> + call_count += e->count; >> What about checking if the sum is way off even for non-0 counts. >> I.e. for case where function was inlined to some calls but not to others? In >> that case we may want to scale up the counts (with some resonable care for >> capping) > > In this patch I am not changing any counts, so I am leaving this one > for follow-on work (even for the routines missing counts completely > like these, we don't apply any counts, just mark them as guessed. I > have a follow-on patch to send once this goes in that does apply > counts to these 0-count routines only when we decide to inline as we > discussed). > >> >> Also I think the code really should propagate - i.e. have simple worklist and >> look for functions that are COMDAT and are called by other COMDAT functions >> where we decided to drop the profile. Having non-trivial chains of comdats >> is >> common thing. > > Done. > >> >> What about outputting user visible warning/error on the incosnsistency when >> no COMDAT function is involved same way as we do for BB profile? > > Done. This one caught the fact that we have this situation for "extern > template" functions as well when I tested on cpu2006. I added in a > check to ignore those when issuing the warning (they are not emitted > thus don't get any profile counts). > > Updated patch below. > > Bootstrapped and tested on x86-64-unknown-linux-gnu. Also tested on > profiledbootstrap and profile-use build of SPEC cpu2006. Ok for trunk? > > Thanks, > Teresa > > 2013-10-30 Teresa Johnson <tejohn...@google.com> > > * predict.c (drop_profile): New function. > (handle_missing_profiles): Ditto. > (counts_to_freqs): Don't overwrite estimated frequencies > when function has no profile counts. > * predict.h (handle_missing_profiles): Declare. > * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. > > Index: predict.c > =================================================================== > --- predict.c (revision 204213) > +++ predict.c (working copy) > @@ -2765,6 +2765,107 @@ estimate_loops (void) > BITMAP_FREE (tovisit); > } > > +/* Drop the profile for NODE to guessed, and update its frequency based on > + whether it is expected to be HOT. */ > + > +static void > +drop_profile (struct cgraph_node *node, bool hot) > +{ > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + if (dump_file) > + fprintf (dump_file, > + "Dropping 0 profile for %s/%i. %s based on calls.\n", > + cgraph_node_name (node), node->order, > + hot ? "Function is hot" : "Function is normal"); > + /* We only expect to miss profiles for functions that are reached > + via non-zero call edges in cases where the function may have > + been linked from another module or library (COMDATs and extern > + templates). See the comments below for handle_missing_profiles. */ > + if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl)) > + warning (0, > + "Missing counts for called function %s/%i", > + cgraph_node_name (node), node->order); > + > + profile_status_for_function (fn) > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > + node->frequency > + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > +} > + > +/* In the case of COMDAT routines, multiple object files will contain the > same > + function and the linker will select one for the binary. In that case > + all the other copies from the profile instrument binary will be missing > + profile counts. Look for cases where this happened, due to non-zero > + call counts going to 0-count functions, and drop the profile to guessed > + so that we can use the estimated probabilities and avoid optimizing only > + for size. > + > + The other case where the profile may be missing is when the routine > + is not going to be emitted to the object file, e.g. for "extern template" > + class methods. Those will be marked DECL_EXTERNAL. Emit a warning in > + all other cases of non-zero calls to 0-count functions. */ > + > +void > +handle_missing_profiles (void) > +{ > + struct cgraph_node *node; > + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); > + vec<struct cgraph_node *> worklist; > + worklist.create (64); > + > + /* See if 0 count function has non-0 count callers. In this case we > + lost some profile. Drop its function profile to PROFILE_GUESSED. */ > + FOR_EACH_DEFINED_FUNCTION (node) > + { > + struct cgraph_edge *e; > + gcov_type call_count = 0; > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + if (node->count) > + continue; > + for (e = node->callers; e; e = e->next_caller) > + call_count += e->count; > + if (call_count > + && fn && fn->cfg > + && (call_count * unlikely_count_fraction >= profile_info->runs)) > + { > + bool maybe_hot = maybe_hot_count_p (NULL, call_count); > + > + drop_profile (node, maybe_hot); > + worklist.safe_push (node); > + } > + } > + > + /* Propagate the profile dropping to other 0-count COMDATs that are > + potentially called by COMDATs we already dropped the profile on. */ > + while (worklist.length () > 0) > + { > + struct cgraph_edge *e; > + > + node = worklist.pop (); > + for (e = node->callees; e; e = e->next_caller) > + { > + struct cgraph_node *callee = e->callee; > + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); > + > + if (callee->count > 0) > + continue; > + if (DECL_COMDAT (callee->decl) && fn && fn->cfg > + && profile_status_for_function (fn) == PROFILE_READ) > + { > + /* Since there are no non-0 call counts to this function, > + we don't know for sure whether it is hot. Indicate to > + the drop_profile routine that function should be marked > + normal, rather than hot. */ > + drop_profile (node, false); > + worklist.safe_push (callee); > + } > + } > + } > + worklist.release (); > +} > + > /* Convert counts measured by profile driven feedback to frequencies. > Return nonzero iff there was any nonzero execution count. */ > > @@ -2774,6 +2875,9 @@ counts_to_freqs (void) > gcov_type count_max, true_count_max = 0; > basic_block bb; > > + if (!ENTRY_BLOCK_PTR->count) > + return 0; > + > FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) > true_count_max = MAX (bb->count, true_count_max); > > Index: predict.h > =================================================================== > --- predict.h (revision 204213) > +++ predict.h (working copy) > @@ -37,6 +37,7 @@ enum prediction > > extern void predict_insn_def (rtx, enum br_predictor, enum prediction); > extern int counts_to_freqs (void); > +extern void handle_missing_profiles (void); > extern void estimate_bb_frequencies (bool); > extern const char *predictor_name (enum br_predictor); > extern tree build_predict_expr (enum br_predictor, enum prediction); > Index: tree-profile.c > =================================================================== > --- tree-profile.c (revision 204213) > +++ tree-profile.c (working copy) > @@ -612,6 +612,8 @@ tree_profiling (void) > pop_cfun (); > } > > + handle_missing_profiles (); > + > del_node_map (); > return 0; > } > >> >> Honza > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
-- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413