On Mon, Oct 7, 2013 at 10:49 PM, Teresa Johnson <tejohn...@google.com> wrote: > On Sun, Oct 6, 2013 at 6:51 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >>> The second part ensures that frequencies are correctly updated after >>> inlining. The problem is that after inlining the frequencies were >>> being recomputed directly from the corresponding bb counts in >>> rebuild_frequencies. If the counts had been truncated to 0, then the >>> recomputed frequencies would become 0 as well (often leading to >>> inconsistencies in the frequencies between blocks). Then the above >>> change to probably_never_executed would not help identify these blocks >>> as non-cold from the frequencies. The fix was to use the existing >>> logic used during static profile rebuilding to also >>> recompute/propagate the frequencies from the branch probabilities in >>> the profile feedback case. I also renamed a number of estimate_* >>> routines to compute_* and updated the comments to reflect the fact >>> that these routines are not doing estimation (from a static profile), >>> but in fact recomputing/propagating frequencies from the existing >>> (either guessed or profile-feedback-based) probabilities. >> >> I see where your plan - you want frequencies to be computed purely on >> probabilities that are there. This however will lead to unnecesary loss of >> precisions for functions with large counts: probabilities are truncated in >> precision and the propagation won't give correct answers and has serious >> truncating issues. >> >> So perhaps we want to execute this only if maximal count in function is low? >> (i.e. less than REG_BR_PROB_BASE or even REG_BR_PROB_BASE/10 - if counts >> are close enough the roundoff errors of count updating should be less >> terrible >> than errors introduced by the propagation). > > Yes, I think that is a better approach. > >> >>> -/* Estimate probabilities of loopback edges in loops at same nest level. >>> */ >>> +/* Compute frequencies in loops at same nest level. */ >>> >>> static void >>> -estimate_loops_at_level (struct loop *first_loop) >>> +compute_loops_at_level (struct loop *first_loop) >> >> I would keep "estimate". The whole logic is not safe and it will lead to >> mistakes for irreducible loops and many other cases... > > Ok > >> >>> @@ -3027,18 +3042,16 @@ void >>> rebuild_frequencies (void) >>> { >>> timevar_push (TV_REBUILD_FREQUENCIES); >>> - if (profile_status == PROFILE_GUESSED) >>> + if (profile_status == PROFILE_GUESSED || profile_status == PROFILE_READ) >>> { >>> loop_optimizer_init (0); >>> add_noreturn_fake_exit_edges (); >>> mark_irreducible_loops (); >>> connect_infinite_loops_to_exit (); >>> - estimate_bb_frequencies (); >>> + compute_bb_frequencies (true); >> >> Here please add the check about maximal count in functiona and commnet >> (especially for case we will remember to remove this hack if we switch to >> sreal >> based counts/frequencies/probabilities) >> >> OK, with those changes (if it fixes the problem) > > Ok thanks. > Teresa
This has been committed as r203395: 2013-10-10 Teresa Johnson <tejohn...@google.com> * predict.c (tree_estimate_probability): Add new parameter for estimate_bb_frequencies. (estimate_bb_frequencies): Add new parameter to force estimation. (rebuild_frequencies): When max frequency in function is small, recompute counts from frequencies. * predict.h (estimate_bb_frequencies): New parameter. Index: predict.c =================================================================== --- predict.c (revision 203389) +++ predict.c (working copy) @@ -2389,7 +2389,7 @@ tree_estimate_probability (void) pointer_map_destroy (bb_predictions); bb_predictions = NULL; - estimate_bb_frequencies (); + estimate_bb_frequencies (false); free_dominance_info (CDI_POST_DOMINATORS); remove_fake_exit_edges (); } @@ -2692,7 +2692,7 @@ propagate_freq (basic_block head, bitmap tovisit) } } -/* Estimate probabilities of loopback edges in loops at same nest level. */ +/* Estimate frequencies in loops at same nest level. */ static void estimate_loops_at_level (struct loop *first_loop) @@ -2801,15 +2801,17 @@ expensive_function_p (int threshold) return false; } -/* Estimate basic blocks frequency by given branch probabilities. */ +/* Estimate and propagate basic block frequencies using the given branch + probabilities. If FORCE is true, the frequencies are used to estimate + the counts even when there are already non-zero profile counts. */ void -estimate_bb_frequencies (void) +estimate_bb_frequencies (bool force) { basic_block bb; sreal freq_max; - if (profile_status != PROFILE_READ || !counts_to_freqs ()) + if (force || profile_status != PROFILE_READ || !counts_to_freqs ()) { static int real_values_initialized = 0; @@ -2846,8 +2848,8 @@ void } } - /* First compute probabilities locally for each loop from innermost - to outermost to examine probabilities for back edges. */ + /* First compute frequencies locally for each loop from innermost + to outermost to examine frequencies for back edges. */ estimate_loops (); memcpy (&freq_max, &real_zero, sizeof (real_zero)); @@ -3028,13 +3030,29 @@ void rebuild_frequencies (void) { timevar_push (TV_REBUILD_FREQUENCIES); - if (profile_status == PROFILE_GUESSED) + + /* When the max bb count in the function is small, there is a higher + chance that there were truncation errors in the integer scaling + of counts by inlining and other optimizations. This could lead + to incorrect classification of code as being cold when it isn't. + In that case, force the estimation of bb counts/frequencies from the + branch probabilities, rather than computing frequencies from counts, + which may also lead to frequencies incorrectly reduced to 0. There + is less precision in the probabilities, so we only do this for small + max counts. */ + gcov_type count_max = 0; + basic_block bb; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) + count_max = MAX (bb->count, count_max); + + if (profile_status == PROFILE_GUESSED + || (profile_status == PROFILE_READ && count_max < REG_BR_PROB_BASE/10)) { loop_optimizer_init (0); add_noreturn_fake_exit_edges (); mark_irreducible_loops (); connect_infinite_loops_to_exit (); - estimate_bb_frequencies (); + estimate_bb_frequencies (true); remove_fake_exit_edges (); loop_optimizer_finalize (); } Index: predict.h =================================================================== --- predict.h (revision 203389) +++ predict.h (working copy) @@ -37,7 +37,7 @@ enum prediction extern void predict_insn_def (rtx, enum br_predictor, enum prediction); extern int counts_to_freqs (void); -extern void estimate_bb_frequencies (void); +extern void estimate_bb_frequencies (bool); extern const char *predictor_name (enum br_predictor); extern tree build_predict_expr (enum br_predictor, enum prediction); extern void tree_estimate_probability (void); > >> >> Thanks, >> Honza > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413