> 2013-10-03 Teresa Johnson <tejohn...@google.com> > > * params.def (PARAM_MIN_HOT_RUN_RATIO): New parameter.
I would not mention HOT in the parameter name. Blocks are hot/normal/unlikely and we HOT_BB_FREQUENCY_FRACTION. So perhaps UNLIKELY_BB_COUNT_FRACTION with an explanation that it is relative to number of train runs? > +DEFPARAM(PARAM_MIN_HOT_RUN_RATIO, > + "min-hot-run-ratio", > + "The minimum ratio of profile runs to basic block execution count " > + "for the block to be considered hot", Perhaps as: > + "The maximum ratio of profile runs to basic block execution count " > + "for the block to be considered unlikely", > + 4, 0, 10000) lower bound should be 1 or we divide by 0. > Index: predict.c > =================================================================== > --- predict.c (revision 203152) > +++ predict.c (working copy) > @@ -237,17 +237,31 @@ probably_never_executed (struct function *fun, > gcc_checking_assert (fun); > if (profile_status_for_function (fun) == PROFILE_READ) > { > - if ((count * 4 + profile_info->runs / 2) / profile_info->runs > 0) > + int min_run_ratio = PARAM_VALUE (PARAM_MIN_HOT_RUN_RATIO); > + if (RDIV (count * min_run_ratio, profile_info->runs) > 0) This way if count is 1, runs needs to be n_run_ratio * 2 So perhaps if (count * min_run_ratio >= profile_info->runs) instead of division. > - if (ENTRY_BLOCK_PTR->count && ENTRY_BLOCK_PTR->count < > REG_BR_PROB_BASE) > + if (ENTRY_BLOCK_PTR->count) > { > - return (RDIV (frequency * ENTRY_BLOCK_PTR->count, > - ENTRY_BLOCK_PTR->frequency) > - < REG_BR_PROB_BASE / 4); > + gcov_type scaled_count > + = frequency * ENTRY_BLOCK_PTR->count * min_run_ratio; > + gcov_type computed_count; > + /* Check for overflow, in which case ENTRY_BLOCK_PTR->count should > + be large enough to do the division first without losing much > + precision. */ > + if (scaled_count/frequency/min_run_ratio != ENTRY_BLOCK_PTR->count) I do not like the undefined behaviour triggered before the check (sure unsigned arithmetic would fix that, but it seems all strange). What about guarding the whole code. if (ENTRY_BLOCK_PTR->count && count < REG_BR_PROB_BASE) For large counts the roudoff problems in first test should not be problem. > + { > + computed_count = RDIV (ENTRY_BLOCK_PTR->count, > + ENTRY_BLOCK_PTR->frequency); > + computed_count *= frequency * min_run_ratio; > + } > + else > + computed_count = RDIV (scaled_count, ENTRY_BLOCK_PTR->frequency); > + if (RDIV (computed_count * min_run_ratio, profile_info->runs) > 0) So at nonoverflow path you compute ((frequency * entry_bb_count * min_run_ratio) / entry_bb_frequency) * min_run_ratio / runs > 0.5 I think you want ((frequency * entry_bb_count * min_run_ratio) / entry_bb_frequency >= runs If entry_bb_frequency is 0, you can just return true iff frequency is > min_run_ratio. I think safe way to compute this is if (ENTRY_BLOCK_PTR->count < REG_BR_PROB_BASE * REG_BR_PROB_BASE) scaled_count = ((frequency * entry_bb_count * min_run_ratio) / entry_bb_frequency; else scaled_count = ((frequency * entry_bb_count) / entry_bb_frequency) * min_run_ratio In first path, we have at most 10000*10000*10000*10000 that still fits in 64bit value. In the second path the base of fixed point arithmetic is large enough so the division should not matter. We never get over entry_count * 10000 that is safe. The first part of statement however just compute value that should match count so count > 10000 unless profile got misupated and the low-count roundoff errors should not matter. So perhaps the whole code can execute only if if (ENTRY_BLOCK_PTR->count && count < REG_BR_PROB_BASE && ENTRY_BLOCK_PTR->count < REG_BR_PROB_BASE * REG_BR_PROB_BASE) and we do not need any conditionals for scaled_count Does such code work with ratio changed to 16? I think the double multiplication in your patch may have caused problems. I will check the second part of patch separately. Thanks! Honza