On Sun, Oct 6, 2013 at 6:43 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> 2013-10-03 Teresa Johnson <tejohn...@google.com> >> >> * params.def (PARAM_MIN_HOT_RUN_RATIO): New parameter. > > I would not mention HOT in the parameter name. Blocks are hot/normal/unlikely > and we HOT_BB_FREQUENCY_FRACTION. > > So perhaps UNLIKELY_BB_COUNT_FRACTION with an explanation that it is relative > to number of train runs?
Ok. >> +DEFPARAM(PARAM_MIN_HOT_RUN_RATIO, >> + "min-hot-run-ratio", >> + "The minimum ratio of profile runs to basic block execution count " >> + "for the block to be considered hot", > Perhaps as: >> + "The maximum ratio of profile runs to basic block execution count " >> + "for the block to be considered unlikely", > >> + 4, 0, 10000) > > lower bound should be 1 or we divide by 0. Yep, will fix. >> Index: predict.c >> =================================================================== >> --- predict.c (revision 203152) >> +++ predict.c (working copy) >> @@ -237,17 +237,31 @@ probably_never_executed (struct function *fun, >> gcc_checking_assert (fun); >> if (profile_status_for_function (fun) == PROFILE_READ) >> { >> - if ((count * 4 + profile_info->runs / 2) / profile_info->runs > 0) >> + int min_run_ratio = PARAM_VALUE (PARAM_MIN_HOT_RUN_RATIO); >> + if (RDIV (count * min_run_ratio, profile_info->runs) > 0) > > This way if count is 1, runs needs to be n_run_ratio * 2 > So perhaps if (count * min_run_ratio >= profile_info->runs) instead of > division. Ok. >> - if (ENTRY_BLOCK_PTR->count && ENTRY_BLOCK_PTR->count < >> REG_BR_PROB_BASE) >> + if (ENTRY_BLOCK_PTR->count) >> { >> - return (RDIV (frequency * ENTRY_BLOCK_PTR->count, >> - ENTRY_BLOCK_PTR->frequency) >> - < REG_BR_PROB_BASE / 4); >> + gcov_type scaled_count >> + = frequency * ENTRY_BLOCK_PTR->count * min_run_ratio; >> + gcov_type computed_count; >> + /* Check for overflow, in which case ENTRY_BLOCK_PTR->count should >> + be large enough to do the division first without losing much >> + precision. */ >> + if (scaled_count/frequency/min_run_ratio != >> ENTRY_BLOCK_PTR->count) > > I do not like the undefined behaviour triggered before the check (sure > unsigned > arithmetic would fix that, but it seems all strange). What about guarding the > whole code. > > if (ENTRY_BLOCK_PTR->count && count < REG_BR_PROB_BASE) > For large counts the roudoff problems in first test should not be problem. Sure, that sounds reasonable. >> + { >> + computed_count = RDIV (ENTRY_BLOCK_PTR->count, >> + ENTRY_BLOCK_PTR->frequency); >> + computed_count *= frequency * min_run_ratio; >> + } >> + else >> + computed_count = RDIV (scaled_count, >> ENTRY_BLOCK_PTR->frequency); >> + if (RDIV (computed_count * min_run_ratio, profile_info->runs) > 0) > > So at nonoverflow path you compute > > ((frequency * entry_bb_count * min_run_ratio) / entry_bb_frequency) * > min_run_ratio / runs > 0.5 Yes, there is an extra min_run_ratio factor in there that I need to remove! > > I think you want > > ((frequency * entry_bb_count * min_run_ratio) / entry_bb_frequency >= runs Right, will change that too. > > If entry_bb_frequency is 0, you can just return true iff frequency is > > min_run_ratio. We already return false conservatively if entry_bb_frequency is 0. Do we really want to change this - I'm not sure it makes sense to compare the frequency to the run ratio directly. > > I think safe way to compute this is > > if (ENTRY_BLOCK_PTR->count < REG_BR_PROB_BASE * REG_BR_PROB_BASE) > scaled_count = ((frequency * entry_bb_count * min_run_ratio) / > entry_bb_frequency; > else > scaled_count = ((frequency * entry_bb_count) / entry_bb_frequency) * > min_run_ratio > > In first path, we have at most 10000*10000*10000*10000 that still fits in > 64bit > value. > > In the second path the base of fixed point arithmetic is large enough so the > division should not matter. We never get over entry_count * 10000 that is > safe. > The first part of statement however just compute value that should match count > so count > 10000 unless profile got misupated and the low-count roundoff > errors should > not matter. So perhaps the whole code can execute only if > if (ENTRY_BLOCK_PTR->count && count < REG_BR_PROB_BASE > && ENTRY_BLOCK_PTR->count < REG_BR_PROB_BASE * REG_BR_PROB_BASE) > and we do not need any conditionals for scaled_count > > Does such code work with ratio changed to 16? I think the double > multiplication in your > patch may have caused problems. Let me fix this up and will get back. Thanks, Teresa > > I will check the second part of patch separately. > Thanks! > Honza -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413