Hi Bill, Thanks for the review!
on 2021/9/18 上午12:27, Bill Schmidt wrote: > Hi Kewen, > > On 9/15/21 3:52 AM, Kewen.Lin wrote: >> Hi, >> >> This patch follows the discussion here[1], where Segher suggested >> parameterizing those exact magic constants for density heuristics, >> to make it easier to tweak if need. >> >> Since these heuristics are quite internal, I make these parameters >> as undocumented and be mainly used by developers. >> >> The change here should be "No Functional Change". But I verified >> it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8, >> the result is neutral as expected. >> >> Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9. >> >> Is it ok for trunk? >> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html >> >> BR, >> Kewen >> ----- >> gcc/ChangeLog: >> >> * config/rs6000/rs6000.opt (rs6000-density-pct-threshold, >> rs6000-density-size-threshold, rs6000-density-penalty, >> rs6000-density-load-pct-threshold, >> rs6000-density-load-num-threshold): New parameter. >> * config/rs6000/rs6000.c (rs6000_density_test): Adjust with >> corresponding parameters. >> >> --- >> gcc/config/rs6000/rs6000.c | 22 +++++++--------------- >> gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++ >> 2 files changed, 28 insertions(+), 15 deletions(-) >> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >> index 9bc826e3a50..4ab23b0ab33 100644 >> --- a/gcc/config/rs6000/rs6000.c >> +++ b/gcc/config/rs6000/rs6000.c >> @@ -5284,9 +5284,6 @@ struct rs6000_cost_data >> static void >> rs6000_density_test (rs6000_cost_data *data) >> { >> - const int DENSITY_PCT_THRESHOLD = 85; >> - const int DENSITY_SIZE_THRESHOLD = 70; >> - const int DENSITY_PENALTY = 10; >> struct loop *loop = data->loop_info; >> basic_block *bbs = get_loop_body (loop); >> int nbbs = loop->num_nodes; >> @@ -5322,26 +5319,21 @@ rs6000_density_test (rs6000_cost_data *data) >> free (bbs); >> density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost); >> >> - if (density_pct > DENSITY_PCT_THRESHOLD >> - && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD) >> + if (density_pct > rs6000_density_pct_threshold >> + && vec_cost + not_vec_cost > rs6000_density_size_threshold) >> { >> - data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100; >> + data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / >> 100; >> if (dump_enabled_p ()) >> dump_printf_loc (MSG_NOTE, vect_location, >> "density %d%%, cost %d exceeds threshold, penalizing " >> - "loop body cost by %d%%\n", density_pct, >> - vec_cost + not_vec_cost, DENSITY_PENALTY); >> + "loop body cost by %u%%\n", density_pct, >> + vec_cost + not_vec_cost, rs6000_density_penalty); >> } >> >> /* Check whether we need to penalize the body cost to account >> for excess strided or elementwise loads. */ >> if (data->extra_ctor_cost > 0) >> { >> - /* Threshold for load stmts percentage in all vectorized stmts. */ >> - const int DENSITY_LOAD_PCT_THRESHOLD = 45; >> - /* Threshold for total number of load stmts. */ >> - const int DENSITY_LOAD_NUM_THRESHOLD = 20; >> - >> gcc_assert (data->nloads <= data->nstmts); >> unsigned int load_pct = (data->nloads * 100) / data->nstmts; >> >> @@ -5355,8 +5347,8 @@ rs6000_density_test (rs6000_cost_data *data) >> the loads. >> One typical case is the innermost loop of the hotspot of SPEC2017 >> 503.bwaves_r without loop interchange. */ >> - if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD >> - && load_pct > DENSITY_LOAD_PCT_THRESHOLD) >> + if (data->nloads > (unsigned int) rs6000_density_load_num_threshold >> + && load_pct > (unsigned int) rs6000_density_load_pct_threshold) >> { >> data->cost[vect_body] += data->extra_ctor_cost; >> if (dump_enabled_p ()) >> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt >> index 0538db387dc..563983f3269 100644 >> --- a/gcc/config/rs6000/rs6000.opt >> +++ b/gcc/config/rs6000/rs6000.opt >> @@ -639,3 +639,24 @@ Enable instructions that guard against return-oriented >> programming attacks. >> mprivileged >> Target Var(rs6000_privileged) Init(0) >> Generate code that will run in privileged state. >> + >> +-param=rs6000-density-pct-threshold= >> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) >> Init(85) IntegerRange(0, 99) Param >> +When costing for loop vectorization, we probably need to penalize the loop >> body cost if the existing cost model may not adequately reflect delays from >> unavailable vector resources. We collect the cost for vectorized statements >> and non-vectorized statements separately, check the proportion of vec_cost >> to total cost of vec_cost and non vec_cost, and penalize only if the >> proportion exceeds the threshold specified by this parameter. The default >> value is 85. >> + >> +-param=rs6000-density-size-threshold= >> +Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) >> Init(70) IntegerRange(0, 99) Param > > I think 99 is not a sufficient upper bound. This is a counting value that > could in theory get much higher. Can you set it to something ridiculous like > IntegerRange(0, 1000)? > Thanks for catching! It's a copy & paste typo. :( Will fix! >> +Like parameter rs6000-density-pct-threshold, we also check the total sum of >> vec_cost and non vec_cost, and penalize only if the sum exceeds the >> threshold specified by this parameter. The default value is 70. >> + >> +-param=rs6000-density-penalty= >> +Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) >> IntegerRange(0, 1000) Param >> +When both heuristics with rs6000-density-pct-threshold and >> rs6000-density-size-threshold are satisfied, we decide to penalize the loop >> body cost by the value which is specified by this parameter. The default >> value is 10. >> + >> +-param=rs6000-density-load-pct-threshold= >> +Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) >> Init(45) IntegerRange(0, 99) Param >> +When costing for loop vectorization, we probably need to penalize the loop >> body cost by accounting for excess strided or elementwise loads. We collect >> the numbers for general statements and load statements according to the >> information for statement to be vectorized, check the proportion of load >> statements, and penalize only if the proportion exceeds the threshold >> specified by this parameter. The default value is 45. > > "according to the information for statement*s* to be vectorized" > Good catch, will fix. > Otherwise, these descriptions are quite excellent, by the way. > > This looks good to me with those minor things addressed. Recommend that > maintainers approve. > Thanks again! BR, Kewen > Thanks for the patch! > Bill >> + >> +-param=rs6000-density-load-num-threshold= >> +Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) >> Init(20) IntegerRange(0, 1000) Param >> +Like parameter rs6000-density-load-pct-threshold, we also check if the >> total number of load statements exceeds the threshold specified by this >> parameter, and penalize only if it's satisfied. The default value is 20. >> + >> -- >> 2.25.1 >> >