Hi Bill,

Thanks for the review!

on 2021/9/18 上午12:27, Bill Schmidt wrote:
> Hi Kewen,
> 
> On 9/15/21 3:52 AM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch follows the discussion here[1], where Segher suggested
>> parameterizing those exact magic constants for density heuristics,
>> to make it easier to tweak if need.
>>
>> Since these heuristics are quite internal, I make these parameters
>> as undocumented and be mainly used by developers.
>>
>> The change here should be "No Functional Change".  But I verified
>> it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
>> the result is neutral as expected.
>>
>> Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9.
>>
>> Is it ok for trunk?
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>>     * config/rs6000/rs6000.opt (rs6000-density-pct-threshold,
>>     rs6000-density-size-threshold, rs6000-density-penalty,
>>     rs6000-density-load-pct-threshold,
>>     rs6000-density-load-num-threshold): New parameter.
>>     * config/rs6000/rs6000.c (rs6000_density_test): Adjust with
>>     corresponding parameters.
>>
>> ---
>>   gcc/config/rs6000/rs6000.c   | 22 +++++++---------------
>>   gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++
>>   2 files changed, 28 insertions(+), 15 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 9bc826e3a50..4ab23b0ab33 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -5284,9 +5284,6 @@ struct rs6000_cost_data
>>   static void
>>   rs6000_density_test (rs6000_cost_data *data)
>>   {
>> -  const int DENSITY_PCT_THRESHOLD = 85;
>> -  const int DENSITY_SIZE_THRESHOLD = 70;
>> -  const int DENSITY_PENALTY = 10;
>>     struct loop *loop = data->loop_info;
>>     basic_block *bbs = get_loop_body (loop);
>>     int nbbs = loop->num_nodes;
>> @@ -5322,26 +5319,21 @@ rs6000_density_test (rs6000_cost_data *data)
>>     free (bbs);
>>     density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost);
>>
>> -  if (density_pct > DENSITY_PCT_THRESHOLD
>> -      && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD)
>> +  if (density_pct > rs6000_density_pct_threshold
>> +      && vec_cost + not_vec_cost > rs6000_density_size_threshold)
>>       {
>> -      data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
>> +      data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 
>> 100;
>>         if (dump_enabled_p ())
>>       dump_printf_loc (MSG_NOTE, vect_location,
>>                "density %d%%, cost %d exceeds threshold, penalizing "
>> -             "loop body cost by %d%%\n", density_pct,
>> -             vec_cost + not_vec_cost, DENSITY_PENALTY);
>> +             "loop body cost by %u%%\n", density_pct,
>> +             vec_cost + not_vec_cost, rs6000_density_penalty);
>>       }
>>
>>     /* Check whether we need to penalize the body cost to account
>>        for excess strided or elementwise loads.  */
>>     if (data->extra_ctor_cost > 0)
>>       {
>> -      /* Threshold for load stmts percentage in all vectorized stmts.  */
>> -      const int DENSITY_LOAD_PCT_THRESHOLD = 45;
>> -      /* Threshold for total number of load stmts.  */
>> -      const int DENSITY_LOAD_NUM_THRESHOLD = 20;
>> -
>>         gcc_assert (data->nloads <= data->nstmts);
>>         unsigned int load_pct = (data->nloads * 100) / data->nstmts;
>>
>> @@ -5355,8 +5347,8 @@ rs6000_density_test (rs6000_cost_data *data)
>>             the loads.
>>        One typical case is the innermost loop of the hotspot of SPEC2017
>>        503.bwaves_r without loop interchange.  */
>> -      if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD
>> -      && load_pct > DENSITY_LOAD_PCT_THRESHOLD)
>> +      if (data->nloads > (unsigned int) rs6000_density_load_num_threshold
>> +      && load_pct > (unsigned int) rs6000_density_load_pct_threshold)
>>       {
>>         data->cost[vect_body] += data->extra_ctor_cost;
>>         if (dump_enabled_p ())
>> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
>> index 0538db387dc..563983f3269 100644
>> --- a/gcc/config/rs6000/rs6000.opt
>> +++ b/gcc/config/rs6000/rs6000.opt
>> @@ -639,3 +639,24 @@ Enable instructions that guard against return-oriented 
>> programming attacks.
>>   mprivileged
>>   Target Var(rs6000_privileged) Init(0)
>>   Generate code that will run in privileged state.
>> +
>> +-param=rs6000-density-pct-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) 
>> Init(85) IntegerRange(0, 99) Param
>> +When costing for loop vectorization, we probably need to penalize the loop 
>> body cost if the existing cost model may not adequately reflect delays from 
>> unavailable vector resources.  We collect the cost for vectorized statements 
>> and non-vectorized statements separately, check the proportion of vec_cost 
>> to total cost of vec_cost and non vec_cost, and penalize only if the 
>> proportion exceeds the threshold specified by this parameter.  The default 
>> value is 85.
>> +
>> +-param=rs6000-density-size-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) 
>> Init(70) IntegerRange(0, 99) Param
> 
> I think 99 is not a sufficient upper bound.  This is a counting value that 
> could in theory get much higher.  Can you set it to something ridiculous like 
> IntegerRange(0, 1000)?
> 

Thanks for catching!  It's a copy & paste typo.  :(  Will fix!
>> +Like parameter rs6000-density-pct-threshold, we also check the total sum of 
>> vec_cost and non vec_cost, and penalize only if the sum exceeds the 
>> threshold specified by this parameter.  The default value is 70.
>> +
>> +-param=rs6000-density-penalty=
>> +Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) 
>> IntegerRange(0, 1000) Param
>> +When both heuristics with rs6000-density-pct-threshold and 
>> rs6000-density-size-threshold are satisfied, we decide to penalize the loop 
>> body cost by the value which is specified by this parameter.  The default 
>> value is 10.
>> +
>> +-param=rs6000-density-load-pct-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) 
>> Init(45) IntegerRange(0, 99) Param
>> +When costing for loop vectorization, we probably need to penalize the loop 
>> body cost by accounting for excess strided or elementwise loads.  We collect 
>> the numbers for general statements and load statements according to the 
>> information for statement to be vectorized, check the proportion of load 
>> statements, and penalize only if the proportion exceeds the threshold 
>> specified by this parameter.  The default value is 45.
> 
> "according to the information for statement*s* to be vectorized"
> 

Good catch, will fix.

> Otherwise, these descriptions are quite excellent, by the way.
> 
> This looks good to me with those minor things addressed.  Recommend that 
> maintainers approve.
> 

Thanks again!

BR,
Kewen

> Thanks for the patch!
> Bill
>> +
>> +-param=rs6000-density-load-num-threshold=
>> +Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) 
>> Init(20) IntegerRange(0, 1000) Param
>> +Like parameter rs6000-density-load-pct-threshold, we also check if the 
>> total number of load statements exceeds the threshold specified by this 
>> parameter, and penalize only if it's satisfied.  The default value is 20.
>> +
>> -- 
>> 2.25.1
>>
> 

Reply via email to