Hi Richard,

Thanks for the comments!

on 2020/7/27 下午9:40, Richard Sandiford wrote:
> "Kewen.Lin" <li...@linux.ibm.com> writes:

[snip]

>> +      bool niters_known_p = LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo);
>> +      bool need_iterate_p
>> +    = (!LOOP_VINFO_EPILOGUE_P (loop_vinfo)
>> +       && !vect_known_niters_smaller_than_vf (loop_vinfo));
>> +
>> +      /* Init min/max, shift and minus cost relative to single
>> +     scalar_stmt. For now we only use length-based partial vectors on
>> +     Power, target specific cost tweaking may be needed for other
>> +     ports in future.  */
>> +      unsigned int min_max_cost = 2;
>> +      unsigned int shift_cost = 1, minus_cost = 1;
> 
> Please instead add a scalar_min_max to vect_cost_for_stmt, and use
> scalar_stmt for shift and minus.  There shouldn't be any Power things
> hard-coded into target-independent code.
> 

Agree!  It's not good to leave them there.  I thought to wait and see
if other targets which support vector with length can reuse this, or
move it to target specific codes then if not sharable.  But anyway
it looks not good, let's fix it.

I had some concerns on vect_cost_for_stmt way, since it seems to allow
more computations similar to min/max to be added like this, in long
term it probably leads to the situtation like: scalar_min_max,
scalar_div_expr, scalar_mul_expr ... an enum (cost types) bloat, it
seems not good to maintain.

I noticed that i386 port ix86_add_stmt_cost will check stmt_info->stmt,
whether is assignment and the subcode of the expression, it provides the
chance to check the statement more fine-grain, not just as normal
scalar_stmt/vector_stmt.

For the case here, we don't have the stmt_info, but we know the type
of computation(expression), how about to extend the hook add_stmt_cost
with one extra tree_code type argument, by default it can be some
unmeaningful code, for some needs like here, we specify the tree_code
as the code of computation, like {MIN,MAX}_EXPR, then target specific
add_stmt_cost can check this tree_code and adjust the cost accordingly.

>> +      /* Init cost relative to single scalar_stmt.  */
>> +      unsigned int prologue_cnt = 0;
>> +      unsigned int body_cnt = 0;
> 
> Maybe _stmts instead of _cnt, to make it clearer what we're counting.
> At first it looked like this was actually the raw cost.  I guess with
> the above change we'd also want separate counters for min_max.
> 

Good point, will fix it in v5.

>> +      rgroup_controls *rgc;
>> +      unsigned int num_vectors_m1;
>> +      FOR_EACH_VEC_ELT (LOOP_VINFO_LENS (loop_vinfo), num_vectors_m1, rgc)
>> +    if (rgc->type)
>> +      {
>> +        unsigned nitems = rgc->max_nscalars_per_iter * rgc->factor;
>> +
>> +        /* May need one shift for nitems_total computation.  */
>> +        if (nitems != 1 && !niters_known_p)
>> +          prologue_cnt += shift_cost;
>> +
>> +        /* Need to handle wrap around.  */
>> +        if (iv_limit == -1
>> +            || (wi::min_precision (iv_limit * nitems, UNSIGNED)
>> +                > compare_precision))
>> +          prologue_cnt += (min_max_cost + minus_cost);
> 
> I think we should instead split this condition out into a subroutine
> that both this function and vect_set_loop_condition_partial_vectors
> can use.  It would take the loop_vec_info and rgroup_controls as
> arguments.

OK, will split it in v5.

BR,
Kewen

Reply via email to