Hi Hans,

on 2020/9/6 上午10:47, Hans-Peter Nilsson wrote:
> On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote:
>>> Great idea!  With explicitly specified -funroll-loops, it's bootstrapped
>>> but the regression testing did show one failure (the only one):
>>>
>>>   PASS->FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms "SMS succeeded" 1
>>>
>>> It exposes two issues:
>>>
>>> 1) Currently address_cost hook on rs6000 always return zero, but at least
>>> from Power7, pre_inc/pre_dec kind instructions are cracked, it means we
>>> have to take the address update into account (scalar normal operation).
>>> Since IVOPTs reduces the cost_step for ainc candidates, it makes us prefer
>>> ainc candidates.  In this case, the cand/group cost is -4 (minus cost_step),
>>> with scaling up, the off becomes much.  With one simple hack on for pre_inc/
>>> pre_dec in rs6000 address_cost, the case passed.  It should be handled in
>>> one separated issue.
>>>
>>> 2) This case makes me think we should exclude ainc candidates in function
>>> mark_reg_offset_candidates.  The justification is that: ainc candidate
>>> handles step update itself and when we calculate the cost for it against
>>> its ainc_use, the cost_step has been reduced. When unrolling happens,
>>> the ainc computations are replicated and it doesn't save step updates
>>> like normal reg_offset_p candidates.
>> Though auto-inc candidate embeds stepping operation into memory
>> access, we might want to avoid it in case of unroll if there are many
>> sequences of memory accesses, and if the unroll factor is big.  The
>> rationale is embedded stepping is a u-arch operation and does have its
>> cost.
> 
> Forgive me for barging in here (though the context is powerpc,
> the dialogue and the patch seems to be generic ivopts), but
> that's not a general remark I hope, about auto-inc (always)
> having a cost?
> 
> For some architectures, auto-inc *is* free, as free as
> register-indirect, so the more auto-inc use, the better.  All
> this should be reflected by the address-cost, IMHO, and not
> hardcoded into ivopts.
> 

Yeah, now ivopts doesn't hardcode the cost for auto-inc (always),
instead it allows targets to set its cost by themselves through
address_cost hook.  As the function get_address_cost_ainc, it
checks auto-inc operations supported or not and set the cost
as address_cost hook further.

One example on Power is listed as below:

Group 0:
  cand  cost    compl.  inv.expr.       inv.vars
  1     4       1       NIL;    1
  3     0       0       NIL;    NIL;
  4     0       1       NIL;    1
  5     0       1       NIL;    NIL;
  13    0       1       NIL;    NIL;
  18    -4      0       NIL;    NIL;

Cand 18 is one auto-inc candidate, whose group 0/cand cost is
-4 (minus step_cost), the iv_cost of cand 18 is 5 (step_cost +
non-original_iv cost), when it's selected, the step_cost parts
counteract, the remaining cost (1) is for non-original iv,
it shows it doesn't put any hardcoded cost to this ainc_cost
candidate.

I guess some misunderstanding was derived from some discussion
above.  Sorry if some of my previous comments misled you.

BR,
Kewen

Reply via email to