On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote: > > Great idea! With explicitly specified -funroll-loops, it's bootstrapped > > but the regression testing did show one failure (the only one): > > > > PASS->FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms "SMS succeeded" 1 > > > > It exposes two issues: > > > > 1) Currently address_cost hook on rs6000 always return zero, but at least > > from Power7, pre_inc/pre_dec kind instructions are cracked, it means we > > have to take the address update into account (scalar normal operation). > > Since IVOPTs reduces the cost_step for ainc candidates, it makes us prefer > > ainc candidates. In this case, the cand/group cost is -4 (minus cost_step), > > with scaling up, the off becomes much. With one simple hack on for pre_inc/ > > pre_dec in rs6000 address_cost, the case passed. It should be handled in > > one separated issue. > > > > 2) This case makes me think we should exclude ainc candidates in function > > mark_reg_offset_candidates. The justification is that: ainc candidate > > handles step update itself and when we calculate the cost for it against > > its ainc_use, the cost_step has been reduced. When unrolling happens, > > the ainc computations are replicated and it doesn't save step updates > > like normal reg_offset_p candidates. > Though auto-inc candidate embeds stepping operation into memory > access, we might want to avoid it in case of unroll if there are many > sequences of memory accesses, and if the unroll factor is big. The > rationale is embedded stepping is a u-arch operation and does have its > cost.
Forgive me for barging in here (though the context is powerpc, the dialogue and the patch seems to be generic ivopts), but that's not a general remark I hope, about auto-inc (always) having a cost? For some architectures, auto-inc *is* free, as free as register-indirect, so the more auto-inc use, the better. All this should be reflected by the address-cost, IMHO, and not hardcoded into ivopts. brgds, H-P