Hi Bin, On Fri, Sep 04, 2020 at 04:27:32PM +0800, Bin.Cheng wrote: > On Fri, Sep 4, 2020 at 6:37 AM Segher Boessenkool > <seg...@kernel.crashing.org> wrote: > > It should have cost, certainly, but not address_cost I think. The total > > cost of an ldu should be a tiny bit less than that of ld + that of addi; > > the address_cost of ldu should be the same as that of ld. > Hi Segher, > In simple cases, yes, and it is also the (rough) idea of modeling > auto-inc addressing mode in ivopts, however, things are different if > loop gets complicated.
The address_cost function is used for many other things, not just ivopts, so this shouldn't be done there. That is all :-) > Considering the case choosing 10 auto-inc > addressing_mode/candidate vs. [base_x + iv_index]. The latter only > needs one add instruction, while the former needs 10 embedded auto-inc > operations. Yeah. > Another issue is register pressure, choosing auto-inc candidates could > result in more IV, while choosing IV_index results in one IV (and more > Base pointers), however, spilling base pointer (which is loop > invariant) is usually cheaper than IV. > Another issue is auto-inc candidates probably lead to more bloated > setup code in the preheader BB, due to problems in expression > canonicalization, CSE, etc.. > > So it's not that easy to answer the question for complicated cases. > As for simple cases, the current model works fine with auto-inc > (somehow) preferred. Right, I wasn't saying that at all, sorry if I confused things. Thanks, Segher