Hi, On Wed, 6 Jul 2011, Richard Sandiford wrote:
> But what I mean is: even with your starting loop, I'm comparing the > transformation that this code does with the alternative, but rejected, > transformation of simply treating both addresses as separate ivs. I.e.: > > i=0; i < end; i+=1 > p + i * step; > q + i * step; > --> > n=p; n < p+end; n+=step > n; > (q-p) + n; > > vs. > > i=0; i < end; i+=1 > p + i * step; > q + i * step; > --> > n=p; n < p+end; n+=step, m+=step > n; > m; > > It seems like, with this extra code, we're going out of our way to do > the first, "clever", transformation, instead of doing the second, even > though both seem to have the same cost in terms of loop operations and > live registers. So what I'm not sure of is when the first > transformation is a win over the second. It's only a strict win on targets where the addition in "(q-p) + n" can be hidden in either address generation, or combined with other arithmetic, or on all targets if (q-p) is a constant. Otherwise it merely has the same number of adds and live variables. But if it weren't for deficiencies in downstream optimizers (not hoisting the subtraction) the first variant is at least not worse than the second on targets without autoinc. On targets with autoinc obviously the second variant is better (if the autoinc really comes for free). So, sometimes the first is better, sometimes just the same, sometimes worse :-) Probably the cost function in ivopts could use some improvements taking at least autoinc into account. The valid address forms (i.e. if reg+reg is as cheap as reg) should be taken into account already. Ciao, Michael.