Hi,

On Wed, 6 Jul 2011, Richard Sandiford wrote:

> But what I mean is: even with your starting loop, I'm comparing the
> transformation that this code does with the alternative, but rejected,
> transformation of simply treating both addresses as separate ivs.  I.e.:
> 
>   i=0; i < end; i+=1
>     p + i * step;
>     q + i * step;
> -->
>   n=p; n < p+end; n+=step
>     n;
>     (q-p) + n;
> 
> vs.
> 
>   i=0; i < end; i+=1 
>     p + i * step;
>     q + i * step;
> -->
>   n=p; n < p+end; n+=step, m+=step
>     n;
>     m;
> 
> It seems like, with this extra code, we're going out of our way to do 
> the first, "clever", transformation, instead of doing the second, even 
> though both seem to have the same cost in terms of loop operations and 
> live registers.  So what I'm not sure of is when the first 
> transformation is a win over the second.

It's only a strict win on targets where the addition in "(q-p) + n" can be 
hidden in either address generation, or combined with other arithmetic, or 
on all targets if (q-p) is a constant.
Otherwise it merely has the same number of adds and live variables.  But 
if it weren't for deficiencies in downstream optimizers (not hoisting the 
subtraction) the first variant is at least not worse than the second on 
targets without autoinc.  On targets with autoinc obviously the second 
variant is better (if the autoinc really comes for free).

So, sometimes the first is better, sometimes just the same, sometimes 
worse :-)  Probably the cost function in ivopts could use some 
improvements taking at least autoinc into account.  The valid address 
forms (i.e. if reg+reg is as cheap as reg) should be taken into account 
already.


Ciao,
Michael.

Reply via email to