Michael Matz <m...@suse.de> writes:
> On Wed, 6 Jul 2011, Richard Sandiford wrote:
>> The individual difference_cost and add_cost seem reasonable (4 in each 
>> case). I don't understand the reasoning behind the division though.  Is 
>> the idea that this should be hoisted?
>
> Yes, it should be hoisted outside the loop.  The difference is between two 
> loop-invariant values (the bases), and hence is also loop-invariant.  Some 
> tree optimizer should do this already, possibly the casts confuse us.

OK, thanks, suspected as much.

>> If so, then:
>> 
>> (a) That doesn't happen at the tree level.  The subtraction is still inside
>>     the loop at RTL generation time.
>> 
>> (b) What's the advantage of introducing a new hoisted subtraction that
>>     is going to be live throughout the loop, and then adding another IV
>>     to it inside the loop, over using the original IV and incrementing it
>>     in the normal way?
>
> It can reduce address complexity for one of the addresses.  E.g. given:
>
>  i=0; i < end; i+=4 
>    p[i];
>    q[i];
>
> -->
>
>  n=p; n < p+end; n+=4
>    [n];
>    (q-p)[n];
>
> Here (q-p) is loop-invariant, and the complexity of the first address is 
> lower (no offset).  In fact the register pressure is lower by one too 
> (three instead of four, including the end/p+end bound).

But your second loop isn't what I was comparing it with.  I was comparing
it with:

n=p; n < p+end; n+=4, m+=4
  [n]
  [m]

That has the same number of registers (3) and the same number of
additions (2).  And the [m] is what we started with, so it was
actually:

  i=0; i<count; i+=1, n+=4, m+=4
    [n]
    [m]

-->

  i=0; i<count; i+=1, n+=4
    [n]
    (q-p)[n]

(we don't get rid of "i" or "count" in this case.

If the target allows (q-p)[n] to be used directly as an address, and if
the target has no post-increment instruction, then it might be better.
But I think it's a loss on other targets.  It might even be a loss on
targets (like PowerPC IIRC), that need base+index addresses to have
the "real" base first.  This sort of transformation seems to make us
lose track of which register is the base.

Richard

Reply via email to