Re: Unnecessary PRE optimization

Xinliang David Li Wed, 23 Dec 2009 11:41:31 -0800

Similar situation happens in non loop context as well. PRE commoned
address computation without knowing the existence of advanced
addressing mode, which result in unnecessary address computation
instruction.  The forward substitution code makes local heuristics and
looks at each use individually -- it does not know if the propagation
will happen for all uses and therefore exposes DCE opportunity -- so a
precise cost estimation is not available. Even so, for such cases, a
simple change of 'gain > 0' into 'gain >= 0' in
should_replace_address_p can do the job.

For LIM case discussed in this thread, it is trickier to estimate the
cost of forward substitution without knowing the register pressure --
forward prop MAY increase the live range of the propagated value
(RHS), even though in this case it does not, and it actually shrinks
the LR of the LHS temps, thus reducing register pressure overall.   I
have submitted a live range shrink (LRS) patch a while back, but it
was not accepted.  This address computation propagation can be easily
implemented in the LPS pass with precise knowledge of the change of
register pressure.

In general it will be tricky for latter passes to clean up the messes.
The fundamental problem is that the address computation is exposed to
PRE prematurely (for a given target  ) at GIMPLE level. In this case,
if the INDIRECT_REFs are expressed as MEM_REFs, such problem might be
avoided.  A similar  issue (for ARM) is reported in bug 40956.

Thanks,

David

On Wed, Dec 23, 2009 at 10:06 AM, Paolo Bonzini <bonz...@gnu.org> wrote:
>
> On 12/23/2009 06:47 PM, H.J. Lu wrote:
>>
>> On Wed, Dec 23, 2009 at 8:41 AM, Paolo Bonzini<bonz...@gnu.org>  wrote:
>>>
>>> On 12/23/2009 04:19 PM, Bingfeng Mei wrote:
>>>>
>>>> It seems that just commenting out this check in fwprop.c should work.
>>>
>>> Yes, but it would pessimize x86.
>>>
>>
>> Is there a bug open for x86? Can't we make it target dependent, something
>> like
>>
>>  /* Do not propagate loop invariant definitions inside the loop.  */
>>  if (targetm.foobar
>>     &&  DF_REF_BB (def)->loop_father != DF_REF_BB (use)->loop_father)
>>    return;
>
> I'll open a bug.  The solution is to actually understand what the address 
> costs are on x86 (apparently it's not true that the more complex addressing 
> modes are always better, probably because of instruction sizes), not to add a 
> target macro.
>
> Paolo

Re: Unnecessary PRE optimization

Reply via email to