Re: [PATCH ARM]Refine scaled address expression on ARM

Bin.Cheng Fri, 29 Nov 2013 10:04:18 -0800

On Sat, Nov 30, 2013 at 12:34 AM, Richard Earnshaw <rearn...@arm.com> wrote:
> On 29/11/13 11:46, Yufeng Zhang wrote:
>> On 11/29/13 07:52, Bin.Cheng wrote:
>>> After thinking twice, I some kind of think we should not re-associate
>>> addresses during expanding, because of lacking of context information.
>>>   Take base + scaled_index + offset as an example in PR57540, we just
>>> don't know if "base+offset" is loop invariant from either backend or
>>> RTL expander.
>>
>> I'm getting less convinced by re-associating base with offset
>> unconditionally.  One counter example is
>>
>> typedef int arr_1[20];
>> void foo (arr_1 a1, int i)
>> {
>>    a1[i+10] = 1;
>> }
>>
>> I'm experimenting a patch to get the immediate offset in the above
>> example to be the last addend in the address computing (as mentioned in
>> http://gcc.gnu.org/ml/gcc/2013-11/msg00581.html), aiming to get the
>> following code-gen:
>>
>>          add     r1, r0, r1, asl #2
>>          mov     r3, #1
>>          str     r3, [r1, #40]
>>
>> With your patch applied, the effort will be reverted to
>>
>>          add     r0, r0, #40
>>          mov     r3, #1
>>          str     r3, [r0, r1, asl #2]
>>
>
> And another one is:
>
>
>
> typedef int arr_1[20];
> void foo (arr_1 a1, int i)
> {
>    a1[i+10] = 1;
>    a1[i+11] = 1;
> }
>
> This should compile to:
>
>         add     r1, r0, r1, asl #2
>         mov     r3, #1
>         str     r3, [r1, #40]
>         str     r3, [r1, #44]
>
> And which on Thumb2 should then collapse to:
>
>         add     r1, r0, r1, asl #2
>         mov     r3, #1
>         strd    r3, r3, [r1, #40]
>
> With your patch I don't see any chance of being able to get to this
> situation.
>
> (BTW, we currently generate:
>
>         mov     r3, #1
>         add     r1, r1, #10
>         add     r2, r0, r1, asl #2
>         str     r3, [r0, r1, asl #2]
>         str     r3, [r2, #4]
>
> which is insane).
The two memory references share common sub expressions, SLSR is
designed to handle this case, and it should be improved to handle.
The original patch are only used to pick up cases not handled by SLSR
and IVOPT.  Anyway, as you saw from previous message, to do the
refactoring during expand is not a good practice, without enough
CSE/INVARIANT information, there will be always catched and missed
opportunities, that's why I think another lowering besides SLSR/IVOPT
on gimple might be a win.


Thanks,
bin

>
> I think I see where you're coming from on the original testcase, but I
> think you're trying to solve the wrong problem.   In your test case the
> base is an eliminable register, which is likely to be replaced with an
> offset expression during register allocation.  The problem then, I
> think, is that the cost of these virtual registers is treated the same
> as any other pseudo register, when it may really have the cost of a PLUS
> expression.
>
> Perhaps the cost of using an eliminable register should be raised in
> rtx_costs() (treating them as equivalent to (PLUS (reg) (CONST_INT
> (TBD))), so that loop optimizations will try to hoist suitable
> sub-expressions out the loop and replace them with real pseudos.
>
> R.
>
>
>
>
>
>
>
>



-- 
Best Regards.

Re: [PATCH ARM]Refine scaled address expression on ARM

Reply via email to