Jiong Wang writes:

> Current IRA still use both target macros in a few places.
>
> Tell IRA to use the order we defined rather than with it's own cost
> calculation. Allocate caller saved first, then callee saved.
>
> This is especially useful for LR/x30, as it's free to allocate and is
> pure caller saved when used in leaf function.
>
> Haven't noticed significant impact on benchmarks, but by grepping some
> keywords like "Spilling", "Push.*spill" etc in ira rtl dump, the number
> is smaller.
>
> OK for trunk?
>
> 2015-05-19  Jiong. Wang  <jiong.w...@arm.com>
>
> gcc/
>   PR 63521
>   * config/aarch64/aarch64.h (REG_ALLOC_ORDER): Define.
>   (HONOR_REG_ALLOC_ORDER): Define.
>
> Regards,
> Jiong

Ping.

I know it's hard to notice the register allocation improvements by this
hook as current IRA/LRA has improved register allocation quite a lot.

But given the example like below:

test.c
======

double dec (double, double);

int cal (int a, int b, double d, double e)
{
  double sum = dec (a , a + b);
  sum = dec (b, a - b);
  sum = dec (sum, a * b);
  return d + e + sum;
}

Although the instruction number is the same before and after this patch,
but the instruction scheduling looks better after this patch as we
allocated w7 instead of w0 there is few instruction dependecies.

Before Patch (-O2)
======
cal:
        stp     x29, x30, [sp, -48]!
        add     x29, sp, 0
        stp     x19, x20, [sp, 16]
        stp     d8, d9, [sp, 32]
        mov     w19, w0
        add     w0, w0, w1
        fmov    d9, d1
        mov     w20, w1
        fmov    d8, d0
        scvtf   d1, w0
        scvtf   d0, w19
        bl      dec
        scvtf   d0, w20 
        sub     w0, w19, w20
        mul     w19, w19, w20
        scvtf   d1, w0
        bl      dec
        scvtf   d1, w19
        bl      dec
        fadd    d8, d8, d9
        ldp     x19, x20, [sp, 16]
        fadd    d0, d8, d0
        ldp     d8, d9, [sp, 32]
        ldp     x29, x30, [sp], 48
        fcvtzs  w0, d0
        ret

After Patch
===========
cal:    
        stp     x29, x30, [sp, -48]!
        add     w7, w0, w1
        add     x29, sp, 0
        stp     d8, d9, [sp, 32]
        fmov    d9, d1
        fmov    d8, d0
        scvtf   d1, w7
        scvtf   d0, w0
        stp     x19, x20, [sp, 16]
        mov     w20, w1 
        mov     w19, w0
        bl      dec
        scvtf   d0, w20
        sub     w7, w19, w20
        mul     w19, w19, w20
        scvtf   d1, w7
        bl      dec 
        scvtf   d1, w19
        bl      dec
        fadd    d8, d8, d9
        ldp     x19, x20, [sp, 16]
        fadd    d0, d8, d0
        ldp     d8, d9, [sp, 32]
        ldp     x29, x30, [sp], 48
        fcvtzs  w0, d0
        ret
-- 
Regards,
Jiong

Reply via email to