On Wed, Jul 27, 2011 at 11:59 PM, Sandra Loosemore
<san...@codesourcery.com> wrote:
> Consider this bit of code:
>
> extern double a[20];
>
> double test1 (int n)
> {
>  double accum = 0.0;
>  int i;
>
>  for (i=0; i<n; i++) { accum -= a[i]; }
>  accum = fabs (accum);
>  return accum;
> }
>
> which is compiled for MIPS using
>
> mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 -S
> abstest.c
>
> With a GCC 4.6 compiler, this produces:
> ...
> .L3:
>        mtc1    $3,$f2
>        ldc1    $f0,0($5)
>        addiu   $5,$5,8
>        mtc1    $2,$f3
>        sub.d   $f2,$f2,$f0
>        mfc1    $3,$f2
>        bne     $5,$4,.L3
>        mfc1    $2,$f3
>
>        ext     $5,$2,0,31
>        move    $4,$3
> .L2:
>        mtc1    $4,$f0
>        j       $31
>        mtc1    $5,$f1
> ...
>
> This is terrible code, with all that pointless register-shuffling inside the
> loop -- what's gone wrong?  Well, the bit-twiddling expansion of "fabs"
> produced by optabs.c uses subreg expressions, and on MIPS
> CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integer
> operations.  And, when IRA sees that, it decides it cannot alloc "accum" to
> a FP reg at all, even if it obviously makes sense to put it there for the
> rest of its lifetime.
>
> On mainline trunk, things are even worse as it's spilling to memory, not
> just shuffling between registers:
>
> .L3:
>        ldc1    $f0,0($2)
>        addiu   $2,$2,8
>        sub.d   $f2,$f2,$f0
>        bne     $2,$3,.L3
>        sdc1    $f2,0($sp)
>
>        lw      $2,0($sp)
>        ext     $3,$2,0,31
>        lw      $2,4($sp)
> .L2:
>        sw      $2,4($sp)
>        sw      $3,0($sp)
>        lw      $3,4($sp)
>        lw      $2,0($sp)
>        addiu   $sp,$sp,8
>        mtc1    $3,$f0
>        j       $31
>        mtc1    $2,$f1
>
> I've been experimenting with a patch to the MIPS backend to add
> define_insn_and_split patterns for floating-point abs -- the idea is to
> attach some constraints to the insns to tell IRA it needs a GP reg for these
> operations, so it can apply its usual cost analysis and reload logic instead
> of giving up.  Then the split to introduce the subreg expansion happens
> after reload when we already have the right register class.  This seems to
> work well enough on 4.6; for this particular example, I'm getting:
>
> .L3:
>        ldc1    $f2,0($2)
>        addiu   $2,$2,8
>        bne     $2,$4,.L3
>        sub.d   $f0,$f0,$f2
>
>        mfc1    $2,$f1
>        ext     $2,$2,0,31
>        j       $31
>        mtc1    $2,$f1
>
> However, same patch on mainline is still giving spills to memory.  :-(
>
> So, here's my question.  Is it worthwhile for me to continue this approach
> of trying to make the MIPS backend smarter?  Or is the way IRA deals with
> CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need of fixing in a
> target-inspecific way?  And/or is there some other regression in IRA on
> mainline that's causing it to spill to memory when it didn't used to in 4.6?
>
> BTW, the unary "neg" operator has the same problem as "abs" on MIPS; can't
> use the hardware instruction because it does the wrong thing with NaNs, and
> can't twiddle the sign bit directly in a FP register.  With both abs/neg now
> generating unnecessary memory spills, this seems like a fairly important
> performance regression....

It sounds like IRA would benefit from properly split live-ranges here.
 You could try
to make the fabs optabs magic make sure to use a new pseudo (well, and hope
that survives ...)

Richard.

> -Sandra
>
>

Reply via email to