IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?

Sandra Loosemore Wed, 27 Jul 2011 15:00:53 -0700

Consider this bit of code:

extern double a[20];


double test1 (int n)
{
  double accum = 0.0;
  int i;

  for (i=0; i<n; i++) { accum -= a[i]; }
  accum = fabs (accum);
  return accum;
}

which is compiled for MIPS using

mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1-S abstest.c


With a GCC 4.6 compiler, this produces:
...
.L3:
        mtc1    $3,$f2
        ldc1    $f0,0($5)
        addiu   $5,$5,8
        mtc1    $2,$f3
        sub.d   $f2,$f2,$f0
        mfc1    $3,$f2
        bne     $5,$4,.L3
        mfc1    $2,$f3

        ext     $5,$2,0,31
        move    $4,$3
.L2:
        mtc1    $4,$f0
        j       $31
        mtc1    $5,$f1
...

This is terrible code, with all that pointless register-shuffling insidethe loop -- what's gone wrong? Well, the bit-twiddling expansion of"fabs" produced by optabs.c uses subreg expressions, and on MIPSCANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integeroperations. And, when IRA sees that, it decides it cannot alloc "accum"to a FP reg at all, even if it obviously makes sense to put it there forthe rest of its lifetime.

On mainline trunk, things are even worse as it's spilling to memory, notjust shuffling between registers:


.L3:
        ldc1    $f0,0($2)
        addiu   $2,$2,8
        sub.d   $f2,$f2,$f0
        bne     $2,$3,.L3
        sdc1    $f2,0($sp)

        lw      $2,0($sp)
        ext     $3,$2,0,31
        lw      $2,4($sp)
.L2:
        sw      $2,4($sp)
        sw      $3,0($sp)
        lw      $3,4($sp)
        lw      $2,0($sp)
        addiu   $sp,$sp,8
        mtc1    $3,$f0
        j       $31
        mtc1    $2,$f1

I've been experimenting with a patch to the MIPS backend to adddefine_insn_and_split patterns for floating-point abs -- the idea is toattach some constraints to the insns to tell IRA it needs a GP reg forthese operations, so it can apply its usual cost analysis and reloadlogic instead of giving up. Then the split to introduce the subregexpansion happens after reload when we already have the right registerclass. This seems to work well enough on 4.6; for this particularexample, I'm getting:


.L3:
        ldc1    $f2,0($2)
        addiu   $2,$2,8
        bne     $2,$4,.L3
        sub.d   $f0,$f0,$f2

        mfc1    $2,$f1
        ext     $2,$2,0,31
        j       $31
        mtc1    $2,$f1

However, same patch on mainline is still giving spills to memory.  :-(

So, here's my question. Is it worthwhile for me to continue thisapproach of trying to make the MIPS backend smarter? Or is the way IRAdeals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need offixing in a target-inspecific way? And/or is there some otherregression in IRA on mainline that's causing it to spill to memory whenit didn't used to in 4.6?

BTW, the unary "neg" operator has the same problem as "abs" on MIPS;can't use the hardware instruction because it does the wrong thing withNaNs, and can't twiddle the sign bit directly in a FP register. Withboth abs/neg now generating unnecessary memory spills, this seems like afairly important performance regression....


-Sandra

IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?

Reply via email to