On Wed, Jul 27, 2011 at 11:59 PM, Sandra Loosemore <san...@codesourcery.com> wrote: > Consider this bit of code: > > extern double a[20]; > > double test1 (int n) > { > double accum = 0.0; > int i; > > for (i=0; i<n; i++) { accum -= a[i]; } > accum = fabs (accum); > return accum; > } > > which is compiled for MIPS using > > mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops -march=74kf1_1 -S > abstest.c > > With a GCC 4.6 compiler, this produces: > ... > .L3: > mtc1 $3,$f2 > ldc1 $f0,0($5) > addiu $5,$5,8 > mtc1 $2,$f3 > sub.d $f2,$f2,$f0 > mfc1 $3,$f2 > bne $5,$4,.L3 > mfc1 $2,$f3 > > ext $5,$2,0,31 > move $4,$3 > .L2: > mtc1 $4,$f0 > j $31 > mtc1 $5,$f1 > ... > > This is terrible code, with all that pointless register-shuffling inside the > loop -- what's gone wrong? Well, the bit-twiddling expansion of "fabs" > produced by optabs.c uses subreg expressions, and on MIPS > CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for integer > operations. And, when IRA sees that, it decides it cannot alloc "accum" to > a FP reg at all, even if it obviously makes sense to put it there for the > rest of its lifetime. > > On mainline trunk, things are even worse as it's spilling to memory, not > just shuffling between registers: > > .L3: > ldc1 $f0,0($2) > addiu $2,$2,8 > sub.d $f2,$f2,$f0 > bne $2,$3,.L3 > sdc1 $f2,0($sp) > > lw $2,0($sp) > ext $3,$2,0,31 > lw $2,4($sp) > .L2: > sw $2,4($sp) > sw $3,0($sp) > lw $3,4($sp) > lw $2,0($sp) > addiu $sp,$sp,8 > mtc1 $3,$f0 > j $31 > mtc1 $2,$f1 > > I've been experimenting with a patch to the MIPS backend to add > define_insn_and_split patterns for floating-point abs -- the idea is to > attach some constraints to the insns to tell IRA it needs a GP reg for these > operations, so it can apply its usual cost analysis and reload logic instead > of giving up. Then the split to introduce the subreg expansion happens > after reload when we already have the right register class. This seems to > work well enough on 4.6; for this particular example, I'm getting: > > .L3: > ldc1 $f2,0($2) > addiu $2,$2,8 > bne $2,$4,.L3 > sub.d $f0,$f0,$f2 > > mfc1 $2,$f1 > ext $2,$2,0,31 > j $31 > mtc1 $2,$f1 > > However, same patch on mainline is still giving spills to memory. :-( > > So, here's my question. Is it worthwhile for me to continue this approach > of trying to make the MIPS backend smarter? Or is the way IRA deals with > CANNOT_CHANGE_MODE_CLASS fundamentally broken and in need of fixing in a > target-inspecific way? And/or is there some other regression in IRA on > mainline that's causing it to spill to memory when it didn't used to in 4.6? > > BTW, the unary "neg" operator has the same problem as "abs" on MIPS; can't > use the hardware instruction because it does the wrong thing with NaNs, and > can't twiddle the sign bit directly in a FP register. With both abs/neg now > generating unnecessary memory spills, this seems like a fairly important > performance regression....
It sounds like IRA would benefit from properly split live-ranges here. You could try to make the fabs optabs magic make sure to use a new pseudo (well, and hope that survives ...) Richard. > -Sandra > >