FLOAT_EXPR and FIX_TRUNC_EXPR are both expensive so they really
should be added to this list.
Take the following code:
int f(float a, int *b)
{
int i;
for(i = 0; i<1000;i++)
*b = a;
}
Compile with -O1 and we get on PPC:
_f:
fctiwz f0,f1
addi r2,r1,-4
stfiwx f0,0,r2
lwz r2,-4(r1)
li r0,1000
mtctr r0
L2:
stw r2,-8(r1)
bdnz L2
stw r2,0(r4)
blr
Notice the loop and extra store (to the stack), if we change LIM to pull
out the cast (FIX_TRUNC_EXPR), we get:
_f:
fctiwz f0,f1
stfiwx f0,0,r4
blr
Which is much better and there is no loop. Also this is already done
by PRE
at -O2 but it would be nice if LIM did it also. I should mention
that PRE
does not care about cost really and always pulls out the calculation
so maybe
LIM should be doing that instead.
Thanks,
Andrew Pinski