FLOAT_EXPR and FIX_TRUNC_EXPR are both expensive so they really
should be added to this list.

Take the following code:
int f(float a, int *b)
{
  int i;
  for(i = 0; i<1000;i++)
    *b = a;
}

Compile with -O1 and we get on PPC:
_f:
        fctiwz f0,f1
        addi r2,r1,-4
        stfiwx f0,0,r2
        lwz r2,-4(r1)
        li r0,1000
        mtctr r0
L2:
        stw r2,-8(r1)
        bdnz L2
        stw r2,0(r4)
        blr

Notice the loop and extra store (to the stack), if we change LIM to pull
out the cast (FIX_TRUNC_EXPR), we get:
_f:
        fctiwz f0,f1
        stfiwx f0,0,r4
        blr

Which is much better and there is no loop. Also this is already done by PRE at -O2 but it would be nice if LIM did it also. I should mention that PRE does not care about cost really and always pulls out the calculation so maybe
LIM should be doing that instead.

Thanks,
Andrew Pinski

Reply via email to