Hello,
I looked at an inefficient code sequence for a simple program using GCC's picochip port (not yet submitted to mainline). Basically, a program like

long carray[10];
void fn (long c, int i)
{
 carray[i] = c;
}

produces good assembly code. But, if i were to do

struct complex16
{
int re,im;
};

struct complex16 carray[10];

void fn (struct complex16 c, int i)
{
 carray[i] = c;
}

GCC generates poor code. It has an extra save and restore of the frame-pointer, even though we dont use the frame.

I digged a bit further, and found that the get_frame_size() call returns 4 in this case and hence the port's prologue generation code generates the frame-pointer updation.

It seems to me that each element of the stack is copied to the stack from the parameter registers and then that value is being used in the function. I have the following RTL code as we get into RTL.

(insn 6 2 7 2 (set (reg:HI 26)
       (reg:HI 0 R0 [ c ])) -1 (nil)
   (nil))

(insn 7 6 10 2 (set (reg:HI 27)
       (reg:HI 1 R1 [ c+2 ])) -1 (nil)
   (nil))

(insn 10 7 8 2 (set (reg/v:HI 28 [ i ])
       (reg:HI 2 R2 [ i ])) -1 (nil)
   (nil))

(insn 8 10 9 2 (set (mem/s/c:HI (reg/f:HI 21 virtual-stack-vars) [3 c+0 S2 A16])
       (reg:HI 26)) -1 (nil)
   (nil))

(insn 9 8 11 2 (set (mem/s/c:HI (plus:HI (reg/f:HI 21 virtual-stack-vars)
               (const_int 2 [0x2])) [3 c+2 S2 A16])
       (reg:HI 27)) -1 (nil)
   (nil))

Note that the parameter is being written to the frame in the last 2 instructions above. This, i am guessing is the reason for the get_frame_size() returning 4 later on, though the actual save of the struct parameter value on the stack is being eliminated at later optimization phases (CSE and DCE, i believe).

Why does the compiler do this? I vaguely remember x86 storing all parameter values on stack. Is that the reason for this behaviour? Is there anything i can do in the port to get around this problem?

Note : In our port "int" is 16-bits and long is 32-bits.

Thanks in advance,

Regards
Hari

Reply via email to