Hi, I'm still working on a new gcc-4.5.2 backend for a private processor. I encountered a strange behavior and I'm unable to find what causes this behavior. As an overview, it seems that dse2 pass removes insn where it should not (optim -O2, -O3)
Here is the code giving me headachs which returns 0 when it should return 0x3F800000 (hex representation of 1.0f) : void f1(int *ret2) { *ret2 = 2; } float f2(float par1) { return par1; } void (*ff)() = f1; int main() { int x; float af; ff(&x); af = f2(1.0f); return *((int *)(&af)); } When I try to simplify this sample code further, the problem disappear even if apparenlty there is no relation between the f2 call and the ff call... I watched the rtl dump and here is an extract of the interesting part dealing with the *((int *)(&af)) with f2 previously inlined (-O2 -O3) : * in expand dump : LC0->$72 The 1.0f constant (LC0) is in data section. Its a normal behavior because I have to insn to load a constant in $FP regs. Hence I have to load the address of LC0 first. (insn 9 8 10 3 (set:SI (reg:SI 72) (symbol_ref/u:SI ("*.LC0") [flags 0x2])) -1 (nil)) M($72)->$73 Memory pointing LC0 is loaded into reg 73 (insn 10 9 11 3 (set:SF (reg:SF 73) (mem:SF (reg:SI 72) [0 S4 A32])) -1 (nil)) $73->M($62+4) reg 73 moved into mem(stack+4). (There are no memory to memory moves so using the temp reg 73 to move M($72) to M($62+4) is normal) (insn 11 10 12 3 (set:SF (mem/c/i:SF (plus:SI (reg/f:SI 62 virtual-stack-vars) (const_int 4 [0x4])) [3 af+0 S4 A32]) (reg:SF 73)) -1 (nil)) $62+4->$75 mem(stack+4) moved to reg 75 (insn 12 11 13 3 (set (reg:SI 75) (plus:SI (reg/f:SI 62 virtual-stack-vars) (const_int 4 [0x4])))) M($75)->76 reg 75 to reg 76 (insn 13 12 14 3 (set (reg:SI 76) (mem:SI (reg:SI 75) [4 S4 A32])) -1 (nil)) $76->$69 reg 76 to reg 69 which is the fixed return register (I don't know why insn 12 don't move directly to reg 69... nevertheless it's correct) (insn 14 13 15 3 (set (reg:SI 69 [ <retval> ]) (reg:SI 76)) -1 (nil)) * in pro_and_epilogue dump (just before dse2) : $C0+8->$C3 (insn 24 12 27 2 (set (reg/f:SI 3 $C3 [77]) (plus:SI (reg/f:SI 0 $C0) (const_int 8 [0x8])))) [...] LC0->$C2 (insn 9 8 29 2 (set:SI (reg/f:SI 2 $C2 [72]) (symbol_ref/u:SI ("*.LC0") [flags 0x2])) 1 {movsi_internal} (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC0") [flags 0x2]) (nil))) M($C2)->$FP1 (insn 29 9 11 2 (set (reg:SF 41 $FP1) (mem:SF (reg/f:SI 2 $C2 [72]) [0 S4 A32])) 10 {movsf_internal} (nil)) $FP1->M($C0+12) (insn 11 29 13 2 (set:SF (mem/c/i:SF (plus:SI (reg/f:SI 0 $C0) (const_int 12 [0xc])) [3 af+0 S4 A32]) (reg:SF 41 $FP1)) 10 {movsf_internal} (nil)) M($C3+4)->$R0 (insn 13 11 21 2 (set (reg/i:SI 16 $R0) (mem:SI (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) [4 S4 A32])) 1 {movsi_internal} (nil)) Every thing seems normal except the fact that $FP1 is moved in M($C0+12) and then the stored value is read with M($C3+4) where $C3=$C0+8. This is due to optimisation (factorization) done during previous passes with the code before inline f2 call. * in pro_ dse2_epilogue dump (we die here) : $C0+8->$C3 (insn 24 12 27 2 (set (reg/f:SI 3 $C3 [77]) (plus:SI (reg/f:SI 0 $C0) (const_int 8 [0x8])))) M($C3+4)->$R0 (insn 13 8 21 2 (set (reg/i:SI 16 $R0) (mem:SI (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) [4 S4 A32])) 1 {movsi_internal} (expr_list:REG_DEAD (reg/f:SI 3 $C3 [77]) (nil))) insn 9, 29 and 11 have been deleted ! :) Here is what the dump says concerning insn 9, 29, 11 and 13 **scanning insn=9 mems_found = 0, cannot_delete = true **scanning insn=29 mem: (reg/f:SI 2 $C2 [72]) after canon_rtx address: (reg/f:SI 2 $C2 [72]) after cselib_expand address: (symbol_ref/u:SI ("*.LC0") [flags 0x2]) after canon_rtx address: (symbol_ref/u:SI ("*.LC0") [flags 0x2]) gid=1 offset=0 processing const load gid=1[0..4) mems_found = 0, cannot_delete = true **scanning insn=11 mem: (plus:SI (reg/f:SI 0 $C0) (const_int 12 [0xc])) after canon_rtx address: (plus:SI (reg/f:SI 0 $C0) (const_int 12 [0xc])) gid=2 offset=12 processing const base store gid=2[12..16) mems_found = 1, cannot_delete = false **scanning insn=13 mem: (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) after canon_rtx address: (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) after cselib_expand address: (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) after canon_rtx address: (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) varying cselib base=10:7900 offset = 4 processing cselib load mem:(mem:SI (plus:SI (reg/f:SI 3 $C3 [77]) (const_int 4 [0x4])) [4 S4 A32]) processing cselib load against insn 11 mems_found = 0, cannot_delete = true Locally deleting insn 11 deferring deletion of insn with uid = 11. starting the processing of deferred insns deleting insn with uid = 11. ending the processing of deferred insns DCE: Deleting insn 29 deleting insn with uid = 29. DCE: Deleting insn 9 deleting insn with uid = 9. DCE: Deleting insn 36 deleting insn with uid = 36 I tried to search if there were pending bugs concerning dse2 on gcc 4.5.2 but I did not find anything. I don't know if I forgot a macro or what I have done wrong. I don't know if the problem comes from dse2 pass or is a consequence of an error done before. Can someone help me to find out why my lovely insn are removed ? Regards, Selim