Hello, I have a problem with delete_output_reload. It sometimes deletes instructions which are needed. Here an analysis of a recent case (In a private version of the S390 port). The original S390 shows almost the same reloads, but chooses different registers.
Before reload we have (insn 1597 1697 1598 0 x.c:238 (set (reg:DI 1393) (ashift:DI (reg:DI 1391) (const_int 8 [0x8]))) 349 {*ashldi3_31} (insn_list:REG_DEP_TRUE 1596 (nil)) (nil)) (insn 1598 1597 1623 0 x.c:238 (parallel [ (set (reg:DI 1393) (plus:DI (reg:DI 1391) (reg:DI 1393))) (clobber (reg:CC 33 %cc)) ]) 177 {*adddi3_31} (insn_list:REG_DEP_OUTPUT 1594 (insn_list:REG_DEP_TRUE 1597 (insn_list:REG_DEP_TRUE 1596 (nil)))) (expr_list:REG_DEAD (reg:DI 1391) (expr_list:REG_UNUSED (reg:CC 33 %cc) (expr_list:REG_EQUAL (mult:DI (reg:DI 1384 [+114 ]) (const_int 9934259357961 [0x90900000909])) (nil))))) Both registers 1391 and 1393 will be put on the stack. The offset is more than 7000, so we need a secondary reload. The report in *.greg is Reloads for insn # 1597 Reload 0: reload_in (SI) = (const_int 4080 [0xff0]) ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0) reload_in_reg: (const_int 4080 [0xff0]) reload_reg_rtx: (reg:SI 4 4) Reload 1: reload_in (SI) = (const_int 4080 [0xff0]) ADDR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0) reload_in_reg: (const_int 4080 [0xff0]) reload_reg_rtx: (reg:SI 4 4) Reload 2: reload_in (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3144 [0xc48])) [0 S8 A8]) reload_out (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3136 [0xc40])) [0 S8 A8]) GENERAL_REGS, RELOAD_OTHER (opnum = 0), can't combine reload_in_reg: (reg:DI 1391) reload_out_reg: (reg:DI 1393) reload_reg_rtx: (reg:DI 2 2) Reloads for insn # 1598 Reload 0: reload_in (SI) = (const_int 4080 [0xff0]) ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0) reload_in_reg: (const_int 4080 [0xff0]) reload_reg_rtx: (reg:SI 2 2) Reload 1: reload_in (SI) = (const_int 4080 [0xff0]) ADDR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0) reload_in_reg: (const_int 4080 [0xff0]) reload_reg_rtx: (reg:SI 2 2) Reload 2: reload_in (SI) = (const_int 4080 [0xff0]) ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2) reload_in_reg: (const_int 4080 [0xff0]) reload_reg_rtx: (reg:SI 2 2) Reload 3: reload_in (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3144 [0xc48])) [0 S8 A8]) reload_out (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3136 [0xc40])) [0 S8 A8]) GENERAL_REGS, RELOAD_OTHER (opnum = 0), can't combine reload_in_reg: (reg:DI 1391) reload_out_reg: (reg:DI 1393) reload_reg_rtx: (reg:DI 0 0) Reload 4: ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2), can't combine, secondary_reload_p reload_reg_rtx: (reg:SI 3 3) Reload 5: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3136 [0xc40])) ADDR_REGS, RELOAD_FOR_INPUT (opnum = 2), inc by 8 reload_in_reg: (plus:SI (plus:SI (reg/f:SI 15 15) (const_int 4080 [0xff0])) (const_int 3136 [0xc40])) reload_reg_rtx: (reg:SI 2 2) secondary_in_reload = 4 secondary_in_icode = reload_insi These reloads are ok. In do_output_reload it is noted that both insn_1597.Reload_2 and insn_1598.Reload_3 write to the same stack slot. So the compiler decides to remove the first reload and use register (reg:DI 2) directly. In this analysis it misses the fact that (reg:SI 2) is used for input reloads of insn 1598. After Reload the generated instructions are: (insn 1597 16833 16836 0 x.c:238 (set (reg:DI 2 2) (ashift:DI (reg:DI 2 2) (const_int 8 [0x8]))) 349 {*ashldi3_31} (insn_list:REG_DEP_TRUE 1596 (nil)) (nil)) (insn 16836 1597 16838 0 x.c:238 (set (reg:SI 2 2) (const_int 4080 [0xff0])) 56 {*movsi_esa} (nil) (nil)) (insn 16838 16836 16837 0 x.c:238 (set (reg:DI 0 0) (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15) (reg:SI 2 2)) (const_int 3144 [0xc48])) [0 S8 A8])) 52 {*movdi_31} (nil) (nil)) (insn 16837 16838 16840 0 x.c:238 (set (reg:SI 2 2) (const_int 4080 [0xff0])) 56 {*movsi_esa} (nil) (nil)) (insn 16840 16837 1598 0 x.c:238 (parallel [ (set (reg:SI 2 2) (plus:SI (plus:SI (reg/f:SI 15 15) (reg:SI 2 2)) (const_int 3136 [0xc40]))) (use (const_int 0 [0x0])) ]) 58 {force_la_31} (nil) (nil)) (insn 1598 16840 16835 0 x.c:238 (parallel [ (set (reg:DI 0 0) (plus:DI (reg:DI 0 0) (mem/c:DI (reg:SI 2 2) [0 S8 A8]))) (clobber (reg:CC 33 %cc)) ]) 177 {*adddi3_31} (insn_list:REG_DEP_OUTPUT 1594 (insn_list:REG_DEP_TRUE 1597 (insn_list:REG_DEP_TRUE 1596 (nil)))) (expr_list:REG_EQUAL (mult:DI (mem/c:DI (plus:SI (reg/f:SI 15 15) (const_int 7232 [0x1c40])) [0 S8 A8]) (const_int 9934259357961 [0x90900000909])) (nil))) Further optimization will remove insn 1597 as dead, resulting in wrong code generation. One critical point is the timing on the variables reg_reloaded_valid and spill_reg_store. Within the function emit_reload_insns they are first checked (within do_output_reload) and later updated (after the reload instructions are written). So they reflect the state before the "reload sequence". Not all usages reflect this semantics. Especially the check within delete_output_reload is not correct. I see two possibilities to solve this problem. The first is to complete the check within delete_output_reload to check for input reloads. The second is to rewrite emit_reload_insns: First clear reg_reloaded_valid/spill_reg_store as appropiate, then prepare and do the reload instructions, then set reg_reloaded_valid/spill_reg_store as appropiate. Erwin P.S.: I have no copyright assignment. I gave up on the company lawyer some time ago. Erwin Unruh, Fujitsu Siemens Computers, C/C++ compiler group