http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41993
--- Comment #4 from Uros Bizjak <ubizjak at gmail dot com> 2012-11-04 16:45:51
UTC ---
I have looked a bit into this problem, since AVX vzeroupper insertion now
depends on MODE_EXIT functionality. IMO, the patch in Comment #1 is correct for
all optimization levels. The reason, the problem is triggered only at -O0 is
that since __builtin_return loads from the memory, gcc emits offsets to memory
locations using the pseudo:
...
(insn 9 8 11 2 (set (reg:SI 0 r0)
(mem:SI (reg/f:SI 163) [0 S4 A8])) pr41933.c:3 238 {movsi_ie}
(nil))
(insn 11 9 12 2 (set (reg:SI 165)
(mem/f/c:SI (plus:SI (reg/f:SI 162)
(const_int 60 [0x3c])) [0 rframe+0 S4 A32])) pr41933.c:3 238
{movsi_ie}
(nil))
(insn 12 11 13 2 (set (reg/f:SI 164)
(plus:SI (reg:SI 165)
(const_int 4 [0x4]))) pr41933.c:3 62 {*addsi3_compact}
(nil))
(insn 13 12 10 2 (set (reg:SI 64 fr0)
(mem:SI (reg/f:SI 164) [0 S4 A8])) pr41933.c:3 238 {movsi_ie}
(nil))
(insn 10 13 14 2 (use (reg:SI 0 r0)) pr41933.c:3 -1
(nil))
(insn 14 10 22 2 (use (reg:SI 64 fr0)) pr41933.c:3 -1
(nil))
(insn 22 14 0 2 (use (reg/i:SI 0 r0)) pr41933.c:4 -1
(nil))
This additional pseudo is what breaks the compilation. At -O2, we enter
mode-switching with:
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v/f:SI 161 [ rframe ])
(reg:SI 4 r4 [ rframe ])) pr41933.c:2 238 {movsi_ie}
(expr_list:REG_DEAD (reg:SI 4 r4 [ rframe ])
(nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 8 2 (set (reg:SI 0 r0)
(mem:SI (reg/v/f:SI 161 [ rframe ]) [0 S4 A8])) pr41933.c:3 238
{movsi_ie}
(nil))
(insn 8 6 7 2 (set (reg:SI 64 fr0)
(mem:SI (plus:SI (reg/v/f:SI 161 [ rframe ])
(const_int 4 [0x4])) [0 S4 A8])) pr41933.c:3 238 {movsi_ie}
(expr_list:REG_DEAD (reg/v/f:SI 161 [ rframe ])
(nil)))
(insn 7 8 9 2 (use (reg:SI 0 r0)) pr41933.c:3 -1
(nil))
(insn 9 7 17 2 (use (reg:SI 64 fr0)) pr41933.c:3 -1
(expr_list:REG_DEAD (reg:SI 64 fr0)
(nil)))
(insn 17 9 0 2 (use (reg/i:SI 0 r0)) pr41933.c:4 -1
(nil))
In this case, we found many return registers (due to __builtin_return), and
consequently lowered nregs to zero. This satisfies the following assert in
(!nregs) and (nregs != hard_regno_nregs[ret_start][GET_MODE (ret_reg)]) cases.
In -O0 case, we broke discovery loop too early, so we can't find all return
regs. I would argue, that we should ignore non-relevant pseudos with:
--cut here--
Index: mode-switching.c
===================================================================
--- mode-switching.c (revision 193133)
+++ mode-switching.c (working copy)
@@ -324,7 +324,10 @@ create_pre_exit (int n_entities, int *entity_map,
else
break;
if (copy_start >= FIRST_PSEUDO_REGISTER)
- break;
+ {
+ last_insn = return_copy;
+ continue;
+ }
copy_num
= hard_regno_nregs[copy_start][GET_MODE (copy_reg)];
--cut here--
In the same way as in case of i.e. UNSPEC_VOLATILE in the preceeding code.