Trying to track faulty code generation because of a missing input reload, I got lost in reload and need some help.
The insn to reload (insn 7) is (set (subreg:QI (reg:HI 28) 0) (const_int 0)) This insn generates one output reload (.ira dump) Reloads for insn # 7 Reload 0: reload_out (HI) = (reg/v:HI 28 r28 [orig:43 y ] [43]) GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0) reload_out_reg: (reg/v:HI 28 r28 [orig:43 y ] [43]) reload_reg_rtx: (reg:HI 24 r24) which eventually generates code (insn 7 6 17 2 (set (reg:QI 24 r24) (const_int 0 [0])) pr46779-1.c:34 4 {*movqi} (nil)) (insn 17 7 8 2 (set (reg/v:HI 28 r28 [orig:43 y ] [43]) (reg:HI 24 r24)) pr46779-1.c:34 10 {*movhi} (nil)) so there is a missing input reload, i.e. prior to insn 7 there must be something like (set (reg:HI 28) (reg:HI 24)) find_reloads for insn 7 calls push_reload just once with in = 0 out = (subreg:QI (reg:HI 28) 0) outmode = QImode strict_low = 0 optional = 0 type = RELOAD_FOR_OUTPUT Is that correct so far? Or should push_reload also be called for type = RELOAD_FOR_INPUT or with another reload type? find_reloads has a comment that push_reload will do some fixes: /* Any constants that aren't allowed and can't be reloaded into registers are here changed into memory references. */ for (i = 0; i < noperands; i++) if (! goal_alternative_win[i]) { rtx op = recog_data.operand[i]; rtx subreg = NULL_RTX; rtx plus = NULL_RTX; enum machine_mode mode = operand_mode[i]; /* Reloads of SUBREGs of CONSTANT RTXs are handled later in push_reload so we have to let them pass here. */ if (GET_CODE (op) == SUBREG) { subreg = op; op = SUBREG_REG (op); mode = GET_MODE (op); } the code actually enters the last SUBREG block for insn 7 ===================== gcc 4.7 configured ../../gcc.gnu.org/trunk/configure --target=avr --prefix=/local/gnu/install/gcc-4.7 --disable-nls --disable-shared --enable-languages=c,c++ ===================== gcc called with -Os -dp -save-temps -mmcu=atmega128 -S -v ===================== The C source is: struct S { unsigned char a, b; } ab; void yoo (struct S y) { asm volatile ("ldi %B0, 56" : "+y" (y)); y.a = 0; asm ("; y = %0" : "+y" (y)); ab = y; } ===================== insn 7 is generated from y.a = 0. The C source does some asm hacking to produce this bug. This is because otherwise the bug is hard to reproduce and the source would be rather lengthy and much harder to debug. Register y=r29:r28 is the frame pointer, which is not needed for this function. The avr backend had problems ever since because the frame pointer spans two registers but many places in GCC just do their tests like if (x == FRAME_POINTER_REGNUM) The current implementation of HARD_REGNO_MODE_OK allows HI for r28 but denies QI for r28 and r29. Older versions of avr-gcc tried several other approaches to work around all of this like checking for frame_pointer_needed, reload_completed, reload_in_progress etc. in avr.c:avr_hard_regno_mode_ok. Each approach lead to some problems like faulty code or spill failures. The problem disappears when QI is allowed in r28/r28, but I do not know enough of reload and fp elimination to know if that would be an appropriate fix. Anyway, all this shows that there is something going wrong in reload, both in 4.7, 4.6 and 4.5. Moreover, the first asm needs a "volatile" in order not to get thrown away. This indicates that there is an error in life analysis because some part fails to detect that y.a = 0 just changes only a part of the register. The life analysis is correct for the QI part, i.e. for r24 where r28 gets spilled to, but it is wrong for the overall action.