------- Comment #7 from matz at gcc dot gnu dot org 2008-07-01 19:17 ------- Blaeh. The bug is in code that is there since the dawn of revision control, under a comment that starts with "... This is tricky ..." and ends with "I am not sure whether the algorithm here is always right ...".
One thing after another. The problem is in insn 15 (with Jakubs testcase), that effectively is: p58 <- p63 << p63:QI It so happens that p63 is allocated to $edx and p58 to $ecx. This is all fine and would result in such insn: %ecx = %edx << %dl Then find_reloads comes and determines that the insn as is is invalid: 1) The output and first input have to match and 2) the second input needs to be in So, it generates the first reload to make op0 and op1 match. in = inreg = (reg:SI dx) out = outreg = (reg:SI cx) inmode = outmode = SImode class = GENERAL_REGS type = RELOAD_OTHER perfectly fine. find_reloads will also already set reg_rtx (i.e. the register that is used to carry out the reload) to (reg:SI cx), also quite fine and correct. Then it tries to generate another reload to fit the "c" constraint for operand 2: in = (subreg:QI (reg:SI dx)) out = 0 inmode = QImode class = CREG type = RELOAD_FOR_INPUT Fairly early push_reload will substitute in (still a SUBREG) with the real hardreg: in = (reg:QI dx). Then it tries to find a mergable reload, and indeed finds the first one. That is also correct, the classes overlap, the already assigned reg_rtx is member of that class (CREG) the types of reloads are okay and so on. Okay, so we don't generate a second reload for this operand, but simply reuse the first one, so we have to merge the info of this to-be reload with the one we have already. That for instance would widen the existing mode if our new reload would be wider. See push_reload around line 1369 for what exactly that merging does. Among the things it does, it also overwrites .in: if (in != 0) { ... rld[i].in = in; rld[i].in_reg = in_reg; } This might look strange, but in itself is okay. It looks strange, because our reload now looks like: in = inreg = (reg:QI dx) out = outreg = (reg:SI cx) inmode = outmode = SImode class = CREG type = RELOAD_OTHER reg_rtx = (reg:SI cx) Note in particular that the .in reg is a QImode register, while the inmode (and outmode) are still SImode. This in itself is still not incorrect, if this reload would be emitted as a SImode register move (from dx to cx), as requested by the modes of this reload all would work. I.e. the .in (and hence .out) registers are not to be used as real register references, but merely as container for the hardreg _numbers_ we want to use. Now, if we're going to emit this reload we go over do_input_reload, and right at the beginning we have this: [ here old = rl->in; ] if (old && reg_rtx) { enum machine_mode mode; /* Determine the mode to reload in. This is very tricky because we have three to choose from. ... I am not sure whether the algorithm here is always right, but it does the right things in those cases. */ mode = GET_MODE (old); if (mode == VOIDmode) mode = rl->inmode; So, it determines the mode _from the .in member_ by default, and only uses the mode from the reload when that one is a constant. Argh! This means we are now emitting a QImode move only loading %cl from %dl, whereas we would need to load all of %ecx from %edx. We emit this wrong reload insn, which then later get's removed because we already have another reload %cl <- %dl in the basic block before this one, which we are inheriting then. This just confuses the analysis somewhat, but the problem really is this too narrow reload. This code originally comes from do_input_reload, which in turn has it from emit_reload_insns, and had this comment already in r120 of reload1.c . I see basically two ways to fix this: 1) make push_reload "merge" also the .in member, instead of just overwriting it. By merging I mean that if rld.in and in are both registers (in which case they have the same numbers already, as that is checked by find_reusable_reload) that leave in the larger of the two. Same for .out. 2) Change do_input_reload (and also do_output_reload) to listen to inmode and outmode first, before looking at the mode of .in. The comment above this looks scary and lists some problematic cases, but given it's age I'm not at all sure if these indeed are still a problem. I actually think that inmode and outmode are currently the most precise descriptions of what this reload is about, unlike .in and .out whose significance lies more or less in the reg number only. (2) might be the harder change, as it would require also reinterpreting anything in .in or .out in the mode, in case they don't match. I checked that (1) fixes the testcase, maybe I'm regstrapping that, it's the simpler change. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36613