4.4 Regression] likely codegen bug

matz at gcc dot gnu dot org Tue, 01 Jul 2008 12:18:24 -0700


------- Comment #7 from matz at gcc dot gnu dot org  2008-07-01 19:17 -------
Blaeh.  The bug is in code that is there since the dawn of revision control,
under a comment that starts with "... This is tricky ..." and ends with
"I am not sure whether the algorithm here is always right ...".


One thing after another.  The problem is in insn 15 (with Jakubs testcase),
that effectively is:
   p58 <- p63 << p63:QI
It so happens that p63 is allocated to $edx and p58 to $ecx.  This is all
fine and would result in such insn:  %ecx = %edx << %dl

Then find_reloads comes and determines that the insn as is is invalid:
1) The output and first input have to match and
2) the second input needs to be in

So, it generates the first reload to make op0 and op1 match.
  in = inreg = (reg:SI dx)
  out = outreg = (reg:SI cx)
  inmode = outmode = SImode
  class = GENERAL_REGS
  type = RELOAD_OTHER
perfectly fine.  find_reloads will also already set reg_rtx (i.e. the register
that is used to carry out the reload) to (reg:SI cx), also quite fine and
correct.

Then it tries to generate another reload to fit the "c" constraint for operand
2:
  in = (subreg:QI (reg:SI dx))
  out = 0
  inmode = QImode
  class = CREG
  type = RELOAD_FOR_INPUT
Fairly early push_reload will substitute in (still a SUBREG) with the real
hardreg: in = (reg:QI dx).  Then it tries to find a mergable reload, and
indeed finds the first one.  That is also correct, the classes overlap,
the already assigned reg_rtx is member of that class (CREG) the types of
reloads are okay and so on.

Okay, so we don't generate a second reload for this operand, but simply reuse
the first one, so we have to merge the info of this to-be reload with the one
we have already.  That for instance would widen the existing mode if our new
reload would be wider.  See push_reload around line 1369 for what exactly
that merging does.  Among the things it does, it also overwrites .in:

   if (in != 0) {
     ...
     rld[i].in = in;
     rld[i].in_reg = in_reg;
   }

This might look strange, but in itself is okay.  It looks strange, because
our reload now looks like:
  in = inreg = (reg:QI dx)
  out = outreg = (reg:SI cx)
  inmode = outmode = SImode
  class = CREG
  type = RELOAD_OTHER
  reg_rtx = (reg:SI cx)

Note in particular that the .in reg is a QImode register, while the inmode
(and outmode) are still SImode.  This in itself is still not incorrect,
if this reload would be emitted as a SImode register move (from dx to cx),
as requested by the modes of this reload all would work.

I.e. the .in (and hence .out) registers are not to be used as real register
references, but merely as container for the hardreg _numbers_ we want to use.

Now, if we're going to emit this reload we go over do_input_reload,
and right at the beginning we have this:

  [ here old = rl->in; ]
  if (old && reg_rtx)
    {
      enum machine_mode mode;
      /* Determine the mode to reload in.
         This is very tricky because we have three to choose from.
      ...
         I am not sure whether the algorithm here is always right,
         but it does the right things in those cases.  */
      mode = GET_MODE (old);
      if (mode == VOIDmode)
        mode = rl->inmode;

So, it determines the mode _from the .in member_ by default, and only uses
the mode from the reload when that one is a constant.  Argh!  This means we
are now emitting a QImode move only loading %cl from %dl, whereas we would
need to load all of %ecx from %edx.  We emit this wrong reload insn, which
then later get's removed because we already have another reload %cl <- %dl
in the basic block before this one, which we are inheriting then.  This just
confuses the analysis somewhat, but the problem really is this too narrow
reload.
This code originally comes from do_input_reload, which in turn has it from
emit_reload_insns, and had this comment already in r120 of reload1.c .

I see basically two ways to fix this:
1) make push_reload "merge" also the .in member, instead of just overwriting
   it.  By merging I mean that if rld.in and in are both registers (in which
   case they have the same numbers already, as that is checked by
   find_reusable_reload) that leave in the larger of the two.  Same for
   .out.

2) Change do_input_reload (and also do_output_reload) to listen
   to inmode and outmode first, before looking at the mode of .in.
   The comment above this looks scary and lists some problematic cases, but
   given it's age I'm not at all sure if these indeed are still a problem.
   I actually think that inmode and outmode are currently the most precise
   descriptions of what this reload is about, unlike .in and .out whose
   significance lies more or less in the reg number only.

(2) might be the harder change, as it would require also reinterpreting
anything in .in or .out in the mode, in case they don't match.

I checked that (1) fixes the testcase, maybe I'm regstrapping that, it's the
simpler change.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36613

[Bug target/36613] [4.1/4.2/4.3/4.4 Regression] likely codegen bug

Reply via email to