https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89837

            Bug ID: 89837
           Summary: __builtin_longjmp failure with instruction scheduling
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wilson at gcc dot gnu.org
  Target Milestone: ---

The RISC-V port, with the just committed sifive-7-series support, with the
restore_stack_nonlocal pattern disabled, fails gcc.c-torture/execute/pr64242.c
at -O2 and higher.  This appears to be a latent bug in the __builtin_longjump
support.

Before the first insn sched pass, I see

;;      |    8 |    7 | r73=frame-0x14                 sifive_7_A|sifive_7_B
;;      |   12 |    5 | a2=0x14                        sifive_7_A|sifive_7_B
;;      |   13 |    6 | a1=r81                         sifive_7_A|sifive_7_B
;;      |   14 |    6 | a0=r73                         sifive_7_A|sifive_7_B
;;      |   15 |    4 | {a0=call [`memcpy'];clobber ra;} sifive_7_B
;;      |   17 |    0 | debug_marker                   nothing
;;      |   19 |    4 | r79=[r73+0x4]                  sifive_7_A
;;      |   20 |    2 | clobber [scratch]              nothing
;;      |   21 |    2 | clobber [s0]                   nothing
;;      |   22 |    2 | r80=[frame-0x14]               sifive_7_A
;;      |   23 |    1 | clobber [scratch]              nothing
;;      |   24 |    1 | clobber [sp]                   nothing
;;      |   25 |    1 | sp=[r73+0x8]                   sifive_7_A
;;      |   27 |    1 | s0=r80                         sifive_7_A|sifive_7_B

So insn 25 is using a copy of the soft fp, and insn 27 is storing to the hard
fp.  After register elimination and register allocation, these two instructions
will be using the same register, the hard fp, s0.  However, in the first sched
pass, there is no obvious dependency between the two instructions, and it is
possible for the instruction scheduler to move insn 27 before insn 25, which
will cause the testcase to fail.

There needs to be some sort of clobber between the two instructions to prevent
the bad optimization in the first sched pass.  In the RISC-V port, I added a
restore_stack_nonlocal pattern that adds after insn 25

;;      |   26 |    1 | clobber [s0]                   nothing

which prevents the bad scheduler optimization.

I think either expand_builtin_longjmp in builtins.c or emit_stack_restore in
explow.c should be emitting this clobber or a similar one.  There is already
code in expand_builtin_longjmp to emit a clobber of a MEM using the hard fp
before emit_stack_restore is called.  Adding one after the emit_stack_restore
call would solve the problem.  Or alternatively adding it at the end of
emit_stack_restore would work.

Reply via email to