https://gcc.gnu.org/g:e7393cbb5f2cae50b42713e71984064073aa378a
commit r15-4406-ge7393cbb5f2cae50b42713e71984064073aa378a Author: Denis Chertykov <cherty...@gmail.com> Date: Thu Oct 17 11:12:38 2024 +0400 The detailed explanation from PR116550: Test file: udivmoddi.c problem insn: 484 Before LRA pass we have: (insn 484 483 485 72 (parallel [ (set (reg/v:SI 143 [ __q1 ]) (plus:SI (reg/v:SI 143 [ __q1 ]) (const_int -2 [0xfffffffffffffffe]))) (clobber (scratch:QI)) ]) "udivmoddi.c":163:405 discrim 5 186 {addsi3} (nil)) LRA substitute all scratches with new pseudos, so we have: (insn 484 483 485 72 (parallel [ (set (reg/v:SI 143 [ __q1 ]) (plus:SI (reg/v:SI 143 [ __q1 ]) (const_int -2 [0xfffffffffffffffe]))) (clobber (reg:QI 619)) ]) "/mnt/d/avr-lra/udivmoddi.c":163:405 discrim 5 186 {addsi3} (expr_list:REG_UNUSED (reg:QI 619) (nil))) Pseudo 619 is a special scratch register generated by LRA which is marked in `scratch_bitmap' and can be tested by call `ira_former_scratch_p(regno)'. In dump file (udivmoddi.c.317r.reload) we have: Creating newreg=619 Removing SCRATCH to p619 in insn #484 (nop 3) rescanning insn with uid = 484. After that LRA tries to spill (reg:QI 619) It's a bug because (reg:QI 619) is an output scratch register which is already something like spill register. Fragment from udivmoddi.c.317r.reload: Choosing alt 2 in insn 484: (0) r (1) 0 (2) nYnn (3) &d {addsi3} Creating newreg=728 from oldreg=619, assigning class LD_REGS to r728 IMHO: the bug is in lra-constraints.cc in function `get_reload_reg' fragment of `get_reload_reg': if (type == OP_OUT) { /* Output reload registers tend to start out with a conservative choice of register class. Usually this is ALL_REGS, although a target might narrow it (for performance reasons) through targetm.preferred_reload_class. It's therefore quite common for a reload instruction to require a more restrictive class than the class that was originally assigned to the reload register. In these situations, it's more efficient to refine the choice of register class rather than create a second reload register. This also helps to avoid cycling for registers that are only used by reload instructions. */ if (REG_P (original) && (int) REGNO (original) >= new_regno_start && INSN_UID (curr_insn) >= new_insn_uid_start __________________________________^^ && in_class_p (original, rclass, &new_class, true)) { unsigned int regno = REGNO (original); if (lra_dump_file != NULL) { fprintf (lra_dump_file, " Reuse r%d for output ", regno); dump_value_slim (lra_dump_file, original, 1); } This condition incorrectly limits register reuse to ONLY newly generated instructions. i.e. LRA can reuse registers only from insns generated by himself. IMHO:It's wrong. Scratch registers generated by LRA also have to be reused. The patch is very simple. On x86_64, it bootstraps+regtests fine. gcc/ PR target/116550 * lra-constraints.cc (get_reload_reg): Reuse scratch registers generated by LRA. Diff: --- gcc/lra-constraints.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc index fdcc07764a2e..1f63113f3210 100644 --- a/gcc/lra-constraints.cc +++ b/gcc/lra-constraints.cc @@ -680,7 +680,8 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx original, used by reload instructions. */ if (REG_P (original) && (int) REGNO (original) >= new_regno_start - && INSN_UID (curr_insn) >= new_insn_uid_start + && (INSN_UID (curr_insn) >= new_insn_uid_start + || ira_former_scratch_p (REGNO (original))) && in_class_p (original, rclass, &new_class, true)) { unsigned int regno = REGNO (original);