[gcc r15-4406] The detailed explanation from PR116550:

Denis Chertykov via Gcc-cvs Thu, 17 Oct 2024 00:15:26 -0700

https://gcc.gnu.org/g:e7393cbb5f2cae50b42713e71984064073aa378a


commit r15-4406-ge7393cbb5f2cae50b42713e71984064073aa378a
Author: Denis Chertykov <cherty...@gmail.com>
Date:   Thu Oct 17 11:12:38 2024 +0400

    The detailed explanation from PR116550:
    
    Test file: udivmoddi.c
    problem insn: 484
    
    Before LRA pass we have:
    (insn 484 483 485 72 (parallel [
                (set (reg/v:SI 143 [ __q1 ])
                    (plus:SI (reg/v:SI 143 [ __q1 ])
                        (const_int -2 [0xfffffffffffffffe])))
                (clobber (scratch:QI))
            ]) "udivmoddi.c":163:405 discrim 5 186 {addsi3}
         (nil))
    
    LRA substitute all scratches with new pseudos, so we have:
    (insn 484 483 485 72 (parallel [
                (set (reg/v:SI 143 [ __q1 ])
                    (plus:SI (reg/v:SI 143 [ __q1 ])
                        (const_int -2 [0xfffffffffffffffe])))
                (clobber (reg:QI 619))
            ]) "/mnt/d/avr-lra/udivmoddi.c":163:405 discrim 5 186 {addsi3}
         (expr_list:REG_UNUSED (reg:QI 619)
            (nil)))
    
    Pseudo 619 is a special scratch register generated by LRA which is marked 
in `scratch_bitmap' and can be tested by call `ira_former_scratch_p(regno)'.
    
    In dump file (udivmoddi.c.317r.reload) we have:
          Creating newreg=619
    Removing SCRATCH to p619 in insn #484 (nop 3)
    rescanning insn with uid = 484.
    
    After that LRA tries to spill (reg:QI 619)
    It's a bug because (reg:QI 619) is an output scratch register which is 
already something like spill register.
    
    Fragment from udivmoddi.c.317r.reload:
          Choosing alt 2 in insn 484:  (0) r  (1) 0  (2) nYnn  (3) &d {addsi3}
          Creating newreg=728 from oldreg=619, assigning class LD_REGS to r728
    
    IMHO: the bug is in lra-constraints.cc in function `get_reload_reg'
    fragment of `get_reload_reg':
      if (type == OP_OUT)
        {
          /* Output reload registers tend to start out with a conservative
             choice of register class.  Usually this is ALL_REGS, although
             a target might narrow it (for performance reasons) through
             targetm.preferred_reload_class.  It's therefore quite common
             for a reload instruction to require a more restrictive class
             than the class that was originally assigned to the reload register.
    
             In these situations, it's more efficient to refine the choice
             of register class rather than create a second reload register.
             This also helps to avoid cycling for registers that are only
             used by reload instructions.  */
          if (REG_P (original)
              && (int) REGNO (original) >= new_regno_start
              && INSN_UID (curr_insn) >= new_insn_uid_start
    __________________________________^^
              && in_class_p (original, rclass, &new_class, true))
            {
              unsigned int regno = REGNO (original);
              if (lra_dump_file != NULL)
                {
                  fprintf (lra_dump_file, "  Reuse r%d for output ", regno);
                  dump_value_slim (lra_dump_file, original, 1);
                }
    
    This condition incorrectly limits register reuse to ONLY newly generated 
instructions.
    i.e. LRA can reuse registers only from insns generated by himself.
    
    IMHO:It's wrong.
    Scratch registers generated by LRA also have to be reused.
    
    The patch is very simple.
    On x86_64, it bootstraps+regtests fine.
    
    gcc/
            PR target/116550
            * lra-constraints.cc (get_reload_reg): Reuse scratch registers
            generated by LRA.

Diff:
---
 gcc/lra-constraints.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index fdcc07764a2e..1f63113f3210 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -680,7 +680,8 @@ get_reload_reg (enum op_type type, machine_mode mode, rtx 
original,
         used by reload instructions.  */
       if (REG_P (original)
          && (int) REGNO (original) >= new_regno_start
-         && INSN_UID (curr_insn) >= new_insn_uid_start
+         && (INSN_UID (curr_insn) >= new_insn_uid_start
+             || ira_former_scratch_p (REGNO (original)))
          && in_class_p (original, rclass, &new_class, true))
        {
          unsigned int regno = REGNO (original);

[gcc r15-4406] The detailed explanation from PR116550:

Reply via email to