On 7/13/23 05:27, SenthilKumar.Selvaraj--- via Gcc wrote:
Hi,

   I've been spending some (spare) time checking what it would take to
   make LRA work for the avr target.

   Right after I removed the TARGET_LRA_P hook disabling LRA, building
   libgcc failed with a weird ICE.

  On the avr, the stack pointer (SP)
   is not used to access stack slots
It is very uncommon target then.
  - TARGET_CAN_ELIMINATE returns false
   if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true
   if get_frame_size() > 0.

   With LRA, however, reload generates

(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
         (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
      (nil))

   and the backend code errors out when it finds SP is being used as a
   pointer register.

   Digging through the RTL dumps, I found the following. For the
   following insn sequence in *.ira

(insn 189 128 159 7 (set (reg:HI 58 [ b ])
         (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
      (nil))
(insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0)
         (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1)
         (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split}
      (nil))

   1. For r58, IRA picks R28:R29, which is the frame pointer for avr.

       Popping a13(r58,l0)  --         assign reg 28

   2. LRA sees the subreg in insn 159 and generates a reload reg
   (r125).  simplify_subreg_regno (lra-constraints.cc:1810) however
   bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and
   reload isn't completed yet. LRA therefore decides rclass for the
   pseudo reg is NO_REGS.

<snip>
Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg r125
   159: r125:HI#0=r86:QI

   4. As rclass is NO_REGS, LRA picks an insn alternative that involves memory.
   That is my understanding, please correct me if I'm wrong.
<snip>
             0 Small class reload: reject+=3
             0 Non input pseudo reload: reject++
             Cycle danger: overall += LRA_MAX_REJECT
           alt=0,overall=610,losers=1,rld_nregs=1
             0 Small class reload: reject+=3
             0 Non input pseudo reload: reject++
             alt=1: Bad operand -- refuse
             0 Non pseudo reload: reject++
           alt=2,overall=1,losers=0,rld_nregs=0
         Choosing alt 2 in insn 159:  (0) Qm  (1) rY00 {movqi_insn_split}

   5. LRA creates stack slots, and then uses the FP register to access
   the slots. This is despite r58 already being assigned R28:R29.

   6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore
      frame_pointer_needed is not set, despite the creation of stack
      slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP,
      and this eventually causes the ICE when the avr backend sees SP being
      used as a pointer register.

   This is the relevant sequence after reload
<snip>
(insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
         (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
      (nil))
(insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
         (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
      (nil))
(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
         (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
         (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 
{*movhi_split}
      (nil))
(insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
         (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
      (nil))
(insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 2 [0x2])) [2 %sfp+2 S1 A8])
         (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
         (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                 (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 
{*movhi_split}
      (nil))

   For choices other than FP, simplify_subreg_regno returns the correct part
   of the wider HImode reg, so rclass is not NO_REGS, and things workout fine.

   I checked what classic reload does in the same situation - it picks a
   different register (R25) instead of spilling to a stack slot.

<snip>
(insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
         (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
      (nil))
(insn 159 189 226 7 (set (reg:QI 25 r25)
         (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 226 159 160 7 (set (reg:QI 28 r28)
         (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 160 226 227 7 (set (reg:QI 25 r25)
         (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split}
      (nil))
(insn 227 160 33 7 (set (reg:QI 29 r29)
         (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
      (nil))


   My questions:

   1. Is there something obvious the avr backend is doing wrong that is
   causing this?
No.  In my opinion if it worked with reload, it should work with LRA because LRA is a substitute for reload.
   2. Shouldn't LRA ask the backend for frame_pointer_required_p and
   update frame_pointer_needed if it creates stack slots?
I think so.
   3. Even if (2) works, I see that lra-eliminates.cc:update_reg_eliminate
   asserts that if the backend said elimination to SP is
ok first up, it
   cannot reject that elimination later (line 1165). If the only reason
   FP is required is because LRA created stack
slots, what should the backend do?
I think it can be relaxed for avr on which sp can not be used access stack memory
   4. When simplify_subreg_regno bails for FP, lra-constraints.cc:1815
   sets rclass = NO_REGS and forces a spill to memory. The comment says
   it is to prevent infinite looping, but for this case, doesn't it
   make sense to look for other regs?
I guess it can be relaxed for avr and frame-pointer too as avr does not use sp for accessing memory
   5. I can work around the problem by disabling elimination from FP to SP
   when lra_in_progress, but I think it pevents IRA/LRA from using
   R28:R29 even when FP is not required at all?
Yes, it is better to avoid disabling elimination
   6. Basic question, but does FP to SP elimination mean any operation
   possible with FP should be doable with SP as well?

Hard to say. There are a lot of undocumented RA assumptions.

If you send me the preprocessed test, I could start to work on it to fix the problems.  I think it is hard to fix them right for a person having a little experience with LRA.



Reply via email to