Re: weird impact of lower-subreg on IRA/reload

Georg-Johann Lay Thu, 16 Feb 2012 06:11:14 -0800

Vladimir Makarov wrote:
> On 02/15/2012 09:21 AM, Georg-Johann Lay wrote:
>> This is a question on SUBREGs generated by lower-subreg.c and whether
>> register
>> allocator is supposed to handle them efficiently.
>>
>> Suppose the following small function compiled for AVR.
>> Remember AVR is 8-bit machine with int = HImode and UNITS_PER_WORD = 1.
>>
>> int add (int val)
>> {
>>      return val + 1;
>> }
>>
>> The addition can be performed in one insn; val and return value are
>> passed in
>> HI:24 as you can see in .ira dump:
>>
>>
>> (insn 6 3 19 2 (parallel [
>>              (set (reg:HI 45)
>>                  (plus:HI (reg:HI 24 r24 [ val ])
>>                      (const_int 1 [0x1])))
>>              (clobber (scratch:QI))
>>          ]) add.c:3 42 {addhi3_clobber}
>>       (expr_list:REG_DEAD (reg:HI 24 r24 [ val ])
>>          (nil)))
>>
>> (insn 19 6 20 2 (set (reg:QI 24 r24)
>>          (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}
>>       (nil))
>>
>> (insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
>>          (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
>>       (expr_list:REG_DEAD (reg:HI 45)
>>          (nil)))
>>
>> (insn 14 20 0 2 (use (reg/i:HI 24 r24)) add.c:4 -1
>>       (nil))
>>
>> IRA writes:
>>
>>        Pushing a0(r45,l0)(cost 0)
>>        Popping a0(r45,l0)  -- assign reg 18
>> Disposition:
>>      0:r45  l0    18
>>
>> i.e. it assigns pseudo HI:45 to hard register HI:18 and thus causes
>> inefficient
>> code because it happily moves values around without need.
>>
>> .reload generates additional move insns to satisfy the constraints of
>> addhi3
>> which are basically "=r, %0, rn" i.e. addition is a 2-operand insn
>> where op0
>> and op1 must be in the same hard register:
>>
>> (insn 23 3 6 2 (set (reg:HI 18 r18 [45])
>>          (reg:HI 24 r24 [ val ])) add.c:3 22 {*movhi}
>>       (nil))
>>
>> (insn 6 23 19 2 (parallel [
>>              (set (reg:HI 18 r18 [45])
>>                  (plus:HI (reg:HI 18 r18 [45])
>>                      (const_int 1 [0x1])))
>>              (clobber (scratch:QI))
>>          ]) add.c:3 42 {addhi3_clobber}
>>       (nil))
>>
>> (insn 19 6 20 2 (set (reg:QI 24 r24)
>>          (reg:QI 18 r18 [45])) add.c:4 18 {movqi_insn}
>>       (nil))
>>
>> (insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
>>          (reg:QI 19 r19 [+1 ])) add.c:4 18 {movqi_insn}
>>       (nil))
>>
>>
>> However, the machine could just as well do the addition in HI:24
>> directly like so:
>>
>>
>> (parallel [(set (reg:HI 24 r24)
>>                  (plus:HI (reg:HI 24)
>>                           (const_int 1)))
>>             (clobber (scratch:QI))])  {addhi3_clobber}
>>
>>
>> Question: Is IRA supposed to detect SUBREGs like above and avoid code
>> bloat?
>> Sequences like
>>
>>
>> (insn 19 6 20 2 (set (reg:QI 24 r24)
>>          (subreg:QI (reg:HI 45) 0)) add.c:4 18 {movqi_insn}
>>       (nil))
>>
>> (insn 20 19 14 2 (set (reg:QI 25 r25 [+1 ])
>>          (subreg:QI (reg:HI 45) 1)) add.c:4 18 {movqi_insn}
>>       (expr_list:REG_DEAD (reg:HI 45)
>>          (nil)))
>>
>> obviously generate some early-clobber situation for IRA that avoids
>> HI:45 to be
>> allocated to HI:24.
>>
>> Is IRA a school book implementation that does not know anything about
>> SUBREGs?
> 
> No, it is not a school book implementation.
> 
>> Or should IRA be smart enough to detect and allocate SUBREGs
>> efficiently by some "subreg fusion" mechanism?
> 
> No, it is not smart enough.
> 
> IRA deals well with subregs of multi-register pseudos but not with
> subregs of one-register pseudos.
> 
> By the way, the old register allocator did not deal with subregs at all.
>> The code above is just a small example to show the problem, but the
>> issue also
>> occurs with more complex code and not only for return and parameter
>> registers.
> 
> Thanks for reporting this.  
> I might be work on this.  But I don't know when I can start.  
> This platform is not on my high priority list.


Thanks for improving the situation.

It's already good news to hear it is feasible and not a
"no-go because too much rework in register allocator".

Filed it as PR52278 for reference:

http://gcc.gnu.org/PR52278

Johann

Re: weird impact of lower-subreg on IRA/reload

Reply via email to