Re: Code bloat due to silly IRA cost model?

Georg-Johann Lay Fri, 13 Dec 2019 03:59:19 -0800

Am 11.12.19 um 18:55 schrieb Richard Sandiford:

Georg-Johann Lay <g...@gcc.gnu.org> writes:

Hi, doesn't actually anybody know know to make memory more expensive
than registers when it comes to allocating registers?


Whatever I am trying for TARGET_MEMORY_MOVE_COST and
TARGET_REGISTER_MOVE_COST, ira-costs.c always makes registers more
expensive than mem and therefore allocates values to stack slots instead
of keeping them in registers.

Test case (for avr) is as simple as it gets:

float func (float);

float call (float f)
{
      return func (f);
}

What am I missing?

Johann


Georg-Johann Lay schrieb:

Hi,

I am trying to track down a code bloat issue and am stuck because I do
not understand IRA's cost model.

The test case is as simple as it gets:

float func (float);

float call (float f)
{
     return func (f);
}

IRA dump shows the following insns:


(insn 14 4 2 2 (set (reg:SF 44)
         (reg:SF 22 r22 [ f ])) "bloat.c":4:1 85 {*movsf}
      (expr_list:REG_DEAD (reg:SF 22 r22 [ f ])
         (nil)))
(insn 2 14 3 2 (set (reg/v:SF 43 [ f ])
         (reg:SF 44)) "bloat.c":4:1 85 {*movsf}
      (expr_list:REG_DEAD (reg:SF 44)
         (nil)))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SF 22 r22)
         (reg/v:SF 43 [ f ])) "bloat.c":5:12 85 {*movsf}
      (expr_list:REG_DEAD (reg/v:SF 43 [ f ])
         (nil)))
(call_insn/j 7 6 8 2 (parallel [

#14 sets pseudo 44 from arg register R22.
#2 moves it to pseudo 43
#6 moves it to R22 as it prepares for call_insn #7.

There are 2 allocnos and cost:

Pass 0 for finding pseudo/allocno costs

     a1 (r44,l0) best NO_REGS, allocno NO_REGS
     a0 (r43,l0) best NO_REGS, allocno NO_REGS

   a0(r43,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000
NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000
   a1(r44,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000
NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000

which is quite odd because MEM is way more expensive here than any REG.

Okay, so let's boost the MEM cost (TARGET_MEMORY_MOVE_COST) by a factor
of 100:

     a1 (r44,l0) best NO_REGS, allocno NO_REGS
     a0 (r43,l0) best NO_REGS, allocno NO_REGS

   a0(r43,l0) costs: ADDW_REGS:3200000 SIMPLE_LD_REGS:3200000
LD_REGS:3200000 NO_LD_REGS:3200000 GENERAL_REGS:3200000 MEM:801000
   a1(r44,l0) costs: ADDW_REGS:3200000 SIMPLE_LD_REGS:3200000
LD_REGS:3200000 NO_LD_REGS:3200000 GENERAL_REGS:3200000 MEM:801000

What??? The REG costs are 100 times higher, and stille higher that the
MEM costs.  What the heck is going on?

Setting TARGET_REGISTER_MOVE_COST and also TARGET_MEMORY_MOVE_COST to 0
yiels:

   a0(r43,l0) costs: ADDW_REGS:0 SIMPLE_LD_REGS:0 LD_REGS:0 NO_LD_REGS:0
GENERAL_REGS:0 MEM:0
   a1(r44,l0) costs: ADDW_REGS:0 SIMPLE_LD_REGS:0 LD_REGS:0 NO_LD_REGS:0
GENERAL_REGS:0 MEM:0

as expected, i.e. there is no other hidden source of costs considered by
IRA.  And even TARGET_REGISTER_MOVE_COST = 0  and
TARGET_MEMORY_MOVE_COST = original gives:

   a0(r43,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000
NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000
   a1(r44,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000
NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000

How the heck do I tell ira-costs that registers are way cheaper than MEM?


I think this is coming from:

   /* FIXME: Ideally, the following test is not needed.
         However, it turned out that it can reduce the number
         of spill fails.  AVR and it's poor endowment with
         address registers is extreme stress test for reload.  */

   if (GET_MODE_SIZE (mode) >= 4
       && regno >= REG_X)
     return false;


This was introduced to "fix" unable to find a register to spill ICE.

What I do not understand is that the code with long (which is SImode onavr) is fine:


long lunc (long);

long callL (long f)
{
    return lunc (f);
}

callL:
        rjmp lunc        ;  7   [c=24 l=1]  call_value_insn/3

in avr_hard_regno_mode_ok.  This forbids SFmode in r26+ and means that
moves between pointer registers and general registers have the highest
possible cost (65535) to prevent them for being used for SFmode.  So:

    ira_register_move_cost[SFmode][POINTER_REGS][GENERAL_REGS] = 65535;

The costs for union classes are the maximum (worst-case) cost of
for each subclass, so this means that:

    ira_register_move_cost[SFmode][GENERAL_REGS][GENERAL_REGS] = 65535;

as well.

This means that, when there is an expensive class (because it onlycontains one register for example), then it will blow the cost ofGENERAL_REGS to crazy values no matter what?

What's also strange is that the register allocator would not need toallocate a register at all: The incoming parameter comes in SI:22 andis just be passed through to the callee, which also receives the valuein SI:22. Why would one move that value to memory? Even if memory wascheaper, moving the value to mem just to load it again to the sameregister is not very sensible... because in almost any case, /no/instruction is cheaper than /some/ instructions?

Removing the code above fixes it.  If you don't want to do that, an
alternative might be to add a class for r0-r25 (but I've not tested that).


Is there a way that it would use a similar path like SImode?


Thanks,
Richard


Johann


p.s.

test case compiled with

$ avr-gcc bloat.c -S -Os -dp -da -fsplit-wide-types-early -v

Target: avr
Configured with: ../../gcc.gnu.org/trunk/configure --target=avr
--prefix=/local/gnu/install/gcc-10 --disable-shared --disable-nls
--with-dwarf2 --enable-target-optspace=yes --with-gnu-as --with-gnu-ld
--enable-checking=release --enable-languages=c,c++ --disable-gcov
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 10.0.0 20191021 (experimental) (GCC)

Re: Code bloat due to silly IRA cost model?

Reply via email to