https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706
--- Comment #5 from Georg-Johann Lay <gjl at gcc dot gnu.org> --- Created attachment 47173 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47173&action=edit bloat.c: A trivial test case demonstrating the problem. A (small) part of the overhead can be worked around with -fsplit-wide-types-early, but the major problem is from the register allocator, ira specifically. compile $ avr-gcc -S bloat.c -Os -mmcu=atmega128 -dp -da -fsplit-wide-types-early Generated code: call: push r28 ; 17 [c=4 l=1] pushqi1/0 push r29 ; 18 [c=4 l=1] pushqi1/0 ; SP -= 4 ; 22 [c=4 l=2] *addhi3_sp rcall . rcall . in r28,__SP_L__ ; 23 [c=4 l=2] *movhi/7 in r29,__SP_H__ /* prologue: function */ /* frame size = 4 */ /* stack size = 6 */ .L__stack_usage = 6 std Y+1,r22 ; 14 [c=4 l=4] *movsf/3 std Y+2,r23 std Y+3,r24 std Y+4,r25 /* epilogue start */ ; SP += 4 ; 34 [c=4 l=4] *addhi3_sp pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ pop r29 ; 35 [c=4 l=1] popqi pop r28 ; 36 [c=4 l=1] popqi jmp func ; 7 [c=24 l=2] call_value_insn/3 Optimal code: call: jmp func The problem is that IRA concludes that register moves are always more expensive than memory moves, i.e. whatever costs you assign to TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST, memory will *always* win. From bloat.c.278r.ira: Pass 0 for finding pseudo/allocno costs a1 (r44,l0) best NO_REGS, allocno NO_REGS a0 (r43,l0) best NO_REGS, allocno NO_REGS a0(r43,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000 NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000 a1(r44,l0) costs: ADDW_REGS:32000 SIMPLE_LD_REGS:32000 LD_REGS:32000 NO_LD_REGS:32000 GENERAL_REGS:32000 MEM:9000 == configure == --target=avr --disable-shared --disable-nls --with-dwarf2 --enable-target-optspace=yes --with-gnu-as --with-gnu-ld --enable-checking=release --enable-languages=c,c++
