https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90705

            Bug ID: 90705
           Summary: Suboptimal register allocation on ARM when compiling
                    for size
           Product: gcc
           Version: 9.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fredrik.hederstie...@securitas-direct.com
  Target Milestone: ---

Created attachment 46441
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46441&action=edit
test.c

When compiling this simple example for ARM (-mcpu=cortex-m0) with gcc-9.1.0,
the code generated looks ok when use -O2, but register allocations looks weird
when compiling use -Os. Registers are pushed on stack, and code actually gets
alot bigger.

Example

int k;
int test(int i)
{
  int r = 0;
  for (; i >= 0; i--) {
    k = i;
    r += k;
  }
  return r;
}


Compiling gcc-9.1.0, -mcpu=cortex-m0 using -O2:

00000000 <test>:
   0:   0003            movs    r3, r0
   2:   2000            movs    r0, #0
   4:   2b00            cmp     r3, #0
   6:   db05            blt.n   14 <test+0x14>
   8:   18c0            adds    r0, r0, r3
   a:   3b01            subs    r3, #1
   c:   d2fc            bcs.n   8 <test+0x8>
   e:   2200            movs    r2, #0
  10:   4b01            ldr     r3, [pc, #4]    ; (18 <test+0x18>)
  12:   601a            str     r2, [r3, #0]
  14:   4770            bx      lr
  16:   46c0            nop                     ; (mov r8, r8)
  18:   00000000        .word   0x00000000


but when compiling with same compiler with -Os:

00000000 <test>:
   0:   2200            movs    r2, #0
   2:   b530            push    {r4, r5, lr}
   4:   0003            movs    r3, r0
   6:   2501            movs    r5, #1
   8:   0010            movs    r0, r2
   a:   4906            ldr     r1, [pc, #24]   ; (24 <test+0x24>)
   c:   680c            ldr     r4, [r1, #0]
   e:   2b00            cmp     r3, #0
  10:   da03            bge.n   1a <test+0x1a>
  12:   2a00            cmp     r2, #0
  14:   d000            beq.n   18 <test+0x18>
  16:   600c            str     r4, [r1, #0]
  18:   bd30            pop     {r4, r5, pc}
  1a:   001c            movs    r4, r3
  1c:   18c0            adds    r0, r0, r3
  1e:   002a            movs    r2, r5
  20:   3b01            subs    r3, #1
  22:   e7f4            b.n     e <test+0xe>
  24:   00000000        .word   0x00000000

using 2 more registers and stack, also code size significantly larger.

Reply via email to