https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115947

            Bug ID: 115947
           Summary: ARM Thumb2 baremetal poor optimization
           Product: gcc
           Version: 13.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: seedofonan at yahoo dot com
  Target Milestone: ---

Created attachment 58676
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58676&action=edit
Minimized example of the bugs

A (long enough) sequence of __asm() store statements of various constants to
constant addresses begins to produce the following perverse sequence of
instructions:

  mov.w ip, #68
  mov r5, ip
  str r5, [r3, #32]

 and also:

  ldr.w ip, [pc, #84]
  mov r5, ip
  str r5, [r3, #36]


The value to be stored should be loaded into r5, not first into ip (or lr) only
to be moved to r5.

A minimized example is attached. The assembly listing for it is in the sub
folder bug1. It started happening in version 4.9.

In the subfolder bug2 is the same code but performing these same stores without
the __asm() workaround (see bug2/GpioInitProb.c where the macros STR, STRH, and
STRB are adjusted). "Workaround what?" you ask? Well, the GCC compiler (and
LLVM 18.1.8, too, fwiw) generate significantly less optimal assembly when it
has the freedom to change the small constant offsets (the middle macro
parameter). This has been a problem since I first started working with Cortex-M
on GCC version 4.7 (and llvm version 3.4). Notice that, if bug1 is fixed, it
will be 20 instruction bytes shorter, but bug2/GpioInitProb.c will remain at
224 bytes.

BTW, I have a number of cases where I have to use this kind of workaround for
preventing the compiler from "optimizing" things stupidly. If there's any
interest in fixing bug2, I'll dust off the others and submit them, too. But I'm
not expecting it. Basically, these problems stem from the way that a C program,
even one that has a trivial translation to what would be optimal assembly,
first gets de-optimized into various normal forms, after which the compiler
cannot rediscover anything as efficient.

But I hope someone can fix bug1.

Thank you,
-Gary Fuehrer

Reply via email to