https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115947
Bug ID: 115947 Summary: ARM Thumb2 baremetal poor optimization Product: gcc Version: 13.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: seedofonan at yahoo dot com Target Milestone: --- Created attachment 58676 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58676&action=edit Minimized example of the bugs A (long enough) sequence of __asm() store statements of various constants to constant addresses begins to produce the following perverse sequence of instructions: mov.w ip, #68 mov r5, ip str r5, [r3, #32] and also: ldr.w ip, [pc, #84] mov r5, ip str r5, [r3, #36] The value to be stored should be loaded into r5, not first into ip (or lr) only to be moved to r5. A minimized example is attached. The assembly listing for it is in the sub folder bug1. It started happening in version 4.9. In the subfolder bug2 is the same code but performing these same stores without the __asm() workaround (see bug2/GpioInitProb.c where the macros STR, STRH, and STRB are adjusted). "Workaround what?" you ask? Well, the GCC compiler (and LLVM 18.1.8, too, fwiw) generate significantly less optimal assembly when it has the freedom to change the small constant offsets (the middle macro parameter). This has been a problem since I first started working with Cortex-M on GCC version 4.7 (and llvm version 3.4). Notice that, if bug1 is fixed, it will be 20 instruction bytes shorter, but bug2/GpioInitProb.c will remain at 224 bytes. BTW, I have a number of cases where I have to use this kind of workaround for preventing the compiler from "optimizing" things stupidly. If there's any interest in fixing bug2, I'll dust off the others and submit them, too. But I'm not expecting it. Basically, these problems stem from the way that a C program, even one that has a trivial translation to what would be optimal assembly, first gets de-optimized into various normal forms, after which the compiler cannot rediscover anything as efficient. But I hope someone can fix bug1. Thank you, -Gary Fuehrer