https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122227
Bug ID: 122227
Summary: Storing to volatile array creates spurious loads.
Product: gcc
Version: 15.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: trufflenose at proton dot me
Target Milestone: ---
When optimizations are enabled, on architectures without fast unaligned
load/store instructions, storing a quantity to a constant offset within a
volatile array with alignment that MIGHT be insufficient for the store to
succeed will cause gcc to emit spurious load instructions.
For example, on ARM with -O2 -march=armv4 -fno-PIC, the following code:
extern volatile unsigned char data[8];
void test_write_a(void)
{
*(volatile unsigned int*)data = 0;
}
void test_write_b(void)
{
*(volatile unsigned int*)__builtin_assume_aligned((void*)data, 4) = 0;
}
Produces assembly like:
(...)
ldrb r1, [r3]
strb r2, [r3]
ldrb r1, [r3, #1]
strb r2, [r3, #1]
ldrb r1, [r3, #2]
strb r2, [r3, #2]
ldrb r1, [r3, #3]
strb r2, [r3, #3]
(...)
...when it should be more like:
strb r2, [r3]
strb r2, [r3, #1]
strb r2, [r3, #2]
strb r2, [r3, #3]
...or just simply:
str r2, [r3]
See: https://godbolt.org/z/WsbEYa6an
On a related note, splitting volatile accesses when the alignment can't be
determined to be correct at compile-time, regardless of the actual alignment
when the machine code is executed, seems very wrong, in at least a practical
sense for embedded system developers. At the absolute very least, simple
volatile memory accesses like this should be consistent between optimizations
disabled and optimizations enabled.