https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119989
Bug ID: 119989 Summary: [AVR] Incorrect code generation with __memx pointers when optimization is enabled (-O1 and above) on AVR (ATmega328P) Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gilhad at seznam dot cz Target Milestone: --- Created attachment 61232 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61232&action=edit simple example wrong optimisation of: DT = *IP; // DT get correct value DT = *IP--; // DT get wrong value --- When using __memx pointers and dereferencing them multiple times in a function, the generated code calls __xload_3 / __xload_4 helper functions, which internally modify r30 (Z) by using Z+. However, the compiler incorrectly reuses the (now incremented) r30 without reloading it from the original pointer (IP). This causes reading from an unintended memory address. This problem occurs with optimization levels -O1, -O2, -O3, and -Os. It does not occur with -O0. Minimal example demonstrating the issue: #include <avr/pgmspace.h> const __memx uint32_t some_data[] = {1,2,3,4,5}; const __memx uint32_t * IP; uint32_t DT,a,b; void do_test1() { DT = *IP; DT = *IP--; } void do_test2() { DT = *IP; asm volatile ("" ::: "memory"); // Prevents unwanted optimization DT = *IP--; } uint32_t difference(void) { IP = &some_data[3]; do_test1(); a = DT; IP = &some_data[3]; do_test2(); b = DT; return (a - b); // Expected: 0 } Expected result: difference() should return 0. Actual result: difference() returns a nonzero value (incorrect). Root cause analysis: Compiling with: avr-gcc -mmcu=atmega328p -Os -S demo.c -o demo.Os.s shows that: - Functions like __xload_3 and __xload_4 use r30:r31 (Z) and r21 as arguments. - These functions perform memory reads using ld/lpm with post-increment (Z+), modifying r30. - After returning from __xload_3/__xload_4, r30:r31 no longer points to the same memory location. However, the optimized version of the code does not reload IP into r30:r31, but incorrectly reuses the already incremented register contents, leading to wrong memory accesses. Inserting an inline assembly memory clobber (asm volatile ("" ::: "memory");) between accesses forces the compiler to reload IP correctly, avoiding the issue. The same happens for IP++ used instead of IP--. I found it when program worked with debugging function (in place of the asm ... statement), but failed without it. Environment: Target: AVR ATmega328P Compiler: gcc version 14.2.1 20241221 (Gentoo 14.2.1_p20241221 p7)