https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119989

            Bug ID: 119989
           Summary: [AVR] Incorrect code generation with __memx pointers
                    when optimization is enabled (-O1 and above) on AVR
                    (ATmega328P)
           Product: gcc
           Version: 14.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gilhad at seznam dot cz
  Target Milestone: ---

Created attachment 61232
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61232&action=edit
simple example

wrong optimisation of:
    DT = *IP;   // DT get correct value
    DT = *IP--; // DT get wrong value

---

When using __memx pointers and dereferencing them multiple times in a function,
the generated code calls __xload_3 / __xload_4 helper functions, which
internally modify r30 (Z) by using Z+. However, the compiler incorrectly reuses
the (now incremented) r30 without reloading it from the original pointer (IP).

This causes reading from an unintended memory address.

This problem occurs with optimization levels -O1, -O2, -O3, and -Os.
It does not occur with -O0.

Minimal example demonstrating the issue:

#include <avr/pgmspace.h>

const __memx uint32_t some_data[] = {1,2,3,4,5};
const __memx uint32_t * IP;
uint32_t DT,a,b;

void do_test1() {
    DT = *IP;
    DT = *IP--;
}

void do_test2() {
    DT = *IP;
    asm volatile ("" ::: "memory"); // Prevents unwanted optimization
    DT = *IP--;
}

uint32_t difference(void) {
    IP = &some_data[3];
    do_test1();
    a = DT;
    IP = &some_data[3];
    do_test2();
    b = DT;
    return (a - b); // Expected: 0
}

Expected result:
difference() should return 0.

Actual result:
difference() returns a nonzero value (incorrect).

Root cause analysis:
Compiling with:

avr-gcc -mmcu=atmega328p -Os -S demo.c -o demo.Os.s

shows that:

- Functions like __xload_3 and __xload_4 use r30:r31 (Z) and r21 as arguments.
- These functions perform memory reads using ld/lpm with post-increment (Z+),
modifying r30.
- After returning from __xload_3/__xload_4, r30:r31 no longer points to the
same memory location.

However, the optimized version of the code does not reload IP into r30:r31, but
incorrectly reuses the already incremented register contents, leading to wrong
memory accesses.

Inserting an inline assembly memory clobber (asm volatile ("" ::: "memory");)
between accesses forces the compiler to reload IP correctly, avoiding the
issue.

The same happens for IP++ used instead of IP--.

I found it when program worked with debugging function (in place of the asm ...
statement), but failed without it.

Environment:

    Target: AVR ATmega328P
    Compiler: gcc version 14.2.1 20241221 (Gentoo 14.2.1_p20241221 p7)

Reply via email to