https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103500

            Bug ID: 103500
           Summary: Stack slots for overaligned stack temporaries are not
                    properly aligned
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

gcc/testsuite/gcc.target/aarch64/aapcs64/rec_align-8.c has the following struct
declaration:

/* The alignment also gives this size 32, so will be passed by reference.  */
typedef struct __attribute__ ((__aligned__ (32)))
  {
    long x;
    long y;
  } overaligned;

and as the comment suggests, AAPCS64 requires that the struct is passed
by reference. The test proceeds to check that the copy of the passed
struct is 32-byte aligned (as required by the PCS), with:

  long addr = ((long) &x1) & 31;
  if (addr != 0)
    {
      __builtin_printf ("Alignment was %d\n", addr);
      abort ();
    }

but because GCC "knows" the struct is aligned, the expression assigned
to addr is folded to zero by the frontend (even at -O0). With
-fdump-tree-original I see:

long int addr = 0;

Moreover, it turns out that GCC is not actually aligning the struct copy
properly in the call here. Consider the simplified testcase:

typedef struct __attribute__((aligned(32))) {
  long x,y;
} S;
S x;
void f(S);
void g(void) { f(x); }

for which we currently generate (at -O2):

g:
        adrp    x1, .LANCHOR0
        add     x1, x1, :lo12:.LANCHOR0
        stp     x29, x30, [sp, -48]!
        mov     x29, sp
        ldp     q0, q1, [x1]
        add     x0, sp, 16
        stp     q0, q1, [sp, 16]
        bl      f
        ldp     x29, x30, [sp], 48
        ret

i.e. the struct is stored at sp + 16, but the stack pointer is only
guaranteed to be 16-byte aligned, so the stack slot here is only 16-byte
aligned.

In fact, tweaking the testcase (rec_align-8.c) to __builtin_snprintf the
pointer into a buffer and __builtin_sscanf it out again before
performing the alignment check (to prevent the folding by the frontend),
we can see the execution test failing (sporadically, if ASLR is enabled)
on aarch64 linux.

Note that for the related:

void f2(S*);
void g2(void) {
    S x;
    f2(&x);
}

we generate:

g2:
        stp     x29, x30, [sp, -64]!
        add     x0, sp, 47
        mov     x29, sp
        and     x0, x0, -32
        bl      f2
        ldp     x29, x30, [sp], 64
        ret

i.e. we actually align the stack slot properly. We should do the same
for the PCS-mandated passed-by-reference struct.

I have a patch to fix the issue in the mid-end which I will post to the
list shortly to get some feedback.

Reply via email to