https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124697

--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 31 Mar 2026, hjl.tools at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124697
> 
> --- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
> (In reply to Richard Biener from comment #4)
> > (In reply to H.J. Lu from comment #3)
> > > [hjl@gnu-tgl-3 pr124697]$ cat foo.c
> > > typedef double v4df __attribute__((vector_size(32)));
> > > typedef double v2df __attribute__((vector_size(16)));
> > > typedef struct {
> > >   v2df a[2];
> > > } c __attribute__((aligned(32)));
> > > extern v4df d;
> > > void
> > > e (float a1, float a2, float a3, float a4, float a5, float a6, c f)
> > > {
> > >   d = *(v4df *) &f;
> > > }
> > > [hjl@gnu-tgl-3 pr124697]$ make foo.s
> > > /export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
> > > -B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ 
> > > -O2
> > > -march=x86-64-v4 -S foo.c
> > > [hjl@gnu-tgl-3 pr124697]$ cat foo.s
> > >   .file   "foo.c"
> > >   .text
> > >   .p2align 4
> > >   .globl  e
> > >   .type   e, @function
> > > e:
> > > .LFB0:
> > >   .cfi_startproc
> > >   pushq   %rbp
> > >   .cfi_def_cfa_offset 16
> > >   .cfi_offset 6, -16
> > >   movq    %rsp, %rbp
> > >   .cfi_def_cfa_register 6
> > >   vmovapd 16(%rbp), %ymm0  <<<<<<< f is aligned at 16 bytes.
> > 
> > Yes.  This is wrong code.  My patch would have fixed it, doing
> > effectively (but restricted to x86 at this point)
> > 
> > diff --git a/gcc/function.cc b/gcc/function.cc
> > index 46c0d8b54c2..d44815afc16 100644
> > --- a/gcc/function.cc
> > +++ b/gcc/function.cc
> > @@ -2840,7 +2840,7 @@ assign_parm_adjust_stack_rtl (tree parm, struct
> > assign_parm_data_one *data)
> >                                                  MEM_ALIGN (stack_parm))))
> >           || (data->nominal_type
> >               && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
> > -             && (MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY
> > +             && (MEM_ALIGN (stack_parm) < BIGGEST_ALIGNMENT
> 
> What happens if BIGGEST_ALIGNMENT > PREFERRED_STACK_BOUNDARY and
> BIGGEST_ALIGNMENT > MAX_SUPPORTED_STACK_ALIGNMENT.

The latter would be an unsupported config.  But what _actually_
happens, like what you see on aarch64 is that we then allocate
an aligned stack slot not by re-aligning the stack pointer but
by alloca-like code, rounding up size and then using the aligned
portion of the slot.

IIRC only x86 can do re-alignment of the stack pointer at entry/exit.

Reply via email to