Christophe Lyon via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > While working on enabling DFP for AArch64, I noticed new failures in > gcc.dg/compat/struct-layout-1.exp (t028) which were not actually > caused by DFP types handling. These tests are generated during 'make > check' and enabling DFP made generation different (not sure if new > non-DFP tests are generated, or if existing ones are generated > differently, the tests in question are huge and difficult to compare). > > Anyway, I reduced the problem to what I attach at the end of the new > gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same > scheme as other va_arg* AArch64 tests. Richard Sandiford further > reduced this to a non-vararg function, added as a second testcase. > > This is a tough case mixing bitfields and alignment, where > aarch64_function_arg_alignment did not follow what its descriptive > comment says: we want to use the natural alignment of the bitfield > type only if the user didn't override the alignment for the bitfield > itself. > > The fix is thus very small, and this patch adds two new tests > (va_arg-17.c and pr105549.c). va_arg-17.c contains the reduced > offending testcase from struct-layout-1.exp for reference. > > We also take the opportunity to fix the comment above > aarch64_function_arg_alignment since the value of the abi_break > parameter was changed in a previous commit, no longer match the > description. > > 2022-06-02 Christophe Lyon <christophe.l...@arm.com> > > gcc/ > PR target/105549 > * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): > Check DECL_USER_ALIGN for bitfield. > > gcc/testsuite/ > PR target/105549 > * gcc.target/aarch64/aapcs64/va_arg-17.c: New. > * gcc.target/aarch64/pr105549.c: New. > > > ############### Attachment also inlined for ease of reply > ############### > > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index > 40fc5e633992036a2c06867857a681792178ef00..2c6ccce7cb5dc32097d24514ee525729efb6b7ff > 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -7262,9 +7262,9 @@ aarch64_vfp_is_call_candidate (cumulative_args_t > pcum_v, machine_mode mode, > /* Given MODE and TYPE of a function argument, return the alignment in > bits. The idea is to suppress any stronger alignment requested by > the user and opt for the natural alignment (specified in AAPCS64 \S > - 4.1). ABI_BREAK is set to true if the alignment was incorrectly > - calculated in versions of GCC prior to GCC-9. This is a helper > - function for local use only. */ > + 4.1). ABI_BREAK is set to the old alignment if the alignment was > + incorrectly calculated in versions of GCC prior to GCC-9. This is > + a helper function for local use only. */ > > static unsigned int > aarch64_function_arg_alignment (machine_mode mode, const_tree type, > @@ -7304,7 +7304,10 @@ aarch64_function_arg_alignment (machine_mode mode, > const_tree type, > "s" contains only one Fundamental Data Type (the int field) > but gains 8-byte alignment and size thanks to "e". */ > alignment = std::max (alignment, DECL_ALIGN (field)); > - if (DECL_BIT_FIELD_TYPE (field)) > + > + /* Take bit-field type's alignment into account only if the > + user didn't override this field's alignment. */ > + if (DECL_BIT_FIELD_TYPE (field) && !DECL_USER_ALIGN (field))
I think we need to check DECL_PACKED instead. On its own, an alignment attribute on the field can only increase alignment, not decrease it. In constrast, the packed attribute effectively forces the alignment to 1 byte, so has an effect even without an alignment attribute. Adding an explicit alignment on top can then increase the alignment from 1 to any value (bigger or smaller than the original underlying type). E.g. for: --------------------------------------------------------------------- typedef unsigned long long ull __attribute__((aligned(ALIGN))); #ifndef EXTRA #define EXTRA unsigned long long x; #endif struct S1 { __attribute__((aligned(1))) ull i : 1; EXTRA }; struct S2 { __attribute__((aligned(2))) ull i : 1; EXTRA }; struct S4 { __attribute__((aligned(4))) ull i : 1; EXTRA }; struct S8 { __attribute__((aligned(8))) ull i : 1; EXTRA }; struct S16 { __attribute__((aligned(16))) ull i : 1; EXTRA }; struct Sp { ull i : 1; EXTRA }__attribute__((packed)); struct S1p { __attribute__((packed, aligned(1))) ull i : 1; EXTRA }; struct S2p { __attribute__((packed, aligned(2))) ull i : 1; EXTRA }; struct S4p { __attribute__((packed, aligned(4))) ull i : 1; EXTRA }; struct S8p { __attribute__((packed, aligned(8))) ull i : 1; EXTRA }; struct S16p { __attribute__((packed, aligned(16))) ull i : 1; EXTRA }; int f1(int a, struct S1 s) { return s.i; } int f2(int a, struct S2 s) { return s.i; } int f4(int a, struct S4 s) { return s.i; } int f8(int a, struct S8 s) { return s.i; } int f16(int a, struct S16 s) { return s.i; } int fp(int a, struct Sp s) { return s.i; } int f1p(int a, struct S1p s) { return s.i; } int f2p(int a, struct S2p s) { return s.i; } int f4p(int a, struct S4p s) { return s.i; } int f8p(int a, struct S8p s) { return s.i; } int f16p(int a, struct S16p s) { return s.i; } --------------------------------------------------------------------- there are at least 4 interesting cases: {-DALIGN=8,-DALIGN=16} x {-DEXTRA=,} >From my reading of the ABI, clang gets all of them right. The case GCC currently gets wrong is: fp f1p f2p f4p f8p for -DALIGN=16 [A] f1 to f16 are equivalent for -DALIGN=16: all the structures have 16-byte alignment, despite the user alignments on the fields. We currently handle those correctly. Unfortunately, fixing [A] means that this is an ABI break from GCC 12, so we'll need (yet another) -Wpsabi warning. In this case I guess we're actually restoring the pre-GCC 9.1 behaviour. An interesting oddity with these rules is that for: --------------------------------------------------------------------- struct S1 { unsigned long long a, b; } __attribute__((aligned(16))); struct S2 { struct S1 s; }; int f1(int a, struct S1 s1) { return s1.a; } int f2(int a, struct S2 s2) { return s2.s.a; } --------------------------------------------------------------------- S1 is not treated as 16-byte aligned, but S2 is (because the analysis isn't recursive). Clang and GCC seem to be consistent about that though. Thanks, Richard > bitfield_alignment > = std::max (bitfield_alignment, > TYPE_ALIGN (DECL_BIT_FIELD_TYPE (field))); > diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c > b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..24895c3ab48309b601f6f22c176f1e52350c2257 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c > @@ -0,0 +1,105 @@ > +/* Test AAPCS64 layout and __builtin_va_arg. > + > + This test covers a corner case where a composite type parameter fits in > one > + register: we do not need a double-word alignment when accessing it in the > + va_arg stack area. */ > + > +/* { dg-do run { target aarch64*-*-* } } */ > + > +#ifndef IN_FRAMEWORK > +#define AAPCS64_TEST_STDARG > +#define TESTFILE "va_arg-17.c" > +#include "type-def.h" > + > +enum E6 { e6_0, e6_1, e6_2, e6_3, e6_65533 = 65533, e6_65534, e6_65535 }; > +typedef enum E6 Tal16E6 __attribute__((aligned (16))); > +typedef unsigned int Tuint; > + > +int fails; > + > +union S2844 { > + Tuint a:((((10) - 1) & 31) + 1); > + Tal16E6 __attribute__((aligned (2), packed)) b:31; > + struct{}c[0]; > +} ; > +union S2844 s2844; > +union S2844 a2844[5]; > + > +#define HAS_DATA_INIT_FUNC > +void init_data () > +{ > + memset (&s2844, '\0', sizeof (s2844)); > + memset (a2844, '\0', sizeof (a2844)); > + s2844.a = 799U; > + a2844[2].a = 586U; > +} > + > +#include "abitest.h" > +#else > + ARG (int , 1 , W0 , LAST_NAMED_ARG_ID) > + DOTS > + ANON_PROMOTED (float , 1.0f, double, 1.0, D0, 1) > + ANON (union S2844 , s2844 , X1 , 2) > + ANON (long long , 2LL , X2 , 3) > + ANON (union S2844 , a2844[2] , X3 , 4) > + LAST_ANON (union S2844 , a2844[2] , X4 , 5) > +#endif > + > +#if 0 > + /* This test is derived from a case generated by struct-layout-1.exp: */ > + > +enum E6 { e6_0, e6_1, e6_2, e6_3, e6_65533 = 65533, e6_65534, e6_65535 }; > +typedef enum E6 Tal16E6 __attribute__((aligned (16))); > +typedef unsigned int Tuint; > + > +int fails; > + > +union S2844 { > + Tuint a:((((10) - 1) & 31) + 1); > + Tal16E6 __attribute__((aligned (2), packed)) b:31; > + struct{}c[0]; > +} ; > +union S2844 s2844; > +union S2844 a2844[5]; > + > +typedef __builtin_va_list __gnuc_va_list; > +typedef __gnuc_va_list va_list; > + > +void check2844va (int z, ...) { > + union S2844 arg, *p; > + va_list ap; > + > + __builtin_va_start(ap,z); > + if (__builtin_va_arg(ap,double) != 1.0) > + printf ("fail %d.%d\n", 2844, 0), ++fails; > + > + p = &s2844; > + arg = __builtin_va_arg(ap,union S2844); /* This would fail. */ > + if (p->a != arg.a) > + printf ("fail %d.%d\n", 2844, 1), ++fails; > + > + if (__builtin_va_arg(ap,long long) != 3LL) > + printf ("fail %d.%d\n", 2844, 2), ++fails; > + > + p = &a2844[2]; > + arg = __builtin_va_arg(ap,union S2844); /* This would fail. */ > + if (p->a != arg.a) > + printf ("fail %d.%d\n", 2844, 3), ++fails; > + > + arg = __builtin_va_arg(ap,union S2844); /* This would fail. */ > + if (p->a != arg.a) > + printf ("fail %d.%d\n", 2844, 4), ++fails; > + > + __builtin_va_end(ap); > +} > + > +int main (void) { > + int i, j; > + memset (&s2844, '\0', sizeof (s2844)); > + memset (a2844, '\0', sizeof (a2844)); > + s2844.a = 799U; > + a2844[2].a = 586U; > + check2844va (1, 1.0, s2844, 2LL, a2844[2], a2844[2]); > + exit (fails != 0); > +} > +#endif /* 0 */ > diff --git a/gcc/testsuite/gcc.target/aarch64/pr105549.c > b/gcc/testsuite/gcc.target/aarch64/pr105549.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..55a91ed6bc48b56eed61d668971e890dbd3250ef > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/pr105549.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-save-temps" } */ > + > +enum e { E1 }; > +typedef enum e e __attribute__((aligned(16))); > +union u { > + __attribute__((aligned(2), packed)) e a : 1; > + int x[4]; > +}; > +union u g(int a, union u u2) { return u2; } > + > +/* { dg-final { scan-assembler "stp\tx1, x2, \\\[sp, 8\\\]" } } */