apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > From: Andrew Pinski <apin...@marvell.com> > > The problem here is the aarch64_expand_setmem code did not check > STRICT_ALIGNMENT if it is creating an overlapping store. > This patch adds that check and the testcase works. > > gcc/ChangeLog: > > PR target/101934 > * config/aarch64/aarch64.c (aarch64_expand_setmem): > Check STRICT_ALIGNMENT before creating an overlapping > store. > > gcc/testsuite/ChangeLog: > > PR target/101934 > * gcc.target/aarch64/memset-strict-align-1.c: New test.
OK, thanks. Richard > --- > gcc/config/aarch64/aarch64.c | 4 +-- > .../aarch64/memset-strict-align-1.c | 28 +++++++++++++++++++ > 2 files changed, 30 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 3213585a588..26d59ba1e13 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -23566,8 +23566,8 @@ aarch64_expand_setmem (rtx *operands) > /* Do certain trailing copies as overlapping if it's going to be > cheaper. i.e. less instructions to do so. For instance doing a 15 > byte copy it's more efficient to do two overlapping 8 byte copies than > - 8 + 4 + 2 + 1. */ > - if (n > 0 && n < copy_limit / 2) > + 8 + 4 + 2 + 1. Only do this when -mstrict-align is not supplied. */ > + if (n > 0 && n < copy_limit / 2 && !STRICT_ALIGNMENT) > { > next_mode = smallest_mode_for_size (n, MODE_INT); > int n_bits = GET_MODE_BITSIZE (next_mode).to_constant (); > diff --git a/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c > b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c > new file mode 100644 > index 00000000000..5cdc8a44968 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Os -mstrict-align" } */ > + > +struct s { char x[95]; }; > +void foo (struct s *); > +void bar (void) { struct s s1 = {}; foo (&s1); } > + > +/* memset (s1 = {}, sizeof = 95) should be expanded out > + such that there are no overlap stores when -mstrict-align > + is in use. > + so 2 pair 16 bytes stores (64 bytes). > + 1 16 byte stores > + 1 8 byte store > + 1 4 byte store > + 1 2 byte store > + 1 1 byte store > + */ > + > +/* { dg-final { scan-assembler-times "stp\tq" 2 } } */ > +/* { dg-final { scan-assembler-times "str\tq" 1 } } */ > +/* { dg-final { scan-assembler-times "str\txzr" 1 } } */ > +/* { dg-final { scan-assembler-times "str\twzr" 1 } } */ > +/* { dg-final { scan-assembler-times "strh\twzr" 1 } } */ > +/* { dg-final { scan-assembler-times "strb\twzr" 1 } } */ > + > +/* Also one store pair for the frame-pointer and the LR. */ > +/* { dg-final { scan-assembler-times "stp\tx" 1 } } */ > +