"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes: > Hi, > > I noticed we were missing out on LD1 + UXT combinations in some cases > and found it was because of inconsistent use of the unspec enum > UNSPEC_LD1_SVE. The combine pattern for LD1[S][BHWD] uses UNSPEC_LD1_SVE > whereas one of the LD1 expanders was using UNSPEC_PRED_X. I wasn't sure > whether to change the UNSPEC_LD1_SVE into UNSPEC_PRED_X as the enum > doesn't seem to be used for anything in particular, though I decided > against it for now as it is easier to rename UNSPEC_LD1_SVE to > UNSPEC_PRED_X if there is no use for it than it is to rename only > specific instances of UNSPEC_PRED_X. > > If there is a firm belief the UNSPEC_LD1_SVE will not be used for > anything I am also happy to refactor it out.
Yeah, I think removing uses of UNSPEC_LD1_SVE is the way to go. ld1 support was (unsurprisingly) one of the first things we added, so I think it predates the UNSPEC_PRED_X thing. maskload<mode><vpred> should then just become a define_expand. The problem with the current patch is that @aarch64_pred_mov<mode> includes register moves and stores, so using LD1 in the unspec name would be misleading. It would be better to change the dg-finals to scan-assembler-nots rather than remove them entirely. Thanks, Richard > Bootstrapped and regression tested aarch64-none-linux-gnu. > > Is this OK for trunk? > > Kind regards, > Andre Vieira > > gcc/ChangeLog: > 2021-05-14 Andre Vieira <andre.simoesdiasvie...@arm.com> > > * config/aarch64/aarch64-sve.md: Use UNSPEC_LD1_SVE instead of > UNSPEC_PRED_X. > > gcc/testsuite/ChangeLog: > 2021-05-14 Andre Vieira <andre.simoesdiasvie...@arm.com> > > * gcc.target/aarch64/sve/logical_unpacked_and_2.c: Remove > superfluous uxtb. > * gcc.target/aarch64/sve/logical_unpacked_and_3.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_and_4.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_and_6.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_and_7.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_eor_2.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_eor_3.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_eor_4.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_eor_6.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_eor_7.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_orr_2.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_orr_4.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_orr_6.c: Likewise. > * gcc.target/aarch64/sve/logical_unpacked_orr_7.c: Likewise. > * gcc.target/aarch64/sve/ld1_extend.c: New test. > > diff --git a/gcc/config/aarch64/aarch64-sve.md > b/gcc/config/aarch64/aarch64-sve.md > index > 7db2938bb84e04d066a7b07574e5cf344a3a8fb6..5fd74fcf3e0a984b5b40b8128ad9354fb899ce5f > 100644 > --- a/gcc/config/aarch64/aarch64-sve.md > +++ b/gcc/config/aarch64/aarch64-sve.md > @@ -747,7 +747,7 @@ (define_insn_and_split "@aarch64_pred_mov<mode>" > (unspec:SVE_ALL > [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl") > (match_operand:SVE_ALL 2 "nonimmediate_operand" "w, m, w")] > - UNSPEC_PRED_X))] > + UNSPEC_LD1_SVE))] > "TARGET_SVE > && (register_operand (operands[0], <MODE>mode) > || register_operand (operands[2], <MODE>mode))" > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/ld1_extend.c > b/gcc/testsuite/gcc.target/aarch64/sve/ld1_extend.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..7f78cb4b3e4445c4da93b00ae78d6ef6fec1b2de > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/ld1_extend.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 --param vect-partial-vector-usage=1" } */ > + > +void foo (signed char * __restrict__ a, signed char * __restrict__ b, short > * __restrict__ c, int n) > +{ > + for (int i = 0; i < n; ++i) > + c[i] = a[i] + b[i]; > +} > + > +/* { dg-final { scan-assembler-times {\tld1sb\t} 4 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c > index > 08b274512e1c6ce8f5845084a664b2fa0456dafe..cb6029e90ffc815e75092624f611c4631cbd9fd6 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_2.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_3.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_3.c > index > c823470ca925ee66929475f74fa8d94bc4735594..02fc5460e5ce89c8a3fef611aac561145ddd0f39 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_3.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_3.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_4.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_4.c > index > 52c92911d9b548662d43b23816e4d450a9e67846..8a441300ba59da6f96365bc6ad5482911ed605f8 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_4.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, > uint16_t *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_6.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_6.c > index > 1552ed85302373bb16ad8265f5c84cea71ccbc66..2f657736f37162c1c422b318e6f23242c68fea48 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_6.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_6.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_7.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_7.c > index > 484d9daf38f0779109484eab5a2a03a626c16fe8..d67bdb7233e906581517463386fabe293a8d7170 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_7.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_and_7.c > @@ -10,7 +10,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, uint8_t > *restrict src2){ > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\tand\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_2.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_2.c > index > 23ddeb9f9b11f80783e1b173696a15d1d73762a3..594e4117477ffc78665eac617bb7c3452217cacd > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_2.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_2.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_3.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_3.c > index > 4dd1e085646c6d0e6d3d4e02534c94ea592ea7be..1f684feade878fda293e55fca24541d1fa4133ef > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_3.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_3.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_4.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_4.c > index > a31a2d425faa11c6faa5306d4c0a7e5a5fa086d8..b051eb28c51a02a5240a81a33cbcbd3d163098b3 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_4.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, > uint16_t *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_6.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_6.c > index > 416567b21f703d6e0ff3792d2089b923ecde0441..cd42405690b39c651ce76416ea3b371f5611cfe6 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_6.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_6.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_7.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_7.c > index > 3f7c3ddbba8a986e01fbdbe51b307bdc3990da37..2154ae8f2485fb81cc77286ae8f1293050378173 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_7.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_eor_7.c > @@ -10,7 +10,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, uint8_t > *restrict src2){ > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\teor\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_2.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_2.c > index > 593de65a02cd2a16acb48a6dd05163a4e66b7b27..890b7e64f965678c2fe2e5d0a395520d9189b6dc > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_2.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_2.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_4.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_4.c > index > 561a104a23f00759af457e03e8e175589282aeb5..2b419a65d60abd7b7ef330ea538c02fe2ade8f9a > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_4.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, > uint16_t *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_6.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_6.c > index > 3ce1c3fb1e636c7f8ebcf2a7fbaafe20f3b7cfa0..214a3c7efbb9a776477907e2516d5dd112c77d7d > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_6.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_6.c > @@ -11,7 +11,6 @@ f (uint64_t *restrict dst, uint16_t *restrict src1, uint8_t > *restrict src2) > > /* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.h,} 1 } } */ > /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxth\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_7.c > b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_7.c > index > e6a429167ea38dbbbc7e653cd6d47ffbe0bc768d..66e24058ae2e657bcb804d23b58d7c722e3e2dd5 > 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_7.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/logical_unpacked_orr_7.c > @@ -10,7 +10,6 @@ f (uint64_t *restrict dst, uint32_t *restrict src1, uint8_t > *restrict src2){ > > /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.d,} 2 } } */ > -/* { dg-final { scan-assembler-times {\tuxtb\tz[0-9]+\.s,} 1 } } */ > /* { dg-final { scan-assembler-times {\torr\tz[0-9]+\.d, z[0-9]+\.d, > z[0-9]+\.d\n} 2 } } */ > /* { dg-final { scan-assembler-times {\tuxtw\tz[0-9]+\.d,} 2 } } */ > /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d,} 2 } } */