On Fri, Jan 26, 2018 at 01:50:40PM +0000, Richard Sandiford wrote: > The fallback way of handling a repeated 128-bit constant vector for SVE > is to force the 128 bits to the constant pool and use LD1RQ to load it. > Previously the code always used the byte variant of LD1RQ (LD1RQB), > with a preceding BSWAP for big-endian targets. However, that BSWAP > doesn't handle all cases correctly. > > The simplest fix seemed to be to use the LD1RQ appropriate for the > element size. > > This helps to fix some of the sve/slp_*.c tests for aarch64_be, > although a later patch is needed as well. > > Tested on aarch64_be-elf and aarch64-linux-gnu. OK to install?
OK. Thanks, James > 2018-01-26 Richard Sandiford <richard.sandif...@linaro.org> > > gcc/ > * config/aarch64/aarch64-sve.md (sve_ld1rq): Replace with... > (*sve_ld1rq<Vesize>): ... this new pattern. Handle all element sizes, > not just bytes. > * config/aarch64/aarch64.c (aarch64_expand_sve_widened_duplicate): > Remove BSWAP handing for big-endian targets and use the form of > LD1RQ appropariate for the mode. > > gcc/testsuite/ > * gcc.target/aarch64/sve/slp_2.c: Expect LD1RQD rather than LD1RQB. > * gcc.target/aarch64/sve/slp_3.c: Expect LD1RQW rather than LD1RQB. > * gcc.target/aarch64/sve/slp_4.c: Expect LD1RQH rather than LD1RQB. >