Ping On 02/12/24 2:20 pm, Surya Kumari Jangala wrote: > I have incorporated review comments in this patch. > > Regards, > Surya > > > rs6000: Inefficient vector splat of small V2DI constants [PR107757] > > On P8, for vector splat of double word constants, specifically -1 and 1, > gcc generates inefficient code. For -1, gcc generates two instructions > (vspltisw and vupkhsw) whereas only one instruction (vspltisw) is > sufficient. For constant 1, gcc generates a load of the constant from > .rodata instead of the instructions vspltisw and vupkhsw. > > The routine vspltisw_vupkhsw_constant_p() returns true if the constant > can be synthesized with instructions vspltisw and vupkhsw. However, for > constant 1, this routine returns false. > > For constant -1, this routine returns true. Vector splat of -1 can be > done with only one instruction, i.e., vspltisw. We do not need two > instructions. Hence this routine should return false for -1. > > With this patch, gcc generates only one instruction (vspltisw) > for -1. And for constant 1, this patch generates two instructions > (vspltisw and vupkhsw). > > 2024-11-20 Surya Kumari Jangala <jskum...@linux.ibm.com> > > gcc/ > PR target/107757 > * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): > Return false for -1 and return true for 1. > > gcc/testsuite/ > PR target/107757 > * gcc.target/powerpc/pr107757-1.c: New. > * gcc.target/powerpc/pr107757-2.c: New. > --- > gcc/config/rs6000/rs6000.cc | 2 +- > gcc/testsuite/gcc.target/powerpc/pr107757-1.c | 14 ++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr107757-2.c | 13 +++++++++++++ > 3 files changed, 28 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-2.c > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 02a2f1152db..d0c528f4d5f 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -6652,7 +6652,7 @@ vspltisw_vupkhsw_constant_p (rtx op, machine_mode mode, > int *constant_ptr) > return false; > > value = INTVAL (elt); > - if (value == 0 || value == 1 > + if (value == 0 || value == -1 > || !EASY_VECTOR_15 (value)) > return false; > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-1.c > b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c > new file mode 100644 > index 00000000000..49076fba255 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ > +/* { dg-require-effective-target powerpc_vsx } */ > +/* { dg-final { scan-assembler {\mvspltisw\M} } } */ > +/* { dg-final { scan-assembler {\mvupkhsw\M} } } */ > +/* { dg-final { scan-assembler-not {\mlvx\M} } } */ > + > +#include <altivec.h> > + > +vector long long > +foo () > +{ > + return vec_splats (1LL); > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-2.c > b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c > new file mode 100644 > index 00000000000..4955696f11d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ > +/* { dg-require-effective-target powerpc_vsx } */ > +/* { dg-final { scan-assembler {\mvspltisw\M} } } */ > +/* { dg-final { scan-assembler-not {\mvupkhsw\M} } } */ > + > +#include <altivec.h> > + > +vector long long > +foo () > +{ > + return vec_splats (~0LL); > +}