On Thu, Oct 28, 2021 at 1:39 AM Xionghu Luo <luo...@linux.ibm.com> wrote: > > On 2021/10/27 21:24, David Edelsohn wrote: > > On Sun, Oct 24, 2021 at 10:51 PM Xionghu Luo <luo...@linux.ibm.com> wrote: > >> > >> If the second operand of __builtin_shuffle is const vector 0, and with > >> specific mask, it can be optimized to vspltisw+xxpermdi instead of lxv. > >> > >> gcc/ChangeLog: > >> > >> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Add > >> patterns match and emit for VSX xxpermdi. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * gcc.target/powerpc/pr102868.c: New test. > >> --- > >> gcc/config/rs6000/rs6000.c | 47 ++++++++++++++++-- > >> gcc/testsuite/gcc.target/powerpc/pr102868.c | 53 +++++++++++++++++++++ > >> 2 files changed, 97 insertions(+), 3 deletions(-) > >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102868.c > >> > >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > >> index d0730253bcc..5d802c1fa96 100644 > >> --- a/gcc/config/rs6000/rs6000.c > >> +++ b/gcc/config/rs6000/rs6000.c > >> @@ -23046,7 +23046,23 @@ altivec_expand_vec_perm_const (rtx target, rtx > >> op0, rtx op1, > >> {OPTION_MASK_P8_VECTOR, > >> BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct > >> : CODE_FOR_p8_vmrgew_v4sf_direct, > >> - {4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31}}}; > >> + {4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31}}, > >> + {OPTION_MASK_VSX, > >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi > >> + : CODE_FOR_vsx_xxpermdi_v16qi), > >> + {0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23}}, > >> + {OPTION_MASK_VSX, > >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi > >> + : CODE_FOR_vsx_xxpermdi_v16qi), > >> + {8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23}}, > >> + {OPTION_MASK_VSX, > >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi > >> + : CODE_FOR_vsx_xxpermdi_v16qi), > >> + {0, 1, 2, 3, 4, 5, 6, 7, 24, 25, 26, 27, 28, 29, 30, 31}}, > >> + {OPTION_MASK_VSX, > >> + (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_xxpermdi_v16qi > >> + : CODE_FOR_vsx_xxpermdi_v16qi), > >> + {8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31}}}; > > > > If the insn_code is the same for big endian and little endian, why > > does the new code test BYTES_BIG_ENDIAN to set the same value > > (CODE_FOR_vsx_xxpermdi_v16qi)? > > > > Thanks for the catch, updated the patch as below: > > [PATCH v2] rs6000: Optimize __builtin_shuffle when it's used to zero the > upper bits [PR102868] > > If the second operand of __builtin_shuffle is const vector 0, and with > specific mask, it can be optimized to vspltisw+xxpermdi instead of lxv. > > gcc/ChangeLog: > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Add > patterns match and emit for VSX xxpermdi. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr102868.c: New test.
Okay. Thanks, David