Hi Haochen, on 2024/6/12 10:47, HAO CHEN GUI wrote: > Hi, > This patch creates an insn_and_split pattern which helps the duplicated > constant vector replace the source pseudo of store insn in fwprop pass. > Thus the store can be implemented by a single stxvd2x and it eliminates the > unnecessary byte swap insn on P8 LE. The test case shows the optimization. > > The patch depends on the first generic patch which uses insn cost in fwprop. > https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654276.html > > Compared to previous version, the main change is to remove the predict and > put the check in insn condition and gcc assertion. > > Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no > regressions. > > Thanks > Gui Haochen > > > ChangeLog > rs6000: Eliminate unnecessary byte swaps for duplicated constant vector store > > gcc/ > PR target/113325 > * config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): New. > > gcc/testsuite/ > PR target/113325 > * gcc.target/powerpc/pr113325.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index f135fa079bd..89eb32a0758 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -3368,6 +3368,32 @@ (define_insn "*vsx_stxvd2x4_le_<mode>" > "stxvd2x %x1,%y0" > [(set_attr "type" "vecstore")]) > > +(define_insn_and_split "vsx_stxvd2x4_le_const_<mode>" > + [(set (match_operand:VSX_W 0 "memory_operand" "=Z") > + (match_operand:VSX_W 1 "immediate_operand" "W"))] > + "!BYTES_BIG_ENDIAN > + && VECTOR_MEM_VSX_P (<MODE>mode) > + && !TARGET_P9_VECTOR > + && const_vec_duplicate_p (operands[1])" > + "#" > + "&& 1" > + [(set (match_dup 2) > + (match_dup 1)) > + (set (match_dup 0) > + (vec_select:VSX_W > + (match_dup 2) > + (parallel [(const_int 2) (const_int 3) > + (const_int 0) (const_int 1)])))] > +{ > + /* Here all the constants must be loaded without memory. */ > + gcc_assert (easy_altivec_constant (operands[1], <MODE>mode)); > + operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) > + : operands[1];
For the case of !can_create_pseudo_p (), operands[2] would be a constant vector, does it match any existing pattern? If no, I think we want to add can_create_pseudo_p () to the condition as well. The others look good to me, thanks! BR, Kewen > + > +} > + [(set_attr "type" "vecstore") > + (set_attr "length" "8")]) > + > (define_insn "*vsx_stxvd2x8_le_V8HI" > [(set (match_operand:V8HI 0 "memory_operand" "=Z") > (vec_select:V8HI > diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c > b/gcc/testsuite/gcc.target/powerpc/pr113325.c > new file mode 100644 > index 00000000000..3ca1fcbc9ba > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power8 -mvsx" } */ > +/* { dg-require-effective-target powerpc_vsx } */ > +/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */ > + > +void* foo (void* s1) > +{ > + return __builtin_memset (s1, 0, 32); > +}