Hi Haochen,

on 2024/6/12 10:47, HAO CHEN GUI wrote:
> Hi,
>   This patch creates an insn_and_split pattern which helps the duplicated
> constant vector replace the source pseudo of store insn in fwprop pass.
> Thus the store can be implemented by a single stxvd2x and it eliminates the
> unnecessary byte swap insn on P8 LE. The test case shows the optimization.
> 
>   The patch depends on the first generic patch which uses insn cost in fwprop.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654276.html
> 
>   Compared to previous version, the main change is to remove the predict and
> put the check in insn condition and gcc assertion.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions.
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: Eliminate unnecessary byte swaps for duplicated constant vector store
> 
> gcc/
>       PR target/113325
>       * config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): New.
> 
> gcc/testsuite/
>       PR target/113325
>       * gcc.target/powerpc/pr113325.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f135fa079bd..89eb32a0758 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -3368,6 +3368,32 @@ (define_insn "*vsx_stxvd2x4_le_<mode>"
>    "stxvd2x %x1,%y0"
>    [(set_attr "type" "vecstore")])
> 
> +(define_insn_and_split "vsx_stxvd2x4_le_const_<mode>"
> +  [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
> +     (match_operand:VSX_W 1 "immediate_operand" "W"))]
> +  "!BYTES_BIG_ENDIAN
> +   && VECTOR_MEM_VSX_P (<MODE>mode)
> +   && !TARGET_P9_VECTOR
> +   && const_vec_duplicate_p (operands[1])"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 2)
> +     (match_dup 1))
> +   (set (match_dup 0)
> +     (vec_select:VSX_W
> +       (match_dup 2)
> +       (parallel [(const_int 2) (const_int 3)
> +                  (const_int 0) (const_int 1)])))]
> +{
> +  /* Here all the constants must be loaded without memory.  */
> +  gcc_assert (easy_altivec_constant (operands[1], <MODE>mode));
> +  operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1])
> +                                      : operands[1];

For the case of !can_create_pseudo_p (), operands[2] would be a constant vector,
does it match any existing pattern?  If no, I think we want to add
can_create_pseudo_p () to the condition as well.

The others look good to me, thanks!

BR,
Kewen

> +
> +}
> +  [(set_attr "type" "vecstore")
> +   (set_attr "length" "8")])
> +
>  (define_insn "*vsx_stxvd2x8_le_V8HI"
>    [(set (match_operand:V8HI 0 "memory_operand" "=Z")
>          (vec_select:V8HI
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c 
> b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> new file mode 100644
> index 00000000000..3ca1fcbc9ba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8 -mvsx" } */
> +/* { dg-require-effective-target powerpc_vsx } */
> +/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */
> +
> +void* foo (void* s1)
> +{
> +  return __builtin_memset (s1, 0, 32);
> +}

Reply via email to