RE: [PATCH] RISC-V: Allow Vector IOR(V1, NOT V1) optimiztion

Li, Pan2 via Gcc-patches Tue, 18 Apr 2023 01:08:57 -0700

Thanks Richard for comments, CIL and will have a try for the suggestions.

Pan


-----Original Message-----
From: Richard Biener <richard.guent...@gmail.com> 
Sent: Tuesday, April 18, 2023 4:00 PM
To: Li, Pan2 <pan2...@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; 
rguent...@suse.de; Wang, Yanzhang <yanzhang.w...@intel.com>; 
richard.sandif...@arm.com
Subject: Re: [PATCH] RISC-V: Allow Vector IOR(V1, NOT V1) optimiztion

On Tue, Apr 18, 2023 at 3:31 AM Li, Pan2 via Gcc-patches 
<gcc-patches@gcc.gnu.org> wrote:
>
> Passed the X86 bootstrap and regression tests.
>
> Pan
>
> -----Original Message-----
> From: Li, Pan2 <pan2...@intel.com>
> Sent: Monday, April 17, 2023 10:50 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; rguent...@suse.de; 
> Li, Pan2 <pan2...@intel.com>; Wang, Yanzhang 
> <yanzhang.w...@intel.com>; richard.sandif...@arm.com
> Subject: [PATCH] RISC-V: Allow Vector IOR(V1, NOT V1) optimiztion
>
> From: Pan Li <pan2...@intel.com>
>
> This patch add the optimization for the vector IOR(V1, NOT V1). Assume we 
> have below sample code.
>
> vbool32_t test_shortcut_for_riscv_vmorn_case_5(vbool32_t v1, size_t vl) {
>   return __riscv_vmorn_mm_b32(v1, v1, vl); }
>
> Before this patch:
> vsetvli  a5,zero,e8,mf4,ta,ma
> vlm.v    v24,0(a1)
> vsetvli  zero,a2,e8,mf4,ta,ma
> vmorn.mm v24,v24,v24
> vsetvli  a5,zero,e8,mf4,ta,ma
> vsm.v    v24,0(a0)
> ret
>
> After this patch:
> vsetvli zero,a2,e8,mf4,ta,ma
> vmset.m v24
> vsetvli a5,zero,e8,mf4,ta,ma
> vsm.v   v24,0(a0)
> ret
>
> Or in RTL's perspective,
> from:
> (ior:VNx2BI (reg/v:VNx2BI 137 [ v1 ]) (not:VNx2BI (reg/v:VNx2BI 137 [ 
> v1 ])))
> to:
> (const_vector:VNx2BI repeat [ (const_int 1 [0x1]) ])
>
> The similar optimization like VMANDN has enabled already. There should be no 
> difference execpt the operator when compare the VMORN and VMANDN for such 
> kind of optimization. The patch allows the VECTOR_BOOL IOR(V1, NOT V1) 
> simplification besides the existing SCALAR_INT mode.
>
> gcc/ChangeLog:
>
>         * machmode.h (VECTOR_BOOL_MODE_P):
>         * simplify-rtx.cc (valid_mode_for_ior_simplification_p):
>         (simplify_context::simplify_binary_operation_1):
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/base/mask_insn_shortcut.c:
>         * gcc.target/riscv/simplify_ior_optimization.c: New test.
>
> Signed-off-by: Pan Li <pan2...@intel.com>
> ---
>  gcc/machmode.h                                |  4 ++
>  gcc/simplify-rtx.cc                           | 10 +++-
>  .../riscv/rvv/base/mask_insn_shortcut.c       |  3 +-
>  .../riscv/simplify_ior_optimization.c         | 50 +++++++++++++++++++
>  4 files changed, 63 insertions(+), 4 deletions(-)  create mode 100644 
> gcc/testsuite/gcc.target/riscv/simplify_ior_optimization.c
>
> diff --git a/gcc/machmode.h b/gcc/machmode.h index 
> f1865c1ef42..771bae89cb7 100644
> --- a/gcc/machmode.h
> +++ b/gcc/machmode.h
> @@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
>     || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM       \
>     || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM)
>
> +/* Nonzero if MODE is a vector bool mode.  */
> +#define VECTOR_BOOL_MODE_P(MODE)                       \
> +  (GET_MODE_CLASS (MODE) == MODE_VECTOR_BOOL)          \
> +
>  /* Nonzero if MODE is a scalar integral mode.  */
>  #define SCALAR_INT_MODE_P(MODE)                        \
>    (GET_MODE_CLASS (MODE) == MODE_INT           \
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index 
> ee75079917f..eff27b835bf 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -57,6 +57,12 @@ neg_poly_int_rtx (machine_mode mode, const_rtx i)
>    return immed_wide_int_const (-wi::to_poly_wide (i, mode), mode);  }
>
> +static bool
> +valid_mode_for_ior_simplification_p (machine_mode mode) {
> +  return SCALAR_INT_MODE_P (mode) || VECTOR_BOOL_MODE_P (mode); }
> +
>  /* Test whether expression, X, is an immediate constant that represents
>     the most significant bit of machine mode MODE.  */
>
> @@ -3332,8 +3338,8 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
> code,
>        if (((GET_CODE (op0) == NOT && rtx_equal_p (XEXP (op0, 0), op1))
>            || (GET_CODE (op1) == NOT && rtx_equal_p (XEXP (op1, 0), op0)))
>           && ! side_effects_p (op0)
> -         && SCALAR_INT_MODE_P (mode))
> -       return constm1_rtx;
> +         && valid_mode_for_ior_simplification_p (mode))

for simple predicates like this please do not split them out, it makes 
understanding the code more difficult.
[pan]: Sure, will update this part.

> +       return CONST1_RTX (mode);

shouldn't this be CONSTM1_RTX (mode)?  Why is this only valid for VECTOR_BOOL 
and not also for VECTOR_INT?  You're citing AND and that does
[pan]: will have a try for CONSTM1_RTX. I am not very sure there is some ad-hoc 
reason when compare to AND, thus only add the VECTOR_BOOL covered by test, will 
have update to similar way as AND.

      /* A & (~A) -> 0 */
      if (((GET_CODE (op0) == NOT && rtx_equal_p (XEXP (op0, 0), op1))
           || (GET_CODE (op1) == NOT && rtx_equal_p (XEXP (op1, 0), op0)))
          && ! side_effects_p (op0)
          && GET_MODE_CLASS (mode) != MODE_CC)
        return CONST0_RTX (mode);

so why differ and not use the same GET_MODE_CLASS (mode) != MODE_CC condition?

Richard.

>
>        /* (ior A C) is C if all bits of A that might be nonzero are on in C.  
> */
>        if (CONST_INT_P (op1)
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/base/mask_insn_shortcut.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/mask_insn_shortcut.c
> index 83cc4a1b5a5..57d0241675a 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/mask_insn_shortcut.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/mask_insn_shortcut.c
> @@ -233,9 +233,8 @@ vbool64_t 
> test_shortcut_for_riscv_vmxnor_case_6(vbool64_t v1, size_t vl) {
>  /* { dg-final { scan-assembler-not {vmxor\.mm\s+v[0-9]+,\s*v[0-9]+} } 
> } */
>  /* { dg-final { scan-assembler-not {vmor\.mm\s+v[0-9]+,\s*v[0-9]+} } 
> } */
>  /* { dg-final { scan-assembler-not {vmnor\.mm\s+v[0-9]+,\s*v[0-9]+} } 
> } */
> -/* { dg-final { scan-assembler-times 
> {vmorn\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 7 } } */
>  /* { dg-final { scan-assembler-not {vmxnor\.mm\s+v[0-9]+,\s*v[0-9]+} 
> } } */
>  /* { dg-final { scan-assembler-times {vmclr\.m\s+v[0-9]+} 14 } } */
> -/* { dg-final { scan-assembler-times {vmset\.m\s+v[0-9]+} 7 } } */
> +/* { dg-final { scan-assembler-times {vmset\.m\s+v[0-9]+} 14 } } */
>  /* { dg-final { scan-assembler-times {vmmv\.m\s+v[0-9]+,\s*v[0-9]+} 
> 14 } } */
>  /* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 
> 14 } } */ diff --git 
> a/gcc/testsuite/gcc.target/riscv/simplify_ior_optimization.c 
> b/gcc/testsuite/gcc.target/riscv/simplify_ior_optimization.c
> new file mode 100644
> index 00000000000..ec3bd0baf03
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/simplify_ior_optimization.c
> @@ -0,0 +1,50 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc -mabi=lp64 -O2" } */
> +
> +#include <stdint.h>
> +
> +uint8_t test_simplify_ior_scalar_case_0 (uint8_t a) {
> +  return a | ~a;
> +}
> +
> +uint16_t test_simplify_ior_scalar_case_1 (uint16_t a) {
> +  return a | ~a;
> +}
> +
> +uint32_t test_simplify_ior_scalar_case_2 (uint32_t a) {
> +  return a | ~a;
> +}
> +
> +uint64_t test_simplify_ior_scalar_case_3 (uint64_t a) {
> +  return a | ~a;
> +}
> +
> +int8_t test_simplify_ior_scalar_case_4 (int8_t a) {
> +  return a | ~a;
> +}
> +
> +int16_t test_simplify_ior_scalar_case_5 (int16_t a) {
> +  return a | ~a;
> +}
> +
> +int32_t test_simplify_ior_scalar_case_6 (int32_t a) {
> +  return a | ~a;
> +}
> +
> +int64_t test_simplify_ior_scalar_case_7 (int64_t a) {
> +  return a | ~a;
> +}
> +
> +/* { dg-final { scan-assembler-times {li\s+a[0-9]+,\s*-1} 6 } } */
> +/* { dg-final { scan-assembler-times {li\s+a[0-9]+,\s*255} 1 } } */
> +/* { dg-final { scan-assembler-times {li\s+a[0-9]+,\s*65536} 1 } } */
> +/* { dg-final { scan-assembler-not {or\s+a[0-9]+} } } */
> +/* { dg-final { scan-assembler-not {not\s+a[0-9]+} } } */
> --
> 2.34.1
>

RE: [PATCH] RISC-V: Allow Vector IOR(V1, NOT V1) optimiztion

Reply via email to