VECTOR_FLOAT_MODE_P referenced from expand, will remove it as it will be removed shortly.
Pan From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai> Sent: Friday, June 16, 2023 3:48 PM To: Li, Pan2 <pan2...@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org> Cc: Robin Dapp <rdapp....@gmail.com>; jeffreyalaw <jeffreya...@gmail.com>; Li, Pan2 <pan2...@intel.com>; Wang, Yanzhang <yanzhang.w...@intel.com>; kito.cheng <kito.ch...@gmail.com> Subject: Re: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64. +/* Nonzero if MODE is a vector float mode. */ +#define VECTOR_FLOAT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) Why you add this? Remove it. Otherwise, LGTM. ________________________________ juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai> From: pan2.li<mailto:pan2...@intel.com> Date: 2023-06-16 15:28 To: gcc-patches<mailto:gcc-patches@gcc.gnu.org> CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; rdapp.gcc<mailto:rdapp....@gmail.com>; jeffreyalaw<mailto:jeffreya...@gmail.com>; pan2.li<mailto:pan2...@intel.com>; yanzhang.wang<mailto:yanzhang.w...@intel.com>; kito.cheng<mailto:kito.ch...@gmail.com> Subject: [PATCH v1] RISC-V: Bugfix for RVV integer reduction in ZVE32/64. From: Pan Li <pan2...@intel.com<mailto:pan2...@intel.com>> The rvv integer reduction has 3 different patterns for zve128+, zve64 and zve32. They take the same iterator with different attributions. However, we need the generated function code_for_reduc (code, mode1, mode2). The implementation of code_for_reduc may look like below. code_for_reduc (code, mode1, mode2) { if (code == max && mode1 == VNx1QI && mode2 == VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx16qi; // ZVE128+ if (code == max && mode1 == VNx1QI && mode2 == VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx8qi; // ZVE64 if (code == max && mode1 == VNx1QI && mode2 == VNx1QI) return CODE_FOR_pred_reduc_maxvnx1qivnx4qi; // ZVE32 } Thus there will be a problem here. For example zve32, we will have code_for_reduc (max, VNx1QI, VNx1QI) which will return the code of the ZVE128+ instead of the ZVE32 logically. This patch will merge the 3 patterns into one pattern, and pass both the input_vector and the ret_vector of code_for_reduc. For example, ZVE32 will be code_for_reduc (max, VNx1Q1, VNx4QI), then the correct code of ZVE32 will be returned as expectation. Signed-off-by: Pan Li <pan2...@intel.com<mailto:pan2...@intel.com>> Co-Authored by: Juzhe-Zhong <juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>> PR 110265 gcc/ChangeLog: PR target/110265 * config/riscv/riscv-vector-builtins-bases.cc: Add ret_mode for integer reduction expand. * config/riscv/vector-iterators.md: Add VQI, VHI, VSI and VDI, and the LMUL1 attr respectively. * config/riscv/vector.md. (@pred_reduc_<reduc><mode><vlmul1>): Removed. (@pred_reduc_<reduc><mode><vlmul1_zve64>): Likewise. (@pred_reduc_<reduc><mode><vlmul1_zve32>): Likewise. (@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>): New pattern. (@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>): Likewise. (@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>): Likewise. (@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>): Likewise. * machmode.h (VECTOR_FLOAT_MODE_P): New macro. gcc/testsuite/ChangeLog: PR target/110265 * gcc.target/riscv/rvv/base/pr110265-1.c: New test. * gcc.target/riscv/rvv/base/pr110265-1.h: New test. * gcc.target/riscv/rvv/base/pr110265-2.c: New test. * gcc.target/riscv/rvv/base/pr110265-2.h: New test. * gcc.target/riscv/rvv/base/pr110265-3.c: New test. --- .../riscv/riscv-vector-builtins-bases.cc | 13 +- gcc/config/riscv/vector-iterators.md | 61 +++++ gcc/config/riscv/vector.md | 208 +++++++++++++----- gcc/machmode.h | 4 + .../gcc.target/riscv/rvv/base/pr110265-1.c | 13 ++ .../gcc.target/riscv/rvv/base/pr110265-1.h | 65 ++++++ .../gcc.target/riscv/rvv/base/pr110265-2.c | 14 ++ .../gcc.target/riscv/rvv/base/pr110265-2.h | 57 +++++ .../gcc.target/riscv/rvv/base/pr110265-3.c | 14 ++ 9 files changed, 389 insertions(+), 60 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index 87a684dd127..a77933d60d5 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1396,8 +1396,17 @@ public: rtx expand (function_expander &e) const override { - return e.use_exact_insn ( - code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ())); + machine_mode mode = e.vector_mode (); + machine_mode ret_mode = e.ret_mode (); + + /* TODO: we will use ret_mode after all types of PR110265 are addressed. */ + if (VECTOR_FLOAT_MODE_P (mode) + || GET_MODE_INNER (mode) != GET_MODE_INNER (ret_mode)) + return e.use_exact_insn ( + code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ())); + else + return e.use_exact_insn ( + code_for_pred_reduc (CODE, e.vector_mode (), e.ret_mode ())); } }; diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 8c71c9e22cc..e2c8ade98eb 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -929,6 +929,67 @@ (define_mode_iterator V64T [ (VNx2x64QI "TARGET_MIN_VLEN >= 128") ]) +(define_mode_iterator VQI [ + (VNx1QI "TARGET_MIN_VLEN < 128") + VNx2QI + VNx4QI + VNx8QI + VNx16QI + VNx32QI + (VNx64QI "TARGET_MIN_VLEN > 32") + (VNx128QI "TARGET_MIN_VLEN >= 128") +]) + +(define_mode_iterator VHI [ + (VNx1HI "TARGET_MIN_VLEN < 128") + VNx2HI + VNx4HI + VNx8HI + VNx16HI + (VNx32HI "TARGET_MIN_VLEN > 32") + (VNx64HI "TARGET_MIN_VLEN >= 128") +]) + +(define_mode_iterator VSI [ + (VNx1SI "TARGET_MIN_VLEN < 128") + VNx2SI + VNx4SI + VNx8SI + (VNx16SI "TARGET_MIN_VLEN > 32") + (VNx32SI "TARGET_MIN_VLEN >= 128") +]) + +(define_mode_iterator VDI [ + (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") + (VNx2DI "TARGET_VECTOR_ELEN_64") + (VNx4DI "TARGET_VECTOR_ELEN_64") + (VNx8DI "TARGET_VECTOR_ELEN_64") + (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") +]) + +(define_mode_iterator VQI_LMUL1 [ + (VNx16QI "TARGET_MIN_VLEN >= 128") + (VNx8QI "TARGET_MIN_VLEN == 64") + (VNx4QI "TARGET_MIN_VLEN == 32") +]) + +(define_mode_iterator VHI_LMUL1 [ + (VNx8HI "TARGET_MIN_VLEN >= 128") + (VNx4HI "TARGET_MIN_VLEN == 64") + (VNx2HI "TARGET_MIN_VLEN == 32") +]) + +(define_mode_iterator VSI_LMUL1 [ + (VNx4SI "TARGET_MIN_VLEN >= 128") + (VNx2SI "TARGET_MIN_VLEN == 64") + (VNx1SI "TARGET_MIN_VLEN == 32") +]) + +(define_mode_iterator VDI_LMUL1 [ + (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") + (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN == 64") +]) + (define_mode_attr VLMULX2 [ (VNx1QI "VNx2QI") (VNx2QI "VNx4QI") (VNx4QI "VNx8QI") (VNx8QI "VNx16QI") (VNx16QI "VNx32QI") (VNx32QI "VNx64QI") (VNx64QI "VNx128QI") (VNx1HI "VNx2HI") (VNx2HI "VNx4HI") (VNx4HI "VNx8HI") (VNx8HI "VNx16HI") (VNx16HI "VNx32HI") (VNx32HI "VNx64HI") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 1d1847bd85a..d396e278503 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -7244,76 +7244,168 @@ (define_insn "@pred_rod_trunc<mode>"<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; -------------------------------------------------------------------------------<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; For reduction operations, we should have seperate patterns for<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> -;; TARGET_MIN_VLEN == 32 and TARGET_MIN_VLEN > 32.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> +;; different types. For each type, we will cover MIN_VLEN == 32, MIN_VLEN == 64<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> +;; and the MIN_VLEN >= 128 from the well defined iterators.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; Since reduction need LMUL = 1 scalar operand as the input operand<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; and they are different.<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> ;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64*<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1> -(define_insn "@pred_reduc_<reduc><mode><vlmul1<mailto:%22%0d;;%20-------------------------------------------------------------------------------%0d;;%20For%20reduction%20operations,%20we%20should%20have%20seperate%20patterns%20for%0d-;;%20TARGET_MIN_VLEN%20==%2032%20and%20TARGET_MIN_VLEN%20%3e%2032.%0d+;;%20different%20types.%20For%20each%20type,%20we%20will%20cover%20MIN_VLEN%20==%2032,%20MIN_VLEN%20==%2064%0d+;;%20and%20the%20MIN_VLEN%20%3e=%20128%20from%20the%20well%20defined%20iterators.%0d;;%20Since%20reduction%20need%20LMUL%20=%201%20scalar%20operand%20as%20the%20input%20operand%0d;;%20and%20they%20are%20different.%0d;;%20For%20example,%20The%20LMUL%20=%201%20corresponding%20mode%20of%20VNx16QImode%20is%20VNx4QImode%0d;;%20for%20-march=rv*zve32*%20wheras%20VNx8QImode%20for%20-march=rv*zve64*%0d-(define_insn%20%22@pred_reduc_%3creduc%3e%3cmode%3e%3cvlmul1>>" - [(set (match_operand:<VLMUL1> 0 "register_operand" "=vr, vr") - (unspec:<VLMUL1> - [(unspec:<VM> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") + +;; Integer Reduction for QI +(define_insn "@pred_reduc_<reduc><VQI:mode><VQI_LMUL1:mode>" + [ + (set + (match_operand:VQI_LMUL1 0 "register_operand" "=vr, vr") + (unspec:VQI_LMUL1 + [ + (unspec:<VQI:VM> + [ + (match_operand:<VQI:VM> 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI - (vec_duplicate:VI - (vec_select:<VEL> - (match_operand:<VLMUL1> 4 "register_operand" " vr, vr") - (parallel [(const_int 0)]))) - (match_operand:VI 3 "register_operand" " vr, vr")) - (match_operand:<VLMUL1> 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN >= 128" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VQI + (vec_duplicate:VQI + (vec_select:<VEL> + (match_operand:VQI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VQI 3 "register_operand" " vr, vr") + ) + (match_operand:VQI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred<reduc>.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "<MODE>")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "<VQI:MODE>") + ] +) -(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve64>" - [(set (match_operand:<VLMUL1_ZVE64> 0 "register_operand" "=vr, vr") - (unspec:<VLMUL1_ZVE64> - [(unspec:<VM> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") +;; Integer Reduction for HI +(define_insn "@pred_reduc_<reduc><VHI:mode><VHI_LMUL1:mode>" + [ + (set + (match_operand:VHI_LMUL1 0 "register_operand" "=vr, vr") + (unspec:VHI_LMUL1 + [ + (unspec:<VHI:VM> + [ + (match_operand:<VHI:VM> 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI_ZVE64 - (vec_duplicate:VI_ZVE64 - (vec_select:<VEL> - (match_operand:<VLMUL1_ZVE64> 4 "register_operand" " vr, vr") - (parallel [(const_int 0)]))) - (match_operand:VI_ZVE64 3 "register_operand" " vr, vr")) - (match_operand:<VLMUL1_ZVE64> 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN == 64" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VHI + (vec_duplicate:VHI + (vec_select:<VEL> + (match_operand:VHI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VHI 3 "register_operand" " vr, vr") + ) + (match_operand:VHI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred<reduc>.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "<MODE>")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "<VHI:MODE>") + ] +) -(define_insn "@pred_reduc_<reduc><mode><vlmul1_zve32>" - [(set (match_operand:<VLMUL1_ZVE32> 0 "register_operand" "=vd, vd, vr, vr") - (unspec:<VLMUL1_ZVE32> - [(unspec:<VM> - [(match_operand:<VM> 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") +;; Integer Reduction for SI +(define_insn "@pred_reduc_<reduc><VSI:mode><VSI_LMUL1:mode>" + [ + (set + (match_operand:VSI_LMUL1 0 "register_operand" "=vr, vr") + (unspec:VSI_LMUL1 + [ + (unspec:<VSI:VM> + [ + (match_operand:<VSI:VM> 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (any_reduc:VI_ZVE32 - (vec_duplicate:VI_ZVE32 - (vec_select:<VEL> - (match_operand:<VLMUL1_ZVE32> 4 "register_operand" " vr, vr, vr, vr") - (parallel [(const_int 0)]))) - (match_operand:VI_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand:<VLMUL1_ZVE32> 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] - "TARGET_VECTOR && TARGET_MIN_VLEN == 32" + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VSI + (vec_duplicate:VSI + (vec_select:<VEL> + (match_operand:VSI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VSI 3 "register_operand" " vr, vr") + ) + (match_operand:VSI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" "vred<reduc>.vs\t%0,%3,%4%p1" - [(set_attr "type" "vired") - (set_attr "mode" "<MODE>")]) + [ + (set_attr "type" "vired") + (set_attr "mode" "<VSI:MODE>") + ] +) + +;; Integer Reduction for DI +(define_insn "@pred_reduc_<reduc><VDI:mode><VDI_LMUL1:mode>" + [ + (set + (match_operand:VDI_LMUL1 0 "register_operand" "=vr, vr") + (unspec:VDI_LMUL1 + [ + (unspec:<VDI:VM> + [ + (match_operand:<VDI:VM> 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE + ) + (any_reduc:VDI + (vec_duplicate:VDI + (vec_select:<VEL> + (match_operand:VDI_LMUL1 4 "register_operand" " vr, vr") + (parallel [(const_int 0)]) + ) + ) + (match_operand:VDI 3 "register_operand" " vr, vr") + ) + (match_operand:VDI_LMUL1 2 "vector_merge_operand" " vu, 0") + ] UNSPEC_REDUC + ) + ) + ] + "TARGET_VECTOR" + "vred<reduc>.vs\t%0,%3,%4%p1" + [ + (set_attr "type" "vired") + (set_attr "mode" "<VDI:MODE>") + ] +) (define_insn "@pred_widen_reduc_plus<v_su><mode><vwlmul1>" [(set (match_operand:<VWLMUL1> 0 "register_operand" "=&vr, &vr") diff --git a/gcc/machmode.h b/gcc/machmode.h index a22df60dc20..8ecfc2a656e 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM) +/* Nonzero if MODE is a vector float mode. */ +#define VECTOR_FLOAT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) \ + /* Nonzero if MODE is a scalar integral mode. */ #define SCALAR_INT_MODE_P(MODE) \ (GET_MODE_CLASS (MODE) == MODE_INT \ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c new file mode 100644 index 00000000000..2e4aeb5b90b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve32f -mabi=ilp32f -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h new file mode 100644 index 00000000000..ade44cc27ea --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-1.h @@ -0,0 +1,65 @@ +#include "riscv_vector.h" + +vint8m1_t test_vredand_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredand_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredand_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredand_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredmax_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredmax_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vint32m1_t test_vredmax_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) { + return __riscv_vredmax_vs_i32m8_i32m1(vector, scalar, vl); +} + +vuint8m1_t test_vredmaxu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) { + return __riscv_vredmaxu_vs_u8mf4_u8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredmaxu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredmaxu_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredmin_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredmin_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vint32m1_t test_vredmin_vs_i32m8_i32m1(vint32m8_t vector, vint32m1_t scalar, size_t vl) { + return __riscv_vredmin_vs_i32m8_i32m1(vector, scalar, vl); +} + +vuint8m1_t test_vredminu_vs_u8mf4_u8m1(vuint8mf4_t vector, vuint8m1_t scalar, size_t vl) { + return __riscv_vredminu_vs_u8mf4_u8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredminu_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredminu_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredor_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredor_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredsum_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredsum_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredsum_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredsum_vs_u32m8_u32m1(vector, scalar, vl); +} + +vint8m1_t test_vredxor_vs_i8mf4_i8m1(vint8mf4_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredxor_vs_i8mf4_i8m1(vector, scalar, vl); +} + +vuint32m1_t test_vredxor_vs_u32m8_u32m1(vuint32m8_t vector, vuint32m1_t scalar, size_t vl) { + return __riscv_vredxor_vs_u32m8_u32m1(vector, scalar, vl); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c new file mode 100644 index 00000000000..7454c1cc918 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64d -mabi=ilp32d -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" +#include "pr110265-2.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h new file mode 100644 index 00000000000..6a7e14e51f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-2.h @@ -0,0 +1,57 @@ +#include "riscv_vector.h" + +vint8m1_t test_vredand_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredand_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredmax_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredmax_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint8m1_t test_vredmaxu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) { + return __riscv_vredmaxu_vs_u8mf8_u8m1(vector, scalar, vl); +} + +vint8m1_t test_vredmin_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredmin_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint8m1_t test_vredminu_vs_u8mf8_u8m1(vuint8mf8_t vector, vuint8m1_t scalar, size_t vl) { + return __riscv_vredminu_vs_u8mf8_u8m1(vector, scalar, vl); +} + +vint8m1_t test_vredor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredor_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredsum_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredsum_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vint8m1_t test_vredxor_vs_i8mf8_i8m1(vint8mf8_t vector, vint8m1_t scalar, size_t vl) { + return __riscv_vredxor_vs_i8mf8_i8m1(vector, scalar, vl); +} + +vuint64m1_t test_vredand_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredand_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredmaxu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredmaxu_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredminu_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredminu_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredor_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredsum_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredsum_vs_u64m8_u64m1(vector, scalar, vl); +} + +vuint64m1_t test_vredxor_vs_u64m8_u64m1(vuint64m8_t vector, vuint64m1_t scalar, size_t vl) { + return __riscv_vredxor_vs_u64m8_u64m1(vector, scalar, vl); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c new file mode 100644 index 00000000000..0ed1fbae35a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110265-3.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zve64f -mabi=ilp32f -O3 -Wno-psabi" } */ + +#include "pr110265-1.h" +#include "pr110265-2.h" + +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */ +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 4 } } */ -- 2.34.1