LGTM too, thanks :)
On Mon, Jun 5, 2023 at 4:27 PM juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai> wrote: > > LGTM, > > ________________________________ > juzhe.zh...@rivai.ai > > > From: pan2.li > Date: 2023-06-05 16:20 > To: gcc-patches > CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang > Subject: [PATCH v2] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API > From: Pan Li <pan2...@intel.com> > > This patch support the intrinsic API of FP16 ZVFH floating-point. Aka > SEW=16 for below instructions: > > vfadd vfsub vfrsub vfwadd vfwsub > vfmul vfdiv vfrdiv vfwmul > vfmacc vfnmacc vfmsac vfnmsac vfmadd > vfnmadd vfmsub vfnmsub vfwmacc vfwnmacc vfwmsac vfwnmsac > vfsqrt vfrsqrt7 vfrec7 > vfmin vfmax > vfsgnj vfsgnjn vfsgnjx > vmfeq vmfne vmflt vmfle vmfgt vmfge > vfclass vfmerge > vfmv > vfcvt vfwcvt vfncvt > > Then users can leverage the instrinsic APIs to perform the FP=16 related > operations. Please note not all the instrinsic APIs are coverred in the > test files, only pick some typical ones due to too many. We will perform > the FP16 related instrinsic API test entirely soon. > > Signed-off-by: Pan Li <pan2...@intel.com> > > gcc/ChangeLog: > > * config/riscv/riscv-vector-builtins-types.def > (vfloat32mf2_t): New type for DEF_RVV_WEXTF_OPS. > (vfloat32m1_t): Ditto. > (vfloat32m2_t): Ditto. > (vfloat32m4_t): Ditto. > (vfloat32m8_t): Ditto. > (vint16mf4_t): New type for DEF_RVV_CONVERT_I_OPS. > (vint16mf2_t): Ditto. > (vint16m1_t): Ditto. > (vint16m2_t): Ditto. > (vint16m4_t): Ditto. > (vint16m8_t): Ditto. > (vuint16mf4_t): New type for DEF_RVV_CONVERT_U_OPS. > (vuint16mf2_t): Ditto. > (vuint16m1_t): Ditto. > (vuint16m2_t): Ditto. > (vuint16m4_t): Ditto. > (vuint16m8_t): Ditto. > (vint32mf2_t): New type for DEF_RVV_WCONVERT_I_OPS. > (vint32m1_t): Ditto. > (vint32m2_t): Ditto. > (vint32m4_t): Ditto. > (vint32m8_t): Ditto. > (vuint32mf2_t): New type for DEF_RVV_WCONVERT_U_OPS. > (vuint32m1_t): Ditto. > (vuint32m2_t): Ditto. > (vuint32m4_t): Ditto. > (vuint32m8_t): Ditto. > * config/riscv/vector-iterators.md: Add FP=16 support for V, > VWCONVERTI, VCONVERT, VNCONVERT, VMUL1 and vlmul1. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/zvfh-intrinsic.c: New test. > > Signed-off-by: Pan Li <pan2...@intel.com> > --- > .../riscv/riscv-vector-builtins-types.def | 32 ++ > gcc/config/riscv/vector-iterators.md | 21 + > .../riscv/rvv/base/zvfh-intrinsic.c | 418 ++++++++++++++++++ > 3 files changed, 471 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c > > diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def > b/gcc/config/riscv/riscv-vector-builtins-types.def > index 9cb3aca992e..1e2491de6d6 100644 > --- a/gcc/config/riscv/riscv-vector-builtins-types.def > +++ b/gcc/config/riscv/riscv-vector-builtins-types.def > @@ -518,11 +518,24 @@ DEF_RVV_FULL_V_U_OPS (vuint64m2_t, RVV_REQUIRE_FULL_V) > DEF_RVV_FULL_V_U_OPS (vuint64m4_t, RVV_REQUIRE_FULL_V) > DEF_RVV_FULL_V_U_OPS (vuint64m8_t, RVV_REQUIRE_FULL_V) > +DEF_RVV_WEXTF_OPS (vfloat32mf2_t, TARGET_ZVFH | RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_WEXTF_OPS (vfloat32m1_t, TARGET_ZVFH | RVV_REQUIRE_ELEN_FP_32) > +DEF_RVV_WEXTF_OPS (vfloat32m2_t, TARGET_ZVFH | RVV_REQUIRE_ELEN_FP_32) > +DEF_RVV_WEXTF_OPS (vfloat32m4_t, TARGET_ZVFH | RVV_REQUIRE_ELEN_FP_32) > +DEF_RVV_WEXTF_OPS (vfloat32m8_t, TARGET_ZVFH | RVV_REQUIRE_ELEN_FP_32) > + > DEF_RVV_WEXTF_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64) > DEF_RVV_WEXTF_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64) > DEF_RVV_WEXTF_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64) > DEF_RVV_WEXTF_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64) > +DEF_RVV_CONVERT_I_OPS (vint16mf4_t, TARGET_ZVFH | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_CONVERT_I_OPS (vint16mf2_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_I_OPS (vint16m1_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_I_OPS (vint16m2_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_I_OPS (vint16m4_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_I_OPS (vint16m8_t, TARGET_ZVFH) > + > DEF_RVV_CONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_CONVERT_I_OPS (vint32m1_t, 0) > DEF_RVV_CONVERT_I_OPS (vint32m2_t, 0) > @@ -533,6 +546,13 @@ DEF_RVV_CONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_CONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_CONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, TARGET_ZVFH | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_U_OPS (vuint16m1_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_U_OPS (vuint16m2_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_U_OPS (vuint16m4_t, TARGET_ZVFH) > +DEF_RVV_CONVERT_U_OPS (vuint16m8_t, TARGET_ZVFH) > + > DEF_RVV_CONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_CONVERT_U_OPS (vuint32m1_t, 0) > DEF_RVV_CONVERT_U_OPS (vuint32m2_t, 0) > @@ -543,11 +563,23 @@ DEF_RVV_CONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_CONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_CONVERT_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, TARGET_ZVFH | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_WCONVERT_I_OPS (vint32m1_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_I_OPS (vint32m2_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_I_OPS (vint32m4_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_I_OPS (vint32m8_t, TARGET_ZVFH) > + > DEF_RVV_WCONVERT_I_OPS (vint64m1_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > DEF_RVV_WCONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > DEF_RVV_WCONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > DEF_RVV_WCONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > +DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, TARGET_ZVFH | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, TARGET_ZVFH) > +DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, TARGET_ZVFH) > + > DEF_RVV_WCONVERT_U_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > DEF_RVV_WCONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > DEF_RVV_WCONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_FP_32 | > RVV_REQUIRE_ELEN_64) > diff --git a/gcc/config/riscv/vector-iterators.md > b/gcc/config/riscv/vector-iterators.md > index 90743ed76c5..e4f2ba90799 100644 > --- a/gcc/config/riscv/vector-iterators.md > +++ b/gcc/config/riscv/vector-iterators.md > @@ -296,6 +296,14 @@ (define_mode_iterator VWI_ZVE32 [ > ]) > (define_mode_iterator VF [ > + (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") > + (VNx2HF "TARGET_VECTOR_ELEN_FP_16") > + (VNx4HF "TARGET_VECTOR_ELEN_FP_16") > + (VNx8HF "TARGET_VECTOR_ELEN_FP_16") > + (VNx16HF "TARGET_VECTOR_ELEN_FP_16") > + (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32") > + (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128") > + > (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128") > (VNx2SF "TARGET_VECTOR_ELEN_FP_32") > (VNx4SF "TARGET_VECTOR_ELEN_FP_32") > @@ -496,6 +504,13 @@ (define_mode_iterator VWEXTF [ > ]) > (define_mode_iterator VWCONVERTI [ > + (VNx1SI "TARGET_MIN_VLEN < 128 && TARGET_VECTOR_ELEN_FP_16") > + (VNx2SI "TARGET_VECTOR_ELEN_FP_16") > + (VNx4SI "TARGET_VECTOR_ELEN_FP_16") > + (VNx8SI "TARGET_VECTOR_ELEN_FP_16") > + (VNx16SI "TARGET_MIN_VLEN > 32 && TARGET_VECTOR_ELEN_FP_16") > + (VNx32SI "TARGET_MIN_VLEN >= 128 && TARGET_VECTOR_ELEN_FP_16") > + > (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && > TARGET_MIN_VLEN < 128") > (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32") > (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32") > @@ -1239,17 +1254,21 @@ (define_mode_attr VINDEX_OCT_EXT [ > ]) > (define_mode_attr VCONVERT [ > + (VNx1HF "VNx1HI") (VNx2HF "VNx2HI") (VNx4HF "VNx4HI") (VNx8HF "VNx8HI") > (VNx16HF "VNx16HI") (VNx32HF "VNx32HI") (VNx64HF "VNx64HI") > (VNx1SF "VNx1SI") (VNx2SF "VNx2SI") (VNx4SF "VNx4SI") (VNx8SF "VNx8SI") > (VNx16SF "VNx16SI") (VNx32SF "VNx32SI") > (VNx1DF "VNx1DI") (VNx2DF "VNx2DI") (VNx4DF "VNx4DI") (VNx8DF "VNx8DI") > (VNx16DF "VNx16DI") > ]) > (define_mode_attr vconvert [ > + (VNx1HF "vnx1hi") (VNx2HF "vnx2hi") (VNx4HF "vnx4hi") (VNx8HF "vnx8hi") > (VNx16HF "vnx16hi") (VNx32HF "vnx32hi") (VNx64HF "vnx64hi") > (VNx1SF "vnx1si") (VNx2SF "vnx2si") (VNx4SF "vnx4si") (VNx8SF "vnx8si") > (VNx16SF "vnx16si") (VNx32SF "vnx32si") > (VNx1DF "vnx1di") (VNx2DF "vnx2di") (VNx4DF "vnx4di") (VNx8DF "vnx8di") > (VNx16DF "vnx16di") > ]) > (define_mode_attr VNCONVERT [ > + (VNx1HF "VNx1QI") (VNx2HF "VNx2QI") (VNx4HF "VNx4QI") (VNx8HF "VNx8QI") > (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI") > (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI") > (VNx16SF "VNx16HI") (VNx32SF "VNx32HI") > + (VNx1SI "VNx1HF") (VNx2SI "VNx2HF") (VNx4SI "VNx4HF") (VNx8SI "VNx8HF") > (VNx16SI "VNx16HF") (VNx32SI "VNx32HF") > (VNx1DI "VNx1SF") (VNx2DI "VNx2SF") (VNx4DI "VNx4SF") (VNx8DI "VNx8SF") > (VNx16DI "VNx16SF") > (VNx1DF "VNx1SI") (VNx2DF "VNx2SI") (VNx4DF "VNx4SI") (VNx8DF "VNx8SI") > (VNx16DF "VNx16SI") > ]) > @@ -1263,6 +1282,7 @@ (define_mode_attr VLMUL1 [ > (VNx8SI "VNx4SI") (VNx16SI "VNx4SI") (VNx32SI "VNx4SI") > (VNx1DI "VNx2DI") (VNx2DI "VNx2DI") > (VNx4DI "VNx2DI") (VNx8DI "VNx2DI") (VNx16DI "VNx2DI") > + (VNx1HF "VNx8HF") (VNx2HF "VNx8HF") (VNx4HF "VNx8HF") (VNx8HF "VNx8HF") > (VNx16HF "VNx8HF") (VNx32HF "VNx8HF") (VNx64HF "VNx8HF") > (VNx1SF "VNx4SF") (VNx2SF "VNx4SF") > (VNx4SF "VNx4SF") (VNx8SF "VNx4SF") (VNx16SF "VNx4SF") (VNx32SF "VNx4SF") > (VNx1DF "VNx2DF") (VNx2DF "VNx2DF") > @@ -1333,6 +1353,7 @@ (define_mode_attr vlmul1 [ > (VNx8SI "vnx4si") (VNx16SI "vnx4si") (VNx32SI "vnx4si") > (VNx1DI "vnx2di") (VNx2DI "vnx2di") > (VNx4DI "vnx2di") (VNx8DI "vnx2di") (VNx16DI "vnx2di") > + (VNx1HF "vnx8hf") (VNx2HF "vnx8hf") (VNx4HF "vnx8hf") (VNx8HF "vnx8hf") > (VNx16HF "vnx8hf") (VNx32HF "vnx8hf") (VNx64HF "vnx8hf") > (VNx1SF "vnx4sf") (VNx2SF "vnx4sf") > (VNx4SF "vnx4sf") (VNx8SF "vnx4sf") (VNx16SF "vnx4sf") (VNx32SF "vnx4sf") > (VNx1DF "vnx2df") (VNx2DF "vnx2df") > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c > new file mode 100644 > index 00000000000..0d244aac9ec > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c > @@ -0,0 +1,418 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3" } */ > + > +#include "riscv_vector.h" > + > +typedef _Float16 float16_t; > + > +vfloat16mf4_t test_vfadd_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfadd_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfadd_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfadd_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfsub_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfsub_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfsub_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfrsub_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfrsub_vf_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfrsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfrsub_vf_f16m8(op1, op2, vl); > +} > + > +vfloat32mf2_t test_vfwadd_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfwadd_vv_f32mf2(op1, op2, vl); > +} > + > +vfloat32m8_t test_vfwadd_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t > vl) { > + return __riscv_vfwadd_vv_f32m8(op1, op2, vl); > +} > + > +vfloat32mf2_t test_vfwadd_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfwadd_wv_f32mf2(op1, op2, vl); > +} > + > +vfloat32m8_t test_vfwadd_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t > vl) { > + return __riscv_vfwadd_wv_f32m8(op1, op2, vl); > +} > + > +vfloat32mf2_t test_vfwsub_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfwsub_vv_f32mf2(op1, op2, vl); > +} > + > +vfloat32m8_t test_vfwsub_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t > vl) { > + return __riscv_vfwsub_vv_f32m8(op1, op2, vl); > +} > + > +vfloat32mf2_t test_vfwsub_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfwsub_wv_f32mf2(op1, op2, vl); > +} > + > +vfloat32m8_t test_vfwsub_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t > vl) { > + return __riscv_vfwsub_wv_f32m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfmul_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfmul_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfmul_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfmul_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfdiv_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfdiv_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfdiv_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfrdiv_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfrdiv_vf_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfrdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfrdiv_vf_f16m8(op1, op2, vl); > +} > + > +vfloat32mf2_t test_vfwmul_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfwmul_vv_f32mf2(op1, op2, vl); > +} > + > +vfloat32m8_t test_vfwmul_vf_f32m8(vfloat16m4_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfwmul_vf_f32m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfmacc_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfmacc_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfnmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfnmacc_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfnmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfnmacc_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfmsac_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfmsac_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfnmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfnmsac_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfnmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfnmsac_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfmadd_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfmadd_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfnmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfnmadd_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfnmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfnmadd_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfmsub_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfmsub_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfnmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfnmsub_vv_f16mf4(vd, vs1, vs2, vl); > +} > + > +vfloat16m8_t test_vfnmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, > vfloat16m8_t vs2, size_t vl) { > + return __riscv_vfnmsub_vf_f16m8(vd, rs1, vs2, vl); > +} > + > +vfloat32mf2_t test_vfwmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfwmacc_vv_f32mf2(vd, vs1, vs2, vl); > +} > + > +vfloat32m8_t test_vfwmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, > vfloat16m4_t vs2, size_t vl) { > + return __riscv_vfwmacc_vf_f32m8(vd, vs1, vs2, vl); > +} > + > +vfloat32mf2_t test_vfwnmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfwnmacc_vv_f32mf2(vd, vs1, vs2, vl); > +} > + > +vfloat32m8_t test_vfwnmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, > vfloat16m4_t vs2, size_t vl) { > + return __riscv_vfwnmacc_vf_f32m8(vd, vs1, vs2, vl); > +} > + > +vfloat32mf2_t test_vfwmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfwmsac_vv_f32mf2(vd, vs1, vs2, vl); > +} > + > +vfloat32m8_t test_vfwmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, > vfloat16m4_t vs2, size_t vl) { > + return __riscv_vfwmsac_vf_f32m8(vd, vs1, vs2, vl); > +} > + > +vfloat32mf2_t test_vfwnmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, > vfloat16mf4_t vs2, size_t vl) { > + return __riscv_vfwnmsac_vv_f32mf2(vd, vs1, vs2, vl); > +} > + > +vfloat32m8_t test_vfwnmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, > vfloat16m4_t vs2, size_t vl) { > + return __riscv_vfwnmsac_vf_f32m8(vd, vs1, vs2, vl); > +} > + > +vfloat16mf4_t test_vfsqrt_v_f16mf4(vfloat16mf4_t op1, size_t vl) { > + return __riscv_vfsqrt_v_f16mf4(op1, vl); > +} > + > +vfloat16m8_t test_vfsqrt_v_f16m8(vfloat16m8_t op1, size_t vl) { > + return __riscv_vfsqrt_v_f16m8(op1, vl); > +} > + > +vfloat16mf4_t test_vfrsqrt7_v_f16mf4(vfloat16mf4_t op1, size_t vl) { > + return __riscv_vfrsqrt7_v_f16mf4(op1, vl); > +} > + > +vfloat16m8_t test_vfrsqrt7_v_f16m8(vfloat16m8_t op1, size_t vl) { > + return __riscv_vfrsqrt7_v_f16m8(op1, vl); > +} > + > +vfloat16mf4_t test_vfrec7_v_f16mf4(vfloat16mf4_t op1, size_t vl) { > + return __riscv_vfrec7_v_f16mf4(op1, vl); > +} > + > +vfloat16m8_t test_vfrec7_v_f16m8(vfloat16m8_t op1, size_t vl) { > + return __riscv_vfrec7_v_f16m8(op1, vl); > +} > + > +vfloat16mf4_t test_vfmin_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfmin_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfmin_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfmin_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfmax_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfmax_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfmax_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) > { > + return __riscv_vfmax_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfsgnj_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfsgnj_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfsgnj_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfsgnj_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfsgnjn_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfsgnjn_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfsgnjn_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfsgnjn_vf_f16m8(op1, op2, vl); > +} > + > +vfloat16mf4_t test_vfsgnjx_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vfsgnjx_vv_f16mf4(op1, op2, vl); > +} > + > +vfloat16m8_t test_vfsgnjx_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t > vl) { > + return __riscv_vfsgnjx_vf_f16m8(op1, op2, vl); > +} > + > +vbool64_t test_vmfeq_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmfeq_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmfeq_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmfeq_vf_f16m8_b2(op1, op2, vl); > +} > + > +vbool64_t test_vmfne_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmfne_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmfne_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmfne_vf_f16m8_b2(op1, op2, vl); > +} > + > +vbool64_t test_vmflt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmflt_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmflt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmflt_vf_f16m8_b2(op1, op2, vl); > +} > + > +vbool64_t test_vmfle_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmfle_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmfle_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmfle_vf_f16m8_b2(op1, op2, vl); > +} > + > +vbool64_t test_vmfgt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmfgt_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmfgt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmfgt_vf_f16m8_b2(op1, op2, vl); > +} > + > +vbool64_t test_vmfge_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, > size_t vl) { > + return __riscv_vmfge_vv_f16mf4_b64(op1, op2, vl); > +} > + > +vbool2_t test_vmfge_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) { > + return __riscv_vmfge_vf_f16m8_b2(op1, op2, vl); > +} > + > +vuint16mf4_t test_vfclass_v_u16mf4(vfloat16mf4_t op1, size_t vl) { > + return __riscv_vfclass_v_u16mf4(op1, vl); > +} > + > +vuint16m8_t test_vfclass_v_u16m8(vfloat16m8_t op1, size_t vl) { > + return __riscv_vfclass_v_u16m8(op1, vl); > +} > + > +vfloat16mf4_t test_vfmerge_vfm_f16mf4(vfloat16mf4_t op1, float16_t op2, > vbool64_t mask, size_t vl) { > + return __riscv_vfmerge_vfm_f16mf4(op1, op2, mask, vl); > +} > + > +vfloat16m8_t test_vfmerge_vfm_f16m8(vfloat16m8_t op1, float16_t op2, > vbool2_t mask, size_t vl) { > + return __riscv_vfmerge_vfm_f16m8(op1, op2, mask, vl); > +} > + > +vfloat16mf4_t test_vfmv_v_f_f16mf4(float16_t src, size_t vl) { > + return __riscv_vfmv_v_f_f16mf4(src, vl); > +} > + > +vfloat16m8_t test_vfmv_v_f_f16m8(float16_t src, size_t vl) { > + return __riscv_vfmv_v_f_f16m8(src, vl); > +} > + > +vint16mf4_t test_vfcvt_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) { > + return __riscv_vfcvt_x_f_v_i16mf4(src, vl); > +} > + > +vuint16m8_t test_vfcvt_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) { > + return __riscv_vfcvt_xu_f_v_u16m8(src, vl); > +} > + > +vfloat16mf4_t test_vfcvt_f_x_v_f16mf4(vint16mf4_t src, size_t vl) { > + return __riscv_vfcvt_f_x_v_f16mf4(src, vl); > +} > + > +vfloat16m8_t test_vfcvt_f_xu_v_f16m8(vuint16m8_t src, size_t vl) { > + return __riscv_vfcvt_f_xu_v_f16m8(src, vl); > +} > + > +vint16mf4_t test_vfcvt_rtz_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) { > + return __riscv_vfcvt_rtz_x_f_v_i16mf4(src, vl); > +} > + > +vuint16m8_t test_vfcvt_rtz_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) { > + return __riscv_vfcvt_rtz_xu_f_v_u16m8(src, vl); > +} > + > +vfloat16mf4_t test_vfwcvt_f_x_v_f16mf4(vint8mf8_t src, size_t vl) { > + return __riscv_vfwcvt_f_x_v_f16mf4(src, vl); > +} > + > +vuint32m8_t test_vfwcvt_xu_f_v_u32m8(vfloat16m4_t src, size_t vl) { > + return __riscv_vfwcvt_xu_f_v_u32m8(src, vl); > +} > + > +vint8mf8_t test_vfncvt_x_f_w_i8mf8(vfloat16mf4_t src, size_t vl) { > + return __riscv_vfncvt_x_f_w_i8mf8(src, vl); > +} > + > +vfloat16m4_t test_vfncvt_f_xu_w_f16m4(vuint32m8_t src, size_t vl) { > + return __riscv_vfncvt_f_xu_w_f16m4(src, vl); > +} > + > +/* { dg-final { scan-assembler-times > {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 43 } } */ > +/* { dg-final { scan-assembler-times > {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au]} 11 } } */ > +/* { dg-final { scan-assembler-times > {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 34 } } */ > +/* { dg-final { scan-assembler-times > {vfadd\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfsub\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfrsub\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwadd\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times > {vfwsub\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times > {vfmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfdiv\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfrdiv\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfnmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfnmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfwnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times {vfsqrt\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } > */ > +/* { dg-final { scan-assembler-times {vfrsqrt7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } > } */ > +/* { dg-final { scan-assembler-times {vfrec7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } > */ > +/* { dg-final { scan-assembler-times > {vfmin\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfmax\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfsgnj\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfsgnjn\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vfsgnjx\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmfeq\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmfne\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmflt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmfle\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmfgt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times > {vmfge\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */ > +/* { dg-final { scan-assembler-times {vfclass\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } > */ > +/* { dg-final { scan-assembler-times > {vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*fa[0-9]+,\s*v0} 2 } } */ > +/* { dg-final { scan-assembler-times {vfmv\.v\.f\s+v[0-9]+,\s*fa[0-9]+} 2 } > } */ > +/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 > } } */ > +/* { dg-final { scan-assembler-times {vfcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} > 1 } } */ > +/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 > } } */ > +/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 > } } */ > +/* { dg-final { scan-assembler-times > {vfcvt\.rtz\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times > {vfcvt\.rtz\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vfwcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} > 1 } } */ > +/* { dg-final { scan-assembler-times {vfwcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} > 1 } } */ > +/* { dg-final { scan-assembler-times {vfncvt\.x\.f\.w\s+v[0-9]+,\s*v[0-9]+} > 1 } } */ > +/* { dg-final { scan-assembler-times {vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+} > 1 } } */ > -- > 2.34.1 > >