> So this isn't a regression, but I can also understand the desire to fix > this fairly significant performance issue.
I'd argue it is a regression as the match.pd pattern that merges the permutes was introduces after GCC 14. After giving it a bit more thought, I'd still like to send the attached v2 because it excludes fewer cases and, consequently, requires fewer changes to the test suite. Regtested on rv64gcv_zvl512b. Regards Robin [PATCH v2] RISC-V: Disable two-source permutes for now [PR117173]. After testing on the BPI (4.2% improvement for x264 input 1, 4.4% for input 2) and the discussion in PR117173 I figured it's best to disable the two-source permutes by default for now. The patch adds a parameter "riscv-two-source-permutes" which restores the old behavior. PR target/117173 gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_generic_patterns): Only support single-source permutes by default. * config/riscv/riscv.opt: New param "riscv-two-source-permutes". gcc/testsuite/ChangeLog: * gcc.dg/fold-perm-2.c: Run with two-source permutes. * gcc.dg/pr54346.c: Ditto. --- gcc/config/riscv/riscv-v.cc | 13 ++++++++++++- gcc/config/riscv/riscv.opt | 4 ++++ gcc/testsuite/gcc.dg/fold-perm-2.c | 1 + gcc/testsuite/gcc.dg/pr54346.c | 1 + 4 files changed, 18 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index e1172e9c7d2..9847439ca77 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3947,11 +3947,22 @@ shuffle_generic_patterns (struct expand_vec_perm_d *d) if (!get_gather_index_mode (d).exists (&sel_mode)) return false; + rtx sel = vec_perm_indices_to_rtx (sel_mode, d->perm); + poly_uint64 nunits = GET_MODE_NUNITS (sel_mode); + rtx elt; + + bool is_simple = d->one_vector_p + || const_vec_duplicate_p (sel, &elt) + || (nunits.is_constant () + && const_vec_all_in_range_p (sel, 0, nunits - 1)); + + if (!is_simple && !riscv_two_source_permutes) + return false; + /* Success! */ if (d->testing_p) return true; - rtx sel = vec_perm_indices_to_rtx (sel_mode, d->perm); /* Some FIXED-VLMAX/VLS vector permutation situations call targethook instead of expand vec_perm<mode>, we handle it directly. */ expand_vec_perm (d->target, d->op0, d->op1, sel); diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index f51f8fd1cdf..ed0695e20d3 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -622,6 +622,10 @@ Enum(vsetvl_strategy) String(optim-no-fusion) Value(VSETVL_OPT_NO_FUSION) Target Undocumented RejectNegative Joined Enum(vsetvl_strategy) Var(vsetvl_strategy) Init(VSETVL_OPT) -param=vsetvl-strategy=<string> Set the optimization level of VSETVL insert pass. +-param=riscv-two-source-permutes +Target Undocumented Uinteger Var(riscv_two_source_permutes) Init(0) +-param=riscv-two-source-permutes Enable permutes/gathers with two sources vectors. + Enum Name(stringop_strategy) Type(enum stringop_strategy_enum) Valid arguments to -mstringop-strategy=: diff --git a/gcc/testsuite/gcc.dg/fold-perm-2.c b/gcc/testsuite/gcc.dg/fold-perm-2.c index 1a4ab4065de..9fd809ee296 100644 --- a/gcc/testsuite/gcc.dg/fold-perm-2.c +++ b/gcc/testsuite/gcc.dg/fold-perm-2.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O -fdump-tree-fre1" } */ +/* { dg-additional-options "--param=riscv-two-source-permutes" { target riscv*-*-* } } */ typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); typedef unsigned int vecu __attribute__ ((vector_size (4 * sizeof (unsigned int)))); diff --git a/gcc/testsuite/gcc.dg/pr54346.c b/gcc/testsuite/gcc.dg/pr54346.c index 5ec0609f1e5..b78e0533ac2 100644 --- a/gcc/testsuite/gcc.dg/pr54346.c +++ b/gcc/testsuite/gcc.dg/pr54346.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O -fdump-tree-dse1 -Wno-psabi" } */ +/* { dg-additional-options "--param=riscv-two-source-permutes" { target riscv*-*-* } } */ typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); --