The following adds a new --param for debugging the vectorizers alignment
peeling by increasing the cost of aligned stores.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

This makes the PR115843 testcase fail again on trunk (but not on the
branch), seemingly uncovering another backend issue.  It makes the
testcase get alignment peeling even with the zen4 costs fixed.

Any objection?

        * params.opt (--param=vect-aligned-ldst-cost-bias): New.
        * doc/invoke.texi (--param=vect-aligned-ldst-cost-bias): Document.
        * tree-vect-stmts.cc (vect_get_store_cost): Honor
        param_vect_aligned_ldst_cost_bias.
        (vect_get_load_cost): Likewise.

        * gcc.dg/vect/pr115843.c: Use it.
---
 gcc/doc/invoke.texi                  | 4 ++++
 gcc/params.opt                       | 4 ++++
 gcc/testsuite/gcc.dg/vect/pr115843.c | 1 +
 gcc/tree-vect-stmts.cc               | 2 ++
 4 files changed, 11 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1360cae3986..e542cefbb4a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16914,6 +16914,10 @@ permit performing redundancy elimination after reload.
 The maximum number of insns in loop header duplicated
 by the copy loop headers pass.
 
+@item vect-aligned-ldst-cost-bias
+Bias to apply to the cost of aligned loads and stores.  This
+is useful for debugging only.
+
 @item vect-epilogues-nomask
 Enable loop epilogue vectorization using smaller vector size.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index 3c4369fa052..5f86d564421 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1166,6 +1166,10 @@ Use direct poisoning/unpoisoning instructions for 
variables smaller or equal to
 Common Joined UInteger Var(param_use_canonical_types) Init(1) IntegerRange(0, 
1) Param
 Whether to use canonical types.
 
+-param=vect-aligned-ldst-cost-bias=
+Common Joined UInteger Var(param_vect_aligned_ldst_cost_bias) Init(0) Param 
Optimization
+Bias to apply to the cost of aligned loads and stores.
+
 -param=vect-epilogues-nomask=
 Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) 
IntegerRange(0, 1) Param Optimization
 Enable loop epilogue vectorization using smaller vector size.
diff --git a/gcc/testsuite/gcc.dg/vect/pr115843.c 
b/gcc/testsuite/gcc.dg/vect/pr115843.c
index 1b3fe277209..6701fa3499a 100644
--- a/gcc/testsuite/gcc.dg/vect/pr115843.c
+++ b/gcc/testsuite/gcc.dg/vect/pr115843.c
@@ -1,3 +1,4 @@
+/* { dg-additional-options "--param vect-aligned-ldst-cost-bias=100" } */
 /* { dg-additional-options "-mavx512vl --param vect-partial-vector-usage=2" { 
target { avx512f_runtime && avx512vl } } } */
 
 #include "tree-vect.h"
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index fc02e84b4b4..2502dbd5413 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -997,6 +997,7 @@ vect_get_store_cost (vec_info *, stmt_vec_info stmt_info, 
int ncopies,
        *inside_cost += record_stmt_cost (body_cost_vec, ncopies,
                                          vector_store, stmt_info, 0,
                                          vect_body);
+       *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies;
 
         if (dump_enabled_p ())
           dump_printf_loc (MSG_NOTE, vect_location,
@@ -1049,6 +1050,7 @@ vect_get_load_cost (vec_info *, stmt_vec_info stmt_info, 
int ncopies,
       {
        *inside_cost += record_stmt_cost (body_cost_vec, ncopies, vector_load,
                                          stmt_info, 0, vect_body);
+       *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies;
 
         if (dump_enabled_p ())
           dump_printf_loc (MSG_NOTE, vect_location,
-- 
2.35.3

Reply via email to