The following adds a new --param for debugging the vectorizers alignment peeling by increasing the cost of aligned stores.
Bootstrap & regtest running on x86_64-unknown-linux-gnu. This makes the PR115843 testcase fail again on trunk (but not on the branch), seemingly uncovering another backend issue. It makes the testcase get alignment peeling even with the zen4 costs fixed. Any objection? * params.opt (--param=vect-aligned-ldst-cost-bias): New. * doc/invoke.texi (--param=vect-aligned-ldst-cost-bias): Document. * tree-vect-stmts.cc (vect_get_store_cost): Honor param_vect_aligned_ldst_cost_bias. (vect_get_load_cost): Likewise. * gcc.dg/vect/pr115843.c: Use it. --- gcc/doc/invoke.texi | 4 ++++ gcc/params.opt | 4 ++++ gcc/testsuite/gcc.dg/vect/pr115843.c | 1 + gcc/tree-vect-stmts.cc | 2 ++ 4 files changed, 11 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 1360cae3986..e542cefbb4a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16914,6 +16914,10 @@ permit performing redundancy elimination after reload. The maximum number of insns in loop header duplicated by the copy loop headers pass. +@item vect-aligned-ldst-cost-bias +Bias to apply to the cost of aligned loads and stores. This +is useful for debugging only. + @item vect-epilogues-nomask Enable loop epilogue vectorization using smaller vector size. diff --git a/gcc/params.opt b/gcc/params.opt index 3c4369fa052..5f86d564421 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -1166,6 +1166,10 @@ Use direct poisoning/unpoisoning instructions for variables smaller or equal to Common Joined UInteger Var(param_use_canonical_types) Init(1) IntegerRange(0, 1) Param Whether to use canonical types. +-param=vect-aligned-ldst-cost-bias= +Common Joined UInteger Var(param_vect_aligned_ldst_cost_bias) Init(0) Param Optimization +Bias to apply to the cost of aligned loads and stores. + -param=vect-epilogues-nomask= Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) IntegerRange(0, 1) Param Optimization Enable loop epilogue vectorization using smaller vector size. diff --git a/gcc/testsuite/gcc.dg/vect/pr115843.c b/gcc/testsuite/gcc.dg/vect/pr115843.c index 1b3fe277209..6701fa3499a 100644 --- a/gcc/testsuite/gcc.dg/vect/pr115843.c +++ b/gcc/testsuite/gcc.dg/vect/pr115843.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "--param vect-aligned-ldst-cost-bias=100" } */ /* { dg-additional-options "-mavx512vl --param vect-partial-vector-usage=2" { target { avx512f_runtime && avx512vl } } } */ #include "tree-vect.h" diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index fc02e84b4b4..2502dbd5413 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -997,6 +997,7 @@ vect_get_store_cost (vec_info *, stmt_vec_info stmt_info, int ncopies, *inside_cost += record_stmt_cost (body_cost_vec, ncopies, vector_store, stmt_info, 0, vect_body); + *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, @@ -1049,6 +1050,7 @@ vect_get_load_cost (vec_info *, stmt_vec_info stmt_info, int ncopies, { *inside_cost += record_stmt_cost (body_cost_vec, ncopies, vector_load, stmt_info, 0, vect_body); + *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, -- 2.35.3