This removes --param vect-inner-loop-cost-factor in favor of looking at the estimated number of iterations of the inner loop when available and otherwise just assumes a single inner iteration which is conservative on the side of not vectorizing.
The alternative is to retain the --param for exactly that case, not sure if the result is better or not. The --param is new on head, it was static '50' before. Any strong opinions? Richard. 2021-08-23 Richard Biener <rguent...@suse.de> * doc/invoke.texi (vect-inner-loop-cost-factor): Remove documentation. * params.opt (--param vect-inner-loop-cost-factor): Remove. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize inner_loop_cost_factor to 1. (vect_analyze_loop_form): Initialize inner_loop_cost_factor from the estimated number of iterations of the inner loop. --- gcc/doc/invoke.texi | 5 ----- gcc/params.opt | 4 ---- gcc/tree-vect-loop.c | 12 +++++++++++- 3 files changed, 11 insertions(+), 10 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c057cc1e4ae..054950132f6 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -14385,11 +14385,6 @@ code to iterate. 2 allows partial vector loads and stores in all loops. The parameter only has an effect on targets that support partial vector loads and stores. -@item vect-inner-loop-cost-factor -The factor which the loop vectorizer applies to the cost of statements -in an inner loop relative to the loop being vectorized. The default -value is 50. - @item avoid-fma-max-bits Maximum number of bits for which we avoid creating FMAs. diff --git a/gcc/params.opt b/gcc/params.opt index f9264887b40..f7b19fa430d 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -1113,8 +1113,4 @@ Bound on number of runtime checks inserted by the vectorizer's loop versioning f Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) IntegerRange(0, 2) Param Optimization Controls how loop vectorizer uses partial vectors. 0 means never, 1 means only for loops whose need to iterate can be removed, 2 means for all loops. The default value is 2. --param=vect-inner-loop-cost-factor= -Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) IntegerRange(1, 999999) Param Optimization -The factor which the loop vectorizer applies to the cost of statements in an inner loop relative to the loop being vectorized. - ; This comment is to ensure we retain the blank line above. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index c521b43a47c..cb48717f20e 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -841,7 +841,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared) single_scalar_iteration_cost (0), vec_outside_cost (0), vec_inside_cost (0), - inner_loop_cost_factor (param_vect_inner_loop_cost_factor), + inner_loop_cost_factor (1), vectorizable (false), can_use_partial_vectors_p (param_vect_partial_vector_usage != 0), using_partial_vectors_p (false), @@ -1519,6 +1519,16 @@ vect_analyze_loop_form (class loop *loop, vec_info_shared *shared) stmt_vec_info inner_loop_cond_info = loop_vinfo->lookup_stmt (inner_loop_cond); STMT_VINFO_TYPE (inner_loop_cond_info) = loop_exit_ctrl_vec_info_type; + /* If we have an estimate on the number of iterations of the inner + loop use that as the scale for costing, otherwise conservatively + assume a single inner iteration. */ + widest_int nit; + if (get_estimated_loop_iterations (loop->inner, &nit)) + LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo) + /* Since costing is done on unsigned int cap the scale on + some large number consistent with what we'd see in + CFG counts. */ + = wi::smax (nit, REG_BR_PROB_BASE).to_uhwi (); } gcc_assert (!loop->aux); -- 2.31.1