This removes --param vect-inner-loop-cost-factor in favor of looking
at the estimated number of iterations of the inner loop
when available and otherwise just assumes a single inner
iteration which is conservative on the side of not vectorizing.

The alternative is to retain the --param for exactly that case,
not sure if the result is better or not.  The --param is new on
head, it was static '50' before.

Any strong opinions?

Richard.

2021-08-23  Richard Biener  <rguent...@suse.de>

        * doc/invoke.texi (vect-inner-loop-cost-factor): Remove
        documentation.
        * params.opt (--param vect-inner-loop-cost-factor): Remove.
        * tree-vect-loop.c (_loop_vec_info::_loop_vec_info):
        Initialize inner_loop_cost_factor to 1.
        (vect_analyze_loop_form): Initialize inner_loop_cost_factor
        from the estimated number of iterations of the inner loop.
---
 gcc/doc/invoke.texi  |  5 -----
 gcc/params.opt       |  4 ----
 gcc/tree-vect-loop.c | 12 +++++++++++-
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c057cc1e4ae..054950132f6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14385,11 +14385,6 @@ code to iterate.  2 allows partial vector loads and 
stores in all loops.
 The parameter only has an effect on targets that support partial
 vector loads and stores.
 
-@item vect-inner-loop-cost-factor
-The factor which the loop vectorizer applies to the cost of statements
-in an inner loop relative to the loop being vectorized.  The default
-value is 50.
-
 @item avoid-fma-max-bits
 Maximum number of bits for which we avoid creating FMAs.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index f9264887b40..f7b19fa430d 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1113,8 +1113,4 @@ Bound on number of runtime checks inserted by the 
vectorizer's loop versioning f
 Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) 
IntegerRange(0, 2) Param Optimization
 Controls how loop vectorizer uses partial vectors.  0 means never, 1 means 
only for loops whose need to iterate can be removed, 2 means for all loops.  
The default value is 2.
 
--param=vect-inner-loop-cost-factor=
-Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) 
IntegerRange(1, 999999) Param Optimization
-The factor which the loop vectorizer applies to the cost of statements in an 
inner loop relative to the loop being vectorized.
-
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index c521b43a47c..cb48717f20e 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -841,7 +841,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, 
vec_info_shared *shared)
     single_scalar_iteration_cost (0),
     vec_outside_cost (0),
     vec_inside_cost (0),
-    inner_loop_cost_factor (param_vect_inner_loop_cost_factor),
+    inner_loop_cost_factor (1),
     vectorizable (false),
     can_use_partial_vectors_p (param_vect_partial_vector_usage != 0),
     using_partial_vectors_p (false),
@@ -1519,6 +1519,16 @@ vect_analyze_loop_form (class loop *loop, 
vec_info_shared *shared)
       stmt_vec_info inner_loop_cond_info
        = loop_vinfo->lookup_stmt (inner_loop_cond);
       STMT_VINFO_TYPE (inner_loop_cond_info) = loop_exit_ctrl_vec_info_type;
+      /* If we have an estimate on the number of iterations of the inner
+        loop use that as the scale for costing, otherwise conservatively
+        assume a single inner iteration.  */
+      widest_int nit;
+      if (get_estimated_loop_iterations (loop->inner, &nit))
+       LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo)
+         /* Since costing is done on unsigned int cap the scale on
+            some large number consistent with what we'd see in
+            CFG counts.  */
+         = wi::smax (nit, REG_BR_PROB_BASE).to_uhwi ();
     }
 
   gcc_assert (!loop->aux);
-- 
2.31.1

Reply via email to