The following is sth I noticed when looking at a way to fix PR81303.
We happily compute a runtime cost model threshold that executes the
vectorized variant even though no vector iteration takes place due
to the number of prologue/epilogue iterations.  The following fixes
that -- note that if we do not know the prologue/epilogue counts
statically they are estimated at vf/2 which means there's still the
chance the vector iteration won't execute.  To fix that we'd have to
estimate those as vf-1 instead, sth we might consider doing anyway
given that we regularly completely peel the epilogues vf-1 times
in that case.  Maybe as followup.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-07-21  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/81303
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Take
        into account prologue and epilogue iterations when raising
        min_profitable_iters to sth at least covering one vector iteration.

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c        (revision 250384)
+++ gcc/tree-vect-loop.c        (working copy)
@@ -3702,8 +3702,9 @@ vect_estimate_min_profitable_iters (loop
               "  Calculated minimum iters for profitability: %d\n",
               min_profitable_iters);
 
-  min_profitable_iters =
-       min_profitable_iters < vf ? vf : min_profitable_iters;
+  /* We want the vectorized loop to execute at least once.  */
+  if (min_profitable_iters < (vf + peel_iters_prologue + peel_iters_epilogue))
+    min_profitable_iters = vf + peel_iters_prologue + peel_iters_epilogue;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,

Reply via email to