On Tue, Nov 1, 2016 at 12:38 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
> Hi All,
>
> I re-send all patches sent by Ilya earlier for review which support
> vectorization of loop epilogues and loops with low trip count. We
> assume that the only patch - vec-tails-07-combine-tail.patch - was not
> approved by Jeff.
>
> I did re-base of all patches and performed bootstrapping and
> regression testing that did not show any new failures. Also all
> changes related to new vect_do_peeling algorithm have been changed
> accordingly.
>
> Is it OK for trunk?

Hi,
I can't approve patches, but had some comments after going through the
implementation.

One confusing part is cost model change, as well as the way it's used
to decide how epilogue loop should be vectorized.  Given vect-tail is
disabled at the moment and the cost change needs further tuning, is it
reasonable to split this part out and get vectorization part
reviewed/submitted independently?  For example, let user specified
parameters make the decision for now.  Cost and target dependent
changes should go in at last, this could make the patch easier to
read.

The implementation computes/shares quite amount information from main
loop to epilogue loop vectorization.  Furthermore, variables/fields
for such information are somehow named in a misleading way.  For
example. LOOP_VINFO_MASK_EPILOGUE gives me the impression this is the
flag controlling whether epilogue loop should be vectorized with
masking.  However, it's actually controlled by exactly the same flag
as whether epilogue loop should be combined into the main loop with
masking:
@@ -7338,6 +8013,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)

   slpeel_make_loop_iterate_ntimes (loop, niters_vector);

+  if (LOOP_VINFO_COMBINE_EPILOGUE (loop_vinfo))
+    vect_combine_loop_epilogue (loop_vinfo);
+
   /* Reduce loop iterations by the vectorization factor.  */
   scale_loop_profile (loop, GCOV_COMPUTE_SCALE (1, vf),
               expected_iterations / vf);

IMHO, we should decouple main loop vectorization and epilogue
vectorization as much as possible by sharing as few information as we
can.  The general idea is to handle epilogue loop just like normal
short-trip loop.  For example, we can rename
LOOP_VINFO_COMBINE_EPILOGUE into LOOP_VINFO_VECT_MASK (or something
else), and we don't differentiate its meaning between main and
epilogue(short-trip) loop.  It only indicates the current loop should
be vectorized with masking no matter it's a main loop or epilogue
loop, and it works just like the current implementation.

After this change, we can refine vectorization and make it more
general for normal loop and epilogue(short trip) loop.  For example,
this implementation sets LOOP_VINFO_PEELING_FOR_NITER  for epilogue
loop and use it to control how it should be vectorized:
+  if (!LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))
+    {
+      LOOP_VINFO_MASK_EPILOGUE (loop_vinfo) = false;
+      LOOP_VINFO_COMBINE_EPILOGUE (loop_vinfo) = false;
+    }
+  else if (LOOP_VINFO_CAN_BE_MASKED (loop_vinfo)
+       && min_profitable_combine_iters >= 0)
+    {

This works, but not that good for understanding or maintaining.

Thanks,
bin

Reply via email to