Epilogue vectorisation uses the vectorisation factor of the main loop as the maximum vectorisation factor allowed for correctness. That makes sense as a conservatively correct value, since the chosen vectorisation factor will be strictly less than that anyway. However, once the VF itself becomes variable, it's easier to carry across the original maximum VF instead.
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-09-14 Richard Sandiford <richard.sandif...@linaro.org> Alan Hayward <alan.hayw...@arm.com> David Sherwood <david.sherw...@arm.com> gcc/ * tree-vectorizer.h (_loop_vec_info): Add max_vectorization_factor. (LOOP_VINFO_MAX_VECT_FACTOR): New macro. (LOOP_VINFO_ORIG_VECT_FACTOR): Replace with... (LOOP_VINFO_ORIG_MAX_VECT_FACTOR): ...this new macro. * tree-vect-data-refs.c (vect_analyze_data_ref_dependences): Update accordingly. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize max_vectorization_factor. (vect_analyze_loop_2): Set LOOP_VINFO_MAX_VECT_FACTOR. Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2017-09-14 11:28:27.080519923 +0100 +++ gcc/tree-vectorizer.h 2017-09-14 11:30:06.064254417 +0100 @@ -241,6 +241,10 @@ typedef struct _loop_vec_info : public v /* Unrolling factor */ int vectorization_factor; + /* Maximum runtime vectorization factor, or MAX_VECTORIZATION_FACTOR + if there is no particular limit. */ + unsigned HOST_WIDE_INT max_vectorization_factor; + /* Unknown DRs according to which loop was peeled. */ struct data_reference *unaligned_dr; @@ -355,6 +359,7 @@ #define LOOP_VINFO_NITERS_ASSUMPTIONS(L) #define LOOP_VINFO_COST_MODEL_THRESHOLD(L) (L)->th #define LOOP_VINFO_VECTORIZABLE_P(L) (L)->vectorizable #define LOOP_VINFO_VECT_FACTOR(L) (L)->vectorization_factor +#define LOOP_VINFO_MAX_VECT_FACTOR(L) (L)->max_vectorization_factor #define LOOP_VINFO_PTR_MASK(L) (L)->ptr_mask #define LOOP_VINFO_LOOP_NEST(L) (L)->loop_nest #define LOOP_VINFO_DATAREFS(L) (L)->datarefs @@ -400,8 +405,8 @@ #define LOOP_VINFO_NITERS_KNOWN_P(L) #define LOOP_VINFO_EPILOGUE_P(L) \ (LOOP_VINFO_ORIG_LOOP_INFO (L) != NULL) -#define LOOP_VINFO_ORIG_VECT_FACTOR(L) \ - (LOOP_VINFO_VECT_FACTOR (LOOP_VINFO_ORIG_LOOP_INFO (L))) +#define LOOP_VINFO_ORIG_MAX_VECT_FACTOR(L) \ + (LOOP_VINFO_MAX_VECT_FACTOR (LOOP_VINFO_ORIG_LOOP_INFO (L))) static inline loop_vec_info loop_vec_info_for_loop (struct loop *loop) Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c 2017-09-14 11:29:19.649870912 +0100 +++ gcc/tree-vect-data-refs.c 2017-09-14 11:30:06.063347272 +0100 @@ -509,7 +509,7 @@ vect_analyze_data_ref_dependences (loop_ was applied to original loop. Therefore we may just get max_vf using VF of original loop. */ if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)) - *max_vf = LOOP_VINFO_ORIG_VECT_FACTOR (loop_vinfo); + *max_vf = LOOP_VINFO_ORIG_MAX_VECT_FACTOR (loop_vinfo); else FOR_EACH_VEC_ELT (LOOP_VINFO_DDRS (loop_vinfo), i, ddr) if (vect_analyze_data_ref_dependence (ddr, loop_vinfo, max_vf)) Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2017-09-14 11:28:27.079519923 +0100 +++ gcc/tree-vect-loop.c 2017-09-14 11:30:06.064254417 +0100 @@ -1111,6 +1111,7 @@ _loop_vec_info::_loop_vec_info (struct l num_iters_assumptions (NULL_TREE), th (0), vectorization_factor (0), + max_vectorization_factor (0), unaligned_dr (NULL), peeling_for_alignment (0), ptr_mask (0), @@ -1920,6 +1921,7 @@ vect_analyze_loop_2 (loop_vec_info loop_ "bad data dependence.\n"); return false; } + LOOP_VINFO_MAX_VECT_FACTOR (loop_vinfo) = max_vf; ok = vect_determine_vectorization_factor (loop_vinfo); if (!ok)