On 06/28/2016 06:24 AM, Ilya Enkovich wrote:
Here is an updated patch version.
Thanks,
Ilya
--
gcc/
2016-05-28 Ilya Enkovich <ilya.enkov...@intel.com>
* dbgcnt.def (vect_tail_combine): New.
* params.def (PARAM_VECT_COST_INCREASE_COMBINE_THRESHOLD): New.
* tree-vect-data-refs.c (vect_get_new_ssa_name): Support vect_mask_var.
* tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): Support
epilogue combined with loop body.
(vect_do_peeling_for_loop_bound): LIkewise.
(vect_do_peeling_for_alignment): ???
* tree-vect-loop.c Include alias.h and dbgcnt.h.
(vect_estimate_min_profitable_iters): Add
ret_min_profitable_combine_niters
arg, compute number of iterations for which loop epilogue combining is
profitable.
(vect_generate_tmps_on_preheader): Support combined apilogue.
(vect_gen_ivs_for_masking): New.
(vect_get_mask_index_for_elems): New.
(vect_get_mask_index_for_type): New.
(vect_gen_loop_masks): New.
(vect_mask_reduction_stmt): New.
(vect_mask_mask_load_store_stmt): New.
(vect_mask_load_store_stmt): New.
(vect_combine_loop_epilogue): New.
(vect_transform_loop): Support combined apilogue.
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 41b9380..08fad82 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6895,6 +7044,489 @@ vect_generate_tmps_on_preheader (loop_vec_info
loop_vinfo,
return;
}
+
+/* Function vect_gen_loop_masks.
+
+ Create masks to mask a loop described by LOOP_VINFO. Masks
+ are created according to LOOP_VINFO_REQUIRED_MASKS and are stored
+ into MASKS vector.
+
+ Index of a mask in a vector is computed according to a number
+ of masks's elements. Masks are sorted by number of its elements
+ in descending order. Index 0 is used to access a mask with
+ current_vector_size elements. Among masks with the same number
+ of elements the one with lower index is used to mask iterations
+ with smaller iteration counter. Note that you may get NULL elements
+ for masks which are not required. Use vect_get_mask_index_for_elems
+ or vect_get_mask_index_for_type to access resulting vector. */
+
+static void
+vect_gen_loop_masks (loop_vec_info loop_vinfo, vec<tree> *masks)
I find myself wondering if this ought to be broken down a bit (without
changing the underlying semantics).
+
+ /* Create narrowed masks. */
+ cur_mask_elems = iv_elems;
+ nmasks = ivs.length ();
+ while (cur_mask_elems < max_mask_elems)
+ {
+ prev_mask = vect_get_mask_index_for_elems (cur_mask_elems);
+
+ cur_mask_elems <<= 1;
+ nmasks >>= 1;
+
+ cur_mask = vect_get_mask_index_for_elems (cur_mask_elems);
+
+ mask_type = build_truth_vector_type (cur_mask_elems, vec_size);
+
+ for (unsigned i = 0; i < nmasks; i++)
+ {
+ tree mask_low = (*masks)[prev_mask++];
+ tree mask_hi = (*masks)[prev_mask++];
+ mask = vect_get_new_ssa_name (mask_type, vect_mask_var);
+ stmt = gimple_build_assign (mask, VEC_PACK_TRUNC_EXPR,
+ mask_low, mask_hi);
+ gsi_insert_before (&gsi, stmt, GSI_SAME_STMT);
+ (*masks)[cur_mask++] = mask;
+ }
+ }
For example, pull this into its own function as well as the code to
create widened masks. In fact, didn't I see those functions in one of
the other patches as their own separate subroutines?
It's not a huge deal and I don't think it requires another round of
review. I just found myself scrolling through multiple pages of this
function and thought it'd be slightly easier to grok if were simply smaller.
+
+/* Function vect_mask_reduction_stmt.
+
+ Mask given vectorized reduction statement STMT using
+ MASK. In case scalar reduction statement is vectorized
+ into several vector statements then PREV holds a
+ preceding vetor statement copy for STMT.
s/vetor/vector/
With the one function split up and the typo fix I think this is OK for
the trunk when the set as a whole is ready.
jeff