"Kewen.Lin" <li...@linux.ibm.com> writes: > static void > -vect_set_loop_mask (class loop *loop, tree mask, tree init_mask, > - tree next_mask) > +vect_set_loop_control (class loop *loop, tree ctrl, tree init_ctrl, > + tree next_ctrl)
This argument needs reindenting below "class loop". > [...] > @@ -384,47 +384,47 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, > rgroup_masks *dest_rgm, > return false; > } > > -/* Helper for vect_set_loop_condition_masked. Generate definitions for > - all the masks in RGM and return a mask that is nonzero when the loop > - needs to iterate. Add any new preheader statements to PREHEADER_SEQ. > - Use LOOP_COND_GSI to insert code before the exit gcond. > +/* Helper for vect_set_loop_condition_partial_vectors. Generate definitions > + for all the rgroup controls in RGC and return a control that is nonzero > + when the loop needs to iterate. Add any new preheader statements to > + PREHEADER_SEQ. Use LOOP_COND_GSI to insert code before the exit gcond. > > - RGM belongs to loop LOOP. The loop originally iterated NITERS > + RGC belongs to loop LOOP. The loop originally iterated NITERS > times and has been vectorized according to LOOP_VINFO. > > If NITERS_SKIP is nonnull, the first iteration of the vectorized loop > starts with NITERS_SKIP dummy iterations of the scalar loop before > - the real work starts. The mask elements for these dummy iterations > + the real work starts. The control elements for these dummy iterations > must be 0, to ensure that the extra iterations do not have an effect. The NITERS_SKIP case would remain specific to masks, so I think we should leave out the final change above. > @@ -436,8 +436,8 @@ vect_set_loop_masks_directly (class loop *loop, > loop_vec_info loop_vinfo, > tree nscalars_skip = niters_skip; > if (nscalars_per_iter != 1) > { > - /* We checked before choosing to use a fully-masked loop that these > - multiplications don't overflow. */ > + /* We checked before choosing to use a partial vectorization loop that > + these multiplications don't overflow. */ Maybe: /* We checked before setting LOOP_VINFO_USING_PARTIAL_VECTORS_P that these multiplications don't overflow. */ > [...] > -/* Make LOOP iterate NITERS times using masking and WHILE_ULT calls. > - LOOP_VINFO describes the vectorization of LOOP. NITERS is the > - number of iterations of the original scalar loop that should be > - handled by the vector loop. NITERS_MAYBE_ZERO and FINAL_IV are > - as for vect_set_loop_condition. > +/* Make LOOP iterate NITERS times using controls and its appropriate > + operations eg: masking and WHILE_ULT calls. LOOP_VINFO describes > + the vectorization of LOOP. NITERS is the number of iterations of > + the original scalar loop that should be handled by the vector loop. > + NITERS_MAYBE_ZERO and FINAL_IV are as for vect_set_loop_condition. Reading it back now, the comment in current sources is less than stellar, since it contradicts itself about what NITERS means. How about replacing the first sentence with: /* Set up the iteration condition and rgroup controls for LOOP, given that LOOP_VINFO_USING_PARTIAL_VECTORS_P is true for the vectorized loop. > [...] > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index d15a523752f..7ea75d6d095 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -874,16 +874,16 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, > vec_info_shared *shared) > epilogue_vinfos.create (6); > } > > -/* Free all levels of MASKS. */ > +/* Free all levels of rgroup CONTROLS. */ > > void > -release_vec_loop_masks (vec_loop_masks *masks) > +release_vec_loop_controls (auto_vec<rgroup_controls> *controls) Now that we're expanding the typedef, I guess this should just be vec<rgroup_controls>, since it it doesn't matter whether the vector is auto_vec or not. > [...] > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > index efcf8e8ae60..8f3499cc811 100644 > --- a/gcc/tree-vectorizer.h > +++ b/gcc/tree-vectorizer.h > @@ -409,19 +409,20 @@ is_a_helper <_bb_vec_info *>::test (vec_info *i) > > In classical vectorization, each iteration of the vector loop would > handle exactly VF iterations of the original scalar loop. However, > - in a fully-masked loop, a particular iteration of the vector loop > - might handle fewer than VF iterations of the scalar loop. The vector > - lanes that correspond to iterations of the scalar loop are said to be > - "active" and the other lanes are said to be "inactive". > - > - In a fully-masked loop, many rgroups need to be masked to ensure that > - they have no effect for the inactive lanes. Each such rgroup needs a > - sequence of booleans in the same order as above, but with each (i,j) > - replaced by a boolean that indicates whether iteration i is active. > - This sequence occupies nV vector masks that again have nL lanes each. > - Thus the mask sequence as a whole consists of VF independent booleans > - that are each repeated nS times. > - > + in a partial vectorization loop, a particular iteration of the vector Maybe “in vector loops that are able to operate on partial vectors, …“? > + loop might handle fewer than VF iterations of the scalar loop. The > + vector lanes that correspond to iterations of the scalar loop are > + said to be "active" and the other lanes are said to be "inactive". > + > + In a partial vectorization loop, many rgroups need to be controlled to And here maybe “In such vector loops, …” > + ensure that they have no effect for the inactive lanes. Each such > + rgroup needs a sequence of booleans in the same order as above, but > + with each (i,j) replaced by a boolean that indicates whether iteration > + i is active. This sequence occupies nV vector controls that again have > + nL lanes each. Thus the control sequence as a whole consists of VF > + independent booleans that are each repeated nS times. This description only really applies to masks, but I suppose the length case could be viewed as a restricted version of the same idea. It might be better in that case to say “Conceptually, each such rgroup needs …”. OK with those changes, thanks. Richard