"Kewen.Lin" <li...@linux.ibm.com> writes:
>  static void
> -vect_set_loop_mask (class loop *loop, tree mask, tree init_mask,
> -                 tree next_mask)
> +vect_set_loop_control (class loop *loop, tree ctrl, tree init_ctrl,
> +                 tree next_ctrl)

This argument needs reindenting below "class loop".

> [...]
> @@ -384,47 +384,47 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
> rgroup_masks *dest_rgm,
>    return false;
>  }
>  
> -/* Helper for vect_set_loop_condition_masked.  Generate definitions for
> -   all the masks in RGM and return a mask that is nonzero when the loop
> -   needs to iterate.  Add any new preheader statements to PREHEADER_SEQ.
> -   Use LOOP_COND_GSI to insert code before the exit gcond.
> +/* Helper for vect_set_loop_condition_partial_vectors.  Generate definitions
> +   for all the rgroup controls in RGC and return a control that is nonzero
> +   when the loop needs to iterate.  Add any new preheader statements to
> +   PREHEADER_SEQ.  Use LOOP_COND_GSI to insert code before the exit gcond.
>  
> -   RGM belongs to loop LOOP.  The loop originally iterated NITERS
> +   RGC belongs to loop LOOP.  The loop originally iterated NITERS
>     times and has been vectorized according to LOOP_VINFO.
>  
>     If NITERS_SKIP is nonnull, the first iteration of the vectorized loop
>     starts with NITERS_SKIP dummy iterations of the scalar loop before
> -   the real work starts.  The mask elements for these dummy iterations
> +   the real work starts.  The control elements for these dummy iterations
>     must be 0, to ensure that the extra iterations do not have an effect.

The NITERS_SKIP case would remain specific to masks, so I think we
should leave out the final change above.

> @@ -436,8 +436,8 @@ vect_set_loop_masks_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>    tree nscalars_skip = niters_skip;
>    if (nscalars_per_iter != 1)
>      {
> -      /* We checked before choosing to use a fully-masked loop that these
> -      multiplications don't overflow.  */
> +      /* We checked before choosing to use a partial vectorization loop that
> +      these multiplications don't overflow.  */

Maybe:

      /* We checked before setting LOOP_VINFO_USING_PARTIAL_VECTORS_P that
         these multiplications don't overflow.  */

> [...]
> -/* Make LOOP iterate NITERS times using masking and WHILE_ULT calls.
> -   LOOP_VINFO describes the vectorization of LOOP.  NITERS is the
> -   number of iterations of the original scalar loop that should be
> -   handled by the vector loop.  NITERS_MAYBE_ZERO and FINAL_IV are
> -   as for vect_set_loop_condition.
> +/* Make LOOP iterate NITERS times using controls and its appropriate
> +   operations eg: masking and WHILE_ULT calls.  LOOP_VINFO describes
> +   the vectorization of LOOP.  NITERS is the number of iterations of
> +   the original scalar loop that should be handled by the vector loop.
> +   NITERS_MAYBE_ZERO and FINAL_IV are as for vect_set_loop_condition.

Reading it back now, the comment in current sources is less than stellar,
since it contradicts itself about what NITERS means.  How about replacing
the first sentence with:

/* Set up the iteration condition and rgroup controls for LOOP, given that
   LOOP_VINFO_USING_PARTIAL_VECTORS_P is true for the vectorized loop.

> [...]
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index d15a523752f..7ea75d6d095 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -874,16 +874,16 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, 
> vec_info_shared *shared)
>    epilogue_vinfos.create (6);
>  }
>  
> -/* Free all levels of MASKS.  */
> +/* Free all levels of rgroup CONTROLS.  */
>  
>  void
> -release_vec_loop_masks (vec_loop_masks *masks)
> +release_vec_loop_controls (auto_vec<rgroup_controls> *controls)

Now that we're expanding the typedef, I guess this should just be
vec<rgroup_controls>, since it it doesn't matter whether the vector
is auto_vec or not.

> [...]
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index efcf8e8ae60..8f3499cc811 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -409,19 +409,20 @@ is_a_helper <_bb_vec_info *>::test (vec_info *i)
>  
>     In classical vectorization, each iteration of the vector loop would
>     handle exactly VF iterations of the original scalar loop.  However,
> -   in a fully-masked loop, a particular iteration of the vector loop
> -   might handle fewer than VF iterations of the scalar loop.  The vector
> -   lanes that correspond to iterations of the scalar loop are said to be
> -   "active" and the other lanes are said to be "inactive".
> -
> -   In a fully-masked loop, many rgroups need to be masked to ensure that
> -   they have no effect for the inactive lanes.  Each such rgroup needs a
> -   sequence of booleans in the same order as above, but with each (i,j)
> -   replaced by a boolean that indicates whether iteration i is active.
> -   This sequence occupies nV vector masks that again have nL lanes each.
> -   Thus the mask sequence as a whole consists of VF independent booleans
> -   that are each repeated nS times.
> -
> +   in a partial vectorization loop, a particular iteration of the vector

Maybe “in vector loops that are able to operate on partial vectors, …“?

> +   loop might handle fewer than VF iterations of the scalar loop.  The
> +   vector lanes that correspond to iterations of the scalar loop are
> +   said to be "active" and the other lanes are said to be "inactive".
> +
> +   In a partial vectorization loop, many rgroups need to be controlled to

And here maybe “In such vector loops, …”

> +   ensure that they have no effect for the inactive lanes.  Each such
> +   rgroup needs a sequence of booleans in the same order as above, but
> +   with each (i,j) replaced by a boolean that indicates whether iteration
> +   i is active.  This sequence occupies nV vector controls that again have
> +   nL lanes each.  Thus the control sequence as a whole consists of VF
> +   independent booleans that are each repeated nS times.

This description only really applies to masks, but I suppose the length
case could be viewed as a restricted version of the same idea.  It might
be better in that case to say “Conceptually, each such rgroup needs …”.

OK with those changes, thanks.

Richard

Reply via email to