vectorizable_load had a curious "force_peeling" variable, with no
comment explaining why we need it for single-element interleaving
but not for other cases.  I think it's simply because we weren't
initialising the GROUP_GAP correctly for single loads.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
        * tree-vect-data-refs.c (vect_analyze_group_access_1): Set
        GROUP_GAP for single-element interleaving.
        * tree-vect-stmts.c (vectorizable_load): Remove force_peeling
        variable.

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 7652e21..36d302a 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -2233,6 +2233,7 @@ vect_analyze_group_access_1 (struct data_reference *dr)
        {
          GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = stmt;
          GROUP_SIZE (vinfo_for_stmt (stmt)) = groupsize;
+         GROUP_GAP (stmt_info) = groupsize - 1;
          if (dump_enabled_p ())
            {
              dump_printf_loc (MSG_NOTE, vect_location,
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 9ab4af4..585c073 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -6319,7 +6319,6 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
          that leaves unused vector loads around punt - we at least create
         very sub-optimal code in that case (and blow up memory,
         see PR65518).  */
-      bool force_peeling = false;
       if (first_stmt == stmt
          && !GROUP_NEXT_ELEMENT (stmt_info))
        {
@@ -6333,7 +6332,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
            }
 
          /* Single-element interleaving requires peeling for gaps.  */
-         force_peeling = true;
+         gcc_assert (GROUP_GAP (stmt_info));
        }
 
       /* If there is a gap in the end of the group or the group size cannot
@@ -6341,8 +6340,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
         elements in the last iteration and thus need to peel that off.  */
       if (loop_vinfo
          && ! STMT_VINFO_STRIDED_P (stmt_info)
-         && (force_peeling
-             || GROUP_GAP (vinfo_for_stmt (first_stmt)) != 0
+         && (GROUP_GAP (vinfo_for_stmt (first_stmt)) != 0
              || (!slp && vf % GROUP_SIZE (vinfo_for_stmt (first_stmt)) != 0)))
        {
          if (dump_enabled_p ())

Reply via email to