On Fri, May 20, 2016 at 5:24 PM, Richard Sandiford <richard.sandif...@arm.com> wrote: > vectorizable_load had a curious "force_peeling" variable, with no > comment explaining why we need it for single-element interleaving > but not for other cases. I think it's simply because we weren't > initialising the GROUP_GAP correctly for single loads. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
Ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ > * tree-vect-data-refs.c (vect_analyze_group_access_1): Set > GROUP_GAP for single-element interleaving. > * tree-vect-stmts.c (vectorizable_load): Remove force_peeling > variable. > > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c > index 7652e21..36d302a 100644 > --- a/gcc/tree-vect-data-refs.c > +++ b/gcc/tree-vect-data-refs.c > @@ -2233,6 +2233,7 @@ vect_analyze_group_access_1 (struct data_reference *dr) > { > GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = stmt; > GROUP_SIZE (vinfo_for_stmt (stmt)) = groupsize; > + GROUP_GAP (stmt_info) = groupsize - 1; > if (dump_enabled_p ()) > { > dump_printf_loc (MSG_NOTE, vect_location, > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 9ab4af4..585c073 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -6319,7 +6319,6 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator > *gsi, gimple **vec_stmt, > that leaves unused vector loads around punt - we at least create > very sub-optimal code in that case (and blow up memory, > see PR65518). */ > - bool force_peeling = false; > if (first_stmt == stmt > && !GROUP_NEXT_ELEMENT (stmt_info)) > { > @@ -6333,7 +6332,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator > *gsi, gimple **vec_stmt, > } > > /* Single-element interleaving requires peeling for gaps. */ > - force_peeling = true; > + gcc_assert (GROUP_GAP (stmt_info)); > } > > /* If there is a gap in the end of the group or the group size cannot > @@ -6341,8 +6340,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator > *gsi, gimple **vec_stmt, > elements in the last iteration and thus need to peel that off. */ > if (loop_vinfo > && ! STMT_VINFO_STRIDED_P (stmt_info) > - && (force_peeling > - || GROUP_GAP (vinfo_for_stmt (first_stmt)) != 0 > + && (GROUP_GAP (vinfo_for_stmt (first_stmt)) != 0 > || (!slp && vf % GROUP_SIZE (vinfo_for_stmt (first_stmt)) != > 0))) > { > if (dump_enabled_p ()) >