On Wed, 26 Feb 2025, Tamar Christina wrote: > > > > > > > > No, I don't think so. The code that eventually performs a > > > > contiguous sub-group access directly should never extend > > > > the load beyond GROUP_SIZE - or should be gated on the DR > > > > not executed speculatively. That is, we should "fix" this > > > > elsewhere. > > > > > > > > > > It doesn't, it's just not aligned within the range of GROUP_SIZE > > > from what I remember. > > > > > > > If you have an updated patch I can look at what's wrong here if you > > > > tell me how to reproduce (after applying the patch I suppose). > > > > > > Yes, applying the patch and running: > > > > > > /work/build/gcc/xgcc -B/work/build/gcc/ > > /work/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c -m64 > > -fdiagnostics- > > plain-output -flto -ffat-lto-objects -msse2 -ftree-vectorize > > -fno-tree-loop- > > distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect- > > details -msse4.1 -lm -o ./vect-early-break_26.exe > > > > So it works as in executing fine. We have a VF of 4 and > > > > note: recording new base alignment for &b > > alignment: 32 > > misalignment: 0 > > based on: _1 = b[i_32]; > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: recording new base alignment for &a > > alignment: 32 > > misalignment: 0 > > based on: _2 = a[i_32]; > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: vect_compute_data_ref_alignment: > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: alignment increased due to early break to 32 bytes. > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > missed: misalign = 8 bytes of ref b[i_32] > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: vect_compute_data_ref_alignment: > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: alignment increased due to early break to 32 bytes. > > > > so no peeling necessary. But we also have like > > > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > missed: misalign = 12 bytes of ref b[_6] > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: vect_compute_data_ref_alignment: > > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21: > > note: alignment increased due to early break to 32 bytes. > > > > and we are correctly saying we vectorize an unaligned access. > > > > The "issue" is we're having SLP nodes with a load permutation, their > > expansion might not happen with the whole DR group in mind. I'd say > > we simply refuse to do early break speculative load vectorization > > for SLP nodes with a load permutation. > > This is what I was trying to say on IRC when I mentioned that the permutes > can end up creating an unaligned access wrt to the original address. > > But the reason I was still trying to allow this case is because conceptually > my assumption was that the permutes still maintain the access within > the group. After all, they're just shifting elements around. > > In other words, I was assuming that the group a[i] - a[i-2] still stays within > the group alignment of 32-bytes, even if the permute can make the second > load in the group start at say, byte 28. My assumption was though that it > can't > make it start at byte 36. > > Are you saying that this is the case? that it can? Then I agree the load > permutations > on group loads are unsafe to speculate for unmasked loops...
Yes, looking at the code generation the loads do not stay within the original properly aligned boundary. Richard. > Thanks, > Tamar > > > > It looks like a latent issue to me which could also interfere with > > gap peeling, I have to dig a bit further what code is responsible > > for the current behavior ... > > > > > > > > > Thanks, > > > Tamar > > > > > > > > > > > > Enforcing the alignment on every group member would be wrong I think > > > > > since > > > > > that ends up higher overall alignment than they need. > > > > > > > > > > > So besides these issues in get_load_store_type the change looks > > > > > > good now. > > > > > > > > > > > > > > > > Thanks for the reviews. > > > > > > > > > > Tamar > > > > > > Richard. > > > > > > > > > > > > > + else > > > > > > > + *alignment_support_scheme = dr_aligned; > > > > > > > + } > > > > > > > + > > > > > > > if (*alignment_support_scheme == dr_unaligned_unsupported) > > > > > > > { > > > > > > > if (dump_enabled_p ()) > > > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > > > > > > > index > > > > > > > > > > > > b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca > > > > > > 64797435b46fc 100644 > > > > > > > --- a/gcc/tree-vectorizer.h > > > > > > > +++ b/gcc/tree-vectorizer.h > > > > > > > @@ -1281,7 +1281,11 @@ public: > > > > > > > > > > > > > > /* Set by early break vectorization when this DR needs peeling > > > > > > > for > > alignment > > > > > > > for correctness. */ > > > > > > > - bool need_peeling_for_alignment; > > > > > > > + bool safe_speculative_read_required; > > > > > > > + > > > > > > > + /* Set by early break vectorization when this DR's scalar > > > > > > > accesses are > > known > > > > > > > + to be inbounds of a known bounds loop. */ > > > > > > > + bool scalar_access_known_in_bounds; > > > > > > > > > > > > > > tree base_decl; > > > > > > > > > > > > > > @@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info) > > > > > > > return dr_info->target_alignment; > > > > > > > } > > > > > > > #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR) > > > > > > > +#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)- > > > > > > >scalar_access_known_in_bounds > > > > > > > + > > > > > > > +/* Return if the stmt_vec_info requires peeling for alignment. > > > > > > > */ > > > > > > > +inline bool > > > > > > > +dr_safe_speculative_read_required (stmt_vec_info stmt_info) > > > > > > > +{ > > > > > > > + dr_vec_info *dr_info; > > > > > > > + if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) > > > > > > > + dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT > > > > (stmt_info)); > > > > > > > + else > > > > > > > + dr_info = STMT_VINFO_DR_INFO (stmt_info); > > > > > > > + > > > > > > > + return dr_info->safe_speculative_read_required; > > > > > > > +} > > > > > > > + > > > > > > > +/* Set the safe_speculative_read_required for the the > > > > > > > stmt_vec_info, if > > group > > > > > > > + access then set on the fist element otherwise set on DR > > > > > > > directly. */ > > > > > > > +inline void > > > > > > > +dr_set_safe_speculative_read_required (stmt_vec_info stmt_info, > > > > > > > + bool requires_alignment) > > > > > > > +{ > > > > > > > + dr_vec_info *dr_info; > > > > > > > + if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) > > > > > > > + dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT > > > > (stmt_info)); > > > > > > > + else > > > > > > > + dr_info = STMT_VINFO_DR_INFO (stmt_info); > > > > > > > + > > > > > > > + dr_info->safe_speculative_read_required = requires_alignment; > > > > > > > +} > > > > > > > > > > > > > > inline void > > > > > > > set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Richard Biener <rguent...@suse.de> > > > > > > SUSE Software Solutions Germany GmbH, > > > > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > > Nuernberg) > > > > > > > > > > > > > -- > > > > Richard Biener <rguent...@suse.de> > > > > SUSE Software Solutions Germany GmbH, > > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > Nuernberg) > > > > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)