On Wed, 26 Feb 2025, Tamar Christina wrote:

> > > >
> > > > No, I don't think so.  The code that eventually performs a
> > > > contiguous sub-group access directly should never extend
> > > > the load beyond GROUP_SIZE - or should be gated on the DR
> > > > not executed speculatively.  That is, we should "fix" this
> > > > elsewhere.
> > > >
> > >
> > > It doesn't, it's just not aligned within the range of GROUP_SIZE
> > > from what I remember.
> > >
> > > > If you have an updated patch I can look at what's wrong here if you
> > > > tell me how to reproduce (after applying the patch I suppose).
> > >
> > > Yes, applying the patch and running:
> > >
> > > /work/build/gcc/xgcc -B/work/build/gcc/
> > /work/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c  -m64   
> > -fdiagnostics-
> > plain-output  -flto -ffat-lto-objects -msse2 -ftree-vectorize 
> > -fno-tree-loop-
> > distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-
> > details -msse4.1  -lm  -o ./vect-early-break_26.exe
> > 
> > So it works as in executing fine.  We have a VF of 4 and
> > 
> > note:   recording new base alignment for &b
> >   alignment:    32
> >   misalignment: 0
> >   based on:     _1 = b[i_32];
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   recording new base alignment for &a
> >   alignment:    32
> >   misalignment: 0
> >   based on:     _2 = a[i_32];
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   alignment increased due to early break to 32 bytes.
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > missed:   misalign = 8 bytes of ref b[i_32]
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   alignment increased due to early break to 32 bytes.
> > 
> > so no peeling necessary.  But we also have like
> > 
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > missed:   misalign = 12 bytes of ref b[_6]
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   vect_compute_data_ref_alignment:
> > /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> > note:   alignment increased due to early break to 32 bytes.
> > 
> > and we are correctly saying we vectorize an unaligned access.
> > 
> > The "issue" is we're having SLP nodes with a load permutation, their
> > expansion might not happen with the whole DR group in mind.  I'd say
> > we simply refuse to do early break speculative load vectorization
> > for SLP nodes with a load permutation.
> 
> This is what I was trying to say on IRC when I mentioned that the permutes
> can end up creating an unaligned access wrt to the original address.
> 
> But the reason I was still trying to allow this case is because conceptually
> my assumption was that the permutes still maintain the access within
> the group.  After all, they're just shifting elements around.
> 
> In other words, I was assuming that the group a[i] - a[i-2] still stays within
> the group alignment of 32-bytes, even if the permute can make the second
> load in the group start at say, byte 28.  My assumption was though that it 
> can't
> make it start at byte 36.
> 
> Are you saying that this is the case? that it can? Then I agree the load 
> permutations
> on group loads are unsafe to speculate for unmasked loops...

Yes, looking at the code generation the loads do not stay within the
original properly aligned boundary.

Richard.

> Thanks,
> Tamar
> > 
> > It looks like a latent issue to me which could also interfere with
> > gap peeling, I have to dig a bit further what code is responsible
> > for the current behavior ...
> > 
> > 
> > 
> > > Thanks,
> > > Tamar
> > >
> > > >
> > > > > Enforcing the alignment on every group member would be wrong I think 
> > > > > since
> > > > > that ends up higher overall alignment than they need.
> > > > >
> > > > > > So besides these issues in get_load_store_type the change looks 
> > > > > > good now.
> > > > > >
> > > > >
> > > > > Thanks for the reviews.
> > > > >
> > > > > Tamar
> > > > > > Richard.
> > > > > >
> > > > > > > +      else
> > > > > > > + *alignment_support_scheme = dr_aligned;
> > > > > > > +    }
> > > > > > > +
> > > > > > >    if (*alignment_support_scheme == dr_unaligned_unsupported)
> > > > > > >      {
> > > > > > >        if (dump_enabled_p ())
> > > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > > > > index
> > > > > >
> > > >
> > b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca
> > > > > > 64797435b46fc 100644
> > > > > > > --- a/gcc/tree-vectorizer.h
> > > > > > > +++ b/gcc/tree-vectorizer.h
> > > > > > > @@ -1281,7 +1281,11 @@ public:
> > > > > > >
> > > > > > >    /* Set by early break vectorization when this DR needs peeling 
> > > > > > > for
> > alignment
> > > > > > >       for correctness.  */
> > > > > > > -  bool need_peeling_for_alignment;
> > > > > > > +  bool safe_speculative_read_required;
> > > > > > > +
> > > > > > > +  /* Set by early break vectorization when this DR's scalar 
> > > > > > > accesses are
> > known
> > > > > > > +     to be inbounds of a known bounds loop.  */
> > > > > > > +  bool scalar_access_known_in_bounds;
> > > > > > >
> > > > > > >    tree base_decl;
> > > > > > >
> > > > > > > @@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info)
> > > > > > >    return dr_info->target_alignment;
> > > > > > >  }
> > > > > > >  #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
> > > > > > > +#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)-
> > > > > > >scalar_access_known_in_bounds
> > > > > > > +
> > > > > > > +/* Return if the stmt_vec_info requires peeling for alignment.  
> > > > > > > */
> > > > > > > +inline bool
> > > > > > > +dr_safe_speculative_read_required (stmt_vec_info stmt_info)
> > > > > > > +{
> > > > > > > +  dr_vec_info *dr_info;
> > > > > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > > (stmt_info));
> > > > > > > +  else
> > > > > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > > +
> > > > > > > +  return dr_info->safe_speculative_read_required;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* Set the safe_speculative_read_required for the the 
> > > > > > > stmt_vec_info, if
> > group
> > > > > > > +   access then set on the fist element otherwise set on DR 
> > > > > > > directly.  */
> > > > > > > +inline void
> > > > > > > +dr_set_safe_speculative_read_required (stmt_vec_info stmt_info,
> > > > > > > +                                bool requires_alignment)
> > > > > > > +{
> > > > > > > +  dr_vec_info *dr_info;
> > > > > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > > (stmt_info));
> > > > > > > +  else
> > > > > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > > +
> > > > > > > +  dr_info->safe_speculative_read_required = requires_alignment;
> > > > > > > +}
> > > > > > >
> > > > > > >  inline void
> > > > > > >  set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguent...@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH,
> > > > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > > Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH,
> > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to