Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

Alex Coplan Thu, 21 Nov 2024 03:20:18 -0800

On 19/11/2024 20:12, Richard Sandiford wrote:
> Alex Coplan <alex.cop...@arm.com> writes:
> > On 19/11/2024 17:02, Richard Sandiford wrote:
> >> Sorry for the slow review.  Finally catching up on backlog.
> >> 
> >> Richard Biener <rguent...@suse.de> writes:
> >> > On Mon, 28 Oct 2024, Alex Coplan wrote:
> >> >
> >> >> This allows us to vectorize more loops with early exits by forcing
> >> >> peeling for alignment to make sure that we're guaranteed to be able to
> >> >> safely read an entire vector iteration without crossing a page boundary.
> >> >> 
> >> >> To make this work for VLA architectures we have to allow compile-time
> >> >> non-constant target alignments.  We also have to override the result of
> >> >> the target's preferred_vector_alignment hook if it isn't a power-of-two
> >> >> multiple of the TYPE_SIZE of the chosen vector type.
> >> >> 
> >> >> There is currently an implicit assumption that the TYPE_SIZE of the
> >> >> vector type is itself a power of two.  For non-VLA types this
> >> >> could be checked directly in the vectorizer.  For VLA types I
> >> >> had discussed offline with Richard S about adding a target hook to allow
> >> >> the vectorizer to query the backend to confirm that a given VLA type
> >> >> is known to have a power-of-two size at runtime.
> >> >
> >> > GCC assumes all vectors have power-of-two size, so I don't think we
> >> > need to check anything but we'd instead have to make sure the
> >> > target constrains the hardware when this assumption doesn't hold
> >> > in silicon.
> >> 
> >> We did at one point support non-power-of-2 for VLA only.  But things
> >> might have crept in since that break it even for VLA.  It's no longer
> >> something that matters for SVE because the architecture has been
> >> tightened to remove the non-power-of-2 option.
> >> 
> >> My main comment on the patch is about:
> >> 
> >> +  /* Below we reject compile-time non-constant target alignments, but if
> >> +     our misalignment is zero, then we are known to already be aligned
> >> +     w.r.t. any such possible target alignment.  */
> >> +  if (known_eq (misalignment, 0))
> >> +    return 0;
> >> 
> >> When is that true for VLA?  It seems surprising that we can guarantee
> >> alignment to an unknown boundary :)  However, I agree that it's the
> >> natural consequence of the formula.
> >
> > My vague memory is that the alignment peeling machinery forces the
> > dr_info->misalignment to 0 after we've decided to peel for alignment
> > (for DRs which we know we will have made aligned by peeling).  So the
> > check is designed to handle that case.
> 
> Ah, yeah, of course.  Sorry for the dumb question.  I'd forgotten that
> that was what the misalignment represented here, rather than the
> incoming/"natural" misalignment.


Not a dumb question at all, it is quite non-obvious.  Thanks for taking
a look at the patch.

Alex

> 
> Thanks,
> Richard

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

Reply via email to