Yes. Note I don't see we guarantee element alignment for gather/scatter
either, nor do the IFNs seem to have encoding space for alignment. The
effective type for TBAA seems also missing there ...
Regarding vector_vector_composition_type I had a try and attached a preliminary
V3. I'm not really happy with it (and I suppose you won't be either) because
it's now essentially two closely related functions in one with different
argument requirements (I needed four additional ones).
Indeed :/
I'm not sure whether handling this case as part of VMAT_STRIDED_SLP is
wise. IIRC we do already choose VMAT_GATHER_SCATTER for some
strided loads, so why not do strided load/store handling as part of
gather/scatter handling?
Hmm yeah so we already use strided loads for "stride-like" gathers, but I guess
not for group sizes > 1? But how would we get there for cases that currently
choose VMAT_STRIDED_SLP? Do you mean to change the access type in certain
cases?
IMHO it's not such a bad fit for strided SLP, maybe not perfect but the general
idea of using larger element to construct a group stays the same. Of course
it's kind of in between but we still load individual groups, just without the
final "vec_init"/vec_construct step. I concede that strided loads are more
limited than the other two options because we cannot load groups larger than
64 bit.
I think the spotted correctness issues wrt alignment/aliasing should be
addressed up-front. In the end the gather/stride-load is probably an
UNSPEC, so there's no MEM RTX with wrong info? How would we
query the target on whether it can handle the alignment here? Usually
we go through vect_supportable_dr_alignment which asks
targetm.vectorize.support_vector_misalignment which in turn gets
packed_p as true in case the scalar load involved isn't aligned according
to its size. But I'm not sure we'll end up there for gather/scatter or
strided loads.
OK, going to have a look.
--
Regards
Robin