I'm not sure whether handling this case as part of VMAT_STRIDED_SLP is wise. IIRC we do already choose VMAT_GATHER_SCATTER for some strided loads, so why not do strided load/store handling as part of gather/scatter handling?
Now that we can deal with gather/scatter misalignment I think we can come back to this.
IMHO strided loads are an ok fit for STRIDED_SLP because of the support for composing larger elements from individual bytes. To my understanding this is the only part of the vectorizer where we attempt these composition types.
Regular gather/scatter handling doesn't have this capability yet and we already use strided loads/stores if we can determine the gather/scatter index is simple.
Would maybe refactoring this strategy help with making it suitable for strided loads/stores?
-- Regards Robin