https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94092
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- Yes, it was expected that the patch cannot handle all cases since most definitely the ldist transform loses information on the access that is otherwise used to improve alignment info. I suggested to add __builtin_mem{set,cpy,move} variants with an extra argument specifying the (common, for cpy/move) 'alignment size'. Note that for unbound loop bounds we want to dispatch to libc memset even when we know much about alignment since libc is expected to have optimal code sequences for a variety of alignment/size combinations. As said elsewhere I believe we have code that can dynamically dispatch between a short inline sequence and a libcall dependent on the actual length but I don't remember whether this is/was in generic code or in target specific code (but I think it is done only when value-profile data is available).