http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53726
--- Comment #16 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-20 15:09:46 UTC --- What we could do for the case in question is look at the maximum possible value of c, derived from number-of-iteration analysis which should tell us 8 because of the size of the tem array. But I am not sure if a good library implementation shouldn't be always preferable to a byte-wise copy. We could, at least try to envision a way to retain and use the knowledge that the size is at most 8 when expanding the memcpy (with AVX we could use a masked store for example - quite fancy).