https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Jakub Jelinek from comment #3) > Seems most of the *by_pieces code actually uses widest_int_mode_for_size > which already handles even the wider modes as long as they have a mov<mode> > instruction. With this completely untested patch I get roughly the same > code with -mavx and better with -mavx512f, just as a drawback for some > reason the functions have frame pointer (dunno if that is caused by the > OI/XImode, while vector modes can be handled or what else). Tried memset > with zero too, but haven't tried other memsets (those could be problematic > already) or comparisons. > > Thoughts on this? Not a GCC9 material though. Perhaps it should also > depend on the selected preferred vector width, so that we don't e.g. enable > AVX512F if that is undesirable from power consumption POV. > > --- gcc/config/i386/i386.h.jj 2019-01-01 12:37:32.988715207 +0100 > +++ gcc/config/i386/i386.h 2019-02-06 21:13:01.047765193 +0100 > @@ -1886,7 +1886,9 @@ typedef struct ix86_args { > && TARGET_SSE2 \ > && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \ > && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > - ? GET_MODE_SIZE (TImode) : UNITS_PER_WORD) > + ? (TARGET_AVX512F ? GET_MODE_SIZE (XImode) \ > + : TARGET_AVX ? GET_MODE_SIZE (OImode) \ > + : GET_MODE_SIZE (TImode)) : UNITS_PER_WORD) We need to take prefer_vector_width_type into account.