https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90204
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> --- It seems such code generation is r254855's intention. /* Use 256-bit AVX instructions instead of 512-bit AVX instructions 4695 in the auto-vectorizer. */ 4696 if (ix86_tune_features[X86_TUNE_AVX256_OPTIMAL] 4697 && !(opts_set->x_ix86_target_flags & OPTION_MASK_PREFER_AVX256)) 4698 opts->x_ix86_target_flags |= OPTION_MASK_PREFER_AVX256; I know there is a frequency reduction issue when many zmm registers are used, but i don't know what exact situation did r254855 deal with?