On Thu, Oct 10, 2013 at 08:40:05PM +0200, Jan Hubicka wrote: > --- config/i386/x86-tune.def (revision 203387) > +++ config/i386/x86-tune.def (working copy)
> +/* X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL: if true, unaligned loads are > + split. */ > +DEF_TUNE (X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL, > "256_unaligned_load_optimal", > + ~(m_COREI7 | m_GENERIC)) > + > +/* X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL: if true, unaligned loads are s/loads/stores/ > + split. */ > +DEF_TUNE (X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL, > "256_unaligned_load_optimal", > + ~(m_COREI7 | m_BDVER | m_GENERIC)) s/load/store/ Also, I wonder if we couldn't improve the generated code for -mavx2 -mtune=generic or -march=core-avx2 -mtune=generic etc. - m_GENERIC is included clearly because vmovup{s,d} was really bad on SandyBridge (am I right here?), but if the ISA includes AVX2, then the code will not run on that chip at all, so can't we override it? > @@ -3946,10 +3933,10 @@ ix86_option_override_internal (bool main > if (flag_expensive_optimizations > && !(target_flags_explicit & MASK_VZEROUPPER)) > target_flags |= MASK_VZEROUPPER; > - if ((x86_avx256_split_unaligned_load & ix86_tune_mask) > + if (!ix86_tune_features[X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL] Didn't you mean to use X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL here? > && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD)) > target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD; > - if ((x86_avx256_split_unaligned_store & ix86_tune_mask) > + if (!ix86_tune_features[X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL] And similarly for STORE here? > && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE)) > target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE; > /* Enable 128-bit AVX instruction generation Jakub