On Thu, Oct 10, 2013 at 08:40:05PM +0200, Jan Hubicka wrote:
> --- config/i386/x86-tune.def  (revision 203387)
> +++ config/i386/x86-tune.def  (working copy)

> +/* X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL: if true, unaligned loads are
> +   split.  */
> +DEF_TUNE (X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL, 
> "256_unaligned_load_optimal", 
> +          ~(m_COREI7 | m_GENERIC))
> +
> +/* X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL: if true, unaligned loads are

s/loads/stores/

> +   split.  */
> +DEF_TUNE (X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL, 
> "256_unaligned_load_optimal", 
> +          ~(m_COREI7 | m_BDVER | m_GENERIC))

s/load/store/

Also, I wonder if we couldn't improve the generated code for
-mavx2 -mtune=generic or -march=core-avx2 -mtune=generic etc.
- m_GENERIC is included clearly because vmovup{s,d} was really bad
on SandyBridge (am I right here?), but if the ISA includes AVX2, then
the code will not run on that chip at all, so can't we override it?

> @@ -3946,10 +3933,10 @@ ix86_option_override_internal (bool main
>        if (flag_expensive_optimizations
>         && !(target_flags_explicit & MASK_VZEROUPPER))
>       target_flags |= MASK_VZEROUPPER;
> -      if ((x86_avx256_split_unaligned_load & ix86_tune_mask)
> +      if (!ix86_tune_features[X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL]

Didn't you mean to use X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL here?

>         && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>       target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
> -      if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
> +      if (!ix86_tune_features[X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL]

And similarly for STORE here?

>         && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
>       target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
>        /* Enable 128-bit AVX instruction generation

        Jakub

Reply via email to