On Thu, Feb 4, 2021 at 4:52 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Thu, Feb 4, 2021 at 7:45 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Thu, Feb 4, 2021 at 5:28 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > > > > > GCC11 will be the system GCC 2 years from now, and for the > > > > > > processors then, they shouldn't even need to split a 256-bit vector > > > > > > into 2 128-bits vectors. > > > > > > .i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show > > > > > > option B is better than Option A. > > > > > > Option A: > > > > > > -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast > > > > > > > > > > > > Option B: > > > > > > Option A + > > > > > > -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal" > > > > > > > > > > > > Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. > > > > > > > > > > Given the explicit list for unaligned loads it's a no-brainer to > > > > > change that > > > > > for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL. Given both > > > > > BDVER and ZNVER1 are listed for > > > > > X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL > > > > > we should try to benchmark the effect on ZNVER1 - Martin, do we still > > > > > have a znver1 machine around? > > > > > > > > They are also turned on for Sandybridge. I don't believe we should > > > > keep it > > > > in GCC 11 to penalize today's CPUs as well as CPUs in 2024. > > > > > > > I agree with H.J, and I would also like to hear Uros' opinion. > > > > I don't have any benchmark data to form my opinion on, but I > > definitely agree that the compiler should tune for the newer processor > > where speed matters the most, and 10 years old processors are > > irrelevant as far as speed is concerned. > > > > So, if it is expected that gcc-11 will be most used in 2-3 years from > > now, it should by default target the architecture that will be most > > used at that time. But I think that distribution maintainers should > > decide here. > > I'm all for the change - the case it could regress is odd anyway as it needs > AVX2 enabled and on CPUs with a 128bit data path those shouldn't be > prefered mutlilibs (thinking of this new x86_64-v2/v3 stuff). > I'm going to check in the patch. > Richard. > > > Uros.
-- BR, Hongtao