On Thu, Feb 4, 2021 at 4:52 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Thu, Feb 4, 2021 at 7:45 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Thu, Feb 4, 2021 at 5:28 AM Hongtao Liu <crazy...@gmail.com> wrote:
> >
> > > > > >    GCC11 will be the system GCC 2 years from now, and for the
> > > > > > processors then, they shouldn't even need to split a 256-bit vector
> > > > > > into 2 128-bits vectors.
> > > > > >    .i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show
> > > > > > option B is better than Option A.
> > > > > > Option A:
> > > > > > -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast
> > > > > >
> > > > > > Option B:
> > > > > > Option A + 
> > > > > > -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal"
> > > > > >
> > > > > >   Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}.
> > > > >
> > > > > Given the explicit list for unaligned loads it's a no-brainer to 
> > > > > change that
> > > > > for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL.  Given both
> > > > > BDVER and ZNVER1 are listed for 
> > > > > X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL
> > > > > we should try to benchmark the effect on ZNVER1 - Martin, do we still
> > > > > have a znver1 machine around?
> > > >
> > > > They are also turned on for Sandybridge.  I don't believe we should 
> > > > keep it
> > > > in GCC 11 to penalize today's CPUs as well as CPUs in 2024.
> > > >
> > > I agree with H.J, and I would also like to hear Uros' opinion.
> >
> > I don't have any benchmark data to form my opinion on, but I
> > definitely agree that the compiler should tune for the newer processor
> > where speed matters the most, and 10 years old processors are
> > irrelevant as far as speed is concerned.
> >
> > So, if it is expected that gcc-11 will be most used in 2-3 years from
> > now, it should by default target the architecture that will be most
> > used at that time. But I think that distribution maintainers should
> > decide here.
>
> I'm all for the change - the case it could regress is odd anyway as it needs
> AVX2 enabled and on CPUs with a 128bit data path those shouldn't be
> prefered mutlilibs (thinking of this new x86_64-v2/v3 stuff).
>
I'm going to check in the patch.
> Richard.
>
> > Uros.



-- 
BR,
Hongtao

Reply via email to