> -----Original Message----- > From: qia...@fujitsu.com <qia...@fujitsu.com> > Sent: 18 March 2021 01:52 > To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-patches@gcc.gnu.org > Cc: Richard Sandiford <richard.sandif...@arm.com> > Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults > > Hello Kyrill, > > Sorry for the slow response. > The performance on a64fx is not impacted with this patch.
Thank you very much for testing Qian. Glad to see there is no impact on A64FX. I will push the patch to master then. Kyrill > > Regards, > Qian > > > -----Original Message----- > > From: Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > Sent: Wednesday, March 10, 2021 10:56 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Richard Sandiford <richard.sandif...@arm.com>; Qian, Jianhua/钱 建 > 华 > > <qia...@fujitsu.com> > > Subject: [PATCH] aarch64: Improve generic SVE tuning defaults > > > > Hi all, > > > > This patch adds the recently-added tweak to split some SVE VL-based scalar > > operations [1] to the generic tuning used for SVE, as enabled by adding > +sve to > > the -march flag, for example -march=armv8.2-a+sve. > > > > The recommendation for best performance on a particular CPU remains > > unchanged: > > use the -mcpu option for that CPU, where possible. -mcpu=native makes > this > > straightforward for native compilation. > > > > The tweak to split out SVE VL-based scalar operations is a consistent win > for > > the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of > > SPEC2017 on A64FX with this tweak on didn't show any non-noise > differences. > > It is also expected to be neutral on SVE2 implementations. > > > > Therefore, the patch enables the tweak for generic +sve tuning e.g. > > -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it, > > therefore the tweak is disabled for generic tuning when +sve2 is in -march > e.g. > > -march=armv8.2-a+sve2. > > > > The implementation of this approach requires a bit of custom logic in > > aarch64_override_options_internal to handle these kinds of > > architecture-dependent decisions, but we do believe the user-facing > principle > > here is important to implement. > > > > Qian, as you've contributed the A64FX support to GCC, I would be grateful > for > > your feedback on this approach and in particular on the performance > evaluation > > of this change. > > > > In general, for the generic target we're using a decision framework that > looks > > like: > > > > * If all cores that are known to benefit from an optimization are of > architecture X, > > and all other cores that implement X or above are not impacted, or have a > very > > slight impact, we will consider it for generic tuning for architecture X. > > * We will not enable that optimisation for generic tuning for architecture > X+1 if > > no known cores of architecture X+1 or above will benefit. > > > > This framework allows us to improve generic tuning for CPUs of generation > X > > while avoiding accumulating tweaks for future CPUs of generation X+1, > X+2... > > that do not need them, and thus avoid even the slight negative effects of > these > > optimisations if the user is willing to tell us the desired architecture > accurately. > > > > X above can mean either annual architecture updates (Armv8.2-a, Armv8.3- > a > > etc) or optional architecture extensions (like SVE, SVE2). > > > > We think that this patch fits that framework, so would like to propose it > > for > the > > trunk default tunings for SVE. > > > > Bootstrapped and tested on aarch64-none-linux-gnu. > > > > Thanks, > > Kyrill > > > > [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7 > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning): > > Define. > > (aarch64_override_options_internal): Use it. > > (generic_tunings): Add > > AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to > > tune_flags. > > > > gcc/testsuite/ChangeLog: > > > > * g++.target/aarch64/sve/aarch64-sve.exp: Add > > -moverride=tune=none to > > sve_flags. > > * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. > > * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise. > > * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise. > > * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. > > * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise. > >