On 30/04/2020 11:56, Kyrylo Tkachov wrote: > [Moving to gcc-patches] > >> -----Original Message----- >> From: Gcc <gcc-boun...@gcc.gnu.org> On Behalf Of Andrew Pinski via Gcc >> Sent: 30 April 2020 07:21 >> To: Florian Weimer <fwei...@redhat.com> >> Cc: GCC Mailing List <g...@gcc.gnu.org>; nmeye...@amzn.com >> Subject: Re: Should ARMv8-A generic tuning default to -moutline-atomics >> >> On Wed, Apr 29, 2020 at 6:25 AM Florian Weimer via Gcc <g...@gcc.gnu.org> >> wrote: >>> >>> Distributions are receiving requests to build things with >>> -moutline-atomics: >>> >>> <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=956418> >>> >>> Should this be reflected in the GCC upstream defaults for ARMv8-A >>> generic tuning? It does not make much sense to me if every distribution >>> has to overide these flags, either in their build system or by patching >>> GCC. >> >> At least we should make it a configure option. >> I do want the ability to default it for our (Marvell) toolchain for >> Linux (our bare metal toolchain will be defaulting to ARMv8.2-a >> anyways). > > After some internal discussions, I am open to having it on as a default. > Here are two versions. One has it as a tuning setting that CPUs can override, > the other just switches it on by default always unless overridden by > -mno-outline-atomics. > I slightly prefer the second one as it's cleaner and simpler, but happy to > take either. > Any preferences? > Thanks, > Kyrill > > ChangeLogs: > > 2020-04-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/aarch64/aarch64-tuning-flags.def (no_outline_atomics): Declare. > * config/aarch64/aarch64.h (TARGET_OUTLINE_ATOMICS): Define. > * config/aarch64/aarch64.opt (moutline-atomics): Change to Int variable. > > 2020-04-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/aarch64/aarch64.h (TARGET_OUTLINE_ATOMICS): Define. > * config/aarch64/aarch64.opt (moutline-atomics): Change to Int variable. > * doc/invoke.texi (moutline-atomics): Document as on by default. > >
I think I prefer the second option. The whole point of LSE (and hence the name "Large System Extension") was due to the fact that when you have many cores the v8.0 atomics have known scaling issues. It's not a property of the core though, it's primarily a property of the number of cores in the system. The problem really is that we don't have a tuning param for that, nor can we really tell at compile time how many threads might really be in use. So I don't think this is really a tuning param that can/should be selected via the existing -mtune tables. R.