> -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: Wednesday, December 11, 2024 9:50 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw > <richard.earns...@arm.com>; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that > support it > > Tamar Christina <tamar.christ...@arm.com> writes: > >> -----Original Message----- > >> From: Richard Sandiford <richard.sandif...@arm.com> > >> Sent: Wednesday, December 11, 2024 9:32 AM > >> To: Tamar Christina <tamar.christ...@arm.com> > >> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw > >> <richard.earns...@arm.com>; ktkac...@gcc.gnu.org > >> Subject: Re: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that > >> support it > >> > >> Tamar Christina <tamar.christ...@arm.com> writes: > >> > Hi All, > >> > > >> > GCC 15 added two new fusions CMP+CSEL and CMP+CSET. > >> > > >> > This patch enables them for cores that support based on their Software > >> > Optimization Guides and generically on Armv9-A. Even if a core does not > >> > support it there's no negative performance impact. > >> > > >> > > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > >> > > >> > Ok for master? > >> > > >> > Thanks, > >> > Tamar > >> > > >> > gcc/ChangeLog: > >> > > >> > * config/aarch64/aarch64-fusion-pairs.def > >> (AARCH64_FUSE_NEOVERSE_BASE): > >> > New. > >> > * config/aarch64/tuning_models/cortexx925.h: Use it. > >> > * config/aarch64/tuning_models/generic_armv9_a.h: Use it. > >> > * config/aarch64/tuning_models/neoverse512tvb.h: Use it. > >> > * config/aarch64/tuning_models/neoversen2.h: Use it. > >> > * config/aarch64/tuning_models/neoversen3.h: Use it. > >> > * config/aarch64/tuning_models/neoversev1.h: Use it. > >> > * config/aarch64/tuning_models/neoversev2.h: Use it. > >> > * config/aarch64/tuning_models/neoversev3.h: Use it. > >> > * config/aarch64/tuning_models/neoversev3ae.h: Use it. > >> > > >> > --- > >> > > >> > diff --git a/gcc/config/aarch64/aarch64-fusion-pairs.def > >> b/gcc/config/aarch64/aarch64-fusion-pairs.def > >> > index > >> > f8413ab0c802c28290ebcc171bfd131622cb33be..0123430d988b96b20d20376 > >> df9ae7a196031286d 100644 > >> > --- a/gcc/config/aarch64/aarch64-fusion-pairs.def > >> > +++ b/gcc/config/aarch64/aarch64-fusion-pairs.def > >> > @@ -45,4 +45,8 @@ AARCH64_FUSION_PAIR ("cmp+cset", CMP_CSET) > >> > /* Baseline fusion settings suitable for all cores. */ > >> > #define AARCH64_FUSE_BASE (AARCH64_FUSE_CMP_BRANCH | > >> AARCH64_FUSE_AES_AESMC) > >> > > >> > +/* Baseline fusion settings suitable for all Neoverse cores. */ > >> > +#define AARCH64_FUSE_NEOVERSE_BASE (AARCH64_FUSE_BASE | > >> AARCH64_FUSE_CMP_CSEL \ > >> > + | AARCH64_FUSE_CMP_CSET) > >> > + > >> > #define AARCH64_FUSE_MOVK (AARCH64_FUSE_MOV_MOVK | > >> AARCH64_FUSE_MOVK_MOVK) > >> > diff --git a/gcc/config/aarch64/tuning_models/cortexx925.h > >> b/gcc/config/aarch64/tuning_models/cortexx925.h > >> > index > >> > eb9b89984b0472858bc08dba924c962ec4ba53bd..b3ae1576ade1f701b775496 > >> def277230e193d20f 100644 > >> > --- a/gcc/config/aarch64/tuning_models/cortexx925.h > >> > +++ b/gcc/config/aarch64/tuning_models/cortexx925.h > >> > @@ -205,7 +205,7 @@ static const struct tune_params cortexx925_tunings = > >> > 2 /* store_pred. */ > >> > }, /* memmov_cost. */ > >> > 10, /* issue_rate */ > >> > - AARCH64_FUSE_BASE, /* fusible_ops */ > >> > + AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ > >> > "32:16", /* function_align. */ > >> > "4", /* jump_align. */ > >> > "32:16", /* loop_align. */ > >> > diff --git a/gcc/config/aarch64/tuning_models/generic_armv9_a.h > >> b/gcc/config/aarch64/tuning_models/generic_armv9_a.h > >> > index > >> > 48353a59939d84647c6981d6d0551af7ce9df751..e971e645dfc4b805b7f994a1 > >> 5a5df7803ff4dc6c 100644 > >> > --- a/gcc/config/aarch64/tuning_models/generic_armv9_a.h > >> > +++ b/gcc/config/aarch64/tuning_models/generic_armv9_a.h > >> > @@ -236,7 +236,7 @@ static const struct tune_params > >> generic_armv9_a_tunings = > >> > 1 /* store_pred. */ > >> > }, /* memmov_cost. */ > >> > 3, /* issue_rate */ > >> > - AARCH64_FUSE_BASE, /* fusible_ops */ > >> > + AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ > >> > "32:16", /* function_align. */ > >> > "4", /* jump_align. */ > >> > "32:16", /* loop_align. */ > >> > >> Having AARCH64_FUSE_NEOVERSE_BASE seems like a good thing, but I think > >> we have to be careful about using it for generic tuning. generic-armv9-a > >> mustn't become "generic Neoverse", since it's supposed to be good for > >> non-Neoverse (and non-Arm) Armv9-A cores as well. > >> > >> So perhaps here we should expand the macro. Alternatively, we could add > >> a comment saying that fusion macros for other CPUs can be added here as > >> well, if they are unlikely to have a negative impact on other cores. > >> > >> Perhaps we should expand the macro for cortex-x925 as well. > > > > Sorry that certainly was not the intention, I thought about naming it > > AARCH64_FUSE_ARMV9_BASE instead, but since I was also using it for > > non-Armv9 cores I thought this would be equally confusing. But perhaps > > In light of your comments it may make more sense naming wise? > > Yeah, I'd also wondered about suggesting that, but then saw that there > were quite a few Armv9-A cores that don't have the flag. > > So I think AARCH64_FUSE_NEOVERSE_BASE is a good name, and is what we > should use for the Neoverse tunings. But perhaps we should expand the > macro for the two files above. If anyone adds any new _FUSE flags for > Neoverse cores later, there should still be an explicit decision whether > to include that flag for cortex-x925 and generic-armv9-a. > > If we end up with multiple cores of the same family that want the same > flags, we can add a new macro for that family then, like you're doing > here for Neoverse. > > (All IMO of course.)
Understood, works for me. I'll leave it a bit for people to comment on otherwise will proceed with the expansion for generic. Cheers, Tamar > > Thanks, > Richard