Tamar Christina <tamar.christ...@arm.com> writes: >> -----Original Message----- >> From: Richard Sandiford <richard.sandif...@arm.com> >> Sent: Wednesday, December 11, 2024 9:32 AM >> To: Tamar Christina <tamar.christ...@arm.com> >> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw >> <richard.earns...@arm.com>; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that >> support it >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> > Hi All, >> > >> > GCC 15 added two new fusions CMP+CSEL and CMP+CSET. >> > >> > This patch enables them for cores that support based on their Software >> > Optimization Guides and generically on Armv9-A. Even if a core does not >> > support it there's no negative performance impact. >> > >> > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> > >> > Ok for master? >> > >> > Thanks, >> > Tamar >> > >> > gcc/ChangeLog: >> > >> > * config/aarch64/aarch64-fusion-pairs.def >> (AARCH64_FUSE_NEOVERSE_BASE): >> > New. >> > * config/aarch64/tuning_models/cortexx925.h: Use it. >> > * config/aarch64/tuning_models/generic_armv9_a.h: Use it. >> > * config/aarch64/tuning_models/neoverse512tvb.h: Use it. >> > * config/aarch64/tuning_models/neoversen2.h: Use it. >> > * config/aarch64/tuning_models/neoversen3.h: Use it. >> > * config/aarch64/tuning_models/neoversev1.h: Use it. >> > * config/aarch64/tuning_models/neoversev2.h: Use it. >> > * config/aarch64/tuning_models/neoversev3.h: Use it. >> > * config/aarch64/tuning_models/neoversev3ae.h: Use it. >> > >> > --- >> > >> > diff --git a/gcc/config/aarch64/aarch64-fusion-pairs.def >> b/gcc/config/aarch64/aarch64-fusion-pairs.def >> > index >> f8413ab0c802c28290ebcc171bfd131622cb33be..0123430d988b96b20d20376 >> df9ae7a196031286d 100644 >> > --- a/gcc/config/aarch64/aarch64-fusion-pairs.def >> > +++ b/gcc/config/aarch64/aarch64-fusion-pairs.def >> > @@ -45,4 +45,8 @@ AARCH64_FUSION_PAIR ("cmp+cset", CMP_CSET) >> > /* Baseline fusion settings suitable for all cores. */ >> > #define AARCH64_FUSE_BASE (AARCH64_FUSE_CMP_BRANCH | >> AARCH64_FUSE_AES_AESMC) >> > >> > +/* Baseline fusion settings suitable for all Neoverse cores. */ >> > +#define AARCH64_FUSE_NEOVERSE_BASE (AARCH64_FUSE_BASE | >> AARCH64_FUSE_CMP_CSEL \ >> > + | AARCH64_FUSE_CMP_CSET) >> > + >> > #define AARCH64_FUSE_MOVK (AARCH64_FUSE_MOV_MOVK | >> AARCH64_FUSE_MOVK_MOVK) >> > diff --git a/gcc/config/aarch64/tuning_models/cortexx925.h >> b/gcc/config/aarch64/tuning_models/cortexx925.h >> > index >> eb9b89984b0472858bc08dba924c962ec4ba53bd..b3ae1576ade1f701b775496 >> def277230e193d20f 100644 >> > --- a/gcc/config/aarch64/tuning_models/cortexx925.h >> > +++ b/gcc/config/aarch64/tuning_models/cortexx925.h >> > @@ -205,7 +205,7 @@ static const struct tune_params cortexx925_tunings = >> > 2 /* store_pred. */ >> > }, /* memmov_cost. */ >> > 10, /* issue_rate */ >> > - AARCH64_FUSE_BASE, /* fusible_ops */ >> > + AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ >> > "32:16", /* function_align. */ >> > "4", /* jump_align. */ >> > "32:16", /* loop_align. */ >> > diff --git a/gcc/config/aarch64/tuning_models/generic_armv9_a.h >> b/gcc/config/aarch64/tuning_models/generic_armv9_a.h >> > index >> 48353a59939d84647c6981d6d0551af7ce9df751..e971e645dfc4b805b7f994a1 >> 5a5df7803ff4dc6c 100644 >> > --- a/gcc/config/aarch64/tuning_models/generic_armv9_a.h >> > +++ b/gcc/config/aarch64/tuning_models/generic_armv9_a.h >> > @@ -236,7 +236,7 @@ static const struct tune_params >> generic_armv9_a_tunings = >> > 1 /* store_pred. */ >> > }, /* memmov_cost. */ >> > 3, /* issue_rate */ >> > - AARCH64_FUSE_BASE, /* fusible_ops */ >> > + AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ >> > "32:16", /* function_align. */ >> > "4", /* jump_align. */ >> > "32:16", /* loop_align. */ >> >> Having AARCH64_FUSE_NEOVERSE_BASE seems like a good thing, but I think >> we have to be careful about using it for generic tuning. generic-armv9-a >> mustn't become "generic Neoverse", since it's supposed to be good for >> non-Neoverse (and non-Arm) Armv9-A cores as well. >> >> So perhaps here we should expand the macro. Alternatively, we could add >> a comment saying that fusion macros for other CPUs can be added here as >> well, if they are unlikely to have a negative impact on other cores. >> >> Perhaps we should expand the macro for cortex-x925 as well. > > Sorry that certainly was not the intention, I thought about naming it > AARCH64_FUSE_ARMV9_BASE instead, but since I was also using it for > non-Armv9 cores I thought this would be equally confusing. But perhaps > In light of your comments it may make more sense naming wise?
Yeah, I'd also wondered about suggesting that, but then saw that there were quite a few Armv9-A cores that don't have the flag. So I think AARCH64_FUSE_NEOVERSE_BASE is a good name, and is what we should use for the Neoverse tunings. But perhaps we should expand the macro for the two files above. If anyone adds any new _FUSE flags for Neoverse cores later, there should still be an explicit decision whether to include that flag for cortex-x925 and generic-armv9-a. If we end up with multiple cores of the same family that want the same flags, we can add a new macro for that family then, like you're doing here for Neoverse. (All IMO of course.) Thanks, Richard