Hi Richard,

> Sorry to be awkward, but I don't think we should put
> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT in base.
> CHEAP_SHIFT_EXTEND is a good base flag because it means we can make full
> use of a certain group of instructions.  FULLY_PIPELINED_FMA similarly
> means that FMA chains behave as one would expect.

So does that imply you're happy with
[2/3] https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673224.html ?

> But MATCHED_VECTOR_THROUGHPUT feels to me more like a property of
> a particular uarch.  I don't see a reason in principle why future
> cores must provide the same Advanced SIMD bandwidth as SVE bandwidth.

These are really all microarchitecture tuning related, it's just that some are 
so
standard that they can be the default for all modern cores. This removes the
repeated clutter in the tuning models, and it reduces the chances of new CPUs
accidentally using incorrect settings.

Note many older cores don't use the base setting, and one could remove 
particular
tunings or add a new tune in the future for exceptions like A64FX.

> The AVOID_PRED_RMW is a good catch though, thanks.  +1 to Kyrill's ok
> for that part.

I've updated the patch to just fix the neoverse512tvb tuning - also I spotted 
this
wasn't yet using AARCH64_EXTRA_TUNE_BASE either... So now at least the tuning
flags are finally more consistent!

Cheers,
Wilco


v2: Update to just improve neoverse512tvb tuning

AArch64: Update neoverse512tvb tuning

Fix the neoverse512tvb tuning to be like Neoverse V1/V2 and add the missing
AARCH64_EXTRA_TUNE_BASE and AARCH64_EXTRA_TUNE_AVOID_PRED_RMW.

gcc:
        * config/aarch64/tuning_models/neoverse512tvb.h (tune_flags): Update.

---

diff --git a/gcc/config/aarch64/tuning_models/neoverse512tvb.h 
b/gcc/config/aarch64/tuning_models/neoverse512tvb.h
index 
50eb058e23d1a824d925f6258654f9c3c7abbdff..964b4ac284a895cbea4bf889894dd662374f0d2a
 100644
--- a/gcc/config/aarch64/tuning_models/neoverse512tvb.h
+++ b/gcc/config/aarch64/tuning_models/neoverse512tvb.h
@@ -155,8 +155,10 @@ static const struct tune_params neoverse512tvb_tunings =
   2,   /* min_div_recip_mul_df.  */
   0,   /* max_case_values.  */
   tune_params::AUTOPREFETCHER_WEAK,    /* autoprefetcher_model.  */
-  (AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS
-   | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT),    /* tune_flags.  */
+  (AARCH64_EXTRA_TUNE_BASE
+   | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS
+   | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT
+   | AARCH64_EXTRA_TUNE_AVOID_PRED_RMW),       /* tune_flags.  */
   &generic_armv9a_prefetch_tune,
   AARCH64_LDP_STP_POLICY_ALWAYS,   /* ldp_policy_model.  */
   AARCH64_LDP_STP_POLICY_ALWAYS           /* stp_policy_model.  */

Reply via email to