> On 15 Nov 2024, at 12:33, Wilco Dijkstra <wilco.dijks...@arm.com> wrote: > > Hi Kyrill, > >> This would make USE_NEW_VECTOR_COSTS effectively the default. >> Jennifer has been trying to do that as well and then to remove it (as it >> would be always true) but there are some codegen regressions that still > >> need to be addressed. > > Yes, that's the goal - we should use good tuning settings by default, > especially if > they work well on modern cores. I noticed a huge gap between -mcpu=neoverse-v2 > and -march=armv9-a, so the idea is to make the tunings more similar. Note this > particular patch won't make a difference since both of these tunings already > use the > new vector costs and throughput setting. > >> See the threads “[RFC][PATCH] AArch64: Remove >> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS” from October and September. >> Do those regressions go away if you also specify >> AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT at the same time? > > I believe we always use both of those settings together. Removing the > settings by > making them the default looks like a good idea indeed. We have too many tune > settings...
In principle the only SVE-enabled SVE core that AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT wouldn’t apply for is A64FX but that tuning was also not validated with AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS so indeed in all current uses they appear together. I wouldn’t mind assuming AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT in the generic tuning if others agree, but I don’t think we should remove the ! AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT paths just yet. Thanks, Kyrill > > Cheers, > Wilco