Hi Jennifer, > -----Original Message----- > From: Jennifer Schmitz <jschm...@nvidia.com> > Sent: Friday, August 23, 2024 1:07 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford <richard.sandif...@arm.com>; Kyrylo Tkachov > <ktkac...@exchange.nvidia.com> > Subject: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS > > This patch removes the AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS > tunable and > use_new_vector_costs entry in aarch64-tuning-flags.def and makes the > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS paths in the backend the > default. > To that end, the function aarch64_use_new_vector_costs_p and its uses were > removed. Additionally, guards were added prevent nullpointer dereferences of > fields in cpu_vector_cost. >
I'm not against this change, but it does mean that we now switch old Adv. SIMD cost models as well to the new throughput based cost models. That means that -mcpu=generic now behaves differently, and -mcpu=neoverse-n1 and I think some distros explicitly use this (I believe yocto for instance does). Have we validated that the old generic cost model still behaves sensibly with this change? > The patch was bootstrapped and regtested on aarch64-linux-gnu: > No problems bootstrapping, but several test files (in aarch64-sve.exp: > gather_load_extend_X.c > where X is 1 to 4, strided_load_2.c, strided_store_2.c) fail because of small > differences > in codegen that make some of the scan-assembler-times tests fail. > > Kyrill suggested to add a -fvect-cost-model=unlimited flag to these tests and > add > some I don't personally like unlimited here as unlimited means just vectorize at any cost. This means that costing between modes are also disabled. A lot of these testcases are intended to test not just that we vectorize but that we vectorize with efficient code. I'd prefer to use -fvect-cost-model=dynamic if that fixes the testcases. Thanks, Tamar > logic to aarch64_vector_costs::add_stmt_cost to disable the changes in vector > instructions > when flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED. If you agree with > that > suggestion, I propose prepending the current patch by one that implements this > logic and adding > -fvect-cost-model=unlimited to the failing tests. Please advise. > > Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> > > gcc/ > * config/aarch64/aarch64-tuning-flags.def: Remove > use_new_vector_costs as tuning option. > * config/aarch64/aarch64.cc (aarch64_use_new_vector_costs_p): > Remove. > (aarch64_in_loop_reduction_latency): Add nullpointer dereference > guard. > (aarch64_detect_vector_stmt_subtype): Likewise. > (aarch64_vector_costs::add_stmt_cost): Remove use of > aarch64_use_new_vector_costs_p and add nullpointer dereference > guards. > (aarch64_vector_costs::finish_cost): Remove use of > aarch64_use_new_vector_costs_p. > * config/aarch64/tuning_models/cortexx925.h: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS. > * config/aarch64/tuning_models/generic_armv8_a.h: Likewise. > * config/aarch64/tuning_models/generic_armv9_a.h: Likewise. > * config/aarch64/tuning_models/neoverse512tvb.h: Likewise. > * config/aarch64/tuning_models/neoversen2.h: Likewise. > * config/aarch64/tuning_models/neoversen3.h: Likewise. > * config/aarch64/tuning_models/neoversev1.h: Likewise. > * config/aarch64/tuning_models/neoversev2.h: Likewise. > * config/aarch64/tuning_models/neoversev3.h: Likewise.