Hi Jennifer,

> -----Original Message-----
> From: Jennifer Schmitz <jschm...@nvidia.com>
> Sent: Friday, August 23, 2024 1:07 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford <richard.sandif...@arm.com>; Kyrylo Tkachov
> <ktkac...@exchange.nvidia.com>
> Subject: [RFC][PATCH] AArch64: Remove
> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS
> 
> This patch removes the AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS
> tunable and
> use_new_vector_costs entry in aarch64-tuning-flags.def and makes the
> AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS paths in the backend the
> default.
> To that end, the function aarch64_use_new_vector_costs_p and its uses were
> removed. Additionally, guards were added prevent nullpointer dereferences of
> fields in cpu_vector_cost.
> 

I'm not against this change, but it does mean that we now switch old Adv. SIMD
cost models as well to the new throughput based cost models.  That means that
-mcpu=generic now behaves differently, and -mcpu=neoverse-n1 and I think
some distros explicitly use this (I believe yocto for instance does).

Have we validated that the old generic cost model still behaves sensibly with 
this change?

> The patch was bootstrapped and regtested on aarch64-linux-gnu:
> No problems bootstrapping, but several test files (in aarch64-sve.exp:
> gather_load_extend_X.c
> where X is 1 to 4, strided_load_2.c, strided_store_2.c) fail because of small
> differences
> in codegen that make some of the scan-assembler-times tests fail.
> 
> Kyrill suggested to add a -fvect-cost-model=unlimited flag to these tests and 
> add
> some

I don't personally like unlimited here as unlimited means just vectorize at any
cost.  This means that costing between modes are also disabled. A lot of these
testcases are intended to test not just that we vectorize but that we vectorize
with efficient code.

I'd prefer to use -fvect-cost-model=dynamic if that fixes the testcases.

Thanks,
Tamar

> logic to aarch64_vector_costs::add_stmt_cost to disable the changes in vector
> instructions
> when flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED. If you agree with
> that
> suggestion, I propose prepending the current patch by one that implements this
> logic and adding
> -fvect-cost-model=unlimited to the failing tests. Please advise.
> 
> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
> 
> gcc/
>       * config/aarch64/aarch64-tuning-flags.def: Remove
>       use_new_vector_costs as tuning option.
>       * config/aarch64/aarch64.cc (aarch64_use_new_vector_costs_p):
>       Remove.
>       (aarch64_in_loop_reduction_latency): Add nullpointer dereference
>       guard.
>       (aarch64_detect_vector_stmt_subtype): Likewise.
>       (aarch64_vector_costs::add_stmt_cost): Remove use of
>       aarch64_use_new_vector_costs_p and add nullpointer dereference
>       guards.
>       (aarch64_vector_costs::finish_cost): Remove use of
>       aarch64_use_new_vector_costs_p.
>       * config/aarch64/tuning_models/cortexx925.h: Remove
>       AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS.
>       * config/aarch64/tuning_models/generic_armv8_a.h: Likewise.
>       * config/aarch64/tuning_models/generic_armv9_a.h: Likewise.
>       * config/aarch64/tuning_models/neoverse512tvb.h: Likewise.
>       * config/aarch64/tuning_models/neoversen2.h: Likewise.
>       * config/aarch64/tuning_models/neoversen3.h: Likewise.
>       * config/aarch64/tuning_models/neoversev1.h: Likewise.
>       * config/aarch64/tuning_models/neoversev2.h: Likewise.
>       * config/aarch64/tuning_models/neoversev3.h: Likewise.

Reply via email to