According to the Neoverse V2 Software Optimization Guide (section 4.14), the instruction pairs CMP+CSEL and CMP+CSET can be fused, which had not been implemented so far. This patch implements and tests the two fusion pairs.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. There was also no non-noise impact on SPEC CPU2017 benchmark. OK for mainline? Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> gcc/ * config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Implement fusion logic. * config/aarch64/aarch64-fusion-pairs.def (cmp+csel): New entry. (cmp+cset): Likewise. * config/aarch64/tuning_models/neoversev2.h: Enable logic in field fusible_ops. gcc/testsuite/ * gcc.target/aarch64/fuse_cmp_csel.c: New test. * gcc.target/aarch64/fuse_cmp_cset.c: Likewise.
0001-aarch64-Fuse-CMP-CSEL-and-CMP-CSET-for-mcpu-neoverse.patch
Description: Binary data
smime.p7s
Description: S/MIME cryptographic signature