Hi Richard, > If you're able to say for the record which cores you tested, then that'd > be good.
I've mostly checked it on Cortex-A57 - if there is any affect, it would be on older cores. > OK, thanks. I agree there doesn't seem to be an obvious reason why this > would pessimise any cores significantly. And it looked from a quick > check like all AArch64 cores give these compares the lowest in-use > latency (as expected). Indeed. > We can revisit this if anyone finds any counterexamples. Yes - it's unlikely there are any though! Cheers, Wilco > > -- > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -726,7 +726,7 @@ static const struct tune_params generic_tunings = > SVE_NOT_IMPLEMENTED, /* sve_width */ > 4, /* memmov_cost */ > 2, /* issue_rate */ > - (AARCH64_FUSE_AES_AESMC), /* fusible_ops */ > + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ > "16:12",/* function_align. */ > "4",/* jump_align. */ > "8",/* loop_align. */ > @@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings = > SVE_NOT_IMPLEMENTED, /* sve_width */ > 4, /* memmov_cost */ > 3, /* issue_rate */ > - AARCH64_FUSE_AES_AESMC, /* fusible_ops */ > + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ > "32:16",/* function_align. */ > "32:16",/* jump_align. */ > "32:16",/* loop_align. */