Hi Richard,

> If you're able to say for the record which cores you tested, then that'd
> be good.

I've mostly checked it on Cortex-A57 - if there is any affect, it would be on
older cores.

> OK, thanks.  I agree there doesn't seem to be an obvious reason why this
> would pessimise any cores significantly.  And it looked from a quick
> check like all AArch64 cores give these compares the lowest in-use
> latency (as expected).

Indeed.

> We can revisit this if anyone finds any counterexamples.

Yes - it's unlikely there are any though!

Cheers,
Wilco





>

> --

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c

> index 
> a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5
>  100644

> --- a/gcc/config/aarch64/aarch64.c

> +++ b/gcc/config/aarch64/aarch64.c

> @@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =

>    SVE_NOT_IMPLEMENTED, /* sve_width  */

>    4, /* memmov_cost  */

>    2, /* issue_rate  */

> -  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */

> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */

>    "16:12",/* function_align.  */

>    "4",/* jump_align.  */

>    "8",/* loop_align.  */

> @@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =

>    SVE_NOT_IMPLEMENTED, /* sve_width  */

>    4, /* memmov_cost  */

>    3, /* issue_rate  */

> -  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */

> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */

>    "32:16",/* function_align.  */

>    "32:16",/* jump_align.  */

>    "32:16",/* loop_align.  */

Reply via email to