[PATCH, V2] Disable generating load/store vector pairs for block copies. Testing has found that using store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10.
This is version 2 of the patch. | Date: Mon, 6 Jun 2022 20:55:55 -0400 | Subject: [PATCH 2/3] Disable generating load/store vector pairs for block copies. | Message-ID: <yp6igwu03vjrs...@toto.the-meissners.org> | https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596275.html Compared to version 1, this patch is a stand-alone patch, and it doesn't depend on a new switch (-mno-store-vector-pair). Instead, this patch just sets the default for -mblock-ops-vector-pair to be off if the current cpu being tuned for is power10. It would be anticipated that it would automatically be eabled when tuning for a future cpu. I have tested this patch on: little endian power10 using --with-cpu=power10 little endian power9 using --with-cpu=power9 big endian power8 using --with-cpu=power8, both 32/64-bit tested there were no regressions. Can I apply this to the master branch, and then apply it to the GCC 12 patch after a burn-in period? 2022-06-09 Michael Meissner <meiss...@linux.ibm.com> gcc/ * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do not generate block copies with vector pair instructions if we are tuning for power10. --- gcc/config/rs6000/rs6000.cc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 0af2085adc0..59481d9ac70 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -4141,7 +4141,10 @@ rs6000_option_override_internal (bool global_init_p) if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) { - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) + /* Do not generate lxvp and stxvp on power10 since there are some + performance issues. */ + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX + && rs6000_tune != PROCESSOR_POWER10) rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; else rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; -- 2.35.3 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com