[PATCH, V2] Disable generating load/store vector pairs for block copies.

Testing has found that using store vector pair for block copies can result
in a slow down on power10.  This patch disables using the vector pair
instructions for block copies if we are tuning for power10.

This is version 2 of the patch.

| Date: Mon, 6 Jun 2022 20:55:55 -0400
| Subject: [PATCH 2/3] Disable generating load/store vector pairs for block 
copies.
| Message-ID: <yp6igwu03vjrs...@toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596275.html

Compared to version 1, this patch is a stand-alone patch, and it doesn't depend
on a new switch (-mno-store-vector-pair).  Instead, this patch just sets the
default for -mblock-ops-vector-pair to be off if the current cpu being tuned
for is power10.  It would be anticipated that it would automatically be eabled
when tuning for a future cpu.

I have tested this patch on:

        little endian power10 using --with-cpu=power10
        little endian power9 using --with-cpu=power9
        big endian power8 using --with-cpu=power8, both 32/64-bit tested

there were no regressions.  Can I apply this to the master branch, and then
apply it to the GCC 12 patch after a burn-in period?


2022-06-09   Michael Meissner  <meiss...@linux.ibm.com>

gcc/
        * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do
        not generate block copies with vector pair instructions if we are
        tuning for power10.
---
 gcc/config/rs6000/rs6000.cc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 0af2085adc0..59481d9ac70 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4141,7 +4141,10 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
     {
-      if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX)
+      /* Do not generate lxvp and stxvp on power10 since there are some
+        performance issues.  */
+      if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX
+         && rs6000_tune != PROCESSOR_POWER10)
        rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
       else
        rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Reply via email to