Despite LA664 has 1-cycle movgr2cf in real, it seems setting the correct
value in the cost model has puzzled the register allocator and severely
impacted the performance, esp. for some workloads like OpenSSL 3.5.1
SHA512 and SPEC CPU 2017 exchange_r.

As movgr2cf is very rarely used (we cannot even construct a test case to
make it used), just remove the LA664 customization for it as a temporary
solution.

gcc/ChangeLog:

        PR target/120476
        * config/loongarch/loongarch-def.cc
        (loongarch_cpu_rtx_cost_data): Remove movgr2cf cost
        customization for LA664.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch-def.cc | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch-def.cc 
b/gcc/config/loongarch/loongarch-def.cc
index dcd8d905c5f..dfb12eb946d 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -147,8 +147,10 @@ array_tune<loongarch_rtx_cost_data> 
loongarch_cpu_rtx_cost_data =
   array_tune<loongarch_rtx_cost_data> ()
     .set (TUNE_LA664,
          loongarch_rtx_cost_data ()
-           .movcf2gr_ (COSTS_N_INSNS (1))
-           .movgr2cf_ (COSTS_N_INSNS (1)));
+           .movcf2gr_ (COSTS_N_INSNS (1)));
+
+/* FIXME: LA664 has 1-cycle movgr2cf as well, but setting the real value
+   here would pessimize the performance for some reason.  See PR120476.  */
 
 /* RTX costs to use when optimizing for size.
    We use a value slightly larger than COSTS_N_INSNS (1) for all of them
-- 
2.50.1

Reply via email to