This is a hack I tried to avoid having to poke at lra code for
pr71680..

The idea of adding force_reg here was that it removes subregs from
fix_trunc, emitting the subreg mode conversion in a separate set insn.
That avoids the underlying lra issue, by virtue of combine merging the
SF subreg with the SI mem load, at least for -m64.

For -m32 combine rejects the combination due to the mem address being
a lo_sum which is a mode dependent address.  Of course even for -m64,
combine probably shouldn't allow this combination, and wouldn't if the
rs6000 rtx_costs function said that SLOW_UNALIGNED_ACCESS mems
actually cost more.

So this patch isn't a particularly good solution to pr71680, but
a) force_reg for an operand that must be a reg is 100% safe, and
b) it's good to expose more combine opportunities.

Bootstrapped and regression tested powerpc64le-linux and
powerpc64-linux.

        * config/rs6000/rs6000.md (fix_trunc<mode>si2): Force source operand
        to a reg.  Localize vars.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5afae92..45ad661 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5357,15 +5357,15 @@
 {
   if (!<E500_CONVERT>)
     {
-      rtx tmp, stack;
+      rtx src = force_reg (SFmode, operands[1]);
 
       if (TARGET_STFIWX)
-       emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], operands[1]));
+       emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], src));
       else
        {
-         tmp = gen_reg_rtx (DImode);
-         stack = rs6000_allocate_stack_temp (DImode, true, false);
-         emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], operands[1],
+         rtx tmp = gen_reg_rtx (DImode);
+         rtx stack = rs6000_allocate_stack_temp (DImode, true, false);
+         emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], src,
                                                      tmp, stack));
        }
       DONE;

-- 
Alan Modra
Australia Development Lab, IBM

Reply via email to