This is a hack I tried to avoid having to poke at lra code for pr71680.. The idea of adding force_reg here was that it removes subregs from fix_trunc, emitting the subreg mode conversion in a separate set insn. That avoids the underlying lra issue, by virtue of combine merging the SF subreg with the SI mem load, at least for -m64.
For -m32 combine rejects the combination due to the mem address being a lo_sum which is a mode dependent address. Of course even for -m64, combine probably shouldn't allow this combination, and wouldn't if the rs6000 rtx_costs function said that SLOW_UNALIGNED_ACCESS mems actually cost more. So this patch isn't a particularly good solution to pr71680, but a) force_reg for an operand that must be a reg is 100% safe, and b) it's good to expose more combine opportunities. Bootstrapped and regression tested powerpc64le-linux and powerpc64-linux. * config/rs6000/rs6000.md (fix_trunc<mode>si2): Force source operand to a reg. Localize vars. diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 5afae92..45ad661 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -5357,15 +5357,15 @@ { if (!<E500_CONVERT>) { - rtx tmp, stack; + rtx src = force_reg (SFmode, operands[1]); if (TARGET_STFIWX) - emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], operands[1])); + emit_insn (gen_fix_trunc<mode>si2_stfiwx (operands[0], src)); else { - tmp = gen_reg_rtx (DImode); - stack = rs6000_allocate_stack_temp (DImode, true, false); - emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], operands[1], + rtx tmp = gen_reg_rtx (DImode); + rtx stack = rs6000_allocate_stack_temp (DImode, true, false); + emit_insn (gen_fix_trunc<mode>si2_internal (operands[0], src, tmp, stack)); } DONE; -- Alan Modra Australia Development Lab, IBM