The recent fix for mul_widen_cost revealed an interesting
quirk of ira/reload register allocation on x86_64.  As shown in
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551648.html
for gcc.target/i386/pr71321.c we generate the following code that
performs unnecessary register shuffling.

        movl    $-51, %edx
        movl    %edx, %eax
        mulb    %dil

which is caused by reload generating the following instructions
(notice the set of the first register is dead in the 2nd insn):

(insn 7 4 36 2 (set (reg:QI 1 dx [94])
        (const_int -51 [0xffffffffffffffcd])) {*movqi_internal}
     (expr_list:REG_EQUIV (const_int -51 [0xffffffffffffffcd])
        (nil)))
(insn 36 7 8 2 (set (reg:QI 0 ax [93])
        (reg:QI 1 dx [94])) {*movqi_internal}
     (expr_list:REG_DEAD (reg:QI 1 dx [94])
        (nil)))

Various discussions in bugzilla seem to point to reload preferring
not to load constants directly into CLASS_LIKELY_SPILLED_P registers.
Whatever the cause, one solution (workaround), that doesn't involve
rewriting a register allocator, is to use peephole2 to spot this
weirdness and eliminate it.  In fact, this use case is (probably)
the reason peephole optimizers were originally developed, but it's
a little disappointing this application of them is still required
today.  On a positive note, this clean-up is cheap, as we're already
traversing the instruction stream with liveness (REG_DEAD notes)
already calculated.

With this peephole2 the above three instructions (from pr71321.c)
are replaced with:

        movl    $-51, %eax
        mulb    %dil

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.  This peephole triggers
1435 during stage2 and stage3 of a bootstrap, and a further 1274
times during "make check".  The most common case is DX_REG->AX_REG
(as above) which occurs 421 times.  I've restricted this pattern to
immediate constant loads into general operand registers, which fixes
this particular problem, but broader predicates may help similar cases.
Ok for mainline?

2020-08-11  Roger Sayle  <ro...@nextmovesoftware.com>

        * config/i386/i386.md (peephole2): Reduce unnecessary
        register shuffling produced by register allocation.

Thanks in advance,
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4e916bf..34a8946 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18946,6 +18946,16 @@
   operands[2] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
   ix86_expand_clear (operands[1]);
 })
+
+;; Reload dislikes loading constants directly into class_likely_spilled
+;; hard registers.  Try to tidy things up here.
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "x86_64_immediate_operand"))
+   (set (match_operand:SWI 2 "general_reg_operand")
+       (match_dup 0))]
+  "peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 2) (match_dup 1))])
 
 ;; Misc patterns (?)
 

Reply via email to