> > while working on the GCN port I ended up with many redundant register copies > > of the form > > mov reg, exec > > do something > > mov reg, exec > > do something > > ... > > these copies are generated by LRA because exec is small register class and > > needs a lot of reloading (it could be improved too, but I do not care > > because I want to handle exec specially later anyway). > > > > I was however suprised this garbage survives postreload optimizations. It > > is easy to fix in regcprop which already does some noop copy elimination, > > but only of the for mov reg, reg after substituting. > > Right, this ought to be dealt with during postreload CSE, there is roughly > the > same code as yours: > > /* See whether a single set SET is a noop. */ > static int > reload_cse_noop_set_p (rtx set) > { > if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set))) > return 0; > > return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set)); > } > > Any idea about why this doesn't work in your case?
Thanks for pointing that code out. I looked for noop_set in postreload passes and found one in regcprop first. My case is not optimized because my IRA move pattern contains an use that is not handled by postreload cse. I am testing the attached patch and plan commit it as obvious. regcprop uses single_set which may be bit more natural, but it won't be stronger here because REG_DEAD notes are not computed, yet, and the current way noop moves are discovered runs in parallel with simplification which is probably a bit cheaper. I think elimination at both places makes sense: postreload cse is run before splitting and regcprop afterwards. It seems that at least for x86 we get quite few noop moves by splitting. I will get statistics from x86_64 bootstrap for regcprop part of elimination. * postreload.c (reload_cse_simplify): Also accept USE in noop move patterns. diff --git a/gcc/postreload.c b/gcc/postreload.c index 61c1ce8..4f3a526 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -153,7 +153,8 @@ reload_cse_simplify (rtx_insn *insn, rtx testreg) value = SET_DEST (part); } } - else if (GET_CODE (part) != CLOBBER) + else if (GET_CODE (part) != CLOBBER + && GET_CODE (part) != USE) break; }