After some digging, I can confirm local-alloc.c is creating OR Rx,0
instructions but not simplifying them
local-alloc.c is not the problem - but right now it the only help I'm
getting for post split optimization.
This occurs when source registers are replaced with equivalent constant
using validate_replace_rtx() (which has very minimal simplifications)
I added validate_simplify_rtx() after the normal
update_equiv_regs/validate_replace_rtx() and the OR Rx,0 got removed.
I also found that the limited propagation of constants is also due to
limitations of local-alloc.c. In particular two restrictions:
1) Constants are not propagated into operands that are both input and
output. For example:
Ra = 0
Ra=Ra | Rb
Not sure why - maybe just deemed too difficult.
2) The method used only replaces the first use in a daisy chain of
moves. So if we have
Ra = 0
Rb = Ra
Rc = Rc | Rb
it will only reduce to:
Rb = 0
Rc = Rc | Rb
rather than
Rc = Rc | 0
and ideally
*NOTHING*
Propagating REG_EQUIV notes across register-register moves would seem
to a obviously simple way to fix this. Thoughts?
I am not sure local-alloc is the best place to address the overall
problem, I doubt it is intended to provide such optimizations.
An additional cse pass after split would seem a better way perhaps?
Andy