After some digging, I can confirm local-alloc.c is creating OR Rx,0 instructions but not simplifying them local-alloc.c is not the problem - but right now it the only help I'm getting for post split optimization.

This occurs when source registers are replaced with equivalent constant using validate_replace_rtx() (which has very minimal simplifications)

I added validate_simplify_rtx() after the normal update_equiv_regs/validate_replace_rtx() and the OR Rx,0 got removed.

I also found that the limited propagation of constants is also due to limitations of local-alloc.c. In particular two restrictions:

1) Constants are not propagated into operands that are both input and output. For example:
Ra = 0
Ra=Ra | Rb

Not sure why - maybe just deemed too difficult.

2) The method used only replaces the first use in a daisy chain of moves. So if we have

Ra = 0
Rb = Ra
Rc = Rc | Rb

it will only reduce to:

Rb = 0
Rc = Rc | Rb

rather than

Rc = Rc | 0

and ideally

*NOTHING*

Propagating REG_EQUIV notes across register-register moves would seem to a obviously simple way to fix this. Thoughts? I am not sure local-alloc is the best place to address the overall problem, I doubt it is intended to provide such optimizations.
An additional cse pass after split would seem a better way perhaps?

Andy




Reply via email to