On Thu, 2021-01-21 at 10:49 +0000, Richard Sandiford wrote: > Ilya Leoshkevich via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > On Tue, 2021-01-19 at 09:41 +0100, Richard Biener wrote: > > > On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > Suppose we have: > > > > (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62))) > > > > (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0)) > > > > > > > > It is clearly profitable to propagate the first insn into the > > > > second > > > > one and get: > > > > > > > > (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62))) > > > > > > > > fwprop actually manages to perform this, but doesn't think the > > > > result is > > > > worth it, which results in unnecessary store/load sequences on > > > > s390. > > > > Improve the situation by classifying SUBREG -> MEM changes as > > > > profitable. > > > > > > IIRC fwprop also propagates into multiple uses and replacing a > > > non- > > > MEM > > > with a MEM is only good when the original MEM goes away - is that > > > properly > > > dealt with here? > > > > This is because of efficiency and not correctness reasons, > > right? For > > correctness I already check MEM_VOLATILE_P (new_rtx). For > > efficiency I > > think it would be reasonable to add def_insn->num_uses () == 1 > > check > > (this passes my tests, I'm yet to do a full regtest though). > > That sounds plausible, but I think there's also the issue that the > mem could be in a less frequently executed block. > > A potential problem with checking num_uses is that it might make the > boundary between fwprop and combine more fuzzy. If the propagation > makes the original instruction redundant then we should remove it > and take the cost of the removal into account when costing the > propagation (as combine does). fwprop is instead set up for cases > in which propagations are profitable even if the original instruction > is kept. > > What prevents combine from handling this? Are the instructions in > different blocks?
I wanted to do this before combine, because in __ieee754_sqrtl case fwprop turns this (example from the commit message + the insn after it): (set (reg:TF 63) (mem:TF (reg:DI 62))) (set (reg:FPRX2 66) (subreg:FPRX2 (reg:TF 63) 0)) (set (reg:FPRX2 65) (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0 [(reg:FPRX2 66)] [(asm_input:FPRX2 ("f"))] [])) into this: (set (reg:TF 63) (mem:TF (reg:DI 62))) (set (reg:FPRX2 65) (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0 [(subreg:FPRX2 (reg:TF 63) 0)] [(asm_input:FPRX2 ("f"))] [])) by propagating (reg:FPRX2 66), and there is not much combine can do about this anymore: (set (reg:FPRX2 65) (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0 [(mem:FPRX2 (reg:DI 62))] [(asm_input:FPRX2 ("f"))] [])) is not a valid insn.