On Thu, 2021-01-21 at 10:49 +0000, Richard Sandiford wrote:
> Ilya Leoshkevich via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Tue, 2021-01-19 at 09:41 +0100, Richard Biener wrote:
> > > On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > Suppose we have:
> > > >     (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62)))
> > > >     (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0))
> > > > 
> > > > It is clearly profitable to propagate the first insn into the
> > > > second
> > > > one and get:
> > > > 
> > > >     (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62)))
> > > > 
> > > > fwprop actually manages to perform this, but doesn't think the
> > > > result is
> > > > worth it, which results in unnecessary store/load sequences on
> > > > s390.
> > > > Improve the situation by classifying SUBREG -> MEM changes as
> > > > profitable.
> > > 
> > > IIRC fwprop also propagates into multiple uses and replacing a
> > > non-
> > > MEM
> > > with a MEM is only good when the original MEM goes away - is that
> > > properly
> > > dealt with here?
> > 
> > This is because of efficiency and not correctness reasons,
> > right?  For
> > correctness I already check MEM_VOLATILE_P (new_rtx).  For
> > efficiency I
> > think it would be reasonable to add def_insn->num_uses () == 1
> > check
> > (this passes my tests, I'm yet to do a full regtest though).
> 
> That sounds plausible, but I think there's also the issue that the
> mem could be in a less frequently executed block.
> 
> A potential problem with checking num_uses is that it might make the
> boundary between fwprop and combine more fuzzy.  If the propagation
> makes the original instruction redundant then we should remove it
> and take the cost of the removal into account when costing the
> propagation (as combine does).  fwprop is instead set up for cases
> in which propagations are profitable even if the original instruction
> is kept.
> 
> What prevents combine from handling this?  Are the instructions in
> different blocks?

I wanted to do this before combine, because in __ieee754_sqrtl case
fwprop turns this (example from the commit message + the insn after
it):

(set (reg:TF 63) (mem:TF (reg:DI 62)))
(set (reg:FPRX2 66) (subreg:FPRX2 (reg:TF 63) 0))
(set (reg:FPRX2 65)
     (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0
                         [(reg:FPRX2 66)]
                         [(asm_input:FPRX2 ("f"))]
                         []))

into this:

(set (reg:TF 63) (mem:TF (reg:DI 62)))
(set (reg:FPRX2 65)
     (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0
                         [(subreg:FPRX2 (reg:TF 63) 0)]
                         [(asm_input:FPRX2 ("f"))]
                         []))

by propagating (reg:FPRX2 66), and there is not much combine can do
about this anymore:

(set (reg:FPRX2 65)
     (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0
                         [(mem:FPRX2 (reg:DI 62))]
                         [(asm_input:FPRX2 ("f"))]
                         []))

is not a valid insn.

Reply via email to