On Mon, Jan 18, 2021 at 11:04 PM Ilya Leoshkevich via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Boostrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux > and s390x-redhat-linux. I realize it might be too late for a change > like this, but it's desirable to have this in conjunction with the > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563799.html s390 > regression fix, which otherwise produces unnecessary store/load > sequences in certain glibc routines, e.g. __ieee754_sqrtl. Ok for > master? > > > > Suppose we have: > > (set (reg/v:TF 63) (mem/c:TF (reg/v:DI 62))) > (set (reg:FPRX2 66) (subreg:FPRX2 (reg/v:TF 63) 0)) > > It is clearly profitable to propagate the first insn into the second > one and get: > > (set (reg:FPRX2 66) (mem/c:FPRX2 (reg/v:DI 62))) > > fwprop actually manages to perform this, but doesn't think the result is > worth it, which results in unnecessary store/load sequences on s390. > Improve the situation by classifying SUBREG -> MEM changes as > profitable.
IIRC fwprop also propagates into multiple uses and replacing a non-MEM with a MEM is only good when the original MEM goes away - is that properly dealt with here? Richard. > gcc/ChangeLog: > > 2021-01-15 Ilya Leoshkevich <i...@linux.ibm.com> > > * fwprop.c (fwprop_propagation::classify_result): Allow > (subreg (mem)) simplifications. > > gcc/testsuite/ChangeLog: > > 2021-01-15 Ilya Leoshkevich <i...@linux.ibm.com> > > * gcc.target/s390/vector/long-double-to-i64.c: Expect that > float-vector moves do *not* happen. > --- > gcc/fwprop.c | 5 +++++ > gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c | 3 +-- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/gcc/fwprop.c b/gcc/fwprop.c > index eff8f7cc141..46b8ec7eccf 100644 > --- a/gcc/fwprop.c > +++ b/gcc/fwprop.c > @@ -262,6 +262,11 @@ fwprop_propagation::classify_result (rtx old_rtx, rtx > new_rtx) > && GET_MODE (new_rtx) == GET_MODE_INNER (GET_MODE (from))) > return PROFITABLE; > > + /* Allow (subreg (mem)) -> (mem) simplifications. However, do not allow > + creating new (mem/v)s, since DCE will not remove the old ones. */ > + if (SUBREG_P (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx)) > + return PROFITABLE; > + > return 0; > } > > diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > index 2dbbb5d1c03..8f4e377ed72 100644 > --- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c > @@ -10,8 +10,7 @@ long_double_to_i64 (long double x) > return x; > } > > -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } > */ > -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } > */ > +/* { dg-final { scan-assembler-not {\n\tvpdi\t} } } */ > /* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */ > > int > -- > 2.26.2 >