On Wed, May 25, 2022 at 2:30 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > On Tue, May 24, 2022 at 10:11 PM H.J. Lu <hjl.to...@gmail.com> wrote: > >> > >> On Mon, May 23, 2022 at 11:42 PM Richard Biener > >> <richard.guent...@gmail.com> wrote: > >> > > >> > On Mon, May 23, 2022 at 8:34 PM H.J. Lu <hjl.to...@gmail.com> wrote: > >> > > > >> > > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > >> > > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > >> > > > <gcc-patches@gcc.gnu.org> wrote: > >> > > > > > >> > > > > When recording store for RTL dead store elimination, check if the > >> > > > > source > >> > > > > register is set only once to a constant. If yes, record the > >> > > > > constant > >> > > > > as the store source. It eliminates unrolled zero stores after > >> > > > > memset 0 > >> > > > > in a loop where a vector register is used as the zero store source. > >> > > > > > >> > > > > gcc/ > >> > > > > > >> > > > > PR rtl-optimization/105638 > >> > > > > * dse.cc (record_store): Use the constant source if the > >> > > > > source > >> > > > > register is set only once. > >> > > > > > >> > > > > gcc/testsuite/ > >> > > > > > >> > > > > PR rtl-optimization/105638 > >> > > > > * g++.target/i386/pr105638.C: New test. > >> > > > > --- > >> > > > > gcc/dse.cc | 19 ++++++++++ > >> > > > > gcc/testsuite/g++.target/i386/pr105638.C | 44 > >> > > > > ++++++++++++++++++++++++ > >> > > > > 2 files changed, 63 insertions(+) > >> > > > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >> > > > > > >> > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc > >> > > > > index 30c11cee034..0433dd3d846 100644 > >> > > > > --- a/gcc/dse.cc > >> > > > > +++ b/gcc/dse.cc > >> > > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > >> > > > > > >> > > > > if (tem && CONSTANT_P (tem)) > >> > > > > const_rhs = tem; > >> > > > > + else > >> > > > > + { > >> > > > > + /* If RHS is set only once to a constant, set > >> > > > > CONST_RHS > >> > > > > + to the constant. */ > >> > > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >> > > > > + if (def != nullptr > >> > > > > + && !DF_REF_IS_ARTIFICIAL (def) > >> > > > > + && !DF_REF_NEXT_REG (def)) > >> > > > > + { > >> > > > > + rtx_insn *def_insn = DF_REF_INSN (def); > >> > > > > + rtx def_body = PATTERN (def_insn); > >> > > > > + if (GET_CODE (def_body) == SET) > >> > > > > + { > >> > > > > + rtx def_src = SET_SRC (def_body); > >> > > > > + if (CONSTANT_P (def_src)) > >> > > > > + const_rhs = def_src; > >> > > > > >> > > > doesn't DSE have its own tracking of stored values? Shouldn't we > >> > > > >> > > It tracks stored values only within the basic block. When RTL loop > >> > > invariant motion hoists a constant initialization out of the loop into > >> > > a separate basic block, the constant store value becomes unknown > >> > > within the original basic block. > >> > > > >> > > > improve _that_ if it is not enough? I also wonder if you need to > >> > > > >> > > My patch extends DSE stored value tracking to include the constant > >> > > which > >> > > is set only once in another basic block. > >> > > > >> > > > verify the SET isn't partial? > >> > > > > >> > > > >> > > Here is the v2 patch to check that the constant is set by a non-partial > >> > > unconditional load. > >> > > > >> > > OK for master? > >> > > > >> > > Thanks. > >> > > > >> > > H.J. > >> > > --- > >> > > RTL DSE tracks redundant constant stores within a basic block. When > >> > > RTL > >> > > loop invariant motion hoists a constant initialization out of the loop > >> > > into a separate basic block, the constant store value becomes unknown > >> > > within the original basic block. When recording store for RTL DSE, > >> > > check > >> > > if the source register is set only once to a constant by a non-partial > >> > > unconditional load. If yes, record the constant as the constant store > >> > > source. It eliminates unrolled zero stores after memset 0 in a loop > >> > > where a vector register is used as the zero store source. > >> > > > >> > > gcc/ > >> > > > >> > > PR rtl-optimization/105638 > >> > > * dse.cc (record_store): Use the constant source if the source > >> > > register is set only once. > >> > > > >> > > gcc/testsuite/ > >> > > > >> > > PR rtl-optimization/105638 > >> > > * g++.target/i386/pr105638.C: New test. > >> > > --- > >> > > gcc/dse.cc | 22 ++++++++++++ > >> > > gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > >> > > 2 files changed, 66 insertions(+) > >> > > create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >> > > > >> > > diff --git a/gcc/dse.cc b/gcc/dse.cc > >> > > index 30c11cee034..af8e88dac32 100644 > >> > > --- a/gcc/dse.cc > >> > > +++ b/gcc/dse.cc > >> > > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > >> > > > >> > > if (tem && CONSTANT_P (tem)) > >> > > const_rhs = tem; > >> > > + else > >> > > + { > >> > > + /* If RHS is set only once to a constant, set CONST_RHS > >> > > + to the constant. */ > >> > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >> > > + if (def != nullptr > >> > > + && !DF_REF_IS_ARTIFICIAL (def) > >> > > + && !(DF_REF_FLAGS (def) > >> > > + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > >> > > + && !DF_REF_NEXT_REG (def)) > >> > > >> > Can we really use df-chain here and rely that a single definition is > >> > the only one? If rhs is a hardreg does df-chain include implicit > >> > sets of function argument registers for example? Don't we need RD > >> > here or at least verify the single df-chain definition dominates the > >> > use here (if we can rely on the reg otherwise be uninitialized and thus > >> > the use invoking undefined behavior we could use the constant even > >> > in non-dominating context, but WRT the function args & hardregs I'm not > >> > sure we can tell). > >> > >> Does the hard register have a regular DEF? Won't the function args & > >> hardregs have DF_REF_ARTIFICIAL? > > > > I don't know - do they? > > They won't be DF_REF_ARTIFICIAL. I think the code should work ok for > hard registers though. All sets of argument registers, and all clobbers > of registers by function calls, are recorded as df defs. (This is one
In these cases, will SET_SRC be constant? > of the things that bloats the representation compared to rtx insns, > where most call clobbers are implicit. But it's also one of the things > that makes df easier to use & more reliable than ad hoc liveness tracking.) > > Thanks, > Richard -- H.J.