On Sat, May 28, 2022 at 11:37 AM Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > > On 5/26/2022 2:43 PM, H.J. Lu via Gcc-patches wrote: > > On Thu, May 26, 2022 at 04:14:17PM +0100, Richard Sandiford wrote: > >> "H.J. Lu" <hjl.to...@gmail.com> writes: > >>> On Wed, May 25, 2022 at 12:30 AM Richard Sandiford > >>> <richard.sandif...@arm.com> wrote: > >>>> "H.J. Lu via Gcc-patches" <gcc-patches@gcc.gnu.org> writes: > >>>>> On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote: > >>>>>> On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches > >>>>>> <gcc-patches@gcc.gnu.org> wrote: > >>>>>>> When recording store for RTL dead store elimination, check if the > >>>>>>> source > >>>>>>> register is set only once to a constant. If yes, record the constant > >>>>>>> as the store source. It eliminates unrolled zero stores after memset > >>>>>>> 0 > >>>>>>> in a loop where a vector register is used as the zero store source. > >>>>>>> > >>>>>>> gcc/ > >>>>>>> > >>>>>>> PR rtl-optimization/105638 > >>>>>>> * dse.cc (record_store): Use the constant source if the > >>>>>>> source > >>>>>>> register is set only once. > >>>>>>> > >>>>>>> gcc/testsuite/ > >>>>>>> > >>>>>>> PR rtl-optimization/105638 > >>>>>>> * g++.target/i386/pr105638.C: New test. > >>>>>>> --- > >>>>>>> gcc/dse.cc | 19 ++++++++++ > >>>>>>> gcc/testsuite/g++.target/i386/pr105638.C | 44 > >>>>>>> ++++++++++++++++++++++++ > >>>>>>> 2 files changed, 63 insertions(+) > >>>>>>> create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >>>>>>> > >>>>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc > >>>>>>> index 30c11cee034..0433dd3d846 100644 > >>>>>>> --- a/gcc/dse.cc > >>>>>>> +++ b/gcc/dse.cc > >>>>>>> @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info) > >>>>>>> > >>>>>>> if (tem && CONSTANT_P (tem)) > >>>>>>> const_rhs = tem; > >>>>>>> + else > >>>>>>> + { > >>>>>>> + /* If RHS is set only once to a constant, set CONST_RHS > >>>>>>> + to the constant. */ > >>>>>>> + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >>>>>>> + if (def != nullptr > >>>>>>> + && !DF_REF_IS_ARTIFICIAL (def) > >>>>>>> + && !DF_REF_NEXT_REG (def)) > >>>>>>> + { > >>>>>>> + rtx_insn *def_insn = DF_REF_INSN (def); > >>>>>>> + rtx def_body = PATTERN (def_insn); > >>>>>>> + if (GET_CODE (def_body) == SET) > >>>>>>> + { > >>>>>>> + rtx def_src = SET_SRC (def_body); > >>>>>>> + if (CONSTANT_P (def_src)) > >>>>>>> + const_rhs = def_src; > >>>>>> doesn't DSE have its own tracking of stored values? Shouldn't we > >>>>> It tracks stored values only within the basic block. When RTL loop > >>>>> invariant motion hoists a constant initialization out of the loop into > >>>>> a separate basic block, the constant store value becomes unknown > >>>>> within the original basic block. > >>>>> > >>>>>> improve _that_ if it is not enough? I also wonder if you need to > >>>>> My patch extends DSE stored value tracking to include the constant which > >>>>> is set only once in another basic block. > >>>>> > >>>>>> verify the SET isn't partial? > >>>>>> > >>>>> Here is the v2 patch to check that the constant is set by a non-partial > >>>>> unconditional load. > >>>>> > >>>>> OK for master? > >>>>> > >>>>> Thanks. > >>>>> > >>>>> H.J. > >>>>> --- > >>>>> RTL DSE tracks redundant constant stores within a basic block. When RTL > >>>>> loop invariant motion hoists a constant initialization out of the loop > >>>>> into a separate basic block, the constant store value becomes unknown > >>>>> within the original basic block. When recording store for RTL DSE, > >>>>> check > >>>>> if the source register is set only once to a constant by a non-partial > >>>>> unconditional load. If yes, record the constant as the constant store > >>>>> source. It eliminates unrolled zero stores after memset 0 in a loop > >>>>> where a vector register is used as the zero store source. > >>>>> > >>>>> gcc/ > >>>>> > >>>>> PR rtl-optimization/105638 > >>>>> * dse.cc (record_store): Use the constant source if the source > >>>>> register is set only once. > >>>>> > >>>>> gcc/testsuite/ > >>>>> > >>>>> PR rtl-optimization/105638 > >>>>> * g++.target/i386/pr105638.C: New test. > >>>>> --- > >>>>> gcc/dse.cc | 22 ++++++++++++ > >>>>> gcc/testsuite/g++.target/i386/pr105638.C | 44 ++++++++++++++++++++++++ > >>>>> 2 files changed, 66 insertions(+) > >>>>> create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C > >>>>> > >>>>> diff --git a/gcc/dse.cc b/gcc/dse.cc > >>>>> index 30c11cee034..af8e88dac32 100644 > >>>>> --- a/gcc/dse.cc > >>>>> +++ b/gcc/dse.cc > >>>>> @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info) > >>>>> > >>>>> if (tem && CONSTANT_P (tem)) > >>>>> const_rhs = tem; > >>>>> + else > >>>>> + { > >>>>> + /* If RHS is set only once to a constant, set CONST_RHS > >>>>> + to the constant. */ > >>>>> + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >>>>> + if (def != nullptr > >>>>> + && !DF_REF_IS_ARTIFICIAL (def) > >>>>> + && !(DF_REF_FLAGS (def) > >>>>> + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)) > >>>>> + && !DF_REF_NEXT_REG (def)) > >>>> Can we introduce a helper for this? There are already similar tests > >>>> in ira and loop-iv, and it seems a bit too complex to have to open-code > >>>> each time. > >>> I can use find_single_def_src in loop-iv.cc: > >>> > >>> /* If REGNO has a single definition, return its known value, otherwise > >>> return > >>> null. */ > >>> > >>> rtx > >>> find_single_def_src (unsigned int regno) > >> Yeah, reusing that sounds good. Perhaps we should move it into df-core.cc, > >> alongside the df_reg_used group of functions. > >> > >> I think the mode check in your original patch should be kept though, > >> so how about we change the parameter to an rtx reg and use rtx_equal in: > >> > >> rtx set = single_set (DF_REF_INSN (adef)); > >> if (set == NULL || !REG_P (SET_DEST (set)) > >> || REGNO (SET_DEST (set)) != regno) > >> return NULL_RTX; > > Fixed. > > > >> rather than the existing !REG_P and REGNO checks. We should also add: > >> > >>> and do > >>> > >>> /* If RHS is set only once to a constant, set CONST_RHS > >>> to the constant. */ > >>> rtx def_src = find_single_def_src (REGNO (rhs)); > >>> if (def_src != nullptr && CONSTANT_P (def_src)) > >>> { > >>> df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs)); > >>> if (!(DF_REF_FLAGS (def) > >>> & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) > >> …this check to the function, since it's needed for correctness even > >> in the loop-iv.cc use. > > Fixed. > > > >> Thanks, > >> Richard > > Here is the v3 patch. OK for master? > > > > Thanks. > > > > H.J. > > --- > > RTL DSE tracks redundant constant stores within a basic block. When RTL > > loop invariant motion hoists a constant initialization out of the loop > > into a separate basic block, the constant store value becomes unknown > > within the original basic block. When recording store for RTL DSE, check > > if the source register is set only once to a constant by a non-partial > > unconditional load. If yes, record the constant as the constant store > > source. It eliminates unrolled zero stores after memset 0 in a loop > > where a vector register is used as the zero store source. > > > > Extract find_single_def_src from loop-iv.cc and move it to df-core.cc: > > > > 1. Rename to df_find_single_def_src. > > 2. Change the argument to rtx and use rtx_equal_p. > > 3. Return null for partial or conditional defs. > > > > gcc/ > > > > PR rtl-optimization/105638 > > * df-core.cc (df_find_single_def_sr): Moved and renamed from > > find_single_def_src in loop-iv.cc. Change the argument to rtx > > and use rtx_equal_p. Return null for partial or conditional > > defs. > > * df.h (df_find_single_def_src): New prototype. > > * dse.cc (record_store): Use the constant source if the source > > register is set only once. > > * loop-iv.cc (find_single_def_src): Moved to df-core.cc. > > (replace_single_def_regs): Replace find_single_def_src with > > df_find_single_def_src. > > > > gcc/testsuite/ > > > > PR rtl-optimization/105638 > > * g++.target/i386/pr105638.C: New test. > Avoiding REG_EQUAL and only handling REG_EQUIV notes would be better > here. REG_EQUIV indicates the destination could be replaced with the > source of the note at any point and semantically the code would still be > valid. REG_EQUAL doesn't give us that guarantee. > > To allow REG_EQUAL you have to check that the block with the note > dominates the use.
When a use only has one non-conditional def which doesn't dominate the use, isn't its behavior undefined? -- H.J.