On Fri, 25 Feb 2022, Jiufu Guo wrote:

> Richard Biener <rguent...@suse.de> writes:
> 
> > On Fri, 25 Feb 2022, Jiufu Guo wrote:
> >
> >> Richard Biener <rguent...@suse.de> writes:
> >> 
> >> > On Thu, 24 Feb 2022, Jiufu Guo wrote:
> >> >
> >> >> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >> >> 
> >> >> > Segher Boessenkool <seg...@kernel.crashing.org> writes:
> >> >> >
> >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
> >> >> >>> I'm assuming we're always dealing with
> >> >> >>> 
> >> >> >>>   (set (reg:MODE ..) <src_folded>)
> >> >> >>> 
> >> >> >>> here and CSE is not substituting into random places of an
> >> >> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
> >> >> >>> to for a constant, if it should implicitely evaluate the cost
> >> >> >>> of putting the result into a register for example.
> >> >> >>
> >> >> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
> >> >> >> for anything that is used as input in a machine instruction -- but 
> >> >> >> you
> >> >> >> need much more context to determine that.  insn_cost is much simpler 
> >> >> >> and
> >> >> >> much easier to use.
> >> >> >>
> >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using
> >> >> >>> your proposed new target hook and comparing it with the original
> >> >> >>> unfolded src (again with SET and 1).
> >> >> >>
> >> >> >> It is required to generate valid instructions no matter what, before
> >> >> >> the pass has finished that is.  On all more modern architectures it 
> >> >> >> is
> >> >> >> futile to think you can usefully consider the cost of an RTL 
> >> >> >> expression
> >> >> >> and derive a real-world cost of the generated code from that.
> >> >> >
> >> >> > Thanks Segher for pointing out these!  Here is  another reason that I
> >> >> > did not use rtx_cost: in a few passes, there are codes to check the
> >> >> > constants and store them in constant pool.  I'm thinking to integerate
> >> >> > those codes in a consistent way.
> >> >> 
> >> >> Hi Segher, Richard!
> >> >> 
> >> >> I'm thinking the way like: For a constant,
> >> >> 1. if the constant could be used as an immediate for the
> >> >> instruction, then retreated as an operand;
> >> >> 2. otherwise if the constant can not be stored into a
> >> >> constant pool, then handle through instructions;
> >> >> 3. if it is faster to access constant from pool, then emit
> >> >> constant as data(.rodata);
> >> >> 4. otherwise, handle the constant by instructions.
> >> >> 
> >> >> And to store the constant into a pool, besides force_const_mem,
> >> >> create reference (toc) may be needed on some platforms.
> >> >> 
> >> >> For this particular issue in CSE, there is already code that
> >> >> tries to put constant into a pool (invoke force_const_mem).
> >> >> While the code is too late.  So, we may check the constant
> >> >> earlier and store it into constant pool if profitable.
> >> >> 
> >> >> And another thing as Segher pointed out, CSE is doing too
> >> >> much work.  It may be ok to separate the constant handling
> >> >> logic from CSE.
> >> >
> >> > Not sure - CSE just is value numbering, I don't see that it does
> >> > more than that.  Yes, it might have developed "heuristics" over
> >> > the years what to CSE and to what and where to substitute and
> >> > where not.  But in the end it does just value numbering.
> >> >
> >> >> 
> >> >> I update a new version patch as follow (did not seprate CSE):
> >> >
> >> > How is the new target hook better in any way compared to rtx_cost
> >> > or insn_cost?  It looks like a total hack.
> >> >
> >> > I suppose the actual way of materializing a constant is done
> >> > behind GCCs back and not exposed anywhere?  But instead you
> >> > claim the constants are valid when they actually are not?
> >> > Isn't the problem then that the rs6000 backend lies?
> >> 
> >> Hi Richard,
> >> 
> >> Thanks for your comments and sugguestions!
> >> 
> >> Materializing a constant should be done behind GCC.
> >> On rs6000, in expand pass, during emit_move, the constant is
> >> checked and store into constant pool if necessary.
> >> Some other platforms are doing a similar thing, e.g.
> >> ix86_expand_vector_move, alpha_expand_mov,...
> >> mips_legitimize_const_move.
> >> 
> >> But, it does not as we expect, force_const_mem is also 
> >> exposed other places (not only ira/reload for stack reference).
> >> 
> >> CSE is one place, for example, CSE first retrieve the constant
> >> from insn's equal, but it also tries to put constant into
> >> pool for some condition (the condition was introduced at
> >> early age of cse.c, and it is rare to run into in trunk).
> >> In some aspects, IMHO, this seems not a great work of CSE.
> >> 
> >> And this is how the 'invalid(or say slow)' constant occurs.
> >> e.g.  before cse:
> >>     7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
> >>       REG_EQUAL 0x100803004101001
> >> after cse: 
> >>     7: r119:DI=0x100803004101001
> >>       REG_EQUAL 0x100803004101001
> >> 
> >> As you pointed out: we can also avoid this transformation through
> >> rtx_cost/insn_cost by estimating the COST more accurately for
> >>  "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not
> >> be a real final instruction.)
> >
> > At which point does this become the final instruction?  I suppose
> > CSE belives this constant is legitimate and the insn is recognized
> > just fine in CSE?  (that's what I mean with "behind GCCs back")
> >
> It would become final instructions during split pass on rs6000,
> other target, e.g. alpha, seem also do split it.
> >> Is it necessary to refine this constant handling for CSE, or even
> >> to eliminate the logic on constant extracting for an insn, beside
> >> updating rtx_cost/insn_cost?
> >
> > So if CSE really just transforms
> >
> >>     7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
> >>       REG_EQUAL 0x100803004101001
> >
> > to
> >
> >>     7: r119:DI=0x100803004101001
> >>       REG_EQUAL 0x100803004101001
> >
> > then why can't rtx_cost (SET, 1) or insn_cost () be used to
> > accurately tell it that the immediate is going to be a lot
> > more expensive?
> >
> > That is, rtx_cost (CONST_INT-rtx, DI, SET, 1, ...) is accurate
> > enough to be treated as an actual insn (it _might_ be part of
> > a parallel I guess, but that's unlikely) and thus the backend
> > should have a very good idea of the many instruction it needs
> > for this.  Likewise rtx_cost ([unspec[`*.LC0',%r2:DI] 47], DI, SET, 1, 
> > ...)
> > should receive accurate cost that can be compared to other
> > rtx_cost (..., DI, SET, 1, ...)
> >
> > And CSE doesn't even need to create fake insns here since IMHO
> > rtx_cost is good enough to capture the full insn.  Targets can
> > choose to split out a set_cost from their rtx_cost/insn_cost
> > hooks for this case for example.
> 
> Hi Richard,
> 
> Right, we can fix this issue by updating rtx_cost/insn_cost to
> tell that this kind of constants is a lot of expansive.
> 
> To update rtx_cost, we can use a trivial patch (shown as end of
> this mail) to fix this particular issue.
> To use insn_cost, additional work is replacing rtx_cost with
> insn_cost for CSE, maybe more suitable for stage1.
> 
> So, it would be too far to fix this by refactoring the logic of
> constant handling. :-)
> 
> Thanks for your comments!
> 
> BR,
> Jiufu
> 
> >
> > Richard.
> >
> >> Any sugguestions?
> >> 
> >> >
> >> > Btw, all of this is of course not appropriate for stage4 and changes
> >> > to CSE need testing on more than one target.
> >> I would do more evaluation, thanks!
> >> 
> >> Jiufu
> >> 
> >> >
> >> > Richard.
> >> >
> >> >> Thanks for the comments and suggestions again!
> >> >> 
> >> >> 
> >> >> BR,
> >> >> Jiufu
> >> >> 
> 
> Part of the experimental patch for rs6000, which could help
> to mitigate inaccurate rtx cost on constant.

Yes, that sort of thing is exactly what I had in mind.  I assume
the corresponding insn_costs hook does the same?

Richard.

> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index e82a47f4c0e..e429ae2bcf0 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -21884,6 +21884,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
> outer_code,
>  
>      case CONST_DOUBLE:
>      case CONST_WIDE_INT:
> +      /* Set const to a reg, it may needs a few insns.  */
> +      if (outer_code == SET)
> +     {
> +       *total = COSTS_N_INSNS (num_insns_constant (x, mode));
> +       return true;
> +     }
> +      /* FALLTHRU */
> +
>      case CONST:
>      case HIGH:
>      case SYMBOL_REF:

Reply via email to