On Fri, 25 Feb 2022, Jiufu Guo wrote: > Richard Biener <rguent...@suse.de> writes: > > > On Fri, 25 Feb 2022, Jiufu Guo wrote: > > > >> Richard Biener <rguent...@suse.de> writes: > >> > >> > On Thu, 24 Feb 2022, Jiufu Guo wrote: > >> > > >> >> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > >> >> > >> >> > Segher Boessenkool <seg...@kernel.crashing.org> writes: > >> >> > > >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote: > >> >> >>> I'm assuming we're always dealing with > >> >> >>> > >> >> >>> (set (reg:MODE ..) <src_folded>) > >> >> >>> > >> >> >>> here and CSE is not substituting into random places of an > >> >> >>> instruction(?). I don't know what 'rtx_cost' should evaluate > >> >> >>> to for a constant, if it should implicitely evaluate the cost > >> >> >>> of putting the result into a register for example. > >> >> >> > >> >> >> rtx_cost is no good here (and in most places). rtx_cost should be 0 > >> >> >> for anything that is used as input in a machine instruction -- but > >> >> >> you > >> >> >> need much more context to determine that. insn_cost is much simpler > >> >> >> and > >> >> >> much easier to use. > >> >> >> > >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using > >> >> >>> your proposed new target hook and comparing it with the original > >> >> >>> unfolded src (again with SET and 1). > >> >> >> > >> >> >> It is required to generate valid instructions no matter what, before > >> >> >> the pass has finished that is. On all more modern architectures it > >> >> >> is > >> >> >> futile to think you can usefully consider the cost of an RTL > >> >> >> expression > >> >> >> and derive a real-world cost of the generated code from that. > >> >> > > >> >> > Thanks Segher for pointing out these! Here is another reason that I > >> >> > did not use rtx_cost: in a few passes, there are codes to check the > >> >> > constants and store them in constant pool. I'm thinking to integerate > >> >> > those codes in a consistent way. > >> >> > >> >> Hi Segher, Richard! > >> >> > >> >> I'm thinking the way like: For a constant, > >> >> 1. if the constant could be used as an immediate for the > >> >> instruction, then retreated as an operand; > >> >> 2. otherwise if the constant can not be stored into a > >> >> constant pool, then handle through instructions; > >> >> 3. if it is faster to access constant from pool, then emit > >> >> constant as data(.rodata); > >> >> 4. otherwise, handle the constant by instructions. > >> >> > >> >> And to store the constant into a pool, besides force_const_mem, > >> >> create reference (toc) may be needed on some platforms. > >> >> > >> >> For this particular issue in CSE, there is already code that > >> >> tries to put constant into a pool (invoke force_const_mem). > >> >> While the code is too late. So, we may check the constant > >> >> earlier and store it into constant pool if profitable. > >> >> > >> >> And another thing as Segher pointed out, CSE is doing too > >> >> much work. It may be ok to separate the constant handling > >> >> logic from CSE. > >> > > >> > Not sure - CSE just is value numbering, I don't see that it does > >> > more than that. Yes, it might have developed "heuristics" over > >> > the years what to CSE and to what and where to substitute and > >> > where not. But in the end it does just value numbering. > >> > > >> >> > >> >> I update a new version patch as follow (did not seprate CSE): > >> > > >> > How is the new target hook better in any way compared to rtx_cost > >> > or insn_cost? It looks like a total hack. > >> > > >> > I suppose the actual way of materializing a constant is done > >> > behind GCCs back and not exposed anywhere? But instead you > >> > claim the constants are valid when they actually are not? > >> > Isn't the problem then that the rs6000 backend lies? > >> > >> Hi Richard, > >> > >> Thanks for your comments and sugguestions! > >> > >> Materializing a constant should be done behind GCC. > >> On rs6000, in expand pass, during emit_move, the constant is > >> checked and store into constant pool if necessary. > >> Some other platforms are doing a similar thing, e.g. > >> ix86_expand_vector_move, alpha_expand_mov,... > >> mips_legitimize_const_move. > >> > >> But, it does not as we expect, force_const_mem is also > >> exposed other places (not only ira/reload for stack reference). > >> > >> CSE is one place, for example, CSE first retrieve the constant > >> from insn's equal, but it also tries to put constant into > >> pool for some condition (the condition was introduced at > >> early age of cse.c, and it is rare to run into in trunk). > >> In some aspects, IMHO, this seems not a great work of CSE. > >> > >> And this is how the 'invalid(or say slow)' constant occurs. > >> e.g. before cse: > >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47] > >> REG_EQUAL 0x100803004101001 > >> after cse: > >> 7: r119:DI=0x100803004101001 > >> REG_EQUAL 0x100803004101001 > >> > >> As you pointed out: we can also avoid this transformation through > >> rtx_cost/insn_cost by estimating the COST more accurately for > >> "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not > >> be a real final instruction.) > > > > At which point does this become the final instruction? I suppose > > CSE belives this constant is legitimate and the insn is recognized > > just fine in CSE? (that's what I mean with "behind GCCs back") > > > It would become final instructions during split pass on rs6000, > other target, e.g. alpha, seem also do split it. > >> Is it necessary to refine this constant handling for CSE, or even > >> to eliminate the logic on constant extracting for an insn, beside > >> updating rtx_cost/insn_cost? > > > > So if CSE really just transforms > > > >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47] > >> REG_EQUAL 0x100803004101001 > > > > to > > > >> 7: r119:DI=0x100803004101001 > >> REG_EQUAL 0x100803004101001 > > > > then why can't rtx_cost (SET, 1) or insn_cost () be used to > > accurately tell it that the immediate is going to be a lot > > more expensive? > > > > That is, rtx_cost (CONST_INT-rtx, DI, SET, 1, ...) is accurate > > enough to be treated as an actual insn (it _might_ be part of > > a parallel I guess, but that's unlikely) and thus the backend > > should have a very good idea of the many instruction it needs > > for this. Likewise rtx_cost ([unspec[`*.LC0',%r2:DI] 47], DI, SET, 1, > > ...) > > should receive accurate cost that can be compared to other > > rtx_cost (..., DI, SET, 1, ...) > > > > And CSE doesn't even need to create fake insns here since IMHO > > rtx_cost is good enough to capture the full insn. Targets can > > choose to split out a set_cost from their rtx_cost/insn_cost > > hooks for this case for example. > > Hi Richard, > > Right, we can fix this issue by updating rtx_cost/insn_cost to > tell that this kind of constants is a lot of expansive. > > To update rtx_cost, we can use a trivial patch (shown as end of > this mail) to fix this particular issue. > To use insn_cost, additional work is replacing rtx_cost with > insn_cost for CSE, maybe more suitable for stage1. > > So, it would be too far to fix this by refactoring the logic of > constant handling. :-) > > Thanks for your comments! > > BR, > Jiufu > > > > > Richard. > > > >> Any sugguestions? > >> > >> > > >> > Btw, all of this is of course not appropriate for stage4 and changes > >> > to CSE need testing on more than one target. > >> I would do more evaluation, thanks! > >> > >> Jiufu > >> > >> > > >> > Richard. > >> > > >> >> Thanks for the comments and suggestions again! > >> >> > >> >> > >> >> BR, > >> >> Jiufu > >> >> > > Part of the experimental patch for rs6000, which could help > to mitigate inaccurate rtx cost on constant.
Yes, that sort of thing is exactly what I had in mind. I assume the corresponding insn_costs hook does the same? Richard. > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index e82a47f4c0e..e429ae2bcf0 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -21884,6 +21884,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > outer_code, > > case CONST_DOUBLE: > case CONST_WIDE_INT: > + /* Set const to a reg, it may needs a few insns. */ > + if (outer_code == SET) > + { > + *total = COSTS_N_INSNS (num_insns_constant (x, mode)); > + return true; > + } > + /* FALLTHRU */ > + > case CONST: > case HIGH: > case SYMBOL_REF: