On Wed, 9 Mar 2022, Jiufu Guo wrote:

> 
> Hi!
> 
> Richard Biener <rguent...@suse.de> writes:
> 
> > On Tue, 8 Mar 2022, Jiufu Guo wrote:
> >
> >> Jiufu Guo <guoji...@linux.ibm.com> writes:
> >> 
> >> Hi!
> >> 
> >> > Hi Sehger,
> >> >
> >> > Segher Boessenkool <seg...@kernel.crashing.org> writes:
> >> >
> >> >> On Tue, Mar 01, 2022 at 10:28:57PM +0800, Jiufu Guo wrote:
> >> >>> Segher Boessenkool <seg...@kernel.crashing.org> writes:
> >> >>> > No.  insn_cost is only for correct, existing instructions, not for
> >> >>> > made-up nonsense.  I created insn_cost precisely to get away from 
> >> >>> > that
> >> >>> > aspect of rtx_cost (and some other issues, like, it is incredibly 
> >> >>> > hard
> >> >>> > and cumbersome to write a correct rtx_cost).
> >> >>> 
> >> >>> Thanks! The implementations of hook insn_cost are align with this
> >> >>> design, they are  checking insn's attributes and COSTS_N_INSNS.
> >> >>> 
> >> >>> One question on the speciall case: 
> >> >>> For instruction: "r119:DI=0x100803004101001"
> >> >>> Would we treat it as valid instruction?
> >> >>
> >> >> Currently we do, alternative 6 in *movdi_internal64: we allow any r<-n.
> >> >> This is costed as 5 insns (cost=20).
> >> >>
> >> >> It generally is better to split things into patterns close to the
> >> >> eventual machine isntructions as early as possible: all the more generic
> >> >> optimisations can take advantage of that then.
> >> > Get it!
> >> >>
> >> >>> A patch, which is attached the end of this mail, accepts
> >> >>> "r119:DI=0x100803004101001" as input of insn_cost.
> >> >>> In this patch, 
> >> >>> - A tmp instruction is generated via make_insn_raw.
> >> >>> - A few calls to rtx_cost (in cse_insn) is replaced by insn_cost.
> >> >>> - In hook of insn_cost, checking the special 'constant' instruction.
> >> >>> Are these make sense?
> >> >>
> >> >> I'll review that patch inline.
> >> 
> >> I drafted a new patch that replace rtx_cost with insn_cost for cse.cc.
> >> Different from the previous partial patch, this patch replaces all usage
> >> of rtx_cost. It may be better/aggressive than previous one.
> >
> > I think there's no advantage for using insn_cost over rtx_cost for
> > the simple SET case.
> 
> Thanks for your comments and raise this concern.
> 
> For those targets which do not implement insn_cost, insn_cost calls
> rtx_cost through pattern_cost, then insn_cost is equal to rtx_cost.
> 
> While, for those targets which have insn_cost, it seems insn_cost would
> be better(or say more accurate/consistent?) than rtx_cost. Since:
> - insn_cost recog the insn first, and compute cost through something

target hooks are expected to call recog on the insn but the generic
fallback does not!?  Or do you say a target _could_ call recog?
I think it would be valid to only expect recognized insns here
and thus cse.cc obligation would be to call regoc on the "fake"
instruction which then raises the obvious issue whether you should
ever call recog on something "fake".

I also see that rs6000_insn_cost does

static int
rs6000_insn_cost (rtx_insn *insn, bool speed)
{
  if (recog_memoized (insn) < 0)
    return 0;

so not recognized insns become quite cheap - it looks like
insn_cost has no documented failure mode and initial implementors
chickened out asserting that an insn is / can be recognized.
In fact I'd assert that INSN_CODE (insn) >= 0 simply because there's
no failure mode and the _caller_ should have made sure the
insn can be recognized.  That said, the insn_cost should do that.
(is INSN_CODE () == 0 a real thing?)

> (like length/cost attributes from .md file) for the 'machine insn'.
> - rtx_cost estimates the cost through analyzing the 'rtx content'.
> The accurate estimation relates to the context.
> 
> For a special example: "%r100 = C", as a previous patch, by tunning
> target's rtx_cost hook, cost could be computed according to the value
> of C. insn_cost may just model the cost in the define of the machine
> instruction.
> 
> These reasons are my initial thoughts.  Segher may have better
> explain. :-) 

OK, so in theory the targets insn_cost can use insn attributes
which allows to keep the cost parts of an instruction close to the
instruction patterns which is probably a good thing for
maintainability.  But as you point out this requires recognizing
of possibly generated instructions.  And before reload it has
the issue that the alternative is not determined which means the
true cost cannot be determined without walking the pattern,
like on x86 where many instruction operands have both register
and memory alternatives.

> To replace rtx_cost with insn_cost, this patch build a SET instruction:
> "%r = rtx_expr", then using "%r = rtx_expr" from insn_cost to simulate
> the cost of "rtx_expr" from rtx_cost.
> 
> 
> BR,
> Jiufu
> 
> >
> > Richard.
> >
> >> With this patch, bootstrap pass.
> >> From regtest, only output of fusion-p10-ldcmpi.c is changed, and the
> >> change seems as expected.
> >> 
> >> 
> >> BR,
> >> Jiufu
> >> 
> >> diff --git a/gcc/cse.cc b/gcc/cse.cc
> >> index a18b599d324..e623ad298db 100644
> >> --- a/gcc/cse.cc
> >> +++ b/gcc/cse.cc
> >> @@ -262,6 +262,9 @@ static struct qty_table_elem *qty_table;
> >>  static rtx_insn *this_insn;
> >>  static bool optimize_this_for_speed_p;
> >>  
> >> +/* Used for insn_cost. */
> >> +static rtx_insn *estimate_insn;
> >> +
> >>  /* Index by register number, gives the number of the next (or
> >>     previous) register in the chain of registers sharing the same
> >>     value.
> >> @@ -445,7 +448,7 @@ struct table_elt
> >>  /* Compute cost of X, as stored in the `cost' field of a table_elt.  Fixed
> >>     hard registers and pointers into the frame are the cheapest with a cost
> >>     of 0.  Next come pseudos with a cost of one and other hard registers 
> >> with
> >> -   a cost of 2.  Aside from these special cases, call `rtx_cost'.  */
> >> +   a cost of 2.  Aside from these special cases, call `insn_cost'.  */
> >>  
> >>  #define CHEAP_REGNO(N)                                                    
> >> \
> >>    (REGNO_PTR_FRAME_P (N)                                          \
> >> @@ -698,18 +701,33 @@ preferable (int cost_a, int regcost_a, int cost_b, 
> >> int regcost_b)
> >>     from COST macro to keep it simple.  */
> >>  
> >>  static int
> >> -notreg_cost (rtx x, machine_mode mode, enum rtx_code outer, int opno)
> >> +notreg_cost (rtx x, machine_mode mode, enum rtx_code /*outer*/, int 
> >> /*opno*/)
> >>  {
> >>    scalar_int_mode int_mode, inner_mode;
> >> -  return ((GET_CODE (x) == SUBREG
> >> -     && REG_P (SUBREG_REG (x))
> >> -     && is_int_mode (mode, &int_mode)
> >> -     && is_int_mode (GET_MODE (SUBREG_REG (x)), &inner_mode)
> >> -     && GET_MODE_SIZE (int_mode) < GET_MODE_SIZE (inner_mode)
> >> -     && subreg_lowpart_p (x)
> >> -     && TRULY_NOOP_TRUNCATION_MODES_P (int_mode, inner_mode))
> >> -    ? 0
> >> -    : rtx_cost (x, mode, outer, opno, optimize_this_for_speed_p) * 2);
> >> +  if (GET_CODE (x) == SUBREG && REG_P (SUBREG_REG (x))
> >> +      && is_int_mode (mode, &int_mode)
> >> +      && is_int_mode (GET_MODE (SUBREG_REG (x)), &inner_mode)
> >> +      && GET_MODE_SIZE (int_mode) < GET_MODE_SIZE (inner_mode)
> >> +      && subreg_lowpart_p (x)
> >> +      && TRULY_NOOP_TRUNCATION_MODES_P (int_mode, inner_mode))
> >> +    return 0;
> >> +
> >> +  if (estimate_insn == NULL)
> >> +    {
> >> +      estimate_insn = make_insn_raw (
> >> +  gen_rtx_SET (gen_rtx_REG (mode, LAST_VIRTUAL_REGISTER + 1), x));
> >> +      SET_PREV_INSN (estimate_insn) = NULL_RTX;
> >> +      SET_NEXT_INSN (estimate_insn) = NULL_RTX;
> >> +      INSN_LOCATION (estimate_insn) = 0;
> >> +    }
> >> +  else
> >> +    {
> >> +      /* Update for new context.  */
> >> +      INSN_CODE (estimate_insn) = -1;
> >> +      PUT_MODE (SET_DEST (PATTERN (estimate_insn)), mode);
> >> +      SET_SRC (PATTERN (estimate_insn)) = x;
> >> +    }
> >> +  return insn_cost (estimate_insn, optimize_this_for_speed_p);
> >>  }
> >>  
> >>  
> >> @@ -6667,6 +6685,7 @@ cse_main (rtx_insn *f ATTRIBUTE_UNUSED, int nregs)
> >>  
> >>    init_recog ();
> >>    init_alias_analysis ();
> >> +  estimate_insn = NULL;
> >>  
> >>    reg_eqv_table = XNEWVEC (struct reg_eqv_elem, nregs);
> >>  
> >> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Reply via email to