RE: A problem with one instruction multiple latencies and pipelines

Qian, Jianhua Sun, 13 Sep 2020 22:42:10 -0700

Hi Richard and Segher

I don't know if I exactly understood your discussion.
If I misunderstood, please let me know.


I am trying to test these two cases.
Case 1. keep the TYPE attribute unchanged, add new attributes
    It works well as below.
    (define_attr "shift_imm_value" 
"shift_imm_none,shift_imm_0,shift_imm_1to4,shift_imm_ge5" (const_string 
"shift_imm_none"))

    (define_insn "*add_<shift>_<mode>"
      [(set (match_operand:GPI 0 "register_operand" "=r")
            (plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
                                  (match_operand:QI 2 
"aarch64_shift_imm_<mode>" "n"))
                      (match_operand:GPI 3 "register_operand" "r")))]
      ""
      "add\\t%<w>0, %<w>3, %<w>1, <shift> %2"
      [(set_attr "type" "alu_shift_imm")
       (set (attr "shift_imm_value")
            (cond [(eq (symbol_ref "XINT(operands[2],0)") (const_int 0)) 
(const_string "shift_imm_0")
                   (and (ge (symbol_ref "XINT(operands[2],0)") (const_int 1)) 
(le (symbol_ref "XINT(operands[2],0)") (const_int 4))) (const_string 
"shift_imm_1to4")]
                   (const_string "shift_imm_ge5")))]
    )

    (define_insn_reservation "a64fx_alu_shift1to4" 2
      (and (eq_attr "tune" "a64fx")
           (eq_attr "type" "alu_shift_imm")
           (eq_attr "shift_imm_value" "shift_imm_1to4"))
      "(a64fx_exa,a64fx_exa)|(a64fx_exb,a64fx_exb)")


Case2. expand the TYPE attribute
    I replaced the alu_shift_imm to alu_shift_imm, alu_shift_imm1to4, and I got 
the error message " bad number of entries in SET_ATTR_ALTERNATIVE, was 2 
expected 1"
    (define_attr "type"
      ...
      alu_shift_imm,\
      alu_shift_imm1to4,\
      ...

    (define_insn "*add_<shift>_<mode>"
      [(set (match_operand:GPI 0 "register_operand" "=r")
            (plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
                                  (match_operand:QI 2 
"aarch64_shift_imm_<mode>" "n"))
                      (match_operand:GPI 3 "register_operand" "r")))]
      ""
      "add\\t%<w>0, %<w>3, %<w>1, <shift> %2"
      [(set_attr "type" "alu_shift_imm,alu_shift_imm1to4")]
    )

    It means that one "type" value should be matched by one operand constraint 
pattern. So this will raise two problems.
      1. One instruction should be classified to multiple patterns
         - define multiple operand constraints, such as "(match_operand:QI 2 
"aarch64_shift_imm_<mode>" "0, 1to4, ...")"
         - or, use "cond" at set_attr just like Case1.
      2. Whatever you do with the first problem, the type attribute cannot be 
set with "alu_shift_imm"
        and "alu_shift_imm1to4" at the same time by one operand constraint 
pattern.
            - If we cannot resolve it, the existing CPUs' descriptions need to 
be changed. This is not what I expected.
            - If we want to add new attribute to resolve this problem, why not 
use the Case1 directly?

> It is very much not what I am saying.  I *am* saying that if people trying to
> improve one CPU's modelling have to edit over 20 models for CPUs that they do
> not care about, mistakes happen. 

That's what I'm worried about.

Regards
Qian

> -----Original Message-----
> From: Segher Boessenkool <seg...@kernel.crashing.org>
> Sent: Friday, September 11, 2020 9:59 PM
> To: Qian, Jianhua/钱 建华 <qia...@cn.fujitsu.com>; gcc@gcc.gnu.org;
> richard.sandif...@arm.com
> Subject: Re: A problem with one instruction multiple latencies and pipelines
> 
> Hi!
> 
> On Fri, Sep 11, 2020 at 08:44:54AM +0100, Richard Sandiford wrote:
> > Segher Boessenkool <seg...@kernel.crashing.org> writes:
> > > Consider cores that do not care about the "subtype" at all: when
> > > using just "type", all cores have to test for "foo,foo_subtype",
> > > while with a separate attribute they can just test for "foo".
> >
> > Right.  But that was exactly the case I was considering with the sed
> > comment above :-)
> >
> > I don't see adding a new attribute value to a set_attr as extra
> > complication.  It's just adding a value to a set.  To me extra
> > complication is adding (ior …)s, (and …)s, etc.
> 
> Combinatorial explosion is real.
> 
> (ior)s etc. are easier to read, and importantly it is much more obvious if 
> any case
> is missed with them.
> 
> Here is the very first diff in d839f53b7dfd, where I started this for
> rs6000:
> 
>  (define_insn_reservation "ppc403-load" 2
> -  (and (eq_attr "type"
> "load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,\
> -                       load_l,store_c,sync")
> +  (and (eq_attr "type" "load,load_l,store_c,sync")
>         (eq_attr "cpu" "ppc403,ppc405"))
>    "iu_40x")
> 
> We started off with 77 insn types.  Now we have 63 (it was 57 before, we added
> some more).
> 
> > > Yeas, exactly.  And for rs6000 we *did* have many more types before,
> > > and very frequently one was missed (in a scheduling description usually).
> > > Especially common was when some new type attribute was added, not
> > > all existing cpu descriptions were updated correctly.  Maybe this is
> > > the strongest argument "for" actually :-)
> >
> > The update shouldn't be done by hand.
> 
> No, it very much had to, because the existing code missed many cases and was
> inconsistent in others.
> 
> > What I'm really wary of with taking the line above is that it feeds
> > into the attitude that existing scheduling descriptions should become
> > fossilised at the point they're added: noone working on a different
> > core should change the declarations in the file later in case they
> > miss something.  I don't think that's what you're saying, but it could
> > feed into that general attitude.
> 
> It is very much not what I am saying.  I *am* saying that if people trying to
> improve one CPU's modelling have to edit over 20 models for CPUs that they do
> not care about, mistakes happen.  Such churn is conceptually wrong as well,
> of course (the new types are just the same as the old types on most older
> CPUs!)
> 
> > > Yes.
> > >
> > > One example of what we do:
> > >
> > > ;; Is this instruction using operands[2] as shift amount, and can
> > > that be a ;; register?
> > > ;; This is used for shift insns.
> > > (define_attr "maybe_var_shift" "no,yes" (const_string "no"))
> > >
> > > ;; Is this instruction using a shift amount from a register?
> > > ;; This is used for shift insns.
> > > (define_attr "var_shift" "no,yes"
> > >   (if_then_else (and (eq_attr "type" "shift")
> > >                      (eq_attr "maybe_var_shift" "yes"))
> > >                 (if_then_else (match_operand 2 "gpc_reg_operand")
> > >                               (const_string "yes")
> > >                               (const_string "no"))
> > >                 (const_string "no")))
> > >
> > > define_insns only use maybe_var_shift.  Only the power6 and e*500*
> > > cpu descriptions ever look at var_shift.  (Directly.  There is some
> > > other scheduling code that looks at it too -- and all but the power6
> > > ones seem to be incorrect!  That is all a direct translation of
> > > "type-only" code there was before...)
> >
> > Right.  This is similar to the:
> >
> >       (define_attr "shift_imm_type" "none,alu_shift_op2")
> >
> > thing I suggested upthread.  I think it's better for the define_insn
> > attribute to specify the operand number directly, since there might
> > well be cases where the shift isn't operand 2.
> 
> Not in this case; all such insns always have it as op 2 in the machine
> instruction, and RTL almost always uses the same order.
> 
> > Anyway, I think we're in violent agreement on how this works, we're
> > just taking different conclusions from it.
> 
> Yeah :-)  And not even so different: I think you agree it is a trade-off, you 
> just
> see it tilt more to one side, while I see it go to the other side often.
> 
> We still have over fifty instruction types.  We (hopefully :-) ) use the 
> "extra
> attribute" method only where that helps more than it hurts.  It is just 
> another
> tool in the toolbox.
> 
> 
> Segher
>

RE: A problem with one instruction multiple latencies and pipelines

Reply via email to