Re: [PATCH 2/2]AArch64 Support new tbranch optab.

Richard Sandiford via Gcc-patches Tue, 22 Nov 2022 06:00:43 -0800

Tamar Christina <[email protected]> writes:
>> -----Original Message-----
>> From: Richard Sandiford <[email protected]>
>> Sent: Tuesday, November 15, 2022 11:34 AM
>> To: Tamar Christina <[email protected]>
>> Cc: [email protected]; Richard Earnshaw
>> <[email protected]>; nd <[email protected]>; Marcus Shawcroft
>> <[email protected]>
>> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab.
>> 
>> Tamar Christina <[email protected]> writes:
>> >> -----Original Message-----
>> >> From: Richard Sandiford <[email protected]>
>> >> Sent: Tuesday, November 15, 2022 11:15 AM
>> >> To: Tamar Christina <[email protected]>
>> >> Cc: [email protected]; Richard Earnshaw
>> >> <[email protected]>; nd <[email protected]>; Marcus Shawcroft
>> >> <[email protected]>
>> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab.
>> >>
>> >> Tamar Christina <[email protected]> writes:
>> >> >> -----Original Message-----
>> >> >> From: Richard Sandiford <[email protected]>
>> >> >> Sent: Tuesday, November 15, 2022 10:51 AM
>> >> >> To: Tamar Christina <[email protected]>
>> >> >> Cc: [email protected]; Richard Earnshaw
>> >> >> <[email protected]>; nd <[email protected]>; Marcus
>> Shawcroft
>> >> >> <[email protected]>
>> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab.
>> >> >>
>> >> >> Tamar Christina <[email protected]> writes:
>> >> >> >> -----Original Message-----
>> >> >> >> From: Richard Sandiford <[email protected]>
>> >> >> >> Sent: Tuesday, November 15, 2022 10:36 AM
>> >> >> >> To: Tamar Christina <[email protected]>
>> >> >> >> Cc: [email protected]; Richard Earnshaw
>> >> >> >> <[email protected]>; nd <[email protected]>; Marcus
>> >> Shawcroft
>> >> >> >> <[email protected]>
>> >> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab.
>> >> >> >>
>> >> >> >> Tamar Christina <[email protected]> writes:
>> >> >> >> > Hello,
>> >> >> >> >
>> >> >> >> > Ping and updated patch.
>> >> >> >> >
>> >> >> >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no
>> issues.
>> >> >> >> >
>> >> >> >> > Ok for master?
>> >> >> >> >
>> >> >> >> > Thanks,
>> >> >> >> > Tamar
>> >> >> >> >
>> >> >> >> > gcc/ChangeLog:
>> >> >> >> >
>> >> >> >> >         * config/aarch64/aarch64.md (*tb<optab><mode>1):
>> >> >> >> > Rename
>> >> to...
>> >> >> >> >         (*tb<optab><ALLI:mode><GPI:mode>1): ... this.
>> >> >> >> >         (tbranch<mode>4): New.
>> >> >> >> >
>> >> >> >> > gcc/testsuite/ChangeLog:
>> >> >> >> >
>> >> >> >> >         * gcc.target/aarch64/tbz_1.c: New test.
>> >> >> >> >
>> >> >> >> > --- inline copy of patch ---
>> >> >> >> >
>> >> >> >> > diff --git a/gcc/config/aarch64/aarch64.md
>> >> >> >> > b/gcc/config/aarch64/aarch64.md index
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> 2bc2684b82c35a44e0a2cea6e3aaf32d939f8cdf..d7684c93fba5b717d568e1a4fd
>> >> >> >> 71
>> >> >> >> > 2bde55c7c72e 100644
>> >> >> >> > --- a/gcc/config/aarch64/aarch64.md
>> >> >> >> > +++ b/gcc/config/aarch64/aarch64.md
>> >> >> >> > @@ -943,12 +943,29 @@ (define_insn "*cb<optab><mode>1"
>> >> >> >> >                       (const_int 1)))]
>> >> >> >> >  )
>> >> >> >> >
>> >> >> >> > -(define_insn "*tb<optab><mode>1"
>> >> >> >> > +(define_expand "tbranch<mode>4"
>> >> >> >> >    [(set (pc) (if_then_else
>> >> >> >> > -             (EQL (zero_extract:DI (match_operand:GPI 0
>> >> "register_operand"
>> >> >> >> "r")
>> >> >> >> > -                                   (const_int 1)
>> >> >> >> > -                                   (match_operand 1
>> >> >> >> > -                                     
>> >> >> >> > "aarch64_simd_shift_imm_<mode>" "n"))
>> >> >> >> > +               (match_operator 0 "aarch64_comparison_operator"
>> >> >> >> > +                [(match_operand:ALLI 1 "register_operand")
>> >> >> >> > +                 (match_operand:ALLI 2
>> >> >> >> "aarch64_simd_shift_imm_<ALLI:mode>")])
>> >> >> >> > +               (label_ref (match_operand 3 "" ""))
>> >> >> >> > +               (pc)))]
>> >> >> >> > +  "optimize > 0"
>> >> >> >>
>> >> >> >> Why's the pattern conditional on optimize?  Seems a valid
>> >> >> >> choice at -O0
>> >> >> too.
>> >> >> >>
>> >> >> >
>> >> >> > Hi,
>> >> >> >
>> >> >> > I had explained the reason why in the original patch, just
>> >> >> > didn't repeat it in
>> >> >> the ping:
>> >> >> >
>> >> >> > Instead of emitting the instruction directly I've chosen to
>> >> >> > expand the pattern using a zero extract and generating the
>> >> >> > existing pattern for comparisons for two
>> >> >> > reasons:
>> >> >> >
>> >> >> >   1. Allows for CSE of the actual comparison.
>> >> >> >   2. It looks like the code in expand makes the label as unused
>> >> >> > and removed
>> >> >> it
>> >> >> >      if it doesn't see a separate reference to it.
>> >> >> >
>> >> >> > Because of this expansion though I disable the pattern at -O0
>> >> >> > since we
>> >> >> have no combine in that case so we'd end up with worse code.  I
>> >> >> did try emitting the pattern directly, but as mentioned in no#2
>> >> >> expand would then kill the label.
>> >> >> >
>> >> >> > Basically I emit the pattern directly, immediately during expand
>> >> >> > the label is
>> >> >> marked as dead for some weird reason.
>> >> >>
>> >> >> Isn't #2 a bug though?  It seems like something we should fix
>> >> >> rather than work around.
>> >> >
>> >> > Yes it's a bug ☹ ok if I'm going to fix that bug then do I need to
>> >> > split the optabs still? Isn't the problem atm that I need the split?
>> >> > If I'm emitting the instruction directly then the recog pattern for
>> >> > it can just be (eq (vec_extract x 1) 0) which is the correct semantics?
>> >>
>> >> What rtx does the code that uses the optab pass for operand 0?
>> >
>> > It gets passed the full comparison:
>> >
>> > (eq (reg/v:SI 92 [ x ])
>> >     (const_int 0 [0]))
>> >
>> > of which we only look at the operator.
>> 
>> OK, that's what I thought.  The problem is then the one I mentioned above.
>> This rtx doesn't describe the operation that the optab is supposed to
>> perform, so it can never be used in the instruction pattern.  (This is 
>> different
>> from something like cbranch, where operand 0 can be used directly if the
>> target supports a very general compare-and-branch instruction.)
>
> So I was wrong before about which RTL it gets passed.  Deep in the expansion
> Code the rtl operation 
>
> (eq (reg/v:SI 92 [ x ])
>       (const_int 0 [0]))
>
> Gets broken up and passed piecewise.
>
> First thing it does it explicitly check that the first argument in RTL is an 
> operator:
>
> gcc_assert (insn_operand_matches (icode, 0, test));
>
> and then the jump is emitted by breaking apart the rtl into it's operands:
>
> 4646      insn = emit_jump_insn (GEN_FCN (icode) (test, XEXP (test, 0),
> 4647                          XEXP (test, 1), label));


Yeah, but the question was what the code that generates the tbranch
optab passes for operand 0 ("test" in the call above).  And like you
said, it's the EQ rtx above, with XEXPs 0 and 1 being passed as operands
1 and 2.  I think the point still stands that that EQ rtx doesn't
describe the correct operation.

> And so the operands are:
>
>>>> p debug (operand0)
> (reg/v:SI 92 [ xD.4391 ])
>
>>>> p debug (operand1)
> (const_int 0 [0])
>
>>>> p debug (operand2)
> (code_label 0 0 0 2 (nil) [0 uses])
>
> And targets never get to see the equality check.

But the .md pattern was:

(define_expand "tbranch<mode>4"
  [(set (pc) (if_then_else
                (match_operator 0 "aarch64_comparison_operator"
                 [(match_operand:ALLI 1 "register_operand")
                  (match_operand:ALLI 2 "aarch64_simd_shift_imm_<ALLI:mode>")])
                (label_ref (match_operand 3 "" ""))
                (pc)))]
  "optimize > 0"
{
  rtx bitvalue = gen_reg_rtx (DImode);
  rtx tmp = simplify_gen_subreg (DImode, operands[1], GET_MODE (operands[1]), 
0);
  emit_insn (gen_extzv (bitvalue, tmp, const1_rtx, operands[2]));
  operands[2] = const0_rtx;
  operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), bitvalue,
                                         operands[2]);
})

where the EQ/NE rtx is passed and matched as operand 0.

> If the documentation of the optab is
> Updated to say that the target operand1 is to be used in a zero_extract with 
> operand0
> and compared with 0 then that should be fine no?  that's the semantic of the 
> optab itself.
>
> Based on that I don't think we need to split this optab do we?  Just update 
> the docs to
> clarify the zero extract semantics?

Well, the point of...

>> If we want to use a single optab, the code that generates the optab should
>> pass something like:
>> 
>>   (eq/ne (zero_extract op0 (const_int 1) op1) (const_int 0))
>> 
>> as operand 0, so that operand 0 specifies the real test condition.
>> 
>> Thanks,
>> Richard

...was that we should either (a) split the optab or (b) keep the single
optab and pass a "proper" description of the operation as operand 0.

Thanks,
Richard

Re: [PATCH 2/2]AArch64 Support new tbranch optab.

Reply via email to