> -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: Tuesday, November 15, 2022 11:34 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft > <marcus.shawcr...@arm.com> > Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > > Tamar Christina <tamar.christ...@arm.com> writes: > >> -----Original Message----- > >> From: Richard Sandiford <richard.sandif...@arm.com> > >> Sent: Tuesday, November 15, 2022 11:15 AM > >> To: Tamar Christina <tamar.christ...@arm.com> > >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft > >> <marcus.shawcr...@arm.com> > >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > >> > >> Tamar Christina <tamar.christ...@arm.com> writes: > >> >> -----Original Message----- > >> >> From: Richard Sandiford <richard.sandif...@arm.com> > >> >> Sent: Tuesday, November 15, 2022 10:51 AM > >> >> To: Tamar Christina <tamar.christ...@arm.com> > >> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > >> >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus > Shawcroft > >> >> <marcus.shawcr...@arm.com> > >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > >> >> > >> >> Tamar Christina <tamar.christ...@arm.com> writes: > >> >> >> -----Original Message----- > >> >> >> From: Richard Sandiford <richard.sandif...@arm.com> > >> >> >> Sent: Tuesday, November 15, 2022 10:36 AM > >> >> >> To: Tamar Christina <tamar.christ...@arm.com> > >> >> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > >> >> >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus > >> Shawcroft > >> >> >> <marcus.shawcr...@arm.com> > >> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > >> >> >> > >> >> >> Tamar Christina <tamar.christ...@arm.com> writes: > >> >> >> > Hello, > >> >> >> > > >> >> >> > Ping and updated patch. > >> >> >> > > >> >> >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no > issues. > >> >> >> > > >> >> >> > Ok for master? > >> >> >> > > >> >> >> > Thanks, > >> >> >> > Tamar > >> >> >> > > >> >> >> > gcc/ChangeLog: > >> >> >> > > >> >> >> > * config/aarch64/aarch64.md (*tb<optab><mode>1): > >> >> >> > Rename > >> to... > >> >> >> > (*tb<optab><ALLI:mode><GPI:mode>1): ... this. > >> >> >> > (tbranch<mode>4): New. > >> >> >> > > >> >> >> > gcc/testsuite/ChangeLog: > >> >> >> > > >> >> >> > * gcc.target/aarch64/tbz_1.c: New test. > >> >> >> > > >> >> >> > --- inline copy of patch --- > >> >> >> > > >> >> >> > diff --git a/gcc/config/aarch64/aarch64.md > >> >> >> > b/gcc/config/aarch64/aarch64.md index > >> >> >> > > >> >> >> > >> >> > >> > 2bc2684b82c35a44e0a2cea6e3aaf32d939f8cdf..d7684c93fba5b717d568e1a4fd > >> >> >> 71 > >> >> >> > 2bde55c7c72e 100644 > >> >> >> > --- a/gcc/config/aarch64/aarch64.md > >> >> >> > +++ b/gcc/config/aarch64/aarch64.md > >> >> >> > @@ -943,12 +943,29 @@ (define_insn "*cb<optab><mode>1" > >> >> >> > (const_int 1)))] > >> >> >> > ) > >> >> >> > > >> >> >> > -(define_insn "*tb<optab><mode>1" > >> >> >> > +(define_expand "tbranch<mode>4" > >> >> >> > [(set (pc) (if_then_else > >> >> >> > - (EQL (zero_extract:DI (match_operand:GPI 0 > >> "register_operand" > >> >> >> "r") > >> >> >> > - (const_int 1) > >> >> >> > - (match_operand 1 > >> >> >> > - > >> >> >> > "aarch64_simd_shift_imm_<mode>" "n")) > >> >> >> > + (match_operator 0 "aarch64_comparison_operator" > >> >> >> > + [(match_operand:ALLI 1 "register_operand") > >> >> >> > + (match_operand:ALLI 2 > >> >> >> "aarch64_simd_shift_imm_<ALLI:mode>")]) > >> >> >> > + (label_ref (match_operand 3 "" "")) > >> >> >> > + (pc)))] > >> >> >> > + "optimize > 0" > >> >> >> > >> >> >> Why's the pattern conditional on optimize? Seems a valid > >> >> >> choice at -O0 > >> >> too. > >> >> >> > >> >> > > >> >> > Hi, > >> >> > > >> >> > I had explained the reason why in the original patch, just > >> >> > didn't repeat it in > >> >> the ping: > >> >> > > >> >> > Instead of emitting the instruction directly I've chosen to > >> >> > expand the pattern using a zero extract and generating the > >> >> > existing pattern for comparisons for two > >> >> > reasons: > >> >> > > >> >> > 1. Allows for CSE of the actual comparison. > >> >> > 2. It looks like the code in expand makes the label as unused > >> >> > and removed > >> >> it > >> >> > if it doesn't see a separate reference to it. > >> >> > > >> >> > Because of this expansion though I disable the pattern at -O0 > >> >> > since we > >> >> have no combine in that case so we'd end up with worse code. I > >> >> did try emitting the pattern directly, but as mentioned in no#2 > >> >> expand would then kill the label. > >> >> > > >> >> > Basically I emit the pattern directly, immediately during expand > >> >> > the label is > >> >> marked as dead for some weird reason. > >> >> > >> >> Isn't #2 a bug though? It seems like something we should fix > >> >> rather than work around. > >> > > >> > Yes it's a bug ☹ ok if I'm going to fix that bug then do I need to > >> > split the optabs still? Isn't the problem atm that I need the split? > >> > If I'm emitting the instruction directly then the recog pattern for > >> > it can just be (eq (vec_extract x 1) 0) which is the correct semantics? > >> > >> What rtx does the code that uses the optab pass for operand 0? > > > > It gets passed the full comparison: > > > > (eq (reg/v:SI 92 [ x ]) > > (const_int 0 [0])) > > > > of which we only look at the operator. > > OK, that's what I thought. The problem is then the one I mentioned above. > This rtx doesn't describe the operation that the optab is supposed to > perform, so it can never be used in the instruction pattern. (This is > different > from something like cbranch, where operand 0 can be used directly if the > target supports a very general compare-and-branch instruction.)
So I was wrong before about which RTL it gets passed. Deep in the expansion Code the rtl operation (eq (reg/v:SI 92 [ x ]) (const_int 0 [0])) Gets broken up and passed piecewise. First thing it does it explicitly check that the first argument in RTL is an operator: gcc_assert (insn_operand_matches (icode, 0, test)); and then the jump is emitted by breaking apart the rtl into it's operands: 4646 insn = emit_jump_insn (GEN_FCN (icode) (test, XEXP (test, 0), 4647 XEXP (test, 1), label)); And so the operands are: >>> p debug (operand0) (reg/v:SI 92 [ xD.4391 ]) >>> p debug (operand1) (const_int 0 [0]) >>> p debug (operand2) (code_label 0 0 0 2 (nil) [0 uses]) And targets never get to see the equality check. If the documentation of the optab is Updated to say that the target operand1 is to be used in a zero_extract with operand0 and compared with 0 then that should be fine no? that's the semantic of the optab itself. Based on that I don't think we need to split this optab do we? Just update the docs to clarify the zero extract semantics? Thanks, Tamar > > If we want to use a single optab, the code that generates the optab should > pass something like: > > (eq/ne (zero_extract op0 (const_int 1) op1) (const_int 0)) > > as operand 0, so that operand 0 specifies the real test condition. > > Thanks, > Richard