Tamar Christina <tamar.christ...@arm.com> writes: >> -----Original Message----- >> From: Richard Sandiford <richard.sandif...@arm.com> >> Sent: Tuesday, November 15, 2022 11:34 AM >> To: Tamar Christina <tamar.christ...@arm.com> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft >> <marcus.shawcr...@arm.com> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> -----Original Message----- >> >> From: Richard Sandiford <richard.sandif...@arm.com> >> >> Sent: Tuesday, November 15, 2022 11:15 AM >> >> To: Tamar Christina <tamar.christ...@arm.com> >> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw >> >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft >> >> <marcus.shawcr...@arm.com> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. >> >> >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> >> -----Original Message----- >> >> >> From: Richard Sandiford <richard.sandif...@arm.com> >> >> >> Sent: Tuesday, November 15, 2022 10:51 AM >> >> >> To: Tamar Christina <tamar.christ...@arm.com> >> >> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw >> >> >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus >> Shawcroft >> >> >> <marcus.shawcr...@arm.com> >> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. >> >> >> >> >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> >> >> -----Original Message----- >> >> >> >> From: Richard Sandiford <richard.sandif...@arm.com> >> >> >> >> Sent: Tuesday, November 15, 2022 10:36 AM >> >> >> >> To: Tamar Christina <tamar.christ...@arm.com> >> >> >> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw >> >> >> >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus >> >> Shawcroft >> >> >> >> <marcus.shawcr...@arm.com> >> >> >> >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. >> >> >> >> >> >> >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> >> >> > Hello, >> >> >> >> > >> >> >> >> > Ping and updated patch. >> >> >> >> > >> >> >> >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no >> issues. >> >> >> >> > >> >> >> >> > Ok for master? >> >> >> >> > >> >> >> >> > Thanks, >> >> >> >> > Tamar >> >> >> >> > >> >> >> >> > gcc/ChangeLog: >> >> >> >> > >> >> >> >> > * config/aarch64/aarch64.md (*tb<optab><mode>1): >> >> >> >> > Rename >> >> to... >> >> >> >> > (*tb<optab><ALLI:mode><GPI:mode>1): ... this. >> >> >> >> > (tbranch<mode>4): New. >> >> >> >> > >> >> >> >> > gcc/testsuite/ChangeLog: >> >> >> >> > >> >> >> >> > * gcc.target/aarch64/tbz_1.c: New test. >> >> >> >> > >> >> >> >> > --- inline copy of patch --- >> >> >> >> > >> >> >> >> > diff --git a/gcc/config/aarch64/aarch64.md >> >> >> >> > b/gcc/config/aarch64/aarch64.md index >> >> >> >> > >> >> >> >> >> >> >> >> >> >> 2bc2684b82c35a44e0a2cea6e3aaf32d939f8cdf..d7684c93fba5b717d568e1a4fd >> >> >> >> 71 >> >> >> >> > 2bde55c7c72e 100644 >> >> >> >> > --- a/gcc/config/aarch64/aarch64.md >> >> >> >> > +++ b/gcc/config/aarch64/aarch64.md >> >> >> >> > @@ -943,12 +943,29 @@ (define_insn "*cb<optab><mode>1" >> >> >> >> > (const_int 1)))] >> >> >> >> > ) >> >> >> >> > >> >> >> >> > -(define_insn "*tb<optab><mode>1" >> >> >> >> > +(define_expand "tbranch<mode>4" >> >> >> >> > [(set (pc) (if_then_else >> >> >> >> > - (EQL (zero_extract:DI (match_operand:GPI 0 >> >> "register_operand" >> >> >> >> "r") >> >> >> >> > - (const_int 1) >> >> >> >> > - (match_operand 1 >> >> >> >> > - >> >> >> >> > "aarch64_simd_shift_imm_<mode>" "n")) >> >> >> >> > + (match_operator 0 "aarch64_comparison_operator" >> >> >> >> > + [(match_operand:ALLI 1 "register_operand") >> >> >> >> > + (match_operand:ALLI 2 >> >> >> >> "aarch64_simd_shift_imm_<ALLI:mode>")]) >> >> >> >> > + (label_ref (match_operand 3 "" "")) >> >> >> >> > + (pc)))] >> >> >> >> > + "optimize > 0" >> >> >> >> >> >> >> >> Why's the pattern conditional on optimize? Seems a valid >> >> >> >> choice at -O0 >> >> >> too. >> >> >> >> >> >> >> > >> >> >> > Hi, >> >> >> > >> >> >> > I had explained the reason why in the original patch, just >> >> >> > didn't repeat it in >> >> >> the ping: >> >> >> > >> >> >> > Instead of emitting the instruction directly I've chosen to >> >> >> > expand the pattern using a zero extract and generating the >> >> >> > existing pattern for comparisons for two >> >> >> > reasons: >> >> >> > >> >> >> > 1. Allows for CSE of the actual comparison. >> >> >> > 2. It looks like the code in expand makes the label as unused >> >> >> > and removed >> >> >> it >> >> >> > if it doesn't see a separate reference to it. >> >> >> > >> >> >> > Because of this expansion though I disable the pattern at -O0 >> >> >> > since we >> >> >> have no combine in that case so we'd end up with worse code. I >> >> >> did try emitting the pattern directly, but as mentioned in no#2 >> >> >> expand would then kill the label. >> >> >> > >> >> >> > Basically I emit the pattern directly, immediately during expand >> >> >> > the label is >> >> >> marked as dead for some weird reason. >> >> >> >> >> >> Isn't #2 a bug though? It seems like something we should fix >> >> >> rather than work around. >> >> > >> >> > Yes it's a bug ☹ ok if I'm going to fix that bug then do I need to >> >> > split the optabs still? Isn't the problem atm that I need the split? >> >> > If I'm emitting the instruction directly then the recog pattern for >> >> > it can just be (eq (vec_extract x 1) 0) which is the correct semantics? >> >> >> >> What rtx does the code that uses the optab pass for operand 0? >> > >> > It gets passed the full comparison: >> > >> > (eq (reg/v:SI 92 [ x ]) >> > (const_int 0 [0])) >> > >> > of which we only look at the operator. >> >> OK, that's what I thought. The problem is then the one I mentioned above. >> This rtx doesn't describe the operation that the optab is supposed to >> perform, so it can never be used in the instruction pattern. (This is >> different >> from something like cbranch, where operand 0 can be used directly if the >> target supports a very general compare-and-branch instruction.) > > So I was wrong before about which RTL it gets passed. Deep in the expansion > Code the rtl operation > > (eq (reg/v:SI 92 [ x ]) > (const_int 0 [0])) > > Gets broken up and passed piecewise. > > First thing it does it explicitly check that the first argument in RTL is an > operator: > > gcc_assert (insn_operand_matches (icode, 0, test)); > > and then the jump is emitted by breaking apart the rtl into it's operands: > > 4646 insn = emit_jump_insn (GEN_FCN (icode) (test, XEXP (test, 0), > 4647 XEXP (test, 1), label));
Yeah, but the question was what the code that generates the tbranch optab passes for operand 0 ("test" in the call above). And like you said, it's the EQ rtx above, with XEXPs 0 and 1 being passed as operands 1 and 2. I think the point still stands that that EQ rtx doesn't describe the correct operation. > And so the operands are: > >>>> p debug (operand0) > (reg/v:SI 92 [ xD.4391 ]) > >>>> p debug (operand1) > (const_int 0 [0]) > >>>> p debug (operand2) > (code_label 0 0 0 2 (nil) [0 uses]) > > And targets never get to see the equality check. But the .md pattern was: (define_expand "tbranch<mode>4" [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator" [(match_operand:ALLI 1 "register_operand") (match_operand:ALLI 2 "aarch64_simd_shift_imm_<ALLI:mode>")]) (label_ref (match_operand 3 "" "")) (pc)))] "optimize > 0" { rtx bitvalue = gen_reg_rtx (DImode); rtx tmp = simplify_gen_subreg (DImode, operands[1], GET_MODE (operands[1]), 0); emit_insn (gen_extzv (bitvalue, tmp, const1_rtx, operands[2])); operands[2] = const0_rtx; operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), bitvalue, operands[2]); }) where the EQ/NE rtx is passed and matched as operand 0. > If the documentation of the optab is > Updated to say that the target operand1 is to be used in a zero_extract with > operand0 > and compared with 0 then that should be fine no? that's the semantic of the > optab itself. > > Based on that I don't think we need to split this optab do we? Just update > the docs to > clarify the zero extract semantics? Well, the point of... >> If we want to use a single optab, the code that generates the optab should >> pass something like: >> >> (eq/ne (zero_extract op0 (const_int 1) op1) (const_int 0)) >> >> as operand 0, so that operand 0 specifies the real test condition. >> >> Thanks, >> Richard ...was that we should either (a) split the optab or (b) keep the single optab and pass a "proper" description of the operation as operand 0. Thanks, Richard