On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote: > Hi Richard, > > Thanks for the reviews, I'm making these changes but just a heads up. > > When hardcoding LR_REGNUM like this we need to change the way we compare the > register in doloop_condition_get. This function currently compares the rtx > nodes by address, which I think happens to be fine before we assign hard > registers, as I suspect we always share the rtx node for the same pseudo, but > when assigning registers it seems like we create copies, so things like: > `XEXP (inc_src, 0) == reg` will fail for > inc_src: (plus (reg LR) (const_int -n)' > reg: (reg LR) > > Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, > op2, NULL)'.
Yes, that's fine. R. > > Sound good? > > Kind regards, > Andre > > ________________________________________ > From: Richard Earnshaw (lists) <richard.earns...@arm.com> > Sent: Tuesday, January 30, 2024 11:36 AM > To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov; Stam Markianos-Wright > Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low > Overhead Loops > > On 19/01/2024 14:40, Andre Vieira wrote: >> >> Respin after comments from Kyrill and rebase. I also removed an if-then-else >> construct in arm_mve_check_reg_origin_is_num_elems similar to the other >> functions >> Kyrill pointed out. >> >> After an earlier comment from Richard Sandiford I also added comments to the >> two tail predication patterns added to explain the need for the unspecs. > > [missing ChangeLog] > > I'm just going to focus on loop-doloop.c in this reply, I'll respond to the > other bits in a follow-up. > > 2) (set (reg) (plus (reg) (const_int -1)) > - (set (pc) (if_then_else (reg != 0) > - (label_ref (label)) > - (pc))). > + (set (pc) (if_then_else (reg != 0) > + (label_ref (label)) > + (pc))). > > Some targets (ARM) do the comparison before the branch, as in the > following form: > > - 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0))) > - (set (reg) (plus (reg) (const_int -1)))]) > - (set (pc) (if_then_else (cc == NE) > ... > > > This comment is becoming confusing. Really the text leading up to 3)... > should be inside 3. Something like: > > 3) Some targets (ARM) do the comparison before the branch, as in the > following form: > > (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0)) > (set (reg) (plus (reg) (const_int -1)))]) > (set (pc) (if_then_else (cc == NE) > (label_ref (label)) > (pc)))]) > > > The same issue on the comment structure also applies to the new point 4... > > + The ARM target also supports a special case of a counter that > decrements > + by `n` and terminating in a GTU condition. In that case, the compare > and > + branch are all part of one insn, containing an UNSPEC: > + > + 4) (parallel [ > + (set (pc) > + (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr) > + (const_int -n))]) > + (const_int n-1])) > + (label_ref) > + (pc))) > + (set (reg:SI 14 lr) > + (plus:SI (reg:SI 14 lr) > + (const_int -n))) > + */ > > I think this needs a bit more clarification. Specifically that this > construct supports a predicated vectorized do loop. Also, the placement of > the unspec inside the comparison is ugnly and unnecessary. It should be > sufficient to have the unspec inside a USE expression, which the mid-end can > then ignore entirely. So > > (parallel > [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n)) > (const_int n-1)) > (label_ref) (pc))) > (set (reg) (plus (reg) (const_int -n))) > (additional clobbers and uses)]) > > For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to > this pattern to stop anything else from matching it. > > Note that we don't need to mention that the register is 'LR' or the modes, > those are specific to a particular backend, not the generic pattern we want > to match. > > + || !CONST_INT_P (XEXP (inc_src, 1)) > + || INTVAL (XEXP (inc_src, 1)) >= 0) > return 0; > + int dec_num = abs (INTVAL (XEXP (inc_src, 1))); > > We can just use '-INTVAL(...)' here, we've verified just above that the > constant is negative. > > - if ((XEXP (condition, 0) == reg) > + /* For the ARM special case of having a GTU: re-form the condition without > + the unspec for the benefit of the middle-end. */ > + if (GET_CODE (condition) == GTU) > + { > + condition = gen_rtx_fmt_ee (GTU, VOIDmode, inc_src, > + GEN_INT (dec_num - 1)); > + return condition; > + } > > If you make the change I mentioned above, this re-forming isn't needed any > more, so the arm-specific comment goes away > > - { > + { > if (GET_CODE (pattern) != PARALLEL) > /* For the second form we expect: > > You've fixed the indentation of the brace (good), but the body of the braced > expression needs re-indenting as well. > > R. >