Re: [PATCH 6/8] gcn: Add else operand to masked loads.

Palmer Dabbelt Thu, 05 Sep 2024 16:59:55 -0700

On Thu, 05 Sep 2024 14:57:06 PDT (-0700), a...@baylibre.com wrote:

On Thu, 5 Sept 2024, 21:10 Robin Dapp, <rdapp....@gmail.com> wrote:

> > +(define_predicate "maskload_else_operand"
> > +  (and (match_code "const_int,const_vector")
> > +       (match_test "op == CONST0_RTX (GET_MODE (op))")))
>
> This forces maskload and mask_gather_load to only accept zero here, but
> in fact the hardware would allow us to accept any value (including
> undefined).
>
> Here's the expand code:
>
>    /* Masked lanes are required to hold zero.  */
>    emit_move_insn (operands[0], gcn_vec_constant (<MODE>mode, 0));
>
>    emit_insn (gen_gather<mode>_expr_exec (operands[0], addr, as, v,
>                                           operands[0], exec));

Where does this requirement come from?  Did you face problems when not
setting the masked lanes to zero?  I also realized the original problem
with a gather load, but, as Richi noticed in the meanwhile, only happens
with padding types.


There were absolutely problems without this. It's a while ago now, so I'm
struggling with the details, but as GCC only applies the mask to selected
operations there were all sorts of issues that crept in. Zeroing the
undefined lanes seemed to match the middle end assumptions (or, at least it
made the UB consistent?) or maybe I read it in the code somewhere. Sorry,
it's years since I wrote that.

Sorry to just randomly chime in here, but I bet Robin's asleep and Ithink you guys might just be talking past each other (and for somereason this one's poking my phone, so I kept seeing the messages).

[Also something's a bit broken with the thread in general, it looks likesome bits didn't make inbox.sourceware.org. They're all in my gmail, sohopefully they're possible to dig up on your end. Here's the besttop-level I could find:https://inbox.sourceware.org/gcc-patches/d3ddv8vm4ar5.usibt8rv5...@gmail.com/]

We ran into this unwritten "masked elements are zeroed" requirement inRISC-V as well, which resulted in a handful of bugs. The initialproposed workaround was to jam a bunch of zeroing into the backend, butthat's both a bit inefficient and appears to have not fully worked dueto some quirks of how the RISC-V vector extension configuration works<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115336>.

So Robin now has this patch set to adujst the middle end to be explicitabout what values it assumes for the unmasked elements, which we're thentaking advantage of on RISC-V by not inserting the pre-zeroing.

Generally the idea for the patch is to not require unconditional zeroing

everything but also enable undefined (like riscv) and leave the decision
to the vectorizer.  So it would only zero if needed.



This sounds like a generally good plan. Better than just zero it and hope
that's right anyway. ;)

So, in theory, is it better if amdgcn allows both? Or is that one little
move immediate instruction in the backend going to produce better/cleaner
middle end code?

IIUC the best code would come frome backends matching what the HW doesand relying on the middle end to insert explicit ops to set the unmaskedvalues where necessary. Unless the hardware has some faster way toinsert the zeros on masked elements (which IIUC is how x86 and aarch64work).

I think Robin was just trying to adjust the backends to have nofunctional changes while still respecting the new on-mask elementpattern flavors.


Robin would know better, though.


Andrew

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

Reply via email to