> > +(define_predicate "maskload_else_operand"
> > +  (and (match_code "const_int,const_vector")
> > +       (match_test "op == CONST0_RTX (GET_MODE (op))")))
>
> This forces maskload and mask_gather_load to only accept zero here, but 
> in fact the hardware would allow us to accept any value (including 
> undefined).
>
> Here's the expand code:
>
>    /* Masked lanes are required to hold zero.  */
>    emit_move_insn (operands[0], gcn_vec_constant (<MODE>mode, 0));
>
>    emit_insn (gen_gather<mode>_expr_exec (operands[0], addr, as, v,
>                                           operands[0], exec));

Where does this requirement come from?  Did you face problems when not
setting the masked lanes to zero?  I also realized the original problem
with a gather load, but, as Richi noticed in the meanwhile, only happens
with padding types.

Generally the idea for the patch is to not require unconditional zeroing
everything but also enable undefined (like riscv) and leave the decision
to the vectorizer.  So it would only zero if needed.

-- 
Regards
 Robin

Reply via email to