not autovec patterns

Lehua Ding Tue, 22 Aug 2023 20:33:54 -0700

Hi Robin,

Thanks for these nice comments!

-  emit_insn (gen_vcond_mask (vmode, vmode, d->target, d->op0, d->op1, mask));
+  /* swap op0 and op1 since the order is opposite to pred_merge.  */
+  rtx ops2[] = {d->target, d->op1, d->op0, mask};
+  emit_vlmax_merge_insn (code_for_pred_merge (vmode), 
riscv_vector::RVV_MERGE_OP, ops2);
    return true;
  }


This seems a separate, general fix that just surfaced in the course of
this patch?  Would be nice to have this factored out but as we already have
it, no need I guess.

Yes, since I change @vcond_mask_<mode><vm> from define_expand todefine_insn_and_split. If I don't change it then I need to manually makesure that d->target, d->op1, d->op0 satisfy the predicate of the@vcond_mask (vregs pass will check it, so need forbidden mem operand).If I use emit_vlmax_merge_insn directly, it uses expand_insn inner,which automatically converts the operands for me to make it satisfy thepredicate condition. This is one difference between gen_xxx andexpand_insn. And I think calling emit_vlmax_merge_insn to generatepred_merge is the most appropriate and uniform way.

+  if (is_dummy_mask)
+    {
+      /* Use TU, MASK ANY policy.  */
+      if (needs_fp_rounding (code, mode))
+       emit_nonvlmax_fp_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
+      else
+       emit_nonvlmax_tu_insn (icode, RVV_UNOP_TU, cond_ops, len);
+    }


We have quite a bit of code duplication across the expand_cond_len functions
now (binop, ternop, unop).  Not particular to your patch but I'd suggest to
unify this later.

Indeed, leave it to me and I'll send another patch later to reduce thisduplicate code.

+TEST_ALL (DEF_LOOP)
+
+/* NOTE: int abs operator is converted to vmslt + vneg.v */
+/* { dg-final { scan-assembler-times {\tvneg\.v\tv[0-9]+,v[0-9]+,v0\.t} 12 { xfail { 
any-opts "--param riscv-autovec-lmul=m2" } } } } */


Why does this fail with LMUL == 2 (also in the following tests)?  A comment
would be nice here.

This is because the number of iterations 5 in the testcase caused GCC toremove the Loop and turn it into two basic blocks, resulting in adoubling of the number of vnegs. I'm going to modify the iteration count(It should be big enough that that wouldn't happen even when LMUL=m8) sothat it doesn't trigger that optimization.


V2 patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628210.html

--
Best,
Lehua

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

Reply via email to