On 8/30/24 16:16, LIU Zhiwei wrote:
From: TANG Tiancheng<tangtiancheng....@alibaba-inc.com>

1.Address immediate value constraints in RISC-V Vector Extension 1.0 for
comparison instructions.

2.Extend comparison results from mask registers to SEW-width elements,
   following recommendations in The RISC-V SPEC Volume I (Version 20240411).

This aligns with TCG's cmp_vec behavior by expanding compare results to
full element width: all 1s for true, all 0s for false.

Signed-off-by: TANG Tiancheng<tangtiancheng....@alibaba-inc.com>
Reviewed-by: Liu Zhiwei<zhiwei_...@linux.alibaba.com>
---
  tcg/riscv/tcg-target-con-set.h |   6 +-
  tcg/riscv/tcg-target.c.inc     | 240 +++++++++++++++++++++++++++++++++
  tcg/riscv/tcg-target.opc.h     |   5 +
  3 files changed, 250 insertions(+), 1 deletion(-)

diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index 7277cb9af8..6c9ad5188b 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -21,7 +21,11 @@ C_O1_I2(r, rZ, rZ)
  C_N1_I2(r, r, rM)
  C_O1_I4(r, r, rI, rM, rM)
  C_O2_I4(r, r, rZ, rZ, rM, rM)
+C_O0_I1(v)
+C_O0_I2(v, v)
  C_O0_I2(v, r)
-C_O0_I2(v, vK)

Removing vK, just added in the previous patch.

+static bool expand_vec_cmp_vi(TCGType type, unsigned vece,
+                              TCGv_vec v1, TCGArg a2, TCGCond cond)
+{
+    int64_t arg2 = arg_temp(a2)->val;
+    bool invert = false;
+
+    if (!tcg_vec_cmp_can_do_vi(cond, arg2)) {
...
+static bool expand_vec_cmp_noinv(TCGType type, unsigned vece, TCGv_vec v1,
+                          TCGArg a2, TCGCond cond)
+{
+    bool invert = false;
+    TCGTemp *t1 = arg_temp(a2);
+
+    if (t1->kind == TEMP_CONST) {
+        invert = expand_vec_cmp_vi(type, vece, v1, a2, cond);

This will not work as you intend, primarily because vector constants are stored in expanded form. E.g. MO_8 1 is stored as 0x0101010101010101.

This is handled transparently *if* you use tcg_target_const_match instead.
Otherwise one must (sign)extract the low vece bits, and then double-check that the replication of the low bits matches the complete 'a2' value.

I agree that we should be prepared for more vector x scalar operations, but that needs to happen during generic expansion rather than very late in the backend.

I think the first implementation should be simpler:

CONST('C', TCG_CT_CONST_CMP_VI)

tcg_target_const_match()
{
    ...
    if ((ct & TCG_CT_CONST_CMP_VI) &&
        val >= tcg_cmpcond_to_rvv_vi[cond].min &&
        val <= tcg_cmpcond_to_rvv_vi[cond].max) {
        return true;
    }
}

    case INDEX_op_cmp_vec:
        riscv_set_vec_config_vl_vece(s, type, vece);
        cond = args[3];
        if (c2) {
            tcg_out_opc_vi(s, tcg_cmpcond_to_rvv_vi[cond].op, a0, a1,
                           a2 - tcg_cmpcond_to_rvv_vi[cond].adjust);
        } else if (tcg_cmpcond_to_rvv_vv[cond].swap) {
            tcg_out_opc_vv(s, tcg_cmpcond_to_rvv_vv[cond].op, a0, a2, a1);
        } else {
            tcg_out_opc_vv(s, tcg_cmpcond_to_rvv_vv[cond].op, a0, a1, a2);
        }
        break;

This appears to not require any expansion in tcg_expand_vec_op at all.


r~

Reply via email to