On 9/11/24 06:26, LIU Zhiwei wrote:
+static bool lmul_check(int lmul, MemOp vsew) +{ + /* + * For a given supported fractional LMUL setting, implementations must + * support SEW settings between SEW_MIN and LMUL * ELEN, inclusive. + * So if ELEN = 64, LMUL = 1/2, then SEW will support e8, e16, e32, + * but e64 may not be supported. + */ + if (lmul < 0) { + return (8 << vsew) <= (64 / (1 << (-lmul))); + } else { + return true; + } +}
While the spec uses language like "may not be supported", but it then goes on to use an example of VLEN=32 and LMUL=1/8 not being valid because that leaves only one 4 bit element.
In our case...
+ +static void set_vtype(TCGContext *s, TCGType type, MemOp vsew) +{ + unsigned vtype, insn, avl; + int lmul; + RISCVVlmul vlmul; + bool lmul_eq_avl; + + s->riscv_cur_type = type; + s->riscv_cur_vsew = vsew; + + /* Match riscv_lg2_vlenb to TCG_TYPE_V64. */ + QEMU_BUILD_BUG_ON(TCG_TYPE_V64 != 3); + + lmul = type - riscv_lg2_vlenb;
We know VLEN, and LMUL is bounded by TCG_TYPE_V64. Since SEW=64 will never be smaller than LMUL*VLEN, I expect the lmul_check function to be entirely unneeded: all SEW should always work.
If for some strange reason that is not the case, the correct solution not to *assume* that it might not work, as you are doing, but to *probe* for it at startup. For instance, it would be easy to loop over each SEW to find the minimal LMUL for which VSETVL returns a positive VL, i.e. VILL not set.
+ if (lmul < -3) { + /* Host VLEN >= 1024 bits. */ + vlmul = VLMUL_M1; + lmul_eq_avl = false; + } else if (lmul < 3) { + /* 1/8, 1/4, 1/2, 1, 2, 4 */ + if (lmul_check(lmul, vsew)) { + vlmul = lmul & 7; + } else { + vlmul = VLMUL_M1; + } + lmul_eq_avl = true;
lmul_eq_avl incorrectly set here for !lmul_check.
+ if (type >= riscv_lg2_vlenb) { + static const RISCVInsn whole_reg_ld[] = { + OPC_VL1RE64_V, OPC_VL2RE64_V, OPC_VL4RE64_V, OPC_VL8RE64_V + }; + unsigned idx = type - riscv_lg2_vlenb; + + tcg_debug_assert(idx < sizeof(whole_reg_ld)); + insn = whole_reg_ld[idx]; + } else { + static const RISCVInsn unit_stride_ld[] = { + OPC_VLE8_V, OPC_VLE16_V, OPC_VLE32_V, OPC_VLE64_V + }; + MemOp prev_vsew = set_vtype_len(s, type); + + tcg_debug_assert(prev_vsew < sizeof(unit_stride_ld));
Both sizeof are incorrect; you need ARRAY_SIZE(). Likewise in tcg_out_st.
static void tcg_out_tb_start(TCGContext *s) { + s->riscv_cur_type = TCG_TYPE_COUNT; /* nothing to do */ }
Remove the out-of-date comment. r~