Rebased to riscv-to-apply.next and submitted v19. Thank you WeiWei, Frank and Alistair for the reviews along the way.
Regards, eop Chen > Alistair Francis <alistai...@gmail.com> 於 2022年6月6日 上午9:37 寫道: > > On Fri, May 13, 2022 at 9:55 PM ~eopxd <eo...@git.sr.ht> wrote: >> >> According to v-spec, tail agnostic behavior can be either kept as >> undisturbed or set elements' bits to all 1s. To distinguish the >> difference of tail policies, QEMU should be able to simulate the tail >> agnostic behavior as "set tail elements' bits to all 1s". An option >> 'rvv_ta_all_1s' is added to enable the behavior, it is default as >> disabled. >> >> There are multiple possibility for agnostic elements according to >> v-spec. The main intent of this patch-set tries to add option that >> can distinguish between tail policies. Setting agnostic elements to >> all 1s makes things simple and allow QEMU to express this. >> >> We may explore other possibility of agnostic behavior by adding >> other options in the future. Please understand that this patch-set >> is limited. >> >> v2 updates: >> - Addressed comments from Weiwei Li >> - Added commit tail agnostic on load / store instructions (which >> I forgot to include into the patch-set) >> >> v3 updates: >> - Missed the very 1st commit, adding it back >> >> v4 updates: >> - Renamed vlmax to total_elems >> - Deal with tail element when vl_eq_vlmax == true >> >> v5 updates: >> - Let `vext_get_total_elems` take `desc` and `esz` >> - Utilize `simd_maxsz(desc)` to get `vlenb` >> - Fix alignments to code >> >> v6 updates: >> - Fix `vext_get_total_elems` >> >> v7 updates: >> - Reuse `max_elems` for vector load / store helper functions. The >> translation sets desc's `lmul` to `min(1, lmul)`, making >> `vext_max_elems` equivalent to `vext_get_total_elems`. >> >> v8 updates: >> - Simplify `vext_set_elems_1s`, don't need `vext_set_elems_1s_fns` >> - Fix `vext_get_total_elems`, it should derive upon EMUL instead >> of LMUL >> >> v9 updates: >> - Let instructions that is tail agnostic regardless of vta respect the >> option and not the vta. >> >> v10 updates: >> - Correct range to set element to 1s for load instructions >> >> v11 updates: >> - Separate addition of option 'rvv_ta_all_1s' as a new (last) commit >> - Add description to show intent of the option in first commit for the >> optional tail agnostic behavior >> - Tag WeiWei as Reviewed-by for all commits >> - Tag Alistair as Reviewed-by for commit 01, 02 >> - Tag Alistair as Acked-by for commit 03 >> >> v12 updates: >> - Add missing space in WeiWei's "Reviewed-by" tag >> >> v13 updates: >> - Fix tail agnostic for vext_ldst_us. The function operates on input >> parameter 'evl' rather than 'env->vl'. >> - Fix tail elements for vector segment load / store instructions >> A vector segment load / store instruction may contain fractional >> lmul with nf * lmul > 1. The rest of the elements in the last >> register should be treated as tail elements. >> - Fix tail agnostic length for instructions with mask destination >> register. Instructions with mask destination register should have >> 'vlen - vl' tail elements. >> >> v14 updates: >> - Pass lmul information to into vector helper function. >> `vext_get_total_elems` needs it. >> >> v15 updates: >> - Rebase to latest `master` >> - Tag Alistair as Acked by for commit 04 ~ 14 >> - Tag Alistair as Acked by for commit 15 >> >> v16 updates: >> - Fix bug, when encountering situation when lmul < 0 and vl_eq_vlmax, >> the original version will override on `vd` but the computation will >> override again, meaning the tail elements will not be set correctly. >> Now, we don't use TCG functions if we are trying to simulate all 1s >> for agnostic and use vector helpers instead. >> >> v17 updates: >> - Add "Prune access_type parameter" commit to cleanup vector load/ >> store functions. Then add parameter `is_load` in vector helper >> functions to enable vta behavior in the commit for adding vta on >> vector load/store functions. >> >> v18 updates: >> - Don't use `is_load` parameter in vector helper. Don't let vta pass >> through in `trans_rvv.inc` >> >> eopXD (16): >> target/riscv: rvv: Prune redundant ESZ, DSZ parameter passed >> target/riscv: rvv: Prune redundant access_type parameter passed >> target/riscv: rvv: Rename ambiguous esz >> target/riscv: rvv: Early exit when vstart >= vl >> target/riscv: rvv: Add tail agnostic for vv instructions >> target/riscv: rvv: Add tail agnostic for vector load / store >> instructions >> target/riscv: rvv: Add tail agnostic for vx, vvm, vxm instructions >> target/riscv: rvv: Add tail agnostic for vector integer shift >> instructions >> target/riscv: rvv: Add tail agnostic for vector integer comparison >> instructions >> target/riscv: rvv: Add tail agnostic for vector integer merge and move >> instructions >> target/riscv: rvv: Add tail agnostic for vector fix-point arithmetic >> instructions >> target/riscv: rvv: Add tail agnostic for vector floating-point >> instructions >> target/riscv: rvv: Add tail agnostic for vector reduction instructions >> target/riscv: rvv: Add tail agnostic for vector mask instructions >> target/riscv: rvv: Add tail agnostic for vector permutation >> instructions >> target/riscv: rvv: Add option 'rvv_ta_all_1s' to enable optional tail >> agnostic behavior > > Do you mind rebasing this on: > https://github.com/alistair23/qemu/tree/riscv-to-apply.next > > Alistair > >> >> target/riscv/cpu.c | 1 + >> target/riscv/cpu.h | 2 + >> target/riscv/cpu_helper.c | 2 + >> target/riscv/insn_trans/trans_rvv.c.inc | 94 +- >> target/riscv/internals.h | 6 +- >> target/riscv/translate.c | 4 + >> target/riscv/vector_helper.c | 1587 ++++++++++++++--------- >> 7 files changed, 1053 insertions(+), 643 deletions(-) >> >> -- >> 2.34.2 >>