On 2020/6/18 12:25, Richard Henderson wrote:
I know this patch set is too big, and that there are parts that
can be split out that are prepatory rather that specifically sve2.
It's also not 100% tested. I have done some amount of testing
vs ArmIE, but because of bugs and missing features therein, that
testing has been somewhat limited. I understand a new version
of FVP has just been release containing SVE2 support, but I have
not yet tried that.
However, I believe this finally contains all of the instructions
in sve2 and its optional extensions. Excluding BFloat16, since
that extension is supposed to implement AdvSIMD at the same time.
Hi Richard,
I try to merge this patch set to master branch. As some MTE instructions
have been merged after this patch set,
it can't be merged now.
Would you mind to rebase it to master branch and send the patch set again?
Best Regards,
Zhiwei
r~
Richard Henderson (81):
tcg: Save/restore vecop_list around minmax fallback
qemu/int128: Add int128_lshift
target/arm: Split out gen_gvec_fn_zz
target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn
target/arm: Rearrange {sve,fp}_check_access assert
target/arm: Merge do_vector2_p into do_mov_p
target/arm: Clean up 4-operand predicate expansion
target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp
target/arm: Split out gen_gvec_ool_zzzp
target/arm: Merge helper_sve_clr_* and helper_sve_movz_*
target/arm: Split out gen_gvec_ool_zzp
target/arm: Split out gen_gvec_ool_zzz
target/arm: Split out gen_gvec_ool_zz
target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2
target/arm: Enable SVE2 and some extensions
target/arm: Implement SVE2 Integer Multiply - Unpredicated
target/arm: Implement SVE2 integer pairwise add and accumulate long
target/arm: Implement SVE2 integer unary operations (predicated)
target/arm: Split out saturating/rounding shifts from neon
target/arm: Implement SVE2 saturating/rounding bitwise shift left
(predicated)
target/arm: Implement SVE2 integer halving add/subtract (predicated)
target/arm: Implement SVE2 integer pairwise arithmetic
target/arm: Implement SVE2 saturating add/subtract (predicated)
target/arm: Implement SVE2 integer add/subtract long
target/arm: Implement SVE2 integer add/subtract interleaved long
target/arm: Implement SVE2 integer add/subtract wide
target/arm: Implement SVE2 integer multiply long
target/arm: Implement PMULLB and PMULLT
target/arm: Tidy SVE tszimm shift formats
target/arm: Implement SVE2 bitwise shift left long
target/arm: Implement SVE2 bitwise exclusive-or interleaved
target/arm: Implement SVE2 bitwise permute
target/arm: Implement SVE2 complex integer add
target/arm: Implement SVE2 integer absolute difference and accumulate
long
target/arm: Implement SVE2 integer add/subtract long with carry
target/arm: Implement SVE2 bitwise shift right and accumulate
target/arm: Implement SVE2 bitwise shift and insert
target/arm: Implement SVE2 integer absolute difference and accumulate
target/arm: Implement SVE2 saturating extract narrow
target/arm: Implement SVE2 SHRN, RSHRN
target/arm: Implement SVE2 SQSHRUN, SQRSHRUN
target/arm: Implement SVE2 UQSHRN, UQRSHRN
target/arm: Implement SVE2 SQSHRN, SQRSHRN
target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS
target/arm: Implement SVE2 WHILERW, WHILEWR
target/arm: Implement SVE2 bitwise ternary operations
target/arm: Implement SVE2 saturating multiply-add long
target/arm: Generalize inl_qrdmlah_* helper functions
target/arm: Implement SVE2 saturating multiply-add high
target/arm: Implement SVE2 integer multiply-add long
target/arm: Implement SVE2 complex integer multiply-add
target/arm: Implement SVE2 XAR
target/arm: Fix sve_uzp_p vs odd vector lengths
target/arm: Fix sve_zip_p vs odd vector lengths
target/arm: Fix sve_punpk_p vs odd vector lengths
target/arm: Pass separate addend to {U,S}DOT helpers
target/arm: Pass separate addend to FCMLA helpers
target/arm: Split out formats for 2 vectors + 1 index
target/arm: Split out formats for 3 vectors + 1 index
target/arm: Implement SVE2 integer multiply (indexed)
target/arm: Use helper_gvec_mul_idx_* for aa64 advsimd
target/arm: Implement SVE2 integer multiply-add (indexed)
target/arm: Use helper_gvec_ml{a,s}_idx_* for aa64 advsimd
target/arm: Implement SVE2 saturating multiply-add high (indexed)
target/arm: Implement SVE2 saturating multiply-add (indexed)
target/arm: Implement SVE2 integer multiply long (indexed)
target/arm: Implement SVE2 saturating multiply (indexed)
target/arm: Implement SVE2 signed saturating doubling multiply high
target/arm: Use helper_neon_sq{,r}dmul_* for aa64 advsimd
target/arm: Implement SVE2 saturating multiply high (indexed)
target/arm: Implement SVE2 multiply-add long (indexed)
target/arm: Implement SVE2 complex integer multiply-add (indexed)
target/arm: Implement SVE mixed sign dot product (indexed)
target/arm: Implement SVE mixed sign dot product
target/arm: Implement SVE2 crypto unary operations
target/arm: Implement SVE2 crypto destructive binary operations
target/arm: Implement SVE2 crypto constructive binary operations
tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
target/arm: Share table of sve load functions
target/arm: Implement SVE2 LD1RO
target/arm: Implement 128-bit ZIP, UZP, TRN
Stephen Long (19):
target/arm: Implement SVE2 floating-point pairwise
target/arm: Implement SVE2 MATCH, NMATCH
target/arm: Implement SVE2 ADDHNB, ADDHNT
target/arm: Implement SVE2 RADDHNB, RADDHNT
target/arm: Implement SVE2 SUBHNB, SUBHNT
target/arm: Implement SVE2 RSUBHNB, RSUBHNT
target/arm: Implement SVE2 HISTCNT, HISTSEG
target/arm: Implement SVE2 scatter store insns
target/arm: Implement SVE2 gather load insns
target/arm: Implement SVE2 FMMLA
target/arm: Implement SVE2 SPLICE, EXT
target/arm: Implement SVE2 TBL, TBX
target/arm: Implement SVE2 FCVTNT
target/arm: Implement SVE2 FCVTLT
target/arm: Implement SVE2 FCVTXNT, FCVTX
softfloat: Add float16_is_normal
target/arm: Implement SVE2 FLOGB
target/arm: Implement SVE2 bitwise shift immediate
target/arm: Implement SVE2 fp multiply-add long
include/fpu/softfloat.h | 5 +
include/qemu/int128.h | 16 +
target/arm/cpu.h | 56 +
target/arm/helper-sve.h | 717 ++++++-
target/arm/helper.h | 132 +-
target/arm/translate-a64.h | 3 +
target/arm/translate.h | 1 +
target/arm/vec_internal.h | 143 ++
target/arm/sve.decode | 594 +++++-
target/arm/cpu64.c | 11 +
target/arm/helper.c | 3 +-
target/arm/kvm64.c | 11 +
target/arm/neon_helper.c | 507 +----
target/arm/sve_helper.c | 2123 +++++++++++++++++--
target/arm/translate-a64.c | 177 +-
target/arm/translate-neon.inc.c | 20 +-
target/arm/translate-sve.c | 3364 ++++++++++++++++++++++++++++---
target/arm/vec_helper.c | 942 +++++++--
tcg/tcg-op-gvec.c | 52 +-
tcg/tcg-op-vec.c | 2 +
20 files changed, 7698 insertions(+), 1181 deletions(-)