Big fat pullreq this time around, because it has all of RTH's SVE2 emulation patchset in it.
-- PMM The following changes since commit 0dab1d36f55c3ed649bb8e4c74b9269ef3a63049: Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging (2021-05-24 15:48:08 +0100) are available in the Git repository at: https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210525 for you to fetch changes up to f8680aaa6e5bfc6022b75157c23db7d2ea98ab11: target/arm: Enable SVE2 and related extensions (2021-05-25 16:01:44 +0100) ---------------------------------------------------------------- target-arm queue: * Implement SVE2 emulation * Implement integer matrix multiply accumulate * Implement FEAT_TLBIOS * Implement FEAT_TLBRANGE * disas/libvixl: Protect C system header for C++ compiler * Use correct SP in M-profile exception return * AN524, AN547: Correct modelling of internal SRAMs * hw/intc/arm_gicv3_cpuif: Fix EOIR write access check logic * hw/arm/smmuv3: Another range invalidation fix ---------------------------------------------------------------- Eric Auger (1): hw/arm/smmuv3: Another range invalidation fix Peter Maydell (8): hw/intc/arm_gicv3_cpuif: Fix EOIR write access check logic hw/arm/mps2-tz: Don't duplicate modelling of SRAM in AN524 hw/arm/mps2-tz: Make SRAM_ADDR_WIDTH board-specific hw/arm/armsse.c: Correct modelling of SSE-300 internal SRAMs hw/arm/armsse: Convert armsse_realize() to use ERRP_GUARD hw/arm/mps2-tz: Allow board to specify a boot RAM size hw/arm: Model TCMs in the SSE-300, not the AN547 target/arm: Use correct SP in M-profile exception return Philippe Mathieu-Daudé (1): disas/libvixl: Protect C system header for C++ compiler Rebecca Cran (3): target/arm: Add support for FEAT_TLBIRANGE target/arm: Add support for FEAT_TLBIOS target/arm: set ID_AA64ISAR0.TLB to 2 for max AARCH64 CPU type Richard Henderson (84): accel/tcg: Replace g_new() + memcpy() by g_memdup() accel/tcg: Pass length argument to tlb_flush_range_locked() accel/tlb: Rename TLBFlushPageBitsByMMUIdxData -> TLBFlushRangeData accel/tcg: Remove {encode,decode}_pbm_to_runon accel/tcg: Add tlb_flush_range_by_mmuidx() accel/tcg: Add tlb_flush_range_by_mmuidx_all_cpus() accel/tlb: Add tlb_flush_range_by_mmuidx_all_cpus_synced() accel/tcg: Rename tlb_flush_page_bits -> range]_by_mmuidx_async_0 accel/tlb: Rename tlb_flush_[page_bits > range]_by_mmuidx_async_[2 > 1] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 target/arm: Implement SVE2 Integer Multiply - Unpredicated target/arm: Implement SVE2 integer pairwise add and accumulate long target/arm: Implement SVE2 integer unary operations (predicated) target/arm: Split out saturating/rounding shifts from neon target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) target/arm: Implement SVE2 integer halving add/subtract (predicated) target/arm: Implement SVE2 integer pairwise arithmetic target/arm: Implement SVE2 saturating add/subtract (predicated) target/arm: Implement SVE2 integer add/subtract long target/arm: Implement SVE2 integer add/subtract interleaved long target/arm: Implement SVE2 integer add/subtract wide target/arm: Implement SVE2 integer multiply long target/arm: Implement SVE2 PMULLB, PMULLT target/arm: Implement SVE2 bitwise shift left long target/arm: Implement SVE2 bitwise exclusive-or interleaved target/arm: Implement SVE2 bitwise permute target/arm: Implement SVE2 complex integer add target/arm: Implement SVE2 integer absolute difference and accumulate long target/arm: Implement SVE2 integer add/subtract long with carry target/arm: Implement SVE2 bitwise shift right and accumulate target/arm: Implement SVE2 bitwise shift and insert target/arm: Implement SVE2 integer absolute difference and accumulate target/arm: Implement SVE2 saturating extract narrow target/arm: Implement SVE2 SHRN, RSHRN target/arm: Implement SVE2 SQSHRUN, SQRSHRUN target/arm: Implement SVE2 UQSHRN, UQRSHRN target/arm: Implement SVE2 SQSHRN, SQRSHRN target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS target/arm: Implement SVE2 WHILERW, WHILEWR target/arm: Implement SVE2 bitwise ternary operations target/arm: Implement SVE2 saturating multiply-add long target/arm: Implement SVE2 saturating multiply-add high target/arm: Implement SVE2 integer multiply-add long target/arm: Implement SVE2 complex integer multiply-add target/arm: Implement SVE2 XAR target/arm: Use correct output type for gvec_sdot_*_b target/arm: Pass separate addend to {U, S}DOT helpers target/arm: Pass separate addend to FCMLA helpers target/arm: Split out formats for 2 vectors + 1 index target/arm: Split out formats for 3 vectors + 1 index target/arm: Implement SVE2 integer multiply (indexed) target/arm: Implement SVE2 integer multiply-add (indexed) target/arm: Implement SVE2 saturating multiply-add high (indexed) target/arm: Implement SVE2 saturating multiply-add (indexed) target/arm: Implement SVE2 saturating multiply (indexed) target/arm: Implement SVE2 signed saturating doubling multiply high target/arm: Implement SVE2 saturating multiply high (indexed) target/arm: Implement SVE2 multiply-add long (indexed) target/arm: Implement SVE2 integer multiply long (indexed) target/arm: Implement SVE2 complex integer multiply-add (indexed) target/arm: Implement SVE2 complex integer dot product target/arm: Macroize helper_gvec_{s,u}dot_{b,h} target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h} target/arm: Implement SVE mixed sign dot product (indexed) target/arm: Implement SVE mixed sign dot product target/arm: Implement SVE2 crypto unary operations target/arm: Implement SVE2 crypto destructive binary operations target/arm: Implement SVE2 crypto constructive binary operations target/arm: Implement SVE2 FCVTNT target/arm: Share table of sve load functions target/arm: Tidy do_ldrq target/arm: Implement SVE2 LD1RO target/arm: Implement 128-bit ZIP, UZP, TRN target/arm: Move endian adjustment macros to vec_internal.h target/arm: Implement aarch64 SUDOT, USDOT target/arm: Split out do_neon_ddda_fpst target/arm: Remove unused fpst from VDOT_scalar target/arm: Fix decode for VDOT (indexed) target/arm: Split out do_neon_ddda target/arm: Split decode of VSDOT and VUDOT target/arm: Implement aarch32 VSUDOT, VUSDOT target/arm: Implement integer matrix multiply accumulate linux-user/aarch64: Enable hwcap bits for sve2 and related extensions target/arm: Enable SVE2 and related extensions Stephen Long (17): target/arm: Implement SVE2 floating-point pairwise target/arm: Implement SVE2 MATCH, NMATCH target/arm: Implement SVE2 ADDHNB, ADDHNT target/arm: Implement SVE2 RADDHNB, RADDHNT target/arm: Implement SVE2 SUBHNB, SUBHNT target/arm: Implement SVE2 RSUBHNB, RSUBHNT target/arm: Implement SVE2 HISTCNT, HISTSEG target/arm: Implement SVE2 scatter store insns target/arm: Implement SVE2 gather load insns target/arm: Implement SVE2 FMMLA target/arm: Implement SVE2 SPLICE, EXT target/arm: Implement SVE2 TBL, TBX target/arm: Implement SVE2 FCVTLT target/arm: Implement SVE2 FCVTXNT, FCVTX target/arm: Implement SVE2 FLOGB target/arm: Implement SVE2 bitwise shift immediate target/arm: Implement SVE2 fp multiply-add long disas/libvixl/vixl/code-buffer.h | 2 +- disas/libvixl/vixl/globals.h | 16 +- disas/libvixl/vixl/invalset.h | 2 +- disas/libvixl/vixl/platform.h | 2 + disas/libvixl/vixl/utils.h | 2 +- include/exec/exec-all.h | 44 + include/hw/arm/armsse.h | 2 + target/arm/cpu.h | 76 + target/arm/helper-sve.h | 722 ++++++++- target/arm/helper.h | 110 +- target/arm/translate-a64.h | 3 + target/arm/vec_internal.h | 167 ++ target/arm/neon-shared.decode | 24 +- target/arm/sve.decode | 574 ++++++- accel/tcg/cputlb.c | 231 ++- hw/arm/armsse.c | 35 +- hw/arm/mps2-tz.c | 39 +- hw/arm/smmuv3.c | 50 +- hw/intc/arm_gicv3_cpuif.c | 48 +- linux-user/elfload.c | 10 + target/arm/cpu.c | 2 + target/arm/cpu64.c | 14 + target/arm/cpu_tcg.c | 1 + target/arm/helper.c | 327 +++- target/arm/kvm64.c | 21 +- target/arm/m_helper.c | 3 +- target/arm/neon_helper.c | 507 +----- target/arm/sve_helper.c | 2110 +++++++++++++++++++++++-- target/arm/translate-a64.c | 111 +- target/arm/translate-neon.c | 231 +-- target/arm/translate-sve.c | 3200 +++++++++++++++++++++++++++++++++++--- target/arm/vec_helper.c | 887 ++++++++--- disas/libvixl/vixl/utils.cc | 2 +- 33 files changed, 8275 insertions(+), 1300 deletions(-)