[PATCH v8 5/5] powerpc:selftest update memcmp_64 selftest for VMX implementation

2018-06-06 Thread wei . guo . simon
From: Simon Guo This patch reworked selftest memcmp_64 so that memcmp selftest can cover more test cases. It adds testcases for: - memcmp over 4K bytes size. - s1/s2 with different/random offset on 16 bytes boundary. - enter/exit_vmx_ops pairness. Signed-off-by: Simon Guo --- .../selftests/po

[PATCH v8 4/5] powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp()

2018-06-06 Thread wei . guo . simon
From: Simon Guo This patch is based on the previous VMX patch on memcmp(). To optimize ppc64 memcmp() with VMX instruction, we need to think about the VMX penalty brought with: If kernel uses VMX instruction, it needs to save/restore current thread's VMX registers. There are 32 x 128 bits VMX re

[PATCH v8 3/5] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

2018-06-06 Thread wei . guo . simon
From: Simon Guo This patch add VMX primitives to do memcmp() in case the compare size is equal or greater than 4K bytes. KSM feature can benefit from this. Test result with following test program(replace the "^>" with ""): -- ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c >#incl

[PATCH v8 2/5] powerpc: add vcmpequd/vcmpequb ppc instruction macro

2018-06-06 Thread wei . guo . simon
From: Simon Guo Some old tool chains don't know about instructions like vcmpequd. This patch adds .long macro for vcmpequd and vcmpequb, which is a preparation to optimize ppc64 memcmp with VMX instructions. Signed-off-by: Simon Guo --- arch/powerpc/include/asm/ppc-opcode.h | 11 +++

[PATCH v8 1/5] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()

2018-06-06 Thread wei . guo . simon
From: Simon Guo Currently memcmp() 64bytes version in powerpc will fall back to .Lshort (compare per byte mode) if either src or dst address is not 8 bytes aligned. It can be opmitized in 2 situations: 1) if both addresses are with the same offset with 8 bytes boundary: memcmp() can compare the

[PATCH v8 0/5] powerpc/64: memcmp() optimization

2018-06-06 Thread wei . guo . simon
From: Simon Guo There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare b

[PATCH v7 5/5] powerpc:selftest update memcmp_64 selftest for VMX implementation

2018-05-30 Thread wei . guo . simon
From: Simon Guo This patch reworked selftest memcmp_64 so that memcmp selftest can cover more test cases. It adds testcases for: - memcmp over 4K bytes size. - s1/s2 with different/random offset on 16 bytes boundary. - enter/exit_vmx_ops pairness. Signed-off-by: Simon Guo --- .../selftests/po

[PATCH v7 4/5] powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp()

2018-05-30 Thread wei . guo . simon
From: Simon Guo This patch is based on the previous VMX patch on memcmp(). To optimize ppc64 memcmp() with VMX instruction, we need to think about the VMX penalty brought with: If kernel uses VMX instruction, it needs to save/restore current thread's VMX registers. There are 32 x 128 bits VMX re

[PATCH v7 3/5] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

2018-05-30 Thread wei . guo . simon
From: Simon Guo This patch add VMX primitives to do memcmp() in case the compare size is equal or greater than 4K bytes. KSM feature can benefit from this. Test result with following test program(replace the "^>" with ""): -- ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c >#incl

[PATCH v7 2/5] powerpc: add vcmpequd/vcmpequb ppc instruction macro

2018-05-30 Thread wei . guo . simon
From: Simon Guo Some old tool chains don't know about instructions like vcmpequd. This patch adds .long macro for vcmpequd and vcmpequb, which is a preparation to optimize ppc64 memcmp with VMX instructions. Signed-off-by: Simon Guo --- arch/powerpc/include/asm/ppc-opcode.h | 11 +++

[PATCH v7 1/5] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()

2018-05-30 Thread wei . guo . simon
From: Simon Guo Currently memcmp() 64bytes version in powerpc will fall back to .Lshort (compare per byte mode) if either src or dst address is not 8 bytes aligned. It can be opmitized in 2 situations: 1) if both addresses are with the same offset with 8 bytes boundary: memcmp() can compare the

[PATCH v7 0/5] powerpc/64: memcmp() optimization

2018-05-30 Thread wei . guo . simon
From: Simon Guo There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare b

[PATCH v2] KVM: PPC: remove mmio_vsx_tx_sx_enabled in KVM MMIO emulation

2018-05-27 Thread wei . guo . simon
From: Simon Guo Originally PPC KVM MMIO emulation uses only 0~31#(5 bits) for VSR reg number, and use mmio_vsx_tx_sx_enabled field together for 0~63# VSR regs. Currently PPC KVM MMIO emulation is reimplemented with analyse_instr() assistence. analyse_instr() returns 0~63 for VSR register number,

[PATCH v6 4/4] powerpc:selftest update memcmp_64 selftest for VMX implementation

2018-05-24 Thread wei . guo . simon
From: Simon Guo This patch reworked selftest memcmp_64 so that memcmp selftest can cover more test cases. It adds testcases for: - memcmp over 4K bytes size. - s1/s2 with different/random offset on 16 bytes boundary. - enter/exit_vmx_ops pairness. Signed-off-by: Simon Guo --- .../selftests/po

[PATCH v6 3/4] powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp()

2018-05-24 Thread wei . guo . simon
From: Simon Guo This patch is based on the previous VMX patch on memcmp(). To optimize ppc64 memcmp() with VMX instruction, we need to think about the VMX penalty brought with: If kernel uses VMX instruction, it needs to save/restore current thread's VMX registers. There are 32 x 128 bits VMX re

[PATCH v6 2/4] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

2018-05-24 Thread wei . guo . simon
From: Simon Guo This patch add VMX primitives to do memcmp() in case the compare size is equal or greater than 4K bytes. KSM feature can benefit from this. Test result with following test program(replace the "^>" with ""): -- ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c >#incl

[PATCH v6 1/4] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()

2018-05-24 Thread wei . guo . simon
From: Simon Guo Currently memcmp() 64bytes version in powerpc will fall back to .Lshort (compare per byte mode) if either src or dst address is not 8 bytes aligned. It can be opmitized in 2 situations: 1) if both addresses are with the same offset with 8 bytes boundary: memcmp() can compare the

[PATCH v6 0/4] powerpc/64: memcmp() optimization

2018-05-24 Thread wei . guo . simon
From: Simon Guo There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare b

[PATCH] KVM: PPC: remove mmio_vsx_tx_sx_enabled in PR KVM MMIO emulation

2018-05-24 Thread wei . guo . simon
From: Simon Guo Originally PR KVM MMIO emulation uses only 0~31#(5 bits) for VSR reg number, and use mmio_vsx_tx_sx_enabled field together for 0~63# VSR regs. Currently PR KVM MMIO emulation is reimplemented with analyse_instr() assistence. analyse_instr() returns 0~63 for VSR register number, s

[PATCH v4 29/29] KVM: PPC: Book3S PR: enable kvmppc_get/set_one_reg_pr() for HTM registers

2018-05-23 Thread wei . guo . simon
From: Simon Guo We need to migrate PR KVM during transaction and qemu will use kvmppc_get_one_reg_pr()/kvmppc_set_one_reg_pr() APIs to get/set transaction checkpoint state. This patch adds support for that. So far PPC PR qemu doesn't fully function for migration but the savevm/loadvm can be done

[PATCH v4 28/29] KVM: PPC: remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS

2018-05-23 Thread wei . guo . simon
From: Simon Guo In both HV/PR KVM, the KVM_SET_REGS/KVM_GET_REGS ioctl should be able to perform without load vcpu. This patch adds KVM_SET_ONE_REG/KVM_GET_ONE_REG implementation to async ioctl function. Due to the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_SET_

[PATCH v4 27/29] KVM: PPC: remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl

2018-05-23 Thread wei . guo . simon
From: Simon Guo Due to the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_GET_ONE_REG and KVM_SET_ONE_REG doesn't need to do ioctl with loading vcpu anymore. This patch removes vcpu_load()/vcpu_put() from KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctl. Signed-off-by: Sim

[PATCH v4 26/29] KVM: PPC: move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl

2018-05-23 Thread wei . guo . simon
From: Simon Guo Although we already have kvm_arch_vcpu_async_ioctl() which doesn't require ioctl to load vcpu, the sync ioctl code need to be cleaned up when CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL is not configured. This patch moves vcpu_load/vcpu_put down to each ioctl switch case so that each ioctl

[PATCH v4 25/29] KVM: PPC: Book3S PR: enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl

2018-05-23 Thread wei . guo . simon
From: Simon Guo With current patch set, PR KVM now supports HTM. So this patch turns it on for PR KVM. Tested with: https://github.com/justdoitqd/publicFiles/blob/master/test_kvm_htm_cap.c Signed-off-by: Simon Guo --- arch/powerpc/kvm/powerpc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 de

[PATCH v4 24/29] KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM.

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently guest kernel doesn't handle TAR fac unavailable and it always runs with TAR bit on. PR KVM will lazily enable TAR. TAR is not a frequent-use reg and it is not included in SVCPU struct. Due to the above, the checkpointed TAR val might be a bogus TAR val. To solve this is

[PATCH v4 23/29] KVM: PPC: Book3S PR: add guard code to prevent returning to guest with PR=0 and Transactional state

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently PR KVM doesn't support transaction memory at guest privilege state. This patch adds a check at setting guest msr, so that we can never return to guest with PR=0 and TS=0b10. A tabort will be emulated to indicate this and fail transaction immediately. Signed-off-by: Sim

[PATCH v4 22/29] KVM: PPC: Book3S PR: add emulation for tabort. for privilege guest

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently privilege guest will be run with TM disabled. Although the privilege guest cannot initiate a new transaction, it can use tabort to terminate its problem state's transaction. So it is still necessary to emulate tabort. for privilege guest. This patch adds emulation for

[PATCH v4 21/29] KVM: PPC: Book3S PR: add emulation for trechkpt in PR KVM.

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch adds host emulation when guest PR KVM executes "trechkpt.", which is a privileged instruction and will trap into host. We firstly copy vcpu ongoing content into vcpu tm checkpoint content, then perform kvmppc_restore_tm_pr() to do trechkpt. with updated vcpu tm checkpo

[PATCH v4 20/29] KVM: PPC: Book3S PR: adds emulation for treclaim.

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch adds support for "treclaim." emulation when PR KVM guest executes treclaim. and traps to host. We will firstly doing treclaim. and save TM checkpoint. Then it is necessary to update vcpu current reg content with checkpointed vals. When rfid into guest again, those vcpu

[PATCH v4 19/29] KVM: PPC: Book3S PR: enable NV reg restore for reading TM SPR at guest privilege state

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently kvmppc_handle_fac() will not update NV GPRs and thus it can return with GUEST_RESUME. However PR KVM guest always disables MSR_TM bit at privilege state. If PR privilege guest are trying to read TM SPRs, it will trigger TM facility unavailable exception and fall into kv

[PATCH v4 18/29] KVM: PPC: Book3S PR: always fail transaction in guest privilege state

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently kernel doesn't use transaction memory. And there is an issue for privilege guest that: tbegin/tsuspend/tresume/tabort TM instructions can impact MSR TM bits without trap into PR host. So following code will lead to a false mfmsr result: tbegin <- MSR bits update

[PATCH v4 17/29] KVM: PPC: Book3S PR: make mtspr/mfspr emulation behavior based on active TM SPRs

2018-05-23 Thread wei . guo . simon
From: Simon Guo The mfspr/mtspr on TM SPRs(TEXASR/TFIAR/TFHAR) are non-privileged instructions and can be executed at PR KVM guest without trapping into host in problem state. We only emulate mtspr/mfspr texasr/tfiar/tfhar at guest PR=0 state. When we are emulating mtspr tm sprs at guest PR=0 st

[PATCH v4 16/29] KVM: PPC: Book3S PR: add math support for PR KVM HTM

2018-05-23 Thread wei . guo . simon
From: Simon Guo The math registers will be saved into vcpu->arch.fp/vr and corresponding vcpu->arch.fp_tm/vr_tm area. We flush or giveup the math regs into vcpu->arch.fp/vr before saving transaction. After transaction is restored, the math regs will be loaded back into regs. If there is a FP/VE

[PATCH v4 15/29] KVM: PPC: Book3S PR: add transaction memory save/restore skeleton for PR KVM

2018-05-23 Thread wei . guo . simon
From: Simon Guo The transaction memory checkpoint area save/restore behavior is triggered when VCPU qemu process is switching out/into CPU. ie. at kvmppc_core_vcpu_put_pr() and kvmppc_core_vcpu_load_pr(). MSR TM active state is determined by TS bits: active: 10(transactional) or 01 (suspende

[PATCH v4 14/29] KVM: PPC: Book3S PR: add kvmppc_save/restore_tm_sprs() APIs

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch adds 2 new APIs kvmppc_save_tm_sprs()/kvmppc_restore_tm_sprs() for the purpose of TEXASR/TFIAR/TFHAR save/restore. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/kvm/book3s_pr.c | 22 ++ 1 file changed, 22 insertions(+) di

[PATCH v4 13/29] KVM: PPC: Book3S PR: adds new kvmppc_copyto_vcpu_tm/kvmppc_copyfrom_vcpu_tm API for PR KVM.

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch adds 2 new APIs: kvmppc_copyto_vcpu_tm() and kvmppc_copyfrom_vcpu_tm(). These 2 APIs will be used to copy from/to TM data between VCPU_TM/VCPU area. PR KVM will use these APIs for treclaim. or trchkpt. emulation. Signed-off-by: Simon Guo --- arch/powerpc/kvm/book3s

[PATCH v4 12/29] KVM: PPC: Book3S PR: prevent TS bits change in kvmppc_interrupt_pr()

2018-05-23 Thread wei . guo . simon
From: Simon Guo PR KVM host usually equipped with enabled TM in its host MSR value, and with non-transactional TS value. When a guest with TM active traps into PR KVM host, the rfid at the tail of kvmppc_interrupt_pr() will try to switch TS bits from S0 (Suspended & TM disabled) to N1 (Non-trans

[PATCH v4 11/29] KVM: PPC: Book3S PR: implement RFID TM behavior to suppress change from S0 to N0

2018-05-23 Thread wei . guo . simon
From: Simon Guo Accordingly to ISA specification for RFID, in MSR TM disabled and TS suspended state(S0), if the target MSR is TM disabled and TS state is inactive(N0), rfid should suppress this update. This patch make RFID emulation of PR KVM to be consistent with this. Signed-off-by: Simon Gu

[PATCH v4 10/29] KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state guest

2018-05-23 Thread wei . guo . simon
From: Simon Guo MSR TS bits can be modified with non-privileged instruction like tbegin./tend. That means guest can change MSR value "silently" without notifying host. It is necessary to sync the TM bits to host so that host can calculate shadow msr correctly. note privilege guest will always

[PATCH v4 09/29] KVM: PPC: Book3S PR: PR KVM pass through MSR TM/TS bits to shadow_msr.

2018-05-23 Thread wei . guo . simon
From: Simon Guo PowerPC TM functionality needs MSR TM/TS bits support in hardware level. Guest TM functionality can not be emulated with "fake" MSR (msr in magic page) TS bits. This patch syncs TM/TS bits in shadow_msr with the MSR value in magic page, so that the MSR TS value which guest sees i

[PATCH v4 08/29] KVM: PPC: Book3S PR: In PR KVM suspends Transactional state when inject an interrupt.

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch simulates interrupt behavior per Power ISA while injecting interrupt in PR KVM: - When interrupt happens, transactional state should be suspended. kvmppc_mmu_book3s_64_reset_msr() will be invoked when injecting an interrupt. This patch performs this ISA logic in kvmppc

[PATCH v4 07/29] KVM: PPC: Book3S PR: add C function wrapper for _kvmppc_save/restore_tm()

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently _kvmppc_save/restore_tm() APIs can only be invoked from assembly function. This patch adds C function wrappers for them so that they can be safely called from C function. Signed-off-by: Simon Guo --- arch/powerpc/include/asm/asm-prototypes.h | 6 ++ arch/powerpc/kvm/

[PATCH v4 06/29] KVM: PPC: Book3S PR: turn on FP/VSX/VMX MSR bits in kvmppc_save_tm()

2018-05-23 Thread wei . guo . simon
From: Simon Guo kvmppc_save_tm() invokes store_fp_state/store_vr_state(). So it is mandatory to turn on FP/VSX/VMX MSR bits for its execution, just like what kvmppc_restore_tm() did. Previsouly HV KVM has turned the bits on outside of function kvmppc_save_tm(). Now we include this bit change i

[PATCH v4 05/29] KVM: PPC: Book3S PR: add new parameter (guest MSR) for kvmppc_save_tm()/kvmppc_restore_tm()

2018-05-23 Thread wei . guo . simon
From: Simon Guo HV KVM and PR KVM need different MSR source to indicate whether treclaim. or trecheckpoint. is necessary. This patch add new parameter (guest MSR) for these kvmppc_save_tm/ kvmppc_restore_tm() APIs: - For HV KVM, it is VCPU_MSR - For PR KVM, it is current host MSR or VCPU_SHADOW_

[PATCH v4 04/29] KVM: PPC: Book3S PR: Move kvmppc_save_tm/kvmppc_restore_tm to separate file

2018-05-23 Thread wei . guo . simon
From: Simon Guo It is a simple patch just for moving kvmppc_save_tm/kvmppc_restore_tm() functionalities to tm.S. There is no logic change. The reconstruct of those APIs will be done in later patches to improve readability. It is for preparation of reusing those APIs on both HV/PR PPC KVM. Some

[PATCH v4 03/29] powerpc: export tm_enable()/tm_disable/tm_abort() APIs

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch exports tm_enable()/tm_disable/tm_abort() APIs, which will be used for PR KVM transaction memory logic. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/include/asm/asm-prototypes.h | 3 +++ arch/powerpc/include/asm/tm.h | 2 -- ar

[PATCH v4 02/29] powerpc: add TEXASR related macros

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patches add some macros for CR0/TEXASR bits so that PR KVM TM logic(tbegin./treclaim./tabort.) can make use of them later. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/include/asm/reg.h | 32 +++-- arch/powerpc

[PATCH v4 01/29] powerpc: export symbol msr_check_and_set().

2018-05-23 Thread wei . guo . simon
From: Simon Guo PR KVM will need to reuse msr_check_and_set(). This patch exports this API for reuse. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/kernel/process.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/

[PATCH v4 00/29] KVM: PPC: Book3S PR: Transaction memory support on PR KVM

2018-05-23 Thread wei . guo . simon
From: Simon Guo In current days, many OS distributions have utilized transaction memory functionality. In PowerPC, HV KVM supports TM. But PR KVM does not. The drive for the transaction memory support of PR KVM is the openstack Continuous Integration testing - They runs a HV(hypervisor) KVM(as l

[PATCH v5 4/4] powerpc:selftest update memcmp_64 selftest for VMX implementation

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch reworked selftest memcmp_64 so that memcmp selftest can cover more test cases. It adds testcases for: - memcmp over 4K bytes size. - s1/s2 with different/random offset on 16 bytes boundary. - enter/exit_vmx_ops pairness. Signed-off-by: Simon Guo --- .../selftests/po

[PATCH v5 3/4] powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp()

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch is based on the previous VMX patch on memcmp(). To optimize ppc64 memcmp() with VMX instruction, we need to think about the VMX penalty brought with: If kernel uses VMX instruction, it needs to save/restore current thread's VMX registers. There are 32 x 128 bits VMX re

[PATCH v5 2/4] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

2018-05-23 Thread wei . guo . simon
From: Simon Guo This patch add VMX primitives to do memcmp() in case the compare size exceeds 4K bytes. KSM feature can benefit from this. Test result with following test program(replace the "^>" with ""): -- ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c >#include >#include >

[PATCH v5 1/4] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()

2018-05-23 Thread wei . guo . simon
From: Simon Guo Currently memcmp() 64bytes version in powerpc will fall back to .Lshort (compare per byte mode) if either src or dst address is not 8 bytes aligned. It can be opmitized in 2 situations: 1) if both addresses are with the same offset with 8 bytes boundary: memcmp() can compare the

[PATCH v5 0/4] powerpc/64: memcmp() optimization

2018-05-23 Thread wei . guo . simon
From: Simon Guo There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare b

[PATCH v3 7/7] KVM: PPC: reimplements LOAD_VMX/STORE_VMX instruction mmio emulation with analyse_intr() input

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch reimplements LOAD_VMX/STORE_VMX MMIO emulation with analyse_intr() input. When emulating the store, the VMX reg will need to be flushed so that the right reg val can be retrieved before writing to IO MEM. This patch also adds support for lvebx/lvehx/lvewx/stvebx/stvehx

[PATCH v3 6/7] KVM: PPC: expand mmio_vsx_copy_type to mmio_copy_type to cover VMX load/store elem types

2018-05-20 Thread wei . guo . simon
From: Simon Guo VSX MMIO emulation uses mmio_vsx_copy_type to represent VSX emulated element size/type, such as KVMPPC_VSX_COPY_DWORD_LOAD, etc. This patch expands mmio_vsx_copy_type to cover VMX copy type, such as KVMPPC_VMX_COPY_BYTE(stvebx/lvebx), etc. As a result, mmio_vsx_copy_type is also r

[PATCH v3 5/7] KVM: PPC: reimplements LOAD_VSX/STORE_VSX instruction mmio emulation with analyse_intr() input

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with analyse_intr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported by analyse_instr() and handle accordingly. When emulating VSX store, the VSX reg will need to be flushed so that the right reg val ca

[PATCH v3 4/7] KVM: PPC: reimplement LOAD_FP/STORE_FP instruction mmio emulation with analyse_intr() input

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch reimplements LOAD_FP/STORE_FP instruction MMIO emulation with analyse_intr() input. It utilizes the FPCONV/UPDATE properties exported by analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store() accordingly. For FP store MMIO emulation, the FP regs need t

[PATCH v3 3/7] KVM: PPC: add giveup_ext() hook for PPC KVM ops

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently HV will save math regs(FP/VEC/VSX) when trap into host. But PR KVM will only save math regs when qemu task switch out of CPU, or when returning from qemu code. To emulate FP/VEC/VSX mmio load, PR KVM need to make sure that math regs were flushed firstly and then be able

[PATCH v3 2/7] KVM: PPC: reimplement non-SIMD LOAD/STORE instruction mmio emulation with analyse_intr() input

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch reimplements non-SIMD LOAD/STORE instruction MMIO emulation with analyse_intr() input. It utilizes the BYTEREV/UPDATE/SIGNEXT properties exported by analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store() accordingly. It also move CACHEOP type handling

[PATCH v3 1/7] KVM: PPC: add KVMPPC_VSX_COPY_WORD_LOAD_DUMP type support for mmio emulation

2018-05-20 Thread wei . guo . simon
From: Simon Guo Some VSX instruction like lxvwsx will splat word into VSR. This patch adds VSX copy type KVMPPC_VSX_COPY_WORD_LOAD_DUMP to support this. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/powerpc.c | 2

[PATCH v3 0/7] KVM: PPC: reimplement mmio emulation with analyse_instr()

2018-05-20 Thread wei . guo . simon
From: Simon Guo We already have analyse_instr() which analyzes instructions for the instruction type, size, addtional flags, etc. What kvmppc_emulate_loadstore() did is somehow duplicated and it will be good to utilize analyse_instr() to reimplement the code. The advantage is that the code logic

[PATCH v3 29/29] KVM: PPC: Book3S PR: enable kvmppc_get/set_one_reg_pr() for HTM registers

2018-05-20 Thread wei . guo . simon
From: Simon Guo We need to migrate PR KVM during transaction and qemu will use kvmppc_get_one_reg_pr()/kvmppc_set_one_reg_pr() APIs to get/set transaction checkpoint state. This patch adds support for that. So far PPC PR qemu doesn't fully function for migration but the savevm/loadvm can be done

[PATCH v3 28/29] KVM: PPC: remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS

2018-05-20 Thread wei . guo . simon
From: Simon Guo In both HV/PR KVM, the KVM_SET_REGS/KVM_GET_REGS ioctl should be able to perform without load vcpu. This patch adds KVM_SET_ONE_REG/KVM_GET_ONE_REG implementation to async ioctl function. Due to the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_SET_

[PATCH v3 27/29] KVM: PPC: remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo Due to the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_GET_ONE_REG and KVM_SET_ONE_REG doesn't need to do ioctl with loading vcpu anymore. This patch removes vcpu_load()/vcpu_put() from KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctl. Signed-off-by: Sim

[PATCH v3 26/29] KVM: PPC: move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo Although we already have kvm_arch_vcpu_async_ioctl() which doesn't require ioctl to load vcpu, the sync ioctl code need to be cleaned up when CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL is not configured. This patch moves vcpu_load/vcpu_put down to each ioctl switch case so that each ioctl

[PATCH v3 25/29] KVM: PPC: Book3S PR: enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo With current patch set, PR KVM now supports HTM. So this patch turns it on for PR KVM. Tested with: https://github.com/justdoitqd/publicFiles/blob/master/test_kvm_htm_cap.c Signed-off-by: Simon Guo --- arch/powerpc/kvm/powerpc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 de

[PATCH v3 24/29] KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently guest kernel doesn't handle TAR fac unavailable and it always runs with TAR bit on. PR KVM will lazily enable TAR. TAR is not a frequent-use reg and it is not included in SVCPU struct. Due to the above, the checkpointed TAR val might be a bogus TAR val. To solve this is

[PATCH v3 23/29] KVM: PPC: Book3S PR: add guard code to prevent returning to guest with PR=0 and Transactional state

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently PR KVM doesn't support transaction memory at guest privilege state. This patch adds a check at setting guest msr, so that we can never return to guest with PR=0 and TS=0b10. A tabort will be emulated to indicate this and fail transaction immediately. Signed-off-by: Sim

[PATCH v3 22/29] KVM: PPC: Book3S PR: add emulation for tabort. for privilege guest

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently privilege guest will be run with TM disabled. Although the privilege guest cannot initiate a new transaction, it can use tabort to terminate its problem state's transaction. So it is still necessary to emulate tabort. for privilege guest. This patch adds emulation for

[PATCH v3 21/29] KVM: PPC: Book3S PR: add emulation for trechkpt in PR KVM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch adds host emulation when guest PR KVM executes "trechkpt.", which is a privileged instruction and will trap into host. We firstly copy vcpu ongoing content into vcpu tm checkpoint content, then perform kvmppc_restore_tm_pr() to do trechkpt. with updated vcpu tm checkpo

[PATCH v3 20/29] KVM: PPC: Book3S PR: adds emulation for treclaim.

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch adds support for "treclaim." emulation when PR KVM guest executes treclaim. and traps to host. We will firstly doing treclaim. and save TM checkpoint. Then it is necessary to update vcpu current reg content with checkpointed vals. When rfid into guest again, those vcpu

[PATCH v3 19/29] KVM: PPC: Book3S PR: enable NV reg restore for reading TM SPR at guest privilege state

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently kvmppc_handle_fac() will not update NV GPRs and thus it can return with GUEST_RESUME. However PR KVM guest always disables MSR_TM bit at privilege state. If PR privilege guest are trying to read TM SPRs, it will trigger TM facility unavailable exception and fall into kv

[PATCH v3 18/29] KVM: PPC: Book3S PR: always fail transaction in guest privilege state

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently kernel doesn't use transaction memory. And there is an issue for privilege guest that: tbegin/tsuspend/tresume/tabort TM instructions can impact MSR TM bits without trap into PR host. So following code will lead to a false mfmsr result: tbegin <- MSR bits update

[PATCH v3 17/29] KVM: PPC: Book3S PR: make mtspr/mfspr emulation behavior based on active TM SPRs

2018-05-20 Thread wei . guo . simon
From: Simon Guo The mfspr/mtspr on TM SPRs(TEXASR/TFIAR/TFHAR) are non-privileged instructions and can be executed at PR KVM guest without trapping into host in problem state. We only emulate mtspr/mfspr texasr/tfiar/tfhar at guest PR=0 state. When we are emulating mtspr tm sprs at guest PR=0 st

[PATCH v3 16/29] KVM: PPC: Book3S PR: add math support for PR KVM HTM

2018-05-20 Thread wei . guo . simon
From: Simon Guo The math registers will be saved into vcpu->arch.fp/vr and corresponding vcpu->arch.fp_tm/vr_tm area. We flush or giveup the math regs into vcpu->arch.fp/vr before saving transaction. After transaction is restored, the math regs will be loaded back into regs. If there is a FP/VE

[PATCH v3 15/29] KVM: PPC: Book3S PR: add transaction memory save/restore skeleton for PR KVM

2018-05-20 Thread wei . guo . simon
From: Simon Guo The transaction memory checkpoint area save/restore behavior is triggered when VCPU qemu process is switching out/into CPU. ie. at kvmppc_core_vcpu_put_pr() and kvmppc_core_vcpu_load_pr(). MSR TM active state is determined by TS bits: active: 10(transactional) or 01 (suspende

[PATCH v3 14/29] KVM: PPC: Book3S PR: add kvmppc_save/restore_tm_sprs() APIs

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch adds 2 new APIs kvmppc_save_tm_sprs()/kvmppc_restore_tm_sprs() for the purpose of TEXASR/TFIAR/TFHAR save/restore. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/kvm/book3s_pr.c | 22 ++ 1 file changed, 22 insertions(+) di

[PATCH v3 13/29] KVM: PPC: Book3S PR: adds new kvmppc_copyto_vcpu_tm/kvmppc_copyfrom_vcpu_tm API for PR KVM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch adds 2 new APIs: kvmppc_copyto_vcpu_tm() and kvmppc_copyfrom_vcpu_tm(). These 2 APIs will be used to copy from/to TM data between VCPU_TM/VCPU area. PR KVM will use these APIs for treclaim. or trchkpt. emulation. Signed-off-by: Simon Guo --- arch/powerpc/kvm/book3s

[PATCH v3 12/29] KVM: PPC: Book3S PR: prevent TS bits change in kvmppc_interrupt_pr()

2018-05-20 Thread wei . guo . simon
From: Simon Guo PR KVM host usually equipped with enabled TM in its host MSR value, and with non-transactional TS value. When a guest with TM active traps into PR KVM host, the rfid at the tail of kvmppc_interrupt_pr() will try to switch TS bits from S0 (Suspended & TM disabled) to N1 (Non-trans

[PATCH v3 11/29] KVM: PPC: Book3S PR: implement RFID TM behavior to suppress change from S0 to N0

2018-05-20 Thread wei . guo . simon
From: Simon Guo Accordingly to ISA specification for RFID, in MSR TM disabled and TS suspended state(S0), if the target MSR is TM disabled and TS state is inactive(N0), rfid should suppress this update. This patch make RFID emulation of PR KVM to be consistent with this. Signed-off-by: Simon Gu

[PATCH v3 10/29] KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state guest

2018-05-20 Thread wei . guo . simon
From: Simon Guo MSR TS bits can be modified with non-privileged instruction like tbegin./tend. That means guest can change MSR value "silently" without notifying host. It is necessary to sync the TM bits to host so that host can calculate shadow msr correctly. note privilege guest will always

[PATCH v3 09/29] KVM: PPC: Book3S PR: PR KVM pass through MSR TM/TS bits to shadow_msr.

2018-05-20 Thread wei . guo . simon
From: Simon Guo PowerPC TM functionality needs MSR TM/TS bits support in hardware level. Guest TM functionality can not be emulated with "fake" MSR (msr in magic page) TS bits. This patch syncs TM/TS bits in shadow_msr with the MSR value in magic page, so that the MSR TS value which guest sees i

[PATCH v3 08/29] KVM: PPC: Book3S PR: In PR KVM suspends Transactional state when inject an interrupt.

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch simulates interrupt behavior per Power ISA while injecting interrupt in PR KVM: - When interrupt happens, transactional state should be suspended. kvmppc_mmu_book3s_64_reset_msr() will be invoked when injecting an interrupt. This patch performs this ISA logic in kvmppc

[PATCH v3 07/29] KVM: PPC: Book3S PR: add C function wrapper for _kvmppc_save/restore_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo Currently _kvmppc_save/restore_tm() APIs can only be invoked from assembly function. This patch adds C function wrappers for them so that they can be safely called from C function. Signed-off-by: Simon Guo --- arch/powerpc/include/asm/asm-prototypes.h | 6 ++ arch/powerpc/kvm/

[PATCH v3 06/29] KVM: PPC: Book3S PR: turn on FP/VSX/VMX MSR bits in kvmppc_save_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo kvmppc_save_tm() invokes store_fp_state/store_vr_state(). So it is mandatory to turn on FP/VSX/VMX MSR bits for its execution, just like what kvmppc_restore_tm() did. Previsouly HV KVM has turned the bits on outside of function kvmppc_save_tm(). Now we include this bit change i

[PATCH v3 05/29] KVM: PPC: Book3S PR: add new parameter (guest MSR) for kvmppc_save_tm()/kvmppc_restore_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo HV KVM and PR KVM need different MSR source to indicate whether treclaim. or trecheckpoint. is necessary. This patch add new parameter (guest MSR) for these kvmppc_save_tm/ kvmppc_restore_tm() APIs: - For HV KVM, it is VCPU_MSR - For PR KVM, it is current host MSR or VCPU_SHADOW_

[PATCH v3 04/29] KVM: PPC: Book3S PR: Move kvmppc_save_tm/kvmppc_restore_tm to separate file

2018-05-20 Thread wei . guo . simon
From: Simon Guo It is a simple patch just for moving kvmppc_save_tm/kvmppc_restore_tm() functionalities to tm.S. There is no logic change. The reconstruct of those APIs will be done in later patches to improve readability. It is for preparation of reusing those APIs on both HV/PR PPC KVM. Some

[PATCH v3 03/29] powerpc: export tm_enable()/tm_disable/tm_abort() APIs

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patch exports tm_enable()/tm_disable/tm_abort() APIs, which will be used for PR KVM transaction memory logic. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/include/asm/asm-prototypes.h | 3 +++ arch/powerpc/include/asm/tm.h | 2 -- ar

[PATCH v3 02/29] powerpc: add TEXASR related macros

2018-05-20 Thread wei . guo . simon
From: Simon Guo This patches add some macros for CR0/TEXASR bits so that PR KVM TM logic(tbegin./treclaim./tabort.) can make use of them later. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/include/asm/reg.h | 32 +++-- arch/powerpc

[PATCH v3 01/29] powerpc: export symbol msr_check_and_set().

2018-05-20 Thread wei . guo . simon
From: Simon Guo PR KVM will need to reuse msr_check_and_set(). This patch exports this API for reuse. Signed-off-by: Simon Guo Reviewed-by: Paul Mackerras --- arch/powerpc/kernel/process.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/

[PATCH v3 00/29] KVM: PPC: Book3S PR: Transaction memory support on PR KVM

2018-05-20 Thread wei . guo . simon
From: Simon Guo In current days, many OS distributions have utilized transaction memory functionality. In PowerPC, HV KVM supports TM. But PR KVM does not. The drive for the transaction memory support of PR KVM is the openstack Continuous Integration testing - They runs a HV(hypervisor) KVM(as l

[PATCH v4 4/4] powerpc:selftest update memcmp_64 selftest for VMX implementation

2018-05-16 Thread wei . guo . simon
From: Simon Guo This patch reworked selftest memcmp_64 so that memcmp selftest can cover more test cases. It adds testcases for: - memcmp over 4K bytes size. - s1/s2 with different/random offset on 16 bytes boundary. - enter/exit_vmx_ops pairness. Signed-off-by: Simon Guo --- .../selftests/po

[PATCH v4 3/4] powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp()

2018-05-16 Thread wei . guo . simon
From: Simon Guo This patch is based on the previous VMX patch on memcmp(). To optimize ppc64 memcmp() with VMX instruction, we need to think about the VMX penalty brought with: If kernel uses VMX instruction, it needs to save/restore current thread's VMX registers. There are 32 x 128 bits VMX re

[PATCH v4 2/4] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

2018-05-16 Thread wei . guo . simon
From: Simon Guo This patch add VMX primitives to do memcmp() in case the compare size exceeds 4K bytes. KSM feature can benefit from this. Test result with following test program(replace the "^>" with ""): -- ># cat tools/testing/selftests/powerpc/stringloops/memcmp.c >#include >#include >

[PATCH v4 1/4] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()

2018-05-16 Thread wei . guo . simon
From: Simon Guo Currently memcmp() 64bytes version in powerpc will fall back to .Lshort (compare per byte mode) if either src or dst address is not 8 bytes aligned. It can be opmitized in 2 situations: 1) if both addresses are with the same offset with 8 bytes boundary: memcmp() can compare the

[PATCH v4 0/4] powerpc/64: memcmp() optimization

2018-05-16 Thread wei . guo . simon
From: Simon Guo There is some room to optimize memcmp() in powerpc 64 bits version for following 2 cases: (1) Even src/dst addresses are not aligned with 8 bytes at the beginning, memcmp() can align them and go with .Llong comparision mode without fallback to .Lshort comparision mode do compare b

[PATCH v2 10/10] KVM: PPC: reimplements LOAD_VMX/STORE_VMX instruction mmio emulation with analyse_intr() input

2018-05-06 Thread wei . guo . simon
From: Simon Guo This patch reimplements LOAD_VMX/STORE_VMX MMIO emulation with analyse_intr() input. When emulating the store, the VMX reg will need to be flushed so that the right reg val can be retrieved before writing to IO MEM. This patch also adds support for lvebx/lvehx/lvewx/stvebx/stvehx

[PATCH v2 09/10] KVM: PPC: expand mmio_vsx_copy_type to mmio_copy_type to cover VMX load/store elem types

2018-05-06 Thread wei . guo . simon
From: Simon Guo VSX MMIO emulation uses mmio_vsx_copy_type to represent VSX emulated element size/type, such as KVMPPC_VSX_COPY_DWORD_LOAD, etc. This patch expands mmio_vsx_copy_type to cover VMX copy type, such as KVMPPC_VMX_COPY_BYTE(stvebx/lvebx), etc. As a result, mmio_vsx_copy_type is also r

[PATCH v2 08/10] KVM: PPC: reimplements LOAD_VSX/STORE_VSX instruction mmio emulation with analyse_intr() input

2018-05-06 Thread wei . guo . simon
From: Simon Guo This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with analyse_intr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported by analyse_instr() and handle accordingly. When emulating VSX store, the VSX reg will need to be flushed so that the right reg val ca

  1   2   3   4   >