When waking from a CPU idle instruction (e.g., nap or stop), the sync for ordering the KVM secondary thread state can be avoided if there wakeup is coming from a kernel context rather than KVM context.
This improves performance for ping-pong benchmark with the stop0 idle state by 0.46% for 2 threads in the same core, and 1.02% for different cores. Signed-off-by: Nicholas Piggin <npig...@gmail.com> --- arch/powerpc/kernel/idle_book3s.S | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index 2f8364e7b489..07a306173c5a 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -532,6 +532,9 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) mr r3,r12 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + lbz r0,HSTATE_HWTHREAD_STATE(r13) + cmpwi r0,KVM_HWTHREAD_IN_KERNEL + beq 1f li r0,KVM_HWTHREAD_IN_KERNEL stb r0,HSTATE_HWTHREAD_STATE(r13) /* Order setting hwthread_state vs. testing hwthread_req */ -- 2.15.0