Hi Paul,

[Added Shreyas's current e-mail address ]

On Fri, Oct 21, 2016 at 08:03:05PM +1100, Paul Mackerras wrote:
> Commit 8117ac6a6c2f ("powerpc/powernv: Switch off MMU before entering
> nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one
> thread entering a KVM guest could switch the MMU context to the guest
> while another thread was still in host kernel context with the MMU on.
> That commit moved the point where a thread entering a power-saving
> mode set its kvm_hstate.hwthread_state field in its PACA to
> KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the
> MMU had been switched off.  That commit also added a comment
> explaining that we have to switch to real mode before setting
> hwthread_state to avoid this race.
> 
> Nevertheless, commit 4eae2c9ae54a ("powerpc/powernv: Make
> pnv_powersave_common more generic", 2016-07-08) subsequently moved
> the setting of hwthread_state back to a point where the MMU is on,
> thus reintroducing the race, despite the comment saying that this
> should not be done being included in full in the context lines of
> the patch that did it.
>

Sorry about missing that part. I am at fault, since I reviewed
4eae2c9ae54a patch. Will keep this in mind in the future.

> This fixes the race again and adds a bigger and shoutier comment
> explaining the potential race condition.
> 
> Cc: sta...@vger.kernel.org # v4.8
> Fixes: 4eae2c9ae54a
> Signed-off-by: Paul Mackerras <pau...@ozlabs.org>
> ---
>  arch/powerpc/kernel/idle_book3s.S | 32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/idle_book3s.S 
> b/arch/powerpc/kernel/idle_book3s.S
> index bd739fe..0d8712a 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -163,12 +163,6 @@ _GLOBAL(pnv_powersave_common)
>       std     r9,_MSR(r1)
>       std     r1,PACAR1(r13)
> 
> -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> -     /* Tell KVM we're entering idle */
> -     li      r4,KVM_HWTHREAD_IN_IDLE
> -     stb     r4,HSTATE_HWTHREAD_STATE(r13)
> -#endif
> -
>       /*
>        * Go to real mode to do the nap, as required by the architecture.
>        * Also, we need to be in real mode before setting hwthread_state,
> @@ -185,6 +179,26 @@ _GLOBAL(pnv_powersave_common)
> 
>       .globl pnv_enter_arch207_idle_mode
>  pnv_enter_arch207_idle_mode:
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +     /* Tell KVM we're entering idle */
> +     li      r4,KVM_HWTHREAD_IN_IDLE
> +     /******************************************************/
> +     /*  N O T E   W E L L    ! ! !    N O T E   W E L L   */
> +     /* The following store to HSTATE_HWTHREAD_STATE(r13)  */
> +     /* MUST occur in real mode, i.e. with the MMU off,    */
> +     /* and the MMU must stay off until we clear this flag */
> +     /* and test HSTATE_HWTHREAD_REQ(r13) in the system    */
> +     /* reset interrupt vector in exceptions-64s.S.        */
> +     /* The reason is that another thread can switch the   */
> +     /* MMU to a guest context whenever this flag is set   */
> +     /* to KVM_HWTHREAD_IN_IDLE, and if the MMU was on,    */
> +     /* that would potentially cause this thread to start  */
> +     /* executing instructions from guest memory in        */
> +     /* hypervisor mode, leading to a host crash or data   */
> +     /* corruption, or worse.                              */
> +     /******************************************************/
> +     stb     r4,HSTATE_HWTHREAD_STATE(r13)
> +#endif
>       stb     r3,PACA_THREAD_IDLE_STATE(r13)
>       cmpwi   cr3,r3,PNV_THREAD_SLEEP
>       bge     cr3,2f
> @@ -250,6 +264,12 @@ enter_winkle:
>   * r3 - requested stop state
>   */
>  power_enter_stop:
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +     /* Tell KVM we're entering idle */
> +     li      r4,KVM_HWTHREAD_IN_IDLE
> +     /* DO THIS IN REAL MODE!  See comment above. */
> +     stb     r4,HSTATE_HWTHREAD_STATE(r13)
> +#endif
>  /*
>   * Check if the requested state is a deep idle state.
>   */
> -- 
> 2.7.4
> 

Reply via email to