On 14/02/19 03:48, Luwei Kang wrote:
> Some Posted-Interrupts from passthrough devices may be lost or
> overwritten when the vCPU is in runnable state.
> 
> The SN (Suppress Notification) of PID (Posted Interrupt Descriptor) will
> be set when the vCPU is preempted (vCPU in KVM_MP_STATE_RUNNABLE state
> but not running on physical CPU). If a posted interrupt coming at this
> time, the irq remmaping facility will set the bit of PIR (Posted
> Interrupt Requests) without ON (Outstanding Notification).
> So this interrupt can't be sync to APIC virtualization register and
> will not be handled by Guest because ON is zero.
> 
> Signed-off-by: Luwei Kang <luwei.k...@intel.com>

Queued, thanks.

Paolo

> ---
>  arch/x86/kvm/vmx/vmx.c | 26 +++++++++++---------------
>  arch/x86/kvm/vmx/vmx.h |  6 ++++++
>  arch/x86/kvm/x86.c     |  2 +-
>  3 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f6915f1..fe59199 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1192,21 +1192,6 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, 
> int cpu)
>       if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
>               return;
>  
> -     /*
> -      * First handle the simple case where no cmpxchg is necessary; just
> -      * allow posting non-urgent interrupts.
> -      *
> -      * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> -      * PI.NDST: pi_post_block will do it for us and the wakeup_handler
> -      * expects the VCPU to be on the blocked_vcpu_list that matches
> -      * PI.NDST.
> -      */
> -     if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR ||
> -         vcpu->cpu == cpu) {
> -             pi_clear_sn(pi_desc);
> -             return;
> -     }
> -
>       /* The full case.  */
>       do {
>               old.control = new.control = pi_desc->control;
> @@ -1221,6 +1206,17 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, 
> int cpu)
>               new.sn = 0;
>       } while (cmpxchg64(&pi_desc->control, old.control,
>                          new.control) != old.control);
> +
> +     /*
> +      * Clear SN before reading the bitmap.  The VT-d firmware
> +      * writes the bitmap and reads SN atomically (5.2.3 in the
> +      * spec), so it doesn't really have a memory barrier that
> +      * pairs with this, but we cannot do that and we need one.
> +      */
> +     smp_mb__after_atomic();
> +
> +     if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
> +             pi_set_on(pi_desc);
>  }
>  
>  /*
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9932895..a4527e1 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -349,6 +349,12 @@ static inline void pi_set_sn(struct pi_desc *pi_desc)
>                       (unsigned long *)&pi_desc->control);
>  }
>  
> +static inline void pi_set_on(struct pi_desc *pi_desc)
> +{
> +     set_bit(POSTED_INTR_ON,
> +             (unsigned long *)&pi_desc->control);
> +}
> +
>  static inline void pi_clear_on(struct pi_desc *pi_desc)
>  {
>       clear_bit(POSTED_INTR_ON,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3d32b8f..ebd6737 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7795,7 +7795,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>        * 1) We should set ->mode before checking ->requests.  Please see
>        * the comment in kvm_vcpu_exiting_guest_mode().
>        *
> -      * 2) For APICv, we should set ->mode before checking PIR.ON.  This
> +      * 2) For APICv, we should set ->mode before checking PID.ON. This
>        * pairs with the memory barrier implicit in pi_test_and_set_on
>        * (see vmx_deliver_posted_interrupt).
>        *
> 

Reply via email to