On 28/07/2017 04:31, Longpeng (Mike) wrote:
> Hi Paolo,
> 
> On 2017/6/6 18:57, Paolo Bonzini wrote:
> 
>> In some cases, for example involving hot-unplug of assigned
>> devices, pi_post_block can forget to remove the vCPU from the
>> blocked_vcpu_list.  When this happens, the next call to
>> pi_pre_block corrupts the list.
>>
>> Fix this in two ways.  First, check vcpu->pre_pcpu in pi_pre_block
>> and WARN instead of adding the element twice in the list.  Second,
>> always do the list removal in pi_post_block if vcpu->pre_pcpu is
>> set (not -1).
>>
>> The new code keeps interrupts disabled for the whole duration of
>> pi_pre_block/pi_post_block.  This is not strictly necessary, but
>> easier to follow.  For the same reason, PI.ON is checked only
>> after the cmpxchg, and to handle it we just call the post-block
>> code.  This removes duplication of the list removal code.
>>
>> Cc: Longpeng (Mike) <longpe...@huawei.com>
>> Cc: Huangweidong <weidong.hu...@huawei.com>
>> Cc: Gonglei <arei.gong...@huawei.com>
>> Cc: wangxin <wangxinxin.w...@huawei.com>
>> Cc: Radim Krčmář <rkrc...@redhat.com>
>> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
>> ---
>>  arch/x86/kvm/vmx.c | 62 
>> ++++++++++++++++++++++--------------------------------
>>  1 file changed, 25 insertions(+), 37 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 747d16525b45..0f4714fe4908 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11236,10 +11236,11 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>>      struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>>      struct pi_desc old, new;
>>      unsigned int dest;
>> -    unsigned long flags;
>>  
>>      do {
>>              old.control = new.control = pi_desc->control;
>> +            WARN(old.nv != POSTED_INTR_WAKEUP_VECTOR,
>> +                 "Wakeup handler not enabled while the VCPU is blocked\n");
>>  
>>              dest = cpu_physical_id(vcpu->cpu);
>>  
>> @@ -11256,14 +11257,10 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>>      } while (cmpxchg(&pi_desc->control, old.control,
>>                      new.control) != old.control);
>>  
>> -    if(vcpu->pre_pcpu != -1) {
>> -            spin_lock_irqsave(
>> -                    &per_cpu(blocked_vcpu_on_cpu_lock,
>> -                    vcpu->pre_pcpu), flags);
>> +    if (!WARN_ON_ONCE(vcpu->pre_pcpu == -1)) {
> 
> 
> __pi_post_block is only called by pi_post_block/pi_pre_block now, it seems 
> that
> both of them would make sure "vcpu->pre_pcpu != -1" before __pi_post_block is
> called, so maybe the above check is useless, right?

It's because a WARN is better than a double-add.  And even if the caller
broke the invariant you'd have to do the cmpxchg loop above to make
things not break too much.

Paolo

Reply via email to