On 27/03/20 16:04, Peter Xu wrote:
> That makes perfect sense to me, however... What if the vcpu generates
> dirty bits _after_ we do KVM_GET_DIRTY_LOG but _before_ we send
> KVM_SET_USER_MEMORY_REGION to remove the memslot?  If the vcpu is in
> the userspace I think it's fine because BQL is needed so it won't be
> able to, however the vcpus running inside KVM should not be restricted
> to that.  I think those dirty bits will still get lost, but it should
> be extremely hard to trigger.

Yes, you've found a bug.

> I'm not sure whether I missed something above, but if I'm correct, I
> think the solution should be a flag for KVM_SET_USER_MEMORY_REGION to
> set the memslot as invalid (KVM_MEM_INVALID), then when removing the
> memslot which has KVM_MEM_LOG_DIRTY_PAGES enabled, we should:
> 
>   - send KVM_SET_USER_MEMORY_REGION with KVM_MEM_INVALID to invalidate
>     the memslot, but keep the slot and bitmap in KVM
> 
>   - send KVM_GET_DIRTY_LOG to fetch the bitmap for the slot
> 
>   - send KVM_SET_USER_MEMORY_REGION with size==0 to remove the slot

Or KVM_MEM_READONLY.

> However I don't know whether that'll worth it.

Yes, especially in the light of the dirty ring issue below.

> (Side question which is irrelevant to this: for kvm dirty ring we now
>  need to do similar thing to flush dirty bits before removing a
>  memslot, however that's even trickier because flushing dirty ring
>  needs to kick all vcpu out, currently the RFC series is using
>  run_on_cpu() which will release the BQL and wait for all vcpus to
>  quit into userspace, however that cannot be done inside
>  kvm_set_phys_mem because it needs the BQL.  I'm still thinking about
>  a good way to fix this, but any idea is greatly welcomed :)

The problem here is also that the GFN is not an unique identifier of the
QEMU ram_addr_t.  However you don't really need to kick all vCPUs out,
do you?  You can protect the dirty ring with its own per-vCPU mutex and
harvest the pages from the main thread.

Paolo


Reply via email to