Hi all, After applying the patch below, the time which memory_global_dirty_log_stop() function takes is down to milliseconds of a 4T memory guest, but I'm not sure whether this patch will trigger other problems. Does this patch make sense?
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 464da93..fe26ee5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8313,6 +8313,8 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, enum kvm_mr_change change) { int nr_mmu_pages = 0; + int i; + struct kvm_vcpu *vcpu; if (!kvm->arch.n_requested_mmu_pages) nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm); @@ -8328,14 +8330,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, * in the source machine (for example if live migration fails), small * sptes will remain around and cause bad performance. * - * Scan sptes if dirty logging has been stopped, dropping those - * which can be collapsed into a single large-page spte. Later - * page faults will create the large-page sptes. + * Reset each vcpu's mmu, then page faults will create the large-page + * sptes later. */ if ((change != KVM_MR_DELETE) && (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) - kvm_mmu_zap_collapsible_sptes(kvm, new); + !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) { + kvm_for_each_vcpu(i, vcpu, kvm) + kvm_mmu_reset_context(vcpu); + } /* * Set up write protection and/or dirty logging for the new slot. > * Yang Hongyang (yanghongy...@huawei.com) wrote: > > > > > > On 2017/4/24 20:06, Juan Quintela wrote: > > > Yang Hongyang <yanghongy...@huawei.com> wrote: > > >> Hi all, > > >> > > >> We found dirty log switch costs more then 13 seconds while > > >> migrating a 4T memory guest, and dirty log switch is currently > > >> protected by QEMU BQL. This causes guest freeze for a long time > > >> when switching dirty log on, and the migration downtime is > unacceptable. > > >> Are there any chance to optimize the time cost for dirty log switch > operation? > > >> Or move the time consuming operation out of the QEMU BQL? > > > > > > Hi > > > > > > Could you specify what do you mean by dirty log switch? > > > The one inside kvm? > > > The merge between kvm one and migration bitmap? > > > > The call of the following functions: > > memory_global_dirty_log_start/stop(); > > I suppose there's a few questions; > a) Do we actually need the BQL - and if so why > b) What actually takes 13s? It's probably worth figuring out where it > goes, the whole bitmap is only 1GB isn't it even on a 4TB machine, and > even the simplest way to fill that takes way less than 13s. > > Dave > > > > > > > > > Thanks, Juan. > > > > > > > -- > > Thanks, > > Yang > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK Regards, Jay Zhou