Hi Zhoujian, 2017-05-17 10:20 GMT+08:00 Zhoujian (jay) <jianjay.z...@huawei.com>: > Hi Wanpeng, > >> > On 11/05/2017 14:07, Zhoujian (jay) wrote: >> >> - * Scan sptes if dirty logging has been stopped, dropping those >> >> - * which can be collapsed into a single large-page spte. Later >> >> - * page faults will create the large-page sptes. >> >> + * Reset each vcpu's mmu, then page faults will create the >> large-page >> >> + * sptes later. >> >> */ >> >> if ((change != KVM_MR_DELETE) && >> >> (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && >> >> - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) >> >> - kvm_mmu_zap_collapsible_sptes(kvm, new); >> >> This is an unlikely branch(unless guest live migration fails and continue >> to run on the source machine) instead of hot path, do you have any >> performance number for your real workloads? >> > > Sorry to bother you again. > > Recently, I have tested the performance before migration and after migration > failure > using spec cpu2006 https://www.spec.org/cpu2006/, which is a standard > performance > evaluation tool. > > These are the results: > ****** > Before migration the score is 153, and the TLB miss statistics of the > qemu process is: > linux-sjrfac:/mnt/zhoujian # perf stat -e > dTLB-load-misses,dTLB-loads,dTLB-store-misses, \ > dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10 > > Performance counter stats for process id '26463': > > 698,938 dTLB-load-misses # 0.13% of all dTLB > cache hits (50.46%) > 543,303,875 dTLB-loads > (50.43%) > 199,597 dTLB-store-misses > (16.51%) > 60,128,561 dTLB-stores > (16.67%) > 69,986 iTLB-load-misses # 6.17% of all iTLB > cache hits (16.67%) > 1,134,097 iTLB-loads > (33.33%) > > 10.000684064 seconds time elapsed > > After migration failure the score is 149, and the TLB miss statistics of > the qemu process is: > linux-sjrfac:/mnt/zhoujian # perf stat -e > dTLB-load-misses,dTLB-loads,dTLB-store-misses, \ > dTLB-stores,iTLB-load-misses,iTLB-loads -p 26463 sleep 10 > > Performance counter stats for process id '26463': > > 765,400 dTLB-load-misses # 0.14% of all dTLB > cache hits (50.50%) > 540,972,144 dTLB-loads > (50.47%) > 207,670 dTLB-store-misses > (16.50%) > 58,363,787 dTLB-stores > (16.67%) > 109,772 iTLB-load-misses # 9.52% of all iTLB > cache hits (16.67%) > 1,152,784 iTLB-loads > (33.32%) > > 10.000703078 seconds time elapsed > ******
Could you comment out the original "lazy collapse small sptes into large sptes" codes in the function kvm_arch_commit_memory_region() and post the results here? Regards, Wanpeng Li > > These are the steps: > ====== > (1) the version of kmod is 4.4.11(with slightly modified) and the version of > qemu is 2.6.0 > (with slightly modified), the kmod is applied with the following patch > according to > Paolo's advice: > > diff --git a/source/x86/x86.c b/source/x86/x86.c > index 054a7d3..75a4bb3 100644 > --- a/source/x86/x86.c > +++ b/source/x86/x86.c > @@ -8550,8 +8550,10 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > */ > if ((change != KVM_MR_DELETE) && > (old->flags & KVM_MEM_LOG_DIRTY_PAGES) && > - !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) > - kvm_mmu_zap_collapsible_sptes(kvm, new); > + !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) { > + printk(KERN_ERR "zj make KVM_REQ_MMU_RELOAD request\n"); > + kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD); > + } > > /* > * Set up write protection and/or dirty logging for the new slot. > > (2) I started up a memory preoccupied 10G VM(suse11sp3), which means its "RES > column" in top is 10G, > in order to set up the EPT table in advance. > (3) And then, I run the test case 429.mcf of spec cpu2006 before migration > and after migration failure. > The 429.mcf is a memory intensive workload, and the migration failure is > constructed deliberately > with the following patch of qemu: > > diff --git a/migration/migration.c b/migration/migration.c > index 5d725d0..88dfc59 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -625,6 +625,9 @@ static void process_incoming_migration_co(void *opaque) > MIGRATION_STATUS_ACTIVE); > ret = qemu_loadvm_state(f); > > + // deliberately construct the migration failure > + exit(EXIT_FAILURE); > + > ps = postcopy_state_get(); > trace_process_incoming_migration_co_end(ret, ps); > if (ps != POSTCOPY_INCOMING_NONE) { > ====== > > > Results of the score and TLB miss rate are almost the same, and I am confused. > May I ask which tool do you use to evaluate the performance? > And if my test steps are wrong, please let me know, thank you. > > Regards, > Jay Zhou > > > > >