Hello, I used Juan's latest bits to try out migration on large guest configurations. Workload : a slightly modified SpecJBB.
The following is some very preliminary data... (for the non-XBZRLE case). FYI... Vinod Configuration: ------------------ Source & Target Host Hardware : 8 Westmere socket + 1TB each. HT off. (10Gb back-to-back connection dedicated for live migration traffic) Source & Target Host OS : 3.5-rc6+ (picked from kvm.git) Guest OS : 3.4.1 qemu : git://repo.or.cz/qemu/quintela.git -b migration-next-v3 ---- 1) 10VCPU / 128GB Guest ------------------------ (qemu) migrate_set_speed 10G (qemu) migrate_set_downtime 2 a) Idle guest: transferred ram: 2841554 kbytes total ram: 134226368 kbytes total time: 119350 milliseconds Number of pre-copy interations : 1766 Stage_3_time : ~3271 ms b) SpecJBB (10 warehouse threads made to run for 10mins). transferred ram: 236383717 kbytes total ram: 134226368 kbytes total time: 619110 milliseconds Number of pre-copy interations : 145515 Stage_3_time : ~3469 ms 2) 20VCPUs / 256GB guest ------------------------ (qemu) migrate_set_speed 10G (qemu) migrate_set_downtime 2 a) Idle guest: transferred ram: 5257340 kbytes total ram: 268444096 kbytes total time: 256496 milliseconds Number of pre-copy interations : 3379 Stage_3_time : 4281 ms b) SpecJBB (20 warehouse threads made to run for 10mins) transferred ram: 151607814 kbytes total ram: 268444096 kbytes total time: 653578 milliseconds Number of pre-copy interations : 28433 Stage_3_time : ~2670 ms 3) 40VCPUs/ 512GB guest ------------------------ (qemu) migrate_set_speed 10G (qemu) migrate_set_downtime 2 a) Idle guest: transferred ram: 9534968 kbytes total ram: 536879552 kbytes total time: 665557 milliseconds Number of pre-copy interations : 11541 Stage_3_time : ~6210 ms b) SpecJBB (40 warehouse threads made to run for 10mins) transferred ram: 47845021 kbytes total ram: 536879552 kbytes total time: 760423 milliseconds Number of pre-copy interations : 15963 Stage_3_time : ~6180 ms ------ Note 1 : Stage3 time (aka "down" time) listed above is an approximation. I measured the time spent in the ram_save_complete() routine. Notice that the Stage3 duration did exceed the 2 second downtime limit that was specified. Also Stage3 time seems to vary quite a bit from run to run with the same configuration/workload. Note 2: Modified SpecJBB to run for 10min duration with just a fixed number of warehouse threads and 24GB heap. [ Did not try out any NUMA or other tuning etc. for these runs] Some observations: - In all the cases the live guest migration converged only after the workload completed running. - Did not observe any guest freezes during the Stage2 (which was happening a lot earlier with the earlier versions). The ssh sessions to the guest stayed up for the entire duration. (I forgot to run a ping to the guest...will do that next time). - Although the SPECJbb was not really run like a typical benchmark but just as a sample workload... I did observe the throughput i.e. Bops (during live migration vs. when its run normally in a guest of the same size) . During live guest migration the Bops # dropped by ~20 %. Need to analyze this further. I am curious to hear what performance impacts others have experienced with either SpecJBB or other workloads during KVM live guest migration. [ As was suggested earlier...Perhaps having a migration thread along with other optimizations around dirty page tracking may help]. - Migration (TX) traffic [observed via "iftop" utility] ranged between 1.5Gb-3.0Gb(occasionally a bit higher) through the dedicated 10Gb link. IOW, the dedicated link was not saturated to near line rate. Need to investigate this further...i.e. not sure if this is all due to the overhead of tracking dirty pages or something else. Any ideas ?