[ Sorry I've lost this thread with email setup changes on my side; catching up ]
On Tue, Mar 15, 2016 at 06:50:45PM +0530, Jitendra Kolhe wrote: > On 3/11/2016 8:09 PM, Jitendra Kolhe wrote: > > Here is what > >I tried, let’s say we have 3 versions of qemu (below timings are for > >16GB idle guest with 12GB ballooned out) > > > >v1. Unmodified qemu – absolutely not code change – Total Migration time > >= ~7600ms (I rounded this one to ~8000ms) > >v2. Modified qemu 1 – with proposed patch set (which skips both zero > >pages scan and migrating control information for ballooned out pages) - > >Total Migration time = ~5700ms > >v3. Modified qemu 2 – only with changes to save_zero_page() as discussed > >in previous mail (and of course using proposed patch set only to > >maintain bitmap for ballooned out pages) – Total migration time is > >irrelevant in this case. > >Total Zero page scan time = ~1789ms > >Total (save_page_header + qemu_put_byte(f, 0)) = ~556ms. > >Everything seems to add up here (may not be exact) – 5700+1789+559 = > >~8000ms > > > >I see 2 factors that we have not considered in this add up a. overhead > >for migrating balloon bitmap to target and b. as you mentioned below > >overhead of qemu_clock_get_ns(). > > Missed one more factor of testing each page against balloon bitmap during > migration, which is consuming around ~320ms for same configuration. If we > remove this overhead which is introduced by proposed patch set from above > calculation we almost get total migration time for unmodified qemu > (5700-320+1789+559=~7700ms) I'm a bit lost in the numbers you quote, so let me try with back-of-the-envelope calculation. First off, the way you identify pages that don't need to be sent is basically orthogonal to how you optimize the protocol to send them. So teaching is_zero_range() to consult unmapped or ballooned out page map looks like a low-hanging fruit that may benefit the migration time by avoiding scanning the memory, without protocol changes. [And vice versa, if sending the zero pages bitmap brought so big benefit it would make sense to apply it to pages found by scanning, too]. Now regarding the protocol: - as a first approximation, let's speak in terms of transferred data size - consider a VM using 1/10 of its memory (I think this can be considered an extreme of over-provisioning) - a whiteout is 3 decimal orders smaller than a page, so with zero pages replaced by whiteouts (current protocol) the overall transferred data size for zero pages is on the order of a percent of the total transferred data size - zero page bitmap would reduce that further by a couple of orders So, if this calculation is not totally off, extending the protocol to use zero page bitmaps is unlikely to give an improvement at more than a percent level. I'm not sure it pays off the extra code paths and incompatible protocol changes... Roman.