On Mon, 10 May 2010 15:24:20 -0500 Anthony Liguori <anth...@codemonkey.ws> wrote:
> On 04/13/2010 04:33 AM, Izik Eidus wrote: > > From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00 > > 2001 From: Izik Eidus<iei...@redhat.com> > > Date: Tue, 13 Apr 2010 12:24:57 +0300 > > Subject: [PATCH] fix migration with large mem > > > > In cases of guests with large mem that have pages > > that all their bytes content are the same, we will > > spend alot of time reading the memory from the guest > > (is_dup_page()) > > > > It is happening beacuse ram_save_live() function have > > limit of how much we can send to the dest but not how > > much we read from it, and in cases we have many is_dup_page() > > hits, we might read huge amount of data without updating important > > stuff like the timers... > > > > The guest lose all its repsonsibility and have many softlock ups > > inside itself. > > > > this patch add limit on the size we can read from the guest each > > iteration. > > > > Thanks. > > > > Signed-off-by: Izik Eidus<iei...@redhat.com> > > --- > > arch_init.c | 6 +++++- > > 1 files changed, 5 insertions(+), 1 deletions(-) > > > > diff --git a/arch_init.c b/arch_init.c > > index cfc03ea..e27b1a0 100644 > > --- a/arch_init.c > > +++ b/arch_init.c > > @@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH; > > #define RAM_SAVE_FLAG_PAGE 0x08 > > #define RAM_SAVE_FLAG_EOS 0x10 > > > > +#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024 > > + > > static int is_dup_page(uint8_t *page, uint8_t ch) > > { > > uint32_t val = ch<< 24 | ch<< 16 | ch<< 8 | ch; > > @@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, > > int stage, void *opaque) uint64_t bytes_transferred_last; > > double bwidth = 0; > > uint64_t expected_time = 0; > > + int data_read = 0; > > > > if (stage< 0) { > > cpu_physical_memory_set_dirty_tracking(0); > > @@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, > > int stage, void *opaque) bytes_transferred_last = bytes_transferred; > > bwidth = qemu_get_clock_ns(rt_clock); > > > > - while (!qemu_file_rate_limit(f)) { > > + while (!qemu_file_rate_limit(f)&& data_read< > > MAX_SAVE_BLOCK_READ) { > > The effect of this patch is that we'll never send more than 10mb/s > during live migration? If so, it's totally wrong as a fix to the > problem. It is 100mb/s... (if I remember correct the migration code will run this thing 10 times for each iteration) My feeling is that limit it with the network 32mb/s limit is too low, reading memory for 100mb/s is not such a problem as long as we don`t read gigas out of memory every sec... (Still we want to optimize the billion of zeros cases of windows guests) Anyway if the above does not make sense to you, I will just change it into what you suggested So ? > > It would be better to account the deduplicated pages as part of the > rate limiting calculations. > > Regards, > > Anthony Liguori > > > int ret; > > > > ret = ram_save_block(f); > > + data_read += ret * TARGET_PAGE_SIZE; > > bytes_transferred += ret * TARGET_PAGE_SIZE; > > if (ret == 0) { /* no more blocks */ > > break; > > >