Reproduce the problem: migrate migrate_cancel migrate Error happen for memory migration
The reason as follows: 1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to 1 by a series of cpu_physical_memory_set_dirty_range 2. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero 3. migration data... 4. migrate_cancel, will stop log dirty 5. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero Here RAMBlock.bmap only have new logged dirty pages, don't contain the whole guest pages. Signed-off-by: Ivan Ren <ivan...@tencent.com> --- migration/ram.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 908517fc2b..bbebaee0c1 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3173,11 +3173,11 @@ static int ram_state_init(RAMState **rsp) QSIMPLEQ_INIT(&(*rsp)->src_page_requests); /* + * Count the total number of pages used by ram blocks not including any + * gaps due to alignment or unplugs. * This must match with the initial values of dirty bitmap. - * Currently we initialize the dirty bitmap to all zeros so - * here the total dirty page count is zero. */ - (*rsp)->migration_dirty_pages = 0; + (*rsp)->migration_dirty_pages = ram_bytes_total() >> TARGET_PAGE_BITS; ram_state_reset(*rsp); return 0; @@ -3196,12 +3196,13 @@ static void ram_list_init_bitmaps(void) * The initial dirty bitmap for migration must be set with all * ones to make sure we'll migrate every guest RAM page to * destination. - * Here we didn't set RAMBlock.bmap simply because it is already - * set in ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] in - * ram_block_add, and that's where we'll sync the dirty bitmaps. - * Here setting RAMBlock.bmap would be fine too but not necessary. + * Here we set RAMBlock.bmap all to 1 because when rebegin a + * new migration after a failed migration, ram_list. + * dirty_memory[DIRTY_MEMORY_MIGRATION] don't include the whole + * guest memory. */ block->bmap = bitmap_new(pages); + bitmap_set(block->bmap, 0, pages); if (migrate_postcopy_ram()) { block->unsentmap = bitmap_new(pages); bitmap_set(block->unsentmap, 0, pages); -- 2.17.2 (Apple Git-113)