On 21 July 2017 at 10:13, Dr. David Alan Gilbert <dgilb...@redhat.com> wrote: > I don't fully understand the way memory_region_do_invalidate_mmio_ptr > works; I see it dropping the memory region; if that's also dropping > the RAMBlock then it will upset migration. Even if the CPU is stopped > I dont think that stops the migration thread walking through the list of > RAMBlocks.
memory_region_do_invalidate_mmio_ptr() calls memory_region_unref(), which will eventually result in memory_region_finalize() being called, which will call the MR destructor, which in this case is memory_region_destructor_ram(), which calls qemu_ram_free() on the RAMBlock, which removes the RAMBlock from the list (after taking the ramlist lock). > Even then, the problem is migration keeps a 'dirty_pages' count which is > calculated at the start of migration and updated as we dirty and send > pages; if we add/remove a RAMBlock then that dirty_pages count is wrong > and we either never finish migration (since dirty_pages never reaches > zero) or finish early with some unsent data. > And then there's the 'received' bitmap currently being added for > postcopy which tracks each page that's been received (that's not in yet > though). It sounds like we really need to make migration robust against RAMBlock changes -- in the hotplug case it's certainly possible for RAMBlocks to be newly created or destroyed while migration is in progress. thanks -- PMM