On Wednesday, July 7, 2021 1:40 AM, Peter Xu wrote: > On Tue, Jul 06, 2021 at 12:05:49PM +0200, David Hildenbrand wrote: > > On 06.07.21 11:41, Wang, Wei W wrote: > > > On Monday, July 5, 2021 9:42 PM, David Hildenbrand wrote: > > > > On 03.07.21 04:53, Wang, Wei W wrote: > > > > > On Friday, July 2, 2021 3:07 PM, David Hildenbrand wrote: > > > > > > On 02.07.21 04:48, Wang, Wei W wrote: > > > > > > > On Thursday, July 1, 2021 10:22 PM, David Hildenbrand wrote: > > > > > > > > On 01.07.21 14:51, Peter Xu wrote: > > > > > > > > > > > > I think that clearly shows the issue. > > > > > > > > > > > > My theory I did not verify yet: Assume we have 1GB chunks in the > > > > > > clear > bmap. > > > > > > Assume the VM reports all pages within a 1GB chunk as free > > > > > > (easy with a fresh VM). While processing hints, we will clear > > > > > > the bits from the dirty bmap in the RAMBlock. As we will never > > > > > > migrate any page of that 1GB chunk, we will not actually clear > > > > > > the dirty bitmap of the memory region. When re-syncing, we > > > > > > will set all bits bits in the dirty bmap again from the dirty > > > > > > bitmap in the memory region. Thus, many of our hints end up > > > > > > being mostly ignored. The smaller the clear bmap chunk, the > > > > more extreme the issue. > > > > > > > > > > OK, that looks possible. We need to clear the related bits from > > > > > the memory region bitmap before skipping the free pages. Could > > > > > you try with > > > > below patch: > > > > > > > > I did a quick test (with the memhog example) and it seems like it mostly > works. > > > > However, we're now doing the bitmap clearing from another, racing > > > > with the migration thread. Especially: > > > > > > > > 1. Racing with clear_bmap_set() via > > > > cpu_physical_memory_sync_dirty_bitmap() > > > > 2. Racing with migration_bitmap_clear_dirty() > > > > > > > > So that might need some thought, if I'm not wrong. > > > > > > I think this is similar to the non-clear_bmap case, where the > > > rb->bmap is set or cleared by the migration thread and > > > qemu_guest_free_page_hint. For example, the migration thread could find a > bit is set from rb->bmap before qemu_guest_free_page_hint gets that bit > cleared in time. > > > The result is that the free page is transferred, which isn't necessary, > > > but won't > cause any issue. > > > This is very rare in practice. > > > > Here are my concerns regarding races: > > > > a) We now end up calling migration_clear_memory_region_dirty_bitmap() > > without holding the bitmap_mutex. We have to clarify if that is ok. At > > least > > migration_bitmap_clear_dirty() holds it *while* clearing the log and > > migration_bitmap_sync() while setting bits in the clear_bmap. I think > > we also have to hold the mutex on the new path. > > Makes sense; I think we can let bitmap_mutex to protect both dirty/clear > bitmaps, and also the dirty pages counter. I'll comment in Wei's patch too > later.
Btw, what would you think if we change mutex to QemuSpin? It will also reduce the overhead, I think. Best, Wei