* Denis Plotnikov (dplotni...@virtuozzo.com) wrote: > > > On 27.07.2020 19:48, Dr. David Alan Gilbert wrote: > > * Denis Plotnikov (dplotni...@virtuozzo.com) wrote: > ... > > > +static void page_fault_thread_stop(void) > > > +{ > > > + if (page_fault_fd) { > > > + close(page_fault_fd); > > > + page_fault_fd = 0; > > > + } > > I think you need to do that after you've done the quit and join, > > otherwise the fault thread might still be reading this. > > Seems to be so > > > > > + if (thread_quit_fd) { > > > + uint64_t val = 1; > > > + int ret; > > > + > > > + ret = write(thread_quit_fd, &val, sizeof(val)); > > > + assert(ret == sizeof(val)); > > > + > > > + qemu_thread_join(&page_fault_thread); > > > + close(thread_quit_fd); > > > + thread_quit_fd = 0; > > > + } > > > +} > ... > > > /** > > > * ram_find_and_save_block: finds a dirty page and sends it to f > > > * > > > @@ -1782,6 +2274,7 @@ static int ram_find_and_save_block(RAMState *rs, > > > bool last_stage) > > > pss.block = rs->last_seen_block; > > > pss.page = rs->last_page; > > > pss.complete_round = false; > > > + pss.page_copy = NULL; > > > if (!pss.block) { > > > pss.block = QLIST_FIRST_RCU(&ram_list.blocks); > > > @@ -1794,11 +2287,30 @@ static int ram_find_and_save_block(RAMState *rs, > > > bool last_stage) > > > if (!found) { > > > /* priority queue empty, so just search for something dirty > > > */ > > > found = find_dirty_block(rs, &pss, &again); > > > + > > > + if (found && migrate_background_snapshot()) { > > > + /* > > > + * make a copy of the page and > > > + * pass it to the page search status > > > + */ > > > + int ret; > > > + ret = ram_copy_page(pss.block, pss.page, &pss.page_copy); > > I'm a bit confused about why we hit this; the way I'd thought about your > > code was we turn on the write faulting, do one big save and then fixup > > the faults as the save is happening (doing the copies) as the writes > > hit; so when does this case hit? > > To make it more clear, let me draw the whole picture: > > When we do background snapshot, the vm is paused untill all vmstate EXCEPT > ram is saved. > RAM isn't written at all. That vmstate part is saved in the temporary > buffer. > > Then all the RAM is marked as read-only and the vm is un-paused. Note that > at this moment all vm's vCPUs are > running and can touch any part of memory. > After that, the migration thread starts writing the ram content. Once a > memory chunk is written, the write protection is removed for that chunk. > If a vCPU wants to write to a memory page which is write protected (hasn't > been written yet), this write is intercepted, the memory page is copied > and queued for writing, the memory page write access is restored. The > intention behind of that, is to allow vCPU to work with a memory page as > soon as possible.
So I think I'm confusing this description with the code I'm seeing above. The code above, being in ram_find_and_save_block makes me think it's calling ram_copy_page for every page at the point just before it writes it - I'm not seeing how that corresponds to what you're saying about it being queued when the CPU tries to write it. > Once all the RAM has been written, the rest of the vmstate is written from > the buffer. This needs to be so because some of the emulated devices, saved > in that > buffered vmstate part, expects the RAM content to be available first on its > loading. Right, same type of problem as postcopy. Dave > > I hope this description will make things more clear. > If not, please let me know, so I could add more details. > > Denis > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK