On Fri, Nov 20, 2020 at 07:53:34PM +0300, Andrey Gruzdev wrote: > On 20.11.2020 19:43, Peter Xu wrote: > > On Fri, Nov 20, 2020 at 07:15:07PM +0300, Andrey Gruzdev wrote: > > > Yeah, I think we can re-use the postcopy queue code for faulting pages. > > > I'm > > > worring a little about some additional overhead dealing with urgent > > > request > > > semaphore. Also, the code won't change a lot, something like: > > > > > > [...] > > > /* In case of 'write-tracking' migration we first try > > > * to poll UFFD and sse if we have write page fault event */ > > > poll_fault_page(rs); > > > > > > again = true; > > > found = get_queued_page(rs, &pss); > > > > > > if (!found) { > > > /* priority queue empty, so just search for something dirty > > > */ > > > found = find_dirty_block(rs, &pss, &again); > > > } > > > [...] > > > > Could I ask what's the "urgent request semaphore"? Thanks, > > > > These function use it (the correct name is 'rate_limit_sem'): > > void migration_make_urgent_request(void) > { > qemu_sem_post(&migrate_get_current()->rate_limit_sem); > } > > void migration_consume_urgent_request(void) > { > qemu_sem_wait(&migrate_get_current()->rate_limit_sem); > } > > They are called from ram_save_queue_pages and unqueue_page, accordingly, to > control migration rate limiter. > > bool migration_rate_limit(void) > { > [...] > /* > * Wait for a delay to do rate limiting OR > * something urgent to post the semaphore. > */ > int ms = s->iteration_start_time + BUFFER_DELAY - now; > trace_migration_rate_limit_pre(ms); > if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) { > /* > * We were woken by one or more urgent things but > * the timedwait will have consumed one of them. > * The service routine for the urgent wake will dec > * the semaphore itself for each item it consumes, > * so add this one we just eat back. > */ > qemu_sem_post(&s->rate_limit_sem); > urgent = true; > } > [...] > } >
Hmm... Why its overhead could be a problem? If it's an overhead that can be avoided, then postcopy might want that too. The thing is I really feel like the snapshot logic can leverage the whole idea of existing postcopy (like get_queued_page/unqueue_page; it's probably due to the fact that both of them want to "migrate some more urgent pages than the background migration, due to either missing/wrprotected pages"), but I might have something missing. Thanks, -- Peter Xu