* Peter Xu (pet...@redhat.com) wrote: > With preemption mode on, when we see a postcopy request that was requesting > for exactly the page that we have preempted before (so we've partially sent > the page already via PRECOPY channel and it got preempted by another > postcopy request), currently we drop the request so that after all the > other postcopy requests are serviced then we'll go back to precopy stream > and start to handle that. > > We dropped the request because we can't send it via postcopy channel since > the precopy channel already contains partial of the data, and we can only > send a huge page via one channel as a whole. We can't split a huge page > into two channels. > > That's a very corner case and that works, but there's a change on the order > of postcopy requests that we handle since we're postponing this (unlucky) > postcopy request to be later than the other queued postcopy requests. The > problem is there's a possibility that when the guest was very busy, the > postcopy queue can be always non-empty, it means this dropped request will > never be handled until the end of postcopy migration. So, there's a chance > that there's one dest QEMU vcpu thread waiting for a page fault for an > extremely long time just because it's unluckily accessing the specific page > that was preempted before. > > The worst case time it needs can be as long as the whole postcopy migration > procedure. It's extremely unlikely to happen, but when it happens it's not > good. > > The root cause of this problem is because we treat pss->postcopy_requested > variable as with two meanings bound together, as the variable shows: > > 1. Whether this page request is urgent, and, > 2. Which channel we should use for this page request. > > With the old code, when we set postcopy_requested it means either both (1) > and (2) are true, or both (1) and (2) are false. We can never have (1) > and (2) to have different values. > > However it doesn't necessarily need to be like that. It's very legal that > there's one request that has (1) very high urgency, but (2) we'd like to > use the precopy channel. Just like the corner case we were discussing > above. > > To differenciate the two meanings better, introduce a new field called > postcopy_target_channel, showing which channel we should use for this page > request, so as to cover the old meaning (2) only. Then we leave the > postcopy_requested variable to stand only for meaning (1), which is the > urgency of this page request. > > With this change, we can easily boost priority of a preempted precopy page > as long as we know that page is also requested as a postcopy page. So with > the new approach in get_queued_page() instead of dropping that request, we > send it right away with the precopy channel so we get back the ordering of > the page faults just like how they're requested on dest. > > Alongside, I touched up find_dirty_block() to only set the postcopy fields > in the pss section if we're going through a postcopy migration. That's a > very light optimization and shouldn't affect much. > > Reported-by: manish.mis...@nutanix.com > Signed-off-by: Peter Xu <pet...@redhat.com>
So I think this is OK; getting a bit complicated! Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > --- > migration/ram.c | 69 +++++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 55 insertions(+), 14 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 9d76db8491..fdcd9984fa 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -441,8 +441,28 @@ struct PageSearchStatus { > unsigned long page; > /* Set once we wrap around */ > bool complete_round; > - /* Whether current page is explicitly requested by postcopy */ > + /* > + * [POSTCOPY-ONLY] Whether current page is explicitly requested by > + * postcopy. When set, the request is "urgent" because the dest QEMU > + * threads are waiting for us. > + */ > bool postcopy_requested; > + /* > + * [POSTCOPY-ONLY] The target channel to use to send current page. > + * > + * Note: This may _not_ match with the value in postcopy_requested > + * above. Let's imagine the case where the postcopy request is exactly > + * the page that we're sending in progress during precopy. In this case > + * we'll have postcopy_requested set to true but the target channel > + * will be the precopy channel (so that we don't split brain on that > + * specific page since the precopy channel already contains partial of > + * that page data). > + * > + * Besides that specific use case, postcopy_target_channel should > + * always be equal to postcopy_requested, because by default we send > + * postcopy pages via postcopy preempt channel. > + */ > + bool postcopy_target_channel; > }; > typedef struct PageSearchStatus PageSearchStatus; > > @@ -496,6 +516,9 @@ static QemuCond decomp_done_cond; > static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock > *block, > ram_addr_t offset, uint8_t *source_buf); > > +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss, > + bool postcopy_requested); > + > static void *do_data_compress(void *opaque) > { > CompressParam *param = opaque; > @@ -1516,8 +1539,14 @@ retry: > */ > static bool find_dirty_block(RAMState *rs, PageSearchStatus *pss, bool > *again) > { > - /* This is not a postcopy requested page */ > - pss->postcopy_requested = false; > + if (migration_in_postcopy()) { > + /* > + * This is not a postcopy requested page, mark it "not urgent", and > + * use precopy channel to send it. > + */ > + pss->postcopy_requested = false; > + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY; > + } Do you need the 'if' here? Dave > pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page); > if (pss->complete_round && pss->block == rs->last_seen_block && > @@ -2038,15 +2067,20 @@ static bool get_queued_page(RAMState *rs, > PageSearchStatus *pss) > RAMBlock *block; > ram_addr_t offset; > > -again: > block = unqueue_page(rs, &offset); > > if (block) { > /* See comment above postcopy_preempted_contains() */ > if (postcopy_preempted_contains(rs, block, offset)) { > trace_postcopy_preempt_hit(block->idstr, offset); > - /* This request is dropped */ > - goto again; > + /* > + * If what we preempted previously was exactly what we're > + * requesting right now, restore the preempted precopy > + * immediately, boosting its priority as it's requested by > + * postcopy. > + */ > + postcopy_preempt_restore(rs, pss, true); > + return true; > } > } else { > /* > @@ -2070,7 +2104,9 @@ again: > * really rare. > */ > pss->complete_round = false; > + /* Mark it an urgent request, meanwhile using POSTCOPY channel */ > pss->postcopy_requested = true; > + pss->postcopy_target_channel = RAM_CHANNEL_POSTCOPY; > } > > return !!block; > @@ -2324,7 +2360,8 @@ static bool postcopy_preempt_triggered(RAMState *rs) > return rs->postcopy_preempt_state.preempted; > } > > -static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss) > +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss, > + bool postcopy_requested) > { > PostcopyPreemptState *state = &rs->postcopy_preempt_state; > > @@ -2332,8 +2369,15 @@ static void postcopy_preempt_restore(RAMState *rs, > PageSearchStatus *pss) > > pss->block = state->ram_block; > pss->page = state->ram_page; > - /* This is not a postcopy request but restoring previous precopy */ > - pss->postcopy_requested = false; > + > + /* Whether this is a postcopy request? */ > + pss->postcopy_requested = postcopy_requested; > + /* > + * When restoring a preempted page, the old data resides in PRECOPY > + * slow channel, even if postcopy_requested is set. So always use > + * PRECOPY channel here. > + */ > + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY; > > trace_postcopy_preempt_restored(pss->block->idstr, pss->page); > > @@ -2344,12 +2388,9 @@ static void postcopy_preempt_restore(RAMState *rs, > PageSearchStatus *pss) > static void postcopy_preempt_choose_channel(RAMState *rs, PageSearchStatus > *pss) > { > MigrationState *s = migrate_get_current(); > - unsigned int channel; > + unsigned int channel = pss->postcopy_target_channel; > QEMUFile *next; > > - channel = pss->postcopy_requested ? > - RAM_CHANNEL_POSTCOPY : RAM_CHANNEL_PRECOPY; > - > if (channel != rs->postcopy_channel) { > if (channel == RAM_CHANNEL_PRECOPY) { > next = s->to_dst_file; > @@ -2505,7 +2546,7 @@ static int ram_find_and_save_block(RAMState *rs) > * preempted precopy. Otherwise find the next dirty bit. > */ > if (postcopy_preempt_triggered(rs)) { > - postcopy_preempt_restore(rs, &pss); > + postcopy_preempt_restore(rs, &pss, false); > found = true; > } else { > /* priority queue empty, so just search for something dirty > */ > -- > 2.32.0 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK