On Wed, Jun 05, 2019 at 05:38:19PM +0800, Peter Xu wrote: >On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote: >> On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote: >> >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote: >> >> In case we gets a queued page, the order of block is interrupted. We may >> >> not rely on the complete_round flag to say we have already searched the >> >> whole blocks on the list. >> >> >> >> Signed-off-by: Wei Yang <richardw.y...@linux.intel.com> >> >> --- >> >> migration/ram.c | 6 ++++++ >> >> 1 file changed, 6 insertions(+) >> >> >> >> diff --git a/migration/ram.c b/migration/ram.c >> >> index d881981876..e9b40d636d 100644 >> >> --- a/migration/ram.c >> >> +++ b/migration/ram.c >> >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, >> >> PageSearchStatus *pss) >> >> */ >> >> pss->block = block; >> >> pss->page = offset >> TARGET_PAGE_BITS; >> >> + >> >> + /* >> >> + * This unqueued page would break the "one round" check, even is >> >> + * really rare. >> > >> >Why this is needed? Could you help explain the problem first? >> >> Peter, Thanks for your question. >> >> I found this issue during code review and I believe this is a corner case. >> >> Below is a draft chart for ram_find_and_save_block: >> >> ram_find_and_save_block >> do >> get_queued_page() >> find_dirty_block() >> ram_save_host_page() >> while >> >> The basic logic here is : get a page need to migrate and migrate it. >> >> In case we don't have get_queued_page(), find_dirty_block() will search the >> whole ram_list.blocks by order. pss->complete_round is used to indicate >> whether this search has looped. >> >> Everything works fine after get_queued_page() involved. The block unqueued in >> get_queued_page() could be any block in the ram_list.blocks. This means we >> have very little chance to break the looped indicator. >> >> unqueue_page() last_seen_block >> | | >> ram_list.blocks v v >> ---------------------------------+=====+--- >> >> >> Just draw a raw picture to demonstrate a corner case. >> >> For example, we start from last_seen_block and search till the end of >> ram_list.blocks. At this moment, pss->complete_round is set to true. Then we >> get a queued page from unqueue_page() at the point I pointed. So the loop >> continues may just continue the range as I marked as "=". We will skip all >> the >> other ranges. > >Ah I see your point, but I don't think there is a problem - note that >complete_round will be reset for each ram_find_and_save_block(), so >even if we have that iteration of ram_find_and_save_block() to return >we'll still know we have dirty pages to migrate and in the next call >we'll be fine, no? >
This is really a rare case and hard to say whether it would be harmful. The chance still exists. >-- >Peter Xu -- Wei Yang Help you, Help me