"Dr. David Alan Gilbert (git)" <dgilb...@redhat.com> wrote: > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > Where postcopy is preceeded by a period of precopy, the destination will > have received pages that may have been dirtied on the source after the > page was sent. The destination must throw these pages away before > starting it's CPUs. > > Maintain a 'sentmap' of pages that have already been sent. > Calculate list of sent & dirty pages > Provide helpers on the destination side to discard these. > > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > Reviewed-by: Amit Shah <amit.s...@redhat.com>
Hi > /* Flag set once the migration has been asked to enter postcopy */ > bool start_postcopy; This is from a previous patch, but .... Change the name of the variable or the comment? From the comment it sholud be "in_postcopy", no? > + > + /* bitmap of pages that have been sent at least once > + * only maintained and used in postcopy at the moment > + * where it's used to send the dirtymap at the start > + * of the postcopy phase > + */ > + unsigned long *sentmap; > }; I *think* that patch would be easier if you put this one inside migration_bitmap_rcu. If you put it there, yoqu could do the > + if (ms->sentmap) { > + set_bit(dirty_ram_abs >> TARGET_PAGE_BITS, ms->sentmap); > + } And you wouldn't have to change all callers to have an ram_addr_abs address parameter, right? > +struct PostcopyDiscardState { > + const char *name; Iht is not obvious to me what name means here. I assume ram block name, change it to block_name, ramblock? > + * returns: 0 on success. > + */ > +int postcopy_ram_discard_range(MigrationIncomingState *mis, uint8_t *start, > + size_t length) > +{ > + trace_postcopy_ram_discard_range(start, length); > + if (madvise(start, length, MADV_DONTNEED)) { > + error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno)); > + return -1; > + } > + > + return 0; > +} > + > #else > /* No target OS support, stubs just fail */ > bool postcopy_ram_supported_by_host(void) > @@ -153,5 +192,95 @@ bool postcopy_ram_supported_by_host(void) > return false; > } > > +int postcopy_ram_discard_range(MigrationIncomingState *mis, uint8_t *start, > + size_t length) > +{ > + assert(0); I will assume that just returning -1 would work here. But yes, I understand that this code shouldn't be reach ... > +} > #endif > > +/* ------------------------------------------------------------------------- > */ > + > +/** > + * postcopy_discard_send_init: Called at the start of each RAMBlock before > + * asking to discard individual ranges. > + * > + * @ms: The current migration state. > + * @offset: the bitmap offset of the named RAMBlock in the migration > + * bitmap. > + * @name: RAMBlock that discards will operate on. > + * > + * returns: a new PDS. > + */ > +PostcopyDiscardState *postcopy_discard_send_init(MigrationState *ms, > + unsigned long offset, > + const char *name) > +{ > + PostcopyDiscardState *res = g_try_malloc(sizeof(PostcopyDiscardState)); Why are we using here g_try_malloc instead of g_malloc()? Even g_malloc0()? Specially when we don't check if res is NULL on return. Please change. > + > + if (res) { > + res->name = name; > + res->cur_entry = 0; > + res->nsentwords = 0; > + res->nsentcmds = 0; With the zero variant, this three can be removed. > + res->offset = offset; > + } > + > + return res; > +} > -/* Called with rcu_read_lock() to protect migration_bitmap */ > +/* Called with rcu_read_lock() to protect migration_bitmap > + * mr: The region to search for dirty pages in Haha, you forgot to update the comment when you moved the function to use ram blocks O:-) > @@ -662,6 +672,24 @@ static int save_zero_page(QEMUFile *f, RAMBlock *block, > ram_addr_t offset, > } > > /** > + * ram_find_block_by_id: Find a ramblock by name. > + * > + * Returns: The RAMBlock with matching ID, or NULL. > + */ > +static RAMBlock *ram_find_block_by_id(const char *id) > +{ > + RAMBlock *block; > + > + QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { > + if (!strcmp(id, block->idstr)) { > + return block; > + } > + } > + > + return NULL; > +} We don't have this function already..... Once here, could we split it in its own patch and use it in ram_load? QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { if (!strncmp(id, block->idstr, sizeof(id))) { if (length != block->used_length) { Error *local_err = NULL; ret = qemu_ram_resize(block->offset, length, &local_err); if (local_err) { error_report_err(local_err); } } ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG, block->idstr); break; } } if (!block) { error_report("Unknown ramblock \"%s\", cannot " "accept migration", id); ret = -EINVAL; } We could also use it in: host_from_stream_offset > +/* **** functions for postcopy ***** */ > + > +/* > + * Callback from postcopy_each_ram_send_discard for each RAMBlock > + * start,end: Indexes into the bitmap for the first and last bit > + * representing the named block > + */ > +static int postcopy_send_discard_bm_ram(MigrationState *ms, > + PostcopyDiscardState *pds, > + unsigned long start, unsigned long > end) > +{ > + unsigned long current; > + > + for (current = start; current <= end; ) { > + unsigned long set = find_next_bit(ms->sentmap, end + 1, current); > + > + if (set <= end) { > + unsigned long zero = find_next_zero_bit(ms->sentmap, > + end + 1, set + 1); > + > + if (zero > end) { > + zero = end + 1; > + } > + postcopy_discard_send_range(ms, pds, set, zero - 1); > + current = zero + 1; > + } else { > + current = set; > + } > + } I think I undrestood the logic here at the end.... But if we change the meaning of postcopy_discard_send_range() from (begin, end), to (begin, length), I think everything goes clearer, no? if (set <= end) { unsigned long zero = find_next_zero_bit(ms->sentmap, end + 1, set + 1); unsigned long length; if (zero > end) { length = end - set; } else { lenght = zero - 1 - set; current = zero + 1; } postcopy_discard_send_range(ms, pds, set, len); } else { current = set; } } Y would clame that if we call one zero, the other would be called one. Or change to set/unset, but that is just me. Yes, I haven't tested, and it is possible that there is a off-by-one somewhere... Looking at postocpy_eand_ram_send_discard, I even think that it would be a good idea to pass length to this function. > +/* > + * Transmit the set of pages to be discarded after precopy to the target > + * these are pages that: > + * a) Have been previously transmitted but are now dirty again > + * b) Pages that have never been transmitted, this ensures that > + * any pages on the destination that have been mapped by background > + * tasks get discarded (transparent huge pages is the specific > concern) > + * Hopefully this is pretty sparse > + */ > +int ram_postcopy_send_discard_bitmap(MigrationState *ms) > +{ > + int ret; > + > + rcu_read_lock(); > + > + /* This should be our last sync, the src is now paused */ > + migration_bitmap_sync(); > + > + /* > + * Update the sentmap to be sentmap = ~sentmap | dirty > + */ > + bitmap_complement(ms->sentmap, ms->sentmap, > + last_ram_offset() >> TARGET_PAGE_BITS); > + > + bitmap_or(ms->sentmap, ms->sentmap, migration_bitmap, > + last_ram_offset() >> TARGET_PAGE_BITS); This bitmaps are really big, I don't know how long would take to do this operations here, but I think that we can avoid (at least) the bitmap_complement. We can change the bitmap name to notsentbitmap, init it to one and clear it each time that we sent a page, no? We can also do the bitmap_or() at migration_sync_bitmap() time, at that point, we shouldn't be on the critical path? Later, Juan.