Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> wrote: > On 30.01.23 11:03, Juan Quintela wrote: >> Signed-off-by: Juan Quintela <quint...@redhat.com> >> Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com> >> --- >> migration/migration.c | 24 ++++++++++++------------ >> 1 file changed, 12 insertions(+), 12 deletions(-) >> diff --git a/migration/migration.c b/migration/migration.c >> index 594a42f085..644c61e91d 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -3764,23 +3764,23 @@ static MigIterateState >> migration_iteration_run(MigrationState *s) >> pend_pre, pend_compat, pend_post); >> } >> - if (pending_size && pending_size >= s->threshold_size) { >> - /* Still a significant amount to transfer */ >> - if (!in_postcopy && pend_pre <= s->threshold_size && >> - qatomic_read(&s->start_postcopy)) { >> - if (postcopy_start(s)) { >> - error_report("%s: postcopy failed to start", __func__); >> - } >> - return MIG_ITERATE_SKIP; >> - } >> - /* Just another iteration step */ >> - qemu_savevm_state_iterate(s->to_dst_file, in_postcopy); >> - } else { >> + if (pending_size < s->threshold_size) { > > to keep the logic, formally it should be "if (!pending_size || pending_size < > s->threshold_size)"...
> Actually, could s->threshold_size be 0 here? Or, worth an assertion > assert(s->threshold_size) ? It is weird, but it could: bandwidth = (double)transferred / time_spent; s->threshold_size = bandwidth * s->parameters.downtime_limit; That is the (real) only place when we set it, and if the network was completely down, bandwidth could be zero. But I think that it is better to explain in the other way. What does the current code do: if (too much dirty memory to converge) do another iteration else do final iteration What the new code do is if (low enough memory to converge) do final iteration. do another iteration. So, how we decide if we converge? if pending_size < s->threshold_size we "could" change it to pending_size <= s->threshold_size for the zero case But I think that we would be shooting ourselves in the foot, because we are having network trouble (only reason that s->theshold_size could be zero) and we still have all the devices state to migrate. And once that we enter that state, there is no way back, guest is stopped in source and we are not able to send anything else. So I think it is better this way. What do you think? Later, Juan.