> -----Original Message-----
> From: Juan Quintela <quint...@redhat.com>
> Sent: Wednesday, November 17, 2021 12:28 AM
> To: Zhang, Chen <chen.zh...@intel.com>
> Cc: Hailiang Zhang <zhang.zhanghaili...@huawei.com>; Dr . David Alan
> Gilbert <dgilb...@redhat.com>; qemu-dev <qemu-devel@nongnu.org>
> Subject: Re: [PATCH 1/2] migration/colo: Optimize COLO start code path
>
> Zhang Chen <chen.zh...@intel.com> wrote:
> > There is no need to start COLO through MIGRATION_STATUS_ACTIVE.
>
> Hi
>
> I don't understand what you are trying to do. In my reading, at least the
> commit message is wrong:
>
> void migrate_start_colo_process(MigrationState *s) {
> ...
> migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_COLO);
> ...
> }
>
> and
>
> void *colo_process_incoming_thread(void *opaque) {
> ...
> migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_COLO);
>
> So colo starts with MIGRATION_STATUS_ACTIVE.
Yes, this patch just optimized COLO primary code
path(migrate_start_colo_process()).
We can see this patch removed the
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
In the migrate_start_colo_process().
Current COLO status path:
MIGRATION_STATUS_XXX ---> MIGRATION_STATUS_ACTIVE --->
MIGRATION_STATUS_COLO ---> MIGRATION_STATUS_COMPLETED
This patch try to remove redundant " MIGRATION_STATUS_ACTIVE " in COLO start.
MIGRATION_STATUS_XXX ---> MIGRATION_STATUS_COLO --->
MIGRATION_STATUS_COMPLETED
Actually COLO primary code did nothing when running on
"MIGRATION_STATUS_ACTIVE".
But for COLO secondary (void *colo_process_incoming_thread()), it shared some
code with normal migration. No need to do this.
So, I will fix commit message to:
Optimize COLO primary start path to:
MIGRATION_STATUS_XXX ---> MIGRATION_STATUS_COLO --->
MIGRATION_STATUS_COMPLETED
No need to start primary COLO through "MIGRATION_STATUS_ACTIVE".
How about it?
>
>
> > Signed-off-by: Zhang Chen <chen.zh...@intel.com>
> > ---
> > migration/colo.c | 2 --
> > migration/migration.c | 18 +++++++++++-------
> > 2 files changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/migration/colo.c b/migration/colo.c index
> > 2415325262..ad1a4426b3 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -667,8 +667,6 @@ void migrate_start_colo_process(MigrationState *s)
> > colo_checkpoint_notify, s);
> >
> > qemu_sem_init(&s->colo_exit_sem, 0);
> > - migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> > - MIGRATION_STATUS_COLO);
> > colo_process_checkpoint(s);
> > qemu_mutex_lock_iothread();
> > }
> > diff --git a/migration/migration.c b/migration/migration.c index
> > abaf6f9e3d..4c8662a839 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -3222,7 +3222,10 @@ static void migration_completion(MigrationState
> *s)
> > goto fail_invalidate;
> > }
> >
> > - if (!migrate_colo_enabled()) {
> > + if (migrate_colo_enabled()) {
> > + migrate_set_state(&s->state, current_active_state,
> > + MIGRATION_STATUS_COLO);
> > + } else {
> > migrate_set_state(&s->state, current_active_state,
> > MIGRATION_STATUS_COMPLETED);
> > }
>
> This moves the setup to MIGRATION_STATUS_COLO to completion time
> instead of the beggining of the process. I have no clue why. I guess you can
> put a comment/commit message to say what you ar.e trynig to do.
You are right, no need to setup here.
I will remove this in next version.
>
> > @@ -3607,12 +3610,7 @@ static void
> migration_iteration_finish(MigrationState *s)
> > migration_calculate_complete(s);
> > runstate_set(RUN_STATE_POSTMIGRATE);
> > break;
> > -
> > - case MIGRATION_STATUS_ACTIVE:
> > - /*
> > - * We should really assert here, but since it's during
> > - * migration, let's try to reduce the usage of assertions.
> > - */
> > + case MIGRATION_STATUS_COLO:
> > if (!migrate_colo_enabled()) {
> > error_report("%s: critical error: calling COLO code without "
> > "COLO enabled", __func__); @@ -3622,6
> > +3620,12 @@ static void migration_iteration_finish(MigrationState *s)
> > * Fixme: we will run VM in COLO no matter its old running state.
> > * After exited COLO, we will keep running.
> > */
> > + /* Fallthrough */
> > + case MIGRATION_STATUS_ACTIVE:
> > + /*
> > + * We should really assert here, but since it's during
> > + * migration, let's try to reduce the usage of assertions.
> > + */
> > s->vm_was_running = true;
> > /* Fallthrough */
> > case MIGRATION_STATUS_FAILED:
>
> I guess this change is related to the previous one, but I don't understand
> colo
> enough to review it.
I think this patch is the general code, little background needed.
You can simple understand COLO is two VMs(primary node and secondary node)
entered a state of cyclic migration.
Thanks your comments.
Thanks
Chen
>
> Later, Juan.