On Fri, Dec 06, 2024 at 07:24:58PM +0100, Maciej S. Szmigiero wrote: > On 5.12.2024 20:46, Zhang Chen wrote: > > On Thu, Dec 5, 2024 at 5:30 AM Peter Xu <pet...@redhat.com> wrote: > > > > > > On Sun, Nov 17, 2024 at 08:20:00PM +0100, Maciej S. Szmigiero wrote: > > > > diff --git a/migration/colo.c b/migration/colo.c > > > > index 9590f281d0f1..a75c2c41b464 100644 > > > > --- a/migration/colo.c > > > > +++ b/migration/colo.c > > > > @@ -452,6 +452,9 @@ static int > > > > colo_do_checkpoint_transaction(MigrationState *s, > > > > bql_unlock(); > > > > goto out; > > > > } > > > > + > > > > + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); > > > > + > > > > /* Note: device state is saved into buffer */ > > > > ret = qemu_save_device_state(fb); > > > > > > Looks all good, except I'm not sure whether we should touch colo. IIUC it > > > should be safer to remove it. > > > > > > > Agree with Peter's comments. > > If I understand correctly, the current COLO doesn't support multifd > > migration. > > This patch adds a generic migration bit stream command, which could be used > for other purposes than multifd device state migration too. > > It just so happens we make use of it for VFIO driver multifd device state > migration currently since we need a way to achieve the same functionality > as save_live_complete_precopy_{begin,end} handlers did in the previous > versions of this patch set. > > Since adding this bit stream command to COLO does not cost anything > (it's already behind a compatibility migration property) and it may be > useful in the future I would advise to keep it there. > > On the other hand, if we don't add it to COLO now but it turns out it > will be needed there to implement some functionality in the future then > we'll need to add yet another compatibility migration property for that.
There's one thing still slightly off for COLO, where IIUC COLO runs that in a loop to synchronize device states (colo_do_checkpoint_transaction()) to the other side, so that's not exactly where the "switchover" (in COLO's wording, I think it's called "failover") happens for COLO.. Hence the name qemu_savevm_maybe_send_switchover_start() may be slightly misleading in COLO's case.. But that's not a huge deal. At least I checked and I agree the code should work for COLO too, and I think COLO should need something like machine type to work properly across upgrades, in that case I think COLO is safe. So I'm OK with keeping this, as long as Chen doesn't object. -- Peter Xu