Hi all,
following discussions yesterday with Juan Quintela and Marcelo Tosatti,
here is my humble proposal: remove block migration from qemu master. It
seems to me that keeping block migration is going to slow down further
improvements on migration. The main problems are:
1) there are very good reasons to move migration to a separate thread.
Only a limited amount of extra locking, perhaps none is needed in order
to do so for RAM and devices. But the block drivers pretty much need to
run under the I/O thread lock, and coroutines will not help if the I/O
thread is taken by another thread. It's hard/unreliable/pointless to
ping-pong migration between threads.
2) there already are plans to reimplement block migration... it's called
streaming :) and not coincidentially it reuses some of the block
migration code.
Here is how it would go:
1) remotely stream the block devices from machine A to machine B. Keep
them in sync via the mirroring block device when the streaming is finished.
2) As soon as they are in sync, start live migration.
3) When the source QEMU exits, the destination machine runs and
switches over to its local storage.
Now, how do you do remote streaming? NBD. It does anything you need
except perhaps TRIM---also easily added.
1a) Start qemu-nbd on the remote machine, writing to the destination image.
1b) Start (local) live streaming of the source block device to an NBD
client that points to the qemu-nbd occurrence.
2a) Start qemu -S -incoming on the remote machine. Use the same image
you used in (1a) for the disk.
2b) As soon as streaming finishes, _keep mirroring_ on the source and
start live migration.
3) Wait until after the NBD server closes the connection with the client
before starting the destination machine.
Advantages:
1) Streaming and migration cannot be done at the same time, this should
make the limitation much easier to fix. I don't know if Marcelo's code
can stream multiple devices at the time but, even if it can't, taking
migration out of the picture should make that easier.
2) Easier to make improvements to migration.
3) Perhaps separating block/RAM migration would make convergence of
pre-copy migration easier?
Disadvantages:
1) It requires one listening port per disk on the destination machine
but, if desired, libvirt can tunnel the NBD connection just like it
tunnels the migration data---it can even encrypt it and whatnot.
2) Complicated to do by hand. If you want to simplify this, qemu-nbd
can be embedded in qemu itself as -incoming-nbd or as a new monitor
command, and similar syntactic sugar can be provided on the source. But
this requires more changes and the sequence would in practice be managed
by libvirt, of course. The manual steps still allow the idea to be
demonstrated with zero changes to the code besides streaming.
3) ... put yours here.
How does it sound?
Paolo