* Michael S. Tsirkin (m...@redhat.com) wrote: > On Wed, Oct 06, 2021 at 09:09:30AM +0100, Dr. David Alan Gilbert wrote: > > * Michael S. Tsirkin (m...@redhat.com) wrote: > > > On Tue, Oct 05, 2021 at 12:10:08PM -0400, Eduardo Habkost wrote: > > > > On Tue, Oct 05, 2021 at 03:01:05PM +0100, Dr. David Alan Gilbert wrote: > > > > > * Michael S. Tsirkin (m...@redhat.com) wrote: > > > > > > On Tue, Oct 05, 2021 at 02:18:40AM +0300, Roman Kagan wrote: > > > > > > > On Mon, Oct 04, 2021 at 11:11:00AM -0400, Michael S. Tsirkin > > > > > > > wrote: > > > > > > > > On Mon, Oct 04, 2021 at 06:07:29PM +0300, Denis Plotnikov wrote: > > > > > > > > > It might be useful for the cases when a slow block layer > > > > > > > > > should be replaced > > > > > > > > > with a more performant one on running VM without stopping, > > > > > > > > > i.e. with very low > > > > > > > > > downtime comparable with the one on migration. > > > > > > > > > > > > > > > > > > It's possible to achive that for two reasons: > > > > > > > > > > > > > > > > > > 1.The VMStates of "virtio-blk" and "vhost-user-blk" are > > > > > > > > > almost the same. > > > > > > > > > They consist of the identical VMSTATE_VIRTIO_DEVICE and > > > > > > > > > differs from > > > > > > > > > each other in the values of migration service fields only. > > > > > > > > > 2.The device driver used in the guest is the same: virtio-blk > > > > > > > > > > > > > > > > > > In the series cross-migration is achieved by adding a new > > > > > > > > > type. > > > > > > > > > The new type uses virtio-blk VMState instead of > > > > > > > > > vhost-user-blk specific > > > > > > > > > VMstate, also it implements migration save/load callbacks to > > > > > > > > > be compatible > > > > > > > > > with migration stream produced by "virtio-blk" device. > > > > > > > > > > > > > > > > > > Adding the new type instead of modifying the existing one is > > > > > > > > > convenent. > > > > > > > > > It ease to differ the new virtio-blk-compatible vhost-user-blk > > > > > > > > > device from the existing non-compatible one using qemu > > > > > > > > > machinery without any > > > > > > > > > other modifiactions. That gives all the variety of qemu > > > > > > > > > device related > > > > > > > > > constraints out of box. > > > > > > > > > > > > > > > > Hmm I'm not sure I understand. What is the advantage for the > > > > > > > > user? > > > > > > > > What if vhost-user-blk became an alias for > > > > > > > > vhost-user-virtio-blk? > > > > > > > > We could add some hacks to make it compatible for old machine > > > > > > > > types. > > > > > > > > > > > > > > The point is that virtio-blk and vhost-user-blk are not > > > > > > > migration-compatible ATM. OTOH they are the same device from the > > > > > > > guest > > > > > > > POV so there's nothing fundamentally preventing the migration > > > > > > > between > > > > > > > the two. In particular, we see it as a means to switch between > > > > > > > the > > > > > > > storage backend transports via live migration without disrupting > > > > > > > the > > > > > > > guest. > > > > > > > > > > > > > > Migration-wise virtio-blk and vhost-user-blk have in common > > > > > > > > > > > > > > - the content of the VMState -- VMSTATE_VIRTIO_DEVICE > > > > > > > > > > > > > > The two differ in > > > > > > > > > > > > > > - the name and the version of the VMStateDescription > > > > > > > > > > > > > > - virtio-blk has an extra migration section (via .save/.load > > > > > > > callbacks > > > > > > > on VirtioDeviceClass) containing requests in flight > > > > > > > > > > > > > > It looks like to become migration-compatible with virtio-blk, > > > > > > > vhost-user-blk has to start using VMStateDescription of > > > > > > > virtio-blk and > > > > > > > provide compatible .save/.load callbacks. It isn't entirely > > > > > > > obvious how > > > > > > > to make this machine-type-dependent, so we came up with a simpler > > > > > > > idea > > > > > > > of defining a new device that shares most of the implementation > > > > > > > with the > > > > > > > original vhost-user-blk except for the migration stuff. We're > > > > > > > certainly > > > > > > > open to suggestions on how to reconcile this under a single > > > > > > > vhost-user-blk device, as this would be more user-friendly indeed. > > > > > > > > > > > > > > We considered using a class property for this and defining the > > > > > > > respective compat clause, but IIUC the class constructors (where > > > > > > > .vmsd > > > > > > > and .save/.load are defined) are not supposed to depend on class > > > > > > > properties. > > > > > > > > > > > > > > Thanks, > > > > > > > Roman. > > > > > > > > > > > > So the question is how to make vmsd depend on machine type. > > > > > > CC Eduardo who poked at this kind of compat stuff recently, > > > > > > paolo who looked at qom things most recently and dgilbert > > > > > > for advice on migration. > > > > > > > > > > I don't think I've seen anyone change vmsd name dependent on machine > > > > > type; making fields appear/disappear is easy - that just ends up as a > > > > > property on the device that's checked; I guess if that property is > > > > > global (rather than per instance) then you can check it in > > > > > vhost_user_blk_class_init and swing the dc->vmsd pointer? > > > > > > > > class_init can be called very early during QEMU initialization, > > > > so it's too early to make decisions based on machine type. > > > > > > > > Making a specific vmsd appear/disappear based on machine > > > > configuration or state is "easy", by implementing > > > > VMStateDescription.needed. But this would require registering > > > > both vmsds (one of them would need to be registered manually > > > > instead of using DeviceClass.vmsd). > > > > > > > > I don't remember what are the consequences of not using > > > > DeviceClass.vmsd to register a vmsd, I only remember it was > > > > subtle. See commit b170fce3dd06 ("cpu: Register > > > > VMStateDescription through CPUState") and related threads. CCing > > > > Philippe, who might remember the details here. > > > > > > > > If that's an important use case, I would suggest allowing devices > > > > to implement a DeviceClass.get_vmsd method, which would override > > > > DeviceClass.vmsd if necessary. Is the problem we're trying to > > > > address worth the additional complexity? > > > > > > The tricky part is that we generally dont support migration when > > > command line is different on source and destination ... > > > > The reality has always been a bit more subtle than that. > > For example, it's fine if the path to a block device is different on the > > source and destination; or if it's accessed by iSCSI on the destination > > say. As long as what the guest sees, and the migration stream carries > > are the same, then in principal it's OK - but that does start getting > > trickier; also it would prboably get interesting to let libvirt know > > that this combo is OK. > > I agree, but that's not the same as specifying a different > device. Yes we internally they are compatible, but > this is a detail users/tools generally won't be able to > figure out.
Yeh. > > > So maybe the actual answer is that vhost-user-blk should really > > > be a drive supplied to a virtio blk device, not a device > > > itself? > > > This way it's sane, and also matches what we do e.g. for net. > > > > Hmm a bit of a fudge; it's not quite the same as a drive is it; there's > > almost another layer split in there. > > > > Dave > > We can make it something else, not "drive=". Maybe simply "vhost-user=" ? > Point is if we promise it looks the same to guest it should be the > same -device. To me it feels the same as the distinction between vhost-kernel and qemu backended virtio that we get in net and others - in principal it's just another implementation. A tricky part is guaranteeing the set of visible virtio features between implementations; we have that problem when we use vhost-kernel and run on a newer/older kernel and gain virtio features; the same will be true with vhost-user implementations. But this would make the structure of a vhost-user implementation quite different. Dave > > > > -- > > > MST > > > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK