On Wed, Sep 08, 2021 at 03:41:35PM +0200, Stefano Garzarella wrote: > On Tue, Sep 07, 2021 at 03:47:56PM +0200, Stefano Garzarella wrote: > > On Tue, Sep 07, 2021 at 02:22:24PM +0100, Daniel P. Berrangé wrote: > > > On Tue, Sep 07, 2021 at 02:49:35PM +0200, Stefano Garzarella wrote: > > > > Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support") > > > > enabled the SEQPACKET feature bit. > > > > This commit is released with QEMU 6.1, so if we try to migrate a VM > > > > where > > > > the host kernel supports SEQPACKET but machine type version is less than > > > > 6.1, we get the following errors: > > > > > > > > Features 0x130000002 unsupported. Allowed features: 0x179000000 > > > > Failed to load virtio-vhost_vsock:virtio > > > > error while loading state for instance 0x0 of device > > > > '0000:00:05.0/virtio-vhost_vsock' > > > > load of migration failed: Operation not permitted > > > > > > > > Let's disable the feature bit for machine types < 6.1, adding a > > > > `features` field to VHostVSock to simplify the handling of upcoming > > > > features we will support. > > > > > > IIUC, this will still leave migration broken for anyone migrating > > > a >= 6.1 machine type between a kernel that supports SEQPACKET and > > > a kernel lacking that, or vica-verca. > > > > This should be true for migrating from kernel that supports SEQPACKET to > > a kernel lacking that. > > > > For vice-versa I'm not sure, since vhost_get_features() will disable > > that feature if the host kernel doesn't support it, and the guest will > > not have acked it. > > I did some testing and the migration is only broken in the case of > kernel 5.14+ (SEQPACKET supported) -> kernel 5.13 (SEQPACKET not supported). > > Vice-versa works well because the feature is not acked. > > > > > > > > > If a feature is dependant on a host kernel feature we can't turn > > > that on automatically as part of the machine type, as we need > > > ABI stability across migration indepdant of kernel version. > > > > > > > How do we typically handle this? > > > > I wrongly thought it was an expected behavior that migrating a guest > > using a vhost device from a new kernel to an old one can fail if not all > > features are supported. > > > > I need to take a look at the other vhost devices. > > I took a look at vhost-net and vhost-scsi and we don't seem to handle this > case. Maybe I'm missing something...
We've never done very well at having a consistent story wrt deps on kernel features. So I wouldn't be surprised to see differences or omissions anywhere and people not notice the issue. > So following your advice, the best thing would be to have this feature > disabled by default and require the user to enable it explicitly so we are > sure it is needed. At this point a migration to a kernel that doesn't > support it is rightly broken. > > Or is there something better we can do? > > @Michael @Jason any thoughts? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|