On Tue, Feb 09, 2021 at 10:04:30AM -0500, Michael S. Tsirkin wrote: > On Tue, Feb 09, 2021 at 02:51:05PM +0000, Daniel P. Berrangé wrote: > > On Tue, Feb 09, 2021 at 09:34:20AM -0500, Michael S. Tsirkin wrote: > > > On Thu, Feb 04, 2021 at 10:29:12PM +0200, Yuri Benditovich wrote: > > > > This set of patches introduces graceful switch from tap-vhost to > > > > tap-no-vhost depending on guest features. Before that the features > > > > that vhost does not support were silently cleared in get_features. > > > > This creates potential problem of migration from the machine where > > > > some of virtio-net features are supported by the vhost kernel to the > > > > machine where they are not supported (packed ring as an example). > > > > > > I still worry that adding new features will silently disable vhost for > > > people. > > > Can we limit the change to when a VM is migrated in? > > > > Some management applications expect bi-directional live migration to > > work, so taking specific actions on incoming migration only feels > > dangerous. > > Could you be more specific? > > Bi-directional migration is currently broken > when migrating new kernel->old kernel. > > This seems to be the motivation for this patch, though I wish > it was spelled out more explicitly. > > People don't complain much, but I'm fine with fixing that > with a userspace fallback. > > > I'd rather not force the fallback on others though: vhost is generally > specified explicitly by user while features are generally set > automatically, so this patch will make us override what user specified, > not nice. > > > > IMHO if the features we're adding cannot be expected to exist in > > host kernels in general, then the feature should defualt to off > > and require explicit user config to enable. > > Downstream distros which can guarantee newer kernels can flip the > > default in their custom machine types if they desire. > > > > Regards, > > Daniel > > Unfortunately that will basically mean we are stuck with no new features > for years. We did what this patch is trying to change for years now, in > particular KVM also seems to happily disable CPU features not supported > by kernel so I wonder why we can't keep doing it, with tweaks for some > corner cases.
I should say the kernel's continual changing in CPU features that are exposed has been responsible for a *huge* number of bugs with live migration compatibility. libvirt, QEMU & apps have needed to introduce a lot of extra code to try to cope with the changing CPU features across migration and i still goes wrong to this very day, because we have to migrate from prehistoric QEMU versions to quite modern versions. IOW, the CPU features approach is a perfect example of why we should *not* introduce a kernel dependancy in more areas of QEMU feature enablement, and instead should strictly tie feature defaults to the machine type versions. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|