On 10/15/24 11:13, LIU Yulong wrote: > Hi community and experts, > > We have recently attempted to upgrade OVS 2.12+DPDK 18.11 to OVS > 2.13.11+DPDK 19.11.14. And then we encountered a state where some > virtual machine network cards are down, and users were not able to > start the network cards inside the guest VM. > After investigating, we found that qemu reported errors (many many > times) , which means virtIO feature negotiation failed: > 2024-10-15T06:25:16.986398Z qemu-kvm: failed to init vhost_net for queue 0 > vhost lacks feature mask 16384 for backend > > Which means the backend of virtIO, aka vhostuser, does not support > 16384 (the 14th in feature bits). > Source code definition bit: > #define VIRTIO_NET_F_HOST_UFO 14 /* Host can handle UFO in. */ > > In the same host, if the HOST_UFO bit of some virtual machines is set > to 1, the network card cannot start. While some are 0, it can be > started. > > We found some useful series of links: > https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/405829.html > https://bugzilla.redhat.com/show_bug.cgi?id=1845488#c5 > > The conclusion seems to be that such hot upgrade is impossible to > achieve. If the guest VM is not restarted or the network card is not > redo hot unplug and plug, the user's network card will not be able to > work properly. This situation is unacceptable for a cloud environment > because we cannot require all user VMs to be restarted. > > Therefore, I'm asking here if there is a possible work around to > achieve such an upgrade?
Hi, unfortunately, I don't think there is a way forward that doesn't involve cold migration / restart / port hot-replug. The issue is that at some point we accidentally exposed UFO and a few other features for negotiation due to compound of different factors. Ideally, those features would not be acked / negotiated, because we did not advertise prerequisite features. However, AFAIU, none of virtio/vhost-net implementation parts including DPDK, QEMU and the kernel actually comply with virtio-net spec and accept feature flags for which dependencies are not satisfied. So, these features end up acked by QEMU and the guest driver even if they are not allowed to use them. Unfortunately for us that means that if we do the right thing and turn these features off on OVS side, we will not be able to connect to QEMU that did already expose these features to the guest. As I mentioned, at some point we did expose UFO to the guest by mistake. Then it was fixed by the following commit: https://github.com/openvswitch/ovs/commit/514950d37dabebbdfa40ddf87596a7293de2d87c You may see that this patch also makes the wrong assumption for TSO case that disabling checksum offload will end up with TSO/UFO not being enabled. Later it was fixed + worked around while trying to figure out enabling checksum offload by default, but we still can't really work around unsupported ECN. At least, nobody seem to use ECN, so that wasn't a huge problem so far. Unfortunately again, the fact that commit 514950d37dab breaks live migration and upgrades was discovered too late and reverting this commit wasn't an option. Also because reverting it would mean that we would start advertising incorrect features again, which is not good. The only way to make your VMs work without restarting / re-plugging is to remove VIRTIO_NET_F_HOST_UFO from the vhost_unsup_flags. But once you do that, you'll have to keep that broken workaround literally forever, as all the newly started VMs will have it negotiated and hence will have the same problem. This will also become a big problem once you go to OVS 3.2+ where checksum offload is enabled by default, so your negotiated UFO will now be allowed to be used by the guest and that will break OVS, because we do not support UFO on OVS side and, unlike ECN, we can't really ignore it. The best available solution, I think, is to plan the upgrade and gradually cold-migrate (not live) VMs from nodes with old OVS to nodes with upgraded one. I'd also suggest to migrate to some supported version of OVS instead of 2.13. OVS 3.3 LTS might be a good choice. FWIW, while upgrade from pre-2.13 to post-2.13 is not possible without restart, upgrades from 2.13+ forward should not have such issues. I had an idea that the issue could be solved by QEMU not acking features that do not have satisfied dependencies and clearing features with not satisfied dependencies from the acked feature set during live migration. Since the guest is not allowed to use those anyway, it should not cause problems. And if the guest will re-negotiate it will receive an updated feature set without those non-satisfied dependencies and we can move on with our lives... But this requires a lot of considerations and discussion with QEMU / virtio maintainers. I'll start the thread on qemu-devel to check if there are issues with such a solution or if it is even possible or acceptable. Either way, such a change will unlikely be backported to older versions of QEMU. Best regards, Ilya Maximets. _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss