On 10/15/24 11:13, LIU Yulong wrote:
> Hi community and experts,
> 
> We have recently attempted to upgrade OVS 2.12+DPDK 18.11 to OVS
> 2.13.11+DPDK 19.11.14. And then we encountered a state where some
> virtual machine network cards are down, and users were not able to
> start the network cards inside the guest VM.
> After investigating, we found that qemu reported errors (many many
> times) , which means virtIO feature negotiation failed:
> 2024-10-15T06:25:16.986398Z qemu-kvm: failed to init vhost_net for queue 0
> vhost lacks feature mask 16384 for backend
> 
> Which means the backend of virtIO, aka vhostuser,  does not support
> 16384 (the 14th in feature bits).
> Source code definition bit:
> #define  VIRTIO_NET_F_HOST_UFO   14 /* Host can handle UFO in. */
> 
> In the same host, if the HOST_UFO bit of some virtual machines is set
> to 1, the network card cannot start. While some are 0, it can be
> started.
> 
> We found some useful series of links:
> https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/405829.html
> https://bugzilla.redhat.com/show_bug.cgi?id=1845488#c5
> 
> The conclusion seems to be that such hot upgrade is impossible to
> achieve. If the guest VM is not restarted or the network card is not
> redo hot unplug and plug, the user's network card will not be able to
> work properly. This situation is unacceptable for a cloud environment
> because we cannot require all user VMs to be restarted.
> 
> Therefore, I'm asking here if there is a possible work around to
> achieve such an upgrade?

Hi, unfortunately, I don't think there is a way forward that doesn't involve
cold migration / restart / port hot-replug.

The issue is that at some point we accidentally exposed UFO and a few other
features for negotiation due to compound of different factors.  Ideally,
those features would not be acked / negotiated, because we did not advertise
prerequisite features.  However, AFAIU, none of virtio/vhost-net implementation
parts including DPDK, QEMU and the kernel actually comply with virtio-net spec
and accept feature flags for which dependencies are not satisfied.  So, these
features end up acked by QEMU and the guest driver even if they are not allowed
to use them.  Unfortunately for us that means that if we do the right thing and 
turn these features off on OVS side, we will not be able to connect to QEMU
that did already expose these features to the guest.

As I mentioned, at some point we did expose UFO to the guest by mistake.
Then it was fixed by the following commit:
  
https://github.com/openvswitch/ovs/commit/514950d37dabebbdfa40ddf87596a7293de2d87c
You may see that this patch also makes the wrong assumption for TSO case that
disabling checksum offload will end up with TSO/UFO not being enabled.  Later
it was fixed + worked around while trying to figure out enabling checksum
offload by default, but we still can't really work around unsupported ECN.
At least, nobody seem to use ECN, so that wasn't a huge problem so far.

Unfortunately again, the fact that commit 514950d37dab breaks live migration
and upgrades was discovered too late and reverting this commit wasn't an option.
Also because reverting it would mean that we would start advertising incorrect
features again, which is not good.

The only way to make your VMs work without restarting / re-plugging is to
remove VIRTIO_NET_F_HOST_UFO from the vhost_unsup_flags.  But once you do that,
you'll have to keep that broken workaround literally forever, as all the newly
started VMs will have it negotiated and hence will have the same problem.

This will also become a big problem once you go to OVS 3.2+ where checksum
offload is enabled by default, so your negotiated UFO will now be allowed to
be used by the guest and that will break OVS, because we do not support UFO 
on OVS side and, unlike ECN, we can't really ignore it.

The best available solution, I think, is to plan the upgrade and gradually
cold-migrate (not live) VMs from nodes with old OVS to nodes with upgraded one.
I'd also suggest to migrate to some supported version of OVS instead of 2.13.
OVS 3.3 LTS might be a good choice.

FWIW, while upgrade from pre-2.13 to post-2.13 is not possible without restart,
upgrades from 2.13+ forward should not have such issues.

I had an idea that the issue could be solved by QEMU not acking features that
do not have satisfied dependencies and clearing features with not satisfied
dependencies from the acked feature set during live migration.  Since the guest
is not allowed to use those anyway, it should not cause problems.  And if the
guest will re-negotiate it will receive an updated feature set without those
non-satisfied dependencies and we can move on with our lives...  But this
requires a lot of considerations and discussion with QEMU / virtio maintainers.
I'll start the thread on qemu-devel to check if there are issues with such
a solution or if it is even possible or acceptable.  Either way, such a change
will unlikely be backported to older versions of QEMU.

Best regards, Ilya Maximets.
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to