On 06/26/2015 12:54 AM, Marcel Apfelbaum wrote: > On 06/24/2015 11:00 AM, Jason Wang wrote: >> >> >> On 06/19/2015 02:05 AM, Marcel Apfelbaum wrote: >>> Clear host multi-queue related features if the peer >>> doesn't support it. >>> >>> Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> >>> --- >>> Notes: >>> This fixes a guest CPU soft lock, however the virtio-net >>> device will not work correctly. It seems that is >>> peer's "fault", not knowing how to handle the situation. >>> However, I submit this patch since it corrects the negotiation >>> and saves us from a guest crash (!). >> >> Could you please describe how to reproduce this issue? > Hi Jason, > Sorry for the late reply. > I was hoping that a vhost-user multi-queue "insider" will ask > questions :) > > This happens when we have OVS as backend without multi-queue support > and qemu/guest with multi-queue-support. > > >> >>> >>> Any ideas from the virtio/multi-queue developers on how to debug >>> this further are welcomed. >>> >>> hw/net/virtio-net.c | 13 ++++++++++++- >>> 1 file changed, 12 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >>> index 9281aa1..63e59e8 100644 >>> --- a/hw/net/virtio-net.c >>> +++ b/hw/net/virtio-net.c >>> @@ -367,6 +367,11 @@ static int peer_has_ufo(VirtIONet *n) >>> return n->has_ufo; >>> } >>> >>> +static int peer_has_multiqueue(VirtIONet *n) >>> +{ >>> + return n->multiqueue; >>> +} >> >> The name is confusing, this is in fact whether or not guest support >> multiqueue. To check peer's ability, you need check >> n->nic_conf.peers.queues instead. > I just wanted to mimic similar code, I have no issue against > this approach. > >> >>> + >>> static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int >>> mergeable_rx_bufs, >>> int version_1) >>> { >>> @@ -469,6 +474,13 @@ static uint64_t >>> virtio_net_get_features(VirtIODevice *vdev, uint64_t features) >>> virtio_clear_feature(&features, VIRTIO_NET_F_HOST_UFO); >>> } >>> >>> + if (!peer_has_multiqueue(n)) { >>> + virtio_clear_feature(&features, VIRTIO_NET_F_MQ); >> >> I'm not quite understand this, VIRTIO_NET_F_MQ should work if peer has >> only 1 queue. > Please explain, I thought VIRTIO_NET_F_MQ flag *enables* > multi-queue, and in our case we pass queue=2 to vhost. > >> >>> + virtio_clear_feature(&features, VIRTIO_NET_F_GUEST_ANNOUNCE); >>> + virtio_clear_feature(&features, VIRTIO_NET_F_CTRL_VQ); >>> + virtio_clear_feature(&features, VIRTIO_NET_F_CTRL_RX); >>> + } >>> + >> >> Those features don't depend on multiqueue, why clear them? > Once VIRTIO_NET_F_MQ is cleared, the virtio driver from guest is > complaining > about those other flags too. > > Any explanations about how *it should* work are welcomed. > Again, the scenario is: backend doesn't support multi-queue, > QEMU/guest do, > and queues=2 is passed on command line. >
In this case, backend should just fail the initialization like what we've done for tap. I believe vhost-user should check whether or not its backend can support up to 2 queues and fail if not. Thanks > > Thanks, > Marcel >> >>> if (!get_vhost_net(nc->peer)) { >>> virtio_add_feature(&features, VIRTIO_F_VERSION_1); >>> return features; >>> @@ -1314,7 +1326,6 @@ static void virtio_net_tx_bh(void *opaque) >>> static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue) >>> { >>> n->multiqueue = multiqueue; >>> - >>> virtio_net_set_queues(n); >>> } >>> >> >