Hi Akihiko,

On 04/06/2024 09:37, Jason Wang wrote:
From: Akihiko Odaki <akihiko.od...@daynix.com>

Multiqueue usage is not negotiated yet when realizing. If more than
one queue is added and the guest never requests to enable multiqueue,
the extra queues will not be deleted when unrealizing and leak.

Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest doesn't support 
multiqueue")
Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com>
Signed-off-by: Jason Wang <jasow...@redhat.com>
---
  hw/net/virtio-net.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3cee2ef3ac..a8db8bfd9c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3743,9 +3743,7 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
      n->net_conf.tx_queue_size = MIN(virtio_net_max_tx_queue_size(n),
                                      n->net_conf.tx_queue_size);
- for (i = 0; i < n->max_queue_pairs; i++) {
-        virtio_net_add_queue(n, i);
-    }
+    virtio_net_add_queue(n, 0);
n->ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl);
      qemu_macaddr_default_if_unset(&n->nic_conf.macaddr);

This change breaks virtio net migration when multiqueue is enabled.

I think this is because virtqueues are half initialized after migration : they are initialized on guest side (kernel is using them) but not on QEMU side (realized has only initialized one). After migration, they are not initialized by the call to virtio_net_set_multiqueue() from virtio_net_set_features() because virtio_get_num_queues() reports already n->max_queue_pairs as this value is coming from the source guest memory.

I don't think we have a way to half-initialize a virtqueue (to initialize them only on QEMU side as they are already initialized on kernel side).

I think this change should be reverted to fix the migration issue.

How to reproduce the problem:

Source:

qemu-system-x86_64 -serial mon:stdio -accel kvm -cpu host -m 2G -display none -hda vm3.qcow2 -netdev tap,vhost=false,queues=2,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:49:47:db,mq=true

Destination:

qemu-system-x86_64 -serial mon:stdio -accel kvm -cpu host -m 2G -display none -hda vm3.qcow2 -netdev tap,vhost=false,queues=2,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:49:47:db,mq=true -incoming tcp:localhost:4444

In monitor:

migrate tcp:localhost:4444

Result on destination side:

(hangs and then: )
[   44.175916] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [kworker/0:0:8]
...
I think we have this error because the control virqueue is #3 for QEMU, whereas the kernel is using a control virqueue set by the multiqueue (max_queue_pairs * 2 + 1). There is a mismatch between queues...

Thanks,
Laurent


Reply via email to