Hi,

I managed to fix a few issues while testing this patch series.
There is still one issue that I am unable to resolve. I thought
I would send this patch series for review in case I have missed
something.

The issue is that this patch series does not work every time. I
am able to ping L0 from L2 and vice versa via packed SVQ when it
works.

When this doesn't work, both VMs throw a "Destination Host
Unreachable" error. This is sometimes (not always) accompanied
by the following kernel error (thrown by L2-kernel):

virtio_net virtio1: output.0:id 1 is not a head!

This error is not thrown always, but when it is thrown, the id
varies. This is invariably followed by a soft lockup:

[  284.662292] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [swapper/1:0]
[  284.662292] Modules linked in: rfkill intel_rapl_msr intel_rapl_common 
intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class 
vfg
[  284.662292] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.7-200.fc39.x86_64 
#1
[  284.662292] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[  284.662292] RIP: 0010:virtqueue_enable_cb_delayed+0x115/0x150
[  284.662292] Code: 44 77 04 0f ae f0 48 8b 42 70 0f b7 40 02 66 2b 42 50 66 
39 c1 0f 93 c0 c3 cc cc cc cc 66 87 44 77 04 eb e2 f0 83 44 24 fc 00 <e9> 5a f1
[  284.662292] RSP: 0018:ffffb8f000100cb0 EFLAGS: 00000246
[  284.662292] RAX: 0000000000000000 RBX: ffff96f20204d800 RCX: ffff96f206f5e000
[  284.662292] RDX: ffff96f2054fd900 RSI: ffffb8f000100c7c RDI: ffff96f2054fd900
[  284.662292] RBP: ffff96f2078bb000 R08: 0000000000000001 R09: 0000000000000001
[  284.662292] R10: ffff96f2078bb000 R11: 0000000000000005 R12: ffff96f207bb4a00
[  284.662292] R13: 0000000000000000 R14: 0000000000000000 R15: ffff96f20452fd00
[  284.662292] FS:  0000000000000000(0000) GS:ffff96f27bc80000(0000) 
knlGS:0000000000000000
[  284.662292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  284.662292] CR2: 00007f2a9ca191e8 CR3: 0000000136422003 CR4: 0000000000770ef0
[  284.662292] PKRU: 55555554
[  284.662292] Call Trace:
[  284.662292]  <IRQ>
[  284.662292]  ? watchdog_timer_fn+0x1e6/0x270
[  284.662292]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  284.662292]  ? __hrtimer_run_queues+0x10f/0x2b0
[  284.662292]  ? hrtimer_interrupt+0xf8/0x230
[  284.662292]  ? __sysvec_apic_timer_interrupt+0x4d/0x140
[  284.662292]  ? sysvec_apic_timer_interrupt+0x39/0x90
[  284.662292]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  284.662292]  ? virtqueue_enable_cb_delayed+0x115/0x150
[  284.662292]  start_xmit+0x2a6/0x4f0 [virtio_net]
[  284.662292]  ? netif_skb_features+0x98/0x300
[  284.662292]  dev_hard_start_xmit+0x61/0x1d0
[  284.662292]  sch_direct_xmit+0xa4/0x390
[  284.662292]  __dev_queue_xmit+0x84f/0xdc0
[  284.662292]  ? nf_hook_slow+0x42/0xf0
[  284.662292]  ip_finish_output2+0x2b8/0x580
[  284.662292]  igmp_ifc_timer_expire+0x1d5/0x430
[  284.662292]  ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[  284.662292]  call_timer_fn+0x21/0x130
[  284.662292]  ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[  284.662292]  __run_timers+0x21f/0x2b0
[  284.662292]  run_timer_softirq+0x1d/0x40
[  284.662292]  __do_softirq+0xc9/0x2c8
[  284.662292]  __irq_exit_rcu+0xa6/0xc0
[  284.662292]  sysvec_apic_timer_interrupt+0x72/0x90
[  284.662292]  </IRQ>
[  284.662292]  <TASK>
[  284.662292]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  284.662292] RIP: 0010:pv_native_safe_halt+0xf/0x20
[  284.662292] Code: 22 d7 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 
90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 53 75 3f 00 fb f4 <c3> cc c0
[  284.662292] RSP: 0018:ffffb8f0000b3ed8 EFLAGS: 00000212
[  284.662292] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
[  284.662292] RDX: 4000000000000000 RSI: 0000000000000083 RDI: 00000000000289ec
[  284.662292] RBP: ffff96f200810000 R08: 0000000000000000 R09: 0000000000000001
[  284.662292] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  284.662292] R13: 0000000000000000 R14: ffff96f200810000 R15: 0000000000000000
[  284.662292]  default_idle+0x9/0x20
[  284.662292]  default_idle_call+0x2c/0xe0
[  284.662292]  do_idle+0x226/0x270
[  284.662292]  cpu_startup_entry+0x2a/0x30
[  284.662292]  start_secondary+0x11e/0x140
[  284.662292]  secondary_startup_64_no_verify+0x184/0x18b
[  284.662292]  </TASK>

The soft lockup seems to happen in
drivers/net/virtio_net.c:start_xmit() [1].

I don't think the issue is in the kernel because I haven't seen
any issue when testing my changes with split vqs. Only packed vqs
give an issue.

L0 kernel version: 6.12.13-1-lts

QEMU command to boot L1:

$ sudo ./qemu/build/qemu-system-x86_64 \
-enable-kvm \
-drive 
file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \
-net nic,model=virtio \
-net user,hostfwd=tcp::2222-:22 \
-device intel-iommu,snoop-control=on \
-device 
virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,mq=off,ctrl_vq=off,ctrl_rx=off,ctrl_vlan=off,ctrl_mac_addr=off,packed=on,event_idx=off,bus=pcie.0,addr=0x4
 \
-netdev tap,id=net0,script=no,downscript=no,vhost=off \
-nographic \
-m 8G \
-smp 4 \
-M q35 \
-cpu host 2>&1 | tee vm.log

L1 kernel version: 6.8.5-201.fc39.x86_64

I have been following the "Hands on vDPA - Part 2" blog
to set up the environment in L1 [2].

QEMU command to boot L2:

# ./qemu/build/qemu-system-x86_64 \
-nographic \
-m 4G \
-enable-kvm \
-M q35 \
-drive file=//root/L2.qcow2,media=disk,if=virtio \
-netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \
-device 
virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=off,ctrl_rx=off,ctrl_vlan=off,ctrl_mac_addr=off,event_idx=off,packed=on,bus=pcie.0,addr=0x7
 \
-smp 4 \
-cpu host \
2>&1 | tee vm.log

L2 kernel version: 6.8.7-200.fc39.x86_64

I confirmed that packed vqs are enabled in L2 by running the
following:

# cut -c35 /sys/devices/pci0000\:00/0000\:00\:07.0/virtio1/features 
1

I may be wrong, but I think the issue in my implementation might be
related to:

1. incorrect endianness coversions.
2. implementation of "vhost_svq_more_used_packed" in commit #5.
3. implementation of "vhost_svq_(en|dis)able_notification" in commit #5.
4. something else?

Thanks,
Sahil

[1] https://github.com/torvalds/linux/blob/master/drivers/net/virtio_net.c#L3245
[2] 
https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2

Sahil Siddiq (7):
  vhost: Refactor vhost_svq_add_split
  vhost: Data structure changes to support packed vqs
  vhost: Forward descriptors to device via packed SVQ
  vdpa: Allocate memory for SVQ and map them to vdpa
  vhost: Forward descriptors to guest via packed vqs
  vhost: Validate transport device features for packed vqs
  vdpa: Support setting vring_base for packed SVQ

 hw/virtio/vhost-shadow-virtqueue.c | 396 ++++++++++++++++++++++-------
 hw/virtio/vhost-shadow-virtqueue.h |  88 ++++---
 hw/virtio/vhost-vdpa.c             |  52 +++-
 3 files changed, 404 insertions(+), 132 deletions(-)

-- 
2.48.1


Reply via email to