Hi Sairaj

On 5/16/25 4:07 AM, Sairaj Kodilkar wrote:


On 5/2/2025 7:45 AM, Alejandro Jimenez wrote:

Hi Alejandro,

Tested the v2, everything looks good when I boot guest with upstream
kernel. But I observed that NVME driver fails to load with guest kernel
version 4.15.0-213-generic. This is the default kernel that comes with
the ubuntu image.

Thank you for the additional testing and for the report. I wanted to investigate and if possible solve the issue before replying, but since it is taking me some time I wanted to ACK your message. Minor comments below...

This is what I see in the dmesg

[   26.702381] nvme nvme0: pci function 0000:00:04.0
[   26.817847] nvme nvme0: missing or invalid SUBNQN field.

There are multiple reports of that warning which would indicate that is not caused by an issue with the IOMMU emulation, but it is interesting that you don't see it with "regular passthrough" (I assume that means with guest kernel in pt mode).


I am using following command qemu command line

-enable-kvm -m 10G -smp cpus=$NUM_VCPUS  \
-device amd-iommu,dma-remap=on \
-netdev user,id=USER0,hostfwd=tcp::3333-:22 \
-device virtio-net-pci,id=vnet0,iommu_platform=on,disable- legacy=on,romfile=,netdev=USER0 \ -cpu EPYC-Genoa,x2apic=on,kvm-msi-ext-dest-id=on,+kvm-pv-unhalt,kvm-pv- tlb-flush,kvm-pv-ipi,kvm-pv-sched-yield  \
-name guest=my-vm,debug-threads=on \
-machine q35,kernel_irqchip=split \
-global kvm-pit.lost_tick_policy=discard \
-nographic -vga none -chardev stdio,id=STDIO0,signal=off,mux=on \
-device isa-serial,id=isa-serial0,chardev=STDIO0 \
-smbios type=0,version=2.8 \
-blockdev node- name=drive0,driver=qcow2,file.driver=file,file.filename=$IMG \
-device virtio-blk-pci,num-queues=8,drive=drive0 \
-chardev socket,id=SOCKET1,server=on,wait=off,path=qemu.mon.user3333 \
-mon chardev=SOCKET1,mode=control \
-device vfio-pci,host=0000:44:00.0

Do you have any idea what might trigger this.

There are some parameters above that are unnecessary and perhaps conflicting e.g. we don't need kvm-msi-ext-dest-id=on since the vIOMMU provides interrupt remapping (plus you are likely not using more than 255 vCPUs). We also don't need kvm-pit.lost_tick_policy when using split irqchip, since the PIT is not emulated by KVM. But to be fair I don't believe those are likely to be causing the problem...

My main suspicion is the guest IOMMU driver being too old and missing lots of fixes, so it could be missing some essential operations that the emulation requires to work. e.g. if the guest driver does not comply with the spec and fails to issue a DEVTAB_INVALIDATE after changing the DTE, the vIOMMU code never gets the chance to enable the IOMMU memory region, and it all goes wrong from that point on. But I need to reproduce the problem and figure out where/when the emulation is failing. I've tested as far back as 5.15 based kernels.

I would argue that while it is something that I am definitely going to address if possible, this issue should not be a blocker. I'll update as soon as I have more data on the cause.

Thank you,
Alejandro


I see the error only when I am using emulated AMD IOMMU with passthrough
device. Regular passthrough works fine.

Regards
Sairaj Kodilkar

P.S. I know that the guest kernel is quite old but still wanted to make you aware.



Reply via email to