On Feb 15 12:01, Major Saheb wrote: > > Assuming you are *not* explicitly configuring shadow doorbells, then I > > think you might have a broken driver that does not properly reset the > > controller before using it (are you tripping CC.EN?). That could explain > > the admin queue size of 32 (default admin queue depth for the Linux nvme > > driver) as well as the db/ei_addrs being left over. And behavior wrt. > > how the Linux driver disables the device might have changed between the > > kernel version used in Ubuntu 20.04 and 22.04. > > Thanks Klaus, I didn't had the driver source, so I acquired it and > looked into it, the driver was not toggling the cc.en nor waiting for > csts.ready the right way. So I implemented it and it started working > perfectly. > - R > > On Tue, Feb 14, 2023 at 8:26 PM Klaus Jensen <i...@irrelevant.dk> wrote: > > > > On Feb 14 14:05, Klaus Jensen wrote: > > > On Feb 14 17:34, Major Saheb wrote: > > > > Thanks Peter for the reply. I tried to connect gdb to qemu and able to > > > > break 'vtd_iova_to_slpte()', I dumped the following with both Ubuntu > > > > 20.04 base image container which is the success case and Ubuntu 22.04 > > > > base image container which is failure case > > > > One thing I observed is the NvmeSQueue::dma_addr is correctly set to > > > > '0x800000000', however in failure case this value is 0x1196b1000. A > > > > closer look indicates more fields in NvmeSQueue might be corrupted, > > > > for example we are setting admin queue size as 512 but in failure case > > > > it is showing 32. > > > > > > > > > > Hi Major, > > > > > > It's obviously pretty bad if hw/nvme somehow corrupts the SQ structure, > > > but it's difficult to say from this output. > > > > > > Are you configuring shadow doorbells (the db_addr and ei_addr's are > > > set in both cases)? > > > > > > > > > Following is the partial qemu command line that I am using > > > > > > > > > > > > -device > > > > > > intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on,aw-bits=48 > > > > > > > > > > > > I'm not sure if caching-mode=on and device-iotlb=on leads to any issues > > > here? As far as I understand, this is mostly used with stuff like vhost. > > > I've tested and developed vfio-based drivers against hw/nvme excessively > > > and I'm not using anything besides `-device intel-iommu`. > > > > > > Do I undestand correctly that your setup is "just" a Ubuntu 22.04 guest > > > with a container and a user-space driver to interact with the nvme > > > devices available on the guest? No nested virtualization with vfio > > > passthrough? > > > > Assuming you are *not* explicitly configuring shadow doorbells, then I > > think you might have a broken driver that does not properly reset the > > controller before using it (are you tripping CC.EN?). That could explain > > the admin queue size of 32 (default admin queue depth for the Linux nvme > > driver) as well as the db/ei_addrs being left over. And behavior wrt. > > how the Linux driver disables the device might have changed between the > > kernel version used in Ubuntu 20.04 and 22.04.
Awesome. Occam's Razor strikes again ;)
signature.asc
Description: PGP signature
