On Thu, Dec 16, 2021 at 12:33 PM Bruce Richardson <bruce.richard...@intel.com> wrote: > > On Thu, Dec 16, 2021 at 11:34:25AM -0500, Lance Richardson wrote: > > On Thu, Dec 16, 2021 at 11:20 AM Bruce Richardson > > <bruce.richard...@intel.com> wrote: > > > > > > On Thu, Dec 16, 2021 at 11:04:54AM -0500, Lance Richardson wrote: > > > > Hi Bruce, > > > > > > > > I've been looking into using the IOAT PMD, initially with dma_autotest > > > > and the dpdk-dma example application. These seem to work fine on > > > > SKX with the current main branch, but when I try the same procedure > > > > on ICX (binding all 8 devices to vfio-pci in both cases), I get the > > > > following > > > > output for each device when probed. Is something different needed when > > > > using IOAT on ICX vs. SKX? > > > > > > > > Thanks, > > > > Lance > > > > > > > > EAL: Probe PCI driver: dmadev_ioat (8086:b00) device: 0000:80:01.0 > > > > (socket 2) > > > > IOAT: ioat_dmadev_probe(): Init 0000:80:01.0 on NUMA node 2 > > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: Channel count == 255 > > > > > > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: Channel appears locked > > > > > > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: cannot reset device. > > > > CHANCMD=0xff, CHANSTS=0xffffffffffffffff, CHANERR=0xffffffff > > > > > > > > EAL: Releasing PCI mapped resource for 0000:80:01.0 > > > > EAL: Calling pci_unmap_resource for 0000:80:01.0 at 0x4102430000 > > > > EAL: Requested device 0000:80:01.0 cannot be used > > > > > > That is strange, the same PMD should work ok on both platforms. This is > > > all > > > on latest branch, right? Let me attempt to reproduce and get back to you. > > > > Hi Bruce, > > > > That's correct, I'm using the current tip of the main branch, which > > seems to be identical to 21.11.0. > > > > > > /Bruce > > > > > > PS: Is this a 4-socket system you are running on, since I see "socket 2" > > > being described as the socket number for device 80:01.0? > > > > > It is a two-socket system with sub-NUMA enabled, so it appears as four > > NUMA nodes. I'm only binding the devices on the second socket. > > > > Ok, [not that that should affect anything to do with ioat, AFAIK] > > Tried quickly reproducing the issue on some of our systems and failed to do > so. Does this error appear consistently, especially after a reboot? > > Thanks, > /Bruce
It fails consistently after every warm reboot or power cycle. The kernel ioatdma driver always loads successfully at boot time for both sockets, but it also fails once I have bound the devices to vfio-pci and attempted to run examples/dpdk-dma. The kernel log messages are similar, both seem to read all-ones values. However, I have found that it works when binding to igb_uio instead of vfio, so maybe that's some kind of clue (vfio does work for the NIC ports). I'll continue to experiment with igb_uio, but I'm happy to gather any debug info for the vfio case if that would help. Thanks, Lance