On Thu, Dec 16, 2021 at 12:33 PM Bruce Richardson
<bruce.richard...@intel.com> wrote:
>
> On Thu, Dec 16, 2021 at 11:34:25AM -0500, Lance Richardson wrote:
> > On Thu, Dec 16, 2021 at 11:20 AM Bruce Richardson
> > <bruce.richard...@intel.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 11:04:54AM -0500, Lance Richardson wrote:
> > > > Hi Bruce,
> > > >
> > > > I've been looking into using the IOAT PMD, initially with dma_autotest
> > > > and the dpdk-dma example application. These seem to work fine on
> > > > SKX with the current main branch, but when I try the same procedure
> > > > on ICX (binding all 8 devices to vfio-pci in both cases), I get the 
> > > > following
> > > > output for each device when probed. Is something different needed when
> > > > using IOAT on ICX vs. SKX?
> > > >
> > > > Thanks,
> > > >      Lance
> > > >
> > > > EAL: Probe PCI driver: dmadev_ioat (8086:b00) device: 0000:80:01.0 
> > > > (socket 2)
> > > > IOAT: ioat_dmadev_probe(): Init 0000:80:01.0 on NUMA node 2
> > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: Channel count == 255
> > > >
> > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: Channel appears locked
> > > >
> > > > IOAT: ioat_dmadev_create(): ioat_dmadev_create: cannot reset device.
> > > > CHANCMD=0xff, CHANSTS=0xffffffffffffffff, CHANERR=0xffffffff
> > > >
> > > > EAL: Releasing PCI mapped resource for 0000:80:01.0
> > > > EAL: Calling pci_unmap_resource for 0000:80:01.0 at 0x4102430000
> > > > EAL: Requested device 0000:80:01.0 cannot be used
> > >
> > > That is strange, the same PMD should work ok on both platforms. This is 
> > > all
> > > on latest branch, right? Let me attempt to reproduce and get back to you.
> >
> > Hi Bruce,
> >
> > That's correct, I'm using the current tip of the main branch, which
> > seems to be identical to 21.11.0.
> > >
> > > /Bruce
> > >
> > > PS: Is this a 4-socket system you are running on, since I see "socket 2"
> > > being described as the socket number for device 80:01.0?
> > >
> > It is a two-socket system with sub-NUMA enabled, so it appears as four
> > NUMA nodes. I'm only binding the devices on the second socket.
> >
>
> Ok, [not that that should affect anything to do with ioat, AFAIK]
>
> Tried quickly reproducing the issue on some of our systems and failed to do
> so. Does this error appear consistently, especially after a reboot?
>
> Thanks,
> /Bruce

It fails consistently after every warm reboot or power cycle. The kernel ioatdma
driver always loads successfully at boot time for both sockets, but it also
fails once I have bound the devices to vfio-pci and attempted to run
examples/dpdk-dma. The kernel log messages are similar, both seem to read
all-ones values.

However, I have found that it works when binding to igb_uio instead of vfio, so
maybe that's some kind of clue (vfio does work for the NIC ports).

I'll continue to experiment with igb_uio, but I'm happy to gather any debug info
for the vfio case if that would help.

Thanks,
    Lance

Reply via email to