Hi, I am seeing a "Failed to init adminq: -54" or admin queue timeouts while initializing the admin queue for i40e xl710 intel nic. (Intel server is a E5-2670)
First things first. I am running the latest firmware. The kernel module is not loaded and yes, it works with the i40e kernel driver. (latest or otherwise) And this problem comes even with dpdk 2.0/2.1 or the latest stable. So there's that. I have done a bunch of debugging and here are my findings. With the card configured in 2x40g or 4x10g mode, it _ALWAYS_ works with successfully initializing pci function 0 or port 0. It always fails to subsequently initialize the rest. Even if unbind the igb uio for port 0 and bind only port 1 or port 2,3,4 in 4x10g mode, it fails. Since it works with the kernel driver, I tried to see if there were differences in the way registers are setup for i40e driver in kernel and dpdk. They look mostly to be the same but obviously there were subtle differences. From what I could fathom, I couldn't see much and whatever little was caught, I tried to keep the dpdk code in sync and it still failed. While stepping through gdb all the way from eal pci to pci uio map to eth_i40e_dev_init, to the failure in obtaining the firmware revision for port1 during i40e_init_adminq, I did confirm that the memory map was right for the pci. So the hw->hw_addr looks correct for port 1 correlating it to the uio1 map or the physical address from lspci or kernel driver when using the kernel driver which works. However the admin queue seems to be not processing any request for port 1. Note that port 0 always works and its the same code for others with a different eal dev/hw instance. But for other ports like port1, after correctly setting up the adminq registers and memory map, it always fails to obtain the firmware revision since the i40e_asq_done is returning 0 for the head register at 0x80300 and doesn't match the next_in_use when starting at 1. So it always returns pending or false in i40e_asq_done which is retried a certain times after resetting the aq by i40e_init_adminq but ultimately gives up. Thoughts and wondering if you guys have seen this and have a fix or patch that is not in upstream yet. Failure enclosed below as mentioned above in detail: (with a 4x10g mode for the card but same failure with 2x40g mode as well. No difference. Port 0 always succeeds but subsequent ports fail. And same result even with port 0 not bound and starting with the initialization of port 2,3,4 which always fails. EAL: lcore 1 is ready (tid=6bd30700;cpuset=[1]) EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: Not managed by a supported kernel driver, skipped EAL: PCI device 0000:01:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: Not managed by a supported kernel driver, skipped EAL: PCI device 0000:83:00.0 on NUMA socket 1 EAL: probe driver: 8086:1583 rte_i40e_pmd EAL: PCI memory mapped at 0x7f2f80000000 EAL: PCI memory mapped at 0x7f2f80800000 PMD: eth_i40e_dev_init(): FW 4.40 API 1.4 NVM 04.05.03 eetrack 80001dca PMD: i40e_pf_parameter_init(): Max supported VSIs:34 PMD: i40e_pf_parameter_init(): PF queue pairs:64 PMD: i40e_pf_parameter_init(): Max VMDQ VSI num:34 PMD: i40e_pf_parameter_init(): VMDQ queue pairs:4 EAL: PCI device 0000:83:00.1 on NUMA socket 1 EAL: probe driver: 8086:1583 rte_i40e_pmd EAL: PCI memory mapped at 0x7f2f80808000 EAL: PCI memory mapped at 0x7f2f81008000 PMD: eth_i40e_dev_init(): Failed to init adminq: -54 EAL: Error - exiting with code: 1 Cause: Requested device 0000:83:00.1 cannot be used Regards, -Karthick