Hi all, [Short version] I'm trying enable 16 VFs on a Mellanox ConnectX-3 Gen3x8 PCIe card, which only enumerates <=8VFs if the card is used in the CPU's x16 PCIe slot but enumerates 16 VFs (or more) if placed into a Southbridge PCIe connector. Tried with two differnet generations of Xeons, (1) Xeon E3-1240V3 / Asrock E3C222D4U (2) Xeon E3-1240LV5 / Asrock E3C236D4U and the results are same -- 16VFs fail if I use the PCIe x16 slot but works when using Southbridge PCIe.
To cover bases, the CX3 card has latest firmware (2.42.5000), configured for dual 10/40 GbE (not IB), SR-IOV & 16VFs enabled in the card, VT-x/VT-d enabled in the BIOS, kernel booted with intel_iommu=on. Using Debian Buster 10 / have also tested with Arch Linux, default kernel driver and latest (4.9-2.2.4.0) driver from Mellanox, same errors. Is this a kernel bug, a BIOS issue, a driver bug or a hardware limitation of the x16 PCIe Root Complex on these Xeon CPUs? Could it be related to lack of ACS on the x16 interface or ARI not being properly enabled? [Long version] When using the the x16 slot after the first 8 total Virtual Functions, dmesg shows errors enumerating the further VFs there are complaints about "INTx" interrupts for these VFs (should be using MSI-X) and lspci shows (ff) for the additional VF devices. CPUs has no internal GFX, system has ASPEED GFX/IPMI (attached via Southrbridge) and the CPU's x16 PCIe isn't occupied by anything else, have also tried the other x8 slot on motherboard which can be bifurcated from the x16 port with no positive difference. However, if the CX3 card is placed into the Southbridge's PCIe, I CAN successfully allocate the 16 VFs I am looking for, lspci shows the devices as expected and there are no complaints about use of INTx interrupts and /proc/interrupts shows MSI-X for everything. To further ensure this is not a user / driver configuration problem, I used the same Linux install between the two machines and a third Dell Precision Tower 5810 (Xeon E5) (which could also enumerate >8 VFs) and have also tried building the Mellanox / nVidia "proprietary" driver and making sure the module is loaded with msi_x=1, the failure mode is exactly the same. I can share dmesg, modprobe, lspci results but figured I'd abbreviate the mail if the answer is obvious. Thank you! _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu