There are 2 problems: 1. HW CRC strip needs to be enabled for VFs, that’s why DPDK is failing to init device 2. VFs are dropping packets when .max_rx_pkt_len is set to 9216
Problem 1. is easy fixable by changing .hw_strip_crc to 1 in src/plugins/dpdk/device/init.c Problem 2. seems to be outside of VPP control, but it can be workarounded by setting .max_rx_pkt_len to 1518. consequence of doing this is that we will (likely) loose jumbo frame support on VFs. I’m going to submit patch which fixes both issues soon (actually workarounds 2. ), I need to play a bit more with 2. first... On 12 May 2017, at 12:08, Tomas Brännström <tomas.a.brannst...@tieto.com<mailto:tomas.a.brannst...@tieto.com>> wrote: Unfortunately my MTU seems to be at 1500 already. I did an upgrade to release 1704 and now none of the interfaces are discovered anymore. But it seems suspicious since there are basically no log printouts at startup either, below are 1701 vs 1704 for comparision. This isn't exclusive for this "SR-IOV" machine either I think, when running later VPP in for example virtual box I get the same problems, so I guess there's something additional that must be done that's maybe not documented on the wiki yet. 17.01: ------ vlib_plugin_early_init:213: plugin path /usr/lib/vpp_plugins vpp[5066]: vlib_pci_bind_to_uio: Skipping PCI device 0000:00:03.0 as host interface eth0 is up EAL: Detected 4 lcore(s) EAL: No free hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles ! EAL: PCI device 0000:00:03.0 on NUMA socket -1 EAL: Device is blacklisted, not initializing EAL: PCI device 0000:00:06.0 on NUMA socket -1 EAL: probe driver: 8086:10ed net_ixgbe_vf EAL: PCI device 0000:00:07.0 on NUMA socket -1 EAL: probe driver: 8086:10ed net_ixgbe_vf DPDK physical memory layout: Segment 0: phys:0x5cc00000, len:2097152, virt:0x7f7e0b800000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 Segment 1: phys:0x5d000000, len:266338304, virt:0x7f7db1600000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip 17.04: ------ vlib_plugin_early_init:360: plugin path /usr/lib/vpp_plugins /Tomas On 12 May 2017 at 11:49, Gonsalves, Avinash (Nokia - IN/Bangalore) <avinash.gonsal...@nokia.com<mailto:avinash.gonsal...@nokia.com>> wrote: I faced a similar issue with SR-IOV, and for some reason setting the MTU size to 1500 on the interface helped with ARP resolution. Thanks, Avinash ================================================================================ Thanks. I can try to use a later VPP version. A thing to note is that when we did try to use the master release before, VPP failed to discover interfaces, even when they were whitelisted. Not sure if something has changed in the way VPP discover interfaces in later versions. I will try with 1704 though. Ah OK sorry, should have realized what PF was in this context. Not sure of all the info that might be needed but here's what I could think of: root at node-4<https://lists.fd.io/mailman/listinfo/vpp-dev>:~# lspci -t -v [...] +-03.0-[0b-0c]--+-00.0 Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection | +-00.1 Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection | +-10.0 Intel Corporation 82599 Ethernet Controller Virtual Function | +-10.1 Intel Corporation 82599 Ethernet Controller Virtual Function [...] root at node-4<https://lists.fd.io/mailman/listinfo/vpp-dev>:~# lshw -class network -businfo Bus info Device Class Description ============================================================ pci at 0000<https://lists.fd.io/mailman/listinfo/vpp-dev>:0b:00.0 enp11s0f0 network 82599ES 10-Gigabit SFI/SFP+ Network Connection pci at 0000<https://lists.fd.io/mailman/listinfo/vpp-dev>:0b:00.1 enp11s0f1 network 82599ES 10-Gigabit SFI/SFP+ Network Connection root at node-4<https://lists.fd.io/mailman/listinfo/vpp-dev>:~# cat /sys/class/net/enp11s0f0/device/sriov_totalvfs 63 root at node-4<https://lists.fd.io/mailman/listinfo/vpp-dev>:~# cat /sys/class/net/enp11s0f0/device/sriov_numvfs 16 root at node-4<https://lists.fd.io/mailman/listinfo/vpp-dev>:~# ethtool -i enp11s0f0 driver: ixgbe version: 3.15.1-k firmware-version: 0x61c10001 bus-info: 0000:0b:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no /Tomas On 12 May 2017 at 10:44, Damjan Marion (damarion) <damarion at cisco.com<https://lists.fd.io/mailman/listinfo/vpp-dev>> wrote: > > On 12 May 2017, at 08:01, Tomas Brännström <tomas.a.brannstrom at > tieto.com<https://lists.fd.io/mailman/listinfo/vpp-dev>> > wrote: > > *(I forgot to mention before, this is running with VPP installed from > binaries with release .stable.1701)* > > > I strongly suggest that you use 17.04 release at least. > > > With PF do you mean packet filter? I don't think we have any such > configuration. If there is anything else I should provide then please tell > :) > > > PF = SR-IOV Physical Function > > > I decided to try to attach to the VPP process with gdb and I actually get > a crash when trying to do "ip probe": > > vpp# ip probe 10.0.1.1 TenGigabitEthernet0/6/0 > exec error: Misc > > Program received signal SIGSEGV, Segmentation fault. > ip4_probe_neighbor (vm=vm at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=0x7f681533e720 > <vlib_global_main>, > dst=dst at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=0x7f67d345cc50, > sw_if_index=sw_if_index at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=1) > at /w/workspace/vpp-merge-1701-ubuntu1404/build-data/../vnet/ > vnet/ip/ip4_forward.c:2223 > 2223 > /w/workspace/vpp-merge-1701-ubuntu1404/build-data/../vnet/vnet/ip/ip4_forward.c: > No such file or directory. > (gdb) bt > #0 ip4_probe_neighbor (vm=vm at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=0x7f681533e720 > <vlib_global_main>, > dst=dst at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=0x7f67d345cc50, > sw_if_index=sw_if_index at > entry<https://lists.fd.io/mailman/listinfo/vpp-dev>=1) > at /w/workspace/vpp-merge-1701-ubuntu1404/build-data/../vnet/ > vnet/ip/ip4_forward.c:2223 > > Whether this is related or not I'm not sure because yesterday I could do > the probe but got "Resolution failed". I've attached the stack trace at any > rate. > > /Tomas > > On 11 May 2017 at 20:25, Damjan Marion (damarion) <damarion at > cisco.com<https://lists.fd.io/mailman/listinfo/vpp-dev>> > wrote: > >> Dear Tomas, >> >> Can you please share your PF configuration so I can try to reproduce? >> >> Thanks, >> >> Damjan >> >> On 11 May 2017, at 17:07, Tomas Brännström <tomas.a.brannstrom at >> tieto.com<https://lists.fd.io/mailman/listinfo/vpp-dev>> >> wrote: >> >> Hello >> Since the last mail I sent I've managed to get our test client working >> and VPP running in a KVM VM. >> >> We are still facing some problems though. We have a two servers, one >> where the virtual machines are running and one we use as the openstack >> controller. They are connected to each other with a 10G NIC. We have SR-IOV >> configured for the 10G NIC. >> >> So VPP is installed in a VM, and all interfaces work OK, then can be >> reached from outside the VM etc. Following the basic examples on the wiki, >> we configure VPP to take over the interfaces: >> >> vpp# set int ip address TenGigabitEthernet0/6/0 >> 10.0.1.101/24<http://10.0.1.101/24> >> vpp# set int ip address TenGigabitEthernet0/7/0 >> 10.0.2.101/24<http://10.0.2.101/24> >> vpp# set int state TenGigabitEthernet0/6/0 up >> vpp# set int state TenGigabitEthernet0/7/0 up >> >> But when trying to ping for example the physical NIC on the other server, >> we get no reply: >> >> vpp# ip probe 10.0.1.1 TenGigabitEthernet0/6/0 >> ip probe-neighbor: Resolution failed for 10.0.1.1 >> >> If I do a tcpdump on the physical interface when trying to ping, I see >> ARP packets being sent so -something- is happening, but it seems that >> packets are not correctly arriving to VPP... I can't ping from the physical >> host either, but the ARP cache is updated on the host when trying to ping >> from VPP. >> >> I've tried dumping counters etc. but I can't really see anything. The >> trace does not show anything either. This is the output from "show >> hardware": >> >> vpp# show hardware >> Name Idx Link Hardware >> TenGigabitEthernet0/6/0 1 up TenGigabitEthernet0/6/0 >> Ethernet address fa:16:3e:04:42:d1 >> Intel 82599 VF >> carrier up full duplex speed 10000 mtu 9216 >> rx queues 1, rx desc 1024, tx queues 1, tx desc 1024 >> >> tx frames ok 3 >> tx bytes ok 126 >> extended stats: >> tx good packets 3 >> tx good bytes 126 >> TenGigabitEthernet0/7/0 2 up TenGigabitEthernet0/7/0 >> Ethernet address fa:16:3e:f2:15:a5 >> Intel 82599 VF >> carrier up full duplex speed 10000 mtu 9216 >> rx queues 1, rx desc 1024, tx queues 1, tx desc 1024 >> >> I've tried a similar setup between two virtual box VM's and that worked >> OK, so I'm thinking it might have something to do with SR-IOV for some >> reason. I'm having a hard time troubleshooting this since I'm not sure how >> to check where the packets actually get lost... >> >> /Tomas >> >> _______________________________________________ >> vpp-dev mailing list >> vpp-dev at lists.fd.io<https://lists.fd.io/mailman/listinfo/vpp-dev> >> https://lists.fd.io/mailman/listinfo/vpp-dev >> >> >> > <vpp_stacktrace.txt> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.fd.io/pipermail/vpp-dev/attachments/20170512/5a0abc73/attachment.html> _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> https://lists.fd.io/mailman/listinfo/vpp-dev
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev