On 08/01/2016 11:56 AM, Erez Shitrit wrote: > The GID (9000:0:2800:0:bc00:7500:6e:d8a4) is not regular, not from > local subnet prefix. > why is that? >
So I managed to debug this and it tuns out the problem lies between veth and ipoib interaction: I've discovered the following strange thing. If I have a vethpair where the 2 devices are in a different net namespaces as shown in the scripts I have attached then the performance of sending a file, originating from the veth interface inside the non-init netnamespace, going across the ipoib interface is very slow (100kb). For simple reproduction I'm attaching 2 scripts which have to be run on 2 machine and the respective ip addresses set on them. Then sending node woult initiate a simple file copy over NC. I've observed this behavior on upstream 4.4, 4.5.4 and 4.7.0 kernels both with ipv4 and ipv6 addresses. Here is what the debug log of the ipoib module shows: ib%d: max_srq_sge=128 ib%d: max_cm_mtu = 0xfff0, num_frags=16 ib0: enabling connected mode will cause multicast packet drops ib0: mtu > 4092 will cause multicast packet drops. ib0: bringing up interface ib0: starting multicast thread ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff ib0: restarting multicast task ib0: adding multicast entry for mgid ff12:601b:ffff:0000:0000:0000:0000:0001 ib0: restarting multicast task ib0: adding multicast entry for mgid ff12:401b:ffff:0000:0000:0000:0000:0001 ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0) ib0: Created ah ffff88081063ea80 ib0: MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff AV ffff88081063ea80, LID 0xc000, SL 0 ib0: joining MGID ff12:601b:ffff:0000:0000:0000:0000:0001 ib0: joining MGID ff12:401b:ffff:0000:0000:0000:0000:0001 ib0: successfully started all multicast joins ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001 (status 0) ib0: Created ah ffff880839084680 ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV ffff880839084680, LID 0xc002, SL 0 ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0) ib0: Created ah ffff88081063e280 ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff88081063e280, LID 0xc004, SL 0 When the transfer is initiated I can see the following errors on the sending node: ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: Start path record lookup for 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: PathRec status -22 for GID 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 ib0: neigh free for 000003 0401:0000:1400:0000:a0a8:ffff:1c01:4d36 Here is the port guid of the sending node: 0x0011750000772664 and on the receiving one: 0x0011750000774d36 Here is how the paths look like on the sending node, clearly the paths being requested from the veth interface cat /sys/kernel/debug/ipoib/ib0_path GID: 401:0:1400:0:a0a8:ffff:1c01:4d36 complete: no GID: 401:0:1400:0:a410:ffff:1c01:4d36 complete: no GID: fe80:0:0:0:11:7500:77:2a1a complete: yes DLID: 0x0004 SL: 0 rate: 40.0 Gb/sec GID: fe80:0:0:0:11:7500:77:4d36 complete: yes DLID: 0x000a SL: 0 rate: 40.0 Gb/sec Testing the same scenario but instead of using veth devices I create the device in the non-init netnamespace via the following commands I can achieve sensible speeds: ip link add link ib0 name ip1 type ipoib ip link set dev ip1 netns test-netnamespace [Snipped a lot of useless stuff]
receive-node.sh
Description: application/shellscript
sending-node.sh
Description: application/shellscript