Hi Pim, The logic there (I think) is that the since the source port is the one you send with, it is also the one you receive on, so the source-port setting determines the key to match incoming traffic, i.e. it’s the peer’s dst-port.
/neale From: Pim van Pelt <p...@ipng.nl> Date: Friday, 14 January 2022 at 11:28 To: Neale Ranns <ne...@graphiant.com> Cc: vpp-dev <vpp-dev@lists.fd.io> Subject: Re: [vpp-dev] VXLAN and RSS Hoi, Neale, thank you for pointing that out! I verified the intent, and I can confirm that VXLAN uses random source ports [1], and so does GENEVE [2], so this is WAI. I mirrored traffic between the two VPP hosts, while running a T-Rex bench.py with vm=var2 to scramble the src/dst IP addresses of the inner payload. It resulted in the outer src_port scrambling, which is great. So it seems I was reading the wrong part of the vxlan source code, and the thing I wanted is already there. I'll update my article with this info, and I'm only left wondering why there would be an option to set src_port when creating a VXLAN tunnel (such an option is not present in GENEVE)? create vxlan tunnel src <local-vtep-addr> {dst <remote-vtep-addr>|group <mcast-vtep-addr> <intf-name>} vni <nn> [instance <id>] [encap-vrf-id <nn>] [decap-next [l2|node <name>]] [del] [l3] [src_port <local-vtep-udp-port>] [dst_port <remote-vtep-udp-port>] create geneve tunnel local <local-vtep-addr> {remote <remote-vtep-addr>|group <mcast-vtep-addr> <intf-name>} vni <nn> [encap-vrf-id <nn>] [decap-next [l2|node <name>]] [l3-mode] [del] groet, Pim [1] (on the mirrored interface, watching traffic between rhino:Hu12/0/1 and hippo:Hu12/0/1) tcpdump -ni enp5s0f3 port 4789 11:19:54.887763 IP 10.0.0.1.4452 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.888283 IP 10.0.0.1.42537 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.888285 IP 10.0.0.0.17895 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.899353 IP 10.0.0.1.40751 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.899355 IP 10.0.0.0.35475 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.904642 IP 10.0.0.0.60633 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.908642 IP 10.0.0.1.54881 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.910201 IP 10.0.0.1.11787 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.910204 IP 10.0.0.0.13300 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.919702 IP 10.0.0.1.55752 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.919714 IP 10.0.0.0.22122 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.944301 IP 10.0.0.0.42756 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.944303 IP 10.0.0.1.8992 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.954043 IP 10.0.0.1.49613 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.954045 IP 10.0.0.0.16483 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.954411 IP 10.0.0.0.37118 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.954412 IP 10.0.0.1.26825 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 11:19:54.959725 IP 10.0.0.1.5643 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 8298 [1] (on the mirrored interface, watching traffic between rhino:Hu12/0/1 and hippo:Hu12/0/1) tcpdump -ni enp5s0f3 port 6081 11:20:55.802406 IP 10.0.0.0.32299 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.45.1025 > 48.0.0.45.12: UDP, length 18 11:20:55.802409 IP 10.0.0.1.44011 > 10.0.0.0.6081: Geneve, Flags [none], vni 0x206a: IP 48.0.0.45.1025 > 16.0.0.45.12: UDP, length 18 11:20:55.807711 IP 10.0.0.1.45503 > 10.0.0.0.6081: Geneve, Flags [none], vni 0x206a: IP 48.0.0.47.1025 > 16.0.0.47.12: UDP, length 18 11:20:55.807712 IP 10.0.0.0.45532 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.47.1025 > 48.0.0.47.12: UDP, length 18 11:20:55.841494 IP 10.0.0.1.10795 > 10.0.0.0.6081: Geneve, Flags [none], vni 0x206a: IP 48.0.0.50.1025 > 16.0.0.50.12: UDP, length 18 11:20:55.841495 IP 10.0.0.0.61694 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.50.1025 > 48.0.0.50.12: UDP, length 18 11:20:55.851719 IP 10.0.0.1.47581 > 10.0.0.0.6081: Geneve, Flags [none], vni 0x206a: IP 48.0.0.48.1025 > 16.0.0.48.12: UDP, length 18 11:20:55.851719 IP 10.0.0.0.52458 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.48.1025 > 48.0.0.48.12: UDP, length 18 11:20:55.851772 IP 10.0.0.0.12360 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.52.1025 > 48.0.0.52.12: UDP, length 18 11:20:55.855768 IP 10.0.0.1.39531 > 10.0.0.0.6081: Geneve, Flags [none], vni 0x206a: IP 48.0.0.52.1025 > 16.0.0.52.12: UDP, length 18 11:20:55.856296 IP 10.0.0.0.28635 > 10.0.0.1.6081: Geneve, Flags [none], vni 0x206a: IP 16.0.0.51.1025 > 48.0.0.51.12: UDP, length 18 On Fri, Jan 14, 2022 at 10:40 AM Neale Ranns <ne...@graphiant.com<mailto:ne...@graphiant.com>> wrote: Hi Pim, For VXLAN the intention is to use random source ports. The code you sight builds the ‘static’ portion of the imposed header. The source ports are overwritten with the hash of the encapped packet in encap.c:246 /neale From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Pim van Pelt via lists.fd.io<http://lists.fd.io> <pim=ipng...@lists.fd.io<mailto:ipng...@lists.fd.io>> Date: Thursday, 13 January 2022 at 23:37 To: vpp-dev <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> Subject: [vpp-dev] VXLAN and RSS Hoi folks, I did a deep dive today on VXLAN, GENEVE and compared it to GRE and L2XC - the full read is here: https://ipng.ch/s/articles/2022/01/13/vpp-l2.html One thing that I observed is that both VXLAN and GENEVE use static source ports. In the case of VLLs, (an l2 xconnect from a customer ethernet interface into a tunnel), the customer port will be receiving IPv4 or IPv6 traffic (either tagged or untagged) and this allows the NIC to use RSS to assign this inbound traffic to multiple queues, and thus multiple CPU threads. That’s great, it means linear encapsulation performance. However,, once the traffic is encapsulated, it’ll become single flow with respect to the remote host, ie we're sending from 10.0.0.0:4789<http://10.0.0.0:4789> to the remote 10.0.0.1:4789<http://10.0.0.1:4789> and it is for this reason, that all decapsulation is single threaded. One common approach is to use an ingress hash algorithm to choose from a pool of source ports, or possibly a simpler round-robin over a pool of ports 4000-5000, say, based on the inner payload. That way, the remote would be able to use multiple RSS queues. However, VPP currently does not implement that. I think the original author has this in mind as a future improvement based on the comment on L295 in vxlan.c /* UDP header, randomize src port on something, maybe? */ udp->src_port = clib_host_to_net_u16 (t->src_port); udp->dst_port = clib_host_to_net_u16 (t->dst_port); What would it take for src_port to not be static? It would greatly improve VXLAN (and similarly, GENEVE) throughput on ingress. groet, Pim -- Pim van Pelt <p...@ipng.nl<mailto:p...@ipng.nl>> PBVP1-RIPE - http://www.ipng.nl/ -- Pim van Pelt <p...@ipng.nl<mailto:p...@ipng.nl>> PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20716): https://lists.fd.io/g/vpp-dev/message/20716 Mute This Topic: https://lists.fd.io/mt/88408739/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-