Hi Pim,

The logic there (I think) is that the since the source port is the one you send 
with, it is also the one you receive on, so the source-port setting determines 
the key to match incoming traffic, i.e. it’s the peer’s dst-port.

/neale

From: Pim van Pelt <p...@ipng.nl>
Date: Friday, 14 January 2022 at 11:28
To: Neale Ranns <ne...@graphiant.com>
Cc: vpp-dev <vpp-dev@lists.fd.io>
Subject: Re: [vpp-dev] VXLAN and RSS
Hoi,

Neale, thank you for pointing that out! I verified the intent, and I can 
confirm that VXLAN uses random source ports [1], and so does GENEVE [2], so 
this is WAI. I mirrored traffic between the two VPP hosts, while running a 
T-Rex bench.py with vm=var2 to scramble the src/dst IP addresses of the inner 
payload. It resulted in the outer src_port scrambling, which is great. So it 
seems I was reading the wrong part of the vxlan source code, and the thing I 
wanted is already there. I'll update my article with this info, and I'm only 
left wondering why there would be an option to set src_port when creating a 
VXLAN tunnel (such an option is not present in GENEVE)?

create vxlan tunnel src <local-vtep-addr> {dst <remote-vtep-addr>|group 
<mcast-vtep-addr> <intf-name>} vni <nn> [instance <id>] [encap-vrf-id <nn>] 
[decap-next [l2|node <name>]] [del] [l3] [src_port <local-vtep-udp-port>] 
[dst_port <remote-vtep-udp-port>]
create geneve tunnel local <local-vtep-addr> {remote <remote-vtep-addr>|group 
<mcast-vtep-addr> <intf-name>} vni <nn> [encap-vrf-id <nn>] [decap-next 
[l2|node <name>]] [l3-mode] [del]

groet,
Pim

[1] (on the mirrored interface, watching traffic between rhino:Hu12/0/1 and 
hippo:Hu12/0/1) tcpdump -ni enp5s0f3 port 4789
11:19:54.887763 IP 10.0.0.1.4452 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.888283 IP 10.0.0.1.42537 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.888285 IP 10.0.0.0.17895 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.899353 IP 10.0.0.1.40751 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.899355 IP 10.0.0.0.35475 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.904642 IP 10.0.0.0.60633 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.908642 IP 10.0.0.1.54881 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.910201 IP 10.0.0.1.11787 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.910204 IP 10.0.0.0.13300 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.919702 IP 10.0.0.1.55752 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.919714 IP 10.0.0.0.22122 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.944301 IP 10.0.0.0.42756 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.944303 IP 10.0.0.1.8992 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.954043 IP 10.0.0.1.49613 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.954045 IP 10.0.0.0.16483 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.954411 IP 10.0.0.0.37118 > 10.0.0.1.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.954412 IP 10.0.0.1.26825 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298
11:19:54.959725 IP 10.0.0.1.5643 > 10.0.0.0.4789: VXLAN, flags [I] (0x08), vni 
8298

[1] (on the mirrored interface, watching traffic between rhino:Hu12/0/1 and 
hippo:Hu12/0/1) tcpdump -ni enp5s0f3 port 6081
11:20:55.802406 IP 10.0.0.0.32299 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.45.1025 > 48.0.0.45.12: UDP, length 18
11:20:55.802409 IP 10.0.0.1.44011 > 10.0.0.0.6081: Geneve, Flags [none], vni 
0x206a: IP 48.0.0.45.1025 > 16.0.0.45.12: UDP, length 18
11:20:55.807711 IP 10.0.0.1.45503 > 10.0.0.0.6081: Geneve, Flags [none], vni 
0x206a: IP 48.0.0.47.1025 > 16.0.0.47.12: UDP, length 18
11:20:55.807712 IP 10.0.0.0.45532 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.47.1025 > 48.0.0.47.12: UDP, length 18
11:20:55.841494 IP 10.0.0.1.10795 > 10.0.0.0.6081: Geneve, Flags [none], vni 
0x206a: IP 48.0.0.50.1025 > 16.0.0.50.12: UDP, length 18
11:20:55.841495 IP 10.0.0.0.61694 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.50.1025 > 48.0.0.50.12: UDP, length 18
11:20:55.851719 IP 10.0.0.1.47581 > 10.0.0.0.6081: Geneve, Flags [none], vni 
0x206a: IP 48.0.0.48.1025 > 16.0.0.48.12: UDP, length 18
11:20:55.851719 IP 10.0.0.0.52458 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.48.1025 > 48.0.0.48.12: UDP, length 18
11:20:55.851772 IP 10.0.0.0.12360 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.52.1025 > 48.0.0.52.12: UDP, length 18
11:20:55.855768 IP 10.0.0.1.39531 > 10.0.0.0.6081: Geneve, Flags [none], vni 
0x206a: IP 48.0.0.52.1025 > 16.0.0.52.12: UDP, length 18
11:20:55.856296 IP 10.0.0.0.28635 > 10.0.0.1.6081: Geneve, Flags [none], vni 
0x206a: IP 16.0.0.51.1025 > 48.0.0.51.12: UDP, length 18

On Fri, Jan 14, 2022 at 10:40 AM Neale Ranns 
<ne...@graphiant.com<mailto:ne...@graphiant.com>> wrote:
Hi Pim,

For VXLAN the intention is to use random source ports. The code you sight 
builds the ‘static’ portion of the imposed header. The source ports are 
overwritten with the hash of the encapped packet in encap.c:246

/neale


From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> 
<vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Pim van Pelt via 
lists.fd.io<http://lists.fd.io> 
<pim=ipng...@lists.fd.io<mailto:ipng...@lists.fd.io>>
Date: Thursday, 13 January 2022 at 23:37
To: vpp-dev <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>>
Subject: [vpp-dev] VXLAN and RSS
Hoi folks,

I did a deep dive today on VXLAN, GENEVE and compared it to GRE and L2XC - the 
full read is here:
https://ipng.ch/s/articles/2022/01/13/vpp-l2.html

One thing that I observed is that both VXLAN and GENEVE use static source 
ports. In the case of VLLs, (an l2 xconnect from a customer ethernet interface 
into a tunnel), the customer port will be receiving IPv4 or IPv6 traffic 
(either tagged or untagged) and this allows the NIC to use RSS to assign this 
inbound traffic to multiple queues, and thus multiple CPU threads. That’s 
great, it means linear encapsulation performance.
However,, once the traffic is encapsulated, it’ll become single flow with 
respect to the remote host, ie we're sending from 
10.0.0.0:4789<http://10.0.0.0:4789> to the remote 
10.0.0.1:4789<http://10.0.0.1:4789> and it is for this reason, that all 
decapsulation is single threaded.
One common approach is to use an ingress hash algorithm to choose from a pool 
of source ports, or possibly a simpler round-robin over a pool of ports 
4000-5000, say, based on the inner payload. That way, the remote would be able 
to use multiple RSS queues. However, VPP currently does not implement that.

I think the original author has this in mind as a future improvement based on 
the comment on L295 in vxlan.c
  /* UDP header, randomize src port on something, maybe? */
  udp->src_port = clib_host_to_net_u16 (t->src_port);
  udp->dst_port = clib_host_to_net_u16 (t->dst_port);

What would it take for src_port to not be static? It would greatly improve 
VXLAN (and similarly, GENEVE) throughput on ingress.

groet,
Pim
--
Pim van Pelt <p...@ipng.nl<mailto:p...@ipng.nl>>
PBVP1-RIPE - http://www.ipng.nl/


--
Pim van Pelt <p...@ipng.nl<mailto:p...@ipng.nl>>
PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20716): https://lists.fd.io/g/vpp-dev/message/20716
Mute This Topic: https://lists.fd.io/mt/88408739/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to