We have done a performance evaluation of the termination L3-VPN traffic (Phy --> VM) with MPLSoGRE encapsulation with and without recirculation to be able to quantify the performance impact of the above recent change.
OVS versions tested: • branch-2.5 (74f1fc4d25fa8) • master (87530bc1c1c) • master (87530bc1c1c) plus reverted commit 8bf009bf8ab (always recirculate after pop_mpls) All tests were performed on a dual socket Xeon E5-2680 v3 @ 2.50GHz. The system under test was OVS with DPDK-netdev datapath running on a single CPU core. The VPN traffic was generated on an external server and injected encapsulated through an Intel 82599 10G NIC. The traffic was terminated by DPDK testpmd in rxonly mode on a 4 core VM with 4 vhost-user ports. NIC, OVS and VM were all on the same CPU socket. With the crucial commit 8bf009bf8ab (always recirculate after pop_mpls) the performance of MPLSoGRE decapsulation and forwarding drops to 1.4 Mpps, a reduction by 35% compared to current master (2.2 Mpps) and 39% compared to branch-2.5 (2.3 Mpps). In our view a performance drop of 35% for this highly relevant use case in Cloud should be sufficient reason to re-consider the decision to simplify the code for recirculation after pop_mpls. Best regards, Jan Appendix: The details of the measured DPIF datapath: root@dl380-668:~# appctl-25 dpif/show netdev@ovs-netdev: hit:38333441319 missed:3547 br_int: br_int 65534/13: (tap) gre0 101/8: (gre: key=flow, remote_ip=10.1.2.9) vhost811 11/1: (dpdkvhostuser: configured_rx_queues=1, configured_tx_queues=1, requested_tx_queues=49) vhost812 12/6: (dpdkvhostuser: configured_rx_queues=1, configured_tx_queues=1, requested_tx_queues=49) vhost813 13/9: (dpdkvhostuser: configured_rx_queues=1, configured_tx_queues=1, requested_tx_queues=49) vhost814 14/7: (dpdkvhostuser: configured_rx_queues=1, configured_tx_queues=1, requested_tx_queues=49) vhost815 15/10: (dpdkvhostuser: configured_rx_queues=1, configured_tx_queues=1, requested_tx_queues=49) br_phy: br_phy 65534/15: (tap) dpdk0 1/14: (dpdk: configured_rx_queues=1, configured_tx_queues=49, requested_tx_queues=49) MPLS/GRE encap without recirc: root@dl380-668:~# appctl-25 dpif/dump-flows br_phy; appctl-25 dpif/dump-flows br_int recirc_id(0),in_port(14),eth(src=8c:dc:d4:ab:58:48,dst=8c:dc:d4:ab:5b:f0),eth_type(0x0800),ipv4(dst=10.1.2.8,proto=47,frag=no), packets:329751228, bytes:34953630168, used:0.000s, actions:tnl_pop(8) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,ttl=64,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth(dst=52:54:00:a0:91:02),eth_type(0x8847),mpls(label=813/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:71863832, bytes:4599285248, used:0.000s, actions:set(eth(dst=52:54:00:a0:81:04)),pop_mpls(eth_type=0x800),9 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,ttl=64,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth(dst=52:54:00:a0:91:03),eth_type(0x8847),mpls(label=812/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:10595417, bytes:678106688, used:0.001s, actions:set(eth(dst=52:54:00:a0:81:03)),pop_mpls(eth_type=0x800),6 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,ttl=64,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth(dst=52:54:00:a0:91:04),eth_type(0x8847),mpls(label=815/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:10465873, bytes:669815872, used:0.000s, actions:set(eth(dst=52:54:00:a0:81:06)),pop_mpls(eth_type=0x800),10 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,ttl=64,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth(dst=52:54:00:a0:91:05),eth_type(0x8847),mpls(label=814/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:72711248, bytes:4653519872, used:0.000s, actions:set(eth(dst=52:54:00:a0:81:05)),pop_mpls(eth_type=0x800),7 MPLS/GRE encap with recirc: root@dl380-668:~# appctl-25 dpif/dump-flows br_phy; appctl-25 dpif/dump-flows br_int recirc_id(0),in_port(14),eth(src=8c:dc:d4:ab:58:48,dst=8c:dc:d4:ab:5b:f0),eth_type(0x0800),ipv4(dst=10.1.2.8,proto=47,frag=no), packets:584293179, bytes:61935076974, used:0.000s, actions:tnl_pop(8) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth_type(0x8847),mpls(label=812/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:35344994, bytes:2262079616, used:0.000s, actions:pop_mpls(eth_type=0x800),recirc(0x8) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth_type(0x8847),mpls(label=813/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:164328240, bytes:10517007360, used:0.000s, actions:pop_mpls(eth_type=0x800),recirc(0x12) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth_type(0x8847),mpls(label=814/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:166271192, bytes:10641356288, used:0.000s, actions:pop_mpls(eth_type=0x800),recirc(0x14) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0),in_port(8),eth_type(0x8847),mpls(label=815/0xfffff,tc=0/0,ttl=4/0x0,bos=1/1), packets:32353131, bytes:2070600384, used:0.000s, actions:pop_mpls(eth_type=0x800),recirc(0x16) tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0x8),in_port(8),eth(dst=52:54:00:a0:91:03),eth_type(0x0800),ipv4(frag=no), packets:34268739, bytes:2056124340, used:0.000s, flags:., actions:set(eth(dst=52:54:00:a0:81:03)),6 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0x12),in_port(8),eth(dst=52:54:00:a0:91:02),eth_type(0x0800),ipv4(frag=no), packets:162167251, bytes:9730035060, used:0.000s, flags:., actions:set(eth(dst=52:54:00:a0:81:04)),9 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0x14),in_port(8),eth(dst=52:54:00:a0:91:05),eth_type(0x0800),ipv4(frag=no), packets:164069211, bytes:9844152660, used:0.000s, flags:., actions:set(eth(dst=52:54:00:a0:81:05)),7 tunnel(tun_id=0x0,src=10.1.2.9,dst=10.1.2.8,flags(-df-csum+key)),skb_mark(0),recirc_id(0x16),in_port(8),eth(dst=52:54:00:a0:91:04),eth_type(0x0800),ipv4(frag=no), packets:31327117, bytes:1879627020, used:0.000s, flags:., actions:set(eth(dst=52:54:00:a0:81:06)),10 Detailed measurements: Packet size # parallel L4 flows branch-2.5 master master (recirc) (no recirc) 64 8 2377324 1428012 2205425 64 100 2358810 1387175 2209671 64 1000 2310414 1470200 2121747 64 5000 2291150 1443560 2096412 64 10000 2299286 1409919 2177457 64 20000 2298938 1406937 2164322 64 30000 2303101 1399930 2182350 64 50000 2304269 1399208 2187825 64 100000 2303166 1401328 2187800 64 500000 2301091 1401154 2191662 300 8 2279299 1446896 2159220 300 100 2273545 1394192 2150617 300 1000 2219312 1414551 2044742 300 5000 2267291 1387401 2144859 300 10000 2265424 1394049 2159700 300 20000 2268109 1398392 2169303 300 30000 2266996 1399446 2173401 300 50000 2264847 1399667 2170199 300 100000 2266520 1401113 2168330 300 500000 2267000 1401134 2170788 > -----Original Message----- > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Jan Scheurich > > I would like to confirm that Thomas's use case is key also for the BGP/MPLS > VPN > implementation in Open Daylight. In our case OVS is forwarding MPLSoGRE > encapsulated L3 VPN traffic from physical ports as IP packets to VMs on vhost- > user ports. > > Even without the additional recirculation after pop_mpls, the throughput of > the > DPDK datapath in OVS 2.5 for incoming traffic from the GRE tunnel is less than > half of the performance for outgoing traffic to the GRE tunnel. The > additional, > unnecessary recirculation will shift that imbalance further. We would be very > happy to help finding a satisfactory way of implementing a logic to insert > recirculation only when needed. > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev