Hello, list I am doing some performance tests for the preparation of upgrading Open vSwitch from 1.11.0 to 2.3.2. However, with TCP_CRR, I can only achieve about 130k tps (last time I got only 40k because of a .debug type kernel), not even close to the reported 680k from the blog post [0]. I also found other available reports [1, 2] but those results were even worse and not consistent with each other.
I guess difference in the test environment/method should be the main cause and would like to know more about other possible optimizations to make 680k tps possible. I cleaned up my test scripts a bit and posted them on github [3]. Here I am going to summarize test methods used in my tests in case that you can shed some light on me and give your test setup for reference? There are so many details. Sorry that it's going to be a little verbose :) I have 2 physical hosts with the following configuration - 2 Intel Xeon E5-2650 v2 @ 2.60Ghz on 2 sockets - 256 (16 * 16) GB memory - Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 03) - CentOS 6.6 with 2.6.32-431.20.3.el6.xxxx.x86_64 - Open vSwitch 2.3.2 - Netperf 2.6.0 On both hosts, - HyperThreading was disabled by kernel parameter maxcpus=16 - Interrupts of 16 queues of the NIC were distributed to all online cores with set_irq_affinity.sh - rx, tx sg and other offload features were all off - rx-flow-hash of tcp4 was set to sdfn - tcp_tw_recycle and tcp_tw_reuse were enabled - connection tracking is disabled with iptables on all flows involved in the tests Then on Host A - OVS bridge ovsbr was created - 127 OVS internal ports were created, placed into 32 different net namespaces - Those 127 internal ports were also placed as members of the bridge ovsbr - No special flow rule was added to the bridge. It's just a plain bridge the default settings Then - addresses and routes were configured - 1, 16, 32, 127 netserver instances were started on Host B listening on different port - 1, 16, 32, 127 netperf instances were started within net namespaces created earlier and each connected to a netserver instance - Each netserver and netperf instance were bound to a core with taskset Then test results from netperf instances were collected and summed together and they never came close to 680k tps Tests were conducted on Linux bridge with veth, Open vSwitch 1.11.0 and 2.3.2 with internal ports. Their upper limits attained in my TCP_CRR tests were as the following (details can be found in link [4]) Linux bridge 110k Open vSwitch 1.11.0 19k Open vSwitch 2.3.2 130k Regards [0] Accelerating Open vSwitch to “Ludicrous Speed”, http://networkheresy.com/2014/11/13/accelerating-open-vswitch-to-ludicrous-speed/ [1] Reports for netperf's TCP_CRR test (i.e TCP accept() performance), https://gist.github.com/cgbystrom/985475 [2] `[ovs-dev] [TCP_CRR 0/6] improve TCP_CRR performance by 87%`, for Open vSwitch 1.3, http://openvswitch.org/pipermail/dev/2011-November/013183.html [3] brtest, Scripts mainly for testing bridge implementations (Linux bridge and Open vSwitch), https://github.com/yousong/brtest [4] https://github.com/yousong/brtest/blob/master/out/yousong-X540-AT2.md yousong _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss