Hello, list

I am doing some performance tests for the preparation of upgrading
Open vSwitch from 1.11.0 to 2.3.2.  However, with TCP_CRR, I can only
achieve about 130k tps (last time I got only 40k because of a .debug
type kernel), not even close to the reported 680k from the blog post
[0].  I also found other available reports [1, 2] but those results
were even worse and not consistent with each other.

I guess difference in the test environment/method should be the main
cause and would like to know more about other possible optimizations
to make 680k tps possible.

I cleaned up my test scripts a bit and posted them on github [3].
Here I am going to summarize test methods used in my tests in case
that you can shed some light on me and give your test setup for
reference?  There are so many details.  Sorry that it's going to be a
little verbose :)

I have 2 physical hosts with the following configuration

 - 2 Intel Xeon E5-2650 v2 @ 2.60Ghz on 2 sockets
 - 256 (16 * 16) GB memory
 - Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 03)
 - CentOS 6.6 with 2.6.32-431.20.3.el6.xxxx.x86_64
 - Open vSwitch 2.3.2
 - Netperf 2.6.0

On both hosts,

 - HyperThreading was disabled by kernel parameter maxcpus=16
 - Interrupts of 16 queues of the NIC were distributed to all online
cores with set_irq_affinity.sh
 - rx, tx sg and other offload features were all off
 - rx-flow-hash of tcp4 was set to sdfn
 - tcp_tw_recycle and tcp_tw_reuse were enabled
 - connection tracking is disabled with iptables on all flows involved
in the tests

Then on Host A

 - OVS bridge ovsbr was created
 - 127 OVS internal ports were created, placed into 32 different net namespaces
 - Those 127 internal ports were also placed as members of the bridge ovsbr
 - No special flow rule was added to the bridge.  It's just a plain
bridge the default settings

Then

 - addresses and routes were configured
 - 1, 16, 32, 127 netserver instances were started on Host B listening
on different port
 - 1, 16, 32, 127 netperf instances were started within net namespaces
created earlier and each connected to a netserver instance
 - Each netserver and netperf instance were bound to a core with taskset

Then test results from netperf instances were collected and summed
together and they never came close to 680k tps

Tests were conducted on Linux bridge with veth, Open vSwitch 1.11.0
and 2.3.2 with internal ports.  Their upper limits attained in my
TCP_CRR tests were as the following (details can be found in link [4])

    Linux bridge                  110k
    Open vSwitch 1.11.0            19k
    Open vSwitch 2.3.2            130k

Regards

 [0] Accelerating Open vSwitch to “Ludicrous Speed”,
http://networkheresy.com/2014/11/13/accelerating-open-vswitch-to-ludicrous-speed/
 [1] Reports for netperf's TCP_CRR test (i.e TCP accept()
performance), https://gist.github.com/cgbystrom/985475
 [2] `[ovs-dev] [TCP_CRR 0/6] improve TCP_CRR performance by 87%`, for
Open vSwitch 1.3,
http://openvswitch.org/pipermail/dev/2011-November/013183.html
 [3] brtest, Scripts mainly for testing bridge implementations (Linux
bridge and Open vSwitch), https://github.com/yousong/brtest
 [4] https://github.com/yousong/brtest/blob/master/out/yousong-X540-AT2.md


                yousong
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to