Re: [ovs-discuss] High CPU Usage by ovs-vswitchd and resulting packet loss

Oliver Francke Wed, 06 Jun 2012 02:53:04 -0700

Hi Kaushal,

thanks for your first impressions. My next change-window is in two days,will put the current version on one of 5 nodes.I'ill establish a small script, which monitors memory-usages, CPU-load,no of flows etc...


@Justin: Any other recommendations?

If it's worth, I could try to start a new thread, but talking about highCPU-load, how do you all handle something like SYN-FLOOD attacks andstuff like that?


Thnx in@vance,

Oliver.


On 06/06/2012 09:09 AM, Kaushal Shubhank wrote:

Hi Justin, Oliver,

So I switched to the newer version around 11 hrs ago. Here are someobservations:

1. Number of flows has come down to a couple of thousands (from12-15k). However we might wait for the setup to run for one whole dayto see through peak and lean times and then count the flows again.

2. The flows have lesser percentage of lower packet count. I haveattached a dump for reference.

3. The CPU usage is still around the same, that means we still seemisses in kernel flow tables.


$ sudo ovs-dpctl dump-flows br0 | grep -e "packets:[0123]," | wc -l
764
$ sudo ovs-dpctl show
system@br0:
lookups: hit:117426873 missed:87741549 lost:0
flows: 2145
port 0: br0 (internal)
port 1: eth3
port 2: eth4

- Kaushal

On Tue, Jun 5, 2012 at 12:49 PM, Kaushal Shubhank <kshubh...@gmail.com<mailto:kshubh...@gmail.com>> wrote:


    Surely we will try the 1.7.0 version. Considering this is
    production, we will be able to try this in off-peak hours. We will
    update you with the results as soon as possible.

    Thanks a lot and looking forward to contribute to the project in
    any way possible.

    Kaushal


    On Tue, Jun 5, 2012 at 12:36 PM, Justin Pettit <jpet...@nicira.com
    <mailto:jpet...@nicira.com>> wrote:

        Of your nearly 12,000 flows, over 10,000 had fewer than four
        packets:

        [jpettit@timber-2 Desktop] grep -e "packets:[0123],"
        live_flows_20120604  |wc -l
          10143

        Short-lived flows are really difficult for OVS, since there's
        a lot of overhead in setting up and maintaining the kernel
        flow table.  We made *substantial* improvements for handling
        just this scenario in the forthcoming 1.7.0 release.  The code
        should be stable, but it hasn't gone through a full QA
        regression.  However, if you're willing to give it a shot, you
        can download a snapshot of the tip of the 1.7 branch:

        
http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=snapshot;h=04a67c083458784d1fed689bcb7ed904026d2352;sf=tgz

        We've only been able to test it with generated traffic, so
        seeing how much it improves performance with real traffic
        would be invaluable.  If you're able to give it a try and let
        us know, we'd really appreciate it.

        --Justin


        On Jun 4, 2012, at 11:39 PM, Kaushal Shubhank wrote:

        > Hi Justin,
        >
        > This is how the connections are made, so I guess eth3 and
        eth4 are not in the same network segment.
        > Router--->eth4==eth3--->switch
        >
        > We tried with eviction threshold 10000, but were seeing high
        packet losses. I am pasting a few kernel flows (ovs-dpct
        dump-flows) here, and attaching the whole dump (11k flows). I
        don't see any pattern. The port 80 filtering flows were around
        800 in the 11k flows, that means other flows were just
        non-port 80 packets which we just send from eth3 to eth4 or
        vice-versa.
        >
        > If there is any way we reduce those (11k - 800) flows, we
        could reduce CPU usage.
        >
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.195,dst=1.2.138.199,proto=17,tos=0,ttl=127,frag
        > =no),udp(src=62294,dst=16464), packets:1, bytes:60,
        used:3.170s, actions:2
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=94.194.158.115,dst=110.172.18.250,proto=6,tos=0,ttl=22,frag
        > =no),tcp(src=62760,dst=47868), packets:0, bytes:0,
        used:never, actions:1
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.134,dst=209.85.148.139,proto=6,tos=0,ttl=126,frag=no),tcp(src=64741,dst=80),
        packets:1, bytes:60, used:2.850s,
        actions:set(eth(src=00:15:17:44:03:6e,dst=00:e0:ed:15:24:4a)),0
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.137,dst=219.90.100.27,proto=6,tos=0,ttl=127,frag=no),tcp(src=49504,dst=12758),
        packets:67603, bytes:4060369, used:0.360s, actions:2
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=189.63.179.72,dst=203.188.231.195,proto=17,tos=0,ttl=110,frag=no),udp(src=60414,dst=16464),
        packets:1, bytes:60, used:0.620s, actions:1
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=213.57.230.226,dst=110.172.18.8,proto=17,tos=0,ttl=101,frag=no),udp(src=59274,dst=24844),
        packets:0, bytes:0, used:never, actions:1
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=195.35.128.105,dst=110.172.18.250,proto=6,tos=0,ttl=15,frag=no),tcp(src=54303,dst=47868),
        packets:3, bytes:222, used:5.300s, actions:2
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.154,dst=76.186.139.105,proto=6,tos=0,ttl=126,frag=no),tcp(src=10369,dst=61585),
        packets:1, bytes:60, used:0.290s, actions:2
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=78.92.118.9,dst=110.172.18.80,proto=17,tos=0,ttl=23,frag=no),udp(src=44779,dst=59357),
        packets:0, bytes:0, used:never, actions:2
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=89.216.130.134,dst=203.188.231.206,proto=17,tos=0,ttl=33,frag=no),udp(src=52342,dst=30291),
        packets:0, bytes:0, used:never, actions:1
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=76.226.72.157,dst=110.172.18.250,proto=6,tos=0,ttl=36,frag=no),tcp(src=46637,dst=47868),
        packets:2, bytes:148, used:2.730s, actions:1
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=89.211.162.95,dst=110.172.18.80,proto=17,tos=0,ttl=92,frag=no),udp(src=19442,dst=59357),
        packets:0, bytes:0, used:never, actions:2
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=86.179.231.157,dst=110.172.18.11,proto=17,tos=0,ttl=109,frag=no),udp(src=58240,dst=23813),
        packets:7, bytes:1181, used:1.700s, actions:1
        >
        
in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=72.201.71.66,dst=203.188.231.195,proto=17,tos=0,ttl=115,frag=no),udp(src=1025,dst=16464),
        packets:1, bytes:60, used:2.620s, actions:1
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=95.165.107.21,dst=110.172.18.80,proto=17,tos=0,ttl=96,frag=no),udp(src=49400,dst=59357),
        packets:1, bytes:72, used:3.360s, actions:2
        >
        
in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.203,dst=212.96.161.246,proto=6,tos=0,ttl=127,frag=no),tcp(src=49172,dst=80),
        packets:2, bytes:735, used:0.240s,
        actions:set(eth(src=00:15:17:44:03:6e,dst=00:e0:ed:15:24:4a)),0
        >
        
in_port(0),eth(src=00:e0:ed:15:24:4a,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.54,dst=111.119.15.31,proto=6,tos=0,ttl=64,frag=no),tcp(src=47463,dst=80),
        packets:6, bytes:928, used:4.440s, actions:2
        >
        > Thanks,
        > Kaushal
        >
        > On Tue, Jun 5, 2012 at 11:29 AM, Justin Pettit
        <jpet...@nicira.com <mailto:jpet...@nicira.com>> wrote:
        > Are eth3 and eth4 on the same network segment?  If so, I'd
        guess you've introduced a loop.
        >
        > I wouldn't recommend setting your evection threshold so
        high, since OVS is going to have to do a lot of work to
        maintain so many kernel flows.  I wouldn't go above 10s of
        thousands of flows.  What do your kernel flows look like?  You
        have too many to post here, but maybe you can provide a
        sampling of a couple hundred.  Do you see any patterns?
        >
        > --Justin
        >
        >
        > On Jun 4, 2012, at 10:40 PM, Kaushal Shubhank wrote:
        >
        > > Hello,
        > >
        > > We have a simple setup in which a server running a
        transparent proxy needs to intercept the http port 80 data. We
        have installed openvswitch (1.4.1) in the same server (running
        Ubuntu-natty 2.6.38-12-server 64bit) to feed the proxy with
        the corresponding type of packets while bridging all other
        types of packets. The functionality is working properly but
        the CPU usage is quite high (~30% for 20mbps traffic). The
        total load we need to deploy on is around 350mbps, and as soon
        as we plug in, the CPU usage shoots up to 100% (on a quad core
        Intel(R) Xeon(R) CPU E5420  @ 2.50GHz), even when only
        allowing all the packets to flow through br0. Packet loss also
        starts to occur.
        > >
        > > After reading similar discussions on previous threads I
        made my bridge stp-enabled and increased the
        flow-eviction-threshold to "1000000". Still the CPU load is
        high due to misses in kernel flow table. I have defined only
        the following flows:
        > >
        > > $ ovs-ofctl dump-flows br0
        > >
        > > NXST_FLOW reply (xid=0x4):
        > >  cookie=0x0, duration=80105.621s, table=0,
        n_packets=61978784, n_bytes=7438892513,
        priority=100,tcp,in_port=1,tp_dst=80
        actions=mod_dl_dst:00:e0:ed:15:24:4a,LOCAL
        > >  cookie=0x0, duration=80105.501s, table=0,
        n_packets=49343241, n_bytes=113922939324,
        priority=100,tcp,dl_src=00:e0:ed:15:24:4a,tp_src=80
        actions=output:1
        > >  cookie=0x0, duration=518332.577s, table=0,
        n_packets=3052099665, n_bytes=2041603012562, priority=0
        actions=NORMAL
        > >  cookie=0x0, duration=80105.586s, table=0,
        n_packets=46209782, n_bytes=109671221356,
        priority=100,tcp,in_port=2,tp_src=80
        actions=mod_dl_dst:00:e0:ed:15:24:4a,LOCAL
        > >  cookie=0x0, duration=80105.601s, table=0,
        n_packets=40389137, n_bytes=5660094662,
        priority=100,tcp,dl_src=00:e0:ed:15:24:4a,tp_dst=80
        actions=output:2
        > >
        > > where 00:e0:ed:15:24:4a is br0's MAC address
        > >
        > > $ ovs-dpctl show
        > >
        > > system@br0:
        > >       lookups: hit:3105457869 missed:792488043 lost:903955
        {these lost packets came with 350mbps load and do not change
        with 20mbps}
        > >       flows: 12251
        > >       port 0: br0 (internal)
        > >       port 1: eth3
        > >       port 2: eth4
        > >
        > > As far as we could understand, the missed packets here
        cause context switch to user-mode and increase CPU usage. Let
        me know if any other detail about the setup is required.
        > >
        > > Is there anything else we can do to reduce CPU usage?
        > > Can the flows above be improved in some way?
        > > Is there any other configuration for deployment in
        production that we missed?
        > >
        > > Regards,
        > > Kaushal
        > > _______________________________________________
        > > discuss mailing list
        > > discuss@openvswitch.org <mailto:discuss@openvswitch.org>
        > > http://openvswitch.org/mailman/listinfo/discuss
        >
        >
        > <flows.tgz>





_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss



--

Oliver Francke

filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh

Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Re: [ovs-discuss] High CPU Usage by ovs-vswitchd and resulting packet loss

Reply via email to