Re: [ovs-discuss] Packet drops with high rate of Packet_In

Anton Matsiuk Fri, 22 Nov 2013 08:31:24 -0800

>
> Does the controller get any error replies from Open vSwitch?


No, Open vSwitch just accepts all the rules (500) and installs them without
sending any errors to a controller, but processes only part of ingress
packets through them:
ovs-ofctl dump-aggregate br0
NXST_AGGREGATE reply (xid=0x4): packet_count=324 byte_count=20736
flow_count=500

For part of out these 500 rules it shows that packet was processed:
ovs-ofctl dump-flows br0
cookie=0x0, duration=10.686s, table=0, n_packets=1, n_bytes=64,
hard_timeout=30, idle_age=10, udp,in_port=1,tp_dst=50272
actions=mod_vlan_vid:20,output:2

For part of them that wasn't:
cookie=0x0, duration=10.509s, table=0, n_packets=0, n_bytes=0,
hard_timeout=30, idle_age=10, udp,in_port=1,tp_dst=50475
actions=mod_vlan_vid:20,output:2

ovs-dpctl show -s br0 shows that kernel datapath sends 500 packets to
user-space, 500 packets enters ingress eth interface and 324 leaves egress
eth imterface.

What's in the ovs-vswitchd log?  (Not in debug mode, that's too big.)


2013-11-22T15:49:39Z|00001|vlog|INFO|opened log file
/var/log/openvswitch/ovs-vswitchd.log
2013-11-22T15:49:39Z|00002|worker(worker)|INFO|worker process started
2013-11-22T15:49:39Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connecting...
2013-11-22T15:49:39Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connected
2013-11-22T15:49:39Z|00004|bridge|INFO|bridge br0: using datapath ID
0000000000000002
2013-11-22T15:49:39Z|00005|connmgr|INFO|br0: added service controller
"punix:/var/run/openvswitch/br0.mgmt"
2013-11-22T15:49:39Z|00006|connmgr|INFO|br0: added primary controller "tcp:
192.168.168.2:6633"
2013-11-22T15:49:39Z|00007|rconn|INFO|br0<->tcp:192.168.168.2:6633:
connecting...
2013-11-22T15:49:39Z|00008|bridge|INFO|ovs-vswitchd (Open vSwitch) 1.9.3
2013-11-22T15:49:39Z|00009|rconn|INFO|br0<->tcp:192.168.168.2:6633:
connected
2013-11-22T15:49:39Z|00010|ofp_util|INFO|normalization changed ofp_match,
details:
2013-11-22T15:49:39Z|00011|ofp_util|INFO| pre:
nw_src=0.0.0.0,nw_dst=0.0.0.0,nw_proto=0,nw_tos=0,tp_src=0,tp_dst=0
2013-11-22T15:49:39Z|00012|ofp_util|INFO|post:
2013-11-22T15:49:39Z|00013|ofp_util|INFO|normalization changed ofp_match,
details:
2013-11-22T15:49:39Z|00014|ofp_util|INFO| pre:
nw_src=0.0.0.0,nw_dst=0.0.0.0,nw_proto=0,nw_tos=0,tp_src=0,tp_dst=0
2013-11-22T15:49:39Z|00015|ofp_util|INFO|post:
2013-11-22T15:49:49Z|00016|memory|INFO|7040 kB peak resident set size after
10.0 seconds
2013-11-22T15:49:49Z|00017|memory|INFO|ofconns:1 ports:3 rules:3
2013-11-22T15:49:50Z|00018|ofproto|INFO|br0: 2 flow_mods 10 s ago (2
deletes)
2013-11-22T15:50:49Z|00019|ofproto|INFO|br0: 500 flow_mods 25 s ago (500
adds)
2013-11-22T15:51:53Z|00020|ofproto|INFO|br0: 500 flow_mods 10 s ago (500
adds)
2013-11-22T15:52:54Z|00021|ofproto_dpif_governor|INFO|br0: engaging
governor with 16 kB hash table
2013-11-22T15:52:59Z|00022|ofproto_dpif_governor|INFO|br0: disengaging


On 22 November 2013 16:44, Ben Pfaff <b...@nicira.com> wrote:

> Does the controller get any error replies from Open vSwitch?
> What's in the ovs-vswitchd log?  (Not in debug mode, that's too big.)
>
> On Fri, Nov 22, 2013 at 04:15:20PM +0100, Anton Matsiuk wrote:
> > Dear Ben,
> >
> > I figured out that drops occur inside OVS. I see all packets entering one
> > interface of OVS, Packet_In generated for every packet, then Flow_Mods
> (or
> > Packet_Out in other tests) generated and sent for every Packet_In by
> > external controller and all this rules are installed to OVS. Namely 500
> > Packet_In  --> 500 flows in OVS, but only part of ingress packets is
> > processed through their corresponding flow rules and leaves OVS.
> > (dump-ports and dump-flows both in kermel and user-space modules show
> this).
> > Drops occur only after some threshold of Packet_In per msec, that's why
> it
> > seems like OVS drops some packets due to buffer overloads (or probably
> due
> > to expired timeouts for arrived packets).
> >
> > I read logs up to dbg level but the only thing that I figured out (in
> > ovs-vswitchd.log) that governor periodically expands hash table in
> response
> > to flow_mods increasing frequency.
> >
> > Is there possibility to track drops in internal buffers of OVS or somehow
> > to debug it?
> >
> > Or, probably, does OVS drop packets after expired timeout for Packet_In
> > residing in buffer? And what is the default value for such timeout if
> any?
> >
> > --
> > Best regards,
> > Anton Matsiuk
> >
> > On 21 November 2013 17:56, Ben Pfaff <b...@nicira.com> wrote:
> >
> > > Please don't drop the mailing list.
> > >
> > > You have begun to narrow down where the drops occur, but it's still not
> > > clear exactly where.  I suggest following the troubleshooting procedure
> > > in the FAQ.
> > >
> > > Q: I have a sophisticated network setup involving Open vSwitch, VMs or
> > >    multiple hosts, and other components.  The behavior isn't what I
> > >    expect.  Help!
> > >
> > > A: To debug network behavior problems, trace the path of a packet,
> > >    hop-by-hop, from its origin in one host to a remote host.  If
> > >    that's correct, then trace the path of the response packet back to
> > >    the origin.
> > >
> > >    Usually a simple ICMP echo request and reply ("ping") packet is
> > >    good enough.  Start by initiating an ongoing "ping" from the origin
> > >    host to a remote host.  If you are tracking down a connectivity
> > >    problem, the "ping" will not display any successful output, but
> > >    packets are still being sent.  (In this case the packets being sent
> > >    are likely ARP rather than ICMP.)
> > >
> > >    Tools available for tracing include the following:
> > >
> > >        - "tcpdump" and "wireshark" for observing hops across network
> > >          devices, such as Open vSwitch internal devices and physical
> > >          wires.
> > >
> > >        - "ovs-appctl dpif/dump-flows <br>" in Open vSwitch 1.10 and
> > >          later or "ovs-dpctl dump-flows <br>" in earlier versions.
> > >          These tools allow one to observe the actions being taken on
> > >          packets in ongoing flows.
> > >
> > >          See ovs-vswitchd(8) for "ovs-appctl dpif/dump-flows"
> > >          documentation, ovs-dpctl(8) for "ovs-dpctl dump-flows"
> > >          documentation, and "Why are there so many different ways to
> > >          dump flows?" above for some background.
> > >
> > >        - "ovs-appctl ofproto/trace" to observe the logic behind how
> > >          ovs-vswitchd treats packets.  See ovs-vswitchd(8) for
> > >          documentation.  You can out more details about a given flow
> > >          that "ovs-dpctl dump-flows" displays, by cutting and pasting
> > >          a flow from the output into an "ovs-appctl ofproto/trace"
> > >          command.
> > >
> > >        - SPAN, RSPAN, and ERSPAN features of physical switches, to
> > >          observe what goes on at these physical hops.
> > >
> > >    Starting at the origin of a given packet, observe the packet at
> > >    each hop in turn.  For example, in one plausible scenario, you
> > >    might:
> > >
> > >        1. "tcpdump" the "eth" interface through which an ARP egresses
> > >           a VM, from inside the VM.
> > >
> > >        2. "tcpdump" the "vif" or "tap" interface through which the ARP
> > >           ingresses the host machine.
> > >
> > >        3. Use "ovs-dpctl dump-flows" to spot the ARP flow and observe
> > >           the host interface through which the ARP egresses the
> > >           physical machine.  You may need to use "ovs-dpctl show" to
> > >           interpret the port numbers.  If the output seems surprising,
> > >           you can use "ovs-appctl ofproto/trace" to observe details of
> > >           how ovs-vswitchd determined the actions in the "ovs-dpctl
> > >           dump-flows" output.
> > >
> > >        4. "tcpdump" the "eth" interface through which the ARP egresses
> > >           the physical machine.
> > >
> > >        5. "tcpdump" the "eth" interface through which the ARP
> > >           ingresses the physical machine, at the remote host that
> > >           receives the ARP.
> > >
> > >        6. Use "ovs-dpctl dump-flows" to spot the ARP flow on the
> > >           remote host that receives the ARP and observe the VM "vif"
> > >           or "tap" interface to which the flow is directed.  Again,
> > >           "ovs-dpctl show" and "ovs-appctl ofproto/trace" might help.
> > >
> > >        7. "tcpdump" the "vif" or "tap" interface to which the ARP is
> > >           directed.
> > >
> > >        8. "tcpdump" the "eth" interface through which the ARP
> > >           ingresses a VM, from inside the VM.
> > >
> > >    It is likely that during one of these steps you will figure out the
> > >    problem.  If not, then follow the ARP reply back to the origin, in
> > >    reverse.
> > >
> > >
> > > On Thu, Nov 21, 2013 at 04:55:13PM +0100, Anton Matsiuk wrote:
> > > > I request log files up to debug level, namely:
> > > > ovs-vswitchd.log
> > > > ovs-dpctl.log
> > > > ovs-ofctl.log
> > > > but none of them shows any messages related to packet drops. All the
> > > > statistics shows that correct number of flows was installed and only
> part
> > > > of packets was processed.
> > > > That's why I am asking, is there any else possibilities (beyond log
> > > files)
> > > > to track packet drops in input buffers and probably to fix them? Or
> at
> > > > least in which direction I should search for a solution?
> > > >
> > > >
> > > > On 20 November 2013 18:13, Ben Pfaff <b...@nicira.com> wrote:
> > > >
> > > > > On Wed, Nov 20, 2013 at 12:35:25PM +0100, Anton Matsiuk wrote:
> > > > > > I test Open vSwitch in the following scheme: I use 2 hosts
> directly
> > > > > > connected to OVS and external OpenFlow Controller. Host1
> generates
> > > UDP
> > > > > > datagrams with sequential ports towards Host2, Host 2 listens for
> > > these
> > > > > UDP
> > > > > > datagrams. In responce to every UDP datagram OVS generates
> Packet_In
> > > and
> > > > > > Controller sends Flow_Mod back with L4 granularity (so for every
> > > pair of
> > > > > > UDP port numbers it installs separate flow). I send bunch of UDP
> > > > > datagrams
> > > > > > from Host1 and calculate how many of them arrived to Host2. I
> tried
> > > both
> > > > > > with detached controller and running in the same machine as OVS.
> I
> > > tested
> > > > > > it on different machines (in Mininet and with separated real
> hosts).
> > > I
> > > > > use
> > > > > > out-of-band option for controller and disable-in-band=true.
> > > > > >
> > > > > >
> > > > > > Starting  some number of packets ( around >300) packet drops are
> > > > > observed.
> > > > > > For instance, if I generate 500 UDP packets in 120 ms only around
> > > 350 of
> > > > > > them arrive to Host2 (Subsequent packets of the same flow can
> arrive
> > > to
> > > > > > Host2, but first packets of flows always experience drops)
> > > > > >
> > > > > >
> > > > > > ovs-ofctl dump-aggregate show that all the flows are installed
> but
> > > only
> > > > > > part of packets are processed through them:
> > > > > >
> > > > > > NXST_AGGREGATE reply (xid=0x4): packet_count=356 byte_count=42364
> > > > > > flow_count=500
> > > > > >
> > > > > >
> > > > > > ovs-ofctl dump-ports also shows that 500 packets arrive on
> ingress
> > > > > > interface and only 356 leave egress.
> > > > > >
> > > > > >
> > > > > > ovs-dpctl show ?s shows the same ?  500 flows installed and 356
> > > packets
> > > > > > processed.
> > > > > >
> > > > > >
> > > > > > Also I tried to replace Flow_Mods with Packet_Out messages for
> every
> > > > > > packet, but I experienced the same drops. It seems like OVS
> starts
> > > > > dropping
> > > > > > packets after some threshold (or buffer overload).
> > > > > >
> > > > > >
> > > > > > Is there any possibility to debug these drops and maybe to
> manipulate
> > > > > > ingress buffer sizes (or queue priorities) in order to avoid such
> > > drops?
> > > > >
> > > > > Yes, I think you will have to do the initial debugging yourself, to
> > > find
> > > > > out where the drop is occurring.  When you report that back to us,
> we
> > > > > can help you figure out how to fix it.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Anton Matsiuk
> > >
>



-- 
Best regards,
Anton Matsiuk

_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Re: [ovs-discuss] Packet drops with high rate of Packet_In

Reply via email to