Does the controller get any error replies from Open vSwitch? What's in the ovs-vswitchd log? (Not in debug mode, that's too big.)
On Fri, Nov 22, 2013 at 04:15:20PM +0100, Anton Matsiuk wrote: > Dear Ben, > > I figured out that drops occur inside OVS. I see all packets entering one > interface of OVS, Packet_In generated for every packet, then Flow_Mods (or > Packet_Out in other tests) generated and sent for every Packet_In by > external controller and all this rules are installed to OVS. Namely 500 > Packet_In --> 500 flows in OVS, but only part of ingress packets is > processed through their corresponding flow rules and leaves OVS. > (dump-ports and dump-flows both in kermel and user-space modules show this). > Drops occur only after some threshold of Packet_In per msec, that's why it > seems like OVS drops some packets due to buffer overloads (or probably due > to expired timeouts for arrived packets). > > I read logs up to dbg level but the only thing that I figured out (in > ovs-vswitchd.log) that governor periodically expands hash table in response > to flow_mods increasing frequency. > > Is there possibility to track drops in internal buffers of OVS or somehow > to debug it? > > Or, probably, does OVS drop packets after expired timeout for Packet_In > residing in buffer? And what is the default value for such timeout if any? > > -- > Best regards, > Anton Matsiuk > > On 21 November 2013 17:56, Ben Pfaff <b...@nicira.com> wrote: > > > Please don't drop the mailing list. > > > > You have begun to narrow down where the drops occur, but it's still not > > clear exactly where. I suggest following the troubleshooting procedure > > in the FAQ. > > > > Q: I have a sophisticated network setup involving Open vSwitch, VMs or > > multiple hosts, and other components. The behavior isn't what I > > expect. Help! > > > > A: To debug network behavior problems, trace the path of a packet, > > hop-by-hop, from its origin in one host to a remote host. If > > that's correct, then trace the path of the response packet back to > > the origin. > > > > Usually a simple ICMP echo request and reply ("ping") packet is > > good enough. Start by initiating an ongoing "ping" from the origin > > host to a remote host. If you are tracking down a connectivity > > problem, the "ping" will not display any successful output, but > > packets are still being sent. (In this case the packets being sent > > are likely ARP rather than ICMP.) > > > > Tools available for tracing include the following: > > > > - "tcpdump" and "wireshark" for observing hops across network > > devices, such as Open vSwitch internal devices and physical > > wires. > > > > - "ovs-appctl dpif/dump-flows <br>" in Open vSwitch 1.10 and > > later or "ovs-dpctl dump-flows <br>" in earlier versions. > > These tools allow one to observe the actions being taken on > > packets in ongoing flows. > > > > See ovs-vswitchd(8) for "ovs-appctl dpif/dump-flows" > > documentation, ovs-dpctl(8) for "ovs-dpctl dump-flows" > > documentation, and "Why are there so many different ways to > > dump flows?" above for some background. > > > > - "ovs-appctl ofproto/trace" to observe the logic behind how > > ovs-vswitchd treats packets. See ovs-vswitchd(8) for > > documentation. You can out more details about a given flow > > that "ovs-dpctl dump-flows" displays, by cutting and pasting > > a flow from the output into an "ovs-appctl ofproto/trace" > > command. > > > > - SPAN, RSPAN, and ERSPAN features of physical switches, to > > observe what goes on at these physical hops. > > > > Starting at the origin of a given packet, observe the packet at > > each hop in turn. For example, in one plausible scenario, you > > might: > > > > 1. "tcpdump" the "eth" interface through which an ARP egresses > > a VM, from inside the VM. > > > > 2. "tcpdump" the "vif" or "tap" interface through which the ARP > > ingresses the host machine. > > > > 3. Use "ovs-dpctl dump-flows" to spot the ARP flow and observe > > the host interface through which the ARP egresses the > > physical machine. You may need to use "ovs-dpctl show" to > > interpret the port numbers. If the output seems surprising, > > you can use "ovs-appctl ofproto/trace" to observe details of > > how ovs-vswitchd determined the actions in the "ovs-dpctl > > dump-flows" output. > > > > 4. "tcpdump" the "eth" interface through which the ARP egresses > > the physical machine. > > > > 5. "tcpdump" the "eth" interface through which the ARP > > ingresses the physical machine, at the remote host that > > receives the ARP. > > > > 6. Use "ovs-dpctl dump-flows" to spot the ARP flow on the > > remote host that receives the ARP and observe the VM "vif" > > or "tap" interface to which the flow is directed. Again, > > "ovs-dpctl show" and "ovs-appctl ofproto/trace" might help. > > > > 7. "tcpdump" the "vif" or "tap" interface to which the ARP is > > directed. > > > > 8. "tcpdump" the "eth" interface through which the ARP > > ingresses a VM, from inside the VM. > > > > It is likely that during one of these steps you will figure out the > > problem. If not, then follow the ARP reply back to the origin, in > > reverse. > > > > > > On Thu, Nov 21, 2013 at 04:55:13PM +0100, Anton Matsiuk wrote: > > > I request log files up to debug level, namely: > > > ovs-vswitchd.log > > > ovs-dpctl.log > > > ovs-ofctl.log > > > but none of them shows any messages related to packet drops. All the > > > statistics shows that correct number of flows was installed and only part > > > of packets was processed. > > > That's why I am asking, is there any else possibilities (beyond log > > files) > > > to track packet drops in input buffers and probably to fix them? Or at > > > least in which direction I should search for a solution? > > > > > > > > > On 20 November 2013 18:13, Ben Pfaff <b...@nicira.com> wrote: > > > > > > > On Wed, Nov 20, 2013 at 12:35:25PM +0100, Anton Matsiuk wrote: > > > > > I test Open vSwitch in the following scheme: I use 2 hosts directly > > > > > connected to OVS and external OpenFlow Controller. Host1 generates > > UDP > > > > > datagrams with sequential ports towards Host2, Host 2 listens for > > these > > > > UDP > > > > > datagrams. In responce to every UDP datagram OVS generates Packet_In > > and > > > > > Controller sends Flow_Mod back with L4 granularity (so for every > > pair of > > > > > UDP port numbers it installs separate flow). I send bunch of UDP > > > > datagrams > > > > > from Host1 and calculate how many of them arrived to Host2. I tried > > both > > > > > with detached controller and running in the same machine as OVS. I > > tested > > > > > it on different machines (in Mininet and with separated real hosts). > > I > > > > use > > > > > out-of-band option for controller and disable-in-band=true. > > > > > > > > > > > > > > > Starting some number of packets ( around >300) packet drops are > > > > observed. > > > > > For instance, if I generate 500 UDP packets in 120 ms only around > > 350 of > > > > > them arrive to Host2 (Subsequent packets of the same flow can arrive > > to > > > > > Host2, but first packets of flows always experience drops) > > > > > > > > > > > > > > > ovs-ofctl dump-aggregate show that all the flows are installed but > > only > > > > > part of packets are processed through them: > > > > > > > > > > NXST_AGGREGATE reply (xid=0x4): packet_count=356 byte_count=42364 > > > > > flow_count=500 > > > > > > > > > > > > > > > ovs-ofctl dump-ports also shows that 500 packets arrive on ingress > > > > > interface and only 356 leave egress. > > > > > > > > > > > > > > > ovs-dpctl show ?s shows the same ? 500 flows installed and 356 > > packets > > > > > processed. > > > > > > > > > > > > > > > Also I tried to replace Flow_Mods with Packet_Out messages for every > > > > > packet, but I experienced the same drops. It seems like OVS starts > > > > dropping > > > > > packets after some threshold (or buffer overload). > > > > > > > > > > > > > > > Is there any possibility to debug these drops and maybe to manipulate > > > > > ingress buffer sizes (or queue priorities) in order to avoid such > > drops? > > > > > > > > Yes, I think you will have to do the initial debugging yourself, to > > find > > > > out where the drop is occurring. When you report that back to us, we > > > > can help you figure out how to fix it. > > > > > > > > > > > > > > > > -- > > > Best regards, > > > Anton Matsiuk > > _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss