I don't enable STP. #ovs-vsctl show 350424e8-f03c-4f5c-aaa9-fd78ec78f97b Bridge br-int fail_mode: secure Port "int-br-eth1" Interface "int-br-eth1" type: patch options: {peer="phy-br-eth1"} Port "tap5c0d4002-26" tag: 1 Interface "tap5c0d4002-26" Port "tap4954ce53-06" tag: 3 Interface "tap4954ce53-06" Port "tap6631c08d-91" tag: 3 Interface "tap6631c08d-91" Port "tapdfd3962f-3a" tag: 2 Interface "tapdfd3962f-3a" Port br-int Interface br-int type: internal Port "tapb2cd1333-03" tag: 1 Interface "tapb2cd1333-03" Port "tap0c1153d7-bb" tag: 1 Interface "tap0c1153d7-bb" Bridge "br-eth1" Port "br-eth1" Interface "br-eth1" type: internal Port "eth3" Interface "eth3" Port "phy-br-eth1" Interface "phy-br-eth1" type: patch options: {peer="int-br-eth1"} ovs_version: "2.3.1"
eth3 is physical NIC. The bidge of OVS is created by: ovs-vsctl add-br br-int ovs-vsctl add-br br-eth1 ovs-vsctl --timeout=10 -- set-fail-mode br-int secure WanLi Yang JD.COM ***************************************************************** From: Joe Stringer Date: 2015-05-27 05:05 To: yangwanli CC: bugs Subject: Re: Re: [ovs-discuss] |timeval|WARN|Unreasonably long 2020ms poll interval (365882ms user, 464906ms system) Are you using STP? On 22 May 2015 at 21:54, yangwanli <yangwa...@jd.com> wrote: > Thank you for your response! > > My servers OS are all CentOS 6.6, kernel version: 2.6.32-504.el6.x86_64, > vCPU:32, RAM:256G > #lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 32 > On-line CPU(s) list: 0-31 > Thread(s) per core: 2 > Core(s) per socket: 8 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 63 > Stepping: 2 > CPU MHz: 2599.971 > BogoMIPS: 5199.24 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 20480K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 > > OpenvSwitch kernel module(openvswitch.ko) and user space utilities are all > build I use OpenvSwitch-2.3.1 source (change nothing ) > (http://openvswitch.org/releases/openvswitch-2.3.1.tar.gz) > #modinfo openvswitch > /lib/modules/2.6.32-504.el6.x86_64/kernel/net/openvswitch/openvswitch.ko > version: 2.3.1 > license: GPL > description: Open vSwitch switching datapath > srcversion: 4D7CD38A83A9A4A782F73A1 > depends: libcrc32c > vermagic: 2.6.32-504.el6.x86_64 SMP mod_unload modversions > parm: vlan_tso:Enable TSO for VLAN packets (int) > > I have 300 servers are all the same hardware , OS and OpenvSwitch-2.3.1. > Since May 15, 2015, There are 6 servers appear this problem that when I > execute ovs-ofctl dump-flows br-int , not always reponse, > OVS ports not transmit datas, led to the virtual machines cannot be > accessed. > When appear this problem , server load is not high, and while I excute > netstat -anp , find some CONNECTING state of unix socket. > I restart ovs-vswitchd and ovdb-server, becames normal. > I doubt the thread dealock of ovs-vswitchd. > > All servers have the following log info: > /var/log/openvswitch/ovs-vswitd.log: > 2015-05-18T12:13:04.491Z|00138|coverage|INFO|Event coverage, avg rate over > last: 5 seconds, last minute, last hour, hash=9e13ffc8: > 2015-05-18T12:13:04.491Z|00139|coverage|INFO|nln_changed > 0.0/sec 0.000/sec 0.0000/sec total: 21 > 2015-05-18T12:13:04.491Z|00140|coverage|INFO|netlink_received > 59.4/sec 53.850/sec 53.6072/sec total: 1159232 > 2015-05-18T12:13:04.491Z|00141|coverage|INFO|netlink_recv_jumbo > 3.8/sec 2.767/sec 2.6536/sec total: 53239 > 2015-05-18T12:13:04.491Z|00142|coverage|INFO|netlink_sent > 68.0/sec 62.783/sec 62.9394/sec total: 1353375 > 2015-05-18T12:13:04.491Z|00143|coverage|INFO|netdev_set_policing > 0.0/sec 0.000/sec 0.0000/sec total: 12 > 2015-05-18T12:13:04.491Z|00144|coverage|INFO|netdev_get_ifindex > 0.0/sec 0.000/sec 0.0000/sec total: 6 > 2015-05-18T12:13:04.491Z|00145|coverage|INFO|netdev_get_hwaddr > 0.0/sec 0.000/sec 0.0000/sec total: 6 > 2015-05-18T12:13:04.491Z|00146|coverage|INFO|netdev_set_hwaddr > 0.0/sec 0.000/sec 0.0000/sec total: 3 > 2015-05-18T12:13:04.491Z|00147|coverage|INFO|netdev_get_ethtool > 0.0/sec 0.000/sec 0.0000/sec total: 22 > 2015-05-18T12:13:04.491Z|00148|coverage|INFO|netdev_set_ethtool > 0.0/sec 0.000/sec 0.0000/sec total: 4 > 2015-05-18T12:13:04.491Z|00149|coverage|INFO|vconn_received > 2.4/sec 2.000/sec 1.9067/sec total: 41642 > 2015-05-18T12:13:04.491Z|00150|coverage|INFO|vconn_sent > 1.8/sec 1.500/sec 1.4300/sec total: 31228 > 2015-05-18T12:13:04.491Z|00151|coverage|INFO|util_xalloc > 1987.2/sec 1867.150/sec 1817.8008/sec total: 37808073 > 2015-05-18T12:13:04.491Z|00152|coverage|INFO|pstream_open > 0.0/sec 0.000/sec 0.0000/sec total: 5 > 2015-05-18T12:13:04.491Z|00153|coverage|INFO|stream_open > 0.0/sec 0.000/sec 0.0000/sec total: 1 > 2015-05-18T12:13:04.491Z|00154|coverage|INFO|rconn_queued > 1.2/sec 1.000/sec 0.9533/sec total: 20815 > 2015-05-18T12:13:04.491Z|00155|coverage|INFO|rconn_sent > 1.2/sec 1.000/sec 0.9533/sec total: 20815 > 2015-05-18T12:13:04.491Z|00156|coverage|INFO|poll_fd_wait > 537.2/sec 481.217/sec 462.1683/sec total: 9762985 > 2015-05-18T12:13:04.491Z|00157|coverage|INFO|poll_zero_timeout > 5.2/sec 4.333/sec 3.8581/sec total: 83823 > 2015-05-18T12:13:04.491Z|00158|coverage|INFO|txn_unchanged > 0.0/sec 0.000/sec 0.0000/sec total: 11 > 2015-05-18T12:13:04.491Z|00159|coverage|INFO|txn_incomplete > 8.8/sec 8.450/sec 8.3044/sec total: 170811 > 2015-05-18T12:13:04.491Z|00160|coverage|INFO|txn_success > 0.2/sec 0.200/sec 0.1989/sec total: 4276 > 2015-05-18T12:13:04.491Z|00161|coverage|INFO|netdev_get_stats > 1.6/sec 1.600/sec 1.5911/sec total: 32833 > 2015-05-18T12:13:04.491Z|00162|coverage|INFO|mac_learning_learned > 0.2/sec 0.133/sec 0.0686/sec total: 1595 > 2015-05-18T12:13:04.491Z|00163|coverage|INFO|mac_learning_expired > 0.4/sec 0.150/sec 0.0689/sec total: 1518 > 2015-05-18T12:13:04.491Z|00164|coverage|INFO|hmap_pathological > 0.0/sec 0.000/sec 0.0000/sec total: 27 > 2015-05-18T12:13:04.491Z|00165|coverage|INFO|hmap_expand > 102.4/sec 92.300/sec 88.6772/sec total: 1853939 > 2015-05-18T12:13:04.491Z|00166|coverage|INFO|flow_extract > 23.0/sec 19.883/sec 19.7067/sec total: 416747 > 2015-05-18T12:13:04.491Z|00167|coverage|INFO|dpif_port_add > 0.0/sec 0.000/sec 0.0000/sec total: 6 > 2015-05-18T12:13:04.491Z|00168|coverage|INFO|dpif_flow_flush > 0.0/sec 0.000/sec 0.0000/sec total: 4 > 2015-05-18T12:13:04.491Z|00169|coverage|INFO|dpif_flow_put > 0.0/sec 0.000/sec 0.0000/sec total: 2 > 2015-05-18T12:13:04.491Z|00170|coverage|INFO|dpif_flow_del > 0.0/sec 0.000/sec 0.0000/sec total: 1 > 2015-05-18T12:13:04.491Z|00171|coverage|INFO|dpif_execute > 0.0/sec 0.000/sec 0.0000/sec total: 1 > 2015-05-18T12:13:04.491Z|00172|coverage|INFO|xlate_actions > 38.8/sec 27.317/sec 22.9731/sec total: 507271 > 2015-05-18T12:13:04.491Z|00173|coverage|INFO|rev_reconfigure > 0.0/sec 0.000/sec 0.0000/sec total: 20 > 2015-05-18T12:13:04.491Z|00174|coverage|INFO|rev_port_toggled > 0.0/sec 0.000/sec 0.0000/sec total: 3 > 2015-05-18T12:13:04.491Z|00175|coverage|INFO|rev_flow_table > 0.0/sec 0.000/sec 0.0000/sec total: 11 > 2015-05-18T12:13:04.491Z|00176|coverage|INFO|rev_mac_learning > 0.6/sec 0.283/sec 0.1286/sec total: 2813 > 2015-05-18T12:13:04.491Z|00177|coverage|INFO|ofproto_flush > 0.0/sec 0.000/sec 0.0000/sec total: 3 > 2015-05-18T12:13:04.491Z|00178|coverage|INFO|ofproto_recv_openflow > 1.8/sec 1.500/sec 1.4300/sec total: 31229 > 2015-05-18T12:13:04.491Z|00179|coverage|INFO|ofproto_update_port > 0.0/sec 0.000/sec 0.0000/sec total: 34 > 2015-05-18T12:13:04.491Z|00180|coverage|INFO|bridge_reconfigure > 0.0/sec 0.000/sec 0.0000/sec total: 15 > 2015-05-18T12:13:04.491Z|00181|coverage|INFO|49 events never hit > 2015-05-18T14:59:05.014Z|00182|timeval|WARN|Unreasonably long 2019ms poll > interval (43448ms user, 50465ms system) > 2015-05-18T14:59:05.014Z|00183|timeval|WARN|faults: 840 minor, 0 major > 2015-05-18T14:59:05.014Z|00184|timeval|WARN|disk: 0 reads, 120 writes > 2015-05-18T14:59:05.014Z|00185|timeval|WARN|context switches: 2167382 > voluntary, 529 involuntary > 2015-05-18T14:59:05.014Z|00186|coverage|INFO|Skipping details of duplicate > event coverage for hash=9e13ffc8 > 2015-05-18T17:46:50.578Z|00187|timeval|WARN|Unreasonably long 2018ms poll > interval (56724ms user, 65745ms system) > 2015-05-18T17:46:50.578Z|00188|timeval|WARN|faults: 841 minor, 0 major > 2015-05-18T17:46:50.578Z|00189|timeval|WARN|disk: 0 reads, 128 writes > 2015-05-18T17:46:50.578Z|00190|timeval|WARN|context switches: 2853218 > voluntary, 538 involuntary > 2015-05-18T17:46:50.578Z|00191|coverage|INFO|Skipping details of duplicate > event coverage for hash=9e13ffc8 > 2015-05-18T20:29:36.092Z|00192|timeval|WARN|Unreasonably long 2018ms poll > interval (69561ms user, 80775ms system) > 2015-05-18T20:29:36.092Z|00193|timeval|WARN|faults: 841 minor, 0 major > 2015-05-18T20:29:36.092Z|00194|timeval|WARN|disk: 0 reads, 136 writes > 2015-05-18T20:29:36.092Z|00195|timeval|WARN|context switches: 3519924 > voluntary, 544 involuntary > 2015-05-18T20:29:36.092Z|00196|coverage|INFO|Skipping details of duplicate > event coverage for hash=9e13ffc8 > 2015-05-18T23:21:51.729Z|00197|timeval|WARN|Unreasonably long 2021ms poll > interval (83029ms user, 96728ms system) > 2015-05-18T23:21:51.729Z|00198|timeval|WARN|faults: 841 minor, 0 major > 2015-05-18T23:21:51.729Z|00199|timeval|WARN|disk: 0 reads, 144 writes > 2015-05-18T23:21:51.729Z|00200|timeval|WARN|context switches: 4224452 > voluntary, 553 involuntary > 2015-05-18T23:21:51.729Z|00201|coverage|INFO|Skipping details of duplicate > event coverage for hash=9e13ffc8 > > ________________________________ > > WanLi Yang > > JD.COM > > ***************************************************************** > > > From: Joe Stringer > Date: 2015-05-23 05:36 > To: yangwanli > CC: bugs > Subject: Re: [ovs-discuss] |timeval|WARN|Unreasonably long 2020ms poll > interval (365882ms user, 464906ms system) > On 19 May 2015 at 03:32, yangwanli <yangwa...@jd.com> wrote: >> Hi, >> I'm a cloud computing R&D engineer. Use OVS as virtual switch, all virtual >> machines and containers communicate with OVS. >> I use OVS VLAN. >> OpenvSwitch Version 2.3.1 (nothing change) >> CentOS 6.6 >> In the process of I use, encounter the flowing problems. >> Need your help! >> >> 1、ovs-vswitchd.log always report warnings: >> >> 2015-05-17T00:52:50.463Z|00575|timeval|WARN|Unreasonably long 2020ms poll >> interval (365882ms user, 464906ms system) >> 2015-05-17T00:52:50.463Z|00576|timeval|WARN|faults: 708 minor, 0 major >> 2015-05-17T00:52:50.463Z|00577|timeval|WARN|disk: 0 reads, 392 writes >> 2015-05-17T00:52:50.463Z|00578|timeval|WARN|context switches: 19571142 >> voluntary, 2080 involuntary >> 2015-05-17T00:52:50.463Z|00579|coverage|INFO|Skipping details of duplicate >> event coverage for hash=35d02982 >> 2015-05-17T03:26:36.336Z|00580|timeval|WARN|Unreasonably long 2020ms poll >> interval (380306ms user, 483259ms system) >> 2015-05-17T03:26:36.336Z|00581|timeval|WARN|faults: 708 minor, 0 major >> 2015-05-17T03:26:36.336Z|00582|timeval|WARN|disk: 0 reads, 400 writes >> 2015-05-17T03:26:36.336Z|00583|timeval|WARN|context switches: 20299734 >> voluntary, 2174 involuntary >> 2015-05-17T03:26:36.336Z|00584|coverage|INFO|Skipping details of duplicate >> event coverage for hash=35d02982 > > The main loop of ovs-vswitchd usually expects to cycle at least once > per second. This message appears when this takes longer than usual, > which may indicate that ovs-vswitchd is under heavy load, or some sort > of bug. > >> >> 2、I also encounter this problem: >> when I execute >> ovs-ofctl dump-flows br-int >> not always reponse, but ovs-vsctl show is OK! > > The main thread of ovs-vswitchd assembles responses to OpenFlow > requests, and given the above messages it sounds like it's busy doing > other things. Ovs-vsctl on the other hand talks to ovsdb-server, which > looks like it is OK in this scenario. > > These both seem like symptoms, but it's not obvious to me what the > underlying cause is. Is there anything else in the logs? Is > ovs-vswitchd using a lot of CPU? > > Are you using the OVS kernel module from CentOS or one distributed > with the openvswitch package? ('modinfo openvswitch' should give a > hint here) > > On 19 May 2015 at 03:32, yangwanli <yangwa...@jd.com> wrote: >> Hi, >> I'm a cloud computing R&D engineer. Use OVS as virtual switch, all virtual >> machines and containers communicate with OVS. >> I use OVS VLAN. >> OpenvSwitch Version 2.3.1 (nothing change) >> CentOS 6.6 >> In the process of I use, encounter the flowing problems. >> Need your help! >> >> 1、ovs-vswitchd.log always report warnings: >> >> 2015-05-17T00:52:50.463Z|00575|timeval|WARN|Unreasonably long 2020ms poll >> interval (365882ms user, 464906ms system) >> 2015-05-17T00:52:50.463Z|00576|timeval|WARN|faults: 708 minor, 0 major >> 2015-05-17T00:52:50.463Z|00577|timeval|WARN|disk: 0 reads, 392 writes >> 2015-05-17T00:52:50.463Z|00578|timeval|WARN|context switches: 19571142 >> voluntary, 2080 involuntary >> 2015-05-17T00:52:50.463Z|00579|coverage|INFO|Skipping details of duplicate >> event coverage for hash=35d02982 >> 2015-05-17T03:26:36.336Z|00580|timeval|WARN|Unreasonably long 2020ms poll >> interval (380306ms user, 483259ms system) >> 2015-05-17T03:26:36.336Z|00581|timeval|WARN|faults: 708 minor, 0 major >> 2015-05-17T03:26:36.336Z|00582|timeval|WARN|disk: 0 reads, 400 writes >> 2015-05-17T03:26:36.336Z|00583|timeval|WARN|context switches: 20299734 >> voluntary, 2174 involuntary >> 2015-05-17T03:26:36.336Z|00584|coverage|INFO|Skipping details of duplicate >> event coverage for hash=35d02982 >> >> 2、I also encounter this problem: >> when I execute >> ovs-ofctl dump-flows br-int >> not always reponse, but ovs-vsctl show is OK! >> >> Thanks! >> WanLi Yang >> >> >> ________________________________ >> >> WanLi Yang >> >> JD.COM >> >> ***************************************************************** >> >> >> _______________________________________________ >> discuss mailing list >> discuss@openvswitch.org >> http://openvswitch.org/mailman/listinfo/discuss >>
_______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss