Hi, Ben, turns out I was wrong, this appears to be a genuine bug in dpif-netdev.
I sent a fix that I believe might be related to the bug observed here: http://openvswitch.org/pipermail/dev/2016-January/065073.html Otherwise it would be interesting to get a backtrace of the main thread from gdb to investigate further. Thanks, Daniele On 26/01/2016 00:23, "Iezzi, Federico" <federico.ie...@hpe.com> wrote: >Hi there, > >I have the same issue with OVS 2.4 (latest commit in the branch 2.4) and >DPDK 2.0.0 in Debian 8 environment. >After a while it just stuck. > >Regards, >Federico > >-----Original Message----- >From: discuss [mailto:discuss-boun...@openvswitch.org] On Behalf Of Ben >Pfaff >Sent: Tuesday, January 26, 2016 7:13 AM >To: Daniele di Proietto <diproiet...@vmware.com> >Cc: discuss@openvswitch.org >Subject: Re: [ovs-discuss] dpdk watchdog stuck? > >Daniele, I think that you said in our meeting today that there was some >sort of bug that falsely blames a thread. Can you explain further? > >On Mon, Jan 25, 2016 at 09:29:52PM +0100, Patrik Andersson R wrote: >> Right, that is likely for sure. Will look there first. >> >> What do you think of the case where the thread is "main". I've got >> examples of this one as well. Have not been able to figure out so far >> what would cause this. >> >> ... >> ovs-vswitchd.log.1.1.1.1:2016-01-23T01:47:19.026Z|00016|ovs_rcu(urcu2) >> |WARN|blocked >> 32768000 ms waiting for main to quiesce >> ovs-vswitchd.log.1.1.1.1:2016-01-23T10:53:27.026Z|00017|ovs_rcu(urcu2) >> |WARN|blocked >> 65536000 ms waiting for main to quiesce >> ovs-vswitchd.log.1.1.1.1:2016-01-24T05:05:43.026Z|00018|ovs_rcu(urcu2) >> |WARN|blocked >> 131072000 ms waiting for main to quiesce >> ovs-vswitchd.log.1.1.1.1:2016-01-24T18:24:40.826Z|00001|ovs_rcu(urcu1) >> |WARN|blocked >> 1092 ms waiting for main to quiesce >> ovs-vswitchd.log.1.1.1.1:2016-01-24T18:24:41.805Z|00002|ovs_rcu(urcu1) >> |WARN|blocked >> 2072 ms waiting for main to quiesce >> ... >> >> Could it be in connection with a deletion of a netdev port? >> >> Regards, >> >> Patrik >> >> >> On 01/25/2016 07:50 PM, Ben Pfaff wrote: >> >On Mon, Jan 25, 2016 at 03:09:09PM +0100, Patrik Andersson R wrote: >> >>during robustness testing, where VM:s are booted and deleted using >> >>nova boot/delete in rather rapid succession, VMs get stuck in >> >>spawning state after a few test cycles. Presumably this is due to >> >>the OVS not responding to port additions and deletions anymore, or >> >>rather that responses to these requests become painfully slow. Other >> >>requests towards the vswitchd fail to complete in any reasonable >> >>time frame as well, ovs-appctl vlog/set is one example. >> >> >> >>The only conclusion I can draw at the moment is that some thread >> >>(I've observed main and dpdk_watchdog3) is blocking the >> >>ovsrcu_synchronize() operation for "infinite" time and there is no >>fall-back to get out of this. >> >>To >> >>recover, the minimum operation seems to be a service restart of the >> >>openvswitch-switch service but that seems to cause other issues >>longer term. >> >> >> >>In the vswitch log when this happens the following can be observed: >> >> >> >>2016-01-24T20:36:14.601Z|02742|ovs_rcu(vhost_thread2)|WARN|blocked >> >>1000 ms waiting for dpdk_watchdog3 to quiesce >> >This looks like a bug somewhere in the DPDK code. The watchdog code >> >is really simple: >> > >> > static void * >> > dpdk_watchdog(void *dummy OVS_UNUSED) >> > { >> > struct netdev_dpdk *dev; >> > >> > pthread_detach(pthread_self()); >> > >> > for (;;) { >> > ovs_mutex_lock(&dpdk_mutex); >> > LIST_FOR_EACH (dev, list_node, &dpdk_list) { >> > ovs_mutex_lock(&dev->mutex); >> > check_link_status(dev); >> > ovs_mutex_unlock(&dev->mutex); >> > } >> > ovs_mutex_unlock(&dpdk_mutex); >> > xsleep(DPDK_PORT_WATCHDOG_INTERVAL); >> > } >> > >> > return NULL; >> > } >> > >> >Although it looks at first glance like it doesn't quiesce, xsleep() >> >does that internally, so I guess check_link_status() must be hanging. >> >_______________________________________________ >discuss mailing list >discuss@openvswitch.org >http://openvswitch.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss