On Fri, Mar 29, 2013 at 8:38 PM, Tianpeng Zhang (Gmail)
<tianpeng0...@gmail.com> wrote:
> Hi All,
>
> I met an issue when running DRBD in Xenserver with ovs-1.7.1. DRBD works
> fine when creating and sync data. But when trying to down DRBD resource,
> ovs-vswitchd hangs for about 20 minutes, then all network connections
> broken.
>
> I add some debug trace, ovs-vswitchd finally stopped at sendmsg() for
> netlink message. The call path is:
> bridge_run_fast()->ofproto_run_fast()->run_fast()->handle_upcalls()->handle_miss_upcalls()->dpif_operate()->dpif_linux_operate__()->nl_sock_transact_multiple()->nl_sock_transact_multiple__()->sendmsg()
>
> vswitchd stop here because sendmsg() does not return.
>     465     memset(&msg, 0, sizeof msg);
>     466     msg.msg_iov = iovs;
>     467     msg.msg_iovlen = n;
>     468     do {
>     469         error = sendmsg(sock->fd, &msg, 0) < 0 ? errno : 0;
>     470     } while (error == EINTR);
>     471
>
> Several guys met similar issue before from Xen/DRBD's mail list, but the
> solution is just stop OVS, use linux bridge. I am thinking the issue may
> because before DRBD stop resource, it will do some cleanup for its netlink
> socket, this conflict with OVS's handling?

It looks like DRBD is also using genetlink for communication with
userspace.  There's a global lock so I suspect that DRBD is holding it
for a long time, which is blocking OVS.  The could also be deadlock if
there is another shared lock that is taken in a different order but
this seems somewhat less likely since there isn't a lot in common
between DRBD and OVS.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to