On Fri, Jul 18, 2014 at 03:10:27PM -0700, Alex Wang wrote:
> *Sure, when I tried to delete my br-int, ovs hangs*
> 
> *Basically, main thread joins the revalidator thread, revalidator threads
> are either blocking at recvmsg() or the mutex.*

Thanks, after some reading and experimentation I understand the
problem now.  Here's a suggested patch that explains further:

diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c
index b1e6804..09d3a61 100644
--- a/lib/netlink-socket.c
+++ b/lib/netlink-socket.c
@@ -724,9 +724,15 @@ nl_dump_refill(struct nl_dump *dump, struct ofpbuf *buffer)
     int error;
 
     while (!ofpbuf_size(buffer)) {
-        error = nl_sock_recv__(dump->sock, buffer, true);
+        error = nl_sock_recv__(dump->sock, buffer, false);
         if (error) {
-            /* The kernel shouldn't return EAGAIN while there's data left. */
+            /* The kernel never blocks providing the results of a dump, so
+             * error == EAGAIN means that we've read the whole thing, and
+             * therefore transform it into EOF.  (The kernel always provides
+             * NLMSG_DONE as a sentinel.  Some other thread must have received
+             * that already but not yet signaled it in 'status'.)
+             *
+             * Any other error is just an error. */
             return error == EAGAIN ? EOF : error;
         }
 
Does that make sense?

Anyway,
Acked-by: Ben Pfaff <b...@nicira.com>
on the patch, I just want to document the reason better.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to