> -----Original Message----- > From: Stojaczyk, DariuszX > Sent: Tuesday, May 16, 2017 4:35 PM > To: dev@dpdk.org > Cc: Wodkowski, PawelX <pawelx.wodkow...@intel.com>; Stojaczyk, DariuszX > <dariuszx.stojac...@intel.com> > Subject: [PATCH] vhost: fix deadlock on rte_vhost_driver_unregister() > > Consider the following scenario, threads A and B: > (A) > * fdset_event_dispatch() start > * pfdentry->busy = 1; > * vhost_user_read_cb() start > * vhost_destroy_device() start > (B) > * rte_vhost_driver_unregister() start > * pthread_mutex_lock(&vsocket->conn_mutex); > * fdset_del() > * endless loop, waiting for pfdentry->busy == 0 > (A) > * vhost_destroy_device() end > * pthread_mutex_lock(&vsocket->conn_mutex); > (mutex already locked - deadlock at this point) > > Thread B has locked vsocket->conn_mutex and is in while(1) loop waiting for > given fd to change it's busy flag to 0. > Thread A would have to finish vhost_user_read_cb() in order to set busy flag > back to 0, but that can't happen due to the vsocket->conn_mutex lock. > > This patch synchronizes rte_vhost_driver_unregister() with > vhost_user_read_cb() through vhost_user.mutex. > > Signed-off-by: Dariusz Stojaczyk <dariuszx.stojac...@intel.com> > --- > lib/librte_vhost/socket.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index > c7f99b0..77e58fe 100644 > --- a/lib/librte_vhost/socket.c > +++ b/lib/librte_vhost/socket.c > @@ -273,6 +273,8 @@ vhost_user_read_cb(int connfd, void *dat, int > *remove) > > ret = vhost_user_msg_handler(conn->vid, connfd); > if (ret < 0) { > + pthread_mutex_lock(&vhost_user.mutex); > + > close(connfd); > *remove = 1; > vhost_destroy_device(conn->vid); > @@ -287,6 +289,8 @@ vhost_user_read_cb(int connfd, void *dat, int > *remove) > create_unix_socket(vsocket); > vhost_user_start_client(vsocket); > } > + > + pthread_mutex_unlock(&vhost_user.mutex); > } > } > > -- > 2.7.4
I've found other cases where rte_vhost_driver_unregister() can still deadlock. Please do not merge this patch now, I will try to come up with a different solution for this issue.