If the vhost-user application (e.g. OVS) deletes the vhost-user port while Qemu sends a vhost-user request, a deadlock can happen if the request handler tries to acquire vhost-user's global mutex, which is also locked by the vhost-user port deletion API (rte_vhost_driver_unregister).
This patch prevents the deadlock by making rte_vhost_driver_unregister() to release the mutex and try again if a request is being handled to give a chance to the request handler to complete. Fixes: 8b4b949144b8 ("vhost: fix dead lock on closing in server mode") Fixes: 5fbb3941da9f ("vhost: introduce driver features related APIs") Cc: sta...@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> --- lib/librte_vhost/socket.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 633c2cbc27..c57a0c7cdd 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -1052,9 +1052,10 @@ rte_vhost_driver_unregister(const char *path) next = TAILQ_NEXT(conn, next); /* - * If r/wcb is executing, release the - * conn_mutex lock, and try again since - * the r/wcb may use the conn_mutex lock. + * If r/wcb is executing, release vsocket's + * conn_mutex and vhost_user's mutex locks, and + * try again since the r/wcb may use the + * conn_mutex and mutex locks. */ if (fdset_try_del(&vhost_user.fdset, conn->connfd) == -1) { @@ -1075,8 +1076,17 @@ rte_vhost_driver_unregister(const char *path) pthread_mutex_unlock(&vsocket->conn_mutex); if (vsocket->is_server) { - fdset_del(&vhost_user.fdset, - vsocket->socket_fd); + /* + * If r/wcb is executing, release vhost_user's + * mutex lock, and try again since the r/wcb + * may use the mutex lock. + */ + if (fdset_try_del(&vhost_user.fdset, + vsocket->socket_fd) == -1) { + pthread_mutex_unlock(&vhost_user.mutex); + goto again; + } + close(vsocket->socket_fd); unlink(path); } else if (vsocket->reconnect) { -- 2.21.0