The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. when memory of vsocket is freed in rte_vhost_driver_unregister(), the invalid memory of vsocket is accessd in vhost_user_read_cb(). It's a bug of both mode for vhost as server or client.
Eg vhostuser port is created as server. Thread1 calls rte_vhost_driver_unregister(). Before the listen fd is deleted from poll waiting fds, "vhost-events" thread then calls vhost_user_server_new_connection(), then a new conn fd is added in fdset when trying to reconnect. "vhost-events" thread then calls vhost_user_read_cb() and access invalid memory of socket while thread1 frees the memory of vsocket. Eg vhostuser port is created as client. Thread1 calls rte_vhost_driver_unregister(). Before vsocket of reconn is deleted from reconn list, "vhost_reconn" thread then calls vhost_user_add_connection() then a new conn fd is added in fdset when trying to reconnect. "vhost-events" thread then calls vhost_user_read_cb() and access invalid memory of socket while thread1 frees the memory of vsocket. The fix is to move the "fdset_try_del" in front of free memory of conn, then avoid the race condition. The core trace is: Program terminated with signal 11, Segmentation fault. Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode") Signed-off-by: Gaoxiang Liu <liugaoxi...@huawei.com> v2: * Fix coding style issues. v3: * Add detailed log. v4: * Add the reason when vhostuser port is created as server. v5: * Add detailed log when vhostuser port is created as client. --- lib/vhost/socket.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 5d0d728d5..2eb8fcadd 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path) for (i = 0; i < vhost_user.vsocket_cnt; i++) { struct vhost_user_socket *vsocket = vhost_user.vsockets[i]; + if (vsocket->is_server) { + /* + * If r/wcb is executing, release vhost_user's + * mutex lock, and try again since the r/wcb + * may use the mutex lock. + */ + if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) { + pthread_mutex_unlock(&vhost_user.mutex); + goto again; + } + } else if (vsocket->reconnect) { + vhost_user_remove_reconnect(vsocket); + } + if (!strcmp(vsocket->path, path)) { pthread_mutex_lock(&vsocket->conn_mutex); for (conn = TAILQ_FIRST(&vsocket->conn_list); @@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path) pthread_mutex_unlock(&vsocket->conn_mutex); if (vsocket->is_server) { - /* - * If r/wcb is executing, release vhost_user's - * mutex lock, and try again since the r/wcb - * may use the mutex lock. - */ - if (fdset_try_del(&vhost_user.fdset, - vsocket->socket_fd) == -1) { - pthread_mutex_unlock(&vhost_user.mutex); - goto again; - } - close(vsocket->socket_fd); unlink(path); - } else if (vsocket->reconnect) { - vhost_user_remove_reconnect(vsocket); } pthread_mutex_destroy(&vsocket->conn_mutex); -- 2.32.0