When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever.
The reason is: When the vhost-user fail at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. With this patch, the vhost_user_blk_disconnect will call the vhost_dev_cleanup() again, it's safe. All vhost-user devices have this issue, including vhost-user-blk/scsi. Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fen...@smartx.com> --- hw/virtio/vhost-user.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 8dcf049d42..697b403fe2 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -2648,16 +2648,8 @@ typedef struct { static void vhost_user_async_close_bh(void *opaque) { VhostAsyncCallback *data = opaque; - struct vhost_dev *vhost = data->vhost; - /* - * If the vhost_dev has been cleared in the meantime there is - * nothing left to do as some other path has completed the - * cleanup. - */ - if (vhost->vdev) { - data->cb(data->dev); - } + data->cb(data->dev); g_free(data); } -- 2.41.0