Why can’t we rather fix this by adding a “event_cb” param to vhost_user_async_close and then call qemu_chr_fe_set_handlers in vhost_user_async_close_bh()?
Even if calling vhost_dev_cleanup() twice is safe today I worry future changes may easily stumble over the reconnect case and introduce crashes or double frees. > On Aug 4, 2023, at 1:29 AM, Li Feng <fen...@smartx.com> wrote: > > When the vhost-user is reconnecting to the backend, and if the vhost-user > fails > at the get_features in vhost_dev_init(), then the reconnect will fail > and it will not be retriggered forever. > > The reason is: > When the vhost-user fail at get_features, the vhost_dev_cleanup will be called > immediately. > > vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. > > The reconnect path is: > vhost_user_blk_event > vhost_user_async_close(.. vhost_user_blk_disconnect ..) > qemu_chr_fe_set_handlers <----- clear the notifier callback > schedule vhost_user_async_close_bh > > The vhost->vdev is null, so the vhost_user_blk_disconnect will not be > called, then the event fd callback will not be reinstalled. > > With this patch, the vhost_user_blk_disconnect will call the > vhost_dev_cleanup() again, it's safe. > > All vhost-user devices have this issue, including vhost-user-blk/scsi. > > Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") > > Signed-off-by: Li Feng <fen...@smartx.com> > --- > hw/virtio/vhost-user.c | 10 +--------- > 1 file changed, 1 insertion(+), 9 deletions(-) > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > index 8dcf049d42..697b403fe2 100644 > --- a/hw/virtio/vhost-user.c > +++ b/hw/virtio/vhost-user.c > @@ -2648,16 +2648,8 @@ typedef struct { > static void vhost_user_async_close_bh(void *opaque) > { > VhostAsyncCallback *data = opaque; > - struct vhost_dev *vhost = data->vhost; > > - /* > - * If the vhost_dev has been cleared in the meantime there is > - * nothing left to do as some other path has completed the > - * cleanup. > - */ > - if (vhost->vdev) { > - data->cb(data->dev); > - } > + data->cb(data->dev); > > g_free(data); > } > -- > 2.41.0 >