Apologies for mixing up patches last time. This looks good from a vhost-user-blk perspective, but I worry that some of these changes could impact other vhost device types.
I agree with adding notifiers_set to struct vhost_dev, and setting it in vhost_dev_enable/disable notifiers, but is there any reason notifiers_set can’t be checked inside vhost-user-blk? On Thu, Apr 30, 2020 at 9:55 AM Dima Stepanov <dimas...@yandex-team.ru> wrote: > > In case of the vhost-user devices the daemon can be killed at any > moment. Since QEMU supports the reconnet functionality the guest > notifiers should be reset and disabled after "disconnect" event. The > most issues were found if the "disconnect" event happened during vhost > device initialization step. > The disconnect event leads to the call of the vhost_dev_cleanup() > routine. Which memset to 0 a vhost device structure. Because of this, if > device was not started (dev.started == false) and the connection is > broken, then the set_guest_notifier method will produce assertion error. > Also connection can be broken after the dev.started field is set to > true. > A new notifiers_set field is added to the vhost_dev structure to track > the state of the guest notifiers during the initialization process. > >From what I can tell this patch does two things: (1) In vhost.c you’re adding checks to abort early, while still returning successfully, from vhost_dev_drop_guest_notifiers() and vhost_dev_disable_notifiers() if notifiers have not been enabled. This new logic will affect all existing vhost devices. (2) For vhost-user-blk backend disconnect, you are ensuring that notifiers are dropped and disabled if and only if the notifiers are currently enabled. I completely agree with (2), but I don't think we need all of what you've done for (1) to accomplish (2). Either way, please clarify in your commit message. > Signed-off-by: Dima Stepanov <dimas...@yandex-team.ru> > --- > hw/block/vhost-user-blk.c | 8 ++++---- > hw/virtio/vhost.c | 11 +++++++++++ > include/hw/virtio/vhost.h | 1 + > 3 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c > index 70d7842..5a3de0f 100644 > --- a/hw/block/vhost-user-blk.c > +++ b/hw/block/vhost-user-blk.c > @@ -175,7 +175,9 @@ static void vhost_user_blk_stop(VirtIODevice *vdev) > return; > } > > - vhost_dev_stop(&s->dev, vdev); > + if (s->dev.started) { > + vhost_dev_stop(&s->dev, vdev); > + } > Couldn't we check if s->dev.notifiers_set here before calling vhost_dev_drop_guest_notifiers()? > ret = vhost_dev_drop_guest_notifiers(&s->dev, vdev, s->dev.nvqs); > if (ret < 0) { > @@ -337,9 +339,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev) > } > s->connected = false; > > - if (s->dev.started) { > - vhost_user_blk_stop(vdev); > - } > + vhost_user_blk_stop(vdev); > > vhost_dev_cleanup(&s->dev); > } > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index fa3da9c..ddbdc53 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -1380,6 +1380,7 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, > VirtIODevice *vdev) > goto fail_vq; > } > } > + hdev->notifiers_set = true; > > return 0; > fail_vq: > @@ -1407,6 +1408,10 @@ void vhost_dev_disable_notifiers(struct vhost_dev > *hdev, VirtIODevice *vdev) > BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev))); > int i, r; > I’m a little weary of short circuiting logic like this without at least propagating an error up. Couldn’t we leave it to the backends to check notifiers_set before they call vhost_dev_disable_notifiers() or vhost_dev_drop_guest_notifiers()? Then, if anything, maybe make this check an assert? > + if (!hdev->notifiers_set) { > + return; > + } > + > for (i = 0; i < hdev->nvqs; ++i) { > r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + > i, > false); > @@ -1417,6 +1422,8 @@ void vhost_dev_disable_notifiers(struct vhost_dev > *hdev, VirtIODevice *vdev) > virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + > i); > } > virtio_device_release_ioeventfd(vdev); > + > + hdev->notifiers_set = false; > } > > /* > @@ -1449,6 +1456,10 @@ int vhost_dev_drop_guest_notifiers(struct vhost_dev > *hdev, > VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > int ret; > Same comment as above - I’d prefer vhost-user-blk (and other backends supporting reconnect) check before calling the function instead of changing existing API behavior for other vhost devices. > + if (!hdev->notifiers_set) { > + return 0; > + } > + > ret = k->set_guest_notifiers(qbus->parent, nvqs, false); > if (ret < 0) { > error_report("Error reset guest notifier: %d", -ret); > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > index 4d0d2e2..e3711a7 100644 > --- a/include/hw/virtio/vhost.h > +++ b/include/hw/virtio/vhost.h > @@ -90,6 +90,7 @@ struct vhost_dev { > QLIST_HEAD(, vhost_iommu) iommu_list; > IOMMUNotifier n; > const VhostDevConfigOps *config_ops; > + bool notifiers_set; > }; > > int vhost_dev_init(struct vhost_dev *hdev, void *opaque, > -- > 2.7.4 > >