On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hre...@redhat.com> wrote: > > On 24.07.23 17:48, Eugenio Perez Martin wrote: > > On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hre...@redhat.com> wrote: > >> On 21.07.23 17:25, Eugenio Perez Martin wrote: > >>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hre...@redhat.com> wrote: > >>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev > >>>> struct, so vhost_dev_stop() can check whether the back-end has been > >>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`. If it has, > >>>> there is no need to reset it; the reset is just a fall-back to stop > >>>> device operations for back-ends that do not support suspend. > >>>> > >>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so > >>>> when the device is re-started, we still have to do the reset to have it > >>>> un-suspend. > >>>> > >>>> Signed-off-by: Hanna Czenczek <hre...@redhat.com> > >>>> --- > >>>> include/hw/virtio/vhost-vdpa.h | 2 -- > >>>> include/hw/virtio/vhost.h | 8 ++++++++ > >>>> hw/virtio/vhost-vdpa.c | 11 +++++++---- > >>>> hw/virtio/vhost.c | 8 +++++++- > >>>> 4 files changed, 22 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/include/hw/virtio/vhost-vdpa.h > >>>> b/include/hw/virtio/vhost-vdpa.h > >>>> index e64bfc7f98..72c3686b7f 100644 > >>>> --- a/include/hw/virtio/vhost-vdpa.h > >>>> +++ b/include/hw/virtio/vhost-vdpa.h > >>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa { > >>>> bool shadow_vqs_enabled; > >>>> /* Vdpa must send shadow addresses as IOTLB key for data queues, > >>>> not GPA */ > >>>> bool shadow_data; > >>>> - /* Device suspended successfully */ > >>>> - bool suspended; > >>>> /* IOVA mapping used by the Shadow Virtqueue */ > >>>> VhostIOVATree *iova_tree; > >>>> GPtrArray *shadow_vqs; > >>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > >>>> index 6a173cb9fa..69bf59d630 100644 > >>>> --- a/include/hw/virtio/vhost.h > >>>> +++ b/include/hw/virtio/vhost.h > >>>> @@ -120,6 +120,14 @@ struct vhost_dev { > >>>> uint64_t backend_cap; > >>>> /* @started: is the vhost device started? */ > >>>> bool started; > >>>> + /** > >>>> + * @suspended: Whether the vhost device is currently suspended. Set > >>>> + * and reset by implementations (vhost-user, vhost-vdpa, ...), which > >>>> + * are supposed to automatically suspend/resume in their > >>>> + * vhost_dev_start handlers as required. Must also be cleared when > >>>> + * the device is reset. > >>>> + */ > >>>> + bool suspended; > >>>> bool log_enabled; > >>>> uint64_t log_size; > >>>> Error *migration_blocker; > >>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >>>> index 7b7dee468e..f7fd19a203 100644 > >>>> --- a/hw/virtio/vhost-vdpa.c > >>>> +++ b/hw/virtio/vhost-vdpa.c > >>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct > >>>> vhost_dev *dev, > >>>> > >>>> static int vhost_vdpa_reset_device(struct vhost_dev *dev) > >>>> { > >>>> - struct vhost_vdpa *v = dev->opaque; > >>>> int ret; > >>>> uint8_t status = 0; > >>>> > >>>> ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status); > >>>> trace_vhost_vdpa_reset_device(dev); > >>>> - v->suspended = false; > >>>> + dev->suspended = false; > >>>> return ret; > >>>> } > >>>> > >>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev > >>>> *dev) > >>>> if (unlikely(r)) { > >>>> error_report("Cannot suspend: %s(%d)", g_strerror(errno), > >>>> errno); > >>>> } else { > >>>> - v->suspended = true; > >>>> + dev->suspended = true; > >>>> return; > >>>> } > >>>> } > >>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev > >>>> *dev, bool started) > >>>> return -1; > >>>> } > >>>> vhost_vdpa_set_vring_ready(dev); > >>>> + if (dev->suspended) { > >>>> + /* TODO: When RESUME is available, use it instead of > >>>> resetting */ > >>>> + vhost_vdpa_reset_status(dev); > >>> How is that we reset the status at each vhost_vdpa_dev_start? That > >>> will clean all the vqs configured, features negotiated, etc. in the > >>> vDPA device. Or am I missing something? > >> What alternative do you propose? We don’t have RESUME for vDPA in qemu, > >> but we somehow need to lift the previous SUSPEND so the device will > >> again respond to guest requests, do we not? > >> > > Reset also clears the suspend state in vDPA, and it should be called > > at vhost_dev_stop. So the device should never be in suspended state > > here. Does that solve your concerns? > > My intention with this patch was precisely not to reset in > vhost_dev_stop when suspending is supported. So now I’m more confused > than before. >
At this moment, I think that should be focused as an optimization and not to be included in the main series. > >> But more generally, is this any different from what is done before this > >> patch? Before this patch, vhost_dev_stop() unconditionally invokes > >> vhost_reset_status(), so the device is reset in every stop/start cycle, > >> that doesn’t change. And we still won’t reset it on the first > >> vhost_dev_start(), because dev->suspended will be false then, only on > >> subsequent stop/start cycles, as before. So the only difference is that > >> now the device is reset on start, not on stop. > >> > > The difference is that vhost_vdpa_dev_start is called after features > > ack (via vhost_dev_start, through vhost_dev_set_features call) and vq > > configuration (using vhost_virtqueue_start). A device reset forces the > > device to forget about all of that, and qemu cannot configure them > > again until qemu acks the features again. > > Now I’m completely confused, because I don’t see the point of > implementing suspend at all if we rely on a reset immediately afterwards > anyway. It's not immediate. From vhost_dev_stop, comments added only in this mail: void vhost_virtqueue_stop(struct vhost_dev *dev, struct VirtIODevice *vdev, struct vhost_virtqueue *vq, unsigned idx) { ... // Get each vring indexes, trusting the destination device can continue safely from there r = dev->vhost_ops->vhost_get_vring_base(dev, &state); if (r < 0) { VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); /* Connection to the backend is broken, so let's sync internal * last avail idx to the device used idx. */ virtio_queue_restore_last_avail_idx(vdev, idx); } else { virtio_queue_set_last_avail_idx(vdev, idx, state.num); } ... } void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings) { ... // Suspend the device, so we can trust in vring indexes / vq state if (hdev->vhost_ops->vhost_dev_start) { hdev->vhost_ops->vhost_dev_start(hdev, false); } if (vrings) { vhost_dev_set_vring_enable(hdev, false); } // Fetch each vq index / state and store in vdev->vq[i] for (i = 0; i < hdev->nvqs; ++i) { vhost_virtqueue_stop(hdev, vdev, hdev->vqs + i, hdev->vq_index + i); } // Reset the device, as we don't need it anymore and it can release the resources if (hdev->vhost_ops->vhost_reset_status) { hdev->vhost_ops->vhost_reset_status(hdev); } } --- > It was my impression this whole time that suspending would > remove the need to reset. Well, at least until the device should be > resumed again, i.e. in vhost_dev_start(). > It cannot. vhost_dev_stop is also called when the guest reset the device, so the guest trusts the device to be in a clean state. Also, the reset is the moment when the device frees the unused resources. This would mandate the device to > In addition, I also don’t understand the magnitude of the problem with > ordering. If the order in vhost_dev_start() is wrong, can we not easily > fix it? The order in vhost_dev_start follows the VirtIO standard. The move of the reset should be to remove it from vhost_dev_stop to something like both virtio_reset and the end of virtio_save. I'm not sure if I'm forgetting some other use cases. > E.g. add a full vhost_dev_resume callback to invoke right at > the start of vhost_dev_start(); or check (in the same place) whether the > back-end supports resuming, and if it doesn’t (and it is currently > suspended), reset it there. > > In any case, if we need to reset in vhost_dev_stop(), i.e. immediately > after suspend, I don’t see the point of suspending, indicating to me > that I still fail to understand its purpose. > > Hanna >