On 24.07.23 17:48, Eugenio Perez Martin wrote:
On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hre...@redhat.com> wrote:
On 21.07.23 17:25, Eugenio Perez Martin wrote:
On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hre...@redhat.com> wrote:
Move the `suspended` field from vhost_vdpa into the global vhost_dev
struct, so vhost_dev_stop() can check whether the back-end has been
suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
there is no need to reset it; the reset is just a fall-back to stop
device operations for back-ends that do not support suspend.

Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
when the device is re-started, we still have to do the reset to have it
un-suspend.

Signed-off-by: Hanna Czenczek <hre...@redhat.com>
---
   include/hw/virtio/vhost-vdpa.h |  2 --
   include/hw/virtio/vhost.h      |  8 ++++++++
   hw/virtio/vhost-vdpa.c         | 11 +++++++----
   hw/virtio/vhost.c              |  8 +++++++-
   4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index e64bfc7f98..72c3686b7f 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
       bool shadow_vqs_enabled;
       /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA 
*/
       bool shadow_data;
-    /* Device suspended successfully */
-    bool suspended;
       /* IOVA mapping used by the Shadow Virtqueue */
       VhostIOVATree *iova_tree;
       GPtrArray *shadow_vqs;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6a173cb9fa..69bf59d630 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -120,6 +120,14 @@ struct vhost_dev {
       uint64_t backend_cap;
       /* @started: is the vhost device started? */
       bool started;
+    /**
+     * @suspended: Whether the vhost device is currently suspended.  Set
+     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
+     * are supposed to automatically suspend/resume in their
+     * vhost_dev_start handlers as required.  Must also be cleared when
+     * the device is reset.
+     */
+    bool suspended;
       bool log_enabled;
       uint64_t log_size;
       Error *migration_blocker;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7b7dee468e..f7fd19a203 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,

   static int vhost_vdpa_reset_device(struct vhost_dev *dev)
   {
-    struct vhost_vdpa *v = dev->opaque;
       int ret;
       uint8_t status = 0;

       ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
       trace_vhost_vdpa_reset_device(dev);
-    v->suspended = false;
+    dev->suspended = false;
       return ret;
   }

@@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
           if (unlikely(r)) {
               error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
           } else {
-            v->suspended = true;
+            dev->suspended = true;
               return;
           }
       }
@@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, 
bool started)
               return -1;
           }
           vhost_vdpa_set_vring_ready(dev);
+        if (dev->suspended) {
+            /* TODO: When RESUME is available, use it instead of resetting */
+            vhost_vdpa_reset_status(dev);
How is that we reset the status at each vhost_vdpa_dev_start? That
will clean all the vqs configured, features negotiated, etc. in the
vDPA device. Or am I missing something?
What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
but we somehow need to lift the previous SUSPEND so the device will
again respond to guest requests, do we not?

Reset also clears the suspend state in vDPA, and it should be called
at vhost_dev_stop. So the device should never be in suspended state
here. Does that solve your concerns?

My intention with this patch was precisely not to reset in vhost_dev_stop when suspending is supported.  So now I’m more confused than before.

But more generally, is this any different from what is done before this
patch?  Before this patch, vhost_dev_stop() unconditionally invokes
vhost_reset_status(), so the device is reset in every stop/start cycle,
that doesn’t change.  And we still won’t reset it on the first
vhost_dev_start(), because dev->suspended will be false then, only on
subsequent stop/start cycles, as before.  So the only difference is that
now the device is reset on start, not on stop.

The difference is that vhost_vdpa_dev_start is called after features
ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
configuration (using vhost_virtqueue_start). A device reset forces the
device to forget about all of that, and qemu cannot configure them
again until qemu acks the features again.

Now I’m completely confused, because I don’t see the point of implementing suspend at all if we rely on a reset immediately afterwards anyway.  It was my impression this whole time that suspending would remove the need to reset.  Well, at least until the device should be resumed again, i.e. in vhost_dev_start().

In addition, I also don’t understand the magnitude of the problem with ordering.  If the order in vhost_dev_start() is wrong, can we not easily fix it?  E.g. add a full vhost_dev_resume callback to invoke right at the start of vhost_dev_start(); or check (in the same place) whether the back-end supports resuming, and if it doesn’t (and it is currently suspended), reset it there.

In any case, if we need to reset in vhost_dev_stop(), i.e. immediately after suspend, I don’t see the point of suspending, indicating to me that I still fail to understand its purpose.

Hanna


Reply via email to