On 24.07.23 20:04, Stefan Hajnoczi wrote:
On Fri, Jul 21, 2023 at 04:16:07PM +0200, Hanna Czenczek wrote:
On 20.07.23 18:03, Stefan Hajnoczi wrote:
On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote:
On 19.07.23 16:11, Hanna Czenczek wrote:
On 18.07.23 17:10, Stefan Hajnoczi wrote:
On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
The only user of vhost_user_reset_status() is vhost_dev_stop(), which
only uses it as a fall-back to stop the back-end if it does not support
SUSPEND.  However, vhost-user's implementation is a no-op unless the
back-end supports SET_STATUS.

vhost-vdpa's implementation instead just calls
vhost_vdpa_reset_device(), implying that it's OK to fully reset the
device if SET_STATUS is not supported.

To be fair, vhost_vdpa_reset_device() does nothing but to set
the status
to zero.  However, that may well be because vhost-vdpa has no method
besides this to reset a device.  In contrast, vhost-user has
RESET_DEVICE and a RESET_OWNER, which can be used instead.

While it is not entirely clear from documentation or git logs, from
discussions and the order of vhost-user protocol features, it
appears to
me as if RESET_OWNER originally had no real meaning for vhost-user, and
was thus used to signal a device reset to the back-end.  Then,
RESET_DEVICE was introduced, to have a well-defined dedicated reset
command.  Finally, vhost-user received full STATUS support, including
SET_STATUS, so setting the device status to 0 is now the preferred way
of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
remain valid as fall-backs.

Therefore, have vhost_user_reset_status() fall back to
vhost_user_reset_device() if the back-end has no STATUS support.

Signed-off-by: Hanna Czenczek <hre...@redhat.com>
---
    hw/virtio/vhost-user.c | 2 ++
    1 file changed, 2 insertions(+)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4507de5a92..53a881ec2a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct
vhost_dev *dev)
        if (virtio_has_feature(dev->protocol_features,
                               VHOST_USER_PROTOCOL_F_STATUS)) {
            vhost_user_set_status(dev, 0);
+    } else {
+        vhost_user_reset_device(dev);
        }
    }
Did you check whether DPDK treats setting the status to 0 as equivalent
to RESET_DEVICE?
If it doesn’t, what’s even the point of using reset_status?
Sorry, I’m being unclear, and I think this may be important because it ties
into the question from patch 1, what qemu is even trying to do by running
SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that
SET_STATUS(0) and RESET_DEVICE should be equivalent:

vhost-vdpa.c runs SET_STATUS(0) in a function called
vhost_vdpa_reset_device().  This is one thing that gave me the impression
that this is about an actual full reset.

Another is the whole discussion that we’ve had.  vhost_dev_stop() does not
call a `vhost_reset_device()` function, it calls `vhost_reset_status()`.
Still, we were always talking about resetting the device.
There is some hacky stuff with struct vhost_dev's vq_index_end and
multi-queue devices. I think it's because multi-queue vhost-net device
consist of many vhost_devs and NetClientStates, so certain vhost
operations are skipped unless this is the "first" or "last" vhost_dev
from a large aggregate vhost-net device. That might be responsible for
part of the weirdness.

It doesn’t make sense to me that vDPA would provide no function to fully
reset a device, while vhost-user does.  Being able to reset a device sounds
vital to me.  This also gave me the impression that SET_STATUS(0) on vDPA at
least is functionally equivalent to a full device reset.


Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on
vhost-user.  That would be a real shame, so I assumed this would not be the
case; that SET_STATUS(0) does the same thing on both protocols.
Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user
it's currently only used by DPDK as a hint for when device
initialization is complete:
https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1
FWIW, now the code is a bit different.
https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054
has added a RESET interpretation for the status field, i.e. when it is 0.
It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0)
is a reset.
That patch adds diagnostics but does not perform any action for
SET_STATUS 0. DPDK's vhost_user_reset_owner() is still the only place
where the device is actually reset.

That’s what I said, it doesn’t do anything, but the diagnostics agree that it is a RESET.

QEMU cannot switch to just
SET_STATUS 0, it still needs to send RESET_DEVICE/RESET_OWNER.

That is what I questioned below: We currently *do not* call RESET_DEVICE/RESET_OWNER.  This patch is not about switching to SET_STATUS(0), it is about having RESET_DEVICE/RESET_OWNER be fallbacks for it.

The virtio specification says “Writing 0 into this field resets the device.”
about the device_status field.

This also makes sense, because the device_status field is basically used to
tell the device that a driver has taken control.  If reset, this indicates
the driver has given up control, and to me this is a point where a device
should fully reset itself.

So all in all, I can’t see the rationale why any implementation that
supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or
a superset of RESET_DEVICE.  I may be wrong, and this might explain a whole
deal about what kind of background operations we hope to stop with
SET_STATUS(0).
I would like vhost-user devices to implement SET_STATUS according to the
VIRTIO specification in the future and they can do that. But I think
front-ends should continue sending RESET_DEVICE in order to support old
devices.
Well, yes, exactly.  That is what I meant to address with this patch,
vhost-user right now does not send RESET_DEVICE in its vhost_reset_status
implementation, so the front-end will not fall back to RESET_DEVICE when it
apparently does intend to reset the device[1].  We do arguably have
vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is
no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi.

So this also begs the question why we even do have vhost_reset_status and
vhost_reset_device as two separate things. The commit introducing
vhost_reset_status (c3716f260bf) doesn’t say.  Maybe the intention was that
vhost_reset_device would leave the status at 0, while vhost_reset_status
would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit,
but that comes back to patch 5 in this series – we don’t need to have
ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need
vhost_reset_status to set those flags.  They should be set in
vhost_dev_start().

[1] This is assuming that SET_STATUS(0) is intended to reset the device, but
it sounds like you agree on that.
I don't know the answers, but I think it's safe to go ahead with a
SET_STATUS sequence that follows the VIRTIO spec, plus a
VHOST_USER_RESET_DEVICE/VHOST_USER_RESET_OWNER.

So what you’re saying is that RESET_DEVICE/RESET_OWNER should not be fallbacks, but be invoked in addition to SET_STATUS(0)?

If so, that would be silly.  I see your point that DPDK resets only in response to RESET_DEVICE/RESET_OWNER, but the diagnostics agree that SET_STATUS(0) is a reset, which is why I find this so silly. It sounds to me as if any properly behaving implementation would fully reset the back-end on SET_STATUS(0), so unconditionally invoking RESET_DEVICE/RESET_OWNER afterwards is just doing a double-reset.

Notably, invoking RESET_DEVICE/RESET_OWNER in addition to SET_STATUS(0) (instead of as a fallback) would be a change in behavior, because we do not call RESET_DEVICE/RESET_OWNER outside of vhost-user-scsi today.

Hanna


Reply via email to