On Mon, May 27, 2024 at 7:27 PM Halil Pasic <pa...@linux.ibm.com> wrote:
>
> On Thu, 16 May 2024 10:39:42 +0200
> Stefano Garzarella <sgarz...@redhat.com> wrote:
>
> [..]
> > >---
> > >
> > >This is a minimal fix, that follows the current patterns in the
> > >codebase, and not necessarily the best one.
> >
> > Yeah, I did something similar with commit 562a7d23bf ("vhost: mask
> > VIRTIO_F_RING_RESET for vhost and vhost-user devices") so I think for
> > now is the right approach.
> >
> > I suggest to check also other devices like we did in that commit (e.g.
> > hw/scsi/vhost-scsi.c, hw/scsi/vhost-user-scsi.c, etc. )
>
> Hi Stefano!
>
> Thank you for chiming in, and sorry for the late response. I was hoping
> that Michael is going to chime in and that I can base my reply on his
> take. Anyway here I  go.
>
> A very valid observation! I do agree that we need this for
> basically every vhost device, and since:
> * net/vhost-vdpa.c
> * hw/net/vhost_net.c
> * hw/virtio/vhost-user-fs.c
> already have it, that translates to shotgun it to the rest. Which
> isn't nice in my opinion, which is why I am hoping for a discussion
> on this topic, and a better solution (even if it turns out to be
> something like a common macro).
> [..]
> > >
> > >The documentation however does kind of state, that feature_bits is
> > >supposed to contain the supported features. And under the assumption
> > >that feature bit not in feature_bits implies that the corresponding bit
> > >must not be set in the 3rd argument (features), then even with the
> > >current implementation we do end up with the intersection of the three
> > >as stated. And then vsock would be at fault for violating that
> > >assumption, and my fix would be the best thing to do -- I guess.
> > >
> > >Is the implementation the way it is for a good reason, I can't judge
> > >that with certainty for myself.
> >
> > Yes, I think we should fix the documentation, and after a few years of
> > not looking at it I'm confused again about what it does.
> >
>
> I would prefer to fix the algorithm and make whole thing less fragile.
>
> > But re-reading my commit for VIRTIO_F_RING_RESET, it seems that I had
> > interpreted `feature_bits` (2nd argument) as a list of features that
> > QEMU doesn't know how to emulate and therefore are required by the
> > backend (vhost/vhost-user/vdpa). Because the problem is that `features`
> > (3rd argument) is a set of features required by the driver that can be
> > provided by both QEMU and the backend.
>
> Hm. I would say, this does sound like the sanest explanation, that might
> justify the current code, but I will argue that for me, it isn't sane
> enough.
>
> Here comes my argument.
>
> 1) The uses is explicitly asking for a vhost device and giving the user
> a non vhost device is not an option.
>
> 2) The whole purpose of vhost is that at least the data plane is
> implemented outside of QEMU (I am maybe a little sloppy here with
> dataplane). That means a rather substantial portion of the device
> implementation is not in QEMU, while QEMU remains in charge of the
> setup.
>
> 3) Thus I would argue, that all the "transport feature bits" from 24 to
> 40 should have a corresponding vhost feature because the vhost part needs
> some sort of a support.
>
> What do we have there in bits from 24 to 40 according to the spec?
> * VIRTIO_F_INDIRECT_DESC
> * VIRTIO_F_EVENT_IDX
> * VIRTIO_F_VERSION_1
> * VIRTIO_F_ACCESS_PLATFORM
> * VIRTIO_F_RING_PACKED
> * VIRTIO_F_IN_ORDER
> * VIRTIO_F_ORDER_PLATFORM
> * VIRTIO_F_SR_IOV
> * VIRTIO_F_NOTIFICATION_DATA
> * VIRTIO_F_NOTIF_CONFIG_DATA
> * VIRTIO_F_RING_RESET
> and for transitional:
> * VIRTIO_F_NOTIFY_ON_EMPTY
> * VIRTIO_F_ANY_LAYOUT
> * UNUSED
>
> I would say, form these only VIRTIO_F_SR_IOV and
> VIRTIO_F_NOTIF_CONFIG_DATA look iffy in a sense things may work out
> for vhost devices without the vhost part doing something for it. And
> even there, I don't think it would hurt to make vhost part of the
> negotiation (I don't think those are supported by QEMU at this point).
>
> I would very much prefer having a consolidated and safe handling for
> these.
>
> 4) I would also argue that a bunch of the device specific feature bits
> should have vhost feature bits as well for the same reason:
> features are also such that for a vhost device, the vhost part needs
> some sort of a support.
>
> Looking through all of these would require a lot of time, so instead
> of that, let me use SCSI as an example. The features are:
> * VIRTIO_SCSI_F_INOUT
> * VIRTIO_SCSI_F_HOTPLUG
> * VIRTIO_SCSI_F_CHANGE
> * VIRTIO_SCSI_F_T10_PI
>
> The in the Linux kernel we have
>         VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL << 
> VIRTIO_SCSI_F_HOTPLUG) |
>                                                (1ULL << VIRTIO_SCSI_F_T10_PI)
> but in QEMU kernel_feature_bits does not have
> VIRTIO_SCSI_F_T10_PI which together does not make much sense to me. And I 
> would
> also expect VIRTIO_SCSI_F_INOUT to be a part of the negotiation, because
> to me that the side that is processing the queue is the one that should
> care about it, but I don't think it is supported by QEMU at all, and
> then not negotiating it is fine. VIRTIO_SCSI_F_HOTPLUG is in QEMU's
> kernel_feature_bits and thus negotiated. in  So the only one that can be
> done purely in QEMU seems to be VIRTIO_SCSI_F_CHANGE.
>
> And for vsock we have
> VIRTIO_VSOCK_F_SEQPACKET and VIRTIO_VSOCK_F_STREAM, with VIRTIO_VSOCK_F_STREAM
> being strange in a sense that the spec says "If no feature bit is set,
> only stream socket type is supported. If VIRTIO_VSOCK_F_SEQPACKET has
> been negotiated, the device MAY act as if VIRTIO_VSOCK_F_STREAM has also
> been negotiated."
>
> VIRTIO_VSOCK_F_SEQPACKET is negotiated with vhost.
>
> It seems for whatever reason we just assume that the vhost device
> supports VIRTIO_VSOCK_F_STREAM and that is why we don't negotiate it. I
> would assume based on what I see that the feature VIRTIO_VSOCK_F_STREAM
> was retrofitted, and that is what we ended up with. Thus I guess now
> we have an implicit rule that any QEMU compatible vhost-vsock
> implementation would have to support that. But in practice we care only
> about Linux, and thus it does not matter.
>
> 5) Based on the following, I would very much prefer a per device list of
> features with the semantic "hey QEMU can do that feature without any
> specialized vhost-device support (e.g. like VIRTIO_SCSI_F_CHANGE)"

Indeed the current code is kind of tricky and may need better
documentation. But the problem is some features were datapath related
and they are supported since the birth of a specific vhost device. For
example, some GSO related features (actually, it's not a feature of
vhost-net but TUN/TAP).

And I've found another interesting thing, for RING_REST, actually we
don't need to do anything but we have the following commits:

313389be06ff6 ("vhost-net: support VIRTIO_F_RING_RESET")
2a3552baafb ("vhost: vhost-kernel: enable vq reset feature")

Technically, they are not necessary as RING_RESET for vhost-kernel
doesn't require any additional new ioctls. But it's too late to change
as the kernel commit has been merged.

> over
> the current list with the semantics "these are the features that
> need vhost backend support, and anything else will just work out". That
> way if an omission is made, we would end up with the usually safer
> under-indication  instead of the current over-indication.
>
>
> @Michael, Jason: Could you guys chime in?

Another issue is that it seems to require a change of the semantic of
VHOST_GET_FEATURES. If my understanding is true, it seems a
non-trivial change which I'm not sure it's worth to bother.

Thanks


Reply via email to