On Mon, May 27, 2024 at 7:27 PM Halil Pasic <pa...@linux.ibm.com> wrote: > > On Thu, 16 May 2024 10:39:42 +0200 > Stefano Garzarella <sgarz...@redhat.com> wrote: > > [..] > > >--- > > > > > >This is a minimal fix, that follows the current patterns in the > > >codebase, and not necessarily the best one. > > > > Yeah, I did something similar with commit 562a7d23bf ("vhost: mask > > VIRTIO_F_RING_RESET for vhost and vhost-user devices") so I think for > > now is the right approach. > > > > I suggest to check also other devices like we did in that commit (e.g. > > hw/scsi/vhost-scsi.c, hw/scsi/vhost-user-scsi.c, etc. ) > > Hi Stefano! > > Thank you for chiming in, and sorry for the late response. I was hoping > that Michael is going to chime in and that I can base my reply on his > take. Anyway here I go. > > A very valid observation! I do agree that we need this for > basically every vhost device, and since: > * net/vhost-vdpa.c > * hw/net/vhost_net.c > * hw/virtio/vhost-user-fs.c > already have it, that translates to shotgun it to the rest. Which > isn't nice in my opinion, which is why I am hoping for a discussion > on this topic, and a better solution (even if it turns out to be > something like a common macro). > [..] > > > > > >The documentation however does kind of state, that feature_bits is > > >supposed to contain the supported features. And under the assumption > > >that feature bit not in feature_bits implies that the corresponding bit > > >must not be set in the 3rd argument (features), then even with the > > >current implementation we do end up with the intersection of the three > > >as stated. And then vsock would be at fault for violating that > > >assumption, and my fix would be the best thing to do -- I guess. > > > > > >Is the implementation the way it is for a good reason, I can't judge > > >that with certainty for myself. > > > > Yes, I think we should fix the documentation, and after a few years of > > not looking at it I'm confused again about what it does. > > > > I would prefer to fix the algorithm and make whole thing less fragile. > > > But re-reading my commit for VIRTIO_F_RING_RESET, it seems that I had > > interpreted `feature_bits` (2nd argument) as a list of features that > > QEMU doesn't know how to emulate and therefore are required by the > > backend (vhost/vhost-user/vdpa). Because the problem is that `features` > > (3rd argument) is a set of features required by the driver that can be > > provided by both QEMU and the backend. > > Hm. I would say, this does sound like the sanest explanation, that might > justify the current code, but I will argue that for me, it isn't sane > enough. > > Here comes my argument. > > 1) The uses is explicitly asking for a vhost device and giving the user > a non vhost device is not an option. > > 2) The whole purpose of vhost is that at least the data plane is > implemented outside of QEMU (I am maybe a little sloppy here with > dataplane). That means a rather substantial portion of the device > implementation is not in QEMU, while QEMU remains in charge of the > setup. > > 3) Thus I would argue, that all the "transport feature bits" from 24 to > 40 should have a corresponding vhost feature because the vhost part needs > some sort of a support. > > What do we have there in bits from 24 to 40 according to the spec? > * VIRTIO_F_INDIRECT_DESC > * VIRTIO_F_EVENT_IDX > * VIRTIO_F_VERSION_1 > * VIRTIO_F_ACCESS_PLATFORM > * VIRTIO_F_RING_PACKED > * VIRTIO_F_IN_ORDER > * VIRTIO_F_ORDER_PLATFORM > * VIRTIO_F_SR_IOV > * VIRTIO_F_NOTIFICATION_DATA > * VIRTIO_F_NOTIF_CONFIG_DATA > * VIRTIO_F_RING_RESET > and for transitional: > * VIRTIO_F_NOTIFY_ON_EMPTY > * VIRTIO_F_ANY_LAYOUT > * UNUSED > > I would say, form these only VIRTIO_F_SR_IOV and > VIRTIO_F_NOTIF_CONFIG_DATA look iffy in a sense things may work out > for vhost devices without the vhost part doing something for it. And > even there, I don't think it would hurt to make vhost part of the > negotiation (I don't think those are supported by QEMU at this point). > > I would very much prefer having a consolidated and safe handling for > these. > > 4) I would also argue that a bunch of the device specific feature bits > should have vhost feature bits as well for the same reason: > features are also such that for a vhost device, the vhost part needs > some sort of a support. > > Looking through all of these would require a lot of time, so instead > of that, let me use SCSI as an example. The features are: > * VIRTIO_SCSI_F_INOUT > * VIRTIO_SCSI_F_HOTPLUG > * VIRTIO_SCSI_F_CHANGE > * VIRTIO_SCSI_F_T10_PI > > The in the Linux kernel we have > VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL << > VIRTIO_SCSI_F_HOTPLUG) | > (1ULL << VIRTIO_SCSI_F_T10_PI) > but in QEMU kernel_feature_bits does not have > VIRTIO_SCSI_F_T10_PI which together does not make much sense to me. And I > would > also expect VIRTIO_SCSI_F_INOUT to be a part of the negotiation, because > to me that the side that is processing the queue is the one that should > care about it, but I don't think it is supported by QEMU at all, and > then not negotiating it is fine. VIRTIO_SCSI_F_HOTPLUG is in QEMU's > kernel_feature_bits and thus negotiated. in So the only one that can be > done purely in QEMU seems to be VIRTIO_SCSI_F_CHANGE. > > And for vsock we have > VIRTIO_VSOCK_F_SEQPACKET and VIRTIO_VSOCK_F_STREAM, with VIRTIO_VSOCK_F_STREAM > being strange in a sense that the spec says "If no feature bit is set, > only stream socket type is supported. If VIRTIO_VSOCK_F_SEQPACKET has > been negotiated, the device MAY act as if VIRTIO_VSOCK_F_STREAM has also > been negotiated." > > VIRTIO_VSOCK_F_SEQPACKET is negotiated with vhost. > > It seems for whatever reason we just assume that the vhost device > supports VIRTIO_VSOCK_F_STREAM and that is why we don't negotiate it. I > would assume based on what I see that the feature VIRTIO_VSOCK_F_STREAM > was retrofitted, and that is what we ended up with. Thus I guess now > we have an implicit rule that any QEMU compatible vhost-vsock > implementation would have to support that. But in practice we care only > about Linux, and thus it does not matter. > > 5) Based on the following, I would very much prefer a per device list of > features with the semantic "hey QEMU can do that feature without any > specialized vhost-device support (e.g. like VIRTIO_SCSI_F_CHANGE)"
Indeed the current code is kind of tricky and may need better documentation. But the problem is some features were datapath related and they are supported since the birth of a specific vhost device. For example, some GSO related features (actually, it's not a feature of vhost-net but TUN/TAP). And I've found another interesting thing, for RING_REST, actually we don't need to do anything but we have the following commits: 313389be06ff6 ("vhost-net: support VIRTIO_F_RING_RESET") 2a3552baafb ("vhost: vhost-kernel: enable vq reset feature") Technically, they are not necessary as RING_RESET for vhost-kernel doesn't require any additional new ioctls. But it's too late to change as the kernel commit has been merged. > over > the current list with the semantics "these are the features that > need vhost backend support, and anything else will just work out". That > way if an omission is made, we would end up with the usually safer > under-indication instead of the current over-indication. > > > @Michael, Jason: Could you guys chime in? Another issue is that it seems to require a change of the semantic of VHOST_GET_FEATURES. If my understanding is true, it seems a non-trivial change which I'm not sure it's worth to bother. Thanks