On Thu, 16 May 2024 10:39:42 +0200 Stefano Garzarella <sgarz...@redhat.com> wrote:
[..] > >--- > > > >This is a minimal fix, that follows the current patterns in the > >codebase, and not necessarily the best one. > > Yeah, I did something similar with commit 562a7d23bf ("vhost: mask > VIRTIO_F_RING_RESET for vhost and vhost-user devices") so I think for > now is the right approach. > > I suggest to check also other devices like we did in that commit (e.g. > hw/scsi/vhost-scsi.c, hw/scsi/vhost-user-scsi.c, etc. ) Hi Stefano! Thank you for chiming in, and sorry for the late response. I was hoping that Michael is going to chime in and that I can base my reply on his take. Anyway here I go. A very valid observation! I do agree that we need this for basically every vhost device, and since: * net/vhost-vdpa.c * hw/net/vhost_net.c * hw/virtio/vhost-user-fs.c already have it, that translates to shotgun it to the rest. Which isn't nice in my opinion, which is why I am hoping for a discussion on this topic, and a better solution (even if it turns out to be something like a common macro). [..] > > > >The documentation however does kind of state, that feature_bits is > >supposed to contain the supported features. And under the assumption > >that feature bit not in feature_bits implies that the corresponding bit > >must not be set in the 3rd argument (features), then even with the > >current implementation we do end up with the intersection of the three > >as stated. And then vsock would be at fault for violating that > >assumption, and my fix would be the best thing to do -- I guess. > > > >Is the implementation the way it is for a good reason, I can't judge > >that with certainty for myself. > > Yes, I think we should fix the documentation, and after a few years of > not looking at it I'm confused again about what it does. > I would prefer to fix the algorithm and make whole thing less fragile. > But re-reading my commit for VIRTIO_F_RING_RESET, it seems that I had > interpreted `feature_bits` (2nd argument) as a list of features that > QEMU doesn't know how to emulate and therefore are required by the > backend (vhost/vhost-user/vdpa). Because the problem is that `features` > (3rd argument) is a set of features required by the driver that can be > provided by both QEMU and the backend. Hm. I would say, this does sound like the sanest explanation, that might justify the current code, but I will argue that for me, it isn't sane enough. Here comes my argument. 1) The uses is explicitly asking for a vhost device and giving the user a non vhost device is not an option. 2) The whole purpose of vhost is that at least the data plane is implemented outside of QEMU (I am maybe a little sloppy here with dataplane). That means a rather substantial portion of the device implementation is not in QEMU, while QEMU remains in charge of the setup. 3) Thus I would argue, that all the "transport feature bits" from 24 to 40 should have a corresponding vhost feature because the vhost part needs some sort of a support. What do we have there in bits from 24 to 40 according to the spec? * VIRTIO_F_INDIRECT_DESC * VIRTIO_F_EVENT_IDX * VIRTIO_F_VERSION_1 * VIRTIO_F_ACCESS_PLATFORM * VIRTIO_F_RING_PACKED * VIRTIO_F_IN_ORDER * VIRTIO_F_ORDER_PLATFORM * VIRTIO_F_SR_IOV * VIRTIO_F_NOTIFICATION_DATA * VIRTIO_F_NOTIF_CONFIG_DATA * VIRTIO_F_RING_RESET and for transitional: * VIRTIO_F_NOTIFY_ON_EMPTY * VIRTIO_F_ANY_LAYOUT * UNUSED I would say, form these only VIRTIO_F_SR_IOV and VIRTIO_F_NOTIF_CONFIG_DATA look iffy in a sense things may work out for vhost devices without the vhost part doing something for it. And even there, I don't think it would hurt to make vhost part of the negotiation (I don't think those are supported by QEMU at this point). I would very much prefer having a consolidated and safe handling for these. 4) I would also argue that a bunch of the device specific feature bits should have vhost feature bits as well for the same reason: features are also such that for a vhost device, the vhost part needs some sort of a support. Looking through all of these would require a lot of time, so instead of that, let me use SCSI as an example. The features are: * VIRTIO_SCSI_F_INOUT * VIRTIO_SCSI_F_HOTPLUG * VIRTIO_SCSI_F_CHANGE * VIRTIO_SCSI_F_T10_PI The in the Linux kernel we have VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL << VIRTIO_SCSI_F_HOTPLUG) | (1ULL << VIRTIO_SCSI_F_T10_PI) but in QEMU kernel_feature_bits does not have VIRTIO_SCSI_F_T10_PI which together does not make much sense to me. And I would also expect VIRTIO_SCSI_F_INOUT to be a part of the negotiation, because to me that the side that is processing the queue is the one that should care about it, but I don't think it is supported by QEMU at all, and then not negotiating it is fine. VIRTIO_SCSI_F_HOTPLUG is in QEMU's kernel_feature_bits and thus negotiated. in So the only one that can be done purely in QEMU seems to be VIRTIO_SCSI_F_CHANGE. And for vsock we have VIRTIO_VSOCK_F_SEQPACKET and VIRTIO_VSOCK_F_STREAM, with VIRTIO_VSOCK_F_STREAM being strange in a sense that the spec says "If no feature bit is set, only stream socket type is supported. If VIRTIO_VSOCK_F_SEQPACKET has been negotiated, the device MAY act as if VIRTIO_VSOCK_F_STREAM has also been negotiated." VIRTIO_VSOCK_F_SEQPACKET is negotiated with vhost. It seems for whatever reason we just assume that the vhost device supports VIRTIO_VSOCK_F_STREAM and that is why we don't negotiate it. I would assume based on what I see that the feature VIRTIO_VSOCK_F_STREAM was retrofitted, and that is what we ended up with. Thus I guess now we have an implicit rule that any QEMU compatible vhost-vsock implementation would have to support that. But in practice we care only about Linux, and thus it does not matter. 5) Based on the following, I would very much prefer a per device list of features with the semantic "hey QEMU can do that feature without any specialized vhost-device support (e.g. like VIRTIO_SCSI_F_CHANGE)" over the current list with the semantics "these are the features that need vhost backend support, and anything else will just work out". That way if an omission is made, we would end up with the usually safer under-indication instead of the current over-indication. @Michael, Jason: Could you guys chime in? > > > > >But I'm pretty convinced that the current approach is fragile, > >especially for the feature bits form the range 24 to 40, as those are > >not specific to a device. > > > >BTW vsock also lacks VIRTIO_F_ACCESS_PLATFORM, and VIRTIO_F_RING_RESET > >as well while vhost-net has both. > > VIRTIO_F_RING_RESET is just above the line added by this patch. > My bad! :) > > > >If our design is indeed to make the individual devices responsible for > >having a complete list of possible features in feature_bits, then at > >least having a common macro for the non-device specific features would > >make sense to me. > > Yeah, I agree on this! I guess, "possible features" shifted towards "possible features that QEMU can not provide without the vhost backend". But some sort of a common list does seem viable. I would however prefer a can do without list, or maybe even two ("can do without because I do it myself and alone" and "can do without, because although I must rely on a vhost capability to provide that feature, the presence of that capability is not indicated via the feature bits and based on knowing how things are I'm assuming the capability is there although not specifically indicated". > > > > >On the other hand, I'm also very happy to send a patch which changes the > >behavior of vhost_get_features(), should the community decide that the > >current behavior does not make all that much sense -- I lean towards: > >probably it does not make much sense, but things like > >VIRTIO_F_ACCESS_PLATFORM, which are mandatory feature bits, need careful > >consideration, because there vhost can't do so we just won't offer it > >and proceed on our merry way is not the right behavior. > > > >Please comment! > > Maybe we should discuss it in another thread, but I agree that we should > fix it in someway. Thank you for raising this discussion! > Hm, I ended up replying in this thread because we still don't have closure on this patch. But if you think it makes, feel free to start a different thread, and please do include me. > > > >Regards, > >Halil > >--- > > hw/virtio/vhost-vsock-common.c | 1 + > > 1 file changed, 1 insertion(+) > > This patch LGTM, but as I mention we should fix other devices as well, > but maybe we can do with the common macro you suggested in another > patch. I agree, this should be definitely another patch or even series depending on how the discussion pans out. Regards, Halil