On Mon, Apr 21, 2025 at 06:00:27PM +0300, Israel Rukshin wrote: > On 4/21/2025 5:12 PM, Michael S. Tsirkin wrote: > > On Mon, Apr 21, 2025 at 03:36:21PM +0300, Israel Rukshin wrote: > > > On 4/21/2025 2:29 PM, Michael S. Tsirkin wrote: > > > > On Mon, Apr 21, 2025 at 02:05:27PM +0300, Israel Rukshin wrote: > > > > > On 4/20/2025 4:07 PM, Michael S. Tsirkin wrote: > > > > > > On Sun, Apr 20, 2025 at 03:31:25PM +0300, Max Gurtovoy wrote: > > > > > > > On 20/04/2025 14:33, Michael S. Tsirkin wrote: > > > > > > > > On Sun, Apr 20, 2025 at 02:19:05PM +0300, Israel Rukshin wrote: > > > > > > > > > Add a safety check to ensure that the length of data written > > > > > > > > > by the > > > > > > > > > device is at least as large as the status length. If this > > > > > > > > > condition is > > > > > > > > > not met, it indicates a potential error in the device's > > > > > > > > > response. > > > > > > > > > > > > > > > > > > This change aligns with the virtio specification, which > > > > > > > > > states: > > > > > > > > > "The driver MUST NOT make assumptions about data in > > > > > > > > > device-writable > > > > > > > > > buffers beyond the first len bytes, and SHOULD ignore this > > > > > > > > > data." > > > > > > > > > > > > > > > > > > By failing the admin command when len is insufficient, we > > > > > > > > > ensure > > > > > > > > > that the driver does not process potentially invalid or > > > > > > > > > incomplete > > > > > > > > > status from the device. > > > > > > > > > > > > > > > > > > Signed-off-by: Israel Rukshin <isra...@nvidia.com> > > > > > > > > > Reviewed-by: Max Gurtovoy <mgurto...@nvidia.com> > > > > > > > > > --- > > > > > > > > > drivers/virtio/virtio_pci_modern.c | 7 +++++-- > > > > > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_pci_modern.c > > > > > > > > > b/drivers/virtio/virtio_pci_modern.c > > > > > > > > > index 7209390a5cbf..0374e86aece3 100644 > > > > > > > > > --- a/drivers/virtio/virtio_pci_modern.c > > > > > > > > > +++ b/drivers/virtio/virtio_pci_modern.c > > > > > > > > > @@ -48,6 +48,7 @@ void vp_modern_avq_done(struct virtqueue > > > > > > > > > *vq) > > > > > > > > > { > > > > > > > > > struct virtio_pci_device *vp_dev = > > > > > > > > > to_vp_device(vq->vdev); > > > > > > > > > struct virtio_pci_admin_vq *admin_vq = > > > > > > > > > &vp_dev->admin_vq; > > > > > > > > > + unsigned int status_size = sizeof(struct > > > > > > > > > virtio_admin_cmd_status); > > > > > > > > > struct virtio_admin_cmd *cmd; > > > > > > > > > unsigned long flags; > > > > > > > > > unsigned int len; > > > > > > > > > @@ -56,8 +57,10 @@ void vp_modern_avq_done(struct virtqueue > > > > > > > > > *vq) > > > > > > > > > do { > > > > > > > > > virtqueue_disable_cb(vq); > > > > > > > > > while ((cmd = virtqueue_get_buf(vq, &len))) { > > > > > > > > > - cmd->result_sg_size = > > > > > > > > > - len - sizeof(struct > > > > > > > > > virtio_admin_cmd_status); > > > > > > > > > + if (len < status_size) > > > > > > > > > + cmd->ret = -EIO; > > > > > > > > > + else > > > > > > > > > + cmd->result_sg_size = len - > > > > > > > > > status_size; > > > > > > > > > complete(&cmd->completion); > > > > > > > > > } > > > > > > > > No this is out of spec: > > > > > > > > Similarly, the driver compares the used buffer length > > > > > > > > of the buffer to what it expects and then silently > > > > > > > > truncates the structure to the used buffer length. > > > > > > > > The driver silently ignores any data falling outside > > > > > > > > the used buffer length reported by the device. Any > > > > > > > > missing > > > > > > > > fields are interpreted as set to zero. > > > > > > > > > > > > > > > > so if status is 0, device can just write a shorter buffer and > > > > > > > > driver must assume 0, which is success, not failure. > > > > > > > maybe we can make a sanity check that the len is at least 4B > > > > > > > (status and > > > > > > > status_qualifier) which is basic size to make sure device is > > > > > > > acting as > > > > > > > expected ? > > > > > > > > > > > > > The spec says nothing about this. Just zero-initialize the buffer > > > > > > beyond len and be done with it. > > > > > What do you think on the bellow change for v2? > > > > > > > > > > @@ -56,7 +57,13 @@ void vp_modern_avq_done(struct virtqueue *vq) > > > > > do { > > > > > virtqueue_disable_cb(vq); > > > > > while ((cmd = virtqueue_get_buf(vq, &len))) { > > > > > - cmd->result_sg_size = len; > > > > > + /* If the length written is less than the > > > > > status > > > > > size, > > > > > + * the remaining status bytes stay > > > > > zero-initialized. > > > > > + */ > > > > > + if (len < status_size) > > > > > + cmd->result_sg_size = 0; > > > > > + else > > > > > + cmd->result_sg_size = len - > > > > > status_size; > > > > > complete(&cmd->completion); > > > > > } > > > > > } while (!virtqueue_enable_cb(vq)); > > > > Can you please explain what exactly is this change doing? > > > > The changelog claims all that is done is adding a check, > > > > but that's not true: previously we reported the whole length. Now you > > > > want to report > > > > length less status_size. This is a userspace-visible change, is it not? > > > > > > > Sorry for the confusion. > > > > > > This change is essentially PATCH 1/2 from this series with the addition of > > > handling the case when len is smaller than status_size to avoid getting a > > > negative value. > > > According to your comments, I will remove PATCH 2/2 and post only PATCH > > > 1/2 > > > with this change. > > > You are correct, this is a userspace-visible change, and the size > > > currently > > > reported in virtio_pci_admin_dev_parts_get() is incorrect. > > > > Sounds good. My only question is, how bad is the current condition? > The migration fails if the state size is exactly 4KB, because the allocated > buffer (4KB) is smaller than the amount the user is trying to read (4KB + 8 > bytes). > For other state sizes, there are an extra 8 bytes of data at the end of the > state. > According to the specification, the device at the destination must fail the > migration because this is not a valid state. > > is there a chance userspace will try to work around it? > I am not aware of.
My concern is userspace might try to work around this bug.. Do you want to add a module attribute to tell userspace it's fixed? > > > > > > > > > > > > > } while (!virtqueue_enable_cb(vq)); > > > > > > > > > -- > > > > > > > > > 2.34.1