Re: [PATCH v2] virtio-pci: disable vring processing when bus-mastering is disabled

Michael S. Tsirkin Thu, 28 Nov 2019 10:57:10 -0800

On Thu, Nov 28, 2019 at 05:48:00PM +0100, Halil Pasic wrote:
> On Tue, 19 Nov 2019 18:50:03 -0600
> Michael Roth <mdr...@linux.vnet.ibm.com> wrote:
> 
> [..]
> > I.e. the calling code is only scheduling a one-shot BH for
> > virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
> > an additional virtqueue entry before we get there. This is likely due
> > to the following check in virtio_queue_host_notifier_aio_poll:
> > 
> >   static bool virtio_queue_host_notifier_aio_poll(void *opaque)
> >   {
> >       EventNotifier *n = opaque;
> >       VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
> >       bool progress;
> > 
> >       if (!vq->vring.desc || virtio_queue_empty(vq)) {
> >           return false;
> >       }
> > 
> >       progress = virtio_queue_notify_aio_vq(vq);
> > 
> > namely the call to virtio_queue_empty(). In this case, since no new
> > requests have actually been issued, shadow_avail_idx == last_avail_idx,
> > so we actually try to access the vring via vring_avail_idx() to get
> > the latest non-shadowed idx:
> > 
> >   int virtio_queue_empty(VirtQueue *vq)
> >   {
> >       bool empty;
> >       ...
> > 
> >       if (vq->shadow_avail_idx != vq->last_avail_idx) {
> >           return 0;
> >       }
> > 
> >       rcu_read_lock();
> >       empty = vring_avail_idx(vq) == vq->last_avail_idx;
> >       rcu_read_unlock();
> >       return empty;
> > 
> > but since the IOMMU region has been disabled we get a bogus value (0
> > usually), which causes virtio_queue_empty() to falsely report that
> > there are entries to be processed, which causes errors such as:
> > 
> >   "virtio: zero sized buffers are not allowed"
> > 
> > or
> > 
> >   "virtio-blk missing headers"
> > 
> > and puts the device in an error state.
> > 
> 
> I've seen something similar on s390x with virtio-ccw-blk under
> protected virtualization, that made me wonder about how virtio-blk in
> particular but also virtio in general handles shutdown and reset.
> 
> This makes me wonder if bus-mastering disabled is the only scenario when
> a something like vdev->disabled should be used.
> 
> In particular I have the following mechanism in mind 
> 
> qemu_system_reset() --> ... --> qemu_devices_reset() --> ... --> 
> --> virtio_[transport]_reset() --> ... --> virtio_bus_stop_ioeventfd()
> --> virtio_blk_data_plane_stop()
> 
> which in turn triggesrs the following cascade:
> virtio_blk_data_plane_stop_bh --> 
> virtio_queue_aio_set_host_notifier_handler() -->
> --> virtio_queue_host_notifier_aio_read() 
> which however calls 
> virtio_queue_notify_aio_vq() if the notifier tests as
> positive. 
> 
> Since we still have vq->handle_aio_output that means we may
> call virtqueue_pop() during the reset procedure.
> 
> This was a problem for us, because (due to a bug) the shared pages that
> constitute the virtio ring weren't shared any more. And thus we got
> the infamous  
> virtio_error(vdev, "virtio: zero sized buffers are not allowed").
> 
> Now the bug is no more, and we can tolerate that somewhat late access
> to the virtio ring.
> 
> But it keeps nagging me, is it really OK for the device to access the
> virtio ring during reset? My intuition tells me that the device should
> not look for new requests after it has been told to reset.



Well it's after it was told to reset but it's not after
it completed reset. So I think it's fine ...

> Opinions? (Michael, Connie)
> 
> Regards,
> Halil
> 
> > This patch works around the issue by introducing virtio_set_disabled(),
> > which sets a 'disabled' flag to bypass checks like virtio_queue_empty()
> > when bus-mastering is disabled. Since we'd check this flag at all the
> > same sites as vdev->broken, we replace those checks with an inline
> > function which checks for either vdev->broken or vdev->disabled.
> > 
> > The 'disabled' flag is only migrated when set, which should be fairly
> > rare, but to maintain migration compatibility we disable it's use for
> > older machine types. Users requiring the use of the flag in conjunction
> > with older machine types can set it explicitly as a virtio-device
> > option.
> >

Re: [PATCH v2] virtio-pci: disable vring processing when bus-mastering is disabled

Reply via email to