On 13/09/2018 08:03, Fam Zheng wrote: > On Wed, 09/12 14:42, Paolo Bonzini wrote: >> On 12/09/2018 13:50, Fam Zheng wrote: >>>> I think it's okay if it is invoked. The sequence is first you stop the >>>> vq, then you drain the BlockBackends, then you switch AioContext. All >>>> that matters is the outcome when virtio_scsi_dataplane_stop returns. >>> Yes, but together with vIOMMU, it also effectively leads to a >>> virtio_error(), >>> which is not clean. QEMU stderr when this call happens (with patch 1 but not >>> this patch): >>> >>> 2018-09-12T11:48:10.193023Z qemu-system-x86_64: vtd_iommu_translate: >>> detected translation failure (dev=02:00:00, iova=0x0) >>> 2018-09-12T11:48:10.193044Z qemu-system-x86_64: New fault is not recorded >>> due to compression of faults >>> 2018-09-12T11:48:10.193061Z qemu-system-x86_64: virtio: zero sized buffers >>> are not allowed >> >> But with iothread, virtio_scsi_dataplane_stop runs in another thread >> than the iothread; in that case you still have a race where the iothread >> can process the vq before aio_disable_external and print the error. >> >> IIUC the guest has cleared the IOMMU page tables _before_ clearing the >> DRIVER_OK bit in the status field. Could this be a guest bug? > > I'm not sure if it is a bug or not. I think what happens is the device is left > enabled by Seabios, and then reset by kernel.
That makes sense, though I'm not sure why QEMU needs to process a request long after SeaBIOS has left control to Linux. Maybe it's just that the messages should not go on QEMU stderr, and rather trace-point should be enough. Paolo