> From: Kunkun Jiang <jiangkun...@huawei.com> > Sent: Friday, September 24, 2021 2:19 PM > > Hi all, > > I encountered a problem in vfio device migration test. The > vCPU may be paused during vfio-pci DMA in iommu nested > stage mode && vSVA. This may lead to migration fail and > other problems related to device hardware and driver > implementation. > > It may be a bit early to discuss this issue, after all, the iommu > nested stage mode and vSVA are not yet mature. But judging > from the current implementation, we will definitely encounter > this problem in the future.
Yes, this is a known limitation to support migration with vSVA. > > This is the current process of vSVA processing translation fault > in iommu nested stage mode (take SMMU as an example): > > guest os 4.handle translation fault 5.send CMD_RESUME to vSMMU > > > qemu 3.inject fault into guest os 6.deliver response to > host os > (vfio/vsmmu) > > > host os 2.notify the qemu 7.send CMD_RESUME to SMMU > (vfio/smmu) > > > SMMU 1.address translation fault 8.retry or > terminate > > The order is 1--->8. > > Currently, qemu may pause vCPU at any step. It is possible to > pause vCPU at step 1-5, that is, in a DMA. This may lead to > migration fail and other problems related to device hardware > and driver implementation. For example, the device status > cannot be changed from RUNNING && SAVING to SAVING, > because the device DMA is not over. > > As far as i can see, vCPU should not be paused during a device > IO process, such as DMA. However, currently live migration > does not pay attention to the state of vfio device when pausing > the vCPU. And if the vCPU is not paused, the vfio device is > always running. This looks like a *deadlock*. Basically this requires: 1) stopping vCPU after stopping device (could selectively enable this sequence for vSVA); 2) when stopping device, the driver should block new requests from vCPU (queued to a pending list) and then drain all in-fly requests including faults; * to block this further requires switching from fast-path to slow trap-emulation path for the cmd portal before stopping the device; 3) save the pending requests in the vm image and replay them after the vm is resumed; * finally disable blocking by switching back to the fast-path for the cmd portal; > > Do you have any ideas to solve this problem? > Looking forward to your replay. > We verified above flow can work in our internal POC. Thanks Kevin