> From: Jason Wang [mailto:jasow...@redhat.com] > Sent: Monday, September 16, 2019 4:33 PM > > > On 2019/9/16 上午9:51, Tian, Kevin wrote: > > Hi, Jason > > > > We had a discussion about dirty page tracking in VFIO, when vIOMMU > > is enabled: > > > > https://lists.nongnu.org/archive/html/qemu-devel/2019- > 09/msg02690.html > > > > It's actually a similar model as vhost - Qemu cannot interpose the fast- > path > > DMAs thus relies on the kernel part to track and report dirty page > information. > > Currently Qemu tracks dirty pages in GFN level, thus demanding a > translation > > from IOVA to GPA. Then the open in our discussion is where this > translation > > should happen. Doing the translation in kernel implies a device iotlb > flavor, > > which is what vhost implements today. It requires potentially large > tracking > > structures in the host kernel, but leveraging the existing log_sync flow in > Qemu. > > On the other hand, Qemu may perform log_sync for every removal of > IOVA > > mapping and then do the translation itself, then avoiding the GPA > awareness > > in the kernel side. It needs some change to current Qemu log-sync flow, > and > > may bring more overhead if IOVA is frequently unmapped. > > > > So we'd like to hear about your opinions, especially about how you came > > down to the current iotlb approach for vhost. > > > We don't consider too much in the point when introducing vhost. And > before IOTLB, vhost has already know GPA through its mem table > (GPA->HVA). So it's nature and easier to track dirty pages at GPA level > then it won't any changes in the existing ABI.
This is the same situation as VFIO. > > For VFIO case, the only advantages of using GPA is that the log can then > be shared among all the devices that belongs to the VM. Otherwise > syncing through IOVA is cleaner. I still worry about the potential performance impact with this approach. In current mdev live migration series, there are multiple system calls involved when retrieving the dirty bitmap information for a given memory range. IOVA mappings might be changed frequently. Though one may argue that frequent IOVA change already has bad performance, it's still not good to introduce further non-negligible overhead in such situation. On the other hand, I realized that adding IOVA awareness in VFIO is actually easy. Today VFIO already maintains a full list of IOVA and its associated HVA in vfio_dma structure, according to VFIO_MAP and VFIO_UNMAP. As long as we allow the latter two operations to accept another parameter (GPA), IOVA->GPA mapping can be naturally cached in existing vfio_dma objects. Those objects are always updated according to MAP and UNMAP ioctls to be up-to-date. Qemu then uniformly retrieves the VFIO dirty bitmap for the entire GPA range in every pre-copy round, regardless of whether vIOMMU is enabled. There is no need of another IOTLB implementation, with the main ask on a v2 MAP/UNMAP interface. Alex, your thoughts? Thanks Kevin