On Mon, 6 Jun 2016 21:43:17 +0800 Peter Xu <pet...@redhat.com> wrote:
> On Mon, Jun 06, 2016 at 07:11:41AM -0600, Alex Williamson wrote: > > On Mon, 6 Jun 2016 13:04:07 +0800 > > Peter Xu <pet...@redhat.com> wrote: > [...] > > > Besides the reason that there might have guests that do not support > > > CM=1, will there be performance considerations? When user's > > > configuration does not require CM capability (e.g., generic VM > > > configuration, without VFIO), shall we allow user to disable the CM > > > bit so that we can have better IOMMU performance (avoid extra and > > > useless invalidations)? > > > > With Alexey's proposed patch to have callback ops when the iommu > > notifier list adds its first entry and removes its last, any of the > > additional overhead to generate notifies when nobody is listening can > > be avoided. These same callbacks would be the ones that need to > > generate a hw_error if a notifier is added while running in CM=0. > > Not familar with Alexey's patch https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg00079.html >, but is that for VFIO only? vfio is currently the only user of the iommu notifier, but the interface is generic, which is how it should (must) be. > I mean, if > we configured CMbit=1, guest kernel will send invalidation request > every time it creates new entries (context entries, or iotlb > entries). Even without VFIO notifiers, guest need to trap into QEMU > and process the invalidation requests. This is avoidable if we are not > using VFIO devices at all (so no need to maintain any mappings), > right? CM=1 only defines that not-present and invalid entries can be cached, any changes to existing entries requires an invalidation regardless of CM. What you're looking for sounds more like ECAP.C: C: Page-walk Coherency This field indicates if hardware access to the root, context, extended-context and interrupt-remap tables, and second-level paging structures for requests-without PASID, are coherent (snooped) or not. • 0: Indicates hardware accesses to remapping structures are non-coherent. • 1: Indicates hardware accesses to remapping structures are coherent. Without both CM=0 and C=0, our only virtualization mechanism for maintaining a hardware cache coherent with the guest view of the iommu would be to shadow all of the VT-d structures. For purely emulated devices, maybe we can get away with that, but I doubt the current ghashes used for the iotlb are prepared for it. > If we allow user to specify cmbit={0|1}, user can decide whether > he/she would like to take this benefit. So long as the *default* gives us the ability to support an external hardware cache, like vfio, and we generate a hw_error or equivalent to avoid unsafe combinations, you're free to enable whatever other shortcuts you want. Thanks, Alex