On Fri, 2 Sep 2016 17:31:00 +0800 Peter Xu <pet...@redhat.com> wrote:
> On Fri, Sep 02, 2016 at 05:00:28PM +1000, David Gibson wrote: > > On Fri, 2 Sep 2016 14:18:47 +0800 > > Peter Xu <pet...@redhat.com> wrote: > > > > > On Fri, Sep 02, 2016 at 02:15:57PM +0800, Peter Xu wrote: > > > > > No, implement the full notifier, and a listener which only wants the > > > > > invalidates can just ignore callbacks which add new mappings. > > > > > > > > > > As I said, you'll need this to get VFIO working with vIOMMU which > > > > > someone is bound to want soon enough anyway. > > > > > > > > But for vhost cases, we do not need CM bit enabled. That might be the > > > > difference? > > > > > > > > I think we need to have vhost working even without CM bit. Device > > > > IOTLB should be able to achieve that. > > > > > > The problem is that, IMHO we should be very careful on enabling CM > > > bit. After enabling it, system might get slower (though I haven't > > > tried it yet), or even very slow? So maybe we will only enable it when > > > really needed (e.g., to do device passthrough and build the shadow > > > table). > > > > Um.. what's the CM bit and what does it have to do with anything? > > It's used to trace guest IO address space mapping changes. > > Pasted from VT-d spec chap 6.1: > > The Caching Mode (CM) field in Capability Register indicates if > the hardware implementation caches not-present or erroneous > translation-structure entries. When the CM field is reported as > Set, any software updates to any remapping structures (including > updates to not-present entries or present entries whose > programming resulted in translation faults) requires explicit > invalidation of the caches. > > Hardware implementations of this architecture must support > operation corresponding to CM=0. Operation corresponding to CM=1 > may be supported by software implementations (emulation) of this > architecture for efficient virtualization of remapping hardware. > Software managing remapping hardware should be written to handle > both caching modes. > > Software implementations virtualizing the remapping architecture > (such as a VMM emulating remapping hardware to an operating system > running within a guest partition) may report CM=1 to efficiently > virtualize the hardware. Software virtualization typically > requires the guest remapping structures to be shadowed in the > host. Reporting the Caching Mode as Set for the virtual hardware > requires the guest software to explicitly issue invalidation > operations on the virtual hardware for any/all updates to the > guest remapping structures. The virtualizing software may trap > these guest invalidation operations to keep the shadow translation > structures consistent to guest translation structure > modifications, without resorting to other less efficient > techniques (such as write-protecting the guest translation > structures through the processor’s paging facility). > > Currently it is not supported for Intel vIOMMUs. Maybe memory_region_register_iommu_notifier() could take an IOMMUAccessFlags argument (filter) that is passed to the notify_started callback. If a notifier client only cares about IOMMU_NONE (invalidations), intel-iommu could allow it, regardless of the CM setting (though I'm dubious whether this is complete in the generic case or really only for device iotlbs). If a client requires IOMMU_RW then intel-iommu would currently bomb-out like it does now, or once that gets fixed it would bomb if CM=0. Ideally intel-iommu would be fully functional, but somehow it was allowed into the tree with this massive gap in support for QEMU iommu interfaces. Thanks, Alex