[cc +dgibson] On Thu, 1 Sep 2016 10:29:29 +0800 Peter Xu <pet...@redhat.com> wrote:
> On Wed, Aug 31, 2016 at 10:45:37AM +0800, Jason Wang wrote: > > > > > > On 2016年08月30日 11:37, Alex Williamson wrote: > > >On Tue, 30 Aug 2016 11:06:58 +0800 > > >Jason Wang <jasow...@redhat.com> wrote: > > > > > >>From: Peter Xu <pet...@redhat.com> > > >> > > >>This reverts commit 3cb3b1549f5401dc3a5e1d073e34063dc274136f. Vhost > > >>device IOTLB API will get notified and send invalidation request to > > >>vhost through this notifier. > > >AFAICT this series does not address the original problem for which > > >commit 3cb3b1549f54 was added. We've only addressed the very narrow > > >use case of a device iotlb firing the iommu notifier therefore this > > >change is a regression versus 2.7 since it allows invalid > > >configurations with a physical iommu which will never receive the > > >necessary notifies from intel-iommu emulation to work properly. Thanks, > > > > > >Alex > > > > Looking at vfio, it cares about map but vhost only cares about IOTLB > > invalidation. Then I think we probably need another kind of notifier in this > > case to avoid this. > > Shall we leverage IOMMUTLBEntry.perm == IOMMU_NONE as a sign for > invalidation? If so, we can use the same IOTLB interface as before. > IMHO these two interfaces are not conflicting? > > Alex, > > Do you mean we should still disallow user from passing through devices > while Intel IOMMU enabled? If so, not sure whether patch below can > solve the issue. > > It seems that we need a "name" for either IOMMU notifier > provider/consumer, and we should not allow (provider==Intel && > consumer==VFIO) happen. In the following case, I added a name for > provider, and VFIO checks it. Absolutely not, intel-iommu emulation is simply incomplete, the IOMMU notifier is never called for mappings. There's a whole aspect of iommu notifiers that intel-iommu simply hasn't bothered to implement. Don't punish vfio for actually making use of the interface as it was intended to be used. AFAICT you're implementing the unmap/invalidation half, without the actual mapping half of the interface. It's broken and incompatible with any iommu notifiers that expect to see both sides. Thanks, Alex > --------8<---------- > > diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c > index 883db13..936c2e6 100644 > --- a/hw/alpha/typhoon.c > +++ b/hw/alpha/typhoon.c > @@ -725,6 +725,7 @@ static IOMMUTLBEntry typhoon_translate_iommu(MemoryRegion > *iommu, hwaddr addr, > } > > static const MemoryRegionIOMMUOps typhoon_iommu_ops = { > + .iommu_type = "typhoon", > .translate = typhoon_translate_iommu, > }; > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 28c31a2..f5e3875 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -2347,6 +2347,7 @@ static void vtd_init(IntelIOMMUState *s) > memset(s->w1cmask, 0, DMAR_REG_SIZE); > memset(s->womask, 0, DMAR_REG_SIZE); > > + s->iommu_ops.iommu_type = "intel"; > s->iommu_ops.translate = vtd_iommu_translate; > s->iommu_ops.notify_started = vtd_iommu_notify_started; > s->root = 0; > diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c > index 653e711..9cfbb73 100644 > --- a/hw/pci-host/apb.c > +++ b/hw/pci-host/apb.c > @@ -323,6 +323,7 @@ static IOMMUTLBEntry pbm_translate_iommu(MemoryRegion > *iommu, hwaddr addr, > } > > static MemoryRegionIOMMUOps pbm_iommu_ops = { > + .iommu_type = "pbm", > .translate = pbm_translate_iommu, > }; > > diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > index 6bc4d4d..e3e8739 100644 > --- a/hw/ppc/spapr_iommu.c > +++ b/hw/ppc/spapr_iommu.c > @@ -244,6 +244,7 @@ static const VMStateDescription vmstate_spapr_tce_table = > { > }; > > static MemoryRegionIOMMUOps spapr_iommu_ops = { > + .iommu_type = "spapr", > .translate = spapr_tce_translate_iommu, > .get_min_page_size = spapr_tce_get_min_page_size, > .notify_started = spapr_tce_notify_started, > diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c > index 9c1c04e..4414462 100644 > --- a/hw/s390x/s390-pci-bus.c > +++ b/hw/s390x/s390-pci-bus.c > @@ -443,6 +443,7 @@ static IOMMUTLBEntry s390_translate_iommu(MemoryRegion > *iommu, hwaddr addr, > } > > static const MemoryRegionIOMMUOps s390_iommu_ops = { > + .iommu_type = "s390", > .translate = s390_translate_iommu, > }; > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index b313e7c..317e08b 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -441,6 +441,11 @@ static void vfio_listener_region_add(MemoryListener > *listener, > if (memory_region_is_iommu(section->mr)) { > VFIOGuestIOMMU *giommu; > > + if (!strcmp(memory_region_iommu_type(section->mr), "intel")) { > + error_report("Device passthrough cannot work with Intel IOMMU"); > + exit(1); > + } > + > trace_vfio_listener_region_add_iommu(iova, end); > /* > * FIXME: For VFIO iommu types which have KVM acceleration to > diff --git a/include/exec/memory.h b/include/exec/memory.h > index 3e4d416..f012f77 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -149,6 +149,8 @@ struct MemoryRegionOps { > typedef struct MemoryRegionIOMMUOps MemoryRegionIOMMUOps; > > struct MemoryRegionIOMMUOps { > + /* Type of IOMMU */ > + const char *iommu_type; > /* Return a TLB entry that contains a given address. */ > IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool > is_write); > /* Returns minimum supported page size */ > @@ -593,6 +595,21 @@ static inline bool memory_region_is_iommu(MemoryRegion > *mr) > return mr->iommu_ops; > } > > +/** > + * memory_region_iommu_type: return type of IOMMU > + * > + * Returns type of IOMMU, empty string ("") if not a IOMMU region. > + * > + * @mr: the memory region being queried > + */ > +static inline const char *memory_region_iommu_type(MemoryRegion *mr) > +{ > + if (mr->iommu_ops && mr->iommu_ops->iommu_type) { > + return mr->iommu_ops->iommu_type; > + } > + > + return ""; > +} > > /** > * memory_region_iommu_get_min_page_size: get minimum supported page size > > ------>8-------- > > Thanks, > > -- peterx