Hi Zhenzhong, On 7/8/25 1:05 PM, Zhenzhong Duan wrote: > When vIOMMU is configured x-flts=on in scalable mode, stage-1 page table > is passed to host to construct nested page table. We need to check > compatibility of some critical IOMMU capabilities between vIOMMU and > host IOMMU to ensure guest stage-1 page table could be used by host. > > For instance, vIOMMU supports stage-1 1GB huge page mapping, but host > does not, then this IOMMUFD backed device should fail. > > Even of the checks pass, for now we willingly reject the association > because all the bits are not there yet. > > Signed-off-by: Yi Liu <yi.l....@intel.com> > Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> > --- > hw/i386/intel_iommu.c | 30 +++++++++++++++++++++++++++++- > hw/i386/intel_iommu_internal.h | 1 + > 2 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index e90fd2f28f..c57ca02cdd 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -40,6 +40,7 @@ > #include "kvm/kvm_i386.h" > #include "migration/vmstate.h" > #include "trace.h" > +#include "system/iommufd.h" > > /* context entry operations */ > #define VTD_CE_GET_RID2PASID(ce) \ > @@ -4355,7 +4356,34 @@ static bool vtd_check_hiod(IntelIOMMUState *s, > HostIOMMUDevice *hiod, > return true; > } > > - error_setg(errp, "host device is uncompatible with stage-1 translation"); > +#ifdef CONFIG_IOMMUFD > + struct HostIOMMUDeviceCaps *caps = &hiod->caps; > + struct iommu_hw_info_vtd *vtd = &caps->vendor_caps.vtd;
I am now confused about how this relates to vtd_get_viommu_cap(). PCIIOMMUOps.set_iommu_device = vtd_dev_set_iommu_device calls vtd_check_hiod() viommu might return HW_NESTED_CAP through PCIIOMMUOps.get_viommu_cap without making sure the underlying HW IOMMU does support it. Is that a correct understanding? Maybe we should clarify the calling order between set_iommu_device vs get_viommu_cap? Could we check HW IOMMU prerequisites in vtd_get_viommu_cap() by enforcing this is called after set_iommu_device. I think we should clarify the exact semantic of get_viommu_cap().Thanks Eric > + > + /* Remaining checks are all stage-1 translation specific */ > + if (!object_dynamic_cast(OBJECT(hiod), TYPE_HOST_IOMMU_DEVICE_IOMMUFD)) { > + error_setg(errp, "Need IOMMUFD backend when x-flts=on"); > + return false; > + } > + > + if (caps->type != IOMMU_HW_INFO_TYPE_INTEL_VTD) { > + error_setg(errp, "Incompatible host platform IOMMU type %d", > + caps->type); > + return false; > + } > + > + if (!(vtd->ecap_reg & VTD_ECAP_NEST)) { > + error_setg(errp, "Host IOMMU doesn't support nested translation"); > + return false; > + } > + > + if (s->fs1gp && !(vtd->cap_reg & VTD_CAP_FS1GP)) { > + error_setg(errp, "Stage-1 1GB huge page is unsupported by host > IOMMU"); > + return false; > + } > +#endif > + > + error_setg(errp, "host IOMMU is incompatible with stage-1 translation"); > return false; > } > > diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h > index 7aba259ef8..18bc22fc72 100644 > --- a/hw/i386/intel_iommu_internal.h > +++ b/hw/i386/intel_iommu_internal.h > @@ -192,6 +192,7 @@ > #define VTD_ECAP_PT (1ULL << 6) > #define VTD_ECAP_SC (1ULL << 7) > #define VTD_ECAP_MHMV (15ULL << 20) > +#define VTD_ECAP_NEST (1ULL << 26) > #define VTD_ECAP_SRS (1ULL << 31) > #define VTD_ECAP_PASID (1ULL << 40) > #define VTD_ECAP_SMTS (1ULL << 43)