Hi Zhenzhong,

On 7/16/25 12:31 PM, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.au...@redhat.com>
>> Subject: Re: [PATCH v3 07/20] intel_iommu: Check for compatibility with
>> IOMMUFD backed device when x-flts=on
>>
>> Hi Zhenzhong,
>>
>> On 7/8/25 1:05 PM, Zhenzhong Duan wrote:
>>> When vIOMMU is configured x-flts=on in scalable mode, stage-1 page table
>>> is passed to host to construct nested page table. We need to check
>>> compatibility of some critical IOMMU capabilities between vIOMMU and
>>> host IOMMU to ensure guest stage-1 page table could be used by host.
>>>
>>> For instance, vIOMMU supports stage-1 1GB huge page mapping, but host
>>> does not, then this IOMMUFD backed device should fail.
>>>
>>> Even of the checks pass, for now we willingly reject the association
>>> because all the bits are not there yet.
>>>
>>> Signed-off-by: Yi Liu <yi.l....@intel.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com>
>>> ---
>>>  hw/i386/intel_iommu.c          | 30
>> +++++++++++++++++++++++++++++-
>>>  hw/i386/intel_iommu_internal.h |  1 +
>>>  2 files changed, 30 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index e90fd2f28f..c57ca02cdd 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -40,6 +40,7 @@
>>>  #include "kvm/kvm_i386.h"
>>>  #include "migration/vmstate.h"
>>>  #include "trace.h"
>>> +#include "system/iommufd.h"
>>>
>>>  /* context entry operations */
>>>  #define VTD_CE_GET_RID2PASID(ce) \
>>> @@ -4355,7 +4356,34 @@ static bool vtd_check_hiod(IntelIOMMUState *s,
>> HostIOMMUDevice *hiod,
>>>          return true;
>>>      }
>>>
>>> -    error_setg(errp, "host device is uncompatible with stage-1
>> translation");
>>> +#ifdef CONFIG_IOMMUFD
>>> +    struct HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> +    struct iommu_hw_info_vtd *vtd = &caps->vendor_caps.vtd;
>> I am now confused about how this relates to vtd_get_viommu_cap().
>> PCIIOMMUOps.set_iommu_device = vtd_dev_set_iommu_device calls
>> vtd_check_hiod()
>> viommu might return HW_NESTED_CAP through
>> PCIIOMMUOps.get_viommu_cap
>> without making sure the underlying HW IOMMU does support it. Is that a
>> correct understanding? Maybe we should clarify the calling order between
>> set_iommu_device vs get_viommu_cap? Could we check HW IOMMU
>> prerequisites in vtd_get_viommu_cap() by enforcing this is called after
>> set_iommu_device. I think we should clarify the exact semantic of
>> get_viommu_cap().Thanks Eric
> My understanding get_viommu_cap() returns pure vIOMMU's capabilities
> with no host IOMMU's capabilities involved.
>
> For example, returned HW_NESTED_CAP means this vIOMMU has code
> to support creating nested hwpt and attaching, no matter if host IOMMU
> supports nesting or not.

Then I think you need to refine the description in 2/20 to make this clear.
stating explicitly get_viommu_cap returns theoretical capabilities which
are independent on the actual host capabilities they may depend on.
>
> The compatibility check between host IOMMU vs vIOMMU is done in
> set_iommu_device(), see vtd_check_hiod().
>
> It's too late for VFIO to call get_viommu_cap() after set_iommu_device()
> because we need get_viommu_cap() to determine if creating nested parent
> hwpt or not at attaching stage.
isn't it possible to rework the call sequence?

Eric
>
> If host doesn't support nesting, then creating nested parent hwpt will fail
> early in vfio realize before vtd_check_hiod() is called and we fail the 
> hotplug.
>
> Thanks
> Zhenzhong


Reply via email to