>-----Original Message----- >From: Liu, Yi L <yi.l....@intel.com> >Subject: Re: [PATCH rfcv3 15/21] intel_iommu: Bind/unbind guest page table to >host > >Hey Nic, > >On 2025/5/22 06:49, Nicolin Chen wrote: >> On Wed, May 21, 2025 at 07:14:45PM +0800, Zhenzhong Duan wrote: >>> +static const MemoryListener iommufd_s2domain_memory_listener = { >>> + .name = "iommufd_s2domain", >>> + .priority = 1000, >>> + .region_add = iommufd_listener_region_add_s2domain, >>> + .region_del = iommufd_listener_region_del_s2domain, >>> +}; >> >> Would you mind elaborating When and how vtd does all S2 mappings? >> >> On ARM, the default vfio_memory_listener could capture the entire >> guest RAM and add to the address space. So what we do is basically >> reusing the vfio_memory_listener: >> https://lore.kernel.org/qemu-devel/20250311141045.66620-13- >shameerali.kolothum.th...@huawei.com/ > >in concept yes, all the guest ram. but due to an errata, we need >to skip the RO mappings. > >> The thing is that when a VFIO device is attached to the container >> upon a nesting configuration, the ->get_address_space op should >> return the system address space as S1 nested HWPT isn't allocated >> yet. Then all the iommu as routines in vfio_listener_region_add() >> would be skipped, ending up with mapping the guest RAM in S2 HWPT >> correctly. Not until the S1 nested HWPT is allocated by the guest >> OS (after guest boots), can the ->get_address_space op return the >> iommu address space. > >This seems a bit different between ARM and VT-d emulation. The VT-d >emulation code returns the iommu address space regardless of what >translation mode guest configured. But the MR of the address space >has two overlapped subregions, one is nodmar, another one is iommu. >As the naming shows, the nodmar is aliased to the system MR. And before >the guest enables iommu and set PGTT to a non-PT mode (e.g. S1 or S2), >the effective MR alias is the nodmar, hence the mapping this address >space holds are the GPA mappings in the beginning. If guest set PGTT to S2, >then the iommu MR is enabled, hence the mapping is gIOVA mappings >accordingly. So in VT-d emulation, the address space switch is more the MR >alias switching. > >In this series, we mainly want to support S1 translation type for guest. >And it is based on nested translation, which needs a S2 domain that holds >the GPA mappings. Besides S1 translation type, PT is also supported. Both >the two types need a S2 domain which already holds GPA mappings. So we have >this internal listener. Also, we want to skip RO mappings on S2, so that's >another reason for it. @Zhenzhong, perhaps, it can be described in the >commit message why an internal listener is introduced.
Thanks Yi for accurate explanation, sure, will add comments for internal listener. BRs, Zhenzhong > >> >> With this address space shift, S2 mappings can be simply captured >> and done by vfio_memory_listener. Then, such an s2domain listener >> would be largely redundant. > >hope above addressed your question. > >> So the second question is: >> Does vtd have to own this iommufd_s2domain_memory_listener? IOW, > >yes based on the current design. when guest GPTT==PT, attach device >to S2 hwpt, when it goes to S1, then attach it to a S1 hwpt whose >parent is the aforementioned S2 hwpt. This S2 hwpt is always there >for use. > >> does vtd_host_dma_iommu() have to return the iommu address space >> all the time? > >yes, all the time. > >-- >Regards, >Yi Liu