>-----Original Message----- >From: Donald Dutile <ddut...@redhat.com> >Subject: Re: [PATCH rfcv2 00/20] intel_iommu: Enable stage-1 translation for >passthrough device > >Hey Zhenzhong, >Thanks for feedback. replies below. >- Don > >On 5/19/25 4:37 AM, Duan, Zhenzhong wrote: >> Hi Donald, >> >>> -----Original Message----- >>> From: Donald Dutile <ddut...@redhat.com> >>> Subject: Re: [PATCH rfcv2 00/20] intel_iommu: Enable stage-1 translation for >>> passthrough device >>> >>> Zhenzhong, >>> >>> Hi! >>> Eric asked me to review this series. >>> Since it's rather late since you posted will summarize review feedback >>> below/bottom. >>> >>> - Don >>> >>> On 2/19/25 3:22 AM, Zhenzhong Duan wrote: ... >>> Did you ever put some tracing in to capture avg hits in cache? >>> ... if so, >add >>> as a comment. >>> Otherwise, looks good. >>> >>> Patch 11: Apologies, I don't know what 'flts' stands for, and why it is >>> relative >to 2- >>> stage mapping, or SIOV. Could you add verbage to explain the use of it, as >>> the >>> rest of this patch doesn't make any sense to me without the background. >>> The patch introduces hw-info-type (none or intel), and then goes on to add a >>> large set of checks; seems like the caps & this checking should go together >(split >>> for each cap; put all caps together & the check...). >> >> OK, will do. There are some explanations in cover-letter. >> For history reason, old vtd spec define stage-1 as first level then switch >> to first >stage. >> >So 'flts' is 'first level then switch' . Sorry for confusion, it stands for 'first level translation support'. > >>> >>> Patch 12: Why isn't HostIOMMUDevice extended to have another iommu- >specif >>> element, opaque in HostIOMMUDevice, but set to specific IOMMU in use? >e.g. >>> void *hostiommustate; >> >> Yes, that's possible, but we want to make a generic interface between >VFIO/VDPA and vIOMMU. >> >ok. I don't understand how VFIO & VPDA complicate that add. IIUC, the hostiommustate provided by VFIO and VDPA may be different format. By using a general interface like .get_cap(), we hide the resolving under VFIO and VDPA backend. This is like the KVM extension checking between QEMU and KVM. FYI, there was some discuss on the interface before, see https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg02658.html > >>> >>> Patch 13: Isn't PASID just an extension/addition of BDF id? and doesn't each >>> PASID have its own address space? >> >> Yes, it is. >> >>> So, why isn't it handle via a uniqe AS cache like 'any other device'? >>> Maybe I'm >>> thinking too SMMU-StreamID, which can be varying length, depending on >>> subsystem support. I see what appears to be sid+pasid calls to do the AS >lookups; >>> hmm, so maybe this is the generalized BDF+pasid AS lookup? if so, maybe a >>> better description stating this transition to a wider stream-id would set >>> the >code >>> context better. >> >> Not quite get.. >> >I'm looking for a better description that states the AS cache lookup is >broadened >from bdf >to bdf+pasid. Guess you mean vtd_as_from_iommu_pasid(), it's a variant of vtd_find_add_as(). We support AS cache lookup by bdf+pasid for a long time, see vtd_find_add_as(). > >>> As for the rest of the (400 intel-iommu) code, I'm not that in-depth in >>> intel- >iommu >>> to determine if its all correct or not. >>> >>> Patch 14: Define PGTT; the second paragraph seem self-contradicting -- it >>> says >it >>> uses a 2-stage page table in each case, but it implies it should be >>> different. At >580 >>> lines of code changes, you win! ;-) >> >> The host side's using nested or only stage-2 page table depends on PGTT's >setting in guest. >> >Thanks for clarification. > >>> >>> Patch 15: Read-only and Read/write areas have different IOMMUFDs? is that >an >>> intel-iommu requriement? >>> At least this intel-iommu-errata code is only in hw/i386/<> >>> modules. >> >> No, if ERRATA_772415, read-only areas should not be mapped, so we allocate a >new VTDIOASContainer to hold only read/write areas mapping. >> We can use same IOMMUFDs for different VTDIOASContainer. >> >ah yes; I got hung-up on different mappings, and didn't back up to AS-container >split & same IOMMUFD. > >>> >>> Patch 16: Looks reasonable. What does the 'SI' mean after "CACHE_DEV", >>> "CACHE_DOM" & "CACHE_PASID" ? -- stream-invalidation? >> >> VTD_PASID_CACHE_DEVSI stands for 'pasid cache device selective invalidation', >> VTD_PASID_CACHE_DOMSI means 'pasid cache domain selective invalidation'. >> >That explanation helps. :) maybe put a short blurb in the commit log, or code, >so one doesn't have to be a ninja-VTD spec consumer to comprehend those >(important) diffs. Good idea, will do. > >> Thanks >> Zhenzhong >> >Again, thanks for the reply. >Looking fwd to the rfcv3 (on list) or move to v1-POST. Thanks for your comments! BRs, Zhenzhong