>-----Original Message-----
>From: Donald Dutile <ddut...@redhat.com>
>Subject: Re: [PATCH rfcv2 00/20] intel_iommu: Enable stage-1 translation for
>passthrough device
>
>Hey Zhenzhong,
>Thanks for feedback. replies below.
>- Don
>
>On 5/19/25 4:37 AM, Duan, Zhenzhong wrote:
>> Hi Donald,
>>
>>> -----Original Message-----
>>> From: Donald Dutile <ddut...@redhat.com>
>>> Subject: Re: [PATCH rfcv2 00/20] intel_iommu: Enable stage-1 translation for
>>> passthrough device
>>>
>>> Zhenzhong,
>>>
>>> Hi!
>>> Eric asked me to review this series.
>>> Since it's rather late since you posted will summarize review feedback
>>> below/bottom.
>>>
>>> - Don
>>>
>>> On 2/19/25 3:22 AM, Zhenzhong Duan wrote:
...
>>> Did you ever put some tracing in to capture avg hits in cache?
>>> ... if so,
>add
>>> as a comment.
>>> Otherwise, looks good.
>>>
>>> Patch 11: Apologies, I don't know what 'flts' stands for, and why it is
>>> relative
>to 2-
>>> stage mapping, or SIOV. Could you add verbage to explain the use of it, as
>>> the
>>> rest of this patch doesn't make any sense to me without the background.
>>> The patch introduces hw-info-type (none or intel), and then goes on to add a
>>> large set of checks; seems like the caps & this checking should go together
>(split
>>> for each cap; put all caps together & the check...).
>>
>> OK, will do. There are some explanations in cover-letter.
>> For history reason, old vtd spec define stage-1 as first level then switch
>> to first
>stage.
>>
>So 'flts' is 'first level then switch' .
Sorry for confusion, it stands for 'first level translation support'.
>
>>>
>>> Patch 12: Why isn't HostIOMMUDevice extended to have another iommu-
>specif
>>> element, opaque in HostIOMMUDevice, but set to specific IOMMU in use?
>e.g.
>>> void *hostiommustate;
>>
>> Yes, that's possible, but we want to make a generic interface between
>VFIO/VDPA and vIOMMU.
>>
>ok. I don't understand how VFIO & VPDA complicate that add.
IIUC, the hostiommustate provided by VFIO and VDPA may be different format.
By using a general interface like .get_cap(), we hide the resolving under VFIO
and
VDPA backend. This is like the KVM extension checking between QEMU and KVM.
FYI, there was some discuss on the interface before,
see https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg02658.html
>
>>>
>>> Patch 13: Isn't PASID just an extension/addition of BDF id? and doesn't each
>>> PASID have its own address space?
>>
>> Yes, it is.
>>
>>> So, why isn't it handle via a uniqe AS cache like 'any other device'?
>>> Maybe I'm
>>> thinking too SMMU-StreamID, which can be varying length, depending on
>>> subsystem support. I see what appears to be sid+pasid calls to do the AS
>lookups;
>>> hmm, so maybe this is the generalized BDF+pasid AS lookup? if so, maybe a
>>> better description stating this transition to a wider stream-id would set
>>> the
>code
>>> context better.
>>
>> Not quite get..
>>
>I'm looking for a better description that states the AS cache lookup is
>broadened
>from bdf
>to bdf+pasid.
Guess you mean vtd_as_from_iommu_pasid(), it's a variant of vtd_find_add_as().
We support AS cache lookup by bdf+pasid for a long time, see vtd_find_add_as().
>
>>> As for the rest of the (400 intel-iommu) code, I'm not that in-depth in
>>> intel-
>iommu
>>> to determine if its all correct or not.
>>>
>>> Patch 14: Define PGTT; the second paragraph seem self-contradicting -- it
>>> says
>it
>>> uses a 2-stage page table in each case, but it implies it should be
>>> different. At
>580
>>> lines of code changes, you win! ;-)
>>
>> The host side's using nested or only stage-2 page table depends on PGTT's
>setting in guest.
>>
>Thanks for clarification.
>
>>>
>>> Patch 15: Read-only and Read/write areas have different IOMMUFDs? is that
>an
>>> intel-iommu requriement?
>>> At least this intel-iommu-errata code is only in hw/i386/<>
>>> modules.
>>
>> No, if ERRATA_772415, read-only areas should not be mapped, so we allocate a
>new VTDIOASContainer to hold only read/write areas mapping.
>> We can use same IOMMUFDs for different VTDIOASContainer.
>>
>ah yes; I got hung-up on different mappings, and didn't back up to AS-container
>split & same IOMMUFD.
>
>>>
>>> Patch 16: Looks reasonable. What does the 'SI' mean after "CACHE_DEV",
>>> "CACHE_DOM" & "CACHE_PASID" ? -- stream-invalidation?
>>
>> VTD_PASID_CACHE_DEVSI stands for 'pasid cache device selective invalidation',
>> VTD_PASID_CACHE_DOMSI means 'pasid cache domain selective invalidation'.
>>
>That explanation helps. :) maybe put a short blurb in the commit log, or code,
>so one doesn't have to be a ninja-VTD spec consumer to comprehend those
>(important) diffs.
Good idea, will do.
>
>> Thanks
>> Zhenzhong
>>
>Again, thanks for the reply.
>Looking fwd to the rfcv3 (on list) or move to v1-POST.
Thanks for your comments!
BRs,
Zhenzhong