On Tue, Nov 13, 2018 at 01:18:54PM +0800, Peter Xu wrote: > On Mon, Nov 12, 2018 at 08:38:30PM +0800, Yu Zhang wrote: > > On Mon, Nov 12, 2018 at 05:36:38PM +0800, Peter Xu wrote: > > > On Mon, Nov 12, 2018 at 05:25:48PM +0800, Yu Zhang wrote: > > > > On Mon, Nov 12, 2018 at 04:51:22PM +0800, Peter Xu wrote: > > > > > On Fri, Nov 09, 2018 at 07:49:47PM +0800, Yu Zhang wrote: > > > > > > This patch updates vtd_lookup_iotlb() to search cached mappings only > > > > > > for all page levels supported by address width of current vIOMMU. > > > > > > Also, > > > > > > to cover 57-bit width, the shift of source id(VTD_IOTLB_SID_SHIFT) > > > > > > and > > > > > > of page level(VTD_IOTLB_LVL_SHIFT) are enlarged by 9 - the stride of > > > > > > one paging structure level. > > > > > > > > > > > > Signed-off-by: Yu Zhang <yu.c.zh...@linux.intel.com> > > > > > > --- > > > > > > Cc: "Michael S. Tsirkin" <m...@redhat.com> > > > > > > Cc: Marcel Apfelbaum <marcel.apfelb...@gmail.com> > > > > > > Cc: Paolo Bonzini <pbonz...@redhat.com> > > > > > > Cc: Richard Henderson <r...@twiddle.net> > > > > > > Cc: Eduardo Habkost <ehabk...@redhat.com> > > > > > > Cc: Peter Xu <pet...@redhat.com> > > > > > > --- > > > > > > hw/i386/intel_iommu.c | 5 +++-- > > > > > > hw/i386/intel_iommu_internal.h | 7 ++----- > > > > > > 2 files changed, 5 insertions(+), 7 deletions(-) > > > > > > > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > > > > index 9cdf755..ce7e17e 100644 > > > > > > --- a/hw/i386/intel_iommu.c > > > > > > +++ b/hw/i386/intel_iommu.c > > > > > > @@ -254,11 +254,12 @@ static uint64_t vtd_get_iotlb_gfn(hwaddr > > > > > > addr, uint32_t level) > > > > > > static VTDIOTLBEntry *vtd_lookup_iotlb(IntelIOMMUState *s, > > > > > > uint16_t source_id, > > > > > > hwaddr addr) > > > > > > { > > > > > > - VTDIOTLBEntry *entry; > > > > > > + VTDIOTLBEntry *entry = NULL; > > > > > > uint64_t key; > > > > > > int level; > > > > > > + int max_level = (s->aw_bits - VTD_PAGE_SHIFT_4K) / > > > > > > VTD_SL_LEVEL_BITS; > > > > > > > > > > > > - for (level = VTD_SL_PT_LEVEL; level < VTD_SL_PML4_LEVEL; > > > > > > level++) { > > > > > > + for (level = VTD_SL_PT_LEVEL; level < max_level; level++) { > > > > > > > > > > My understanding of current IOTLB is that it only caches the last > > > > > level of mapping, say: > > > > > > > > > > - level 1: 4K page > > > > > - level 2: 2M page > > > > > - level 3: 1G page > > > > > > > > > > So we don't check against level=4 even if x-aw-bits=48 is specified. > > > > > > > > > > Here does it mean that we're going to have... 512G iommu huge pages? > > > > > > > > > > > > > No. My bad, I misunderstood this routine. And now I believe we do not > > > > need this patch. :-) > > > > > > Yeah good to confirm that :-) > > > > Sorry, Peter. I still have question about this part. I agree we do not need > > to do the extra loop - therefore no need for the max_level part introduced > > in this patch. > > > > But as to modification of VTD_IOTLB_SID_SHIFT/VTD_IOTLB_LVL_SHIFT, we may > > still need to do it due to the enlarged gfn, to search an IOTLB entry for > > a 4K mapping, the pfn itself could be as large as 45-bit. > > Agreed.
Thanks~ > > > > > Besides, currently vtd_get_iotlb_gfn() is just shifting 12 bits for all > > different levels, is this necessary? I mean, how about we do the shift > > based on current level? > > > > static uint64_t vtd_get_iotlb_gfn(hwaddr addr, uint32_t level) > > { > > - return (addr & vtd_slpt_level_page_mask(level)) >> VTD_PAGE_SHIFT_4K; > > + uint32_t shift = vtd_slpt_level_shift(level); > > + return (addr & vtd_slpt_level_page_mask(level)) >> shift; > > } > > IMHO we can, but I don't see much gain from it. > > If we shift, we still need to use the maximum possible bits that a PFN > can hold, which is 45bits (when with 4k pages), so we can't gain > anything out if it (no saved bits on iotlb key). Instead, we'll need > to call more vtd_slpt_level_shift() for each vtd_get_iotlb_gfn() which > even seems a bit slower. Yep, we still need to use 45 bits for 4K mappings. The only benifit I can think of is it's more intuitive - more aligned to the vtd spec of iotlb tags. But just like you said, I do not see any runtime gain in it. So I'm fine to drop this. :) > > Regards, > > -- > Peter Xu > B.R. Yu