On Fri, Jan 14, 2022 at 03:22:16PM +0800, Jason Wang wrote:
> On Fri, Jan 14, 2022 at 3:13 PM Peter Xu <pet...@redhat.com> wrote:
> >
> > On Fri, Jan 14, 2022 at 01:58:07PM +0800, Jason Wang wrote:
> > > > > Right, but I think you meant to do this only when scalable mode is 
> > > > > disabled.
> > > >
> > > > Yes IMHO it will definitely suite for !scalable case since that's 
> > > > exactly what
> > > > we did before.  What I'm also wondering is even if scalable is enabled 
> > > > but no
> > > > "real" pasid is used, so if all the translations go through the default 
> > > > pasid
> > > > that stored in the device context entry, then maybe we can ignore 
> > > > checking it.
> > > > The latter is the "hacky" part mentioned above.
> > >
> > > The problem I see is that we can't know what PASID is used as default
> > > without reading the context entry?
> >
> > Can the default NO_PASID being used in mixture of !NO_PASID use case on the
> > same device?  If that's possible, then I agree..
> 
> My understanding is that it is possible.

OK.

> 
> >
> > My previous idea should be based on the fact that if NO_PASID is used on one
> > device, then all translations will be based on NO_PASID, but now I'm not 
> > sure
> > of it.
> 
> Actually, what I meant is:
> 
> device 1 using transactions without PASID with RID2PASID 1
> device 2 using transactions without PASID with RID2PASID 2
> 
> Then we can't assume a default pasid here.

This seems fine, because "device N" is still part of the equation when looking
up, so we won't lookup wrong.

But yeah.. it could not really work anyway.

> 
> >
> > >
> > > >
> > > > The other thing to mention is, if we postpone the iotlb lookup to be 
> > > > after
> > > > context entry, then logically we can have per-device iotlb, that means 
> > > > we can
> > > > replace IntelIOMMUState.iotlb with VTDAddressSpace.iotlb in the future, 
> > > > too,
> > > > which can also be more efficient.
> > >
> > > Right but we still need to limit the total slots and ATS is a better
> > > way to deal with the IOTLB bottleneck actually.
> >
> > I think it depends on how the iotlb ghash is implemented.  Logically I 
> > think if
> > we can split the cache to per-device it'll be slightly better because we 
> > don't
> > need to iterate over iotlbs of other devices when lookup anymore; meanwhile
> > each iotlb takes less space too (no devfn needed anymore).
> 
> So we've already used sid in the IOTLB hash, I wonder how much we can
> gain form this.

I think at least we can shrink iotlb structures, e.g.:

struct vtd_iotlb_key {
    uint16_t sid;               <------ not needed
    uint32_t pasid;             <------ not needed
    uint64_t gfn;
    uint32_t level;
};

struct VTDIOTLBEntry {
    uint64_t gfn;
    uint16_t domain_id;
    uint32_t pasid;             <------ not needed
    uint64_t slpte;
    uint64_t mask;
    uint8_t access_flags;
};

Thanks,

-- 
Peter Xu


Reply via email to