Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device

Tian, Kevin Sun, 16 Jul 2017 19:21:10 -0700

> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> Sent: Friday, July 14, 2017 7:26 PM
> 
> On 14/07/17 08:20, Tian, Kevin wrote:
> >> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> >> Sent: Friday, July 7, 2017 11:15 PM
> >>
> >> On 07/07/17 07:21, Tian, Kevin wrote:
> >>> sorry I didn't quite get this part, and here is my understanding:
> >>>
> >>> Guest programs vIOMMU to map a gIOVA (used by MSI to a GPA
> >>> of doorbell register of virtual irqchip. vIOMMU then
> >>> triggers VFIO map/unmap to update physical IOMMU page
> >>> table for gIOVA -> HPA of real doorbell of physical irqchip
> >>
> >> At the moment (non-SVM), physical and virtual MSI doorbell are
> completely
> >> dissociated. VFIO itself maps the doorbell GPA->HPA during container
> >> initialization. The GPA, chosen arbitrarily by the host, is then removed
> >> from the guest GPA space.
> >
> > got you. I also got some basic understanding from below link. :-)
> >
> > https://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-
> armarm64/
> >
> >>
> >> When the guest programs the vIOMMU to map a gIOVA to the virtual
> irqchip
> >> doorbell, I suppose Qemu will notice that the GPA doesn't correspond to
> >> RAM and will withhold sending a VFIO_IOMMU_MAP_DMA request.
> >>
> >> (For SVM I don't want to go into the details just now, but we will
> >> probably need a separate VFIO mechanism to update the physical MSI-X
> >> tables with whatever gIOVA the guest mapped in its private stage-1 page
> >> tables.)
> >
> > I guess there may be either a terminology difference or a hardware
> > difference here, since I noted you mentioned IOVA with stage-1
> > multiple times.
> >
> > For Intel VT-d:
> >
> > - stage-1 is only for VA translation, tagged with PASID
> > - stage-2 can be used for IOVA translation on bare metal or GPA/gIOVA
> > translation in virtualization, w/o PASID tagged
> 
> The terminology is indeed a bit confusing, and the hardware slightly
> different. For me IOVA is the address used as input of the pIOMMU, PA is
> the output address, and GPA only exists if there is stage-1 + stage-2. So
> I think what I meant by gIOVA above was VA in your description.


In Linux kernel IOVA specifically refers to a pseudo address space
remapped to PA (e.g. from pci_map) while VA is for real CPU virtual 
address (so-called SVM). either IOVA or VA can be input to pIOMMU
based on different usages. When running inside a VM, then input
addresses become gIOVA or GVA. What about following this convention
here and in future discussions, though I agree conceptually IOVA can 
represent any input of pIOMMU? :-)

> 
> I understand your "stage-1" and "stage-2" are named "first-level" and
> "second level" in the VT-d spec?

yes, VT-d uses first/second level.

> 
> If I read the VT-d spec correctly, I think the main difference on ARM SMMU
> is that stage-2 always follows stage-1 translation, but either stage may
> be disabled (or both, for bypass mode). There is no mode like in VT-d,
> where non-PASID transactions go only through stage-2 and PASID
> transactions go only through stage-1. I believe this is (NESTE=0,
> T=000b/001b) in the Extended-Context-Entry.
> 
> Something equivalent in SMMU is disabling stage-2 and using the entry 0 in
> the PASID table for non-PASID traffic. In this mode, traffic that uses
> PASID#0 would be aborted. So using your terms, the SMMU can have VAs
> and
> IOVAs be translated by stage-1 and then, if enabled, be translated by
> stage-2 as well.
> 

Clear to me. Thanks for explanation.

Kevin

Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device

Reply via email to