Re: [Qemu-devel] [PATCH v2 7/7] intel_iommu: support passthrough (PT)

Lan Tianyu Wed, 19 Apr 2017 00:36:56 -0700

Hi All:
        Sorry for later response.
On 2017年04月18日 17:04, Liu, Yi L wrote:
>> -----Original Message-----
>> From: Peter Xu [mailto:pet...@redhat.com]
>> Sent: Tuesday, April 18, 2017 3:27 PM
>> To: Liu, Yi L <yi.l....@intel.com>
>> Cc: qemu-devel@nongnu.org; Lan, Tianyu <tianyu....@intel.com>; Michael S .
>> Tsirkin <m...@redhat.com>; Jason Wang <jasow...@redhat.com>; Marcel
>> Apfelbaum <mar...@redhat.com>; David Gibson <da...@gibson.dropbear.id.au>
>> Subject: Re: [Qemu-devel] [PATCH v2 7/7] intel_iommu: support passthrough 
>> (PT)
>>
>> On Tue, Apr 18, 2017 at 06:02:44AM +0000, Liu, Yi L wrote:
>>>>> Hi Peter,
>>>>>
>>>>> Skip address space switching is a good idea to support Passthru mode.
>>>>> However, without the address space, the vfio notifier would not be
>>>>> registered, thus vIOMMU emulator has no way to connect to host. It
>>>>> is no harm if there is only map/unmap notifier. But if we have
>>>>> more notifiers other than map/unmap, it may be a problem.
>>>>>
>>>>> I think we need to reconsider it here.
>>>>
>>>> For now I think as switching is good to us in general. Could I know
>>>> more context about this? Would it be okay to work on top of this in the 
>>>> future?
>>>>
>>>
>>> Let me explain. For vSVM enabling, it needs to add new notifier to
>>> bind gPASID table to host. If an assigned device uses SVM in guest, as
>>> designed guest IOMMU driver would set "pt" for it. This is to make
>>> sure the host second-level page table stores GPA->HPA mapping. So that
>>> pIOMMU can do nested translation to achieve gVA->GPA GPA->HPA mapping.
>>
>> That should mean that in the guest it will only enable first level 
>> translation, while in
>> the host it'll be nested (first + second)? Then you should be trapping the 
>> guest
>> extended context entry invalidation, then pass the PASID table pointer 
>> downward to
>> the host IOMMU, am I correct?
> 
> exactly.
> 
>>
>>>
>>> So the situation is if we want to keep GPA->HPA mapping, we should
>>> skip address space switch, but the vfio listener would not know vIOMMU
>>> is added, so no vfio notifier would be registered. But if we do the
>>> switch, the GPA->HPA mapping is unmapped. And need another way to build the
>> GPA->HPA mapping.
>>
>> If my understanding above is correct, current IOMMU notifier may not satisfy 
>> your
>> need. If you see the current notify interface, it's delivering 
>> IOMMUTLBEntry. I
>> suspect it only suites for IOTLB notifications, but not if you want to pass 
>> some
>> pointer (one GPA) downward somewhere.
> 
> Yeah, you got it more than absolutely. One of my patch is to modify the para 
> to be
> void * and let each notifiers to translate separately. I think it should be a 
> reasonable
> change.
> 
> Supposedly, I would send RFC for vSVM soon. We may talk more it at that 
> thread.
> 
>>>
>>> I think we may have two choice here. Pls let me know your ideas.
>>>
>>> 1. skip the switch for "pt" mode, find other way to register vfio
>>> notifiers. not sure if this is workable. Just a quick thought.
>>>
>>> 2. do the switch, and rebuild GPA->HPA mapping if device use "pt"
>>> mode. For this option, I also have two way to build the GPA->HPA mapping.
>>> a) walk all the memory region sections in address_space_memory, and build 
>>> the
>> mapping.
>>> address_space_memory.dispatch->map.sections, sections stores all the mapped
>> sections.
>>>
>>> b) let guest iommu driver mimics a 1:1 mapping for the devices use
>>> "pt" mode. in this way, the GPA->HPA mapping could be setup by walking
>>> the guest SL page table. This is consistent with your vIOVA replay solution.
>>
>> I'll prefer (1). Reason explained above.
>>
>>>
>>> Also I'm curious if Tianyu's fault report framework needs to register new 
>>> notifiers.
>>
>> For Tianyu's case, I feel like we need other kind of notifier as well, but 
>> not the
>> current IOTLB one. And, of course in this case it'll be in an reverted 
>> direction, which
>> is from device to vIOMMU.
>>
>> (I am thinking whether it's a good time now to let any PCI device able  to 
>> fetch its
>> direct owner IOMMU object, then it can register anything  it want onto that 
>> object
>> maybe?)
>>
> Hmmm, let's see Tianyu's comment as he's driving fault report. Let's keep in 
> touch on
> this Passthru-Mode enabling.


In my previous RFC patchset of fault event reporting, I registered fault
notifier when there is a VFIO group attached to VFIO container and used
the address space to check whether vIOMMU is added. The address space is
returned by vtd_host_dma_iommu(). vtd_find_add_as() initializes device's
IOMMU memory region and put it into device's address space as root
memory region.
Does this make sense?

@@ -1103,6 +1132,14 @@ static int vfio_connect_container(VFIOGroup
*group, AddressSpace *as,
         goto listener_release_exit;
     }

+    if (memory_region_is_iommu(container->space->as->root)) {
+        if (vfio_set_iommu_fault_notifier(container)) {
+            error_setg_errno(errp, -ret,
+                "Fail to set IOMMU fault notifier");
+            goto listener_release_exit;
+        }
+    }
+
     container->initialized = true;

-- 
Best regards
Tianyu Lan

Re: [Qemu-devel] [PATCH v2 7/7] intel_iommu: support passthrough (PT)

Reply via email to