On 2018/6/1 14:50, Leizhen (ThunderTown) wrote:
> 
> 
> On 2018/5/31 22:25, Robin Murphy wrote:
>> On 31/05/18 14:49, Hanjun Guo wrote:
>>> Hi Robin,
>>>
>>> On 2018/5/31 19:24, Robin Murphy wrote:
>>>> On 31/05/18 08:42, Zhen Lei wrote:
>>>>> In common, a IOMMU unmap operation follow the below steps:
>>>>> 1. remove the mapping in page table of the specified iova range
>>>>> 2. execute tlbi command to invalid the mapping which is cached in TLB
>>>>> 3. wait for the above tlbi operation to be finished
>>>>> 4. free the IOVA resource
>>>>> 5. free the physical memory resource
>>>>>
>>>>> This maybe a problem when unmap is very frequently, the combination of 
>>>>> tlbi
>>>>> and wait operation will consume a lot of time. A feasible method is put 
>>>>> off
>>>>> tlbi and iova-free operation, when accumulating to a certain number or
>>>>> reaching a specified time, execute only one tlbi_all command to clean up
>>>>> TLB, then free the backup IOVAs. Mark as non-strict mode.
>>>>>
>>>>> But it must be noted that, although the mapping has already been removed 
>>>>> in
>>>>> the page table, it maybe still exist in TLB. And the freed physical memory
>>>>> may also be reused for others. So a attacker can persistent access to 
>>>>> memory
>>>>> based on the just freed IOVA, to obtain sensible data or corrupt memory. 
>>>>> So
>>>>> the VFIO should always choose the strict mode.
>>>>>
>>>>> Some may consider put off physical memory free also, that will still 
>>>>> follow
>>>>> strict mode. But for the map_sg cases, the memory allocation is not 
>>>>> controlled
>>>>> by IOMMU APIs, so it is not enforceable.
>>>>>
>>>>> Fortunately, Intel and AMD have already applied the non-strict mode, and 
>>>>> put
>>>>> queue_iova() operation into the common file dma-iommu.c., and my work is 
>>>>> based
>>>>> on it. The difference is that arm-smmu-v3 driver will call IOMMU common 
>>>>> APIs to
>>>>> unmap, but Intel and AMD IOMMU drivers are not.
>>>>>
>>>>> Below is the performance data of strict vs non-strict for NVMe device:
>>>>> Randomly Read  IOPS: 146K(strict) vs 573K(non-strict)
>>>>> Randomly Write IOPS: 143K(strict) vs 513K(non-strict)
>>>>
>>>> What hardware is this on? If it's SMMUv3 without MSIs (e.g. D05), then 
>>>> you'll still be using the rubbish globally-blocking sync implementation. 
>>>> If that is the case, I'd be very interested to see how much there is to 
>>>> gain from just improving that - I've had a patch kicking around for a 
>>>> while[1] (also on a rebased branch at [2]), but don't have the means for 
>>>> serious performance testing.
> I will try your patch to see how much it can improve. I think the best way
Hi Robin,

I applied your patch and got below improvemnet.

Randomly Read  IOPS: 146K --> 214K
Randomly Write IOPS: 143K --> 212K

> to resovle the globally-blocking sync is that the hardware provide 64bits
> CONS regitster, so that it can never be wrapped, and the spinlock can also
> be removed.
> 
>>>
>>> The hardware is the new D06 which the SMMU with MSIs,
>>
>> Cool! Now that profiling is fairly useful since we got rid of most of the 
>> locks, are you able to get an idea of how the overhead in the normal case is 
>> distributed between arm_smmu_cmdq_insert_cmd() and 
>> __arm_smmu_sync_poll_msi()? We're always trying to improve our understanding 
>> of where command-queue-related overheads turn out to be in practice, and 
>> there's still potentially room to do nicer things than TLBI_NH_ALL ;)
> Even if the software has no overhead, there may still be a problem, because
> the smmu need to execute the commands in sequence, especially before
> globally-blocking sync has been removed. Base on the actually execute time
> of single tlbi and sync, we can get the upper limit in theory.
> 
> BTW, I will reply the reset of mail next week. I'm busy with other things now.
> 
>>
>> Robin.
>>
>>> it's not D05 :)
>>>
>>> Thanks
>>> Hanjun
>>>
>>
>> .
>>
> 

-- 
Thanks!
BestRegards

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to