On Friday 14 July 2017 04:21 PM, Hemant Agrawal wrote:

> On 7/14/2017 3:59 PM, santosh wrote:
>> On Friday 14 July 2017 03:52 PM, santosh wrote:
>>
>>> On Friday 14 July 2017 03:09 PM, Hemant Agrawal wrote:
>>>
>>>> On 7/14/2017 2:00 PM, santosh wrote:
>>>>> On Friday 14 July 2017 01:37 PM, Hemant Agrawal wrote:
>>>>>
>>>>>> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>>>>>>> API(rte_bus_get_iommu_class) helps to automatically detect and select
>>>>>>> appropriate iova mapping scheme for iommu capable device on that bus.
>>>>>>>
>>>>>>> Algorithm for iova scheme selection for bus:
>>>>>>> 0. Iterate through bus_list.
>>>>>>> 1. Collect each bus iova mode value and update into 'mode' var.
>>>>>>> 2. Here value '1' is _pa and value '2' is _va mode.
>>>>>>> So mode selection scheme is like:
>>>>>>> if mode == 2 then iova mode is _va.
>>>>>>> if mode == 1 then iova mode is _pa
>>>>>>> if mode  == 3 then iova mode ia _pa.
>>>>>>>
>>>>>>> So mode !=2  will be default iova mode.
>>>>>>>
>>>>>>> Signed-off-by: Santosh Shukla <santosh.shu...@caviumnetworks.com>
>>>>>>> Signed-off-by: Jerin Jacob <jerin.ja...@caviumnetworks.com>
>>>>>>> ---
>>>>>>>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  1 +
>>>>>>>  lib/librte_eal/common/eal_common_bus.c          | 23 
>>>>>>> +++++++++++++++++++++++
>>>>>>>  lib/librte_eal/common/eal_common_pci.c          |  1 +
>>>>>>>  lib/librte_eal/common/include/rte_bus.h         | 22 
>>>>>>> ++++++++++++++++++++++
>>>>>>>  lib/librte_eal/linuxapp/eal/rte_eal_version.map |  1 +
>>>>>>>  5 files changed, 48 insertions(+)
>>>>>>>
>>>>>>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
>>>>>>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> index 33c2c32c0..a2dd65a33 100644
>>>>>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>>> @@ -202,6 +202,7 @@ DPDK_17.08 {
>>>>>>>      rte_bus_find_by_name;
>>>>>>>      rte_pci_match;
>>>>>>>      rte_pci_get_iommu_class;
>>>>>>> +    rte_bus_get_iommu_class;
>>>>>>>
>>>>>>>  } DPDK_17.05;
>>>>>>>
>>>>>>> diff --git a/lib/librte_eal/common/eal_common_bus.c 
>>>>>>> b/lib/librte_eal/common/eal_common_bus.c
>>>>>>> index 08bec2d93..5d5753ac9 100644
>>>>>>> --- a/lib/librte_eal/common/eal_common_bus.c
>>>>>>> +++ b/lib/librte_eal/common/eal_common_bus.c
>>>>>>> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
>>>>>>>          c[0] = '\0';
>>>>>>>      return rte_bus_find(NULL, bus_can_parse, name);
>>>>>>>  }
>>>>>>> +
>>>>>>> +
>>>>>>> +/*
>>>>>>> + * Get iommu class of devices on the bus.
>>>>>>> + */
>>>>>>> +enum rte_iova_mode
>>>>>>> +rte_bus_get_iommu_class(void)
>>>>>>> +{
>>>>>>> +    int mode = 0;
>>>>>>> +    struct rte_bus *bus;
>>>>>>> +
>>>>>>> +    TAILQ_FOREACH(bus, &rte_bus_list, next) {
>>>>>>> +
>>>>>>> +        if (bus->get_iommu_class)
>>>>>>> +            mode |= bus->get_iommu_class();
>>>>>>> +    }
>>>>>>> +
>>>>>> If you change the default return as '0' for buses. This code will work.
>>>>>> e.g. PCI will return '0' - when no device is probed. FSL MC will return 
>>>>>> VA. the default mode will be 'VA'
>>>>>>
>>>>> I'm confused why it won't work for fslmc case?
>>>>>
>>>>> Let me walk through the code:
>>>>>
>>>>> If no-pci device Or (future) no-platform device probed then bus opt
>>>>> to use default mapping scheme .. which is iova_pa(default scheme).
>>>>>
>>>>> Lets take PCI_bus example:
>>>>> bus->get_iommu_class()
>>>>>     ---> bus->_pci_get_iommu_class()
>>>>>         * Now consider that no interface bound to any of PCI device, then
>>>>>           it will return RTE_IOVA_PA mode to rte_bus layer (aka 
>>>>> bus->get_iommu_class).
>>>>>           So the iova mapping result from iommu_class scan is RTE_IOVA_PA 
>>>>> (default).
>>>>>           It works for PCI_bus case, tested for both iova_va and iova_pa 
>>>>> case, no-pci device case.
>>>>>
>>>>> Now in fslmc bus case:
>>>>> bus->get_iommu_class()
>>>>>     ---> bus->_fslmc_get_iommu_class()
>>>>>
>>>>>         * IIUC your comment - You want fslmc bus to return RTE_IOVA_VA if 
>>>>> no device
>>>>>           detected, Right?
>>>> why?
>>>>
>>> As I didn't understood your previous reply:
>>> `e.g. PCI will return '0' - when no device is probed. FSL MC will return 
>>> VA. the default mode will be 'VA'`
>>>
>>> So, I'm asking you that in fslmc bus case - if no device found then are you 
>>> opting _va scheme or not?
>>> Seems like _not_ per your below comment.
>>>
>>>
>>>> If bus is just present but no device is in use for dpdk, then bus should 
>>>> return 0 and it *should not* participate in the IOMMU class decision.
>>>>
>>> I think, I understand your point..Example if you have no-pci on first PCI 
>>> bus
>>> but device found on 2nd platform bus then you don't want to fallback to 
>>> default (/_pa) mode..
>>> instead you want to use 2nd bus mode for mapping, which is _va. Right?
>>>
>>> If so then In my first version - We did introduced the case called _DC.
>>> _DC:0 --> stands for no-device found case.
>>>
>>>> Right now there are only two buses. There can be more buses. (e.g. PCI, 
>>>> platform, fslmc in case of dpaa2 as well).
>>>>
>>>> If the bus is not being used at all, why it influence the decision of 
>>>> other buses.
>>>>
>>> If your referring to above case then I agree, We'll re-introduce _DC state 
>>> from v1 in next revision.
>>> That will look like
>>> rte_pci_get_iommu_class() {
>>>     int mode = RTE_IOVA_DC; /* '0' */
>>>
>>>     return _DC; /* if no device found */
>>> }
>>>
>>> Right?
>
> Yes! Thanks!
>
> As I explained in the other thread. The PCI devices can be there, but none of 
> them is for DPDK:
> EAL: PCI device 0000:01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:10d3 net_e1000_em
> EAL:   Not managed by a supported kernel driver, skipped
>
>
Ok, I will queue _DC changes in next verions. Thanks for confirming.

>>>
>>>> if no bus has any device, the System default is anyway PA.
>>>>
>>> Right, If no bus present then It's also responsibility of 
>>> `rte_bus_get_iommu_class`
>>> to use default mapping scheme which is _pa and which It does.
>>>
>>>>>           if so then your fslmc bus handle should do something like below
>>>>>             -- If no device on fslmc bus : return RTE_IOVA_VA.
>>>>>             -- If device detected on fslmc bus and bound to iommu driver 
>>>>> : return RTE_IOVA_VA
>>>>>             -- If device detected fslmc but not bound to iommu drv : 
>>>>> return RTE_IOVA_PA..
>>>>>
>>>>> make sense? If not then can you describe fslmc mapping scheme?
>>>>>
>>>>>> if fslmc is not present. The default mode will be PA.
>>>>>>
>>>>>>> +    if (mode != RTE_IOVA_VA) {
>>>>>>> +        /* Use default IOVA mode */
>>>>>>> +        mode = RTE_IOVA_PA;
>>>>>>> +    }
>>>> The system default is anyway PA.
>>>>
>>> No, That check is needed for case like 1st bus return with _PA and 2nd bus 
>>> returns with _VA,
>>> then mode = 3 (Mix mode), which we don't support so (as I mentioned before) 
>>> its responsibility of
>>> rte_bus_get_iommu_class() to return default mode (_pa). That's why!.
>>>
>>>
>> Does your platform supports `mix mode`, I asked same question in thread 
>> [04/11] too?
>> Let's say that dpaa2 supports mix mode then it is Ok if bus chose to opt 
>> default mapping
>> for mix mode case? Do you see any issue if bus opt to use default scheme for 
>> mix mode?
>>
>>
>
> yes! We can support mix mode. However with your suggested changes in mempool 
> etc APIs, now the DPDK will not work for us in mix mode (when both PCI and 
> DPAA2 devices are available) with VA support only for DPAA2 :)
>
> In case of mix mode, you logic is already there to default to PA. That is 
> fine.
>
> But, when PCI devices are not hooked to dpdk. We should be able to use VA for 
> dpaa2.
>
Ok.
I assume that You'll implement bus handle for fslmc something like
`rte_fslmc_get_iommu_class()` and make sure that you return:
- _VA in no-device found case.
- _VA if iommu capable interface detected for device.
- _PA if no-iommu.

And only change which you expect at bus layer /rte_bus_get_iommu_class() is to
honor `no device found` situation for multiple bus case. Right?


Reply via email to