On Friday 14 July 2017 04:21 PM, Hemant Agrawal wrote: > On 7/14/2017 3:59 PM, santosh wrote: >> On Friday 14 July 2017 03:52 PM, santosh wrote: >> >>> On Friday 14 July 2017 03:09 PM, Hemant Agrawal wrote: >>> >>>> On 7/14/2017 2:00 PM, santosh wrote: >>>>> On Friday 14 July 2017 01:37 PM, Hemant Agrawal wrote: >>>>> >>>>>> On 7/11/2017 11:46 AM, Santosh Shukla wrote: >>>>>>> API(rte_bus_get_iommu_class) helps to automatically detect and select >>>>>>> appropriate iova mapping scheme for iommu capable device on that bus. >>>>>>> >>>>>>> Algorithm for iova scheme selection for bus: >>>>>>> 0. Iterate through bus_list. >>>>>>> 1. Collect each bus iova mode value and update into 'mode' var. >>>>>>> 2. Here value '1' is _pa and value '2' is _va mode. >>>>>>> So mode selection scheme is like: >>>>>>> if mode == 2 then iova mode is _va. >>>>>>> if mode == 1 then iova mode is _pa >>>>>>> if mode == 3 then iova mode ia _pa. >>>>>>> >>>>>>> So mode !=2 will be default iova mode. >>>>>>> >>>>>>> Signed-off-by: Santosh Shukla <santosh.shu...@caviumnetworks.com> >>>>>>> Signed-off-by: Jerin Jacob <jerin.ja...@caviumnetworks.com> >>>>>>> --- >>>>>>> lib/librte_eal/bsdapp/eal/rte_eal_version.map | 1 + >>>>>>> lib/librte_eal/common/eal_common_bus.c | 23 >>>>>>> +++++++++++++++++++++++ >>>>>>> lib/librte_eal/common/eal_common_pci.c | 1 + >>>>>>> lib/librte_eal/common/include/rte_bus.h | 22 >>>>>>> ++++++++++++++++++++++ >>>>>>> lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 + >>>>>>> 5 files changed, 48 insertions(+) >>>>>>> >>>>>>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map >>>>>>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map >>>>>>> index 33c2c32c0..a2dd65a33 100644 >>>>>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map >>>>>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map >>>>>>> @@ -202,6 +202,7 @@ DPDK_17.08 { >>>>>>> rte_bus_find_by_name; >>>>>>> rte_pci_match; >>>>>>> rte_pci_get_iommu_class; >>>>>>> + rte_bus_get_iommu_class; >>>>>>> >>>>>>> } DPDK_17.05; >>>>>>> >>>>>>> diff --git a/lib/librte_eal/common/eal_common_bus.c >>>>>>> b/lib/librte_eal/common/eal_common_bus.c >>>>>>> index 08bec2d93..5d5753ac9 100644 >>>>>>> --- a/lib/librte_eal/common/eal_common_bus.c >>>>>>> +++ b/lib/librte_eal/common/eal_common_bus.c >>>>>>> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str) >>>>>>> c[0] = '\0'; >>>>>>> return rte_bus_find(NULL, bus_can_parse, name); >>>>>>> } >>>>>>> + >>>>>>> + >>>>>>> +/* >>>>>>> + * Get iommu class of devices on the bus. >>>>>>> + */ >>>>>>> +enum rte_iova_mode >>>>>>> +rte_bus_get_iommu_class(void) >>>>>>> +{ >>>>>>> + int mode = 0; >>>>>>> + struct rte_bus *bus; >>>>>>> + >>>>>>> + TAILQ_FOREACH(bus, &rte_bus_list, next) { >>>>>>> + >>>>>>> + if (bus->get_iommu_class) >>>>>>> + mode |= bus->get_iommu_class(); >>>>>>> + } >>>>>>> + >>>>>> If you change the default return as '0' for buses. This code will work. >>>>>> e.g. PCI will return '0' - when no device is probed. FSL MC will return >>>>>> VA. the default mode will be 'VA' >>>>>> >>>>> I'm confused why it won't work for fslmc case? >>>>> >>>>> Let me walk through the code: >>>>> >>>>> If no-pci device Or (future) no-platform device probed then bus opt >>>>> to use default mapping scheme .. which is iova_pa(default scheme). >>>>> >>>>> Lets take PCI_bus example: >>>>> bus->get_iommu_class() >>>>> ---> bus->_pci_get_iommu_class() >>>>> * Now consider that no interface bound to any of PCI device, then >>>>> it will return RTE_IOVA_PA mode to rte_bus layer (aka >>>>> bus->get_iommu_class). >>>>> So the iova mapping result from iommu_class scan is RTE_IOVA_PA >>>>> (default). >>>>> It works for PCI_bus case, tested for both iova_va and iova_pa >>>>> case, no-pci device case. >>>>> >>>>> Now in fslmc bus case: >>>>> bus->get_iommu_class() >>>>> ---> bus->_fslmc_get_iommu_class() >>>>> >>>>> * IIUC your comment - You want fslmc bus to return RTE_IOVA_VA if >>>>> no device >>>>> detected, Right? >>>> why? >>>> >>> As I didn't understood your previous reply: >>> `e.g. PCI will return '0' - when no device is probed. FSL MC will return >>> VA. the default mode will be 'VA'` >>> >>> So, I'm asking you that in fslmc bus case - if no device found then are you >>> opting _va scheme or not? >>> Seems like _not_ per your below comment. >>> >>> >>>> If bus is just present but no device is in use for dpdk, then bus should >>>> return 0 and it *should not* participate in the IOMMU class decision. >>>> >>> I think, I understand your point..Example if you have no-pci on first PCI >>> bus >>> but device found on 2nd platform bus then you don't want to fallback to >>> default (/_pa) mode.. >>> instead you want to use 2nd bus mode for mapping, which is _va. Right? >>> >>> If so then In my first version - We did introduced the case called _DC. >>> _DC:0 --> stands for no-device found case. >>> >>>> Right now there are only two buses. There can be more buses. (e.g. PCI, >>>> platform, fslmc in case of dpaa2 as well). >>>> >>>> If the bus is not being used at all, why it influence the decision of >>>> other buses. >>>> >>> If your referring to above case then I agree, We'll re-introduce _DC state >>> from v1 in next revision. >>> That will look like >>> rte_pci_get_iommu_class() { >>> int mode = RTE_IOVA_DC; /* '0' */ >>> >>> return _DC; /* if no device found */ >>> } >>> >>> Right? > > Yes! Thanks! > > As I explained in the other thread. The PCI devices can be there, but none of > them is for DPDK: > EAL: PCI device 0000:01:00.0 on NUMA socket 0 > EAL: probe driver: 8086:10d3 net_e1000_em > EAL: Not managed by a supported kernel driver, skipped > > Ok, I will queue _DC changes in next verions. Thanks for confirming.
>>> >>>> if no bus has any device, the System default is anyway PA. >>>> >>> Right, If no bus present then It's also responsibility of >>> `rte_bus_get_iommu_class` >>> to use default mapping scheme which is _pa and which It does. >>> >>>>> if so then your fslmc bus handle should do something like below >>>>> -- If no device on fslmc bus : return RTE_IOVA_VA. >>>>> -- If device detected on fslmc bus and bound to iommu driver >>>>> : return RTE_IOVA_VA >>>>> -- If device detected fslmc but not bound to iommu drv : >>>>> return RTE_IOVA_PA.. >>>>> >>>>> make sense? If not then can you describe fslmc mapping scheme? >>>>> >>>>>> if fslmc is not present. The default mode will be PA. >>>>>> >>>>>>> + if (mode != RTE_IOVA_VA) { >>>>>>> + /* Use default IOVA mode */ >>>>>>> + mode = RTE_IOVA_PA; >>>>>>> + } >>>> The system default is anyway PA. >>>> >>> No, That check is needed for case like 1st bus return with _PA and 2nd bus >>> returns with _VA, >>> then mode = 3 (Mix mode), which we don't support so (as I mentioned before) >>> its responsibility of >>> rte_bus_get_iommu_class() to return default mode (_pa). That's why!. >>> >>> >> Does your platform supports `mix mode`, I asked same question in thread >> [04/11] too? >> Let's say that dpaa2 supports mix mode then it is Ok if bus chose to opt >> default mapping >> for mix mode case? Do you see any issue if bus opt to use default scheme for >> mix mode? >> >> > > yes! We can support mix mode. However with your suggested changes in mempool > etc APIs, now the DPDK will not work for us in mix mode (when both PCI and > DPAA2 devices are available) with VA support only for DPAA2 :) > > In case of mix mode, you logic is already there to default to PA. That is > fine. > > But, when PCI devices are not hooked to dpdk. We should be able to use VA for > dpaa2. > Ok. I assume that You'll implement bus handle for fslmc something like `rte_fslmc_get_iommu_class()` and make sure that you return: - _VA in no-device found case. - _VA if iommu capable interface detected for device. - _PA if no-iommu. And only change which you expect at bus layer /rte_bus_get_iommu_class() is to honor `no device found` situation for multiple bus case. Right?