On 10/18/2019 6:15 PM, Vamsi Krishna Attunuru wrote: > > >> -----Original Message----- >> From: dev <dev-boun...@dpdk.org> On Behalf Of Ferruh Yigit >> Sent: Wednesday, October 16, 2019 9:44 PM >> To: Vamsi Krishna Attunuru <vattun...@marvell.com>; Stephen Hemminger >> <step...@networkplumber.org>; Yigit, Ferruh >> <ferruh.yi...@linux.intel.com> >> Cc: dev@dpdk.org; tho...@monjalon.net; Jerin Jacob Kollanukkaran >> <jer...@marvell.com>; olivier.m...@6wind.com; >> anatoly.bura...@intel.com; arybche...@solarflare.com; Kiran Kumar >> Kokkilagadda <kirankum...@marvell.com> >> Subject: Re: [dpdk-dev] [EXT] Re: [PATCH v10 4/5] kni: add IOVA=VA support >> in KNI module >> >> On 10/16/2019 12:26 PM, Vamsi Krishna Attunuru wrote: >>> >>> >>>> -----Original Message----- >>>> From: Stephen Hemminger <step...@networkplumber.org> >>>> Sent: Tuesday, October 15, 2019 9:16 PM >>>> To: Yigit, Ferruh <ferruh.yi...@linux.intel.com> >>>> Cc: Vamsi Krishna Attunuru <vattun...@marvell.com>; dev@dpdk.org; >>>> tho...@monjalon.net; Jerin Jacob Kollanukkaran <jer...@marvell.com>; >>>> olivier.m...@6wind.com; ferruh.yi...@intel.com; >>>> anatoly.bura...@intel.com; arybche...@solarflare.com; Kiran Kumar >>>> Kokkilagadda <kirankum...@marvell.com> >>>> Subject: [EXT] Re: [dpdk-dev] [PATCH v10 4/5] kni: add IOVA=VA >>>> support in KNI module >>>> >>>> External Email >>>> >>>> --------------------------------------------------------------------- >>>> - >>>> On Tue, 15 Oct 2019 16:43:08 +0100 >>>> "Yigit, Ferruh" <ferruh.yi...@linux.intel.com> wrote: >>>> >>>>> On 8/16/2019 7:12 AM, vattun...@marvell.com wrote: >>>>>> From: Kiran Kumar K <kirankum...@marvell.com> >>>>>> >>>>>> Patch adds support for kernel module to work in IOVA = VA mode, the >>>>>> idea is to get physical address from IOVA address using >>>>>> iommu_iova_to_phys API and later use phys_to_virt API to convert >>>>>> the physical address to kernel virtual address. >>>>>> >>>>>> When compared with IOVA = PA mode, there is no performance drop >>>>>> with this approach. >>>>>> >>>>>> This approach does not work with the kernel versions less than >>>>>> 4.4.0 because of API compatibility issues. >>>>>> >>>>>> Patch also updates these support details in KNI documentation. >>>>>> >>>>>> Signed-off-by: Kiran Kumar K <kirankum...@marvell.com> >>>>>> Signed-off-by: Vamsi Attunuru <vattun...@marvell.com> >>>>> >>>>> <...> >>>>> >>>>>> @@ -348,15 +351,65 @@ kni_ioctl_create(struct net *net, uint32_t >>>> ioctl_num, >>>>>> strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE); >>>>>> >>>>>> /* Translate user space info into kernel space info */ >>>>>> - kni->tx_q = phys_to_virt(dev_info.tx_phys); >>>>>> - kni->rx_q = phys_to_virt(dev_info.rx_phys); >>>>>> - kni->alloc_q = phys_to_virt(dev_info.alloc_phys); >>>>>> - kni->free_q = phys_to_virt(dev_info.free_phys); >>>>>> - >>>>>> - kni->req_q = phys_to_virt(dev_info.req_phys); >>>>>> - kni->resp_q = phys_to_virt(dev_info.resp_phys); >>>>>> - kni->sync_va = dev_info.sync_va; >>>>>> - kni->sync_kva = phys_to_virt(dev_info.sync_phys); >>>>>> + if (dev_info.iova_mode) { >>>>>> +#ifdef HAVE_IOVA_AS_VA_SUPPORT >>>>>> + pci = pci_get_device(dev_info.vendor_id, >>>>>> + dev_info.device_id, NULL); >>>>>> + if (pci == NULL) { >>>>>> + pr_err("pci dev does not exist\n"); >>>>>> + return -ENODEV; >>>>>> + } >>>>> >>>>> If there is no PCI device KNI should still work. >>>> >>>> Right now it is possible to use KNI with netvsc PMD on Hyper-V/Azure. >>>> With this patch that won't be possible. >>> >>> Hi Ferruh, Stephen, >>> >>> These can be fixed by forcing iommu_mode as PA when vdevs are used for >>> KNI usecase. >>> >>> rte_bus_get_iommu_class(void) >>> { >>> enum rte_iova_mode mode = RTE_IOVA_DC; >>> + struct rte_devargs *devargs = NULL; >>> bool buses_want_va = false; >>> bool buses_want_pa = false; >>> struct rte_bus *bus; >>> >>> + if (rte_eal_check_module("rte_kni") == 1) { >>> + RTE_EAL_DEVARGS_FOREACH("vdev", devargs) { >>> + return RTE_IOVA_PA; >>> + } >>> + } >>> + >>> TAILQ_FOREACH(bus, &rte_bus_list, next) { >>> enum rte_iova_mode bus_iova_mode; >>> >>> I think this will solve various use cases/combinations like PA or VA mode, >> pdev or vdev used for KNI. >>> Existing use cases would not be affected by these patch series with above >> fix. >>> >> >> Hi Vamsi, >> >> I think this is not a problem of using vdev so I think we can't solve this >> via >> vdev check only. >> >> The sample I give (KNI PMD) is using vdev, but application can use KNI >> library >> APIs directly to create kni interface and have kernel/user space >> communication without any device involved. KNI PMD (vdev) is just a >> wrapper to make this easy. >> >> Just thinking aloud, >> KNI is sharing in userspace buffer with kernel, so basically it needs to do >> virtual address to kernel virtual address translation, in a reasonably fast >> manner. >> >> iova=va breaks KNI because: >> 1) physical memory is not continuous anymore which break our address >> translation logic >> 2) we were using physical address of the buffer for the address translation, >> but we have no more have it, we now have iova address. >> >> I assume 1) is resolved with 'rte_kni_pktmbuf_pool_create()' it would be >> helpful though if you can explain how this works? > > KNI kernel module uses pa2va, va2pa calls to translate buffer pointers, this > will only work when both pa & va are 1-to-1 contiguous. In such cases, dpdk > mbuf memory should not cross physical page boundary, If it crosses, it's > iova/va will be contiguous, but it's pa may or may not be contiguous. Kni > pktmbuf pool api ensures that mempool is not populated with those type of > mbufs(that cross page boundary). Once the pool is created, all mbufs in that > pool resides with in the page limits.
ack. >> >> For second, a simple question, do we need to get a PCIe device information >> to be able to convert iova to kernel virtual address? Can't I get this >> information from iommu somehow? >> Think about a case "--iova-mode=va" provided but there is no physical device >> bind to vfio-pci, can I still allocated memor? And how can I use KNI in that >> case? > > We found a way to translate iova/uva to kva using kernel mm subsystem calls, > using get_user_pages_remote() call KNI module is able to get mapping for iova > to kva. This solution worked for pdevs(without sharing any dev info) and also > for wrapper vdev pmds. Good to hear this. > > I will push next version of patches with these solution and other command > line arg changes. Since the major architectural issue is fixed in enabling > iova=va for KNI, we are planning to merge these patch series in 19.11 release. >