________________________________
From: Ferruh Yigit <ferruh.yi...@intel.com>
Sent: Friday, July 12, 2019 4:40 PM
To: Vamsi Krishna Attunuru; dev@dpdk.org
Cc: olivier.m...@6wind.com; arybche...@solarflare.com; Kiran Kumar Kokkilagadda
Subject: Re: [EXT] Re: [PATCH v6 4/4] kernel/linux/kni: add IOVA support in kni
module
On 7/12/2019 11:38 AM, Vamsi Krishna Attunuru wrote:
>
>
>
> --------------------------------------------------------------------------------
> *From:* Ferruh Yigit <ferruh.yi...@intel.com>
> *Sent:* Thursday, July 11, 2019 10:00 PM
> *To:* Vamsi Krishna Attunuru; dev@dpdk.org
> *Cc:* olivier.m...@6wind.com; arybche...@solarflare.com; Kiran Kumar
> Kokkilagadda
> *Subject:* [EXT] Re: [PATCH v6 4/4] kernel/linux/kni: add IOVA support in kni
> module
>
> External Email
>
> ----------------------------------------------------------------------
> On 6/25/2019 4:57 AM, vattun...@marvell.com wrote:
>> From: Kiran Kumar K <kirankum...@marvell.com>
>>
>> Patch adds support for kernel module to work in IOVA = VA mode,
>> the idea is to get physical address from iova address using
>> iommu_iova_to_phys API and later use phys_to_virt API to
>> convert the physical address to kernel virtual address.
>>
>> When compared with IOVA = PA mode, there is no performance
>> drop with this approach.
>>
>> This approach does not work with the kernel versions less
>> than 4.4.0 because of API compatibility issues.
>>
>> Signed-off-by: Kiran Kumar K <kirankum...@marvell.com>
>> Signed-off-by: Vamsi Attunuru <vattun...@marvell.com>
>
> <...>
>
>> @@ -351,15 +354,56 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num,
>> strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE);
>>
>> /* Translate user space info into kernel space info */
>> - kni->tx_q = phys_to_virt(dev_info.tx_phys);
>> - kni->rx_q = phys_to_virt(dev_info.rx_phys);
>> - kni->alloc_q = phys_to_virt(dev_info.alloc_phys);
>> - kni->free_q = phys_to_virt(dev_info.free_phys);
>> -
>> - kni->req_q = phys_to_virt(dev_info.req_phys);
>> - kni->resp_q = phys_to_virt(dev_info.resp_phys);
>> - kni->sync_va = dev_info.sync_va;
>> - kni->sync_kva = phys_to_virt(dev_info.sync_phys);
>> + if (dev_info.iova_mode) {
>> +#if KERNEL_VERSION(4, 4, 0) > LINUX_VERSION_CODE
>
> We have "kni/compat.h" to put the version checks, please use abstracted
> feature
> checks only in the code.
> From experience this goes ugly quickly with the addition to distro kernels and
> their specific versioning, so better to hide these all from the source code.
>
> And this version requirement needs to be documented in kni doc.
>
> ack
>
>> + (void)pci;
>> + pr_err("Kernel version is not supported\n");
>
> Can you please include 'iova_mode' condition into the message log, because
> this
> kernel version is supported if user wants to use via 'iova_mode == 0'
> condition.
>
> ack
>
>> + return -EINVAL;
>> +#else
>> + pci = pci_get_device(dev_info.vendor_id,
>> + dev_info.device_id, NULL);
>> + while (pci) {
>> + if ((pci->bus->number == dev_info.bus) &&
>> + (PCI_SLOT(pci->devfn) == dev_info.devid) &&
>> + (PCI_FUNC(pci->devfn) == dev_info.function)) {
>> + domain = iommu_get_domain_for_dev(&pci->dev);
>> + break;
>> + }
>> + pci = pci_get_device(dev_info.vendor_id,
>> + dev_info.device_id, pci);
>> + }
>
> What if 'pci' is NULL here?
> In kni it is not required to provide a device at all.
>
> Ack, will add a NULL check.
> other point is not clear to me, device info is absolutely required at least
> for IOVA=VA mode, since it requires to procure iommu domain details.
"device info is absolutely required" *only* for IOVA=VA mode, so user may skip
to provide it.
> Any thoughts or ways to address this without device.?
Return error if 'iova_mode' requested but device info not?
But you didn't replied to passing 'iova_mode' from application, I would like
hear what you are thinking about it..
There is a query on that regard in the cover letter mail chain. Anyways below
is that
"""
> I support the idea to remove 'kni' forcing to the IOVA=PA mode, but also not
> sure about forcing all KNI users to update their code to allocate mempool in a
> very specific way.
>
> What about giving more control to the user on this?
>
> Any user want to use IOVA=VA and KNI together can update application to
> justify memory allocation of the KNI and give an explicit "kni iova_mode=1"
> config.
Jerin > Where this config comes, eal or kni sample app or KNI public API?
"""
>
> <...>
>
>> @@ -186,7 +202,10 @@ kni_fifo_trans_pa2va(struct kni_dev *kni,
>> return;
>>
>> for (i = 0; i < num_rx; i++) {
>> - kva = pa2kva(kni->pa[i]);
>> + if (likely(kni->iova_mode == 1))
>> + kva = iova2kva(kni, kni->pa[i]);
>> + else
>> + kva = pa2kva(kni->pa[i]);
>
> To reduce the churn, what about updating the 'pa2kva()' and put the
> "(kni->iova_mode == 1)" check there? Does it help? (not only 'pa2kva()' but
> its
> friends also, and if it makes more sense agree to rename the functions)
>
> No, in VA mode, kni->pa[i] points to iova address, pa2kva() of iova address
> might
> crash, hence the if..else check is added.
I understand that part.
What I am suggestion is something like this:
kva = common_function(kni, kni->pa[i]);
---
common_function() {
if (unlikely(kni->iova_mode == 1))
return iova2kva(kni, kni->pa[i]);
return pa2kva(kni->pa[i]);
}
To hide the check in the function and make code more readable
>
> And btw, why 'likely' case is "kni->iova_mode == 1"?
>
> no specific case other than branch predict, will remove this if it's really
> harmful to PA mode.