Re: [dpdk-dev] [EXT] Re: [PATCH v10 4/5] kni: add IOVA=VA support in KNI module

Ferruh Yigit Wed, 16 Oct 2019 09:14:45 -0700

On 10/16/2019 12:26 PM, Vamsi Krishna Attunuru wrote:
> 
> 
>> -----Original Message-----
>> From: Stephen Hemminger <step...@networkplumber.org>
>> Sent: Tuesday, October 15, 2019 9:16 PM
>> To: Yigit, Ferruh <ferruh.yi...@linux.intel.com>
>> Cc: Vamsi Krishna Attunuru <vattun...@marvell.com>; dev@dpdk.org;
>> tho...@monjalon.net; Jerin Jacob Kollanukkaran <jer...@marvell.com>;
>> olivier.m...@6wind.com; ferruh.yi...@intel.com; anatoly.bura...@intel.com;
>> arybche...@solarflare.com; Kiran Kumar Kokkilagadda
>> <kirankum...@marvell.com>
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v10 4/5] kni: add IOVA=VA support in KNI
>> module
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On Tue, 15 Oct 2019 16:43:08 +0100
>> "Yigit, Ferruh" <ferruh.yi...@linux.intel.com> wrote:
>>
>>> On 8/16/2019 7:12 AM, vattun...@marvell.com wrote:
>>>> From: Kiran Kumar K <kirankum...@marvell.com>
>>>>
>>>> Patch adds support for kernel module to work in IOVA = VA mode, the
>>>> idea is to get physical address from IOVA address using
>>>> iommu_iova_to_phys API and later use phys_to_virt API to convert the
>>>> physical address to kernel virtual address.
>>>>
>>>> When compared with IOVA = PA mode, there is no performance drop with
>>>> this approach.
>>>>
>>>> This approach does not work with the kernel versions less than 4.4.0
>>>> because of API compatibility issues.
>>>>
>>>> Patch also updates these support details in KNI documentation.
>>>>
>>>> Signed-off-by: Kiran Kumar K <kirankum...@marvell.com>
>>>> Signed-off-by: Vamsi Attunuru <vattun...@marvell.com>
>>>
>>> <...>
>>>
>>>> @@ -348,15 +351,65 @@ kni_ioctl_create(struct net *net, uint32_t
>> ioctl_num,
>>>>    strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE);
>>>>
>>>>    /* Translate user space info into kernel space info */
>>>> -  kni->tx_q = phys_to_virt(dev_info.tx_phys);
>>>> -  kni->rx_q = phys_to_virt(dev_info.rx_phys);
>>>> -  kni->alloc_q = phys_to_virt(dev_info.alloc_phys);
>>>> -  kni->free_q = phys_to_virt(dev_info.free_phys);
>>>> -
>>>> -  kni->req_q = phys_to_virt(dev_info.req_phys);
>>>> -  kni->resp_q = phys_to_virt(dev_info.resp_phys);
>>>> -  kni->sync_va = dev_info.sync_va;
>>>> -  kni->sync_kva = phys_to_virt(dev_info.sync_phys);
>>>> +  if (dev_info.iova_mode) {
>>>> +#ifdef HAVE_IOVA_AS_VA_SUPPORT
>>>> +          pci = pci_get_device(dev_info.vendor_id,
>>>> +                               dev_info.device_id, NULL);
>>>> +          if (pci == NULL) {
>>>> +                  pr_err("pci dev does not exist\n");
>>>> +                  return -ENODEV;
>>>> +          }
>>>
>>> If there is no PCI device KNI should still work.
>>
>> Right now it is possible to use KNI with netvsc PMD on Hyper-V/Azure.
>> With this patch that won't be possible.
> 
> Hi Ferruh, Stephen,
> 
> These can be fixed by forcing iommu_mode as PA when vdevs are used
> for KNI usecase.
> 
> rte_bus_get_iommu_class(void)
>  {
>         enum rte_iova_mode mode = RTE_IOVA_DC;
> +       struct rte_devargs *devargs = NULL;
>         bool buses_want_va = false;
>         bool buses_want_pa = false;
>         struct rte_bus *bus;
> 
> +       if (rte_eal_check_module("rte_kni") == 1) {
> +               RTE_EAL_DEVARGS_FOREACH("vdev", devargs) {
> +                       return RTE_IOVA_PA;
> +               }
> +       }
> +
>         TAILQ_FOREACH(bus, &rte_bus_list, next) {
>                 enum rte_iova_mode bus_iova_mode;
> 
> I think this will solve various use cases/combinations like PA or VA mode, 
> pdev or vdev used for KNI.
> Existing use cases would not be affected by these patch series with above fix.
>


Hi Vamsi,

I think this is not a problem of using vdev so I think we can't solve this via
vdev check only.

The sample I give (KNI PMD) is using vdev, but application can use KNI library
APIs directly to create kni interface and have kernel/user space communication
without any device involved. KNI PMD (vdev) is just a wrapper to make this easy.

Just thinking aloud,
KNI is sharing in userspace buffer with kernel, so basically it needs to do
virtual address to kernel virtual address translation, in a reasonably fast 
manner.

iova=va breaks KNI because:
1) physical memory is not continuous anymore which break our address translation
logic
2) we were using physical address of the buffer for the address translation, but
we have no more have it, we now have iova address.

I assume 1) is resolved with 'rte_kni_pktmbuf_pool_create()' it would be helpful
though if you can explain how this works?

For second, a simple question, do we need to get a PCIe device information to be
able to convert iova to kernel virtual address? Can't I get this information
from iommu somehow?
Think about a case "--iova-mode=va" provided but there is no physical device
bind to vfio-pci, can I still allocated memor? And how can I use KNI in that 
case?

Re: [dpdk-dev] [EXT] Re: [PATCH v10 4/5] kni: add IOVA=VA support in KNI module

Reply via email to