On 10/31/2017 11:54 PM, Jerin Jacob wrote: > -----Original Message----- >> Date: Tue, 31 Oct 2017 23:21:18 -0700 >> From: Ferruh Yigit <ferruh.yi...@intel.com> >> To: Jerin Jacob <jerin.ja...@caviumnetworks.com> >> CC: Thomas Monjalon <tho...@monjalon.net>, Bruce Richardson >> <bruce.richard...@intel.com>, Sergio Gonzalez Monroy >> <sergio.gonzalez.mon...@intel.com>, dev@dpdk.org, Jianfeng Tan >> <jianfeng....@intel.com>, Santosh Shukla >> <santosh.shu...@caviumnetworks.com> >> Subject: Re: [dpdk-dev] [PATCH] eal: disable IOVA mode detection by default >> User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 >> Thunderbird/52.4.0 >> >> On 10/31/2017 9:02 PM, Jerin Jacob wrote: >>> -----Original Message----- >>>> Date: Wed, 1 Nov 2017 01:07:26 +0000 >>>> From: Ferruh Yigit <ferruh.yi...@intel.com> >>>> To: Thomas Monjalon <tho...@monjalon.net>, Bruce Richardson >>>> <bruce.richard...@intel.com>, Sergio Gonzalez Monroy >>>> <sergio.gonzalez.mon...@intel.com> >>>> CC: dev@dpdk.org, Ferruh Yigit <ferruh.yi...@intel.com>, Jianfeng Tan >>>> <jianfeng....@intel.com>, Santosh Shukla >>>> <santosh.shu...@caviumnetworks.com> >>>> Subject: [dpdk-dev] [PATCH] eal: disable IOVA mode detection by default >>>> X-Mailer: git-send-email 2.13.6 >>>> >>>> Fix kernel crash with KNI because KNI requires physical addresses. >>> >>> The actual fix would be to make KNI IOMMU aware based on the DPDK mode. >>> >>> ie. On slow path, >>> >>> /* Get iommu domain for iova to physical addr conversion */ >>> if (rte_eal_iova_mode() == RTE_IOVA_VA) >>> kni->iommu_domain = iommu_get_domain_for_dev(dev); >>> else >>> kni->iommu_domain = NULL; >>> >>> On fast path, >>> >>> static inline u64 kni_iova_to_phys(struct ... *kni, dma_addr_t dma_addr) >>> { >>> >>> /* Translation is installed only when IOMMU is present */ >>> >>> if (kni->iommu_domain) >>> >>> return iommu_iova_to_phys(kni->iommu_domain, dma_addr); >>> >>> return dma_addr; >>> >>> } >>> >>>> >>>> A config option introduced to disable IOVA mode detection and to set it >>>> to physical address by default. Disabling config option will enable IOVA >>>> mode detection. >>>> >>>> When there is no intension to use KNI, it is safe to enable detection. >>>> >>>> Config option disable IOVA mode detection by default to be sure only who >>>> is aware of result enable it. >>>> >>>> Fixes: 72d013644bd6 ("mem: honor IOVA mode in malloc virt2phy") >>>> >>>> Signed-off-by: Ferruh Yigit <ferruh.yi...@intel.com> >>>> --- >>>> Cc: Jianfeng Tan <jianfeng....@intel.com> >>>> Cc: Santosh Shukla <santosh.shu...@caviumnetworks.com> >>>> Cc: Thomas Monjalon <tho...@monjalon.net> >>>> --- >>>> config/common_base | 5 +++++ >>>> lib/librte_eal/bsdapp/eal/eal.c | 4 ++++ >>>> lib/librte_eal/linuxapp/eal/eal.c | 4 ++++ >>>> 3 files changed, 13 insertions(+) >>>> >>>> diff --git a/config/common_base b/config/common_base >>>> index 82ee75456..903e7685b 100644 >>>> --- a/config/common_base >>>> +++ b/config/common_base >>>> @@ -107,6 +107,11 @@ CONFIG_RTE_MALLOC_DEBUG=n >>>> CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n >>>> >>>> # >>>> +# Disabling PHYS_IOVA may crash kernel for KNI, use with caution >>>> +# >>>> +CONFIG_RTE_EAL_USE_PHYS_IOVA=y >>> >>> Defeat the purpose of all dynamic probing scheme. >>> Either we can fix the KNI or revert the following patch for this release. >>> >>> http://dpdk.org/commit/f37dfab2 >> >> This commit just enables IOVA VA mode for Intel drivers, that is how I can >> able >> to observe the issue, but it is not the source of the problem. Reverting that >> commit will not solve KNI crash with any other PMD that enables IOVA VA mode. > > I don't understand why a PMD needs to enable IOVA_VA if it can support > IOVA_PA. > IMO, IOVA_VA should be enabled only for those device it can WORK ONLY on > IOVA_VA mode. Forget about KNI, If we set CONFIG_RTE_EAL_USE_PHYS_IOVA > as y then the normal stuff wont work for if PMD can operate only in > IOVA_VA mode(like octeontx).
Hmm, I wasn't aware that octeontx only operate on IOVA VA mode, I got your concern now. But still that config option still enables choosing between KNI and "IOVA_VA mode only" devices until KNI gets fixed. Unfortunately both won't work at same time for now. > Regarding the KNI crash, it can be avoid by first checking the exiting > mode(rte_eal_iova_mode()). i.e since legacy driver like KNI need real > physical address to work "now", it can grace full exit on the init time if > mode == IOVA_VA; Definitely agree to gracefully exit, but I was looking to make KNI work more than just exit without crash. >> Related to the KNI, iommu is not involved at all, I am not clear with your >> above >> suggestion, physical address is required for kernelspace - userspace >> communication. > > vhost-kernel addressed this case with IOVA as VA. I need to spend cycles > on what takes to remove physical address dependency from KNI. > But someone can add that support if it is required in future. Since the iova patchset breaks the existing KNI, I would like to see KNI fix part of the iova work. It doesn't look right to me that a new feature breaking the existing one. >> KNI uses physical address for two things: >> >> 1- Creates a buffer in userspace, a memzone, and shares its physical address >> to >> the kernel, so that both kernel and app can access same buffer. >> >> The question is, even all devices supports IOVA VA mode, why creating a >> memzone >> that has real physical address info is not possible anymore? >> For KNI case this is not related at all with what devices supports. >> >> 2- For each mbuf that will be sent to kernel, dpdk app puts physical address >> of >> that mbuf into shared buffer, so that kernel can access it. So that kernel >> can >> access to mbuf data that can be coming from any mempool. >> For this the physical address of the mbuf is required. >> >> >> Overall KNI needs to know physical address of mbufs and memzone. >> >> This patch provides a way to have old behavior with a config option, so that >> KNI >> can work and anyone who needs to create memzone with physical address can a >> way >> to have it, and all PMDs will work fine. >> >> If there is no need for these, it is possible to disable config and dynamic >> probing scheme is already there. >> >> Thanks, >> ferruh