On 9/4/2018 1:36 AM, Dan Gora wrote: > Hi Ferruh, > > I remembered now the motivation behind separating rte_kni_release() > and rte_kni_free(). > > The problem is that the DPDK thread which calls rte_kni_release() > _cannot_ be the same thread which handles callbacks from the KNI > driver via rte_kni_handle_request(). This is because the thread which > calls rte_kni_release() will be stuck down in > ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the > RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application. Since > that thread cannot call rte_kni_handle_request(), the callback would > then just timeout unless some other thread calls > rte_kni_handle_request(). > > So then you are in a bit of a chicken and egg situation. You _have_ > to have a separate thread calling rte_kni_handle_request periodically, > but that thread also _cannot_ run after rte_kni_release returns > (actually it's worse than that because it's actually after the > ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).
I see, so we have problem in both end, -userspace side and kernel side. Agreed that separating release & free may help, but I am not sure about adding a new API for KNI. Very simply, what about prevent kni_net_release() send callback to userspace? This is already not working and removing it resolves the issues you mentioned. Sample application calls rte_eth_dev_stop() after release itself, so behavior will be same. But the issues in kernel you mentioned, using `dev` after free_netdev() called should be addressed. > > So in order to resolve this, I separated the release from the freeing > stages. This allows the DPDK application to keep the > rte_kni_handle_request() thread running while rte_kni_release() is > called so that it can handle the interface state callback, then kill > that thread so that it cannot touch any 'struct rte_kni' resources, > then free the struct rte_kni resources. > > > thanks > dan > > On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yi...@intel.com> wrote: > >>> When the kernel network interface is removed with unregister_netdev(), >>> if the interface is up, it will generate a callback to mark the >>> interface down, which calls kni_net_release(). kni_net_release() will >>> block waiting for the DPDK application to call rte_kni_handle_request() >>> to handle the callback, but it also needs the thread in the KNI driver >>> (either the per-dev thread for multi-thread or the per-driver thread) >>> to call kni_net_poll_resp() in order to wake the thread sleeping in >>> kni_net_release (actually kni_net_process_request()). >>> >>> So now, KNI interfaces should be removed as such: >>> >>> 1) The user calls rte_kni_release(). This only unregisters the >>> netdev in the kernel, but touches nothing else. This allows all the >>> threads to run which are necessary to handle the callback into the >>> DPDK application to mark the interface down. >>> >>> 2) The user stops the thread running rte_kni_handle_request(). >>> After rte_kni_release() has been called, there will be no more >>> callbacks for that interface so it is not necessary. It cannot be >>> running at the same time that rte_kni_free() frees all of the FIFOs >>> and DPDK memory for that KNI interface. >>> >>> 3) The user calls rte_kni_free(). This performs the RTE_KNI_IOCTL_FREE >>> ioctl which calls kni_ioctl_free(). This function removes the struct >>> kni_dev from the list of interfaces to poll (and kills the per-dev >>> kthread, if configured for multi-thread), then frees the memory in >>> the FIFOs.