Hi All, Sorry to butt in on this, but I fixed this same issue about 3 years ago in my application, but I was never able to get the changes integrated and eventually just gave up trying.
The rule with KNI is: 1) The app should have a separate control thread per rte_kni which just spins calling rte_kni_handle_request(). This ensures that other threads calling rte_kni_XXX functions will always get a response. 2) In order to deal with lockups and timeouts when closing the device, I sent patches which separated the closing process into two steps: rte_kni_release() which would unregister the underlying netdev, then rte_kni_free() which would free the KNI portions of the KNI device. When rte_kni_release() is called the kernel netdev is unregistered and a response is sent back to the application, the control thread calling rte_kni_handle_request() is still running, so the application will still get a response back from the kernel and not lock up, the application then kills the control thread so that rte_kni_handle_request() is not called again, then the application calls rte_kni_free() which frees all of the FIFOs and closes the device. If anyone is interested the patches are probably still floating around patchwork. If not you can check them out here: https://github.com/danielgora/dpdk.git thanks- dan On Mon, Mar 1, 2021 at 5:10 AM Igor Ryzhov <iryz...@nfware.com> wrote: > > Stephen, > > No, I don't have a better proposal, but I think it is not correct to change > the behavior of KNI (making link down without a real response). > Even though we know that communicating with userspace under rtnl_lock is a > bad idea, it works as it is for many years already. > > Elad, > > I agree with you that KNI should be removed from the main tree if it is not > possible to fix this __dev_close_many issue. > There were discussions about this multiple times already, but no one is > working on this AFAIK. > Last time the discussion was a month ago: > https://www.mail-archive.com/dev@dpdk.org/msg196033.html > > Igor > > On Fri, Feb 26, 2021 at 8:43 PM Elad Nachman <ela...@gmail.com> wrote: > > > The way the kernel handles its locks and lists for the dev close many > > path, there is no way you can go around this with rtnl unlocked : > > " > > > > There is a race condition in __dev_close_many() processing the > > close_list while the application terminates. > > It looks like if two vEth devices are terminating, > > and one releases the rtnl lock, the other takes it, > > updating the close_list in an unstable state, > > causing the close_list to become a circular linked list, > > hence list_for_each_entry() will endlessly loop inside > > __dev_close_many() . > > > > " > > And I don't expect David Miller will bend the kernel networking for DPDK > > or KNI. > > > > But - Stephen - if you can personally convince David to accept a > > kernel patch which will separate the close_list locking mechanism to a > > separate (RCU?) lock, then I can introduce first a patch to the kernel > > which will add a lock for the close_list, this way rtnl can be > > unlocked for the if down case. > > > > After that kernel patch, your original patch + relocation of the sync > > mutex locking will do the job . > >lockups > > Otherwise, rtnl has to be kept locked all of the way for the if down > > case in order to prevent corruption causing a circular linked list out > > of the close_list, causing a hang in the kernel. > >lockups > > Currently, the rtnl lock is the only thing keeping the close_list from > > corruption. > > > > If you doubt rtnl cannot be unlocked for dev close path, you can > > consult David for his opinion, as I think it is critical to understand > > what the kernel can or cannot do, or expects to be done before we can > > unlock its locks as we wish inside rte_kni.ko . > > > > Otherwise, if we are still in disagreement on how to patch this set of > > problems, I think the responsible way around it is to completely > > remove kni from the main dpdk tree and move it to dpdk-kmods > > repository. > > > > I know BSD style open-source does not carry legal responsibility from > > the developers, but I think when a bunch of developers know a piece of > > code is highly buggy, they should not leave it for countless new users > > to bounce their head desperately against, if they cannot agree on a > > correct way to solve the bunch of problems, of which I think we all > > agree exist (we just do not agree on the proper solution or patch)... > > > > That's my two cents, > > > > Elad. > > > > On Fri, Feb 26, 2021 at 5:49 PM Stephen Hemminger > > <step...@networkplumber.org> wrote: > > > > > > On Fri, 26 Feb 2021 00:01:01 +0300 > > > Igor Ryzhov <iryz...@nfware.com> wrote: > > > > > > > Hi Elad, > > > > > > > > Thanks for the patch, but this is still NACK from me. > > > > > > > > The only real advantage of KNI over other exceptional-path techniques > > > > like virtio-user is the ability to configure DPDK-managed interfaces > > > > directly > > > > from the kernel using well-known utils like iproute2. A very important > > part > > > > of this is getting responses from the DPDK app and knowing the actual > > > > result of command execution. > > > > If you're making async requests to the application and you don't know > > > > the result, then what's the point of using KNI at all? > > > > > > > > Igor > > > > > > Do you have a better proposal that keeps the request result but does not > > > call userspace with lock held. > > > > > > PS: I also have strong dislike of KNI, as designed it would have been > > rejected > > > by Linux kernel developers. A better solution would be userspace > > version of > > > something like devlink devices. But doing control operations by proxy is > > > a locking nightmare. > >