The way the kernel handles its locks and lists for the dev close many path, there is no way you can go around this with rtnl unlocked : "
There is a race condition in __dev_close_many() processing the close_list while the application terminates. It looks like if two vEth devices are terminating, and one releases the rtnl lock, the other takes it, updating the close_list in an unstable state, causing the close_list to become a circular linked list, hence list_for_each_entry() will endlessly loop inside __dev_close_many() . " And I don't expect David Miller will bend the kernel networking for DPDK or KNI. But - Stephen - if you can personally convince David to accept a kernel patch which will separate the close_list locking mechanism to a separate (RCU?) lock, then I can introduce first a patch to the kernel which will add a lock for the close_list, this way rtnl can be unlocked for the if down case. After that kernel patch, your original patch + relocation of the sync mutex locking will do the job . Otherwise, rtnl has to be kept locked all of the way for the if down case in order to prevent corruption causing a circular linked list out of the close_list, causing a hang in the kernel. Currently, the rtnl lock is the only thing keeping the close_list from corruption. If you doubt rtnl cannot be unlocked for dev close path, you can consult David for his opinion, as I think it is critical to understand what the kernel can or cannot do, or expects to be done before we can unlock its locks as we wish inside rte_kni.ko . Otherwise, if we are still in disagreement on how to patch this set of problems, I think the responsible way around it is to completely remove kni from the main dpdk tree and move it to dpdk-kmods repository. I know BSD style open-source does not carry legal responsibility from the developers, but I think when a bunch of developers know a piece of code is highly buggy, they should not leave it for countless new users to bounce their head desperately against, if they cannot agree on a correct way to solve the bunch of problems, of which I think we all agree exist (we just do not agree on the proper solution or patch)... That's my two cents, Elad. On Fri, Feb 26, 2021 at 5:49 PM Stephen Hemminger <step...@networkplumber.org> wrote: > > On Fri, 26 Feb 2021 00:01:01 +0300 > Igor Ryzhov <iryz...@nfware.com> wrote: > > > Hi Elad, > > > > Thanks for the patch, but this is still NACK from me. > > > > The only real advantage of KNI over other exceptional-path techniques > > like virtio-user is the ability to configure DPDK-managed interfaces > > directly > > from the kernel using well-known utils like iproute2. A very important part > > of this is getting responses from the DPDK app and knowing the actual > > result of command execution. > > If you're making async requests to the application and you don't know > > the result, then what's the point of using KNI at all? > > > > Igor > > Do you have a better proposal that keeps the request result but does not > call userspace with lock held. > > PS: I also have strong dislike of KNI, as designed it would have been rejected > by Linux kernel developers. A better solution would be userspace version of > something like devlink devices. But doing control operations by proxy is > a locking nightmare.