On Fri, May 02, 2025 at 10:56:58AM +0200, Morten Brørup wrote: > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > Sent: Thursday, 1 May 2025 17.07 > > > > There was recent discussions about drivers creating control threads. > > The current drivers that use rte_thread_create_internal_control keeps > > growing, > > but it got me looking at if this could be done better. > > > > Rather than having multiple control threads which have potential > > conflicts, why not > > add a new API that has one control thread and uses epoll. The current > > multi-process > > control thread could use epoll as well. Epoll scales much better and > > avoids > > any possibility of lock scheduling/priority problems. > > > > Some ideas: > > - single control thread started (where the current MP thread is > > started) > > - have control_register_fd and control_unregister_fd > > - leave rte_control_thread API for legacy uses > > > > Model this after well used libevent library https://libevent.org > > > > Open questions: > > - names are hard, using event as name leads to possible confusion > > with eventdev > > - do we need to support: > > - multiple control threads doing epoll? > > - priorities > > - timers? > > - signals? > > - manual activation? > > - one off events? > > - could alarm thread just be a control event > > > > - should also have stats and info calls > > > > - it would be good to NOT support as many features as libevent, > > since > > so many options leads to bugs. > > I think we need both: > > 1. Multi threading. > Multiple control threads are required for preemptive scheduling between > latency sensitive tasks and long-running tasks (that violate the latency > requirements of the former). > For improved support of multi threading between driver control threads and > other threads (DPDK control threads and other, non-DPDK, processes on the > same host), we should expand the current control thread APIs, e.g. by > expanding the DPDK threads API with more than the current two priorities > ("Normal" and "Real-Time Critical"). > E.g. if polling ethdev counters takes 1 ms, I don't want to add 1 ms jitter > to my other control plane tasks, because they all have to share one control > thread only. > I want the O/S scheduler to handle that for me. And yes, it means that I need > to consider locking, critical sections, and all those potential problems > coming with multithreading. > > 2. Event passing. > Some threads rely on using epoll as dispatcher, some threads use different > designs. > Dataplane threads normally use polling (or eventdev, or Service Cores, or > ...), i.e. non-preemptive scheduling of tiny processing tasks, but may switch > to epoll for power saving during low traffic. > In low traffic periods, drivers may raise an RX interrupt to wake up a > sleeping application to start polling. DPDK currently uses an epoll based > design for passing this "wakeup" event (and other events, e.g. "link status > change"). > > (Disclaimer: Decades have passed since I wrote Windows applications, using > the Win32 API, so the following might be complete nonsense...) > If the "epoll" design pattern is not popular on Windows, we should not force > it upon Windows developers. We should instead offer something compatible with > the Windows "message pump" standard design pattern. > I think it would better to adapt some DPDK APIs to the host O/S than forcing > the APIs of one O/S onto another O/S, if it doesn't fit. > > Here's an idea related to "epoll": We could expose DPDK's internal file > descriptors for the application developer to use her own preferred epoll > library, e.g. libevent. Rather this than requiring using some crippled DPDK > epoll library. > +1 for this suggestion. Let's just provide the low-level info needed to allow the app to work its own solution.
/Bruce