> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Thursday, 1 May 2025 17.07
> 
> There was recent discussions about drivers creating control threads.
> The current drivers that use rte_thread_create_internal_control keeps
> growing,
> but it got me looking at if this could be done better.
> 
> Rather than having multiple control threads which have potential
> conflicts, why not
> add a new API that has one control thread and uses epoll. The current
> multi-process
> control thread could use epoll as well. Epoll scales much better and
> avoids
> any possibility of lock scheduling/priority problems.
> 
> Some ideas:
>    - single control thread started (where the current MP thread is
> started)
>    - have control_register_fd and control_unregister_fd
>    - leave rte_control_thread API for legacy uses
> 
> Model this after well used libevent library https://libevent.org
> 
> Open questions:
>    - names are hard, using event as name leads to possible confusion
> with eventdev
>    - do we need to support:
>         - multiple control threads doing epoll?
>         - priorities
>         - timers?
>         - signals?
>         - manual activation?
>         - one off events?
>    - could alarm thread just be a control event
> 
>    - should also have stats and info calls
> 
>    - it would be good to NOT support as many features as libevent,
> since
>      so many options leads to bugs.

I think we need both:

1. Multi threading.
Multiple control threads are required for preemptive scheduling between latency 
sensitive tasks and long-running tasks (that violate the latency requirements 
of the former).
For improved support of multi threading between driver control threads and 
other threads (DPDK control threads and other, non-DPDK, processes on the same 
host), we should expand the current control thread APIs, e.g. by expanding the 
DPDK threads API with more than the current two priorities ("Normal" and 
"Real-Time Critical").
E.g. if polling ethdev counters takes 1 ms, I don't want to add 1 ms jitter to 
my other control plane tasks, because they all have to share one control thread 
only.
I want the O/S scheduler to handle that for me. And yes, it means that I need 
to consider locking, critical sections, and all those potential problems coming 
with multithreading.

2. Event passing.
Some threads rely on using epoll as dispatcher, some threads use different 
designs.
Dataplane threads normally use polling (or eventdev, or Service Cores, or ...), 
i.e. non-preemptive scheduling of tiny processing tasks, but may switch to 
epoll for power saving during low traffic.
In low traffic periods, drivers may raise an RX interrupt to wake up a sleeping 
application to start polling. DPDK currently uses an epoll based design for 
passing this "wakeup" event (and other events, e.g. "link status change").

(Disclaimer: Decades have passed since I wrote Windows applications, using the 
Win32 API, so the following might be complete nonsense...)
If the "epoll" design pattern is not popular on Windows, we should not force it 
upon Windows developers. We should instead offer something compatible with the 
Windows "message pump" standard design pattern.
I think it would better to adapt some DPDK APIs to the host O/S than forcing 
the APIs of one O/S onto another O/S, if it doesn't fit.

Here's an idea related to "epoll": We could expose DPDK's internal file 
descriptors for the application developer to use her own preferred epoll 
library, e.g. libevent. Rather this than requiring using some crippled DPDK 
epoll library.

At high level...
The application developer should be free to use any design pattern preferred. 
We should not require using epoll as the application's main dispatcher, and 
thereby prevent application developers from using other design patterns.
Remember: DPDK is only a library (with a lot of features). It is not a complete 
framework requiring a specific application design. Let's keep it that way.

PS: I strongly prefer "epoll" events over "signals" for passing events to the 
application. Thanks to whoever made that decision. ;-)

Reply via email to