On Thu, 1 Apr 2021 00:12:54 +0300 Dmitry Kozlyuk <dmitry.kozl...@gmail.com> wrote:
> 2021-03-30 14:11 (UTC-0700), Stephen Hemminger: > > On Mon, 29 Mar 2021 15:40:39 -0700 > > Narcisa Ana Maria Vasile <navas...@linux.microsoft.com> wrote: > > > > > From: Narcisa Vasile <navas...@microsoft.com> > > > > > > Allow the user to choose the thread priority through an EAL > > > command line argument. > > > > > > The user can select the thread priority to be either 'normal' > > > or 'critical': > > > --thread-prio normal > > > --thread-prio realtime > > > > > > Signed-off-by: Narcisa Vasile <navas...@microsoft.com> > > > > The discussion internally was that this was intended to resolve issues on > > Windows. > > So it makes sense for Windows, but it is not something that we want to have > > on Linux. > > Could you make this Windows only, and add update the documentation please. > > > > I just don't want Linux users discovering it, trying it, then reporting > > more bugs. > > Can you share more details of that discussion? > Is realtime-critical needed not for busy-polling apps (which indeed cause > starvation), but for interrupt-driven ones to process packets ASAP? > > If it's true, then maybe NetUIO can instead give priority boost to these > threads when notifying them about interrupts (Omar? DmitryM?). This can be > configurable via devargs. One downside is that every kernel driver has to > support it, currently Mellanox bifurcated driver and NetUIO. But they will > need some interrupt-related IOCTLs anyway. A DPDK application typically has cores detected to polling for packets. The temptation is to set those cores to have a real time scheduling policy (SCHED_FIFO, or SCH_RR). The problem is that those priorities run in preference to required kernel functions. So the polling-for-packets threads will starve out the Linux kernel RCU and softirq completion of I/O. This starvation will lead to memory loss (no RCU cleanup) and potential deadlocks (disk I/O never completing). It is possible to use real time priority on Linux but it requires lots of tuning to make sure that the kernel never runs work queues, interrupts or soft irqs on those cores. Lots of changes to /proc, kernel command line, and sysfs tunables. Which is possible on embedded systems but not for general purpose applications. This is already a problem that shows up, but it only happens if the DPDK application writer explcitly calls the setscheduler on those threads. At that point, it is the case where the user has started to manipulate threads, and we have to assume they know the consequences and are ready to deal with them. On Windows, the situation is different so yes, this is necessary.