Hi, I tend to run with a winbdg kernel debugger (KDNET) connected to my debug target machines. It quite often reports deadlock detection when we have such "real-time" threads never yielding on a core. If we hog core-0 in particular dwm.exe never gets a look in so the desktop stops being drawn too.
John. > -----Original Message----- > From: dev <dev-boun...@dpdk.org> On Behalf Of Tal Shnaiderman > Sent: 09 December 2020 14:16 > To: Dmitry Kozlyuk <dmitry.kozl...@gmail.com>; Dmitry Malloy > (MESHCHANINOV) <dmit...@microsoft.com>; Narcisa Ana Maria Vasile > <narcisa.vas...@microsoft.com> > Cc: Eilon Greenstein <eil...@nvidia.com>; Omar Cardona > <ocard...@microsoft.com>; Rani Sharoni <ran...@nvidia.com>; Odi Assli > <o...@nvidia.com>; Harini Ramakrishnan > <harini.ramakrish...@microsoft.com>; NBU-Contact-Thomas Monjalon > <tho...@monjalon.net>; dev@dpdk.org > Subject: [dpdk-dev] Windows DPDK real-time priority threads causing thread > starvation > > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > Hi, > > During our verification tests on Windows DPDK we've noticed that DPDK > polling threads, which run in REALTIME_PRIORITY_CLASS are causing > starvation to other threads from the OS which need to change affinity and > run in lower priority. > > While running an application for a while we see the OS thread waits for 2:30 > minutes and raises a bugcheck, see below example of such flow: > > 1) DPDK thread running on core-0 in real-time high priority(24) polling mode. > 2) The thread is blocking the system function NtSetSystemInformation > (ExpUpdateTimerConfiguration) in another thread from > switching to core-0 via KeSetSystemGroupAffinityThread since the calling > thread is priority 15. > 3) NtSetSystemInformation exclusively acquired system-wide lock > (ExpTimeRefreshLock) hence > it blocks other threads (e.g. calling NtQuerySystemInformation). > > We've seen this behavior only while running on Windows 2019 VMs, maybe > on native machines OS scheduling of such flow is done differently? > > Below is usage explanation from the documentation of SetPriorityClass [1]: > > - REALTIME_PRIORITY_CLASS > Process that has the highest possible priority. The threads of the process > preempt the threads of all other processes, including operating system > processes performing important tasks. For example, a real-time process that > executes for more than a very brief interval can cause disk caches not to > flush or cause the mouse to be unresponsive. > > So I assume using this kind of thread for a long period as we do can cause > unstable behavior. > > How do you think we can resolve this? Are there such cases in Linux? > > [1] - https://docs.microsoft.com/en- > us/windows/win32/api/processthreadsapi/nf-processthreadsapi- > setpriorityclass > > Thanks, > > Tal.