[dpdk-dev] [EXTERNAL] Re: Windows DPDK real-time priority threads causing thread starvation

Dmitry Malloy (MESHCHANINOV) dmitrym at microsoft.com
Wed Dec 9 17:12:10 CET 2020


There is a configuration in Windows similar to Linux isolcpus where scheduler tries not to run anything on such cores, and implementation is being enhanced for the next Windows release with User-mode vmswitch feedback.

I'll dig out the details.

Dmitry

-----Original Message-----
From: Stephen Hemminger <stephen at networkplumber.org> 
Sent: Wednesday, December 9, 2020 8:09 AM
To: Tal Shnaiderman <talshn at nvidia.com>
Cc: Dmitry Kozlyuk <dmitry.kozliuk at gmail.com>; Dmitry Malloy (MESHCHANINOV) <dmitrym at microsoft.com>; Narcisa Ana Maria Vasile <Narcisa.Vasile at microsoft.com>; Eilon Greenstein <eilong at nvidia.com>; Omar Cardona <ocardona at microsoft.com>; Rani Sharoni <ranish at nvidia.com>; Odi Assli <odia at nvidia.com>; Harini Ramakrishnan <Harini.Ramakrishnan at microsoft.com>; thomas <thomas at monjalon.net>; dev at dpdk.org
Subject: [EXTERNAL] Re: [dpdk-dev] Windows DPDK real-time priority threads causing thread starvation

On Wed, 9 Dec 2020 14:15:30 +0000
Tal Shnaiderman <talshn at nvidia.com> wrote:

> Hi,
> 
> During our verification tests on Windows DPDK we've noticed that DPDK polling threads, which run in REALTIME_PRIORITY_CLASS are causing starvation to other threads from the OS which need to change affinity and run in lower priority.
> 
> While running an application for a while we see the OS thread waits for 2:30 minutes and raises a bugcheck, see below example of such flow:
> 
> 1) DPDK thread running on core-0 in real-time high priority(24) polling mode.
> 2) The thread is blocking the system function NtSetSystemInformation 
> (ExpUpdateTimerConfiguration) in another thread from
>    switching to core-0 via KeSetSystemGroupAffinityThread since the 
> calling thread is priority 15.
> 3) NtSetSystemInformation exclusively acquired system-wide lock 
> (ExpTimeRefreshLock) hence
>     it blocks other threads (e.g. calling NtQuerySystemInformation).
> 
> We've seen this behavior only while running on Windows 2019 VMs, maybe on native machines OS scheduling of such flow is done differently? 
> 
> Below is usage explanation from the documentation of SetPriorityClass [1]:
> 
> - REALTIME_PRIORITY_CLASS
> Process that has the highest possible priority. The threads of the process preempt the threads of all other processes, including operating system processes performing important tasks. For example, a real-time process that executes for more than a very brief interval can cause disk caches not to flush or cause the mouse to be unresponsive. 
> 
> So I assume using this kind of thread for a long period as we do can cause unstable behavior.
> 
> How do you think we can resolve this? Are there such cases in Linux?
> 
> [1] - 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
> .microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Fprocessthreadsapi%2Fn
> f-processthreadsapi-setpriorityclass&data=04%7C01%7Cdmitrym%40micr
> osoft.com%7C623844c12bc2440d3bbd08d89c5cc0f1%7C72f988bf86f141af91ab2d7
> cd011db47%7C1%7C0%7C637431269479649074%7CUnknown%7CTWFpbGZsb3d8eyJWIjo
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp
> ;sdata=LSv%2F%2BZMxUkfgcQVD778wL4JbVdl2qYV1tHdfVrEck4c%3D&reserved
> =0
> 
> Thanks,
> 
> Tal.

This is not unique to Windows, Linux has same thing when using SCHED_FIFO.
Setting REALTIME is not a magic "go fast" flag it tells scheduler to "run this thread at higher priority than kernel".  Setting real time is not compatible with applications doing 100% polling. 

If you have to use REALTIME then application must change to doing sleep/wakeup type architecture, not pure polling.

Typical DPDK style application is incompatible with SCHED_FIFO/SCHED_RR.


More information about the dev mailing list