Large interruptions for EAL thread running on isol core

Stephen Hemminger stephen at networkplumber.org
Tue Jun 28 17:19:18 CEST 2022


On Tue, 28 Jun 2022 09:25:50 +0200
Carsten Andrich <carsten.andrich at tu-ilmenau.de> wrote:

> On 24.06.22 17:01, Stephen Hemminger wrote:
> > On Thu, 23 Jun 2022 21:03:49 +0200
> > Carsten Andrich <carsten.andrich at tu-ilmenau.de> wrote:
> >  
> >>   2. Use real-time priority (SCHED_FIFO w/ priority 99) for the DPDK
> >>      threads and
> >>      echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> >>      to disable the runtime limit. With the runtime limit in place, the
> >>      SCHED_FIFO performance will be significantly worse than SCHED_OTHER.  
> > This can cause major issues if application is normal DPDK application (never does system calls).
> > If an interrupt or other event happens on your isolated CPU, the work that it would
> > do in soft irq is never performed. FIFO has higher priority than kernel threads.
> > This can lead to mystery lockups from other applications (reads not completing, network timeouts, etc).  
> 
> Thanks for pointing that out. Do you know of any official kernel 
> documentation that could shed some light on that? I haven't had any 
> serious issues like the ones you list, but maybe I've been lucky. My 
> DPDK applications typically run on fairly minimal systems used 
> exclusively for DPDK tasks, which require minimal latency/jitter. Minor 
> side-effects from using SCHED_FIFO are tolerable in my case, if it 
> improves performance.

Do some looking around and  you will find good documentation like:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/tuning_guide/real_time_throttling


This characteristic of real-time threads means that it is quite easy to
write an application which monopolizes 100% of a given CPU. At first
glance this sounds like it might be a good idea, but in reality it
causes lots of headaches for the operating system. The OS is
responsible for managing both system-wide and per-CPU resources and
must periodically examine data structures describing these resources
and perform housekeeping activities with them. If a core is monopolized
by a SCHED_FIFO thread, it cannot perform the housekeeping tasks and
eventually the entire system becomes unstable, potentially causing a
crash.


More information about the users mailing list