Large interruptions for EAL thread running on isol core
Stephen Hemminger
stephen at networkplumber.org
Tue Jun 28 17:19:18 CEST 2022
On Tue, 28 Jun 2022 09:25:50 +0200
Carsten Andrich <carsten.andrich at tu-ilmenau.de> wrote:
> On 24.06.22 17:01, Stephen Hemminger wrote:
> > On Thu, 23 Jun 2022 21:03:49 +0200
> > Carsten Andrich <carsten.andrich at tu-ilmenau.de> wrote:
> >
> >> 2. Use real-time priority (SCHED_FIFO w/ priority 99) for the DPDK
> >> threads and
> >> echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> >> to disable the runtime limit. With the runtime limit in place, the
> >> SCHED_FIFO performance will be significantly worse than SCHED_OTHER.
> > This can cause major issues if application is normal DPDK application (never does system calls).
> > If an interrupt or other event happens on your isolated CPU, the work that it would
> > do in soft irq is never performed. FIFO has higher priority than kernel threads.
> > This can lead to mystery lockups from other applications (reads not completing, network timeouts, etc).
>
> Thanks for pointing that out. Do you know of any official kernel
> documentation that could shed some light on that? I haven't had any
> serious issues like the ones you list, but maybe I've been lucky. My
> DPDK applications typically run on fairly minimal systems used
> exclusively for DPDK tasks, which require minimal latency/jitter. Minor
> side-effects from using SCHED_FIFO are tolerable in my case, if it
> improves performance.
Do some looking around and you will find good documentation like:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/tuning_guide/real_time_throttling
This characteristic of real-time threads means that it is quite easy to
write an application which monopolizes 100% of a given CPU. At first
glance this sounds like it might be a good idea, but in reality it
causes lots of headaches for the operating system. The OS is
responsible for managing both system-wide and per-CPU resources and
must periodically examine data structures describing these resources
and perform housekeeping activities with them. If a core is monopolized
by a SCHED_FIFO thread, it cannot perform the housekeeping tasks and
eventually the entire system becomes unstable, potentially causing a
crash.
More information about the users
mailing list