[dpdk-users] Linux forcibly descheduling isolated thread on isolated cpu running DPDK rx under load

Richard Nutman Richard.Nutman at s-a-m.com
Fri Apr 20 12:14:18 CEST 2018

Hi Terry,

Following on from what Stephen mentioned, when you hit an AVX2 instruction there is a warmup latency while the CPU powers on the upper half of the 256bit lanes.
It's normally around 10usecs, so possibly not accounting for everything you're seeing;


Also with RT threads that never yield you should add nosoftlockup to your bootline to prevent the kernel assuming your thread has locked up.

Some things to look into;
1. Are you using no_hz mode on the kernel bootline ?
2. Have you disabled RCU callbacks from your cpu's with rcu_nocbs on kernel bootline ?
3. Have you manually IRQbalanced to move IRQ's off your isolated cpu's ?

The clear_page_erms suggests it could be memory housekeeping like zone reclaiming or transparent_hugepages, have you disabled these ?


> -----Original Message-----
> From: Tim Shearer [mailto:TShearer at advaoptical.com]
> Sent: 20 April 2018 03:00
> To: users at dpdk.org; terry.montague.1980 at btinternet.com
> Subject: Re: [dpdk-users] Linux forcibly descheduling isolated thread
> on isolated cpu running DPDK rx under load
> Hi Terry,
> Without digging into this too much, it looks like the kernel is context
> switching out to do a clear_page call, so I wonder if one of your other
> threads is doing something memory related that's triggering this
> behaviour.
> Tim
> ________________________________
> From: users <users-bounces at dpdk.org> on behalf of
> terry.montague.1980 at btinternet.com <terry.montague.1980 at btinternet.com>
> Sent: Thursday, April 19, 2018 11:43:32 AM
> To: users at dpdk.org
> Subject: [dpdk-users] Linux forcibly descheduling isolated thread on
> isolated cpu running DPDK rx under load
> Hi there,
> I wondered if anyone had come across this particular problem regarding
> linux scheduling, or rather what appears to be a forced descheduling
> effect.
> I'm running on standard vanilla Ubuntu 17-10 using kernel 4.13.0-36-
> generic.
> Local Timer interrupts are therefore enabled....
> I'm running a dual CPU Xeon E5-2623v4 system. I have cpu 2 on the first
> NUMA node (CPU 0) isolated for DPDK receive. I have an Intel X550 card
> attached to NUMA 0.
> What I'm doing is running my DPDK receive thread on the isolated core
> (2) and changing the scheduling for this thread to SCHED_FIFO and
> priority 98.
> Most of the time this works really well. However, I'm running this DPDK
> thread inside a larger application - there are probably 40 threads
> inside this process at default priority.
> What I'm seeing is, when the application is under load, the DPDK
> receive thread is forcibly descheduled (observed with pidstat -p <PID>
> -w and seeing the non-voluntary counts spike ) and the core appears to
> go idle, sometimes for up to 1400uS.
> This is obviously a problem....
> Running "perf" to sample activity on this isolated core only, I see the
> following entries.
>    0.90%  swapper        [kernel.kallsyms]    [k] cpu_idle_poll
>    0.60%  lcore-slave-2  [kernel.kallsyms]    [k] clear_page_erms
> i.e  - it has gone idle and 1.5% of the processing time has gone
> elsewhere - which ties in pretty well with my ~1400uS deschedule
> observation.
> In normal operation I do not see this effect.
> I've checked the code - it appears to go idle in the middle of some
> AVX2 data processing code - there are no system calls taken, it just
> goes idle.
> Does anyone have any ideas ?
> Many thanks
> Terry
This email has been scanned for email related threats and delivered safely by Mimecast.
For more information please visit http://www.mimecast.com

More information about the users mailing list