[dpdk-dev] [PATCH] eal: fix ctrl thread affinity with --lcores
Johan Källström
johan.kallstrom at ericsson.com
Tue Jul 30 18:32:02 CEST 2019
Hi, for the online check I referred to the check of "default_set" via the initial thread affinity.
I see that pthread_getaffinity_np returns an already and:ed mask, was under the impression that pthread_getaffinity_np would return the same mask as was set using pthread_setaffinity_np.
Looking on the implementation I see that it has been implemented on this line (https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L5242) for the last decade. Don’t know how this is implemented on FreeBSD or Windows.
Below is some example runs without the online cpu check running inside the exclusive cpuset 1-3,19,79 with cpu 79 offline.
Added a print statements after each consecutive calculation just to verify what the different steps.
Nice that you were able to reproduce the bug, the fix looks good otherwise :) .
= Example runs
echo 0 > /sys/bus/cpu/devices/cpu79/online
== 1. Ctrl threads via fallback
app# LD_LIBRARY_PATH=$PWD/../lib:$LD_LIBRARY_PATH taskset -c 19,79 ./testpmd --master-lcore 0 --lcores "(0,19)@(19,1,2,3)"
EAL: Detected 79 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: default_set: 19
EAL: cset_online: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78
EAL: cset_non_busy: 0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
EAL: cpuset:
EAL: cpuset fallback: 1,2,3,19
...
^Z
app# grep -HE '^(Cpus_allowed_list|Name):' /proc/48803/task/*/status
/proc/48803/task/48803/status:Name: testpmd
/proc/48803/task/48803/status:Cpus_allowed_list: 1-3,19
/proc/48803/task/48804/status:Name: eal-intr-thread
/proc/48803/task/48804/status:Cpus_allowed_list: 1-3,19
/proc/48803/task/48805/status:Name: rte_mp_handle
/proc/48803/task/48805/status:Cpus_allowed_list: 1-3,19
/proc/48803/task/48806/status:Name: lcore-slave-19
/proc/48803/task/48806/status:Cpus_allowed_list: 1-3,19
== 2. Ctrl threads via default_set
app# LD_LIBRARY_PATH=$PWD/../lib:$LD_LIBRARY_PATH taskset -c 3,79 ./testpmd --master-lcore 0 --lcores "(0,19)@(19,1,2)"
EAL: Detected 79 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: default_set: 3
EAL: cset_online: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78
EAL: cset_non_busy: 0,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
EAL: cpuset: 3
EAL: cpuset fallback: 3
...
^Z
app# grep -HE '^(Cpus_allowed_list|Name):' /proc/54032/task/*/status
/proc/54032/task/54032/status:Name: testpmd
/proc/54032/task/54032/status:Cpus_allowed_list: 1-2,19
/proc/54032/task/54033/status:Name: eal-intr-thread
/proc/54032/task/54033/status:Cpus_allowed_list: 3
/proc/54032/task/54034/status:Name: rte_mp_handle
/proc/54032/task/54034/status:Cpus_allowed_list: 3
/proc/54032/task/54035/status:Name: lcore-slave-19
/proc/54032/task/54035/status:Cpus_allowed_list: 1-2,19
BR
Johan
-----Original Message-----
From: David Marchand [mailto:david.marchand at redhat.com]
Sent: July 30, 2019 15:48
To: Johan Källström <johan.kallstrom at ericsson.com>
Cc: dev at dpdk.org; anatoly.burakov at intel.com; olivier.matz at 6wind.com; stable at dpdk.org
Subject: Re: [PATCH] eal: fix ctrl thread affinity with --lcores
On Tue, Jul 30, 2019 at 1:38 PM Johan Källström <johan.kallstrom at ericsson.com> wrote:
> The CPU failsafe is nice to have as you could set the thread affinity to offline cpus.
Created a "dpdk" cpuset and put cpus 4-7 into it (my system is mono numa with 8 cpus) # cd /sys/fs/cgroup/cpuset/ # mkdir dpdk # cd dpdk # echo 4-7 > cpuset.cpus # echo 0 > cpuset.mems
Disabled cpu 5.
# echo 0 > /sys/bus/cpu/devices/cpu5/online
Put my shell that starts testpmd in this dpdk cpuset # echo 4439 > tasks
EAL refuses an offline core when parsing the thread affinities and this did not change.
$ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)'
--log-level *:debug --no-huge --no-pci -m 512 -- -i
--total-num-mbufs=2048
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 6 as core 2 on socket 0
EAL: Detected lcore 7 as core 3 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 7 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: core 5 unavailable
EAL: invalid parameter for --lcores
What did I miss?
>
> Maybe also add the example I gave you to trigger the bug?
> https://protect2.fireeye.com/url?k=51a8b8b7-0d2163b8-51a8f82c-0cc47ad9
> 3e1a-2e7d7fab24e99be5&q=1&u=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi
> %3Fid%3D322%23c12
I managed to reproduce your error with the setup above (without relying on the cset tool that is not available on rhel afaics), I can add it to the commitlog yes.
> This also shows how to set the default_affinity mask and proves that the calculation will result in threads inside the cpuset on Linux.
>
> /Johan
>
> On tis, 2019-07-30 at 11:35 +0200, David Marchand wrote:
> > When using -l/-c options, each lcore is mapped to a physical cpu in
> > a
> > 1:1 fashion.
> > On the contrary, when using --lcores, each lcore has its own cpuset
>
> Use "thread affinity" instead of cpuset when we talk about setting the thread affinity.
>
> I know that the term cpuset is used in the data structure, but it is not a cpuset as described by 'man cpuset' (on Linux). This comment can be seen as cosmetic, but I think that it could be good to have a clear definitions to minimize confusion.
Indeed, using cpuset is inappropriate.
I will update the commitlog and the comment.
--
David Marchand
More information about the dev
mailing list