[Bug 1137] CPU affinity set incorrectly when lcore_id 0 is not the master-lcore
Stephen Hemminger
stephen at networkplumber.org
Wed Nov 30 20:10:44 CET 2022
On Wed, 30 Nov 2022 18:41:16 +0000
bugzilla at dpdk.org wrote:
> https://bugs.dpdk.org/show_bug.cgi?id=1137
>
> Bug ID: 1137
> Summary: CPU affinity set incorrectly when lcore_id 0 is not
> the master-lcore
> Product: DPDK
> Version: 22.11
> Hardware: All
> OS: Linux
> Status: UNCONFIRMED
> Severity: normal
> Priority: Normal
> Component: core
> Assignee: dev at dpdk.org
> Reporter: ltroup at cisco.com
> Target Milestone: ---
>
> Created attachment 233
> --> https://bugs.dpdk.org/attachment.cgi?id=233&action=edit
> Logs showing incorrect CPU set assignment
>
> When a range of CPUs are used (e.g. 0-3), and the master-lcore is set to
> non-zero, the CPU affinity for lcore-id 0 is set incorrectly, due to its cpuset
> being overwritten by the control-thread creation.
>
> CPU arguments passed are '-c f --master-lcore 3', to indicate that CPUs 0-3
> should be used, with the master on CPU 3. In particular, DPDK itself is
> initialized from CPU 3.
>
> When the control threads (eal-intr-thread, rte_mp_handle) are created, they are
> initialized from CPU3 - so inherit the cpuset containing just this CPU. When
> calling __rte_thread_init(), ctrl_thread_init() passes the result of
> rte_lcore_id() - but this is not yet initialized for this thread - so is set to
> 0.
>
> This means that internally, the lcore_id for the control-thread is set to 0 -
> and
> in particular, the call to thread_update_affinity() overwrites the cpuset for
> lcore_id=0 with the cpuset of CPU3:
>
> memmove(&lcore_config[lcore_id].cpuset, cpusetp,
> sizeof(rte_cpuset_t));
>
> This all occurs before the main __rte_thread_init() call for each Slave thread
> - so that the slave thread associated with lcore_id, which should be running on
> CPU0, instead has its affinity incorrectly set to CPU3.
>
> RTE logs are attached showing this behavior (and including some additional logs
> added locally to print the lcore-id and cpusets being passed).
>
> The fix for this should be to make ctrl_thread_init() more similar to
> rte_thread_register(), so that it calls eal_lcore_non_eal_allocate() to assign
> an lcore-id, then passes this to __rte_thread_init(). I have tested a fix for
> this locally to confirm.
>
Side note: using CPU 0 with DPDK is not recommended for any real application.
It is impossible to fully isolate CPU 0 and therefore you will get poor performance
and mystery latency spikes.
More information about the dev
mailing list