[PATCH] malloc: enhance NUMA affinity heuristic

zhoumin zhoumin at loongson.cn
Tue Dec 27 10:00:19 CET 2022


Hi David,


First of all, I sincerely apologize for the late reply.

I had checked this issue carefully and had some useful findings.

On Wed, Dec 21, 2022 at 22:57 PM, David Marchand wrote:
> Hello Min,
>
> On Wed, Dec 21, 2022 at 11:49 AM David Marchand
> <david.marchand at redhat.com> wrote:
>> Trying to allocate memory on the first detected numa node has less
>> chance to find some memory actually available rather than on the main
>> lcore numa node (especially when the DPDK application is started only
>> on one numa node).
>>
>> Signed-off-by: David Marchand <david.marchand at redhat.com>
> I see a failure in the loongarch CI.
>
> Running binary with
> argv[]:'/home/zhoumin/dpdk/build/app/test/dpdk-test'
> '--file-prefix=eal_flags_c_opt_autotest' '--proc-type=secondary'
> '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7'
> Error - process did not run ok with valid corelist value
> Test Failed
>
> The logs don't give the full picture (though it is not LoongArch CI fault).
>
> I tried to read back on past mail exchanges about the loongarch
> server, but I did not find the info.
> I suspect cores 5 to 7 belong to different numa nodes, can you confirm?

The cores 5 to 7 belong to the same numa node (NUMA node1) on the 
Loongson-3C5000LL CPU on which LoongArch DPDK CI runs.

>
> I'll post a new revision to account for this case.
>

The LoongArch DPDK CI uses the core 0-7 to run all the DPDK unit tests 
by adding the arg '-l 0-7' in the meson test args. In the above test 
case, the arg '--lcores' '0-1,2@(5-7),(3-5)@(0,2),(0,6),7' will make the 
lcore 0 and 6 to run on the core 0 or 6. The logs of EAL will make it 
more clear when I set the log level of EAL to debug as follows:
EAL: Main lcore 0 is ready (tid=fff3ee18f0;cpuset=[0,6])
EAL: lcore 1 is ready (tid=fff2de4cf0;cpuset=[1])
EAL: lcore 2 is ready (tid=fff25e0cf0;cpuset=[5,6,7])
EAL: lcore 5 is ready (tid=fff0dd4cf0;cpuset=[0,2])
EAL: lcore 4 is ready (tid=fff15d8cf0;cpuset=[0,2])
EAL: lcore 3 is ready (tid=fff1ddccf0;cpuset=[0,2])
EAL: lcore 7 is ready (tid=ffdb7f8cf0;cpuset=[7])
EAL: lcore 6 is ready (tid=ffdbffccf0;cpuset=[0,6])

However, The cores 0 and 6 belong to different numa nodes on the 
Loongson-3C5000LL CPU. The core 0 belongs to NUMA node 0 and the core 6 
belongs to NUMA node 1 as follows:
$ lscpu
Architecture:        loongarch64
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           8
NUMA node(s):        8
...
NUMA node0 CPU(s):   0-3
NUMA node1 CPU(s):   4-7
NUMA node2 CPU(s):   8-11
NUMA node3 CPU(s):   12-15
NUMA node4 CPU(s):   16-19
NUMA node5 CPU(s):   20-23
NUMA node6 CPU(s):   24-27
NUMA node7 CPU(s):   28-31
...

So the socket_id for the lcore 0 and 6 will be set to -1 which can be 
seen from the thread_update_affinity(). Meanwhile, I print out the 
socket_id for the lcore 0 to RTE_MAX_LCORE - 1 as follows:
lcore_config[*].socket_id: -1 0 1 0 0 0 -1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 
5 5 6 6 6 6 7 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0

In this test case, the modified malloc_get_numa_socket() will return -1 
which caused a memory allocation failure.
Whether it is acceptable in DPDK that the socket_id for a lcore is -1? 
If it's ok, maybe we can check the socket_id of main lcore before using 
it, such as:
diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index d7c410b786..3ee19aee15 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -717,6 +717,10 @@ malloc_get_numa_socket(void)
                         return socket_id;
         }

+       socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
+       if (socket_id != (unsigned int)SOCKET_ID_ANY)
+               return socket_id;
+
         return rte_socket_id_by_idx(0);
  }



More information about the dev mailing list