[dpdk-dev] [PATCH] mem: fix allocation failure on non-NUMA kernel

Burakov, Anatoly anatoly.burakov at intel.com
Thu Sep 17 16:07:12 CEST 2020


On 17-Sep-20 2:05 PM, Nick Connolly wrote:
> Hi Anatoly,
> 
> Thanks.  My recollection is that all of the NUMA configuration flags 
> were set to 'n'.
> 
> Regards,
> Nick
> 
> On 17/09/2020 13:57, Burakov, Anatoly wrote:
>> On 17-Sep-20 1:29 PM, Nick Connolly wrote:
>>> Hi Anatoly,
>>>
>>> Thanks for the response.  You are asking a good question - here's 
>>> what I know:
>>>
>>> The issue arose on a single socket system, running WSL2 (full Linux 
>>> kernel running as a lightweight VM under Windows).
>>> The default kernel in this environment is built with CONFIG_NUMA=n 
>>> which means get_mempolicy() returns an error.
>>> This causes the check to ensure that the allocated memory is 
>>> associated with the correct socket to fail.
>>>
>>> The change is to skip the allocation check if check_numa() indicates 
>>> that NUMA-aware memory is not supported.
>>>
>>> Researching the meaning of CONFIG_NUMA, I found 
>>> https://cateee.net/lkddb/web-lkddb/NUMA.html which says:
>>>> Enable NUMA (Non-Uniform Memory Access) support.
>>>> The kernel will try to allocate memory used by a CPU on the local 
>>>> memory controller of the CPU and add some more NUMA awareness to the 
>>>> kernel.
>>>
>>> Clearly CONFIG_NUMA enables memory awareness, but there's no 
>>> indication in the description whether information about the NUMA 
>>> physical architecture is 'hidden', or whether it is still exposed 
>>> through /sys/devices/system/node* (which is used by the rte 
>>> initialisation code to determine how many sockets there are). 
>>> Unfortunately, I don't have ready access to a multi-socket Linux 
>>> system that I can test this out on, so I took the conservative 
>>> approach that it may be possible to have CONFIG_NUMA disabled, but 
>>> the kernel still report more than one node, and coded the change to 
>>> generate a debug message if this occurs.
>>>
>>> Do you know whether CONFIG_NUMA turns off all knowledge about the 
>>> hardware architecture?  If it does, then I agree that the test for 
>>> rte_socket_count() serves no purpose and should be removed.
>>>
>>
>> I have a system with a custom compiled kernel, i can recompile it 
>> without this flag and test this. I'll report back with results :)
>>
> 

With CONFIG_NUMA set to 'n':

[root at xxx ~]# find /sys -name "node*"
/sys/kernel/software_nodes/node0
[root at xxx ~]#

This is confirmed by running DPDK on that machine - i can see all cores 
from all sockets, but they're all appearing on socket 0. So, yes, that 
check isn't necessary :)

-- 
Thanks,
Anatoly


More information about the dev mailing list