[dpdk-dev] [PATCH] mem: fix allocation failure on non-NUMA kernel
Nick Connolly
nick.connolly at mayadata.io
Thu Sep 17 16:19:18 CEST 2020
Sure.
On 17/09/2020 15:18, Burakov, Anatoly wrote:
> On 17-Sep-20 3:08 PM, Nick Connolly wrote:
>> Excellent - thanks - I'll amend the patch.
>>
>> On 17/09/2020 15:07, Burakov, Anatoly wrote:
>>> On 17-Sep-20 2:05 PM, Nick Connolly wrote:
>>>> Hi Anatoly,
>>>>
>>>> Thanks. My recollection is that all of the NUMA configuration
>>>> flags were set to 'n'.
>>>>
>>>> Regards,
>>>> Nick
>>>>
>>>> On 17/09/2020 13:57, Burakov, Anatoly wrote:
>>>>> On 17-Sep-20 1:29 PM, Nick Connolly wrote:
>>>>>> Hi Anatoly,
>>>>>>
>>>>>> Thanks for the response. You are asking a good question - here's
>>>>>> what I know:
>>>>>>
>>>>>> The issue arose on a single socket system, running WSL2 (full
>>>>>> Linux kernel running as a lightweight VM under Windows).
>>>>>> The default kernel in this environment is built with
>>>>>> CONFIG_NUMA=n which means get_mempolicy() returns an error.
>>>>>> This causes the check to ensure that the allocated memory is
>>>>>> associated with the correct socket to fail.
>>>>>>
>>>>>> The change is to skip the allocation check if check_numa()
>>>>>> indicates that NUMA-aware memory is not supported.
>>>>>>
>>>>>> Researching the meaning of CONFIG_NUMA, I found
>>>>>> https://cateee.net/lkddb/web-lkddb/NUMA.html which says:
>>>>>>> Enable NUMA (Non-Uniform Memory Access) support.
>>>>>>> The kernel will try to allocate memory used by a CPU on the
>>>>>>> local memory controller of the CPU and add some more NUMA
>>>>>>> awareness to the kernel.
>>>>>>
>>>>>> Clearly CONFIG_NUMA enables memory awareness, but there's no
>>>>>> indication in the description whether information about the NUMA
>>>>>> physical architecture is 'hidden', or whether it is still exposed
>>>>>> through /sys/devices/system/node* (which is used by the rte
>>>>>> initialisation code to determine how many sockets there are).
>>>>>> Unfortunately, I don't have ready access to a multi-socket Linux
>>>>>> system that I can test this out on, so I took the conservative
>>>>>> approach that it may be possible to have CONFIG_NUMA disabled,
>>>>>> but the kernel still report more than one node, and coded the
>>>>>> change to generate a debug message if this occurs.
>>>>>>
>>>>>> Do you know whether CONFIG_NUMA turns off all knowledge about the
>>>>>> hardware architecture? If it does, then I agree that the test
>>>>>> for rte_socket_count() serves no purpose and should be removed.
>>>>>>
>>>>>
>>>>> I have a system with a custom compiled kernel, i can recompile it
>>>>> without this flag and test this. I'll report back with results :)
>>>>>
>>>>
>>>
>>> With CONFIG_NUMA set to 'n':
>>>
>>> [root at xxx ~]# find /sys -name "node*"
>>> /sys/kernel/software_nodes/node0
>>> [root at xxx ~]#
>>>
>>> This is confirmed by running DPDK on that machine - i can see all
>>> cores from all sockets, but they're all appearing on socket 0. So,
>>> yes, that check isn't necessary :)
>>>
>>
>
> I would also add a comment explaining why we're checking for NUMA
> support when NUMA support is defined at compiled time.
>
More information about the dev
mailing list