[dpdk-dev] [PATCH] mem: fix allocation failure on non-NUMA kernel
Nick Connolly
nick.connolly at mayadata.io
Wed Aug 5 16:53:12 CEST 2020
On 05/08/2020 15:36, Nicolas Dichtel wrote:
> Le 05/08/2020 à 16:20, Nick Connolly a écrit :
> [snip]
>>>> Fixes: 2a96c88be83e ("mem: ease init in a docker container")
>>> I'm wondering if the bug existed before this commit.
>>>
>>> Before this commit, it was:
>>> move_pages(getpid(), 1, &addr, NULL, &cur_socket_id, 0);
>>> if (cur_socket_id != socket_id) {
>>> /* error */
>>>
>>> Isn't it possible to hit this error case if CONFIG_NUMA is unset in the kernel?
>> I've just run the previous code to test this out and you are right that
>> move_pages does indeed return -1 with errno set to ENOSYS, but nothing checks
>> this so execution carries on and compares cur_socket_id (which will be unchanged
>> from the zero initialization) with socket_id (which is presumably also zero),
>> thus allowing the allocation to succeed!
> I came to this conclusion, but I didn't check if socket_id could be != from 0.
>
>>> [snip]
>>>> + if (check_numa()) {
>>>> + ret = get_mempolicy(&cur_socket_id, NULL, 0, addr,
>>>> + MPOL_F_NODE | MPOL_F_ADDR);
>>>> + if (ret < 0) {
>>>> + RTE_LOG(DEBUG, EAL, "%s(): get_mempolicy: %s\n",
>>>> + __func__, strerror(errno));
>>>> + goto mapped;
>>>> + } else if (cur_socket_id != socket_id) {
>>>> + RTE_LOG(DEBUG, EAL,
>>>> + "%s(): allocation happened on wrong socket (wanted %d,
>>>> got %d)\n",
>>>> + __func__, socket_id, cur_socket_id);
>>>> + goto mapped;
>>>> + }
>>>> + } else {
>>>> + if (rte_socket_count() > 1)
>>>> + RTE_LOG(DEBUG, EAL, "%s(): not checking socket for allocation
>>>> (wanted %d)\n",
>>>> + __func__, socket_id);
>>> nit: maybe an higher log level like WARNING?
>> Open to guidance here - my concern was that this is going to be generated for
>> every call to alloc_seg() and I'm not sure what the frequency will be - I'm
>> cautious about flooding the log with warnings under 'normal running'. Are the
>> implications of running on a multi socket system with NUMA support disabled in
>> the kernel purely performance related for the DPDK or is there a functional
>> correctness issue as well?
> Is it really a 'normal running' to have CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES in
> dpdk and not CONFIG_NUMA in the kernel?
I'm not an expert of DPDK, but I think it needs to be treated as 'normal
running', for the following reasons:
1. The existing code in eal_memalloc_alloc_seg_bulk() is designed to
work even if check_numa() indicates that NUMA support is not enabled:
#ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
if (check_numa()) {
oldmask = numa_allocate_nodemask();
prepare_numa(&oldpolicy, oldmask, socket);
have_numa = true;
}
#endif
2. The DPDK application could be built with
CONFIG_RTE_EAL_NUMA_AWARE_HUGE_PAGES and then the binary run on
different systems with and without NUMA support.
Regards,
Nick
More information about the dev
mailing list