Mempool bigger than 1 page causes segmentation fault

Dmitry Kozlyuk dmitry.kozliuk at gmail.com
Wed Jul 27 14:30:55 CEST 2022


2022-07-27 14:59 (UTC+0300), MOD:
> Hi All,
> 
> My team and I have encountered a problem where allocation of a mempool
> larger than 1GB (== 1 Hugepage) fails.
> We are in a multi-process environment, and the `rte_mempool_create`
> happens in the secondary process.
> 
> Sometimes the allocation succeeds but after some successes (for me
> specifically, two) the following occurs:
> the secondary process segfaults on `malloc_elem_can_hold`, inside a stack
> starting from `rte_mempool_create`.
> 
> Restarting the secondary process does not work as it is stuck on `EAL:
> Probing VFIO support`, and restarting
> the main process is the only option.
> 
> Has anyone had this problem, or knows any possible solution?
> Thanks!

Please tell the DPDK version and attach the stack trace.

If possible, try rebuilding DPDK with RTE_MALLOC_DEBUG defined,
and if your DPDK version supports it, with AddressSanitizer enabled.
Segfault in a function that traverses the malloc element list
suggests the heap may be corrupted, but it's only a guess.

Restarting the secondary process after a segfault is hardly a viable idea
because at this point the common memory may be already corrupted,
some lock may be taken and never released
(which is a possible reason it stucks, BTW).


More information about the users mailing list