[Bug 1030] rte_malloc() and rte_free() get stuck when used with signal handler
Stephen Hemminger
stephen at networkplumber.org
Wed Oct 5 19:30:59 CEST 2022
On Mon, 13 Jun 2022 14:48:45 +0500
Sarosh Arif <sarosh.arif at emumba.com> wrote:
> Thank you for help, I'll do it this way.
>
> On Sat, Jun 11, 2022 at 9:25 PM Mattias Rönnblom <hofors at lysator.liu.se> wrote:
> >
> > On 2022-06-10 08:04, Sarosh Arif wrote:
> > > On Thu, Jun 9, 2022 at 8:26 PM Stephen Hemminger
> > > <stephen at networkplumber.org> wrote:
> > >>
> > >> On Thu, 09 Jun 2022 12:47:43 +0000
> > >> bugzilla at dpdk.org wrote:
> > >>
> > >>> https://bugs.dpdk.org/show_bug.cgi?id=1030
> > >>>
> > >>> Bug ID: 1030
> > >>> Summary: rte_malloc() and rte_free() get stuck when used with
> > >>> signal handler
> > >>> Product: DPDK
> > >>> Version: 22.03
> > >>> Hardware: All
> > >>> OS: Linux
> > >>> Status: UNCONFIRMED
> > >>> Severity: normal
> > >>> Priority: Normal
> > >>> Component: core
> > >>> Assignee: dev at dpdk.org
> > >>> Reporter: sarosh.arif at emumba.com
> > >>> Target Milestone: ---
> > >>>
> > >>> Created attachment 205
> > >>> --> https://bugs.dpdk.org/attachment.cgi?id=205&action=edit
> > >>> calls rte_malloc and rte_free in the handler and main code
> > >>>
> > >>> I have a dpdk based application which uses rte_malloc() and rte_free()
> > >>> frequently in it's main code. The general method to close the application is
> > >>> though sending SIGINT. The application has a signal handler written for cleanup
> > >>> purposes before closing the application. The handler also uses rte_free() to
> > >>> release some of the memory during cleanup. The application gets stuck in a
> > >>> deadlock.
> > >>>
> > >>>
> > >>> Upon investigation I found out that both rte_free() and rte_malloc() use
> > >>> rte_spinlock_lock() function to place a lock on heap. While this lock is placed
> > >>> and the application receives SIGINT, it goes into the handler without releasing
> > >>> the lock. Since the handler itself calls rte_free() which tries to acquire the
> > >>> lock it gets stuck.
> > >>>
> > >>>
> > >>> I have attached a sample application to reproduce this problem.
> > >>>
> > >>>
> > >>> Steps to reproduce this problem:
> > >>>
> > >>> 1. compile the code provided in attachment with any version of dpdk
> > >>> 2. run the compiled binary
> > >>> 3. press ctrl+c till the prints stop
> > >>>
> > >>> Actual Results:
> > >>> The application gets stuck in either rte_free() or rte_malloc()
> > >>>
> > >>> Expected Results:
> > >>> Application should allocate and free the memory without getting stuck
> > >>>
> > >>
> > >> rte_malloc and rte_free are not async sigsafe()
> > >>
> > > Oh, I did not know that. This should be mentioned in the documentation.
> >
> > Is there anything except <rte_atomic.h> that is/should be async-signal-safe?
> >
> > >> but then again regular glibc is not either.
> > > Memory allocated with glibc malloc() is freed by itself upon closing
> > > the application. My application runs as a secondary process, and it
> > > needs to use rte_malloc() specifically because the memory should be
> > > shared between the two processes. If I don't free it upon closure it
> > > would just be leaked. Is there any other solution for it?
> >
> > The standard solution is that the signal handler using some appropriate,
> > async-signal-safe way talks to the main thread, which then goes on to
> > cleanly terminate the application.
> >
> > A write() to an fd, or an atomic store to a flag are two options.
Patch is pending (why is it not merged?) to describe what is signal safe.
https://patchwork.dpdk.org/project/dpdk/patch/20220711230448.557715-1-stephen@networkplumber.org/
More information about the dev
mailing list