[dpdk-dev] [PATCH v2] eal/service: fix exit by resetting service lcores

Burakov, Anatoly anatoly.burakov at intel.com
Mon Apr 6 12:30:46 CEST 2020


On 13-Mar-20 10:04 AM, David Marchand wrote:
> On Wed, Mar 11, 2020 at 3:39 PM Harry van Haaren
> <harry.van.haaren at intel.com> wrote:
>>
>> This commit releases all service cores from their role,
>> returning them to ROLE_RTE on rte_service_finalize().
>>
>> This may fix an issue relating to the service cores causing
> 
> s/may fix/fixes/
> 
>> a race-condition on eal_cleanup(), where the service core
>> could still be executing while the main thread has already
>> free-d the service memory, leading to a segfault.
>>
>> Fixes: 21698354c832 ("service: introduce service cores concept")
> 
> Replaced with:
> Fixes: da23f0aa87d8 ("service: fix memory leak with new function")
> 
>> Cc: stable at dpdk.org
>>
>> Reported-by: David Marchand <david.marchand at redhat.com>
>> Reported-by: Aaron Conole <aconole at redhat.com>
>> Signed-off-by: David Marchand <david.marchand at redhat.com>
>> Signed-off-by: Harry van Haaren <harry.van.haaren at intel.com>
>> Acked-by: Aaron Conole <aconole at redhat.com>
> 
> Applied, thanks.
> 
> 

This patch breaks a couple of apps (or rather the apps were broken to 
begin with, but the brokenness has been exposed with this patch).

A "good" way to handle a SIGINT is to catch it, set some kind of global 
exit flag, and exit the signal handler, so that all of the threads see 
the exit flag, stop spinning, and exit the main loop and proceed to 
gracefully shutdown. That's what majority of our apps do.

A bad way to handle SIGINT is to call rte_exit() inside the signal 
handler, without setting any global exit flags. Since rte_exit() now 
waits for all of the threads to stop, the exit will never actually 
happen because threads can't stop without an exit signal, and no exit 
signal was provided by the signal handler.

Affected apps:

* l3fwd-power (i'm preparing a patch)
* ip_reassembly (see main.c:988) - +Konstantin

There are also a bunch of apps that simply call exit(0) and do unclean 
shutdown without DPDK cleanup, and also apps i have no idea what they're 
doing (call kill() on themselves in the SIGINT handler? l3fwd-cat does 
that, so do a bunch of others), but this is probably a bigger problem 
that should be addressed separately.

-- 
Thanks,
Anatoly


More information about the dev mailing list