[dpdk-dev] [PATCH 3/4] eal: don't crash if alarm set fails

Thomas Monjalon thomas at monjalon.net
Thu Oct 25 01:51:34 CEST 2018


18/09/2018 12:16, Burakov, Anatoly:
> On 18-Sep-18 10:43 AM, Thomas Monjalon wrote:
> > 26/07/2018 11:41, Burakov, Anatoly:
> >> On 25-Jul-18 7:20 PM, Stephen Hemminger wrote:
> >>> There is no need to call rte_exit and crash the application here;
> >>> better to let the application handle the error itself.
> >>>
> >>> Remove the gratuitous profanity which would be visible if
> >>> the rte_exit was still there.
> >>>
> >>> Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com>
> >>> ---
> >>> --- a/lib/librte_eal/common/eal_common_proc.c
> >>> +++ b/lib/librte_eal/common/eal_common_proc.c
> >>> @@ -841,14 +841,12 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
> >>>    
> >>>    	param->user_reply.nb_sent++;
> >>>    
> >>> -	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> >>> -			      async_reply_handle, pending_req) < 0) {
> >>> +	ret = rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
> >>> +				async_reply_handle, pending_req);
> >>> +	if (ret < 0)
> >>>    		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
> >>>    			dst, req->name);
> >>> -		rte_panic("Fix the above shit to properly free all memory\n");
> >>
> >> Profanity aside, i think the message was trying to tell me something -
> >> namely, that if alarm_set fails, we're risking to leak this memory if
> >> reply from the peer never comes, and we're risking leaving the
> >> application hanging because the timeout never triggers. I'm not sure if
> >> leaving this "to the user" is the right choice, because there is no way
> >> for the user to free IPC-internal memory if it leaks.
> >>
> >> So i think the proper way to handle this would've been to set the alarm
> >> first, then, if it fails, don't sent the message in the first place.
> > 
> > What should be done here? OK to remove rte_panic for now?
> > 
> 
> As i said, the above fix is wrong because it leaks memory (however 
> unlikely it may be).
> 
> The alarm set call should be moved to before we do send_msg() call (and 
> goto fail; on failure). That way, even if alarm triggers too early (i.e. 
> immediately), the requests tailq will still be locked until we complete 
> our request sends - so we appropriately free memory on response, on 
> timeout or in our failure handler if alarm set has failed.

Someone to fix it, please?





More information about the dev mailing list