[dpdk-dev] [PATCH v12 6/7] eal: add failure handle mechanism for hot-unplug

Burakov, Anatoly anatoly.burakov at intel.com
Tue Oct 2 17:53:46 CEST 2018


On 02-Oct-18 1:35 PM, Jeff Guo wrote:
> The mechanism can initially register the sigbus handler after the device
> event monitor is enabled. When a sigbus event is captured, it will check
> the failure address and accordingly handle the memory failure of the
> corresponding device by invoke the hot-unplug handler. It could prevent
> the application from crashing when a device is hot-unplugged.
> 
> By this patch, users could call below new added APIs to enable/disable
> the device hotplug handle mechanism. Note that it just implement the
> hot-unplug handler in these functions, the other handler of hotplug, such
> as handler for hotplug binding, could be add in the future if need:
>    - rte_dev_hotplug_handle_enable
>    - rte_dev_hotplug_handle_disable
> 
> Signed-off-by: Jeff Guo <jia.guo at intel.com>
> ---

<snip>

> +static void sigbus_handler(int signum, siginfo_t *info,
> +				void *ctx __rte_unused)
> +{
> +	int ret;
> +
> +	RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n",
> +		(int)pthread_self(), info->si_addr);
> +
> +	rte_spinlock_lock(&failure_handle_lock);
> +	ret = rte_bus_sigbus_handler(info->si_addr);
> +	rte_spinlock_unlock(&failure_handle_lock);
> +	if (ret == -1) {
> +		rte_exit(EXIT_FAILURE,
> +			 "Failed to handle SIGBUS for hot-unplug, "
> +			 "(rte_errno: %s)!", strerror(rte_errno));

Do we really want to exit the application on sigbus handle failure?

> +	} else if (ret == 1) {
> +		if (sigbus_action_old.sa_handler)
> +			(*(sigbus_action_old.sa_handler))(signum);
> +		else
> +			rte_exit(EXIT_FAILURE,
> +				 "Failed to handle generic SIGBUS!");
> +	}
> +
> +	RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hot-unplug!\n");

Again, does this all need to be with INFO log level? IMO it should be DEBUG.

> +}
> +
> +static int cmp_dev_name(const struct rte_device *dev,
> +	const void *_name)
> +{
> +	const char *name = _name;
> +
> +	return strcmp(dev->name, name);
> +}
> +
>   static int

<snip>

>   
>   int __rte_experimental
> @@ -220,5 +320,67 @@ rte_dev_event_monitor_stop(void)
>   	close(intr_handle.fd);
>   	intr_handle.fd = -1;
>   	monitor_started = false;
> +
>   	return 0;

This looks like unintended change.

>   }
> +
> +int __rte_experimental
> +rte_dev_sigbus_handler_register(void)
> +{
> +	sigset_t mask;
> +	struct sigaction action;
> +

<snip>

> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -281,6 +281,8 @@ EXPERIMENTAL {
>   	rte_dev_event_callback_unregister;
>   	rte_dev_event_monitor_start;
>   	rte_dev_event_monitor_stop;
> +	rte_dev_hotplug_handle_enable;
> +	rte_dev_hotplug_handle_disable;

Nitpicking - disable should be above enable, as E follows D in alphabet :)

>   	rte_dev_iterator_init;
>   	rte_dev_iterator_next;
>   	rte_devargs_add;
> 


-- 
Thanks,
Anatoly


More information about the dev mailing list