[dpdk-dev] [PATCH v12 6/7] eal: add failure handle mechanism for hot-unplug

Jeff Guo jia.guo at intel.com
Thu Oct 4 05:12:22 CEST 2018


On 10/2/2018 11:53 PM, Burakov, Anatoly wrote:
> On 02-Oct-18 1:35 PM, Jeff Guo wrote:
>> The mechanism can initially register the sigbus handler after the device
>> event monitor is enabled. When a sigbus event is captured, it will check
>> the failure address and accordingly handle the memory failure of the
>> corresponding device by invoke the hot-unplug handler. It could prevent
>> the application from crashing when a device is hot-unplugged.
>>
>> By this patch, users could call below new added APIs to enable/disable
>> the device hotplug handle mechanism. Note that it just implement the
>> hot-unplug handler in these functions, the other handler of hotplug, 
>> such
>> as handler for hotplug binding, could be add in the future if need:
>>    - rte_dev_hotplug_handle_enable
>>    - rte_dev_hotplug_handle_disable
>>
>> Signed-off-by: Jeff Guo <jia.guo at intel.com>
>> ---
>
> <snip>
>
>> +static void sigbus_handler(int signum, siginfo_t *info,
>> +                void *ctx __rte_unused)
>> +{
>> +    int ret;
>> +
>> +    RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n",
>> +        (int)pthread_self(), info->si_addr);
>> +
>> +    rte_spinlock_lock(&failure_handle_lock);
>> +    ret = rte_bus_sigbus_handler(info->si_addr);
>> +    rte_spinlock_unlock(&failure_handle_lock);
>> +    if (ret == -1) {
>> +        rte_exit(EXIT_FAILURE,
>> +             "Failed to handle SIGBUS for hot-unplug, "
>> +             "(rte_errno: %s)!", strerror(rte_errno));
>
> Do we really want to exit the application on sigbus handle failure?
>

Definitely yes we want, since it is a failure of the process. Agree with 
Konstantin reply on other mail.


>> +    } else if (ret == 1) {
>> +        if (sigbus_action_old.sa_handler)
>> +            (*(sigbus_action_old.sa_handler))(signum);
>> +        else
>> +            rte_exit(EXIT_FAILURE,
>> +                 "Failed to handle generic SIGBUS!");
>> +    }
>> +
>> +    RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hot-unplug!\n");
>
> Again, does this all need to be with INFO log level? IMO it should be 
> DEBUG.
>

I am fine for that.


>> +}
>> +
>> +static int cmp_dev_name(const struct rte_device *dev,
>> +    const void *_name)
>> +{
>> +    const char *name = _name;
>> +
>> +    return strcmp(dev->name, name);
>> +}
>> +
>>   static int
>
> <snip>
>
>>     int __rte_experimental
>> @@ -220,5 +320,67 @@ rte_dev_event_monitor_stop(void)
>>       close(intr_handle.fd);
>>       intr_handle.fd = -1;
>>       monitor_started = false;
>> +
>>       return 0;
>
> This looks like unintended change.
>

No, i intended to change it to consistent with the other format.


>>   }
>> +
>> +int __rte_experimental
>> +rte_dev_sigbus_handler_register(void)
>> +{
>> +    sigset_t mask;
>> +    struct sigaction action;
>> +
>
> <snip>
>
>> --- a/lib/librte_eal/rte_eal_version.map
>> +++ b/lib/librte_eal/rte_eal_version.map
>> @@ -281,6 +281,8 @@ EXPERIMENTAL {
>>       rte_dev_event_callback_unregister;
>>       rte_dev_event_monitor_start;
>>       rte_dev_event_monitor_stop;
>> +    rte_dev_hotplug_handle_enable;
>> +    rte_dev_hotplug_handle_disable;
>
> Nitpicking - disable should be above enable, as E follows D in 
> alphabet :)
>

yes, after recheck with alphabet, it definitely like what you said. :).


>>       rte_dev_iterator_init;
>>       rte_dev_iterator_next;
>>       rte_devargs_add;
>>
>
>


More information about the dev mailing list