[dpdk-dev] [PATCH V4 3/9] bus: introduce sigbus handler

Jeff Guo jia.guo at intel.com
Wed Jul 11 04:15:11 CEST 2018



On 7/11/2018 5:55 AM, Stephen Hemminger wrote:
> On Fri, 29 Jun 2018 18:30:42 +0800
> Jeff Guo <jia.guo at intel.com> wrote:
>
>> When device be hotplug, if data path still read/write device, the sigbus
>> error will occur, this error need to be handled. So a handler need to be
>> here to capture the signal and handle it correspondingly.
>>
>> To handle sigbus error is a bus-specific behavior, this patch introduces
>> a bus ops so that each kind of bus can implement its own logic.
>>
>> Signed-off-by: Jeff Guo <jia.guo at intel.com>
>> ---
>> v4->v3:
>> split patches to be small and clear.
>> ---
>>   lib/librte_eal/common/include/rte_bus.h | 16 ++++++++++++++++
>>   1 file changed, 16 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
>> index 3642aeb..231bd3d 100644
>> --- a/lib/librte_eal/common/include/rte_bus.h
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -181,6 +181,20 @@ typedef int (*rte_bus_parse_t)(const char *name, void *addr);
>>   typedef int (*rte_bus_hotplug_handler_t)(struct rte_device *dev);
>>   
>>   /**
>> + * Implementation a specific sigbus handler, which is responsible
>> + * for handle the sigbus error which is original memory error, or specific
>> + * memory error that caused of hot unplug.
>> + * @param failure_addr
>> + *	Pointer of the fault address of the sigbus error.
>> + *
>> + * @return
>> + *	0 for success handle the sigbus.
>> + *	1 for no handle the sigbus.
>> + *	-1 for failed to handle the sigbus
>> + */
>> +typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
>> +
>> +/**
>>    * Bus scan policies
>>    */
>>   enum rte_bus_scan_mode {
>> @@ -226,6 +240,8 @@ struct rte_bus {
>>   	rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
>>   	rte_bus_hotplug_handler_t hotplug_handler;
>>   						/**< handle hot plug on bus */
>> +	rte_bus_sigbus_handler_t sigbus_handler; /**< handle sigbus error */
>> +
>>   };
>>   
>>   /**
> One issue with handling sigbus is that you are going to trap program errors
> as well as hotplug. How can you distinguish between removed device and a
> buggy userspace program (or worse comprimised program)?
That is a problem which i have been considerate in this mechanism and do 
it in other patch, the way is that first check if the error domain is 
belong to the mmio device resource or not,
if it is will do new sigbus handler for hotplug, if not will mean that 
it is buggy user space program, will use generic sigbus handler to 
handler it.


More information about the dev mailing list