[dpdk-dev] [PATCH v7 1/2] eal: add uevent monitor for hot plug

Matan Azrad matan at mellanox.com
Mon Jan 8 09:14:20 CET 2018


Hi Guo
I answered for the 2 threads here. 

 From: Guo, Jia, Monday, January 8, 2018 7:27 AM
> On 1/3/2018 1:02 AM, Matan Azrad wrote:
> > Hi Jeff
> >
> > Maybe I'm touching in previous discussions but please see some
> comments\questions.
> >
> > From: Jeff Guo:
> >> This patch aim to add a general uevent mechanism in eal device layer,
> >> to enable all linux kernel object hot plug monitoring, so user could use
> these
> >> APIs to monitor and read out the device status info that sent from the
> kernel
> >> side, then corresponding to handle it, such as detach or attach the
> >> device, and even benefit to use it to do smoothly fail safe work.
> >>
> >> 1) About uevent monitoring:
> >> a: add one epolling to poll the netlink socket, to monitor the uevent of
> >>     the device, add device_state in struct of rte_device, to identify the
> >>     device state machine.
> >> b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent.
> >> c: add below API in rte eal device common layer.
> >>     rte_eal_dev_monitor_enable
> >>     rte_dev_callback_register
> >>     rte_dev_callback_unregister
> >>     _rte_dev_callback_process
> >>     rte_dev_monitor_start
> >>     rte_dev_monitor_stop
> >>
> >> 2) About failure handler, use pci uio for example,
> >>     add pci_remap_device in bus layer and below function to process it:
> >>     rte_pci_remap_device
> >>     pci_uio_remap_resource
> >>     pci_map_private_resource
> >>     add rte_pci_dev_bind_driver to bind pci device with explicit driver.
> >>
> >> Signed-off-by: Jeff Guo <jia.guo at intel.com>
<snip>
> >> +		user_cb->cb_arg = cb_arg;
> >> +		user_cb->event = event;
> >> +		if (event == RTE_EAL_DEV_EVENT_ADD)
> >> +			dev_add_cb = user_cb;
> > Only one dpdk entity can register to ADD callback?
> >
> > I suggest to add option to register all devices maybe by using dummy
> device which will include all the "ALL_DEVICES"  callbacks per event.
> > All means past, present and future devices, by this way 1 callback can be
> called for all the devices and more than one dpdk entity could register to  an
> ADD\NEW event.
> > What's about NEW instead of ADD?
> >
> > I also suggest to add the device pointer as a parameter to the
> callback(which will be managed by EAL).
> if you talk about dev_add_cb, the add means device add not cb add, if
> you talk about dev event type, the ADD type is consistent with the type
> form kernel side, anyway could be find a better.

I'm talking about next:
1. dev_add_cb can hold only 1 callback, why? Can't 2 callbacks to be registered to RTE_EAL_DEV_EVENT_ADD event? (actually there is memory leak in this case)
2. Suggestion to register same callback to "all" devices by 1 call.
3. Suggestion to add parameter for the callback functions - the device pointer. 
4. Suggestion to change name from RTE_EAL_DEV_EVENT_ADD to RTE_EAL_DEV_EVENT_NEW.
5. Clue how to implement 1,2 by dummy device.


> but for 1 callback for all device, it is make sense , i will think about that.
> >> +		else
> >> +			TAILQ_INSERT_TAIL(&(device->uev_cbs), user_cb,
> >> next);
> >> +	}
> >> +
> >> +	rte_spinlock_unlock(&rte_dev_cb_lock);
> >> +	return 0;
> >> +}
<snip>
> >> +int
> >> +_rte_dev_callback_process(struct rte_device *device,
> >> +			enum rte_eal_dev_event_type event,
> >> +			void *cb_arg, void *ret_param)
> >> +{
> >> +	struct rte_eal_dev_callback dev_cb;
> >> +	struct rte_eal_dev_callback *cb_lst;
> >> +	int rc = 0;
> >> +
> >> +	rte_spinlock_lock(&rte_dev_cb_lock);
> >> +	if (event == RTE_EAL_DEV_EVENT_ADD) {
> >> +		if (cb_arg != NULL)
> >> +			dev_add_cb->cb_arg = cb_arg;
> >> +
> >> +		if (ret_param != NULL)
> >> +			dev_add_cb->ret_param = ret_param;
> >> +
> >> +		rte_spinlock_unlock(&rte_dev_cb_lock);
> > Can't someone free it when it running?
> > I suggest to  keep the lock locked.
> > Callbacks are not allowed to use this mechanism to prevent deadlock.
> seems it would bring some deadlock here, let's check it more.

A deadlock should occur only when a callback tries to use this mechanism - I think it is OK, you just need to document it for the user. 

> >> +		rc = dev_add_cb->cb_fn(dev_add_cb->event,
> >> +				dev_add_cb->cb_arg, dev_add_cb-
> >>> ret_param);
> >> +		rte_spinlock_lock(&rte_dev_cb_lock);
> >> +	} else {
> >> +		TAILQ_FOREACH(cb_lst, &(device->uev_cbs), next) {
> >> +			if (cb_lst->cb_fn == NULL || cb_lst->event != event)
> >> +				continue;
> >> +			dev_cb = *cb_lst;
> >> +			cb_lst->active = 1;
> >> +			if (cb_arg != NULL)
> >> +				dev_cb.cb_arg = cb_arg;
> >> +			if (ret_param != NULL)
> >> +				dev_cb.ret_param = ret_param;
> >> +
> >> +			rte_spinlock_unlock(&rte_dev_cb_lock);
> > The current active flag doesn't do it  thread safe here, I suggest to keep the
> lock locked.
> > Scenario:
> > 	1. Thread A see active = 0 in unregister function.
> > 	2. Context switch.
> > 	3. Thread B start the callback.
> > 	4. Context switch.
> > 	5. Thread A free it.
> > 	6. Context switch.
> > 	7. Seg fault in Thread B.
> the same as above.
The same as above, and I think the active flag doesn't solve the race and you must solve it for the both cases.
I suggest just to keep the lock locked and document the optional deadlock by the callback code.

<snip> 
> >> +			rc = dev_cb.cb_fn(dev_cb.event,
> >> +					dev_cb.cb_arg, dev_cb.ret_param);
> >> +			rte_spinlock_lock(&rte_dev_cb_lock);
> >> +			cb_lst->active = 0;
> >> +		}
> >> +	}
> >> +	rte_spinlock_unlock(&rte_dev_cb_lock);
> >> +	return rc;
> >> +}
> >> 	return(_rte_dev_callback_process(dev,
> >> +					  RTE_EAL_DEV_EVENT_REMOVE,
> >> +					  NULL, NULL));
> > What is the reason to keep this device in EAL device list after the removal?
> > I suggest to remove it (driver remove, bus remove and EAL remove) after
> the callbacks running.
> > By this way EAL can initiate all device removals.
> agree, device should be remove from the eal device list after the removal.

I suggest using rte_eal_hotplug_remove().

<Bring from the second thread>
> it will do device  removal from the device list by the eal device detach 
>function in the call backs running. does it fulfill your concerns.

I mean the removal\probe should be initiated by the EAL and not by the users callbacks.

> >> +			} else if (uevent.type == RTE_EAL_DEV_EVENT_ADD)
> >> {
> >> +				if (dev == NULL) {
> >> +					/**
> >> +					 * bind the driver to the device
> >> +					 * before user's add processing
> >> +					 */
> >> +					bus->bind_driver(
> >> +						uevent.devname,
> >> +						"igb_uio");
> >> +
> > Similar comments here:
> > EAL can initiate all device probe operations by adding the device and
> probing it here before the callback running.
> > Then, also the device pointer can be passed to the callbacks.
> pass a device pointer could be bring some more change, let's think about
> more.

Yes, I know, it will help to the user especially in ADD(NEW) and REMOVE events.

Here you can use rte_eal_hotplug_add().

> >> 	return(_rte_dev_callback_process(NULL,
> >> +					  RTE_EAL_DEV_EVENT_ADD,
> >> +					  uevent.devname, NULL));
> >> +				}
> >> +			}
> >> +		}
> >> +	}
> >> +	return 0;
> >> +}
<snip>
> >> +int
> >> +rte_dev_monitor_start(void)
> >> +{
> > Maybe add option to run it also by new EAL command line parameter?
> good idea.
> >> +	int ret;
> >> +
> >> +	if (!no_request_thread)
> >> +		return 0;

Look, also here there is race, no_request_thread doesn't solve it.
Maybe the EAL parameter should be the only way to run it(just don't expose this API), I think the default should be TRUE.

> >> +	no_request_thread = false;
> >> +
> >> +	/* create the host thread to wait/handle the uevent from kernel */
> >> +	ret = pthread_create(&uev_monitor_thread, NULL,
> >> +		dev_uev_monitoring, NULL);
> > What is the reason to open new thread for hotplug?
> > Why not to use the current dpdk host thread by the alarm mechanism?
> appropriate if you could talk about what you mean the disadvantage of
> new thread here and the advantage of alarm mechanism at the case?

One more thread can complicate things - the user will need to synchronize his alarm\interrupt callbacks code(host thread) with his hotplug callbacks code(hotplug thread).  

> >> +	return ret;
> >> +}
> >> +
> >> +int
> >> +rte_dev_monitor_stop(void)
> >> +{
> >> +	udev_exit = true;
> >> +	no_request_thread = true;
> >> +	return 0;
> >> +}
<snip>



More information about the dev mailing list