[dpdk-dev] [PATCH v16 2/6] eal: enable hotplug on multi-process

Thomas Monjalon thomas at monjalon.net
Mon Oct 15 10:43:39 CEST 2018


Hi Qi,

I wanted to push this old series, but I still have some questions
and comments. Please let's fix them quickly.

Note: I did not review the IPC mechanism and rollback. I trust you :)

28/09/2018 06:23, Qi Zhang:
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -67,6 +67,12 @@ New Features
>    SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
>    vdev_netvsc, tap, and failsafe drivers combination.
>  
> +* **Support device multi-process hotplug.**
> +
> +  Hotplug and hot-unplug for devices will now be supported in multiprocessing
> +  scenario. Any ethdev devices created in the primary process will be regarded
> +  as shared and will be available for all DPDK processes. Synchronization
> +  between processes will be done using DPDK IPC.
>  

A blank line is missing here.

>  API Changes
>  -----------
> @@ -91,6 +97,11 @@ API Changes
>    flag the MAC can be properly configured in any case. This is particularly
>    important for bonding.
>  
> +* eal: scope of rte_eal_hotplug_add and rte_eal_hotplug_remove is extended.
> +
> +  In primary-secondary process model, ``rte_eal_hotplug_add`` will guarantee
> +  that device be attached on all processes, while ``rte_eal_hotplug_remove``
> +  will guarantee device be detached on all processes.
>  

Here too, double blank line before next heading.

> --- a/lib/librte_eal/common/eal_common_dev.c
> +++ b/lib/librte_eal/common/eal_common_dev.c
> -int
> -rte_eal_hotplug_add(const char *busname, const char *devname,
> -                    const char *drvargs)
> +/* help funciton to build devargs, caller should free the memory */

"help funciton" -> "helper function"

> +static char *
> +build_devargs(const char *busname, const char *devname,
> +	      const char *drvargs)
>  {
>  	char *devargs = NULL;
>  	int size, length = -1;
> @@ -140,19 +143,33 @@ rte_eal_hotplug_add(const char *busname, const char *devname,
>  		if (length >= size)
>  			devargs = malloc(length + 1);
>  		if (devargs == NULL)
> -			return -ENOMEM;
> +			break;
>  	} while (size == 0);

It is an old code, please rebase on master.

> -int __rte_experimental
> -rte_dev_remove(struct rte_device *dev)
> +/* remove device at local process. */
> +int
> +local_dev_remove(struct rte_device *dev)
>  {
>  	struct rte_bus *bus;
>  	int ret;
> @@ -248,7 +268,194 @@ rte_dev_remove(struct rte_device *dev)
>  	if (ret)
>  		RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n",
>  			dev->name);
> -	rte_devargs_remove(dev->devargs);
> +	else
> +		rte_devargs_remove(dev->devargs);

It looks you are fixing a bug here. Good catch!

> +int __rte_experimental
> +rte_dev_probe(const char *devargs)
> +{
> +	struct eal_dev_mp_req req;
> +	struct rte_device *dev;
> +	int ret;
> +
> +	memset(&req, 0, sizeof(req));
> +	req.t = EAL_DEV_REQ_TYPE_ATTACH;
> +	strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN);
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		/**
> +		 * If in secondary process, just send IPC request to
> +		 * primary process.
> +		 */
> +		ret = eal_dev_hotplug_request_to_primary(&req);
> +		if (ret) {
> +			RTE_LOG(ERR, EAL,
> +				"Failed to send hotplug request to primary\n");
> +			return -ENOMSG;
> +		}
> +		if (req.result)
> +			RTE_LOG(ERR, EAL,
> +				"Failed to hotplug add device\n");
> +		return req.result;
> +	}
> +
> +	/* attach a shared device from primary start from here: */
> +
> +	/* primary attach the new device itself. */
> +	ret = local_dev_probe(devargs, &dev);
> +
> +	if (ret) {
> +		RTE_LOG(ERR, EAL,
> +			"Failed to attach device on primary process\n");
> +
> +		/**
> +		 * it is possible that secondary process failed to attached a
> +		 * device that primary process have during initialization,
> +		 * so for -EEXIST case, we still need to sync with secondary
> +		 * process.
> +		 */
> +		if (ret != -EEXIST)
> +			return ret;
> +	}
> +
> +	/* primary send attach sync request to secondary. */
> +	ret = eal_dev_hotplug_request_to_secondary(&req);
> +
> +	/* if any commnunication error, we need to rollback. */

typo: communication

> +	if (ret) {
> +		RTE_LOG(ERR, EAL,
> +			"Failed to send hotplug add request to secondary\n");
> +		ret = -ENOMSG;
> +		goto rollback;
> +	}
> +
> +	/**
> +	 * if any secondary failed to attach, we need to consider if rollback
> +	 * is necessary.
> +	 */
> +	if (req.result) {
> +		RTE_LOG(ERR, EAL,
> +			"Failed to attach device on secondary process\n");
> +		ret = req.result;
> +
> +		/* for -EEXIST, we don't need to rollback. */
> +		if (ret == -EEXIST)
> +			return ret;
> +		goto rollback;
> +	}
> +
> +	return 0;
> +
> +rollback:
> +	req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK;
> +
> +	/* primary send rollback request to secondary. */
> +	if (eal_dev_hotplug_request_to_secondary(&req))

For all occurences of "if (function())", the coding style is requesting
an explicit check of the return value: if (function() != 0)

> +		RTE_LOG(WARNING, EAL,
> +			"Failed to rollback device attach on secondary."
> +			"Devices in secondary may not sync with primary\n");
> +
> +	/* primary rollback itself. */
> +	if (local_dev_remove(dev))
> +		RTE_LOG(WARNING, EAL,
> +			"Failed to rollback device attach on primary."
> +			"Devices in secondary may not sync with primary\n");
> +
> +	return ret;
> +}
> +
> +int __rte_experimental
> +rte_dev_remove(struct rte_device *dev)
> +{
> +	struct eal_dev_mp_req req;
> +	char *devargs;
> +	int ret;
> +
> +	devargs = build_devargs(dev->devargs->bus->name, dev->name, "");
> +	if (devargs == NULL)
> +		return -ENOMEM;
> +
> +	memset(&req, 0, sizeof(req));
> +	req.t = EAL_DEV_REQ_TYPE_DETACH;
> +	strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN);
> +	free(devargs);
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		/**
> +		 * If in secondary process, just send IPC request to
> +		 * primary process.
> +		 */
> +		ret = eal_dev_hotplug_request_to_primary(&req);
> +		if (ret) {
> +			RTE_LOG(ERR, EAL,
> +				"Failed to send hotplug request to primary\n");
> +			return -ENOMSG;
> +		}
> +		if (req.result)
> +			RTE_LOG(ERR, EAL,
> +				"Failed to hotplug remove device\n");
> +		return req.result;
> +	}
> +
> +	/* detach a device from primary start from here: */
> +
> +	/* primary send detach sync request to secondary */
> +	ret = eal_dev_hotplug_request_to_secondary(&req);
> +
> +	/**
> +	 * if communication error, we need to rollback, because it is possible
> +	 * part of the secondary processes still detached it successfully.
> +	 */
> +	if (ret) {

ret is not a boolean, please do explicit check != 0.

> +		RTE_LOG(ERR, EAL,
> +			"Failed to send device detach request to secondary\n");
> +		ret = -ENOMSG;
> +		goto rollback;
> +	}
> +
> +	/**
> +	 * if any secondary failed to detach, we need to consider if rollback
> +	 * is necessary.
> +	 */
> +	if (req.result) {

result is not a boolean, please do explicit check != 0.

> +		RTE_LOG(ERR, EAL,
> +			"Failed to detach device on secondary process\n");
> +		ret = req.result;
> +		/**
> +		 * if -ENOENT, we don't need to rollback, since devices is
> +		 * already detached on secondary process.
> +		 */
> +		if (ret != -ENOENT)
> +			goto rollback;
> +	}
> +
> +	/* primary detach the device itself. */
> +	ret = local_dev_remove(dev);
> +
> +	/* if primary failed, still need to consider if rollback is necessary */
> +	if (ret) {
> +		RTE_LOG(ERR, EAL,
> +			"Failed to detach device on primary process\n");
> +		/* if -ENOENT, we don't need to rollback */
> +		if (ret == -ENOENT)
> +			return ret;
> +		goto rollback;
> +	}
> +
> +	return 0;
> +
> +rollback:
> +	req.t = EAL_DEV_REQ_TYPE_DETACH_ROLLBACK;
> +
> +	/* primary send rollback request to secondary. */
> +	if (eal_dev_hotplug_request_to_secondary(&req))
> +		RTE_LOG(WARNING, EAL,
> +			"Failed to rollback device detach on secondary."
> +			"Devices in secondary may not sync with primary\n");
> +
>  	return ret;
>  }

> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> +/**
> + * Register all mp action callbacks for hotplug.
> + *
> + * @return
> + *   0 on success, negative on error.
> + */
> +int rte_dev_hotplug_mp_init(void);

This function is called by the init, so it should not be private.
The app should be free to build its own init routine.

> --- /dev/null
> +++ b/lib/librte_eal/common/hotplug_mp.h
> +#include <rte_dev.h>
> +#include <rte_bus.h>

I think EAL headers should be included with quotes.

> +/**
> + * this is a synchronous wrapper for secondary process send

Missing uppercase at the beggining of sentence.

> + * request to primary process, this is invoked when an attach
> + * or detach request issued from primary process.

Missing "is" before "issued": request is issued.

> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -190,6 +190,9 @@ int rte_eal_dev_detach(struct rte_device *dev);
>  
>  /**
>   * Hotplug add a given device to a specific bus.
> + * In multi-process, this function will inform all other processes
> + * to hotplug add the same device. Any failure on other process rollback
> + * the action.

Better to leave a blank line before this comment.
Small reword:

 * Hotplug add a given device to a specific bus.
 *
 * In multi-process, it will request other processes to add the same device.
 * A failure, in any process, will rollback the action.

You should add the same comment for rte_dev_probe().

> @@ -219,6 +222,9 @@ int __rte_experimental rte_dev_probe(const char *devargs);
>  
>  /**
>   * Hotplug remove a given device from a specific bus.
> + * In multi-process, this function will inform all other processes
> + * to hotplug remove the same device. Any failure on other process
> + * will rollback the action.

Same reword:

 * Hotplug remove a given device from a specific bus.
 *
 * In multi-process, it will request other processes to remove the same device.
 * A failure, in any process, will rollback the action.

[...]
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> +	/* register mp action callbacks for hotplug */
> +	if (rte_dev_hotplug_mp_init() < 0) {

In the comment, better to say "multi-process" instead of the cryptic "mp".






More information about the dev mailing list