[dpdk-dev] [PATCH v6 3/3] net/failsafe: fix hotplug races

Gaëtan Rivet gaetan.rivet at 6wind.com
Mon Feb 12 19:33:25 CET 2018


Hi Matan,

On Sun, Feb 11, 2018 at 05:24:32PM +0000, Matan Azrad wrote:
> Fail-safe uses periodic alarm mechanism, running from the host thread,
> to manage the hot-plug events of its sub-devices. This management
> requires a lot of sub-devices PMDs operations (stop,close,start,etc).
> 
> While the hot-plug alarm runs in the host thread, the application may
> call fail-safe operations which directly trigger the sub-devices PMDs
> operations too, This call may occur from any thread decided by the
> application (probably the master thread).
> 
> So, more than one operation can execute to a sub-device in same time
> what can cause a lot of races in the sub-PMDs.
> 
> Moreover, some control operations update the fail-safe internal
> databases which can be used by the alarm mechanism in the same
> time, what also can cause to races and crashes.
> 
> Fail-safe is the owner of its sub-devices and must to synchronize their
> use according to the ETHDEV ownership rules.
> 
> Synchronize hot-plug management by a new lock mechanism uses a mutex to
> atomically defend each critical section in the fail-safe hot-plug
> mechanism and control operations to prevent any races between them.
> 
> Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Matan Azrad <matan at mellanox.com>
> ---
>  drivers/net/failsafe/Makefile           |   1 +
>  drivers/net/failsafe/failsafe.c         |  35 ++++++++
>  drivers/net/failsafe/failsafe_ether.c   |   6 +-
>  drivers/net/failsafe/failsafe_flow.c    |  20 ++++-
>  drivers/net/failsafe/failsafe_ops.c     | 148 ++++++++++++++++++++++++++------
>  drivers/net/failsafe/failsafe_private.h |  62 +++++++++++--
>  6 files changed, 239 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
> index d1ae899..bd2f019 100644
> --- a/drivers/net/failsafe/Makefile
> +++ b/drivers/net/failsafe/Makefile
> @@ -68,5 +68,6 @@ CFLAGS += -pedantic
>  LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
>  LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
>  LDLIBS += -lrte_bus_vdev
> +LDLIBS += -lpthread
>  
>  include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
> index 7b2cdbb..c499bfb 100644
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -113,17 +113,46 @@
>  			break;
>  	/* if we have non-probed device */
>  	if (i != PRIV(dev)->subs_tail) {
> +		if (fs_lock(dev, 1) != 0)
> +			goto reinstall;

You have left a few operations unlocked further down the call stack.
With these discrepancies fixed, the RECURSIVE attribute could be
removed, and this lock as well.

>  		ret = failsafe_eth_dev_state_sync(dev);
> +		fs_unlock(dev, 1);

Compared to the first version of these changes, I much prefer having a
wrapper for locking. However, I dislike having the arguably unnecessary
additional argument (alarm_lock).

I guess you added this for debugging purpose, but in the end either the
design is simple and clear, and you have a proper model, or you don't,
and that's an issue.

And having the RECURSIVE attribute "just in case", is not reassuring.

>  		if (ret)
>  			ERROR("Unable to synchronize sub_device state");
>  	}
>  	failsafe_dev_remove(dev);
> +reinstall:
>  	ret = failsafe_hotplug_alarm_install(dev);
>  	if (ret)
>  		ERROR("Unable to set up next alarm");
>  }
>  
>  static int
> +fs_mutex_init(struct fs_priv *priv)
> +{
> +	int ret;
> +	pthread_mutexattr_t attr;
> +
> +	ret = pthread_mutexattr_init(&attr);
> +	if (ret) {
> +		ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> +		return ret;
> +	}
> +	/* Allow mutex relocks for the thread holding the mutex. */
> +	ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
> +	if (ret) {

Just to emphasize, I think this should be removed.

Please explain why you thought it was necessary.

> +		ERROR("Cannot set mutex type - %s", strerror(ret));
> +		return ret;
> +	}
> +	ret = pthread_mutex_init(&priv->hotplug_mutex, &attr);
> +	if (ret) {
> +		ERROR("Cannot initiate mutex - %s", strerror(ret));
> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +static int
>  fs_eth_dev_create(struct rte_vdev_device *vdev)
>  {
>  	struct rte_eth_dev *dev;
> @@ -176,6 +205,9 @@
>  	ret = failsafe_eal_init(dev);
>  	if (ret)
>  		goto free_args;
> +	ret = fs_mutex_init(priv);
> +	if (ret)
> +		goto free_args;
>  	ret = failsafe_hotplug_alarm_install(dev);
>  	if (ret) {
>  		ERROR("Could not set up plug-in event detection");
> @@ -250,6 +282,9 @@
>  		ERROR("Error while uninitializing sub-EAL");
>  	failsafe_args_free(dev);
>  	fs_sub_device_free(dev);
> +	ret = pthread_mutex_destroy(&PRIV(dev)->hotplug_mutex);
> +	if (ret)
> +		ERROR("Error while destroying hotplug mutex");
>  	rte_free(PRIV(dev));
>  	rte_eth_dev_release_port(dev);
>  	return ret;
> diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
> index d820faf..8672819 100644
> --- a/drivers/net/failsafe/failsafe_ether.c
> +++ b/drivers/net/failsafe/failsafe_ether.c

Locking fs_eth_dev_conf_apply should allow to remove the lock in
fs_hotplug_alarm, as long as we make sure only public rte_ether API is
used in fs_eth_dev_conf_apply and its callee.

> @@ -328,8 +328,11 @@
>  
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
>  		if (sdev->remove && fs_rxtx_clean(sdev)) {
> +			if (fs_lock(dev, 1) != 0)
> +				return;
>  			fs_dev_stats_save(sdev);
>  			fs_dev_remove(sdev);
> +			fs_unlock(dev, 1);
>  		}
>  }
>  
> @@ -428,7 +431,7 @@
>  				void *cb_arg, void *out __rte_unused)
>  {
>  	struct sub_device *sdev = cb_arg;
> -

This line should remain.

> +	fs_lock(sdev->fs_dev, 0);
>  	/* Switch as soon as possible tx_dev. */
>  	fs_switch_dev(sdev->fs_dev, sdev);
>  	/* Use safe bursts in any case. */
> @@ -438,6 +441,7 @@
>  	 * the callback at the source of the current thread context.
>  	 */
>  	sdev->remove = 1;
> +	fs_unlock(sdev->fs_dev, 0);
>  	return 0;
>  }
>  

<snip>

> diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
> index f3be152..ef1f63b 100644
> --- a/drivers/net/failsafe/failsafe_private.h
> +++ b/drivers/net/failsafe/failsafe_private.h
> @@ -7,6 +7,7 @@
>  #define _RTE_ETH_FAILSAFE_PRIVATE_H_
>  
>  #include <sys/queue.h>
> +#include <pthread.h>
>  
>  #include <rte_atomic.h>
>  #include <rte_dev.h>
> @@ -161,6 +162,9 @@ struct fs_priv {
>  	 * appropriate failsafe Rx queue.
>  	 */
>  	struct rx_proxy rxp;
> +	pthread_mutex_t hotplug_mutex;
> +	/* Hot-plug mutex is locked by the alarm mechanism. */
> +	volatile unsigned int alarm_lock:1;

Without the RECURSIVE attribute, I believe this becomes unnecessary.

>  	unsigned int pending_alarm:1; /* An alarm is pending */
>  	/* flow isolation state */
>  	int flow_isolated:1;
> @@ -255,12 +259,6 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
>  	     s != NULL;					\
>  	     s = fs_find_next((dev), i + 1, state, &i))
>  
> -/**
> - * Iterator construct over fail-safe sub-devices:
> - * s:   (struct sub_device *), iterator
> - * i:   (uint8_t), increment
> - * dev: (struct rte_eth_dev *), fail-safe ethdev
> - */

Editing mistake I think here.

>  #define FOREACH_SUBDEV(s, i, dev)			\
>  	FOREACH_SUBDEV_STATE(s, i, dev, DEV_UNDEFINED)
>  
> @@ -347,6 +345,58 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id,
>  }
>  
>  /*
> + * Lock hot-plug mutex.
> + * is_alarm means that the caller is, for sure, the hot-plug alarm mechanism.
> + */
> +static inline int
> +fs_lock(struct rte_eth_dev *dev, unsigned int is_alarm)

The "is_alarm" should be removed without RECURSIVE.

> +{
> +	int ret;
> +
> +	if (is_alarm) {
> +		ret = pthread_mutex_trylock(&PRIV(dev)->hotplug_mutex);
> +		if (ret) {
> +			DEBUG("Hot-plug mutex lock trying failed(%s), will try"
> +			      " again later...", strerror(ret));
> +			return ret;
> +		}
> +		PRIV(dev)->alarm_lock = 1;
> +	} else {
> +		ret = pthread_mutex_lock(&PRIV(dev)->hotplug_mutex);
> +		if (ret) {
> +			ERROR("Cannot lock mutex(%s)", strerror(ret));
> +			return ret;
> +		}
> +	}
> +	DEBUG("Hot-plug mutex was locked by thread %lu%s", pthread_self(),
> +	      PRIV(dev)->alarm_lock ? " by the hot-plug alarm" : "");
> +	return ret;
> +}
> +
> +/*
> + * Unlock hot-plug mutex.
> + * is_alarm means that the caller is, for sure, the hot-plug alarm mechanism.
> + */
> +static inline void
> +fs_unlock(struct rte_eth_dev *dev, unsigned int is_alarm)

ditto

> +{
> +	int ret;
> +	unsigned int prev_alarm_lock = PRIV(dev)->alarm_lock;
> +
> +	if (is_alarm) {
> +		RTE_ASSERT(PRIV(dev)->alarm_lock == 1);
> +		PRIV(dev)->alarm_lock = 0;
> +	}
> +	ret = pthread_mutex_unlock(&PRIV(dev)->hotplug_mutex);
> +	if (ret)
> +		ERROR("Cannot unlock hot-plug mutex(%s)", strerror(ret));
> +	else
> +		DEBUG("Hot-plug mutex was unlocked by thread %lu%s",
> +		      pthread_self(),
> +		      prev_alarm_lock ? " by the hot-plug alarm" : "");
> +}

I know that using a RECURSIVE lock allows you having an implementation
quicker.

So this choice of implementation is only done due to the impending
release, not because it is the right one. I think it should work, and I
heard that it was heavily tested internally.

So I guess this patch can go in with the few other nits (removed blank line,
removed macro doc), as long as it is reworked soon after.

On this matter, I do not think that blindly testing implementations that
all either copied each other or weren't too complicated does the trick
regarding concurrency issues.

You were thinking about an example app for your ownership library, in order
to validate its implementation. I think this could work nicely as a
torture instrument for this patchset as well, with some care.

Regards,
-- 
Gaëtan Rivet
6WIND


More information about the dev mailing list