[dpdk-dev] [PATCH v2] net/mlx5: fix memory regions release deadlock

Raslan Darawsheh rasland at mellanox.com
Wed Feb 5 10:44:41 CET 2020


Hi,

> -----Original Message-----
> From: dev <dev-bounces at dpdk.org> On Behalf Of Michael Baum
> Sent: Tuesday, February 4, 2020 3:36 PM
> To: dev at dpdk.org
> Cc: Matan Azrad <matan at mellanox.com>; Slava Ovsiienko
> <viacheslavo at mellanox.com>; stable at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] net/mlx5: fix memory regions release
> deadlock
> 
> The mpx5 PMD maintains the list of devices for those the memory
> operation callback routines must be invoked to keep the device MRs (MR
> is the entity backing the hardware DMA transactions) consistent with the
> mapped memory.
> Each device context in the list is protected with dedicated lock on per
> device basis, which might be taken inside the callback routine.
> 
> When device is closing the PMD frees all MRs by calling
> mlx5_mr_release(), that might call rte_free() under the taken device
> lock.  If this rte_free call triggers the entire memory segment freeing
> it, in its turn, invokes the callback routine and attempt to take the
> lock inside this one causes the deadlock.
> 
> The patch proposes the remove the device from the callback list first
> and then call mlx5_mr_release() and free the remaining device MRs
> explicitely.
> 
> Fixes: 0e3d0525b2f2 ("net/mlx5: fix memory event callback list")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Michael Baum <michaelba at mellanox.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> Acked-by: Matan Azrad <matan at mellanox.com>
> ---
> 
> v2:
> rephrase commit masage.
> 
> 
>  drivers/net/mlx5/mlx5.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index f80e403..759491f 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -679,12 +679,12 @@ struct mlx5_flow_id_pool *
>  	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
>  	if (--sh->refcnt)
>  		goto exit;
> -	/* Release created Memory Regions. */
> -	mlx5_mr_release(sh);
>  	/* Remove from memory callback device list. */
>  	rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
>  	LIST_REMOVE(sh, mem_event_cb);
>  	rte_rwlock_write_unlock(&mlx5_shared_data-
> >mem_event_rwlock);
> +	/* Release created Memory Regions. */
> +	mlx5_mr_release(sh);
>  	/* Remove context from the global device list. */
>  	LIST_REMOVE(sh, next);
>  	/*
> --
> 1.8.3.1

Fixed typo in commit msg,

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh


More information about the dev mailing list