[dpdk-dev] [PATCH v2] net/mlx5: fix memory regions release deadlock
Raslan Darawsheh
rasland at mellanox.com
Wed Feb 5 10:44:41 CET 2020
Hi,
> -----Original Message-----
> From: dev <dev-bounces at dpdk.org> On Behalf Of Michael Baum
> Sent: Tuesday, February 4, 2020 3:36 PM
> To: dev at dpdk.org
> Cc: Matan Azrad <matan at mellanox.com>; Slava Ovsiienko
> <viacheslavo at mellanox.com>; stable at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] net/mlx5: fix memory regions release
> deadlock
>
> The mpx5 PMD maintains the list of devices for those the memory
> operation callback routines must be invoked to keep the device MRs (MR
> is the entity backing the hardware DMA transactions) consistent with the
> mapped memory.
> Each device context in the list is protected with dedicated lock on per
> device basis, which might be taken inside the callback routine.
>
> When device is closing the PMD frees all MRs by calling
> mlx5_mr_release(), that might call rte_free() under the taken device
> lock. If this rte_free call triggers the entire memory segment freeing
> it, in its turn, invokes the callback routine and attempt to take the
> lock inside this one causes the deadlock.
>
> The patch proposes the remove the device from the callback list first
> and then call mlx5_mr_release() and free the remaining device MRs
> explicitely.
>
> Fixes: 0e3d0525b2f2 ("net/mlx5: fix memory event callback list")
> Cc: stable at dpdk.org
>
> Signed-off-by: Michael Baum <michaelba at mellanox.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> Acked-by: Matan Azrad <matan at mellanox.com>
> ---
>
> v2:
> rephrase commit masage.
>
>
> drivers/net/mlx5/mlx5.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index f80e403..759491f 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -679,12 +679,12 @@ struct mlx5_flow_id_pool *
> MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
> if (--sh->refcnt)
> goto exit;
> - /* Release created Memory Regions. */
> - mlx5_mr_release(sh);
> /* Remove from memory callback device list. */
> rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
> LIST_REMOVE(sh, mem_event_cb);
> rte_rwlock_write_unlock(&mlx5_shared_data-
> >mem_event_rwlock);
> + /* Release created Memory Regions. */
> + mlx5_mr_release(sh);
> /* Remove context from the global device list. */
> LIST_REMOVE(sh, next);
> /*
> --
> 1.8.3.1
Fixed typo in commit msg,
Patch applied to next-net-mlx,
Kindest regards,
Raslan Darawsheh
More information about the dev
mailing list