[PATCH] mlx5: fix race at mlx5_dev_close

Stephen Hemminger stephen at networkplumber.org
Mon Oct 7 19:54:07 CEST 2024


On Thu, 11 Apr 2024 14:17:40 +0800
hepeng <hepeng.0320 at bytedance.com> wrote:

> From: "hepeng.0320" <hepeng.0320 at bytedance.com>
> 
> mlx5_dev_close currently will set priv->sh->port[priv->dev_port -
> 1].nl_ih_port_id to RTE_MAX_ETHPORTS to avoid mlx5_dev_interrupt_nl_cb
> to use the port's dev_private, because later the rte_eth_dev_close
> will free the dev_private and set the pointer to NULL.
> 
> However, since mlx5_dev_interrupt_nl_cb is running in another thread,
> I think the race still exists. So perhaps an easy fix is to wait for
> 1ms to avoid this race.
> 
> Signed-off-by: hepeng.0320 <hepeng.0320 at bytedance.com>

Not the pest way to handle this. Adding a one second delay on shutdown
hurts some availability scenarios. Looks like mlx5 needs a more coordinated
shutdown to be safe; adding big delays is not the correct fix.


More information about the dev mailing list