[PATCH] net/mlx5: fix state corruption in dev start error path
Raslan Darawsheh
rasland at nvidia.com
Mon Nov 17 15:01:09 CET 2025
Hi,
On 13/11/2025 9:37 PM, Maayan Kashani wrote:
> When mlx5_dev_start() fails partway through initialization, the error
> cleanup code unconditionally calls cleanup functions for all steps,
> including those that were never successfully initialized. This causes
> state corruption leading to incorrect behavior on subsequent start
> attempts.
>
> The issue manifests as:
> 1. First start attempt fails with -ENOMEM (expected)
> 2. Second start attempt returns -EINVAL instead of -ENOMEM
> 3. With flow isolated mode, second attempt incorrectly succeeds,
> leading to segfault in rte_eth_rx_burst()
>
> Root cause: The single error label cleanup path calls functions like
> mlx5_traffic_disable() and mlx5_flow_stop_default() even when their
> corresponding initialization functions (mlx5_traffic_enable() and
> mlx5_flow_start_default()) were never called due to earlier failure.
>
> For example, when mlx5_rxq_start() fails:
> - mlx5_traffic_enable() at line 1403 never executes
> - mlx5_flow_start_default() at line 1420 never executes
> - But cleanup unconditionally calls:
> * mlx5_traffic_disable() - destroys control flows list
> * mlx5_flow_stop_default() - corrupts flow metadata state
>
> This corrupts the device state, causing subsequent start attempts to
> fail with different errors or, in isolated mode, to incorrectly succeed
> with an improperly initialized device.
>
> Fix by replacing the single error label with cascading error labels
> (Linux kernel style). Each label cleans up only its corresponding step,
> then falls through to clean up earlier steps.
> This ensures only successfully initialized steps are cleaned up,
> maintaining device state consistency across failed start attempts.
>
> Bugzilla ID: 1419
> Fixes: 8db7e3b69822 ("net/mlx5: change operations for non-cached flows")
> Cc: stable at dpdk.org
>
> Signed-off-by: Maayan Kashani <mkashani at nvidia.com>
> Acked-by: Dariusz Sosnowski <dsosnowski at nvidia.com>
> ---
Patch applied to next-net-mlx,
Kindest regards
Raslan Darawsheh
More information about the stable
mailing list