|SUCCESS| net/mlx5: fix uninitialized skip count
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Fri May 15 16:55:00 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/164007
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-15
git sha: Not provided in patch file
Author: Kiran Vedere <kiranv at nvidia.com>
Subject: [PATCH] net/mlx5: fix uninitialized skip count
This patch fixes an issue where the skip_cnt variable could remain
uninitialized when mlx5_rx_err_handle() returns MLX5_ERROR_CQE_MASK
in the HW_OWN case, causing undefined behavior when advancing rq_ci.
> diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
> index 185bfd4fff..09cd3f1ffb 100644
> --- a/drivers/net/mlx5/mlx5_rx.c
> +++ b/drivers/net/mlx5/mlx5_rx.c
> @@ -1051,7 +1051,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
> int len = 0; /* keep its value across iterations. */
>
> while (pkts_n) {
> - uint16_t skip_cnt;
> + uint16_t skip_cnt = 0;
In mlx5_rx_burst()->mlx5_rx_poll_len(), when the return is
MLX5_ERROR_CQE_MASK, does the code correctly handle skip_cnt remaining
0? The subsequent logic does "rq_ci += skip_cnt", but is this the
correct behavior when mlx5_rx_err_handle() didn't set skip_cnt because
of the HW_OWN case?
Looking at the code flow, when len == MLX5_ERROR_CQE_MASK is true and
skip_cnt remains 0, the rq_ci would not advance. Is this the intended
fix? The commit message states that skip_cnt "can cause rq_ci to jump
by an undefined value", but with this fix, rq_ci would not jump at all
(skip_cnt == 0). Should skip_cnt be set to 1 instead to actually skip
the problematic CQE?
> unsigned int idx = rq_ci & wqe_mask;
> volatile struct mlx5_wqe_data_seg *wqe =
> &((volatile struct mlx5_wqe_data_seg *)rxq->wqes)[idx];
> @@ -1497,7 +1497,7 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
> uint16_t strd_cnt;
> uint16_t strd_idx;
> uint32_t byte_cnt;
> - uint16_t skip_cnt;
> + uint16_t skip_cnt = 0;
Same question applies to mlx5_rx_burst_mprq(). When the MPRQ path
encounters the HW_OWN case with MLX5_ERROR_CQE_MASK return, does leaving
skip_cnt at 0 properly handle the error condition?
More information about the test-report
mailing list