|WARNING| [v9,1/1] net/mana: add device reset support

dpdklab at iol.unh.edu dpdklab at iol.unh.edu
Fri Jun 12 14:56:51 CEST 2026
Previous message (by thread): |WARNING| app/testpmd: add padding mode to txonly engine
Next message (by thread): |WARNING| [v5, 11/11] test/bpf: add tests for error handling contracts
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Test-Label: ai-code-review
Test-Status: WARNING
https://dpdk.org/patch/165382

_AI Code Review_

=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-12

# DPDK Patch Review

## Patch: net/mana: add device reset support

### Summary
This patch adds automatic recovery from hardware service reset events to the MANA PMD. The implementation uses ethdev recovery events, a two-phase reset flow (enter/exit), per-queue atomic burst state for data path synchronization, and a control thread for blocking operations.

---

## Errors

### 1. Resource leak on early return in mana_tx_burst

**File:** `drivers/net/mana/tx.c`  
**Line:** ~217 (in mana_tx_burst after CAS check)

```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(
        &txq->burst_state, &expected, 1,
        rte_memory_order_acquire,
        rte_memory_order_relaxed) || !db_page)) {
    if (!expected) /* CAS succeeded but db_page NULL -- undo */
        rte_atomic_fetch_and_explicit(&txq->burst_state,
                          ~(uint32_t)1,
                          rte_memory_order_release);
    return 0;
}
```

The comment says "CAS succeeded but db_page NULL -- undo", but `!expected` means the CAS **failed** (another thread set the flag first). If the CAS succeeded (`expected == 0` on entry, now `1`), but `db_page` is NULL, the code does NOT undo the burst_state bit. This leaves bit 0 set and the queue permanently stuck -- no future burst can enter because the CAS will never succeed again (expected is always 1).

**Fix:** The condition should be `if (expected == 0)` (CAS succeeded) not `if (!expected)` (CAS failed):

```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(
        &txq->burst_state, &expected, 1,
        rte_memory_order_acquire,
        rte_memory_order_relaxed) || !db_page)) {
    if (expected == 0) /* CAS succeeded but db_page NULL -- undo */
        rte_atomic_fetch_and_explicit(&txq->burst_state,
                          ~(uint32_t)1,
                          rte_memory_order_release);
    return 0;
}
```

Or invert the logic to test the CAS result directly:
```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(...))) {
    return 0; /* CAS failed, already blocked or in-burst */
}
if (unlikely(!db_page)) {
    /* Undo the CAS we just did */
    rte_atomic_fetch_and_explicit(&txq->burst_state, ~(uint32_t)1,
                                  rte_memory_order_release);
    return 0;
}
```

---

### 2. Missing error check on pthread_create_internal_control in mana_reset_thread

**File:** `drivers/net/mana/mana.c`  
**Line:** ~1599 (in mana_reset_enter)

```c
ret = rte_thread_create_internal_control(&priv->reset_thread,
                     "mana-reset",
                     mana_reset_thread, priv);
if (ret) {
    DRV_LOG(ERR, "Failed to create reset thread ret %d", ret);
    rte_atomic_store_explicit(&priv->dev_state,
                  MANA_DEV_RESET_FAILED,
                  rte_memory_order_release);
    goto reset_failed;
}
```

After setting `dev_state` to `MANA_DEV_RESET_FAILED` and jumping to `reset_failed`, the code unlocks `reset_ops_lock` but does NOT clear the per-queue `burst_state` blocked bits. The data path remains permanently blocked -- no future burst can succeed because the `burst_state` `MANA_BURST_BLOCKED` bit is still set.

**Fix:** Call `mana_clear_burst_state(dev)` before unlocking:

```c
if (ret) {
    DRV_LOG(ERR, "Failed to create reset thread ret %d", ret);
    rte_atomic_store_explicit(&priv->dev_state,
                  MANA_DEV_RESET_FAILED,
                  rte_memory_order_release);
    goto reset_failed;
}
...

reset_failed:
    mana_clear_burst_state(dev);
    pthread_mutex_unlock(&priv->reset_ops_lock);
```

(Note: The existing `reset_failed` label does call `mana_clear_burst_state(dev)` in the current patch, so verify the control flow is correct in all error paths from `mana_reset_enter`.)

---

### 3. Incorrect state comparison allows double-join of reset thread

**File:** `drivers/net/mana/mana.c`  
**Line:** ~961 (in mana_join_reset_thread)

```c
if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
    pthread_mutex_unlock(&priv->reset_ops_lock);
    return 0;
}
```

This is in `mana_dev_stop_lock` after calling `mana_join_reset_thread()`. If the device is not in `MANA_DEV_ACTIVE` state, it returns without doing anything -- but the reset thread may still be running (state is `MANA_DEV_RESET_EXIT` waiting for the timer). The next call to `dev_stop` or `dev_close` will call `mana_join_reset_thread()` again, which races on the `reset_thread_active` CAS. The thread can only be joined once.

**Fix:** The check should be in the **caller** before calling `mana_dev_stop`, not after the join. If the device is not active, `mana_join_reset_thread()` still needs to be called to ensure the thread is joined or detached. Remove the early return in `mana_dev_stop_lock` and always call the underlying `mana_dev_stop()` after the join.

```c
static int
mana_dev_stop_lock(struct rte_eth_dev *dev)
{
    struct mana_priv *priv = dev->data->dev_private;
    int ret;

    mana_join_reset_thread(priv);

    pthread_mutex_lock(&priv->reset_ops_lock);

    /* Always call mana_dev_stop -- it handles state transitions itself */
    ret = mana_dev_stop(dev);
    pthread_mutex_unlock(&priv->reset_ops_lock);
    return ret;
}
```

Then in `mana_dev_stop`, add a state check at the start:

```c
static int
mana_dev_stop(struct rte_eth_dev *dev)
{
    struct mana_priv *priv = dev->data->dev_private;
    enum mana_device_state state;

    state = rte_atomic_load_explicit(&priv->dev_state,
                     rte_memory_order_acquire);
    if (state != MANA_DEV_ACTIVE && state != MANA_DEV_RESET_FAILED)
        return 0; /* Already stopped or in reset */

    /* ... existing code ... */
}
```

---

### 4. Missing reset_thread_active flag set in probe path

**File:** `drivers/net/mana/mana.c`  
**Line:** ~2326 (after mana_intr_install)

The reset path sets `reset_thread_active` to `true` after spawning the thread, but on initial probe there is no initialization of this flag. It defaults to zero (false), which is correct. However, if a previous probe failed and left the flag in an indeterminate state (e.g., a probe that partially succeeded then failed in a later step), the flag may be stale.

**Fix:** Initialize `reset_thread_active` to `false` explicitly in the `!is_reset` path:

```c
if (!is_reset) {
    /* ... existing allocations ... */
    rte_atomic_store_explicit(&priv->reset_thread_active,
                  false, rte_memory_order_release);
    /* ... */
}
```

---

### 5. Error path in mana_reset_exit_delay does not free MR btrees on failure

**File:** `drivers/net/mana/mana.c`  
**Lines:** ~1726-1765 (in mana_reset_exit_delay)

When `mana_mr_btree_init` fails for a queue, the code jumps to `mr_init_failed_txq` or `mr_init_failed_rxq`, which frees btrees for indices `< i`. However, if `mana_mp_req_on_rxtx` or `mana_dev_start` fails **after** all btrees are initialized, the code jumps to `mr_init_failed_all`, which sets `i = priv->num_queues` and falls through to `mr_init_failed_rxq` -- but by then, all btrees are already initialized and the loop frees them correctly. This is correct.

**No error** -- the logic is correct. (Withdrawn upon re-inspection.)

---

### 6. Race on dev_state in mana_intr_handler double-check

**File:** `drivers/net/mana/mana.c`  
**Line:** ~1860 (in mana_intr_handler after lock)

```c
/* Re-check after lock to avoid racing with
 * mana_pci_remove_event_cb which may have
 * set RESET_FAILED while we waited.
 */
if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
    pthread_mutex_unlock(&priv->reset_ops_lock);
    break;
}
```

This check prevents a second reset from starting if the state is no longer `ACTIVE`, but it does NOT prevent the first reset (the one that set `RESET_FAILED`) from emitting a duplicate `RECOVERY_FAILED` event. The event was already emitted by the `mana_reset_thread` when it failed, but if `mana_reset_enter` sets `dev_state = RESET_FAILED` and jumps to the `reset_failed` label (e.g., if thread creation fails), the code outside the `if` block in `mana_intr_handler` checks the state again at line ~1872 and emits another `RECOVERY_FAILED` event.

**Fix:** Move the state check and event emission inside the `pthread_mutex_lock` critical section:

```c
pthread_mutex_lock(&priv->reset_ops_lock);

if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
    pthread_mutex_unlock(&priv->reset_ops_lock);
    break;
}

mana_reset_enter(priv);
/* Lock is released by mana_reset_enter */

/* Check if mana_reset_enter failed */
if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) == MANA_DEV_RESET_FAILED) {
    DRV_LOG(INFO, "Sending RTE_ETH_EVENT_RECOVERY_FAILED for port %u",
        priv->port_id);
    rte_eth_dev_callback_process(dev,
        RTE_ETH_EVENT_RECOVERY_FAILED, NULL);
}
```

But `mana_reset_enter` already releases the lock, so the post-check needs to be outside. The current code is structured correctly for this. The issue is that **both** `mana_reset_enter` (via its `reset_failed` label) **and** the caller `mana_intr_handler` emit the event. Only one should.

**Better fix:** Remove the event emission from the `reset_failed` label in `mana_reset_enter`, and always emit it in the caller (`mana_intr_handler`):

```c
reset_failed:
    mana_clear_burst_state(dev);
    pthread_mutex_unlock(&priv->reset_ops_lock);
    /* Do NOT emit event here -- let the caller do it */
}
```

Then in `mana_intr_handler`, after `mana_reset_enter()` returns:

```c
mana_reset_enter(priv);

if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) == MANA_DEV_RESET_FAILED) {
    DRV_LOG(INFO, "Sending RTE_ETH_EVENT_RECOVERY_FAILED for port %u",
        priv->port_id);
    rte_eth_dev_callback_process(dev,
        RTE_ETH_EVENT_RECOVERY_FAILED, NULL);
}
```

---

### 7. Missing NULL check on priv->ib_ctx in mana_reset_exit

**File:** `drivers/net/mana/mana.c`  
**Line:** ~1697 (in mana_reset_exit_delay)

```c
ret = ibv_close_device(priv->ib_ctx);
if (ret) {
    DRV_LOG(ERR, "Failed to close ibv device %d", ret);
    rte_atomic_store_explicit(&priv->dev_state, MANA_DEV_RESET_FAILED,
                 rte_memory_order_release);
    goto out;
}
priv->ib_ctx = NULL;
```

If `priv->ib_ctx` is already NULL (e.g., the reset enter phase called `mana_dev_close` which set it to NULL at line ~318), this calls `ibv_close_device(NULL)`, which is undefined behavior.

**Fix:** Check for NULL before calling:

```c
if (priv->ib_ctx) {
    ret = ibv_close_device(priv->ib_ctx);
    if (ret) {
        DRV_LOG(ERR, "Failed to close ibv device %d", ret);
        rte_atomic_store_explicit(&priv->dev_state, MANA_DEV_RESET_FAILED,
                     rte_memory_order_release);
        goto out;
    }
    priv->ib_ctx = NULL;
}
```

---

### 8. mana_dev_close_lock does not propagate error from mana_dev_close

**File:** `drivers/net/mana/mana.c`  
**Line:** ~1044 (in mana_dev_close_lock)

```c
ret = mana_dev_close(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```

If `mana_dev_close` returns an error, the state is left as-is (might be `RESET_FAILED`), but the function returns the error to the caller. The state should be transitioned back to `ACTIVE` or logged before returning the error, so the caller knows the close failed.

**Fix:** On error, log and transition state:

```c
ret = mana_dev_close(dev);
if (ret) {
    DRV_LOG(ERR, "mana_dev_close failed with ret %d, "
        "state left as-is", ret);
    /* Optionally: transition to ACTIVE or RESET_FAILED */
}
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```

Or always transition to `ACTIVE` after close, regardless of error (since the device is no longer usable):

```c
if (rte_atomic_load_explicit(&priv->dev_state,
    rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
    mana_clear_burst_state(dev);
    rte_atomic_store_explicit(&priv->dev_state,
        MANA_DEV_ACTIVE, rte_memory_order_release);
}

ret = mana_dev_close(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```

---

## Warnings

### 1. Missing documentation for ERR_RECOVERING event emission before lock

**File:**
Previous message (by thread): |WARNING| app/testpmd: add padding mode to txonly engine
Next message (by thread): |WARNING| [v5, 11/11] test/bpf: add tests for error handling contracts
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the test-report mailing list