|WARNING| [v9,1/1] net/mana: add device reset support
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Fri Jun 12 14:56:51 CEST 2026
Test-Label: ai-code-review
Test-Status: WARNING
https://dpdk.org/patch/165382
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-12
# DPDK Patch Review
## Patch: net/mana: add device reset support
### Summary
This patch adds automatic recovery from hardware service reset events to the MANA PMD. The implementation uses ethdev recovery events, a two-phase reset flow (enter/exit), per-queue atomic burst state for data path synchronization, and a control thread for blocking operations.
---
## Errors
### 1. Resource leak on early return in mana_tx_burst
**File:** `drivers/net/mana/tx.c`
**Line:** ~217 (in mana_tx_burst after CAS check)
```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(
&txq->burst_state, &expected, 1,
rte_memory_order_acquire,
rte_memory_order_relaxed) || !db_page)) {
if (!expected) /* CAS succeeded but db_page NULL -- undo */
rte_atomic_fetch_and_explicit(&txq->burst_state,
~(uint32_t)1,
rte_memory_order_release);
return 0;
}
```
The comment says "CAS succeeded but db_page NULL -- undo", but `!expected` means the CAS **failed** (another thread set the flag first). If the CAS succeeded (`expected == 0` on entry, now `1`), but `db_page` is NULL, the code does NOT undo the burst_state bit. This leaves bit 0 set and the queue permanently stuck -- no future burst can enter because the CAS will never succeed again (expected is always 1).
**Fix:** The condition should be `if (expected == 0)` (CAS succeeded) not `if (!expected)` (CAS failed):
```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(
&txq->burst_state, &expected, 1,
rte_memory_order_acquire,
rte_memory_order_relaxed) || !db_page)) {
if (expected == 0) /* CAS succeeded but db_page NULL -- undo */
rte_atomic_fetch_and_explicit(&txq->burst_state,
~(uint32_t)1,
rte_memory_order_release);
return 0;
}
```
Or invert the logic to test the CAS result directly:
```c
if (unlikely(!rte_atomic_compare_exchange_strong_explicit(...))) {
return 0; /* CAS failed, already blocked or in-burst */
}
if (unlikely(!db_page)) {
/* Undo the CAS we just did */
rte_atomic_fetch_and_explicit(&txq->burst_state, ~(uint32_t)1,
rte_memory_order_release);
return 0;
}
```
---
### 2. Missing error check on pthread_create_internal_control in mana_reset_thread
**File:** `drivers/net/mana/mana.c`
**Line:** ~1599 (in mana_reset_enter)
```c
ret = rte_thread_create_internal_control(&priv->reset_thread,
"mana-reset",
mana_reset_thread, priv);
if (ret) {
DRV_LOG(ERR, "Failed to create reset thread ret %d", ret);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto reset_failed;
}
```
After setting `dev_state` to `MANA_DEV_RESET_FAILED` and jumping to `reset_failed`, the code unlocks `reset_ops_lock` but does NOT clear the per-queue `burst_state` blocked bits. The data path remains permanently blocked -- no future burst can succeed because the `burst_state` `MANA_BURST_BLOCKED` bit is still set.
**Fix:** Call `mana_clear_burst_state(dev)` before unlocking:
```c
if (ret) {
DRV_LOG(ERR, "Failed to create reset thread ret %d", ret);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto reset_failed;
}
...
reset_failed:
mana_clear_burst_state(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
```
(Note: The existing `reset_failed` label does call `mana_clear_burst_state(dev)` in the current patch, so verify the control flow is correct in all error paths from `mana_reset_enter`.)
---
### 3. Incorrect state comparison allows double-join of reset thread
**File:** `drivers/net/mana/mana.c`
**Line:** ~961 (in mana_join_reset_thread)
```c
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
pthread_mutex_unlock(&priv->reset_ops_lock);
return 0;
}
```
This is in `mana_dev_stop_lock` after calling `mana_join_reset_thread()`. If the device is not in `MANA_DEV_ACTIVE` state, it returns without doing anything -- but the reset thread may still be running (state is `MANA_DEV_RESET_EXIT` waiting for the timer). The next call to `dev_stop` or `dev_close` will call `mana_join_reset_thread()` again, which races on the `reset_thread_active` CAS. The thread can only be joined once.
**Fix:** The check should be in the **caller** before calling `mana_dev_stop`, not after the join. If the device is not active, `mana_join_reset_thread()` still needs to be called to ensure the thread is joined or detached. Remove the early return in `mana_dev_stop_lock` and always call the underlying `mana_dev_stop()` after the join.
```c
static int
mana_dev_stop_lock(struct rte_eth_dev *dev)
{
struct mana_priv *priv = dev->data->dev_private;
int ret;
mana_join_reset_thread(priv);
pthread_mutex_lock(&priv->reset_ops_lock);
/* Always call mana_dev_stop -- it handles state transitions itself */
ret = mana_dev_stop(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
}
```
Then in `mana_dev_stop`, add a state check at the start:
```c
static int
mana_dev_stop(struct rte_eth_dev *dev)
{
struct mana_priv *priv = dev->data->dev_private;
enum mana_device_state state;
state = rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire);
if (state != MANA_DEV_ACTIVE && state != MANA_DEV_RESET_FAILED)
return 0; /* Already stopped or in reset */
/* ... existing code ... */
}
```
---
### 4. Missing reset_thread_active flag set in probe path
**File:** `drivers/net/mana/mana.c`
**Line:** ~2326 (after mana_intr_install)
The reset path sets `reset_thread_active` to `true` after spawning the thread, but on initial probe there is no initialization of this flag. It defaults to zero (false), which is correct. However, if a previous probe failed and left the flag in an indeterminate state (e.g., a probe that partially succeeded then failed in a later step), the flag may be stale.
**Fix:** Initialize `reset_thread_active` to `false` explicitly in the `!is_reset` path:
```c
if (!is_reset) {
/* ... existing allocations ... */
rte_atomic_store_explicit(&priv->reset_thread_active,
false, rte_memory_order_release);
/* ... */
}
```
---
### 5. Error path in mana_reset_exit_delay does not free MR btrees on failure
**File:** `drivers/net/mana/mana.c`
**Lines:** ~1726-1765 (in mana_reset_exit_delay)
When `mana_mr_btree_init` fails for a queue, the code jumps to `mr_init_failed_txq` or `mr_init_failed_rxq`, which frees btrees for indices `< i`. However, if `mana_mp_req_on_rxtx` or `mana_dev_start` fails **after** all btrees are initialized, the code jumps to `mr_init_failed_all`, which sets `i = priv->num_queues` and falls through to `mr_init_failed_rxq` -- but by then, all btrees are already initialized and the loop frees them correctly. This is correct.
**No error** -- the logic is correct. (Withdrawn upon re-inspection.)
---
### 6. Race on dev_state in mana_intr_handler double-check
**File:** `drivers/net/mana/mana.c`
**Line:** ~1860 (in mana_intr_handler after lock)
```c
/* Re-check after lock to avoid racing with
* mana_pci_remove_event_cb which may have
* set RESET_FAILED while we waited.
*/
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
pthread_mutex_unlock(&priv->reset_ops_lock);
break;
}
```
This check prevents a second reset from starting if the state is no longer `ACTIVE`, but it does NOT prevent the first reset (the one that set `RESET_FAILED`) from emitting a duplicate `RECOVERY_FAILED` event. The event was already emitted by the `mana_reset_thread` when it failed, but if `mana_reset_enter` sets `dev_state = RESET_FAILED` and jumps to the `reset_failed` label (e.g., if thread creation fails), the code outside the `if` block in `mana_intr_handler` checks the state again at line ~1872 and emits another `RECOVERY_FAILED` event.
**Fix:** Move the state check and event emission inside the `pthread_mutex_lock` critical section:
```c
pthread_mutex_lock(&priv->reset_ops_lock);
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
pthread_mutex_unlock(&priv->reset_ops_lock);
break;
}
mana_reset_enter(priv);
/* Lock is released by mana_reset_enter */
/* Check if mana_reset_enter failed */
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) == MANA_DEV_RESET_FAILED) {
DRV_LOG(INFO, "Sending RTE_ETH_EVENT_RECOVERY_FAILED for port %u",
priv->port_id);
rte_eth_dev_callback_process(dev,
RTE_ETH_EVENT_RECOVERY_FAILED, NULL);
}
```
But `mana_reset_enter` already releases the lock, so the post-check needs to be outside. The current code is structured correctly for this. The issue is that **both** `mana_reset_enter` (via its `reset_failed` label) **and** the caller `mana_intr_handler` emit the event. Only one should.
**Better fix:** Remove the event emission from the `reset_failed` label in `mana_reset_enter`, and always emit it in the caller (`mana_intr_handler`):
```c
reset_failed:
mana_clear_burst_state(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
/* Do NOT emit event here -- let the caller do it */
}
```
Then in `mana_intr_handler`, after `mana_reset_enter()` returns:
```c
mana_reset_enter(priv);
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) == MANA_DEV_RESET_FAILED) {
DRV_LOG(INFO, "Sending RTE_ETH_EVENT_RECOVERY_FAILED for port %u",
priv->port_id);
rte_eth_dev_callback_process(dev,
RTE_ETH_EVENT_RECOVERY_FAILED, NULL);
}
```
---
### 7. Missing NULL check on priv->ib_ctx in mana_reset_exit
**File:** `drivers/net/mana/mana.c`
**Line:** ~1697 (in mana_reset_exit_delay)
```c
ret = ibv_close_device(priv->ib_ctx);
if (ret) {
DRV_LOG(ERR, "Failed to close ibv device %d", ret);
rte_atomic_store_explicit(&priv->dev_state, MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto out;
}
priv->ib_ctx = NULL;
```
If `priv->ib_ctx` is already NULL (e.g., the reset enter phase called `mana_dev_close` which set it to NULL at line ~318), this calls `ibv_close_device(NULL)`, which is undefined behavior.
**Fix:** Check for NULL before calling:
```c
if (priv->ib_ctx) {
ret = ibv_close_device(priv->ib_ctx);
if (ret) {
DRV_LOG(ERR, "Failed to close ibv device %d", ret);
rte_atomic_store_explicit(&priv->dev_state, MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto out;
}
priv->ib_ctx = NULL;
}
```
---
### 8. mana_dev_close_lock does not propagate error from mana_dev_close
**File:** `drivers/net/mana/mana.c`
**Line:** ~1044 (in mana_dev_close_lock)
```c
ret = mana_dev_close(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```
If `mana_dev_close` returns an error, the state is left as-is (might be `RESET_FAILED`), but the function returns the error to the caller. The state should be transitioned back to `ACTIVE` or logged before returning the error, so the caller knows the close failed.
**Fix:** On error, log and transition state:
```c
ret = mana_dev_close(dev);
if (ret) {
DRV_LOG(ERR, "mana_dev_close failed with ret %d, "
"state left as-is", ret);
/* Optionally: transition to ACTIVE or RESET_FAILED */
}
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```
Or always transition to `ACTIVE` after close, regardless of error (since the device is no longer usable):
```c
if (rte_atomic_load_explicit(&priv->dev_state,
rte_memory_order_acquire) != MANA_DEV_ACTIVE) {
mana_clear_burst_state(dev);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_ACTIVE, rte_memory_order_release);
}
ret = mana_dev_close(dev);
pthread_mutex_unlock(&priv->reset_ops_lock);
return ret;
```
---
## Warnings
### 1. Missing documentation for ERR_RECOVERING event emission before lock
**File:**
More information about the test-report
mailing list