|WARNING| [v8,1/1] net/mana: add device reset support
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Wed Jun 10 20:26:05 CEST 2026
Test-Label: ai-code-review
Test-Status: WARNING
https://dpdk.org/patch/165257
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-10
# DPDK Patch Review: net/mana Device Reset Support
## Summary
This patch adds automatic recovery from hardware service reset events to the MANA PMD. The implementation includes a two-phase reset flow (enter/teardown/exit), per-queue atomic burst state management, process-shared synchronization primitives, multi-process IPC for secondary doorbell remapping, and ethdev recovery event emission.
---
## Correctness Issues
### Errors
1. **Use-after-free risk in `mana_reset_exit_delay` when `mana_pci_probe` fails**
**drivers/net/mana/mana.c:1651-1652**
```c
ret = mana_pci_probe(NULL, pci_dev);
if (ret) {
DRV_LOG(ERR, "Failed to probe mana pci dev ret %d", ret);
rte_atomic_store_explicit(&priv->dev_state, MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto out;
}
```
If `mana_pci_probe` fails partway through (after allocating some resources but before completing), subsequent accesses to `priv` or `dev->data->rx_queues[i]` in the MR initialization loop may be invalid. The probe failure path in `mana_probe_port` does not free `priv` on reset (`if (!is_reset)`), so `priv` remains valid, but queue structures may be partially initialized or freed by the failed probe.
**Suggested fix:** After `mana_pci_probe` fails, verify that queues are in a safe state before attempting to free MR btrees. Consider setting `priv->num_queues = 0` in the probe failure path when `is_reset == true`, or add a validation check here.
2. **Resource leak: `priv->intr_handle` not freed if `rte_dev_event_callback_register` fails**
**drivers/net/mana/mana.c:1964-1970**
```c
ret = rte_dev_event_callback_register(eth_dev->device->name,
mana_pci_remove_event_cb, priv);
if (ret)
DRV_LOG(WARNING, "Failed to register PCI remove event callback");
eth_dev->intr_handle = priv->intr_handle;
return 0;
```
If `rte_dev_event_callback_register` fails, the function returns success but `priv->intr_handle` is not freed. On subsequent `mana_intr_uninstall` or `mana_reset_exit`, the callback unregister will fail with `-ENOENT` (callback was never registered), but the `intr_handle` itself is leaked.
**Suggested fix:** Change the warning to an error and call `rte_intr_instance_free` + `goto free_intr` on failure, or document that this is acceptable (callback registration failure is non-fatal).
3. **Potential double-join of reset thread in rare race**
**drivers/net/mana/mana.c:988-998** (`mana_join_reset_thread`)
```c
if (rte_atomic_compare_exchange_strong_explicit(
&priv->reset_thread_active, &expected, false,
rte_memory_order_acq_rel,
rte_memory_order_acquire)) {
pthread_mutex_lock(&priv->reset_cond_mutex);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_ACTIVE, rte_memory_order_release);
pthread_cond_signal(&priv->reset_cond);
pthread_mutex_unlock(&priv->reset_cond_mutex);
rte_thread_join(priv->reset_thread, NULL);
}
```
Between the CAS clearing `reset_thread_active` and the `rte_thread_join`, if another thread calls `mana_join_reset_thread` concurrently, the second thread will not enter the `if` block (CAS fails) but the reset thread may not have fully exited yet. This is safe only if no code path can call `mana_join_reset_thread` concurrently on the same `priv`. However, the patch does not document this requirement.
**Analysis:** All call sites (`mana_dev_stop_lock`, `mana_dev_close_lock`, `mana_dev_uninit`) are either under `reset_ops_lock` or serialized by EAL port teardown. The pattern is safe but not obviously so.
**Suggested fix:** Add a comment above `mana_join_reset_thread` stating that callers must ensure serialization (e.g., via `reset_ops_lock` or EAL port lifecycle serialization).
4. **Missing error check: `rte_thread_create_internal_control` failure leaves state inconsistent**
**drivers/net/mana/mana.c:1540-1552**
```c
ret = rte_thread_create_internal_control(&priv->reset_thread,
"mana-reset",
mana_reset_thread, priv);
if (ret) {
DRV_LOG(ERR, "Failed to create reset thread ret %d", ret);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_RESET_FAILED,
rte_memory_order_release);
goto reset_failed;
}
rte_atomic_store_explicit(&priv->reset_thread_active,
true, rte_memory_order_release);
```
If thread creation fails, `dev_state` is set to `RESET_FAILED` and the function jumps to `reset_failed`, which clears `burst_state` and emits `RECOVERY_FAILED`. However, `reset_thread_active` is never set to `true`, so any subsequent call to `mana_join_reset_thread` will not join a nonexistent thread. This is safe but subtle.
**Correctness verdict:** Not a bug--`reset_thread_active` remains `false` if thread creation fails, so no join is attempted. However, the code would be clearer if `reset_thread_active` were explicitly set to `false` on failure (redundant store, but documents intent).
**Suggested fix:** Add `rte_atomic_store_explicit(&priv->reset_thread_active, false, ...)` before `goto reset_failed` for clarity (optional, not a correctness issue).
### Warnings
5. **Potential deadlock: `mana_intr_uninstall` inside reset under lock may hang on unregister**
**drivers/net/mana/mana.c:1799-1808** (`mana_reset_exit`)
```c
if (priv->intr_handle) {
ret = rte_intr_callback_unregister(priv->intr_handle,
mana_intr_handler, priv);
if (ret < 0)
DRV_LOG(ERR, "Failed to unregister intr callback ret %d",
ret);
```
The comment states "Since mana_reset_exit is always called from mana_reset_thread (a non-interrupt thread), the interrupt source is inactive and rte_intr_callback_unregister succeeds directly." However, if an IBV async event arrives concurrently, `mana_intr_handler` could be executing and attempting to acquire `reset_ops_lock`, while `mana_reset_exit` holds the lock and calls `rte_intr_callback_unregister`, which may block waiting for the handler to complete.
**Analysis:** The reset path sets `dev_state` to `RESET_ENTER` or `RESET_EXIT` before reaching here, so `mana_intr_handler` will not re-enter the reset logic (the `if (dev_state == MANA_DEV_ACTIVE)` check prevents it). The handler will ack the event and return without acquiring the lock. This is safe, but the comment is misleading--the interrupt source is NOT inactive; it just won't trigger reset logic.
**Suggested fix:** Clarify the comment to state that the interrupt handler checks `dev_state` and exits early if not `ACTIVE`, so unregister is safe despite concurrent events.
6. **Missing bounds check on loop index in `mana_reset_exit_delay` MR init failure path**
**drivers/net/mana/mana.c:1696-1706**
```c
mr_init_failed_all:
i = priv->num_queues;
goto mr_init_failed_rxq;
mr_init_failed_txq:
/* RXQ btree at index i was initialized, free it */
mana_mr_btree_free(&((struct mana_rxq *)
dev->data->rx_queues[i])->mr_btree);
mr_init_failed_rxq:
/* Free all fully initialized btrees for indices < i */
for (int j = 0; j < i; j++) {
```
If `mr_init_failed_all` is reached, `i` is set to `priv->num_queues`, then `mr_init_failed_txq` accesses `dev->data->rx_queues[i]`, which is out of bounds (valid indices are `0` to `num_queues - 1`).
**Suggested fix:** In the `mr_init_failed_all` case, skip the `mr_init_failed_txq` logic (it does not apply). Change to:
```c
mr_init_failed_all:
i = priv->num_queues;
goto mr_init_failed_rxq; /* skip mr_init_failed_txq */
```
Or restructure to avoid the out-of-bounds access.
7. **Queue state check in `mana_stop_rx_queues` / `mana_stop_tx_queues` returns 0 instead of proceeding**
**drivers/net/mana/rx.c:178-180**, **drivers/net/mana/tx.c:19-20**
```c
for (i = 0; i < priv->num_queues; i++)
if (dev->data->rx_queue_state[i] == RTE_ETH_QUEUE_STATE_STOPPED)
return 0;
```
If ANY queue is already stopped, the function returns success without stopping the remaining queues. This is likely incorrect--the intent is probably to skip queues that are already stopped, not to abort the entire loop.
**Suggested fix:** Change to `continue;` instead of `return 0;` to skip already-stopped queues, or check that ALL queues are stopped before skipping.
8. **PCI remove callback may access freed `priv` after `mana_dev_close`**
**drivers/net/mana/mana.c:1377-1393** (`mana_pci_remove_event_cb`)
```c
pthread_mutex_lock(&priv->reset_cond_mutex);
rte_atomic_store_explicit(&priv->dev_state,
MANA_DEV_RESET_FAILED, rte_memory_order_release);
pthread_cond_signal(&priv->reset_cond);
pthread_mutex_unlock(&priv->reset_cond_mutex);
pthread_mutex_lock(&priv->reset_ops_lock);
pthread_mutex_unlock(&priv->reset_ops_lock);
```
If the PCI device is hot-removed during reset, the reset thread may call `mana_dev_close` (which can fail but does not free `priv`), then exit. However, the callback was registered in `mana_intr_install` with `priv` as the `cb_arg`, and it is unregistered in `mana_intr_uninstall` (called from `mana_dev_close` or `mana_reset_exit`). If the callback fires after `priv` is freed by `mana_dev_uninit`, this is a use-after-free.
**Analysis:** The callback is unregistered in `mana_intr_uninstall` before `priv` can be freed. The only path to free `priv` is `mana_pci_remove` - `mana_dev_uninit` - `rte_free(priv)`, and `mana_dev_uninit` calls `mana_join_reset_thread` then `mana_dev_close` (which calls `mana_intr_uninstall`). So the callback is unregistered before `priv` is freed. However, `mana_dev_free_resources` (which destroys mutexes) is called AFTER `mana_dev_close`, so the callback could still fire and access destroyed mutexes if EAL does not guarantee callback unregister is synchronous.
**Suggested fix:** Verify that `rte_dev_event_callback_unregister` is synchronous (waits for in-flight callbacks) or add a comment documenting the ordering requirement. The comment in `mana_intr_uninstall` mentions not retrying on `-EAGAIN` to avoid deadlock, which suggests the unregister may block--need to ensure it does not return while the callback is still runnable.
---
## Style and Process Issues
### Warnings
9. **Variable initialization in `mana_intr_handler` is unnecessary**
**drivers/net/mana/mana.c:1738**
```c
struct ibv_async_event event = { 0 };
```
The variable is immediately overwritten by `ibv_get_async_event(&event)` in the loop, so zero-initialization is dead code. (However, on loop exit the event may be uninitialized if `ibv_get_async_event` fails, but the code does not access it after the loop.)
**Suggested fix:** Remove `= { 0 }` initialization.
10. **Non-const function pointer struct `mana_dev_ops` and `mana_dev_secondary_ops`**
**drivers/net/mana/mana.c:1146-1169**
```c
static const struct eth_dev_ops mana_dev_ops = { ... };
static const struct eth_dev_ops mana_dev_secondary_ops = { ... };
```
These are already declared `const`. No issue here. (Guideline check: verified.)
11. **Hardcoded timeout in `mana_mp_req_on_rxtx`**
**drivers/net/mana/mp.c:370**
```c
struct timespec ts = {.tv_sec = MANA_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
```
The macro `MANA_MP_REQ_TIMEOUT_SEC` is defined elsewhere (not shown in patch), but the comment in `mana.c` mentions "multi-second IPC timeouts." If this is intended to be tunable, it should be a devarg or documented.
**Suggested fix:** Document the timeout in the NIC guide or add a comment explaining why 5 seconds (or whatever the value is) is appropriate.
12. **RST documentation uses bullet lists where definition lists would be clearer**
**doc/guides/nics/mana.rst:99-111**
```rst
The driver emits the following ethdev recovery events
to notify upper layers (e.g. netvsc) of the reset lifecycle:
``RTE_ETH_EVENT_ERR_RECOVERING``
Reset has started.
``RTE_ETH_EVENT_RECOVERY_SUCCESS``
Device has recovered successfully.
``RTE_ETH_EVENT_RECOVERY_FAILED``
Recovery failed.
```
This is already a definition list (correct RST syntax). No issue here.
13. **Missing space in log messages**
**drivers/net/mana/mana.c:968-969**
```c
DRV_LOG(ERR, "Device reset in progress, "
"mana_dev_tx_queue_release not called");
```
Multi-line strings are concatenated without space, but the continuation starts a new sentence, so this is acceptable. (C concatenates adjacent string literals; the space before the second string is preserved.)
---
## API and Documentation
### No Issues Found
14. **New feature is properly documented in release notes and NIC guide.**
- Release notes updated in `doc/guides/rel_notes/release_26_07.rst
More information about the test-report
mailing list