[PATCH v7 0/8] net/nfb: rework to real multiport
Stephen Hemminger
stephen at networkplumber.org
Thu Feb 12 19:35:54 CET 2026
On Wed, 4 Feb 2026 13:31:29 +0100
spinler at cesnet.cz wrote:
> From: Martin Spinler <spinler at cesnet.cz>
>
> This series implements real multiport for better user experience.
>
> The existing driver creates one ethdev/port for one PCI device.
> As the CESNET-NDK based cards aren't capable to represent each
> Ethernet port by own PCI device, new driver implementation
> processes real port configuration from firmware/card and switches
> from rte_eth_dev_pci_generic_probe to multiple rte_eth_dev_create calls.
>
> ---
> Depends-on: patch-37245 ("doc/nfb: update release notes for nfb driver")
>
> v7:
> * Rebased to patch-37245
> * Removed timestamp update info from release notes
> * Added missing ctype include for isdigit()
> * Updated style: removed extra brackets
>
> Martin Spinler (8):
> net/nfb: prepare for indirect queue mapping scheme
> net/nfb: create one ethdev per ethernet port
> net/nfb: add vdev as alternative device probe method
> net/nfb: add device argument "port" to limit used ports
> net/nfb: init only MACs associated with device
> net/nfb: add compatible cards to driver PCI ID table
> net/nfb: report firmware version
> doc/nfb: cleanup and update guide
>
> doc/guides/nics/features/nfb.ini | 4 +
> doc/guides/nics/nfb.rst | 187 +++++------
> doc/guides/rel_notes/release_26_03.rst | 6 +
> drivers/net/nfb/meson.build | 1 +
> drivers/net/nfb/nfb.h | 56 +++-
> drivers/net/nfb/nfb_ethdev.c | 445 +++++++++++++++++++------
> drivers/net/nfb/nfb_rx.c | 30 +-
> drivers/net/nfb/nfb_rx.h | 9 +-
> drivers/net/nfb/nfb_tx.c | 27 +-
> drivers/net/nfb/nfb_tx.h | 7 +-
> drivers/net/nfb/nfb_vdev.c | 107 ++++++
> 11 files changed, 622 insertions(+), 257 deletions(-)
> create mode 100644 drivers/net/nfb/nfb_vdev.c
>
Improved the AI review process and it saw this.
I think you should fix and retest before merge to next-net.
Here's the review. The most significant finding is in patch 2/8:
Double TAILQ_REMOVE — nfb_eth_common_remove() explicitly removes the
entry from the list, then calls rte_eth_dev_destroy() which invokes
nfb_eth_dev_uninit(), which also does TAILQ_REMOVE on the same entry.
This will corrupt the linked list. The fix is to remove the
TAILQ_REMOVE from nfb_eth_common_remove() and let uninit handle it,
since uninit is also called from the dev_close path.
Other notable findings include unbounded loop counters when populating
queue maps and MAC arrays (patches 2 and 5 — trusting firmware
consistency without defensive bounds checks), strtoul with base 0
allowing accidental octal parsing of port numbers (patch 4), and a
"DYNANIC" typo in the documentation (patch 8).
The overall architecture of the series is solid — the indirect queue
mapping, per-interface ethdev creation, and TAILQ-based device tracking
are well-designed patterns for multi-port NICs.
Full report is:
# Deep Dive Review: NFB Multi-Port Patch Series (v7, 8 patches)
**Series**: `[PATCH v7 1/8]` through `[PATCH v7 8/8]`
**Author**: Martin Spinler `<spinler at cesnet.cz>`
**Delegate**: Stephen Hemminger
---
## Series Overview
This patch series transforms the NFB PMD from a "one ethdev per PCI device" model to "one ethdev per physical Ethernet port." This is a significant architectural change requiring:
1. An indirect queue mapping layer (DPDK queue index → firmware queue ID)
2. Multiple `rte_eth_dev_create()` calls per PCI probe instead of `rte_eth_dev_pci_generic_probe()`
3. A TAILQ-based device list to track all created ethdevs for proper removal
4. A vdev probe path for non-PCI (simulated/virtual) devices
5. Per-port MAC initialization instead of global
6. New PCI IDs and documentation updates
The design is generally sound — the libnfb framework's `nc_ifc_map_info` provides the firmware-side port/queue topology, and this series maps it cleanly into DPDK's model.
---
## Patch-by-Patch Analysis
### Patch 1/8: `net/nfb: prepare for indirect queue mapping scheme`
**What it does**: Adds `queue_map_rx` / `queue_map_tx` arrays to `struct pmd_priv`, allocated as a single `rte_calloc` block. The RX/TX queue setup functions now look up the firmware queue ID via these maps instead of using the DPDK queue index directly.
**Correctness findings**:
**Error: Potential out-of-bounds access on `queue_map_rx` / `queue_map_tx`**
In `nfb_eth_rx_queue_setup()`:
```c
qid = priv->queue_map_rx[rx_queue_id];
```
There is no bounds check that `rx_queue_id < max_rx_queues` before indexing into the map array. The same applies to `nfb_eth_tx_queue_setup()` with `queue_map_tx[tx_queue_id]`. While ethdev may enforce queue count limits configured via `dev_info`, a defensive check would prevent corruption if the upper layer has a bug.
**Confidence**: ~60%. Ethdev's `rte_eth_rx_queue_setup` does validate `rx_queue_id < dev->data->nb_rx_queues`, but `nb_rx_queues` is set by `rte_eth_dev_configure()` which should match `max_rx_queues`. If a mismatch ever occurs (e.g., bug in configure path), this would silently corrupt memory.
**Info: Clean refactoring of error paths**
The conversion from:
```c
if (ret == 0)
dev->data->rx_queues[rx_queue_id] = rxq;
else
rte_free(rxq);
```
to the `goto err_queue_init` pattern is cleaner and more consistent with DPDK conventions.
---
### Patch 2/8: `net/nfb: create one ethdev per ethernet port`
This is the core architectural patch. Key changes:
- `nfb_eth_dev_init()` signature changes to accept `void *init_data` (for `rte_eth_dev_create()`)
- `nfb_eth_common_probe()` opens the NFB device, gets interface map, iterates interfaces
- TAILQ list tracks all created ethdevs for removal
- `nfb_eth_common_remove()` iterates the TAILQ to destroy matching ethdevs
**Correctness findings**:
**Error: Double TAILQ_REMOVE in `nfb_eth_common_remove()`**
```c
RTE_TAILQ_FOREACH_SAFE(entry, &nfb_eth_dev_list, eth_dev_list, temp) {
if (dev == entry->eth_dev->device) {
TAILQ_REMOVE(&nfb_eth_dev_list, entry, eth_dev_list);
rte_eth_dev_destroy(entry->eth_dev, nfb_eth_dev_uninit);
}
}
```
`rte_eth_dev_destroy()` calls `nfb_eth_dev_uninit()`, which also does:
```c
TAILQ_REMOVE(&nfb_eth_dev_list, internals, eth_dev_list);
```
This is a **double TAILQ_REMOVE** on the same entry. The first `TAILQ_REMOVE` in `nfb_eth_common_remove()` unlinks the entry, then `nfb_eth_dev_uninit()` attempts to remove the already-unlinked entry again. Depending on the TAILQ implementation, this could corrupt the list or cause undefined behavior.
**Confidence**: ~85%. The `nfb_eth_dev_uninit()` always does the TAILQ_REMOVE (it was added in this same patch), and `nfb_eth_common_remove()` also does it before calling destroy. One of these must be removed. The cleanest fix is to remove the TAILQ_REMOVE from `nfb_eth_common_remove()` and let `nfb_eth_dev_uninit()` handle it exclusively — that way `dev_close` → `uninit` also works correctly.
**Error: `nfb_eth_common_probe()` opens NFB device twice**
`nfb_eth_common_probe()` opens the NFB device with `nfb_open()` to get the interface map, then closes it. But then `nfb_eth_dev_init()` (called via `rte_eth_dev_create()`) opens the **same device again** per-interface:
```c
internals->nfb = nfb_open(params->probe_params->path);
```
Each ethdev gets its own `nfb_open()` handle, which is probably intentional (each port needs its own handle for independent NDP queue operations). But the probe-level open is done solely to read the interface map and is then closed. This is not a bug per se, but worth confirming the design intent: **N+1 opens** for N ports seems intentional.
**Confidence**: 50% — this is likely intentional but should be confirmed.
**Warning: `__rte_internal` in header file**
```c
__rte_internal
int nfb_eth_common_probe(struct nfb_probe_params *params);
__rte_internal
int nfb_eth_common_remove(struct rte_device *dev);
```
These are declared in `nfb.h` (a private driver header), which is fine for cross-file use within the driver. However, per AGENTS.md guidelines, `__rte_internal` should be alone on its own line before the function — and it is here, so this is correct. The `RTE_EXPORT_INTERNAL_SYMBOL` macros are present in the .c file. No issue.
**Warning: `nfb_eth_dev_create_for_ifc()` queue map population may write beyond allocated bounds**
In patch 2, the queue map population loop:
```c
cnt = 0;
for (i = 0; i < mi->rxq_cnt; i++) {
if (mi->rxq[i].ifc == ifc->id)
priv->queue_map_rx[cnt++] = mi->rxq[i].id;
}
```
The array `queue_map_rx` was allocated with size `max_rx_queues` (which is `ifc->rxq_cnt`). If the firmware map reports more RX queues for this interface than `ifc->rxq_cnt`, `cnt` could exceed the allocation. This depends on the libnfb library's consistency guarantees between `ifc->rxq_cnt` and the actual count of `mi->rxq[i].ifc == ifc->id` matches.
**Confidence**: ~55%. If `nc_ifc_map_info_create_ordinary()` guarantees consistency, this is safe. If not, a bounds check on `cnt < max_rx_queues` would be prudent.
**Info: Static TAILQ without synchronization**
The `nfb_eth_dev_list` TAILQ is a static global with no locking. This is safe as long as probe/remove are only called from the EAL main thread (which is the normal DPDK model), but worth noting.
---
### Patch 3/8: `net/nfb: add vdev as alternative device probe method`
Adds a vdev driver (`net_vdev_nfb`) for non-PCI devices (simulated/virtual).
**Correctness findings**:
**Warning: `nfb_default_dev_path()` used without NULL check**
```c
params.path = nfb_default_dev_path();
```
If `nfb_default_dev_path()` returns NULL (e.g., no NFB device present), this will be passed to `nfb_open()` which will likely segfault. The "dev" argument parsing later may override it, but if no `dev=` argument is given, the default path is used.
**Confidence**: ~65%. Depends on `nfb_default_dev_path()` semantics. If it always returns a valid string (even for non-existent devices), this is fine. If it can return NULL, a check is needed.
**Info: Mixed `free()` and `rte_kvargs_free()` cleanup**
The error paths correctly use `free()` for `strdup()`'d memory and `rte_kvargs_free()` for kvargs. This is correct.
---
### Patch 4/8: `net/nfb: add device argument "port" to limit used ports`
Adds ability to select specific ports via `port=N` arguments.
**Correctness findings**:
**Error: `strtoul` with base 0 allows octal/hex input — potential user confusion**
```c
port = strtoul(value, &end, 0);
```
Base 0 means `port=010` is parsed as octal (8), not decimal 10. For a port number argument, base 10 would be more predictable:
```c
port = strtoul(value, &end, 10);
```
**Confidence**: 70%. Not a crash bug, but a usability issue that could cause silent misconfiguration.
**Info: The `nfb_eth_dev_create_for_ifc_by_port()` validation is good** — it checks for empty string, non-digit start, and out-of-range values.
---
### Patch 5/8: `net/nfb: init only MACs associated with device`
Replaces the global "open all MACs" approach with per-interface MAC initialization.
**Correctness findings**:
**Warning: Uses `calloc()` instead of `rte_calloc()` for rxmac/txmac arrays**
```c
intl->rxmac = calloc(ifc->eth_cnt, sizeof(*intl->rxmac));
```
These are pointer arrays (not DMA-accessible), so `calloc()` is actually appropriate per DPDK guidelines (don't use `rte_malloc` for non-DMA control structures). However, `nfb_eth_dev_uninit()` is freeing the `internals` struct with `rte_free()` while these sub-allocations use plain `free()`. This mixing is fine — they're separate allocations — but worth noting for consistency.
**Error: `nfb_nc_eth_init()` doesn't validate `cnt` against allocated array size**
```c
intl->rxmac = calloc(ifc->eth_cnt, sizeof(*intl->rxmac));
// ...
rxm = 0;
for (i = 0; i < mi->eth_cnt; i++) {
if (mi->eth[i].ifc != ifc->id)
continue;
intl->rxmac[rxm] = nc_rxmac_open(intl->nfb, mi->eth[i].node_rxmac);
if (intl->rxmac[rxm])
rxm++;
// ...
}
```
The array is allocated with size `ifc->eth_cnt`, but the loop counts matches of `mi->eth[i].ifc == ifc->id` without bounding `rxm` against `ifc->eth_cnt`. If the firmware map has more eth entries for this interface than `ifc->eth_cnt` indicates, this overflows the allocation.
**Confidence**: ~55%. Same trust-the-firmware question as in patch 2's queue map population.
**Info: Early return for `ifc->eth_cnt == 0` is good defensive coding.**
---
### Patch 6/8: `net/nfb: add compatible cards to driver PCI ID table`
Adds PCI IDs for NFB-200G2QL secondary endpoint, FB2CGHH, COMBO-400G1, and generic CESNET-NDK.
No correctness issues. Straightforward PCI ID table additions.
---
### Patch 7/8: `net/nfb: report firmware version`
Implements `fw_version_get` ethdev op.
**Correctness findings**:
**Info: Good NULL handling for `proj_name` / `proj_vers`** — both are defaulted to empty string if NULL.
**Info: The return value semantics are correct** — returns 0 on success, the required buffer size on truncation, and negative on error, matching the `fw_version_get` API contract.
---
### Patch 8/8: `doc/nfb: cleanup and update guide`
Documentation rewrite and release notes update.
**Warning: Typo in documentation**
```rst
Ready-to-use demo firmwares can be found on the `DYNANIC page <https://dyna-nic.com/try-now/>`_.
```
"DYNANIC" should be "DYNANIC" or more likely "DyNANIC" or whatever the correct product name is — but the URL suggests it might be "dyna-nic", so the display text appears misspelled. Worth verifying the correct branding.
**Warning: Feature matrix additions should match implemented features**
The patch adds to `features/nfb.ini`:
```ini
Unicast MAC filter = Y
Timestamp offload = P
FW version = Y
Multiprocess aware = Y
```
"Unicast MAC filter" and "Timestamp offload" were presumably already supported but undocumented. "FW version" is added in patch 7. "Multiprocess aware" — the series adds `rte_eal_process_type()` checks in some places but the full multiprocess story should be verified.
---
## Cross-Cutting Concerns
### Thread Safety of Static TAILQ
The `nfb_eth_dev_list` TAILQ has no locking. DPDK's device management is typically single-threaded (EAL main thread), but if hotplug events could race with probe/remove, this could be problematic. This is consistent with how other DPDK drivers handle similar lists, so it's acceptable.
### Memory Allocation Strategy
The series uses an appropriate mix of `rte_calloc`/`rte_zmalloc` for data in the DPDK-managed dev_private area and plain `calloc`/`malloc` for driver-internal control structures (MAC handle arrays). This follows DPDK best practices.
### Error Path Completeness
Error paths are generally well-handled across the series. The `goto` cleanup chains in `nfb_eth_dev_init()` are consistent and cover the allocation sequence.
---
## Summary of Findings
| Severity | Patch | Finding |
|----------|-------|---------|
| **Error** | 2/8 | Double `TAILQ_REMOVE` — `nfb_eth_common_remove()` removes from list, then `nfb_eth_dev_uninit()` (called via `rte_eth_dev_destroy()`) removes again |
| **Warning** | 2/8 | Queue map population loop lacks bounds check against allocated size |
| **Warning** | 5/8 | `nfb_nc_eth_init()` rxmac/txmac population loop lacks bounds check against `ifc->eth_cnt` |
| **Warning** | 4/8 | `strtoul` with base 0 allows unintended octal/hex parsing of port numbers |
| **Warning** | 3/8 | `nfb_default_dev_path()` return value not checked for NULL before use |
| **Warning** | 8/8 | "DYNANIC" appears to be a typo in documentation |
| **Info** | 1/8 | No bounds check on `rx_queue_id` / `tx_queue_id` before indexing queue maps |
The **double TAILQ_REMOVE** in patch 2 is the most significant finding and will cause list corruption when removing devices. The fix is straightforward: remove the `TAILQ_REMOVE` call from `nfb_eth_common_remove()` and let `nfb_eth_dev_uninit()` handle list removal exclusively.
More information about the dev
mailing list