|WARNING| [17/17] common/cnxk: add auth key len check in inbound SA
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Thu Jun 11 16:53:39 CEST 2026
Test-Label: ai-code-review
Test-Status: WARNING
https://dpdk.org/patch/165287
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-06-11
# DPDK Patch Review: series.patch
This review covers 17 patches from a Marvell CNXK driver patch series submitted on June 11, 2026.
---
## Patch 01/17: net/cnxk: update mbuf next field for multi segment
**Summary:** Resets `mbuf->next` and `mbuf->nb_segs` for multi-segment mbufs on CN9K platform to comply with `rte_mbuf_raw_free_bulk()` requirements.
### Errors
1. **Incorrect guard for `mbuf->next = NULL` assignment** (cn9k_tx.h:672, 707, 985)
The patch conditionally sets `mbuf->next = NULL` only when `!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)`. This is backwards. The flag name suggests "no-off" (no reset), so the assignment should happen when the flag is **NOT set**. However, the code logic appears inverted: you're skipping the reset when NOFF is **clear** (the flag is absent).
**Expected behavior:**
- If `MBUF_NOFF_F` flag is set (meaning "do not reset"), skip the assignment.
- If flag is clear, perform the reset.
**Current code:**
```c
if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F))
m->next = NULL;
```
This reads as "if NOFF is NOT set, then set next to NULL" which is correct. But the placement after the initial `m->next` read suggests potential use-after-free. Let me re-check:
The first check at line 668 reads:
```c
if (likely(!m_next))
goto done;
```
This is fine--if no next segment, skip the loop.
Then line 671:
```c
if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)) {
m->next = NULL;
m->nb_segs = 1;
}
```
This sets `m->next = NULL` for the **head** mbuf before entering the loop. This is correct for single-segment mbufs. But then `m = m_next;` (line 673) moves to the next segment. So the head's `next` is cleared before the loop, which is fine.
Inside the loop (line 707, 985), `m->next = NULL` is set for each segment if the flag is clear. This is also correct.
**Conclusion:** The logic is actually correct. The flag `NIX_TX_OFFLOAD_MBUF_NOFF_F` means "do NOT clear next" (no-off = no reset). The code skips the clear when the flag is set. This is consistent.
**Retracted.** This is correct.
2. **`rte_io_wmb()` called unconditionally at cn9k_tx.h:1089**
```c
if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F))
rte_io_wmb();
```
The write barrier is only needed if the mbuf reset was performed. This is correct--if the reset is skipped, no barrier is needed. However, the barrier appears **after** the segment loop and before the LMTLINE writes. The placement is correct for ensuring mbuf field writes are visible before DMA descriptor submission.
**Retracted.** This is correct.
### Warnings
None.
### Info
The patch removes unconditional `mbuf->next = NULL` assignments in the Rx path (cn9k_rx.h:412, 831-834) and makes the Tx path mbuf reset conditional on the `MBUF_NOFF_F` flag. This is consistent with the stated goal of `rte_mbuf_raw_free_bulk()` compliance but introduces a new offload flag whose definition is not visible in this patch. Ensure the flag is documented and that the fast-free offload (`RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE`) does not conflict with this reset behavior.
---
## Patch 02/17: common/cnxk: add API of SA valid for cn20k platform
**Summary:** Adds `cnxk_ow_ipsec_inb_sa_valid()` and `cnxk_ow_ipsec_outb_sa_valid()` internal APIs for CN20K IPsec SA validation.
### Errors
None.
### Warnings
1. **Redundant `!!` operator** (cnxk_security.c:612, 619)
```c
return !!sa->w2.s.valid;
```
The `valid` field is already a 1-bit bitfield (0 or 1). The double negation `!!` is unnecessary. Bitfields of width 1 are unsigned and have values 0 or 1. The explicit cast to `bool` return type handles this automatically.
**Suggested fix:**
```c
return sa->w2.s.valid;
```
### Info
The functions are marked `RTE_EXPORT_INTERNAL_SYMBOL` which is correct for internal API. No documentation is provided but internal APIs do not require Doxygen in `.c` files.
---
## Patch 03/17: common/cnxk: additional NIX SQ ctx fields prints
**Summary:** Adds debug prints for CN20K NIX SQ context fields (`update_sq_count`, `sq_count_iova`, and the value at `sq_count_iova`).
### Errors
None.
### Warnings
1. **Potential NULL pointer dereference** (roc_nix_debug.c:596)
```c
sq_cnt_ptr = (int64_t *)(uintptr_t)(ctx->sq_count_iova << 3);
if (sq_cnt_ptr && ctx->update_sq_count)
nix_dump(file, "sq_count value \t\t0x%" PRIx64 "",
plt_atomic_load_explicit(sq_cnt_ptr, plt_memory_order_relaxed));
```
The `sq_count_iova` field is a 53-bit IOVA (physical address), not a host virtual address. Dereferencing it directly via `(int64_t *)(uintptr_t)` is only safe if:
- The IOVA is identity-mapped (true on some platforms), or
- The pointer has been mapped via `rte_mem_iova2virt()` or similar.
If `sq_count_iova` is a device-only IOVA, this is a segfault.
**Suggested fix:**
Use `rte_mem_iova2virt()` to translate the IOVA to a VA, and validate the result:
```c
if (ctx->update_sq_count && ctx->sq_count_iova) {
sq_cnt_ptr = rte_mem_iova2virt(ctx->sq_count_iova << 3);
if (sq_cnt_ptr != RTE_BAD_IOVA)
nix_dump(file, "sq_count value \t\t0x%" PRIx64 "",
plt_atomic_load_explicit(sq_cnt_ptr, plt_memory_order_relaxed));
}
```
### Info
The atomic load uses `plt_memory_order_relaxed` which is appropriate for a debug read (no ordering needed).
---
## Patch 04/17: common/cnxk: update NIX irq handler
**Summary:** Moves queue context dump and register print before interrupt clear in the NIX IRQ handler.
### Errors
None.
### Warnings
1. **Log message includes queue number but not in all cases** (roc_nix_irq.c:171)
The error message now includes the queue number:
```c
plt_err("Failed execute irq get queue=%d off=0x%x", q, off);
```
But the original message at line 171 used `off` which is a register offset, not a queue number. The `q` parameter is the queue index passed to `nix_lf_q_irq_get_and_clear()`. This is correct if `q` is always the queue number, but the function is also called with `rq` and `sq` which are modulo `nix->qints`. The queue number should be the original loop variable, not `q % qints`.
**Suggested fix:**
Pass the actual queue number to the function or reconstruct it from `qintx` and `q`:
```c
plt_err("Failed execute irq get qintx=%d q=%d off=0x%x", qintx, q, off);
```
### Info
Moving the dump before the interrupt clear ensures register state is captured before it changes. This is the correct order for debugging.
---
## Patch 05/17: common/cnxk: configure LSO mask for single segments
**Summary:** Sets `alt_ssf_set` and `alt_ssf_mask` for single-segment LSO flags.
### Errors
None.
### Warnings
None.
### Info
The patch sets:
```c
alt_flags.s.alt_ssf_set = 0;
alt_flags.s.alt_ssf_mask = 0xFFFF;
```
This means "do not set any bits, but mask all 16 bits" for single-segment LSO. The mask of `0xFFFF` preserves existing flags. This is consistent with the multi-segment `alt_msf_*` fields set on line 245-246.
---
## Patch 06/17: net/cnxk: reserve memory for lookup mem at probe
**Summary:** Calls `cnxk_nix_fastpath_lookup_mem_get()` during probe to allocate shared lookup memory.
### Errors
1. **Error path missing `rc` assignment** (cnxk_ethdev.c:2224)
```c
if (!cnxk_nix_fastpath_lookup_mem_get()) {
plt_err("Failed to reserve lookup memory rc=%d", rc);
goto dev_fini;
}
```
The function `cnxk_nix_fastpath_lookup_mem_get()` returns a pointer (memory address or NULL). The error message prints `rc` but `rc` is uninitialized at this point (last assignment was line 2221 for the meta pool callback, which succeeded). This prints garbage.
**Suggested fix:**
```c
if (!cnxk_nix_fastpath_lookup_mem_get()) {
plt_err("Failed to reserve lookup memory");
rc = -ENOMEM;
goto dev_fini;
}
```
### Warnings
None.
---
## Patch 07/17: drivers: add support for devargs skip size
**Summary:** Adds `skip_size` devargs to configure the number of bytes to skip before the Ethernet header in packet parsing.
### Errors
None.
### Warnings
1. **Memzone leak on driver unload** (roc_npc.c:425, 476)
The memzone `SKIP_SIZE_PKIND_MEMZONE` is allocated in `roc_npc_init()` (line 425):
```c
if (!plt_memzone_lookup(SKIP_SIZE_PKIND_MEMZONE)) {
mz = plt_memzone_reserve_cache_align(SKIP_SIZE_PKIND_MEMZONE, ...);
if (mz != NULL)
memset(mz->addr, 0, ...);
}
```
And freed in `roc_npc_fini()` (line 473):
```c
mz = plt_memzone_lookup(SKIP_SIZE_PKIND_MEMZONE);
if (mz)
plt_memzone_free(mz);
```
However, if multiple devices call `roc_npc_init()`, only the first allocates the memzone (due to the `!plt_memzone_lookup()` check). When the first device calls `roc_npc_fini()`, it frees the memzone, but other devices may still be using it. This is a use-after-free.
**Suggested fix:**
Use reference counting or allocate per-device memzones. Alternatively, make the memzone global and free it only on the last device cleanup.
2. **Potential array out-of-bounds in `skip_size_pkind_get()`** (roc_nix_ops.c:514)
```c
if (cfg->count >= NPC_SKIP_SIZE_PKIND_MAX) {
plt_err("skip_size PKIND limit (%d) reached", NPC_SKIP_SIZE_PKIND_MAX);
return -ENOSPC;
}
i = cfg->count;
cfg->entries[i].skip_size = skip_size;
cfg->entries[i].pkind = NPC_RX_SKIP_SIZE_PKIND + i;
*pkind = cfg->entries[i].pkind;
cfg->count++;
```
If `cfg->count == NPC_SKIP_SIZE_PKIND_MAX` (4), the check passes (`>=` not `>`), then `i = 4` which is out of bounds for `cfg->entries[4]` (array size is 4, valid indices 0-3).
**Suggested fix:**
Change `>=` to `>=` is correct, but the increment `cfg->count++` must happen **before** the check. Actually, the check `cfg->count >= NPC_SKIP_SIZE_PKIND_MAX` should be **before** the array access. Let me re-read:
The check is:
```c
if (cfg->count >= NPC_SKIP_SIZE_PKIND_MAX)
return -ENOSPC;
```
This is correct: if count is 4 (max), return error. Then `i = cfg->count` (which is 0-3) is safe. **Retracted.** The logic is correct.
### Info
The devargs `skip_size_info` accepts a hex value `0x0` to `0xff` (line 426). The doc says "number of bytes to skip" which matches the implementation.
---
## Patch 08/17: net/cnxk: update inbound SA pkind for skip size
**Summary:** Updates the inbound SA `pkind` field using `roc_npc_skip_size_pkind_get()` during session create/update for CN10K and CN20K.
### Errors
None.
### Warnings
1. **Inconsistent return value handling** (cn10k_ethdev_sec.c:856, cn20k_ethdev_sec.c:868)
```c
rc = roc_npc_skip_size_pkind_get(&dev->npc);
if (rc >= 0)
inb_sa_dptr->w0.s.pkind = rc;
```
The function `roc_npc_skip_size_pkind_get()` returns `-1` on error (no skip_size configured) or a pkind value `>= 0`. The code only updates the SA if `rc >= 0`. This is correct, but if an error occurs (e.g., invalid skip_size), the SA retains its default pkind. This may be intentional (skip_size is optional), but should be documented.
**Suggested fix:**
Add a comment:
```c
/* Optional: update pkind if skip_size is configured */
rc = roc_npc_skip_size_pkind_get(&dev->npc);
if (rc >= 0)
inb_sa_dptr->w0.s.pkind = rc;
```
---
## Patch 09/17: common/cnxk: fix cnxk xstats names
**Summary:** Fixes `roc_nix_xstats_names_get()` to prevent buffer overruns when the caller provides a smaller array than required.
### Errors
None.
### Warnings
None.
### Info
The fix follows the standard xstats pattern: if `xstats_names == NULL` or `count >= limit`, return the required count without writing. This matches the ethdev API contract.
---
## Patch 10/17: common/cnxk: fix event type for soft expiry
**Summary:** Initializes `tmp[0]` to `~0ULL` before calling `work_cb` for inline soft expiry events.
### Errors
More information about the test-report
mailing list