|SUCCESS| [RFC,3/3] fib: drop redundant tbl8 reservation counter
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Thu May 7 13:13:38 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/163783
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-07
I'll review this patch series focusing on correctness, coding style, and guideline compliance.
commit 51c96687deba79270c867ae37d01f51adfca6622
Author: Maxime Leroy <maxime at leroys.fr>
fib6: fix tbl8 reservation drift in trie
This series fixes a tbl8 reservation drift bug in the FIB6 trie
implementation where rsvd_tbl8s could wrap to UINT32_MAX and cause
spurious -ENOSPC failures.
> diff --git a/lib/fib/trie.c b/lib/fib/trie.c
> index fa5d9ec6b0..52f25d499c 100644
> --- a/lib/fib/trie.c
> +++ b/lib/fib/trie.c
> @@ -161,12 +161,6 @@ tbl8_alloc(struct rte_trie_tbl *dp, uint64_t nh)
> uint8_t *tbl8_ptr;
>
> tbl8_idx = tbl8_get(dp);
> -
> - /* If there are no tbl8 groups try to reclaim one. */
> - if (unlikely(tbl8_idx == -ENOSPC && dp->dq &&
> - !rte_rcu_qsbr_dq_reclaim(dp->dq, 1, NULL, NULL, NULL)))
> - tbl8_idx = tbl8_get(dp);
> -
> if (tbl8_idx < 0)
> return tbl8_idx;
> tbl8_ptr = get_tbl_p_by_idx(dp->tbl8,
This removal of retry-on-reclaim from tbl8_alloc() appears correct given
the pre-check now handles reclamation. However, does this change the
behavior when depth_diff underestimates the actual allocation need? The
old retry could recover from unexpected allocation failures, but that
safety net is now removed from the allocator itself.
> @@ -603,8 +597,15 @@ trie_modify(struct rte_fib6 *fib, const struct rte_ipv6_addr *ip,
> return 0;
> }
>
> - if ((depth > 24) && (dp->rsvd_tbl8s + depth_diff > dp->number_tbl8s))
> - return -ENOSPC;
> + if ((depth > 24) &&
> + (dp->tbl8_pool_pos + depth_diff > dp->number_tbl8s)) {
> + /* Reclaim deferred tbl8s before failing. */
> + if (dp->dq != NULL)
> + rte_rcu_qsbr_dq_reclaim(dp->dq, depth_diff,
> + NULL, NULL, NULL);
> + if (dp->tbl8_pool_pos + depth_diff > dp->number_tbl8s)
> + return -ENOSPC;
> + }
In trie_modify(), when rte_rcu_qsbr_dq_reclaim() is called with
depth_diff, does this code check whether the reclaim actually freed
enough tbl8 entries? The second check repeats the same condition, but
dp->tbl8_pool_pos is only updated by tbl8_get()/tbl8_put(). Does
rte_rcu_qsbr_dq_reclaim() guarantee that freed entries are immediately
reflected in tbl8_pool_pos, or could this still fail spuriously if the
DQ reclaim doesn't process depth_diff entries?
> @@ -614,15 +615,13 @@ trie_modify(struct rte_fib6 *fib, const struct rte_ipv6_addr *ip,
> if (parent != NULL) {
> rte_rib6_get_nh(parent, &par_nh);
> if (par_nh == next_hop)
> - goto successfully_added;
> + return 0;
> }
> ret = modify_dp(dp, rib, &ip_masked, depth, next_hop);
> if (ret != 0) {
> rte_rib6_remove(rib, &ip_masked, depth);
> return ret;
> }
> -successfully_added:
> - dp->rsvd_tbl8s += depth_diff;
> return 0;
The early return when par_nh == next_hop now bypasses modify_dp()
entirely. Does this preserve correctness? Previously, the goto still
incremented rsvd_tbl8s. Now neither rsvd_tbl8s nor modify_dp() is
executed. Could this leave dataplane state inconsistent if the parent
nexthop matches but the dataplane tables were not updated?
> @@ -641,8 +640,6 @@ trie_modify(struct rte_fib6 *fib, const struct rte_ipv6_addr *ip,
> if (ret != 0)
> return ret;
> rte_rib6_remove(rib, ip, depth);
> -
> - dp->rsvd_tbl8s -= depth_diff;
> return 0;
[ ... ]
> diff --git a/lib/fib/trie.h b/lib/fib/trie.h
> index c34cc2c057..b42a28f84e 100644
> --- a/lib/fib/trie.h
> +++ b/lib/fib/trie.h
> @@ -31,8 +31,7 @@
>
> struct rte_trie_tbl {
> uint32_t number_tbl8s; /**< Total number of tbl8s */
> - uint32_t rsvd_tbl8s; /**< Number of reserved tbl8s */
> - uint32_t cur_tbl8s; /**< Current cumber of tbl8s */
> + uint32_t cur_tbl8s; /**< Current number of tbl8s */
The typo fix in the comment is good. Does tbl8_pool_pos now serve the
same purpose as the removed rsvd_tbl8s, or is it tracking a different
metric? The commit message says tbl8_pool_pos is maintained by
tbl8_get()/tbl8_put(), but I don't see tbl8_pool_pos added to the
struct in this patch.
---
commit 309bfce729c90a9d78afe0ed8cd6ce7e2d62598e
Author: Maxime Leroy <maxime at leroys.fr>
test/fib6: add reproducer for tbl8 reservation drift
[ ... ]
> diff --git a/app/test/test_fib6.c b/app/test/test_fib6.c
> index fffb590dbf..85a6c0abc3 100644
> --- a/app/test/test_fib6.c
> +++ b/app/test/test_fib6.c
> @@ -25,6 +25,7 @@ static int32_t test_get_invalid(void);
> static int32_t test_lookup(void);
> static int32_t test_invalid_rcu(void);
> static int32_t test_fib_rcu_sync_rw(void);
> +static int32_t test_drift(void);
[ ... ]
> +static int32_t
> +test_drift(void)
> +{
> + struct rte_fib6_conf config = { 0 };
> + struct rte_fib6 *fib;
> + struct rte_ipv6_addr parent =
> + RTE_IPV6(0xfcde, 0, 0, 0, 0, 0, 0, 0);
> + struct rte_ipv6_addr child[3] = {
> + RTE_IPV6(0xfcde, 0, 0x6000, 0, 0, 0, 0, 0),
> + RTE_IPV6(0xfcde, 1, 0x6000, 0, 0, 0, 0, 0),
> + RTE_IPV6(0xfcde, 2, 0x6000, 0, 0, 0, 0, 0),
> + };
Does this initialization style comply with C99 designated initializers
as preferred in the coding style guide? The RTE_IPV6 macro expansion may
be acceptable, but should be verified for consistency with existing test
patterns.
> + ret = rte_fib6_add(fib, &parent, 28, 0xa);
> + RTE_TEST_ASSERT(ret == 0, "ADD /28 failed (ret=%d)\n", ret);
Does the test properly clean up the FIB on assertion failure? The
RTE_TEST_ASSERT macro may return immediately without calling
rte_fib6_free(), potentially leaking the FIB structure across test runs.
> + ret = rte_fib6_add(fib, &parent, 28, 0xe);
> + RTE_TEST_ASSERT(ret == 0,
> + "Fresh ADD /28 spuriously failed (ret=%d)\n", ret);
> +
> + ret = rte_fib6_delete(fib, &parent, 28);
> + RTE_TEST_ASSERT(ret == 0, "Final DEL /28 failed (ret=%d)\n", ret);
> +
> + rte_fib6_free(fib);
> + return TEST_SUCCESS;
> +}
The test correctly frees the FIB at the end, but all intermediate
RTE_TEST_ASSERT invocations bypass this cleanup.
---
commit f0f08a796e1554f9cfa58bd94ad5deb7e732098d
Author: Maxime Leroy <maxime at leroys.fr>
fib: drop redundant tbl8 reservation counter
[ ... ]
> diff --git a/lib/fib/dir24_8.c b/lib/fib/dir24_8.c
> index 489d2ef427..80215b93b0 100644
> --- a/lib/fib/dir24_8.c
> +++ b/lib/fib/dir24_8.c
> @@ -207,12 +207,6 @@ tbl8_alloc(struct dir24_8_tbl *dp, uint64_t nh)
> uint8_t *tbl8_ptr;
>
> tbl8_idx = tbl8_get_idx(dp);
> -
> - /* If there are no tbl8 groups try to reclaim one. */
> - if (unlikely(tbl8_idx == -ENOSPC && dp->dq &&
> - !rte_rcu_qsbr_dq_reclaim(dp->dq, 1, NULL, NULL, NULL)))
> - tbl8_idx = tbl8_get_idx(dp);
Does this removal match the trie path's change? The dir24_8 allocator
previously had a one-entry retry, but the new pre-check in dir24_8_modify()
also uses depth_diff=1 for reclaim. Is this consistent?
> @@ -504,9 +498,14 @@ dir24_8_modify(struct rte_fib *fib, uint32_t ip, uint8_t depth,
> tmp = rte_rib_get_nxt(rib, ip, 24, NULL,
> RTE_RIB_GET_NXT_COVER);
> if ((tmp == NULL) &&
> - (dp->rsvd_tbl8s >= dp->number_tbl8s))
> - return -ENOSPC;
> -
> + (dp->cur_tbl8s >= dp->number_tbl8s)) {
> + /* Reclaim deferred tbl8s before failing. */
> + if (dp->dq != NULL)
> + rte_rcu_qsbr_dq_reclaim(dp->dq, 1,
> + NULL, NULL, NULL);
> + if (dp->cur_tbl8s >= dp->number_tbl8s)
> + return -ENOSPC;
> + }
In dir24_8_modify(), the pre-check now uses cur_tbl8s instead of
rsvd_tbl8s. Does cur_tbl8s accurately reflect allocated tbl8s at this
point? The comment in the commit says rsvd_tbl8s mirrored cur_tbl8s, but
does the pre-check timing ensure cur_tbl8s is updated before this check?
> @@ -516,16 +515,13 @@ dir24_8_modify(struct rte_fib *fib, uint32_t ip, uint8_t depth,
> if (parent != NULL) {
> rte_rib_get_nh(parent, &par_nh);
> if (par_nh == next_hop)
> - goto successfully_added;
> + return 0;
> }
> ret = modify_fib(dp, rib, ip, depth, next_hop);
> if (ret != 0) {
> rte_rib_remove(rib, ip, depth);
> return ret;
> }
> -successfully_added:
> - if ((depth > 24) && (tmp == NULL))
> - dp->rsvd_tbl8s++;
> return 0;
Same question as in trie: does the early return when par_nh == next_hop
skip modify_fib() correctly, or should modify_fib() be called to ensure
dataplane consistency even when the nexthop matches the parent?
More information about the test-report
mailing list