|SUCCESS| [7/7] app/test/test_pmd_perf: skip if no device available

dpdklab at iol.unh.edu dpdklab at iol.unh.edu
Sat May 30 03:04:03 CEST 2026


Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/164619

_AI Code Review_

=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-30

# DPDK Patch Review

Author: Stephen Hemminger <stephen at networkplumber.org>

This review covers a patch series titled "app/test/reciprocal_division: make it a fast test" and related test improvements.

## PATCH 1/7: app/test/reciprocal_division: make it a fast test

> +#define MAX_ITERATIONS	(1ULL << 20)
> +#define DIVISORS_RANDOM 64

Does the new MAX_ITERATIONS provide adequate test coverage when reduced from 1ULL << 32? While making this a fast test is valuable, testing only 2^20 iterations might miss edge cases that occurred in the fuller sweep.

> +	for (unsigned int di = 0; di < n_div; di++) {
> +		uint32_t d;
> +		if (di < RTE_DIM(edge_div_u32))
> +			d = edge_div_u32[di]; /* Make sure and test the edge cases */
> +		else
> +			d = rte_rand_max(UINT32_MAX - 1) + 1;

Does rte_rand_max(UINT32_MAX - 1) correctly generate the full range of divisors, given potential issues with random number generation at boundary values?

> +		for (unsigned int k = 0; k < MAX_ITERATIONS; k++) {
> +			uint32_t q = rte_rand_max(qmax);
> +			uint32_t val = q * d;           /* fits in u32 */
> +
> +			/* Check around the value.
> +			 * Under and overflow of 32 bit value are fine here.
> +			 */
> +			if (test_u32_divide(val - 1, d, r) < 0 ||
> +			    test_u32_divide(val, d, r) < 0 ||
> +			    test_u32_divide(val + 1, d, r) < 0)
> +				return -1;

When d equals 1, qmax becomes 0x100000000, which cannot fit in uint32_t. Does rte_rand_max handle this edge case correctly when passed a value beyond UINT32_MAX?

> +static int
> +test_reciprocal_u64_small(void)
> +{
> +	/* 64-bit division with a 32-bit-range divisor */
> +	uint64_t divisor_u64 = (rte_rand() >> 32) | 1;

Does the expression (rte_rand() >> 32) | 1 generate divisors in the intended 32-bit range, or should this use a different method to ensure proper range coverage?

> +REGISTER_FAST_TEST(reciprocal_division_autotest, NOHUGE_OK, ASAN_OK, test_reciprocal);

The old test registered as REGISTER_PERF_TEST. Does changing to REGISTER_FAST_TEST properly reflect the test's purpose in CI versus the separate perf test?

## PATCH 2/7: app/test/reciprocal_division_perf: reduce test time

> -#define MAX_ITERATIONS	(1ULL << 32)
> -#define DIVIDE_ITER	(1ULL << 28)
> +#define MAX_ITERATIONS	(1ULL << 24)
> +#define DIVIDE_ITER	(1ULL << 10)

Does reducing DIVIDE_ITER from 2^28 to 2^10 maintain meaningful performance measurement stability, given the reduction by a factor of 262,144?

## PATCH 3/7: app/test/mempool_perf: size mempool by tested cores

> +		ret = TEST_SKIPPED;
>  		goto err;

Does this test properly clean up allocated resources before returning TEST_SKIPPED in all failure paths?

> -	mp_nocache = rte_mempool_create("perf_test_nocache", MEMPOOL_SIZE,
> +	unsigned int mempool_size = cores *
> +		(MAX_KEEP + RTE_MEMPOOL_CACHE_MAX_SIZE * 2) - 1;

Does this mempool_size calculation avoid overflow when cores is large?

> -	if (do_all_mempool_perf_tests(1) < 0)
> -		goto err;
> +	ret = do_all_mempool_perf_tests(1);
> +	if (ret != 0)
> +		return ret;

Does this code properly handle the difference between TEST_SKIPPED and TEST_FAILED return values from do_all_mempool_perf_tests()?

## PATCH 4/7: app/test/mempool_perf: drop constant-values replay

[ ... ]

> -				/* replay test with constant values */
> -				if (n_get_bulk == n_put_bulk) {
> -					use_constant_values = 1;
> -					ret = launch_cores(mp, cores);
> -					if (ret < 0)
> -						return -1;
> -				}

The commit message states the use_constant_values infrastructure is retained for local testing. Can the unused code paths be triggered accidentally, or should they be removed entirely?

## PATCH 5/7: app/test/mempool_perf: scale down for high core counts

> -#define TIME_S 1
> +#define TIME_MS 200

Does TIME_MS maintain sufficient measurement accuracy across different CPU speeds and cache configurations when reduced from 1000ms to 200ms?

> -	while (time_diff/hz < TIME_S) {
> +	while (time_diff < hz * TIME_MS / 1000) {

Does the multiplication hz * TIME_MS risk overflow on systems with very high timer frequencies?

## PATCH 6/7: app/test/test_rcu_qsbr_perf: call quiescent more often

> -		/* Update quiescent state counter */
> -		rte_rcu_qsbr_quiescent(temp, thread_id);
> +			rte_rcu_qsbr_quiescent(temp, thread_id);

Does calling rte_rcu_qsbr_quiescent() inside the inner loop instead of after all iterations properly represent realistic RCU usage patterns?


More information about the test-report mailing list