|SUCCESS| [v2] dts: add retry loop to trex traffic generation
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Mon May 18 22:25:24 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/164088
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-18
I'll review this DPDK patch for correctness bugs, coding style issues, API compliance, and guideline violations.
git sha: 20260518175424.253600-1-abailey at iol.unh.edu
Author: Andrew Bailey <abailey at iol.unh.edu>
Subject: dts: add retry loop to trex traffic generation
This patch adds a retry loop to handle intermittent link down issues in
TRex traffic generation by retrying the traffic start command up to 5
times with a 250ms delay between attempts.
> diff --git a/dts/framework/testbed_model/traffic_generator/trex.py b/dts/framework/testbed_model/traffic_generator/trex.py
> index 22cd20dea9..daf07355f2 100644
> --- a/dts/framework/testbed_model/traffic_generator/trex.py
> +++ b/dts/framework/testbed_model/traffic_generator/trex.py
> @@ -13,6 +13,7 @@
>
> from framework.config.node import OS, NodeConfiguration
> from framework.config.test_run import TrexTrafficGeneratorConfig
> +from framework.exception import SSHTimeoutError
> from framework.parser import TextParser
> from framework.remote_session.blocking_app import BlockingApp
> from framework.remote_session.python_shell import PythonShell
> @@ -220,7 +221,9 @@ def _create_packet_stream(self, packet: Packet) -> None:
> ]
> self._shell.send_command("\n".join(packet_stream))
>
> - def _send_traffic_and_get_stats(self, duration: float, send_mpps: float | None = None) -> str:
> + def _send_traffic_and_get_stats(
> + self, duration: float, send_mpps: float | None = None, retry_attempts: int = 5
> + ) -> str:
> """Send traffic and get TG Rx stats.
>
> Sends traffic from the TRex client's ports for the given duration.
> @@ -230,15 +233,31 @@ def _send_traffic_and_get_stats(self, duration: float, send_mpps: float | None =
> Args:
> duration: The traffic generation duration.
> send_mpps: The millions of packets per second for TRex to send from each port.
> + retry_attempts: The number of times to retry this command on failure.
> +
> + Raises:
> + SSHTimeoutError: If TRex fails to send traffic in the allotted attempts.
> """
> - if send_mpps:
> - self._shell.send_command(f"""{self.stl_client_name}.start(ports=[0, 1],
> - mult = '{send_mpps}mpps',
> - duration = {duration})""")
> - else:
> - self._shell.send_command(f"""{self.stl_client_name}.start(ports=[0, 1],
> - mult = '100%',
> - duration = {duration})""")
> + result = "link is DOWN"
> + attempt = 0
> +
> + while "link is DOWN" in result and attempt <= retry_attempts:
Does this code handle all failure cases correctly? The loop continues
while the string "link is DOWN" appears in result, but result is
initialized to this exact string. This means the loop will always execute
at least once, which is intended. However, does this correctly handle the
boundary condition?
At attempt == retry_attempts, the code checks and raises an exception, but
then continues to send the command and increments attempt to
retry_attempts + 1. Does this allow one extra attempt beyond the intended
retry_attempts value?
> + if attempt > 0:
> + self._logger.info(
> + f"Generate traffic command failed (attempt {attempt} out of {retry_attempts})"
> + )
> + time.sleep(0.25)
> + elif attempt == retry_attempts:
> + raise SSHTimeoutError("Failed to generate traffic on TREX traffic generator")
Does this elif condition execute at the correct time? When attempt equals
retry_attempts, the code raises an exception. But since this is inside the
while loop that checks attempt <= retry_attempts, and the if/elif are
mutually exclusive with attempt > 0, can attempt ever equal retry_attempts
here on the first iteration when attempt is 0?
Consider the flow: if attempt starts at 0 and equals retry_attempts (both
are 5 in the default case), neither the if (attempt > 0) nor the elif
(attempt == retry_attempts when attempt is 0) would be true initially.
Does this logic work as intended?
> + if send_mpps:
> + result = self._shell.send_command(f"""{self.stl_client_name}.start(ports=[0, 1],
> + mult = '{send_mpps}mpps',
> + duration = {duration})""")
> + else:
> + result = self._shell.send_command(f"""{self.stl_client_name}.start(ports=[0, 1],
> + mult = '100%',
> + duration = {duration})""")
> + attempt += 1
>
> time.sleep(duration)
Does the exception message accurately describe the failure? The code
raises SSHTimeoutError, but is this really a timeout error, or is it more
accurately a link status error? The actual failure is that the link
remains DOWN after retries, not that an SSH operation timed out.
More information about the test-report
mailing list