<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 9/18/23 17:44, Adam Hassick wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAC-YWqgjU5ja3h4V0ewU5Fh8-mX=DSCjjzgwK_H7-n+8kvpdOw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>Hi Andrew and Konstantin,<br>
<br>
</div>
Thank you for adding the tester-dial feature, this
opens up the possibility for us to do CI integrated
testing in the future.<br>
<br>
</div>
Our Mellanox pass rate is similar to yours (about
~2400 passing, ~4400 failing), however our Intel pass
rates are far worse.<br>
</div>
<div>I will try running tests on the XL710 with the
trc-tags argument set and see if it improves the pass
rate.<br>
</div>
Another thing I noticed in the results you uploaded is
that the results are tagged with vfio-pci and not i40e.</div>
<div>Though in the environment dump, the driver on the
test machine and the DUT are set to use the i40e driver.
Is this important at all?<br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I think it is a misunderstanding here. There are two kinds of driver
in configuration: net driver and so-called DPDK driver.<br>
Net driver is a Linux kernel network device driver used on Tester
side.<br>
DPDK driver is a Linux kernel driver to bind device to to use it
with DPDK. So, it is NOT a driver inside DPDK (drivers/net/*).<br>
In the case of bifurcated driver (like mlx5_core) it is the same in
both cases.<br>
In non-bifurcated case DPDK driver is some UIO driver(vfio-pci,
uio-pci-generic or igb_uio).<br>
Some expectations depend on used UIO. For example, uio-pci-generic
do not support many interrupts (used by usecases/rx_intr test
cases).<br>
That's why we care corresponding TRC tag.<br>
<br>
TE_ENV_*_DPDK_DRIVER variables should be vfio-pc in 710's Intel
case. Or uio-pci-generic if IOMMU is turned off on corresponding
machines and Linux distro does not support VFIO no IOMMU mode.<br>
<br>
Andrew.<br>
<br>
<blockquote type="cite"
cite="mid:CAC-YWqgjU5ja3h4V0ewU5Fh8-mX=DSCjjzgwK_H7-n+8kvpdOw@mail.gmail.com">
<div dir="ltr">
<div>
<div>
<div>There isn't anything preventing us from pushing our
results up to the existing Bublik instance running at <a
href="http://ts-factory.io" moz-do-not-send="true">ts-factory.io</a>
that I can think of at the moment.<br>
</div>
<div>We will have to work out how to submit our results to
your Bublik instance in a controlled and secure manner in
that case.<br>
</div>
<div>As far as I know we won't need access controls for the
results themselves. I'll discuss this with Patrick and
will let you know once we confirm that it's fine.</div>
</div>
<div><br>
</div>
Thanks,<br>
</div>
Adam<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Sep 18, 2023 at
2:26 AM Andrew Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
moz-do-not-send="true" class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div>On 9/18/23 09:23, Konstantin Ushakov wrote:<br>
</div>
<blockquote type="cite">
<div style="font-family:sans-serif">
<div style="white-space:normal">
<p dir="auto">Hi Andrew,</p>
<p dir="auto">should we always auto-assign the tags or
you don’t do it since it slows down (by some
seconds) the TE startup?</p>
</div>
</div>
</blockquote>
<br>
Tags are auto-assigned, but I guess it differs in Adam's
case since NIC is a bit different. Below test will help to
understand if it is the root cause of very different
expectations. If pass rate will be close to mine, I'll
simply update TRC database to share expectations for mine
NIC and NIC used by Adam.<br>
<br>
<blockquote type="cite">
<div style="font-family:sans-serif">
<div style="white-space:normal">
<p dir="auto">Hi Adam,</p>
<p dir="auto">I think I second the question from
Andrew - happy to help you with the triage so that
we get to the same baseline. Do you have a good way
for us to share the logs? I.e. say upload to
ts-factory if we add strict permissions system so
it’s not publishing or any other way.</p>
<p dir="auto">Thanks, <br>
Konstantin</p>
<br>
<p dir="auto">On 18 Sep 2023, at 9:15, Andrew
Rybchenko wrote:</p>
</div>
<blockquote style="margin:0px 0px
5px;padding-left:5px;border-left:2px solid
rgb(119,119,119);color:rgb(119,119,119)">
<div
id="m_-792332968217640304733A56D0A-0ED3-47F6-99B4-35C92E41C2DA">
<div>Hi Adam,<br>
<br>
I've uploaded fresh testing results to <a
href="http://ts-factory.io" target="_blank"
moz-do-not-send="true">ts-factory.io</a> [1] to
be on the same page.<br>
<br>
I think I know why your and mine results on Intel
710 series NICs differ so much. Testing results
expectations database (dpdk-ethdev-ts/trc/*) is
filled in in terms of TRC tags. I.e. expectations
depends on TRC tags discovered by helper scripts
when testing is started. These tags identify
various aspects of what is tested. Ideally
expectations should be written in terms of root
cause of the expected behaviour. If it is a driver
expectations, driver tag should be used. If it is
HW limitation, tags with PCI IDs should be used.
However, it is not always easy to classify it
correctly if you're not involved in driver
development. So, in order case expectations for
710's Intel are filled in in terms of PCI IDs. I
guess PCI ID differ in your case and that's why
expectations filled in for my NIC do not apply to
your runs.<br>
<br>
Just try to add the following option when you run
on your 710's Intel in order to mimic mine and see
if it helps to achieve better pass rate.<br>
--trc-tag=pci-8086-1572<br>
<br>
BTW, fresh TE tag <span>v1.21.0 has improved
algorithm to choose tests for --tester-dial
option. It should have better coverage now.</span><br>
<br>
Andrew.<br>
<br>
[1] <a
href="https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1"
target="_blank" moz-do-not-send="true">https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1</a><br>
<br>
On 9/13/23 18:45, Andrew Rybchenko wrote:<br>
</div>
<blockquote type="cite">
<div>Hi Adam,<br>
<br>
I've pushed new TE tag v1.20.0 which supported a
new command-line option --tester-dial=NUM where
NUM is from 0 to 100. it allows to choose
percentage of tests to run. If you want stable
set, you should pass --tester-random-seed=0 (or
other integer). It is the first sketch and we
have plans to improve it, but feedback would be
welcome.<br>
<br>
> Is it needed on the tester?<br>
<br>
It is hard to say if it is strictly required for
simple tests. However, it is better to update
Tester as well, since performance tests run DPDK
on Tester as well.<br>
<br>
> Are there any other manual setup steps for
these devices that I might be missing?<br>
<br>
I don't remember anything else.<br>
<br>
I think it is better to get down to details and
take a look at logs. I'm ready to help with it
and explain what's happening there. May be it
will help to understand if it is a problem with
setup/configuration.<br>
<br>
Text logs are not very convenient. Ideally logs
should be imported to bublik, however, manual
runs do not provide all required artifacts right
now (Jenkins jobs generate all required
artifacts).<br>
Other option is 'tmp_raw_log' file (should be
packed to make it smaller) which could be
converted to various log formats.<br>
Would it be OK for you if I import your logs to
bublik at <a href="http://ts-factory.io"
target="_blank" moz-do-not-send="true">ts-factory.io</a>?
Or is it a problem that it is publicly
available?<br>
Would it help if we add authentication and
access control there?<br>
<br>
Andrew.<br>
<br>
On 9/8/23 17:57, Adam Hassick wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>Hi Andrew,<br>
<br>
</div>
I have a couple questions about needed
setup of the NICs for the ethdev test
suite.<br>
<br>
</div>
Our MCX5s and XL710s are failing the
checkup tests. The pass rate appears to
be much worse on the XL710s (40 of 73
tests failed, 3 passed unexpectedly).<br>
<br>
</div>
For the XL710s, I've updated the driver
and NVM versions to match the minimum
supported versions in the compatibility
matrix found on the DPDK documentation.
This did not change the failure rate much.<br>
</div>
For the MCX5s, I've installed the latest LTS
version of the OFED bifurcated driver on the
DUT. Is it needed on the tester?<br>
<br>
</div>
Are there any other manual setup steps for
these devices that I might be missing?<br>
<div>
<div>
<div>
<div>
<div>
<div>
<div><br>
</div>
<div>Thanks,<br>
</div>
<div>Adam<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Sep
6, 2023 at 11:00 AM Adam Hassick <<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>Hi Andrew,<br>
<br>
</div>
<div>Yes, I copied the X710
configs to set up XL710 configs.
I changed the environment
variable names from the X710
suffix to XL710 suffix in the
script, and forgot to change
them in the corresponding
environment file.<br>
</div>
</div>
That fixed the issue.<br>
<br>
</div>
I got the checkup tests working on the
XL710 now. Most of them are failing,
which leads me to believe this is an
issue with our testbed. Based on the
DPDK documentation for i40e, the
firmware and driver versions are much
older than what DPDK 22.11 LTS and
main prefer, so I'll try updating
those.<br>
<br>
</div>
For now I'm working on getting the XL710
checkup tests passing, and will pick up
getting the E810 configured properly
next. I'll let you know if I run into
any more issues in relation to the test
engine.<br>
<br>
</div>
<div>Thanks,<br>
</div>
<div>Adam<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed,
Sep 6, 2023 at 7:36 AM Andrew Rybchenko
<<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>Hi Adam,<br>
<br>
On 9/5/23 18:01, Adam Hassick wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>Hi Andrew,<br>
<br>
</div>
The compilation warning
issue is now resolved.
Again, thank you guys
for fixing this for us.
I can run the tests on
the Mellanox CX5s again,
however I'm running into
a couple new issues with
running the prologues on
the Intel cards.<br>
<br>
</div>
When running testing on
the Intel XL710s, I see
this error appear in the
log:<br>
<br>
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">ERROR
prologue Environment
LIB 14:16:13.650<br>
Too few networks in
available configuration
(0) in comparison with
required (1)<br>
</blockquote>
<br>
</div>
This seems like a trivial
configuration error, perhaps
this is something I need to
set up in ts-rigs. I briefly
searched through the
examples there and didn't
see any mention of how to
set up a network.<br>
</div>
<div>I will attach this log
just in case you need more
information.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Unfortunately logs are insufficient to
understand it. I've pushed new tag to
TE v1.19.0 which add log message with
TE_* environment variables.<br>
Most likely something is wrong with
variables which are used as conditions
when available networks are defined in
ts-conf/cs/inc.net_cfg_pci_fns.yml:<br>
TE_PCI_INSTANCE_IUT_TST1<br>
TE_PCI_INSTANCE_IUT_TST1a<br>
TE_PCI_INSTANCE_TST1a_IUT<br>
TE_PCI_INSTANCE_TST1_IUT<br>
My guess it that you change naming a
bit, but script like
ts-rigs-sample/scripts/iut.h1-x710 is
not included or not updated.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>There is a different error
when running on the Intel
E810s. It appears to me like
it starts DPDK, does some
configuration inside DPDK and
on the device, and then fails
to bring the device back up.
Since this error seems very
non-trivial, I will also
attach this log.<br>
</div>
</div>
</div>
</blockquote>
<br>
This one is a bit simpler. Few lines
after the first ERROR in log I see the
following:<br>
WARN RCF DPDK 13:06:00.144<br>
ice_program_hw_rx_queue(): currently
package doesn't support RXDID (22)<br>
ice_rx_queue_start(): fail to program
RX queue 0<br>
ice_dev_start(): fail to start Rx
queue 0<br>
Device with port_id=0 already stopped<br>
<br>
It is stdout/stderr from test agent
which runs DPDK. Same logs in plain
format are available in ta.DPDK file.<br>
I'm not an expert here, but I vaguely
remember that E810 requires correct
firmware and DDP to be loaded.<br>
There is some information in
dpdk/doc/guides/nics/ice.rst.<br>
<br>
You can try to add
--dev-args=safe-mode-support=1
command-line option described there.<br>
<br>
Hope it helps,<br>
Andrew.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div><br>
</div>
Thanks,<br>
</div>
Adam<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On
Fri, Sep 1, 2023 at 3:59 AM
Andrew Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>Hi Adam,<br>
<br>
On 8/31/23 22:38, Adam
Hassick wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi Andrew,<br>
</div>
<div><br>
I have one additional
question as well: Does
the test engine support
running tests on two
ARMv8 test agents?</div>
<div><br>
</div>
<div>
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">1.
We'll sort out
warnings this week.
Thanks for heads up.<br>
</blockquote>
<div><br>
</div>
<div>Great. Let me know
when that's fixed.</div>
</div>
</div>
</blockquote>
<br>
Done. We also fixed a number
of warnings in TE.<br>
Also we fixed root test
package name to be consistent
with the repository name.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<div>Support for old LTS
branches was dropped
some time ago, but in
the future it is
definitely possible to
keep it for new LTS
branches. I think
22.11 is supported,
but I'm not sure about
older LTS releases.</div>
</blockquote>
<div><br>
</div>
<div>Good to know.<br>
<div> <br>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">2.
You can add
command-line option
--sanity to run
tests marked with
TEST_HARNESS_SANITY
requirement (see
dpdk-ethdev-ts/scripts/run.sh
and grep
TEST_HARNESS_SANITY
dpdk-ethdev-ts to
see which tests are
marked). Yes, there
is a space for
terminology
improvement here.
We'll do it.<br>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
Done. Now it is called
--checkup.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex"><br>
Also it takes a lot
of time because of
failures and tests
which wait for some
timeout.<br>
</blockquote>
</div>
<div><br>
</div>
<div>That makes sense to
me. We'll use the time
to complete tests on
virtio or the Intel
devices as a reference
for how long the tests
really take to
complete.<br>
</div>
<div>We will explore the
possibility of
periodically running
the sanity tests for
patches.<br>
</div>
</div>
</div>
</blockquote>
<br>
I'll double-check and let you
know how long entire TS runs
on Intel X710, E810, Mellanox
CX5 and virtio net. Just to
ensure that time observed in
your case looks the same.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div> <br>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">The
test harness can
provide coverage
reports based on
gcov, but I'm not
sure what you mean
by a "dial" to
control test
coverage. Provided
reports are rather
for human to
analyze.<br>
</blockquote>
</div>
<div><br>
</div>
<div>The general idea is
to have some kind of
parameter on the test
suite, which could be
an integer ranging
from zero to ten, that
controls how many
tests are run based on
how important the test
is.<br>
<br>
</div>
<div>Similar to how some
command line
interfaces provide a
verbosity level
parameter (some number
of "-v" arguments) to
control the importance
of the information in
the log.<br>
</div>
The verbosity level zero
only prints very
important log messages,
while ten prints
everything.<br>
</div>
<div><br>
In much the same manner
as above, this "dial"
parameter controls what
tests are run and with
what parameters based on
how important those
tests and test parameter
combinations are.<br>
Coverage Level zero
tells the suite to run a
very basic set of
important tests, with
minimal
parameterization. This
mode would take only
~5-10 minutes to run.<br>
In contrast, Coverage
Level ten includes all
the edge cases, every
combination of test
parameters, everything
the test suite can do,
which takes the normal
several hours to run.<br>
The values 1 - 9 are
between those two
extremes, allowing the
user to get a gradient
of test coverage in the
results and to limit the
running time.<br>
<br>
</div>
Then we could, for
example, run the "run.sh"
with a level of 2 or 3 for
incoming patches that need
quick results, and with a
level of 10 for the less
often run periodic tests
performed on main or LTS
branches.<br>
</div>
</blockquote>
<br>
Understood now. Thanks a lot
for the idea. We'll discuss it
and come back.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>3. Yes,
really many
tests on
Mellanox CX5
NICs report
unexpected
testing results.
Unfortunately it
is time
consuming to
fill in
expectations
database since
it is necessary
to analyze
testing results
and classify if
it is a bug or
just acceptable
behaviour
aspect.<br>
<br>
Bublik allows to
compare results
of two runs. It
is useful for
human, but still
not good for
automation.<br>
<br>
I have local
patch for mlx5
driver which
reports Tx ring
size maximum. It
makes pass rate
higher. It is a
problem for test
harness that
mlx5 does not
report limits
right now.<br>
<br>
Pass rate on
Intel X710 is
about 92% on my
test rig. Pass
rate on virtio
net is 99% right
now and could be
done 100% easily
(just one thing
to fix in
expectations).<br>
<br>
I think logs
storage setup is
essential for
logs analysis.
Of course, you
can request HTML
logs when you
run tests
(--log-html=html)
or generate
after run using
dpdk-ethdev-ts/scripts/html-log.sh and open index.html in a browser, but
logs storage
makes it more
convenient.<br>
</div>
</div>
</blockquote>
<div><br>
We are interested in
setting up Bublik,
potentially as an
externally-facing
component, once we
have our process of
running the test
suite stabilized.</div>
<div>Once we are able
to run the test
suite again, I'll
see what the pass
rate is on our other
hardware.<br>
Good to know that it
isn't an issue with
our dev testbed
causing the high
fail rate.</div>
</div>
<div>
<div><br>
</div>
<div>For Intel
hardware, we have an
XL710 and an Intel
E810-C in our
development testbed.
Although they are
slightly different
devices, ideally the
pass rate will be
identical or
similar. I have yet
to set up a VM pair
for virtio, but we
will soon.<br>
</div>
<div><br>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">Latest
version of
test-environment
has examples of
our CGI scripts
which we use for
log storage (see
tools/log_server/README.md).<br>
<br>
Also all bits for
Jenkins setup are
available. See
dpdk-ethdev-ts/jenkins/README.md
and examples of
jenkins files in
ts-rigs-sample.<br>
</blockquote>
</div>
<div><br>
</div>
<div>Jenkins
integration, setting
up production rig
configurations, and
permanent log
storage will be our
next steps once I am
able to run the
tests again.<br>
</div>
<div>Unless there is
an easy way to have
meson not pass
"-Werror" into GCC.
Then I would be able
to run the test
suite.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Hopefully it is resolved now.<br>
<br>
I thought a bit more about
your usecase for Jenkins. I'm
not 100% sure that existing
pipelines are convenient for
your usecase.<br>
Fill free to ask questions
when you are on it.<br>
<br>
Thanks,<br>
Andrew.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div><br>
</div>
<div>Thanks,<br>
</div>
<div>Adam<br>
</div>
<div><br>
</div>
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left:1px
solid
rgb(204,204,204);padding-left:1ex">
<div>
<div><br>
On 8/29/23
17:02, Adam
Hassick wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>
<div>
<div>Hi
Andrew,<br>
<br>
</div>
That fix seems
to have
resolved the
issue, thanks
for the quick
turnaround
time on that
patch.<br>
</div>
<div>Now that
we have the
RCF timeout
issue
resolved,
there are a
few other
questions and
issues that we
have about the
tests
themselves.</div>
<br>
</div>
<div>1. The
test suite
fails to build
with a couple
warnings.<br>
</div>
<div><br>
</div>
<div>Below is
the stderr log
from
compilation:<br>
</div>
<br>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">FAILED:
<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a><br>
cc
-Ilib/76b5a35@@ts_dpdk_pmd@sta
-Ilib
-I../../lib
-I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall
-Winvalid-pch
-Werror -g
-D_GNU_SOURCE
-O0 -ggdb
-Wall -W -fPIC
-MD -MQ '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
-MF '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d</a>'
-o '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
-c
../../lib/dpdk_pmd_ts.c<br>
../../lib/dpdk_pmd_ts.c: In function
‘test_create_traffic_generator_params’:<br>
../../lib/dpdk_pmd_ts.c:5577:5: error: format not a string literal and
no format
arguments
[-Werror=format-security]<br>
5577 | rc
=
te_kvpair_add(result,
buf, mode);<br>
| ^~<br>
cc1: all
warnings being
treated as
errors<br>
ninja: build
stopped:
subcommand
failed.<br>
ninja:
Entering
directory `.'<br>
FAILED: <a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a><br>
cc
-Ilib/76b5a35@@ts_dpdk_pmd@sta
-Ilib
-I../../lib
-I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall
-Winvalid-pch
-Werror -g
-D_GNU_SOURCE
-O0 -ggdb
-Wall -W -fPIC
-MD -MQ '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
-MF '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d</a>'
-o '<a
href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
-c
../../lib/dpdk_pmd_ts.c<br>
../../lib/dpdk_pmd_ts.c: In function
‘test_create_traffic_generator_params’:<br>
../../lib/dpdk_pmd_ts.c:5577:5: error: format not a string literal and
no format
arguments
[-Werror=format-security]<br>
5577 | rc
=
te_kvpair_add(result,
buf, mode);<br>
| ^~<br>
cc1: all
warnings being
treated as
errors<br>
</blockquote>
<div>
<div>
<div><br>
</div>
<div>This
error wasn't
occurring last
week, which
was the last
time I ran the
tests.<br>
</div>
<div>The TE
host and the
DUT have GCC
v9.4.0
installed, and
the tester has
GCC v11.4.0
installed, if
this
information is
helpful.<br>
</div>
<div><br>
</div>
<div>2. On the
Mellanox CX5s,
there are over
6,000 tests
run, which
collectively
take around 9
hours. Is it
possible, and
would it make
sense, to
lower the test
coverage and
have the test
suite run
faster?<br>
<br>
</div>
<div>For some
context, we
run immediate
testing on
incoming
patches for
DPDK main and
development
branches, as
well as
periodic test
runs on the
main, stable,
and LTS
branches.<br>
</div>
<div>For us to
consider
including this
test suite as
part of our
immediate
testing on
patches, we
would have to
reduce the
test coverage
to the most
important
tests.<br>
This is
primarily to
reduce the
testing time
to, for
example, less
than 30
minutes.
Testing on
patches can't
take too long
because the
lab can
receive
numerous
patches each
day, which
each require
individual
testing runs.<br>
<br>
</div>
<div>At what
frequency we
run these
tests, and on
what, still
needs to be
discussed with
the DPDK
community, but
it would be
nice to know
if the test
suite had a
"dial" to
control the
testing
coverage.<br>
</div>
<div><br>
</div>
<div>3. We see
a lot of test
failures on
our Mellanox
CX5 NICs.
Around 2,300
of ~6,600
tests passed.
Is there
anything we
can do to
diagnose these
test failures?<br>
</div>
<div><br>
</div>
<div>Thanks,<br>
</div>
<div>Adam<br>
</div>
<div><br>
</div>
</div>
</div>
</div>
<br>
<div
class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Tue, Aug 29, 2023 at 8:07 AM Andrew Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>Hi Adam,<br>
<br>
I've pushed
the fix in
main branch
and a new tag
v1.18.1. It
should solve
the problem
with IPv6
address from
DNS.<br>
<br>
Andrew.<br>
<br>
On 8/29/23
00:05, Andrew
Rybchenko
wrote:<br>
</div>
<blockquote
type="cite">
<div>Hi Adam,<br>
<br>
> Does the
test engine
prefer to use
IPv6 over IPv4
for initiating
the RCF
connection to
the test bed
hosts? And if
so, is there a
way to force
it to use
IPv4?<br>
<br>
Brilliant
idea. If DNS
returns both
IPv4 and IPv6
addresses in
your case, I
guess it is
the root cause
of the
problem.<br>
Of course, it
is TE problem
since I see
really weird
code in
lib/comm_net_engine/comm_net_engine.c
line 135.<br>
<br>
I've pushed
fix to the
branch
user/arybchik/fix_ipv4_only
in
ts-factory/test-environment
repository.
Please, try.<br>
<br>
It is late
night fix with
minimal
testing and no
review. I'll
pass it
through review
process
tomorrow and<br>
hopefully it
will be
released in
one-two days.<br>
<br>
Andrew.<br>
<br>
On 8/28/23
18:02, Adam
Hassick wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>
<div>
<div>Hi
Andrew,<br>
<br>
</div>
We have yet to
notice a
distinct
pattern with
the failures.
Sometimes, the
RCF will start
and connect
without issue
a few times in
a row before
failing to
connect again.
Once the issue
begins to
occur, neither
rebooting all
of the hosts
(test engine
VM, tester,
IUT) or
deleting all
of the build
directories
(suites,
agents, inst)
and rebooting
the hosts
afterward
resolves the
issue. When it
begins working
again seems
very arbitrary
to us.<br>
<br>
</div>
<div>I do
usually try to
terminate the
test engine
with Ctrl+C,
but when it
hangs while
trying to
start RCF,
that does not
work.<br>
</div>
<div><br>
</div>
<div>Does the
test engine
prefer to use
IPv6 over IPv4
for initiating
the RCF
connection to
the test bed
hosts? And if
so, is there a
way to force
it to use
IPv4?<br>
<br>
</div>
<div> - Adam<br>
</div>
</div>
</div>
<br>
<div
class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Fri, Aug 25, 2023 at 1:35 PM Andrew Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>> I'll
double-check
test engine on
Ubuntu 20.04
and Ubuntu
22.04.<br>
<br>
Done. It works
fine for me
without any
issues.<br>
<br>
Have you
noticed any
pattern when
it works or
does not work?<br>
May be it is a
problem of not
clean state
after
termination?<br>
Does it work
fine the first
time after
DUTs reboot?<br>
How do you
terminate
testing? It
should be done
using Ctrl+C
in terminal
where you
execute run.sh
command.<br>
In this case
it should
shutdown
gracefully and
close all test
agents and
engine
applications.<br>
<br>
(I'm trying to
understand why
you've seen
many test
agent
processes. It
should not
happen.)<br>
<br>
Andrew.<br>
<br>
On 8/25/23
17:41, Andrew
Rybchenko
wrote:<br>
</div>
<blockquote
type="cite">
<div>On
8/25/23 17:06,
Adam Hassick
wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">
<div>
<div>Hi
Andrew,<br>
<br>
</div>
Two of our
systems (the
Test Engine
runner and the
DUT host) are
running Ubuntu
20.04 LTS,
however this
morning I
noticed that
the tester
system (the
one having
issues) is
running Ubuntu
22.04 LTS.<br>
</div>
<div>This
could be the
source of the
problem. I
encountered a
dependency
issue trying
to run the
Test Engine on
22.04 LTS, so
I downgraded
the system.
Since the
tester is also
the host
having
connection
issues, I will
try
downgrading
that system to
20.04, and see
if that
changes
anything.<br>
</div>
</div>
</blockquote>
<br>
Unlikely, but
who knows. We
run tests
(DUTs) on
Ubuntu 20.04,
Ubuntu 22.04,
Ubuntu 22.10,
Ubuntu 23.04,
Debian 11 and
Fedora 38
every night.<br>
Right now
Debian 11 is
used for test
engine in
nightly
regressions.<br>
<br>
I'll
double-check
test engine on
Ubuntu 20.04
and Ubuntu
22.04.<br>
<br>
<blockquote
type="cite">
<div dir="ltr">
<div>I did try
passing in the
"--vg-rcf"
argument to
the run.sh
script of the
test suite
after
installing
valgrind, but
there was no
additional
output that I
saw.<br>
</div>
</div>
</blockquote>
<br>
Sorry, I
should
valgrind
output should
be in
valgrind.te_rcf
(direction
where you run
test engine).<br>
<br>
<blockquote
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I will
try pulling in
the changes
you've pushed
up, and will
see if that
fixes
anything.<br>
<br>
</div>
<div>Thanks,<br>
</div>
<div>Adam<br>
</div>
</div>
<br>
<div
class="gmail_quote">
<div dir="ltr"
class="gmail_attr">On Fri, Aug 25, 2023 at 9:57 AM Andrew Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a>>
wrote:<br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div>Hello
Adam,<br>
<br>
On 8/24/23
23:54, Andrew
Rybchenko
wrote:<br>
</div>
<blockquote
type="cite">I'd
like to try to
repeat the
problem
locally. Which
Linux distro
is running on
test engine
and agents?<br>
<br>
In fact I know
one problem
with Debian 12
and Fedora 38
and we have<br>
patch in
review to fix
it, however,
the behaviour
is different
in<br>
this case, so
it is unlike
the same
problem.<br>
</blockquote>
<br>
I've just
published a
new tag which
fixes known
test engine
side problems
on Debian 12
and Fedora 38.<br>
<br>
<blockquote
type="cite"><br>
One more idea
is to install
valgrind on
the test
engine host
and<br>
run with
option
--vg-rcf to
check if
something
weird is
happening.<br>
<br>
What I don't
understand
right now is
why I see just
one failed
attempt<br>
to connect in
your log.txt
and then
Logger
shutdown after
9 minutes.<br>
<br>
Andrew.<br>
<br>
On 8/24/23
23:29, Adam
Hassick wrote:<br>
<blockquote
type="cite"> >
Is there any
firewall in
the network or
on test hosts
which could
block incoming
TCP connection
to the port
23571 <a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
from the host
where you run
test engine?<br>
<br>
Our test
engine host
and the
testbed are on
the same
subnet. The
connection
does work
sometimes.<br>
<br>
> If
behaviour the
same on the
next try and
you see that
test agent is
kept running,
could you
check using<br>
><br>
> #
netstat -tnlp<br>
><br>
> that
Test Agent is
listening on
the port and
try to
establish TCP
connection
from test
agent using<br>
><br>
> $ telnet
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu"
target="_blank" moz-do-not-send="true">iol-dts-tester.dpdklab.iol.unh.edu</a>
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
23571 <a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
><br>
> and
check if TCP
connection
could be
established.<br>
<br>
I was able to
replicate the
same behavior
again, where
it hangs while
RCF is trying
to start.<br>
Running this
command, I see
this in the
output:<br>
<br>
tcp 0
0 <a
href="http://0.0.0.0:23571"
target="_blank" moz-do-not-send="true">0.0.0.0:23571</a> <a
href="http://0.0.0.0:23571"
target="_blank" moz-do-not-send="true"><http://0.0.0.0:23571></a>
0.0.0.0:*
LISTEN
18599/ta<br>
<br>
So it seems
like it is
listening on
the correct
port.<br>
Additionally,
I was able to
connect to the
Tester machine
from our Test
Engine host
using telnet.
It printed the
PID of the
process once
the connection
was opened.<br>
<br>
I tried
running the
"ta"
application
manually on
the command
line, and it
didn't print
anything at
all.<br>
Maybe the
issue is
something on
the Test
Engine side.<br>
<br>
On Thu, Aug
24, 2023 at
2:35 PM Andrew
Rybchenko <<a
href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a> <a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank" moz-do-not-send="true"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
wrote:<br>
<br>
Hi Adam,<br>
<br>
> On
the tester
host (which
appears to be
the Peer
agent), there<br>
are four
processes that
I see running,
which look
like the test<br>
agent
processes.<br>
<br>
Before the
next try I'd
recommend to
kill these
processes.<br>
<br>
Is there
any firewall
in the network
or on test
hosts which
could<br>
block
incoming TCP
connection to
the port 23571<br>
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
from the host<br>
where you
run test
engine?<br>
<br>
If
behaviour the
same on the
next try and
you see that
test agent is<br>
kept
running, could
you check
using<br>
<br>
# netstat
-tnlp<br>
<br>
that Test
Agent is
listening on
the port and
try to
establish TCP<br>
connection
from test
agent using<br>
<br>
$ telnet <a
href="http://iol-dts-tester.dpdklab.iol.unh.edu" target="_blank"
moz-do-not-send="true">iol-dts-tester.dpdklab.iol.unh.edu</a><br>
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
23571<br>
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
<br>
and check
if TCP
connection
could be
established.<br>
<br>
Another
idea is to
login Tester
under root as
testing does,
get<br>
start TA
command from
the log and
try it by
hands without
-n and<br>
remove
extra
escaping.<br>
<br>
# sudo
PATH=${PATH}:/tmp/linux_x86_root_76872_1692885663_1<br>
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}/tmp/linux_x86_root_76872_1692885663_1
/tmp/linux_x86_root_76872_1692885663_1/ta Peer 23571
host=iol-dts-tester.dpdklab.iol.unh.edu:port=23571:user=root:key=/opt/tsf/keys/id_ed25519:ssh_port=22:copy_timeout=15:kill_timeout=15:sudo=:shell=<br>
<br>
Hopefully
in this case
test agent
directory
remains in the
/tmp and<br>
you don't
need to copy
it as testing
does.<br>
May be
output could
shed some
light on
what's going
on.<br>
<br>
Andrew.<br>
<br>
On 8/24/23
17:30, Adam
Hassick wrote:<br>
<blockquote
type="cite">
Hi Andrew,<br>
<br>
This is
the output
that I see in
the terminal
when this
failure<br>
occurs,
after the test
agent binaries
build and the
test engine<br>
starts:<br>
<br>
Platform
default build
- pass<br>
Simple RCF
consistency
check
succeeded<br>
--->>>
Starting
Logger...done<br>
--->>>
Starting
RCF...rcf_net_engine_connect():
Connection
timed<br>
out <a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true">iol-dts-tester.dpdklab.iol.unh.edu:23571</a><br>
<a
href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571"
target="_blank" moz-do-not-send="true"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
<br>
Then, it
hangs here
until I kill
the "te_rcf"
and "te_tee"<br>
processes.
I let it hang
for around 9
minutes.<br>
<br>
On the
tester host
(which appears
to be the Peer
agent), there
are<br>
four
processes that
I see running,
which look
like the test
agent<br>
processes.<br>
<br>
ta.Peer is
an empty file.
I've attached
the log.txt
from this run.<br>
<br>
- Adam<br>
<br>
On Thu,
Aug 24, 2023
at 4:22 AM
Andrew
Rybchenko<br>
<<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a><br>
<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank" moz-do-not-send="true"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
wrote:<br>
<br>
Hi
Adam,<br>
<br>
Yes,
TE_RCFUNIX_TIMEOUT
is in seconds.
I've
double-checked<br>
that
it goes to
'copy_timeout'
in
ts-conf/rcf.conf.<br>
Description in
in
doc/sphinx/pages/group_te_engine_rcf.rst<br>
says
that
copy_timeout
is in seconds
and
implementation
in<br>
lib/rcfunix/rcfunix.c
passes the
value to
select()
tv_sec.<br>
Theoretically
select() could
be interrupted
by signal, but
I<br>
think
it is unlikely
here.<br>
<br>
I'm
not sure that
I understand
what do you
mean by RCF<br>
connection
timeout. Does
it happen on
TE startup
when RCF<br>
starts
test agents.
If so,
TE_RCFUNIX_TIMEOUT
could help. Or<br>
does
it happen when
tests are in
progress, e.g.
in the middle<br>
of a
test. If so,
TE_RCFUNIX_TIMEOUT
is unrelated
and most<br>
likely
either host
with test
agent dies or
test agent
itself<br>
crashes. It
would be
easier for me
if classify it
if you share<br>
text
log (log.txt,
full or just
corresponding
fragment with<br>
some
context). Also
content of
ta.DPDK or
ta.Peer file<br>
depending on
which agent
has problems
could shed
some light.<br>
Corresponding
files contain
stdout/stderr
of test
agents.<br>
<br>
Andrew.<br>
<br>
On
8/23/23 17:45,
Adam Hassick
wrote:<br>
<blockquote
type="cite">
Hi Andrew,<br>
<br>
I've
set up a test
rig repository
here, and have
created<br>
configurations
for our
development
testbed based
off of the<br>
examples.<br>
We've
been able to
get the test
suite to run
manually on<br>
Mellanox CX5
devices once.<br>
However, we
are running
into an issue
where, when
RCF starts,<br>
the
RCF connection
times out very
frequently. We
aren't sure<br>
why
this is the
case.<br>
It
works
sometimes, but
most of the
time when we
try to run<br>
the
test engine,
it encounters
this issue.<br>
I've
tried changing
the RCF port
by setting<br>
"TE_RCF_PORT=<some
port
number>"
and rebooting
the testbed<br>
machines.
Neither seems
to fix the
issue.<br>
<br>
It
also seems
like the
timeout takes
far longer
than 60<br>
seconds, even
when running
"export
TE_RCFUNIX_TIMEOUT=60"<br>
before
I try to run
the test
suite.<br>
I
assume the
unit for this
variable is
seconds?<br>
<br>
Thanks,<br>
Adam<br>
<br>
On
Mon, Aug 21,
2023 at
10:19 AM Adam
Hassick<br>
<<a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a> <a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a>>
wrote:<br>
<br>
Hi
Andrew,<br>
<br>
Thanks, I've
cloned the
example
repository and
will start<br>
setting up a
configuration
for our
development
testbed<br>
today. I'll
let you know
if I run into
any
difficulties<br>
or
have any
questions.<br>
<br>
-
Adam<br>
<br>
On
Sun, Aug 20,
2023 at
4:40 AM Andrew
Rybchenko<br>
<<a
href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a><br>
<a
href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"
moz-do-not-send="true"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
wrote:<br>
<br>
Hi Adam,<br>
<br>
I've published<br>
<a href="https://github.com/ts-factory/ts-rigs-sample"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/ts-factory/ts-rigs-sample</a><br>
<a href="https://github.com/ts-factory/ts-rigs-sample"
target="_blank" moz-do-not-send="true"><https://github.com/ts-factory/ts-rigs-sample></a>.<br>
Hopefully it will help to define your test rigs and<br>
successfully run some tests manually. Feel free to<br>
ask any questions and I'll answer here and try to<br>
update documentation.<br>
<br>
Meanwhile I'll prepare missing bits for steps (2) and<br>
(3).<br>
Hopefully everything is in place for step (4), but we<br>
need to make steps (2) and (3) first.<br>
<br>
Andrew.<br>
<br>
On 8/18/23 21:40, Andrew Rybchenko wrote:<br>
<blockquote
type="cite">
Hi Adam,<br>
<br>
> I've conferred with the rest of the team, and we<br>
think it would be best to move forward with mainly<br>
option B.<br>
<br>
OK, I'll provide the sample on Monday for you. It is<br>
almost ready right now, but I need to double-check<br>
it before publishing.<br>
<br>
Regards,<br>
Andrew.<br>
<br>
On 8/17/23 20:03, Adam Hassick wrote:<br>
<blockquote
type="cite">
Hi Andrew,<br>
<br>
I'm adding the CI mailing list to this<br>
conversation. Others in the community might find<br>
this conversation valuable.<br>
<br>
We do want to run testing on a regular basis. The<br>
Jenkins integration will be very useful for us, as<br>
most of our CI is orchestrated by Jenkins.<br>
I've conferred with the rest of the team, and we<br>
think it would be best to move forward with mainly<br>
option B.<br>
If you would like to know anything about our<br>
testbeds that would help you with creating an<br>
example ts-rigs repo, I'd be happy to answer any<br>
questions you have.<br>
<br>
We have multiple test rigs (we call these<br>
"DUT-tester pairs") that we run our existing<br>
hardware testing on, with differing network<br>
hardware and CPU architecture. I figured this might<br>
be an important detail.<br>
<br>
Thanks,<br>
Adam<br>
<br>
On Thu, Aug 17, 2023 at 11:44 AM Andrew Rybchenko<br>
<<a href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">andrew.rybchenko@oktetlabs.ru</a><br>
<a href="mailto:andrew.rybchenko@oktetlabs.ru"
target="_blank"
moz-do-not-send="true"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
wrote:<br>
<br>
Greatings Adam,<br>
<br>
I'm happy to hear that you're trying to bring<br>
it up.<br>
<br>
As I understand the final goal is to run it on<br>
regular basis. So, we need to make it properly<br>
from the very beginning.<br>
Bring up of all features consists of 4 steps:<br>
<br>
1. Create site-specific repository (we call it<br>
ts-rigs) which contains information about test<br>
rigs and other site-specific information like<br>
where to send mails, where to store logs etc.<br>
It is required for manual execution as well,<br>
since test rigs description is essential. I'll<br>
return to the topic below.<br>
<br>
2. Setup logs storage for automated runs.<br>
Basically it is a disk space plus apache2 web<br>
server with few CGI scripts which help a lot to<br>
save disk space.<br>
<br>
3. Setup Bublik web application which provides<br>
web interface to view testing results. Same as<br>
<a href="https://ts-factory.io/bublik"
target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">https://ts-factory.io/bublik</a><br>
<a href="https://ts-factory.io/bublik"
target="_blank"
moz-do-not-send="true"><https://ts-factory.io/bublik></a><br>
<br>
4. Setup Jenkins to run tests on regularly,<br>
save logs in log storage (2) and import it to<br>
bublik (3).<br>
<br>
Last few month we spent on our homework to make<br>
it simpler to bring up automated execution<br>
using Jenkins -<br>
<a href="https://github.com/ts-factory/te-jenkins"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/ts-factory/te-jenkins</a><br>
<a href="https://github.com/ts-factory/te-jenkins"
target="_blank" moz-do-not-send="true"><https://github.com/ts-factory/te-jenkins></a><br>
Corresponding bits in dpdk-ethdev-ts will be<br>
available tomorrow.<br>
<br>
Let's return to the step (1).<br>
<br>
Unfortunately there is no publicly available<br>
example of the ts-rigs repository since<br>
sensitive site-specific information is located<br>
there. But I'm ready to help you to create it<br>
for UNH. I see two options here:<br>
<br>
(A) I'll ask questions and based on your<br>
answers will create the first draft with my<br>
comments.<br>
<br>
(B) I'll make a template/example ts-rigs repo,<br>
publish it and you'll create UNH ts-rigs based<br>
on it.<br>
<br>
Of course, I'll help to debug and finally bring<br>
it up in any case.<br>
<br>
(A) is a bit simpler for me and you, but (B) is<br>
a bit more generic and will help other<br>
potential users to bring it up.<br>
We can combine (A)+(B). I.e. start from (A).<br>
What do you think?<br>
<br>
Thanks,<br>
Andrew.<br>
<br>
On 8/17/23 15:18, Konstantin Ushakov wrote:<br>
<blockquote
type="cite">
Greetings
Adam,<br>
<br>
<br>
Thanks for contacting us. I copy Andrew who<br>
would be happy to help<br>
<br>
Thanks,<br>
Konstantin<br>
<br>
<blockquote
type="cite">
On 16 Aug
2023, at
21:50, Adam
Hassick<br>
<a href="mailto:ahassick@iol.unh.edu"
target="_blank"
moz-do-not-send="true"><ahassick@iol.unh.edu></a><br>
<a href="mailto:ahassick@iol.unh.edu"
target="_blank"
moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a> wrote:<br>
<br>
<br>
Greetings Konstantin,<br>
<br>
I am in the process of setting up the DPDK<br>
Poll Mode Driver test suite as an addition to<br>
our testing coverage for DPDK at the UNH lab.<br>
<br>
I have some questions about how to set the<br>
test suite arguments.<br>
<br>
I have been able to configure the Test Engine<br>
to connect to the hosts in the testbed. The<br>
RCF, Configurator, and Tester all begin to<br>
run, however the prelude of the test suite<br>
fails to run.<br>
<br>
<a
href="https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters</a>
<a
href="https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters"
target="_blank" moz-do-not-send="true"><https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters></a><br>
<br>
The documentation mentions that there are<br>
several test parameters for the test suite,<br>
like for the IUT test link MAC, etc. These<br>
seem like they would need to be set somewhere<br>
to run many of the tests.<br>
<br>
I see in the Test Engine documentation, there<br>
are instructions on how to create new<br>
parameters for test suites in the Tester<br>
configuration, but there is nothing in the<br>
user guide or in the Tester guide for how to<br>
set the arguments for the parameters when<br>
running the test suite that I can find. I'm<br>
not sure if I need to write my own Tester<br>
config, or if I should be setting these in<br>
some other way.<br>
<br>
How should these values be set?<br>
<br>
I'm also not sure what environment<br>
variables/arguments are strictly necessary or<br>
which are optional.<br>
<br>
Regards,<br>
Adam<br>
<br>
-- *Adam Hassick*<br>
Senior Developer<br>
UNH InterOperability Lab<br>
<a href="mailto:ahassick@iol.unh.edu"
target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
<a href="mailto:ahassick@iol.unh.edu"
target="_blank"
moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a href="http://iol.unh.edu" target="_blank"
moz-do-not-send="true">iol.unh.edu</a>
<a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1 (603) 475-8248<br>
</blockquote>
</blockquote>
<br>
<br>
<br>
-- *Adam Hassick*<br>
Senior Developer<br>
UNH InterOperability Lab<br>
<a href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a> <a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a href="http://iol.unh.edu" target="_blank"
moz-do-not-send="true">iol.unh.edu</a>
<a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1 (603) 475-8248<br>
</blockquote>
<br>
</blockquote>
<br>
<br>
<br>
--
*Adam Hassick*<br>
Senior
Developer<br>
UNH
InterOperability
Lab<br>
<a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a> <a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a
href="http://iol.unh.edu" target="_blank" moz-do-not-send="true">iol.unh.edu</a>
<a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1
(603) 475-8248<br>
<br>
<br>
<br>
--
*Adam
Hassick*<br>
Senior
Developer<br>
UNH
InterOperability
Lab<br>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">ahassick@iol.unh.edu</a>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a
href="http://iol.unh.edu"
target="_blank" moz-do-not-send="true">iol.unh.edu</a> <a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1
(603) 475-8248<br>
</blockquote>
<br>
<br>
<br>
--
*Adam Hassick*<br>
Senior
Developer<br>
UNH
InterOperability
Lab<br>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">ahassick@iol.unh.edu</a>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a
href="http://iol.unh.edu"
target="_blank" moz-do-not-send="true">iol.unh.edu</a> <a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1 (603)
475-8248<br>
</blockquote>
<br>
<br>
<br>
-- <br>
*Adam Hassick*<br>
Senior
Developer<br>
UNH
InterOperability
Lab<br>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">ahassick@iol.unh.edu</a>
<a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"><mailto:ahassick@iol.unh.edu></a><br>
<a
href="http://iol.unh.edu"
target="_blank" moz-do-not-send="true">iol.unh.edu</a> <a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true"><https://www.iol.unh.edu/></a><br>
+1 (603)
475-8248<br>
</blockquote>
<br>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br
clear="all">
<br>
<span
class="gmail_signature_prefix">--</span><br>
<div dir="ltr"
class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam
Hassick</span></span></b><br>
</div>
</div>
<div><span
style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span
style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/" target="_blank" moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603)
475-8248<br>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br
clear="all">
<br>
<span
class="gmail_signature_prefix">--</span><br>
<div dir="ltr"
class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam
Hassick</span></span></b><br>
</div>
</div>
<div><span
style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span
style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/" target="_blank" moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603)
475-8248<br>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span
class="gmail_signature_prefix">--</span><br>
<div dir="ltr"
class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam
Hassick</span></span></b><br>
</div>
</div>
<div><span
style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span
style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/" target="_blank" moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603)
475-8248<br>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">--</span><br>
<div dir="ltr"
class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
</div>
</div>
<div><span
style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span
style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/" target="_blank" moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603) 475-8248<br>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">--</span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam
Hassick</span></span></b><br>
</div>
</div>
<div><span
style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span
style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH InterOperability Lab</span></span></span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu"
target="_blank"
moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span
style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/"
target="_blank"
moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603) 475-8248<br>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">--</span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span
style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam
Hassick</span></span></b><br>
</div>
</div>
<div><span style="color:rgb(102,102,102)">Senior
Developer</span></div>
<div><span style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
<div><span style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/"
target="_blank" moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603) 475-8248<br>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</div>
</blockquote>
<div style="white-space:normal"> </div>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br clear="all">
<br>
<span class="gmail_signature_prefix">-- </span><br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div><b><span style="background-color:rgb(255,255,255)"><span
style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
</div>
<span style="color:rgb(102,102,102)"></span></div>
<div><span style="color:rgb(102,102,102)">Senior Developer</span></div>
<div><span style="color:rgb(102,102,102)"><span
style="color:rgb(11,83,148)"><span
style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span><span
style="color:rgb(102,102,102)"></span></div>
<div><span style="color:rgb(102,102,102)"><a
href="mailto:ahassick@iol.unh.edu" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">ahassick@iol.unh.edu</a><br>
</span></div>
<div><span style="color:rgb(102,102,102)"><a
href="https://www.iol.unh.edu/" target="_blank"
moz-do-not-send="true">iol.unh.edu</a><br>
</span></div>
+1 (603) 475-8248<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>