<div dir="ltr"><div><div><div><div><div><div><div>Hi Andrew,<br><br></div>Thank you for taking a look at our log. Netplan was attempting to run DHCP on our test links, and additionally I discovered that the NIC firmware was transmitting LLDP packets, causing tests to fail in the same way. Now that these problems have been fixed, our pass rate on the XL710 is approximately 91%. Now that our test results are in line with yours, we can begin looking into setting up the production environment.</div></div><br></div>First, is it possible to run the test agent on ARM hosts? Our ARM testbeds have the best topology for running this test suite, with separate tester and DUT servers.<br><br></div>We are testing this test suite on two x86 development servers using the test suite's recommended server topology. In contrast, our existing x86 production testbeds which run DTS have a single server topology. This single server has both the tester NIC and the device under test NIC installed, with NUMA node separation between TRex and DPDK. We're going to test running the two test agent processes on the single-server testbeds if we cannot run this on ARM. Is there any reason you can think of that would prevent this setup from working?<br><br></div><div>Once we figure out where this can live in production, then we will begin setting up log storage, Jenkins integration, and Bublik.<br></div><div><br></div>Thanks,<br></div>Adam<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 5, 2023 at 6:25 AM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru">andrew.rybchenko@oktetlabs.ru</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <div>Hi Adam,<br>
      <br>
      > Do these default to vfio-pci?<br>
      <br>
      Yes, vfio-pci is the default.<br>
      However, it does not work in the case of Mellanox which uses
      bifurcated driver. It should mlx5_core for Mellanox NICs.<br>
      <br>
      > Here is the text log from a run on our Intel XL710 NICs, with
      the expected result profile set to the X710.<br>
      <br>
      It is hard to analyze all tests using text logs, but I definitely
      see one common problem. Tests receive unexpected packets and fail
      because of it.<br>
      Tests are written very strict from this point of view and it
      brought fruits in the past when HW had bugs.<br>
      Are DUT and tester connected back-to-back on tested interfaces or
      via switch?<br>
      If via switch, is it possible to isolate it from everything else?<br>
      If back-to-back, it could be some embedded SW/FW which regenerates
      these packet.<br>
      I definitely see unexpected DHCP packets.<br>
      <br>
      > We haven't set up the Jenkins integration yet, however if
      this is required to import the logs then we will prioritize that.<br>
      <br>
      Unfortunately manual runs do not generate all artifacts required
      to import logs. However, we have almost solved it right now.
      Hopefully we'll finalize it in a day or two. I'll let you know
      when these changes are available.<br>
      <br>
      Regards,<br>
      Andrew.<br>
      <br>
      On 10/4/23 16:48, Adam Hassick wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">
        <div>
          <div>
            <div>
              <div>Hi Andrew,<br>
                <br>
              </div>
              Ok, that makes sense. I don't see TE_ENV_H1/H2_DPDK_DRIVER
              set anywhere in the default configurations for the Intel
              X710. Do these default to vfio-pci?<br>
            </div>
            <div>We have IOMMU enabled on our development testbed, and
              should be able to bind vfio-pci.<br>
            </div>
            <div>Here is the text log from a run on our Intel XL710
              NICs, with the expected result profile set to the X710. We
              haven't set up the Jenkins integration yet, however if
              this is required to import the logs then we will
              prioritize that.<br>
            </div>
            <div class="gmail_chip gmail_drive_chip" style="width:396px;height:18px;max-height:18px;background-color:rgb(245,245,245);padding:5px;color:rgb(34,34,34);font-family:arial;font-style:normal;font-weight:bold;font-size:13px;border:1px solid rgb(221,221,221);line-height:1"><a href="https://drive.google.com/file/d/10N5JfxFMP7lNXDBgJeN_z-NL2JNUY7nJ/view?usp=drive_web" style="display:inline-block;max-width:366px;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;text-decoration:none;padding:1px 0px;border:medium" aria-label="log.txt.tar.gz" target="_blank"><img style="vertical-align: bottom; border: medium;" src="https://ssl.gstatic.com/docs/doclist/images/icon_10_generic_list.png"> <span dir="ltr" style="color:rgb(17,85,204);text-decoration:none;vertical-align:bottom">log.txt.tar.gz</span></a><img style="opacity: 0.55; float: right; display: none;"></div>
            <br>
          </div>
          Thanks,<br>
        </div>
        Adam<br>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Mon, Sep 18, 2023 at
          11:04 AM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div>
            <div>On 9/18/23 17:44, Adam Hassick wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">
                <div>
                  <div>
                    <div>
                      <div>
                        <div>
                          <div>
                            <div>Hi Andrew and Konstantin,<br>
                              <br>
                            </div>
                            Thank you for adding the tester-dial
                            feature, this opens up the possibility for
                            us to do CI integrated testing in the
                            future.<br>
                            <br>
                          </div>
                          Our Mellanox pass rate is similar to yours
                          (about ~2400 passing, ~4400 failing), however
                          our Intel pass rates are far worse.<br>
                        </div>
                        <div>I will try running tests on the XL710 with
                          the trc-tags argument set and see if it
                          improves the pass rate.<br>
                        </div>
                        Another thing I noticed in the results you
                        uploaded is that the results are tagged with
                        vfio-pci and not i40e.</div>
                      <div>Though in the environment dump, the driver on
                        the test machine and the DUT are set to use the
                        i40e driver. Is this important at all?<br>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </blockquote>
            <br>
            I think it is a misunderstanding here. There are two kinds
            of driver in configuration: net driver and so-called DPDK
            driver.<br>
            Net driver is a Linux kernel network device driver used on
            Tester side.<br>
            DPDK driver is a Linux kernel driver to bind device to to
            use it with DPDK. So, it is NOT a driver inside DPDK
            (drivers/net/*).<br>
            In the case of bifurcated driver (like mlx5_core) it is the
            same in both cases.<br>
            In non-bifurcated case DPDK driver is some UIO
            driver(vfio-pci, uio-pci-generic or igb_uio).<br>
            Some expectations depend on used UIO. For example,
            uio-pci-generic do not support many interrupts (used by
            usecases/rx_intr test cases).<br>
            That's why we care corresponding TRC tag.<br>
            <br>
            TE_ENV_*_DPDK_DRIVER variables should be vfio-pc  in 710's
            Intel case. Or uio-pci-generic if IOMMU is turned off on
            corresponding machines and Linux distro does not support
            VFIO no IOMMU mode.<br>
            <br>
            Andrew.<br>
            <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div>
                  <div>
                    <div>There isn't anything preventing us from pushing
                      our results up to the existing Bublik instance
                      running at <a href="http://ts-factory.io" target="_blank">ts-factory.io</a>
                      that I can think of at the moment.<br>
                    </div>
                    <div>We will have to work out how to submit our
                      results to your Bublik instance in a controlled
                      and secure manner in that case.<br>
                    </div>
                    <div>As far as I know we won't need access controls
                      for the results themselves. I'll discuss this with
                      Patrick and will let you know once we confirm that
                      it's fine.</div>
                  </div>
                  <div><br>
                  </div>
                  Thanks,<br>
                </div>
                Adam<br>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Mon, Sep 18, 2023
                  at 2:26 AM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  <div>
                    <div>On 9/18/23 09:23, Konstantin Ushakov wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div style="font-family:sans-serif">
                        <div style="white-space:normal">
                          <p dir="auto">Hi Andrew,</p>
                          <p dir="auto">should we always auto-assign the
                            tags or you don’t do it since it slows down
                            (by some seconds) the TE startup?</p>
                        </div>
                      </div>
                    </blockquote>
                    <br>
                    Tags are auto-assigned, but I guess it differs in
                    Adam's case since NIC is a bit different. Below test
                    will help to understand if it is the root cause of
                    very different expectations. If pass rate will be
                    close to mine, I'll simply update TRC database to
                    share expectations for mine NIC and NIC used by
                    Adam.<br>
                    <br>
                    <blockquote type="cite">
                      <div style="font-family:sans-serif">
                        <div style="white-space:normal">
                          <p dir="auto">Hi Adam,</p>
                          <p dir="auto">I think I second the question
                            from Andrew - happy to help you with the
                            triage so that we get to the same baseline.
                            Do you have a good way for us to share the
                            logs? I.e. say upload to ts-factory if we
                            add strict permissions system so it’s not
                            publishing or any other way.</p>
                          <p dir="auto">Thanks, <br>
                            Konstantin</p>
                          <br>
                          <p dir="auto">On 18 Sep 2023, at 9:15, Andrew
                            Rybchenko wrote:</p>
                        </div>
                        <blockquote style="margin:0px 0px 5px;padding-left:5px;border-left:2px solid rgb(119,119,119);color:rgb(119,119,119)">
                          <div id="m_3634070300823225067m_-4480488732519633353m_-792332968217640304733A56D0A-0ED3-47F6-99B4-35C92E41C2DA">
                            <div>Hi Adam,<br>
                              <br>
                              I've uploaded fresh testing results to <a href="http://ts-factory.io" target="_blank">ts-factory.io</a>
                              [1] to be on the same page.<br>
                              <br>
                              I think I know why your and mine results
                              on Intel 710 series NICs differ so much.
                              Testing results expectations database
                              (dpdk-ethdev-ts/trc/*) is filled in in
                              terms of TRC tags.  I.e. expectations
                              depends on TRC tags discovered by helper
                              scripts when testing is started. These
                              tags identify various aspects of what is
                              tested. Ideally expectations should be
                              written in terms of root cause of the
                              expected behaviour. If it is a driver
                              expectations, driver tag should be used.
                              If it is HW limitation, tags with PCI IDs
                              should be used. However, it is not always
                              easy to classify it correctly if you're
                              not involved in driver development. So, in
                              order case expectations for 710's Intel
                              are filled in in terms of PCI IDs. I guess
                              PCI ID differ in your case and that's why
                              expectations filled in for my NIC do not
                              apply to your runs.<br>
                              <br>
                              Just try to add the following option when
                              you run on your 710's Intel in order to
                              mimic mine and see if it helps to achieve
                              better pass rate.<br>
                              --trc-tag=pci-8086-1572<br>
                              <br>
                              BTW, fresh TE tag <span>v1.21.0 has
                                improved algorithm to choose tests for
                                --tester-dial option. It should have
                                better coverage now.</span><br>
                              <br>
                              Andrew.<br>
                              <br>
                              [1] <a href="https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1" target="_blank">https://ts-factory.io/bublik/v2/runs?startDate=2023-09-16&finishDate=2023-09-16&runData=&runDataExpr=&page=1</a><br>
                              <br>
                              On 9/13/23 18:45, Andrew Rybchenko wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div>Hi Adam,<br>
                                <br>
                                I've pushed new TE tag v1.20.0 which
                                supported a new command-line option
                                --tester-dial=NUM where NUM is from 0 to
                                100. it allows to choose percentage of
                                tests to run. If you want stable set,
                                you should pass --tester-random-seed=0
                                (or other integer). It is the first
                                sketch and we have plans to improve it,
                                but feedback would be welcome.<br>
                                <br>
                                > Is it needed on the tester?<br>
                                <br>
                                It is hard to say if it is strictly
                                required for simple tests. However, it
                                is better to update Tester as well,
                                since performance tests run DPDK on
                                Tester as well.<br>
                                <br>
                                > Are there any other manual setup
                                steps for these devices that I might be
                                missing?<br>
                                <br>
                                I don't remember anything else.<br>
                                <br>
                                I think it is better to get down to
                                details and take a look at logs. I'm
                                ready to help with it and explain what's
                                happening there. May be it will help to
                                understand if it is a problem with
                                setup/configuration.<br>
                                <br>
                                Text logs are not very convenient.
                                Ideally logs should be imported to
                                bublik, however, manual runs do not
                                provide all required artifacts right now
                                (Jenkins jobs generate all required
                                artifacts).<br>
                                Other option is 'tmp_raw_log' file
                                (should be packed to make it smaller)
                                which could be converted to various log
                                formats.<br>
                                Would it be OK for you if I import your
                                logs to bublik at <a href="http://ts-factory.io" target="_blank">ts-factory.io</a>?
                                Or is it a problem that it is publicly
                                available?<br>
                                Would it help if we add authentication
                                and access control there?<br>
                                <br>
                                Andrew.<br>
                                <br>
                                On 9/8/23 17:57, Adam Hassick wrote:<br>
                              </div>
                              <blockquote type="cite">
                                <div dir="ltr">
                                  <div>
                                    <div>
                                      <div>
                                        <div>
                                          <div>Hi Andrew,<br>
                                            <br>
                                          </div>
                                          I have a couple questions
                                          about needed setup of the NICs
                                          for the ethdev test suite.<br>
                                          <br>
                                        </div>
                                        Our MCX5s and XL710s are failing
                                        the checkup tests. The pass rate
                                        appears to be much worse on the
                                        XL710s (40 of 73 tests failed, 3
                                        passed unexpectedly).<br>
                                        <br>
                                      </div>
                                      For the XL710s, I've updated the
                                      driver and NVM versions to match
                                      the minimum supported versions in
                                      the compatibility matrix found on
                                      the DPDK documentation. This did
                                      not change the failure rate much.<br>
                                    </div>
                                    For the MCX5s, I've installed the
                                    latest LTS version of the OFED
                                    bifurcated driver on the DUT. Is it
                                    needed on the tester?<br>
                                    <br>
                                  </div>
                                  Are there any other manual setup steps
                                  for these devices that I might be
                                  missing?<br>
                                  <div>
                                    <div>
                                      <div>
                                        <div>
                                          <div>
                                            <div>
                                              <div><br>
                                              </div>
                                              <div>Thanks,<br>
                                              </div>
                                              <div>Adam<br>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                                <br>
                                <div class="gmail_quote">
                                  <div dir="ltr" class="gmail_attr">On
                                    Wed, Sep 6, 2023 at 11:00 AM Adam
                                    Hassick <<a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a>>
                                    wrote:<br>
                                  </div>
                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                    <div dir="ltr">
                                      <div>
                                        <div>
                                          <div>
                                            <div>
                                              <div>Hi Andrew,<br>
                                                <br>
                                              </div>
                                              <div>Yes, I copied the
                                                X710 configs to set up
                                                XL710 configs. I changed
                                                the environment variable
                                                names from the X710
                                                suffix to XL710 suffix
                                                in the script, and
                                                forgot to change them in
                                                the corresponding
                                                environment file.<br>
                                              </div>
                                            </div>
                                            That fixed the issue.<br>
                                            <br>
                                          </div>
                                          I got the checkup tests
                                          working on the XL710 now. Most
                                          of them are failing, which
                                          leads me to believe this is an
                                          issue with our testbed. Based
                                          on the DPDK documentation for
                                          i40e, the firmware and driver
                                          versions are much older than
                                          what DPDK 22.11 LTS and main
                                          prefer, so I'll try updating
                                          those.<br>
                                          <br>
                                        </div>
                                        For now I'm working on getting
                                        the XL710 checkup tests passing,
                                        and will pick up getting the
                                        E810 configured properly next.
                                        I'll let you know if I run into
                                        any more issues in relation to
                                        the test engine.<br>
                                        <br>
                                      </div>
                                      <div>Thanks,<br>
                                      </div>
                                      <div>Adam<br>
                                      </div>
                                    </div>
                                    <br>
                                    <div class="gmail_quote">
                                      <div dir="ltr" class="gmail_attr">On
                                        Wed, Sep 6, 2023 at 7:36 AM
                                        Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                                        wrote:<br>
                                      </div>
                                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                        <div>
                                          <div>Hi Adam,<br>
                                            <br>
                                            On 9/5/23 18:01, Adam
                                            Hassick wrote:<br>
                                          </div>
                                          <blockquote type="cite">
                                            <div dir="ltr">
                                              <div>
                                                <div>
                                                  <div>
                                                    <div>
                                                      <div>
                                                        <div>Hi Andrew,<br>
                                                          <br>
                                                        </div>
                                                        The compilation
                                                        warning issue is
                                                        now resolved.
                                                        Again, thank you
                                                        guys for fixing
                                                        this for us. I
                                                        can run the
                                                        tests on the
                                                        Mellanox CX5s
                                                        again, however
                                                        I'm running into
                                                        a couple new
                                                        issues with
                                                        running the
                                                        prologues on the
                                                        Intel cards.<br>
                                                        <br>
                                                      </div>
                                                      When running
                                                      testing on the
                                                      Intel XL710s, I
                                                      see this error
                                                      appear in the log:<br>
                                                      <br>
                                                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">ERROR
                                                         prologue
                                                         Environment LIB
                                                         14:16:13.650<br>
                                                        Too few networks
                                                        in available
                                                        configuration
                                                        (0) in
                                                        comparison with
                                                        required (1)<br>
                                                      </blockquote>
                                                      <br>
                                                    </div>
                                                    This seems like a
                                                    trivial
                                                    configuration error,
                                                    perhaps this is
                                                    something I need to
                                                    set up in ts-rigs. I
                                                    briefly searched
                                                    through the examples
                                                    there and didn't see
                                                    any mention of how
                                                    to set up a network.<br>
                                                  </div>
                                                  <div>I will attach
                                                    this log just in
                                                    case you need more
                                                    information.<br>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                          Unfortunately logs are
                                          insufficient to understand it.
                                          I've pushed new tag to TE
                                          v1.19.0 which add log message
                                          with TE_* environment
                                          variables.<br>
                                          Most likely something is wrong
                                          with variables which are used
                                          as conditions when available
                                          networks are defined in
                                          ts-conf/cs/inc.net_cfg_pci_fns.yml:<br>
                                          TE_PCI_INSTANCE_IUT_TST1<br>
                                          TE_PCI_INSTANCE_IUT_TST1a<br>
                                          TE_PCI_INSTANCE_TST1a_IUT<br>
                                          TE_PCI_INSTANCE_TST1_IUT<br>
                                          My guess it that you change
                                          naming a bit, but script like
ts-rigs-sample/scripts/iut.h1-x710 is not included or not updated.<br>
                                          <br>
                                          <blockquote type="cite">
                                            <div dir="ltr">
                                              <div>
                                                <div>There is a
                                                  different error when
                                                  running on the Intel
                                                  E810s. It appears to
                                                  me like it starts
                                                  DPDK, does some
                                                  configuration inside
                                                  DPDK and on the
                                                  device, and then fails
                                                  to bring the device
                                                  back up. Since this
                                                  error seems very
                                                  non-trivial, I will
                                                  also attach this log.<br>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                          This one is a bit simpler. Few
                                          lines after the first ERROR in
                                          log I see the following:<br>
                                          WARN  RCF  DPDK  13:06:00.144<br>
                                          ice_program_hw_rx_queue():
                                          currently package doesn't
                                          support RXDID (22)<br>
                                          ice_rx_queue_start(): fail to
                                          program RX queue 0<br>
                                          ice_dev_start(): fail to start
                                          Rx queue 0<br>
                                          Device with port_id=0 already
                                          stopped<br>
                                          <br>
                                          It is stdout/stderr from test
                                          agent which runs DPDK. Same
                                          logs in plain format are
                                          available in ta.DPDK file.<br>
                                          I'm not an expert here, but I
                                          vaguely remember that E810
                                          requires correct firmware and
                                          DDP to be loaded.<br>
                                          There is some information in
                                          dpdk/doc/guides/nics/ice.rst.<br>
                                          <br>
                                          You can try to add
                                          --dev-args=safe-mode-support=1
                                          command-line option described
                                          there.<br>
                                          <br>
                                          Hope it helps,<br>
                                          Andrew.<br>
                                          <br>
                                          <blockquote type="cite">
                                            <div dir="ltr">
                                              <div>
                                                <div><br>
                                                </div>
                                                Thanks,<br>
                                              </div>
                                              Adam<br>
                                            </div>
                                            <br>
                                            <div class="gmail_quote">
                                              <div dir="ltr" class="gmail_attr">On
                                                Fri, Sep 1, 2023 at
                                                3:59 AM Andrew Rybchenko
                                                <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                                                wrote:<br>
                                              </div>
                                              <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                <div>
                                                  <div>Hi Adam,<br>
                                                    <br>
                                                    On 8/31/23 22:38,
                                                    Adam Hassick wrote:<br>
                                                  </div>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div>Hi Andrew,<br>
                                                      </div>
                                                      <div><br>
                                                        I have one
                                                        additional
                                                        question as
                                                        well: Does the
                                                        test engine
                                                        support running
                                                        tests on two
                                                        ARMv8 test
                                                        agents?</div>
                                                      <div><br>
                                                      </div>
                                                      <div>
                                                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">1.
                                                          We'll sort out
                                                          warnings this
                                                          week. Thanks
                                                          for heads up.<br>
                                                        </blockquote>
                                                        <div><br>
                                                        </div>
                                                        <div>Great. Let
                                                          me know when
                                                          that's fixed.</div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                  Done. We also fixed a
                                                  number of warnings in
                                                  TE.<br>
                                                  Also we fixed root
                                                  test package name to
                                                  be consistent with the
                                                  repository name.<br>
                                                  <br>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                        <div>Support for
                                                          old LTS
                                                          branches was
                                                          dropped some
                                                          time ago, but
                                                          in the future
                                                          it is
                                                          definitely
                                                          possible to
                                                          keep it for
                                                          new LTS
                                                          branches. I
                                                          think 22.11 is
                                                          supported, but
                                                          I'm not sure
                                                          about older
                                                          LTS releases.</div>
                                                      </blockquote>
                                                      <div><br>
                                                      </div>
                                                      <div>Good to know.<br>
                                                        <div> <br>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">2.
                                                          You can add
                                                          command-line
                                                          option
                                                          --sanity to
                                                          run tests
                                                          marked with
                                                          TEST_HARNESS_SANITY
                                                          requirement
                                                          (see
                                                          dpdk-ethdev-ts/scripts/run.sh
                                                          and grep
                                                          TEST_HARNESS_SANITY
                                                          dpdk-ethdev-ts
                                                          to see which
                                                          tests are
                                                          marked). Yes,
                                                          there is a
                                                          space for
                                                          terminology
                                                          improvement
                                                          here. We'll do
                                                          it.<br>
                                                          </blockquote>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                  Done. Now it is called
                                                  --checkup.<br>
                                                  <br>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div>
                                                        <div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
                                                          Also it takes
                                                          a lot of time
                                                          because of
                                                          failures and
                                                          tests which
                                                          wait for some
                                                          timeout.<br>
                                                          </blockquote>
                                                        </div>
                                                        <div><br>
                                                        </div>
                                                        <div>That makes
                                                          sense to me.
                                                          We'll use the
                                                          time to
                                                          complete tests
                                                          on virtio or
                                                          the Intel
                                                          devices as a
                                                          reference for
                                                          how long the
                                                          tests really
                                                          take to
                                                          complete.<br>
                                                        </div>
                                                        <div>We will
                                                          explore the
                                                          possibility of
                                                          periodically
                                                          running the
                                                          sanity tests
                                                          for patches.<br>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                  I'll double-check and
                                                  let you know how long
                                                  entire TS runs on
                                                  Intel X710, E810,
                                                  Mellanox CX5 and
                                                  virtio net. Just to
                                                  ensure that time
                                                  observed in your case
                                                  looks the same.<br>
                                                  <br>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div>
                                                        <div> <br>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The
                                                          test harness
                                                          can provide
                                                          coverage
                                                          reports based
                                                          on gcov, but
                                                          I'm not sure
                                                          what you mean
                                                          by a "dial" to
                                                          control test
                                                          coverage.
                                                          Provided
                                                          reports are
                                                          rather for
                                                          human to
                                                          analyze.<br>
                                                          </blockquote>
                                                        </div>
                                                        <div><br>
                                                        </div>
                                                        <div>The general
                                                          idea is to
                                                          have some kind
                                                          of parameter
                                                          on the test
                                                          suite, which
                                                          could be an
                                                          integer
                                                          ranging from
                                                          zero to ten,
                                                          that controls
                                                          how many tests
                                                          are run based
                                                          on how
                                                          important the
                                                          test is.<br>
                                                          <br>
                                                        </div>
                                                        <div>Similar to
                                                          how some
                                                          command line
                                                          interfaces
                                                          provide a
                                                          verbosity
                                                          level
                                                          parameter
                                                          (some number
                                                          of "-v"
                                                          arguments) to
                                                          control the
                                                          importance of
                                                          the
                                                          information in
                                                          the log.<br>
                                                        </div>
                                                        The verbosity
                                                        level zero only
                                                        prints very
                                                        important log
                                                        messages, while
                                                        ten prints
                                                        everything.<br>
                                                      </div>
                                                      <div><br>
                                                        In much the same
                                                        manner as above,
                                                        this "dial"
                                                        parameter
                                                        controls what
                                                        tests are run
                                                        and with what
                                                        parameters based
                                                        on how important
                                                        those tests and
                                                        test parameter
                                                        combinations
                                                        are.<br>
                                                        Coverage Level
                                                        zero tells the
                                                        suite to run a
                                                        very basic set
                                                        of important
                                                        tests, with
                                                        minimal
                                                        parameterization.
                                                        This mode would
                                                        take only ~5-10
                                                        minutes to run.<br>
                                                        In contrast,
                                                        Coverage Level
                                                        ten includes all
                                                        the edge cases,
                                                        every
                                                        combination of
                                                        test parameters,
                                                        everything the
                                                        test suite can
                                                        do, which takes
                                                        the normal
                                                        several hours to
                                                        run.<br>
                                                        The values 1 - 9
                                                        are between
                                                        those two
                                                        extremes,
                                                        allowing the
                                                        user to get a
                                                        gradient of test
                                                        coverage in the
                                                        results and to
                                                        limit the
                                                        running time.<br>
                                                        <br>
                                                      </div>
                                                      Then we could, for
                                                      example, run the
                                                      "run.sh" with a
                                                      level of 2 or 3
                                                      for incoming
                                                      patches that need
                                                      quick results, and
                                                      with a level of 10
                                                      for the less often
                                                      run periodic tests
                                                      performed on main
                                                      or LTS branches.<br>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                  Understood now. Thanks
                                                  a lot for the idea.
                                                  We'll discuss it and
                                                  come back.<br>
                                                  <br>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div>
                                                        <div>
                                                          <div> </div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                          <div>
                                                          <div>3. Yes,
                                                          really many
                                                          tests on
                                                          Mellanox CX5
                                                          NICs report
                                                          unexpected
                                                          testing
                                                          results.
                                                          Unfortunately
                                                          it is time
                                                          consuming to
                                                          fill in
                                                          expectations
                                                          database since
                                                          it is
                                                          necessary to
                                                          analyze
                                                          testing
                                                          results and
                                                          classify if it
                                                          is a bug or
                                                          just
                                                          acceptable
                                                          behaviour
                                                          aspect.<br>
                                                          <br>
                                                          Bublik allows
                                                          to compare
                                                          results of two
                                                          runs. It is
                                                          useful for
                                                          human, but
                                                          still not good
                                                          for
                                                          automation.<br>
                                                          <br>
                                                          I have local
                                                          patch for mlx5
                                                          driver which
                                                          reports Tx
                                                          ring size
                                                          maximum. It
                                                          makes pass
                                                          rate higher.
                                                          It is a
                                                          problem for
                                                          test harness
                                                          that mlx5 does
                                                          not report
                                                          limits right
                                                          now.<br>
                                                          <br>
                                                          Pass rate on
                                                          Intel X710 is
                                                          about 92% on
                                                          my test rig.
                                                          Pass rate on
                                                          virtio net is
                                                          99% right now
                                                          and could be
                                                          done 100%
                                                          easily (just
                                                          one thing to
                                                          fix in
                                                          expectations).<br>
                                                          <br>
                                                          I think logs
                                                          storage setup
                                                          is essential
                                                          for logs
                                                          analysis. Of
                                                          course, you
                                                          can request
                                                          HTML logs when
                                                          you run tests
(--log-html=html) or generate after run using
dpdk-ethdev-ts/scripts/html-log.sh and open index.html in a browser, but
                                                          logs storage
                                                          makes it more
                                                          convenient.<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div><br>
                                                          We are
                                                          interested in
                                                          setting up
                                                          Bublik,
                                                          potentially as
                                                          an
                                                          externally-facing
                                                          component,
                                                          once we have
                                                          our process of
                                                          running the
                                                          test suite
                                                          stabilized.</div>
                                                          <div>Once we
                                                          are able to
                                                          run the test
                                                          suite again,
                                                          I'll see what
                                                          the pass rate
                                                          is on our
                                                          other
                                                          hardware.<br>
                                                          Good to know
                                                          that it isn't
                                                          an issue with
                                                          our dev
                                                          testbed
                                                          causing the
                                                          high fail
                                                          rate.</div>
                                                        </div>
                                                        <div>
                                                          <div><br>
                                                          </div>
                                                          <div>For Intel
                                                          hardware, we
                                                          have an XL710
                                                          and an Intel
                                                          E810-C in our
                                                          development
                                                          testbed.
                                                          Although they
                                                          are slightly
                                                          different
                                                          devices,
                                                          ideally the
                                                          pass rate will
                                                          be identical
                                                          or similar. I
                                                          have yet to
                                                          set up a VM
                                                          pair for
                                                          virtio, but we
                                                          will soon.<br>
                                                          </div>
                                                          <div><br>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Latest
                                                          version of
                                                          test-environment
                                                          has examples
                                                          of our CGI
                                                          scripts which
                                                          we use for log
                                                          storage (see
                                                          tools/log_server/README.md).<br>
                                                          <br>
                                                          Also all bits
                                                          for Jenkins
                                                          setup are
                                                          available. See
dpdk-ethdev-ts/jenkins/README.md and examples of jenkins files in
                                                          ts-rigs-sample.<br>
                                                          </blockquote>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>Jenkins
                                                          integration,
                                                          setting up
                                                          production rig
configurations, and permanent log storage will be our next steps once I
                                                          am able to run
                                                          the tests
                                                          again.<br>
                                                          </div>
                                                          <div>Unless
                                                          there is an
                                                          easy way to
                                                          have meson not
                                                          pass "-Werror"
                                                          into GCC. Then
                                                          I would be
                                                          able to run
                                                          the test
                                                          suite.<br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                  Hopefully it is
                                                  resolved now.<br>
                                                  <br>
                                                  I thought a bit more
                                                  about your usecase for
                                                  Jenkins. I'm not 100%
                                                  sure that existing
                                                  pipelines are
                                                  convenient for your
                                                  usecase.<br>
                                                  Fill free to ask
                                                  questions when you are
                                                  on it.<br>
                                                  <br>
                                                  Thanks,<br>
                                                  Andrew.<br>
                                                  <br>
                                                  <blockquote type="cite">
                                                    <div dir="ltr">
                                                      <div>
                                                        <div>
                                                          <div><br>
                                                          </div>
                                                          <div>Thanks,<br>
                                                          </div>
                                                          <div>Adam<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div> </div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                          <div>
                                                          <div><br>
                                                          On 8/29/23
                                                          17:02, Adam
                                                          Hassick wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>Hi
                                                          Andrew,<br>
                                                          <br>
                                                          </div>
                                                          That fix seems
                                                          to have
                                                          resolved the
                                                          issue, thanks
                                                          for the quick
                                                          turnaround
                                                          time on that
                                                          patch.<br>
                                                          </div>
                                                          <div>Now that
                                                          we have the
                                                          RCF timeout
                                                          issue
                                                          resolved,
                                                          there are a
                                                          few other
                                                          questions and
                                                          issues that we
                                                          have about the
                                                          tests
                                                          themselves.</div>
                                                          <br>
                                                          </div>
                                                          <div>1. The
                                                          test suite
                                                          fails to build
                                                          with a couple
                                                          warnings.<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>Below is
                                                          the stderr log
                                                          from
                                                          compilation:<br>
                                                          </div>
                                                          <br>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">FAILED:
                                                          <a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a><br>
                                                          cc
                                                          -Ilib/76b5a35@@ts_dpdk_pmd@sta
                                                          -Ilib
                                                          -I../../lib
                                                          -I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall
                                                          -Winvalid-pch
                                                          -Werror -g
                                                          -D_GNU_SOURCE
                                                          -O0 -ggdb
                                                          -Wall -W -fPIC
                                                          -MD -MQ '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
                                                          -MF '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d</a>'
                                                          -o '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
                                                          -c
                                                          ../../lib/dpdk_pmd_ts.c<br>
../../lib/dpdk_pmd_ts.c: In function
                                                          ‘test_create_traffic_generator_params’:<br>
../../lib/dpdk_pmd_ts.c:5577:5: error: format not a string literal and
                                                          no format
                                                          arguments
[-Werror=format-security]<br>
                                                          5577 |     rc
                                                          =
                                                          te_kvpair_add(result,
                                                          buf, mode);<br>
                                                          |     ^~<br>
                                                          cc1: all
                                                          warnings being
                                                          treated as
                                                          errors<br>
                                                          ninja: build
                                                          stopped:
                                                          subcommand
                                                          failed.<br>
                                                          ninja:
                                                          Entering
                                                          directory `.'<br>
                                                          FAILED: <a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a><br>
                                                          cc
                                                          -Ilib/76b5a35@@ts_dpdk_pmd@sta
                                                          -Ilib
                                                          -I../../lib
                                                          -I/opt/tsf/dpdk-ethdev-ts/ts/inst/default/include
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall
                                                          -Winvalid-pch
                                                          -Werror -g
                                                          -D_GNU_SOURCE
                                                          -O0 -ggdb
                                                          -Wall -W -fPIC
                                                          -MD -MQ '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
                                                          -MF '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o.d</a>'
                                                          -o '<a href="mailto:lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o" target="_blank">lib/76b5a35@@ts_dpdk_pmd@sta/dpdk_pmd_ts.c.o</a>'
                                                          -c
                                                          ../../lib/dpdk_pmd_ts.c<br>
../../lib/dpdk_pmd_ts.c: In function
                                                          ‘test_create_traffic_generator_params’:<br>
../../lib/dpdk_pmd_ts.c:5577:5: error: format not a string literal and
                                                          no format
                                                          arguments
[-Werror=format-security]<br>
                                                          5577 |     rc
                                                          =
                                                          te_kvpair_add(result,
                                                          buf, mode);<br>
                                                          |     ^~<br>
                                                          cc1: all
                                                          warnings being
                                                          treated as
                                                          errors<br>
                                                          </blockquote>
                                                          <div>
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          <div>This
                                                          error wasn't
                                                          occurring last
                                                          week, which
                                                          was the last
                                                          time I ran the
                                                          tests.<br>
                                                          </div>
                                                          <div>The TE
                                                          host and the
                                                          DUT have GCC
                                                          v9.4.0
                                                          installed, and
                                                          the tester has
                                                          GCC v11.4.0
                                                          installed, if
                                                          this
                                                          information is
                                                          helpful.<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>2. On the
                                                          Mellanox CX5s,
                                                          there are over
                                                          6,000 tests
                                                          run, which
                                                          collectively
                                                          take around 9
                                                          hours. Is it
                                                          possible, and
                                                          would it make
                                                          sense, to
                                                          lower the test
                                                          coverage and
                                                          have the test
                                                          suite run
                                                          faster?<br>
                                                          <br>
                                                          </div>
                                                          <div>For some
                                                          context, we
                                                          run immediate
                                                          testing on
                                                          incoming
                                                          patches for
                                                          DPDK main and
                                                          development
                                                          branches, as
                                                          well as
                                                          periodic test
                                                          runs on the
                                                          main, stable,
                                                          and LTS
                                                          branches.<br>
                                                          </div>
                                                          <div>For us to
                                                          consider
                                                          including this
                                                          test suite as
                                                          part of our
                                                          immediate
                                                          testing on
                                                          patches, we
                                                          would have to
                                                          reduce the
                                                          test coverage
                                                          to the most
                                                          important
                                                          tests.<br>
                                                          This is
                                                          primarily to
                                                          reduce the
                                                          testing time
                                                          to, for
                                                          example, less
                                                          than 30
                                                          minutes.
                                                          Testing on
                                                          patches can't
                                                          take too long
                                                          because the
                                                          lab can
                                                          receive
                                                          numerous
                                                          patches each
                                                          day, which
                                                          each require
                                                          individual
                                                          testing runs.<br>
                                                          <br>
                                                          </div>
                                                          <div>At what
                                                          frequency we
                                                          run these
                                                          tests, and on
                                                          what, still
                                                          needs to be
                                                          discussed with
                                                          the DPDK
                                                          community, but
                                                          it would be
                                                          nice to know
                                                          if the test
                                                          suite had a
                                                          "dial" to
                                                          control the
                                                          testing
                                                          coverage.<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>3. We see
                                                          a lot of test
                                                          failures on
                                                          our Mellanox
                                                          CX5 NICs.
                                                          Around 2,300
                                                          of ~6,600
                                                          tests passed.
                                                          Is there
                                                          anything we
                                                          can do to
                                                          diagnose these
                                                          test failures?<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>Thanks,<br>
                                                          </div>
                                                          <div>Adam<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <br>
                                                          <div class="gmail_quote">
                                                          <div dir="ltr" class="gmail_attr">On Tue, Aug 29, 2023 at 8:07 AM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                          <div>
                                                          <div>Hi Adam,<br>
                                                          <br>
                                                          I've pushed
                                                          the fix in
                                                          main branch
                                                          and a new tag
                                                          v1.18.1. It
                                                          should solve
                                                          the problem
                                                          with IPv6
                                                          address from
                                                          DNS.<br>
                                                          <br>
                                                          Andrew.<br>
                                                          <br>
                                                          On 8/29/23
                                                          00:05, Andrew
                                                          Rybchenko
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div>Hi Adam,<br>
                                                          <br>
                                                          > Does the
                                                          test engine
                                                          prefer to use
                                                          IPv6 over IPv4
                                                          for initiating
                                                          the RCF
                                                          connection to
                                                          the test bed
                                                          hosts? And if
                                                          so, is there a
                                                          way to force
                                                          it to use
                                                          IPv4?<br>
                                                          <br>
                                                          Brilliant
                                                          idea. If DNS
                                                          returns both
                                                          IPv4 and IPv6
                                                          addresses in
                                                          your case, I
                                                          guess it is
                                                          the root cause
                                                          of the
                                                          problem.<br>
                                                          Of course, it
                                                          is TE problem
                                                          since I see
                                                          really weird
                                                          code in
                                                          lib/comm_net_engine/comm_net_engine.c
                                                          line 135.<br>
                                                          <br>
                                                          I've pushed
                                                          fix to the
                                                          branch
                                                          user/arybchik/fix_ipv4_only
                                                          in
                                                          ts-factory/test-environment
                                                          repository.
                                                          Please, try.<br>
                                                          <br>
                                                          It is late
                                                          night fix with
                                                          minimal
                                                          testing and no
                                                          review. I'll
                                                          pass it
                                                          through review
                                                          process
                                                          tomorrow and<br>
                                                          hopefully it
                                                          will be
                                                          released in
                                                          one-two days.<br>
                                                          <br>
                                                          Andrew.<br>
                                                          <br>
                                                          On 8/28/23
                                                          18:02, Adam
                                                          Hassick wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>Hi
                                                          Andrew,<br>
                                                          <br>
                                                          </div>
                                                          We have yet to
                                                          notice a
                                                          distinct
                                                          pattern with
                                                          the failures.
                                                          Sometimes, the
                                                          RCF will start
                                                          and connect
                                                          without issue
                                                          a few times in
                                                          a row before
                                                          failing to
                                                          connect again.
                                                          Once the issue
                                                          begins to
                                                          occur, neither
                                                          rebooting all
                                                          of the hosts
                                                          (test engine
                                                          VM, tester,
                                                          IUT) or
                                                          deleting all
                                                          of the build
                                                          directories
                                                          (suites,
                                                          agents, inst)
                                                          and rebooting
                                                          the hosts
                                                          afterward
                                                          resolves the
                                                          issue. When it
                                                          begins working
                                                          again seems
                                                          very arbitrary
                                                          to us.<br>
                                                          <br>
                                                          </div>
                                                          <div>I do
                                                          usually try to
                                                          terminate the
                                                          test engine
                                                          with Ctrl+C,
                                                          but when it
                                                          hangs while
                                                          trying to
                                                          start RCF,
                                                          that does not
                                                          work.<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>Does the
                                                          test engine
                                                          prefer to use
                                                          IPv6 over IPv4
                                                          for initiating
                                                          the RCF
                                                          connection to
                                                          the test bed
                                                          hosts? And if
                                                          so, is there a
                                                          way to force
                                                          it to use
                                                          IPv4?<br>
                                                          <br>
                                                          </div>
                                                          <div> - Adam<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <br>
                                                          <div class="gmail_quote">
                                                          <div dir="ltr" class="gmail_attr">On Fri, Aug 25, 2023 at 1:35 PM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                          <div>
                                                          <div>> I'll
                                                          double-check
                                                          test engine on
                                                          Ubuntu 20.04
                                                          and Ubuntu
                                                          22.04.<br>
                                                          <br>
                                                          Done. It works
                                                          fine for me
                                                          without any
                                                          issues.<br>
                                                          <br>
                                                          Have you
                                                          noticed any
                                                          pattern when
                                                          it works or
                                                          does not work?<br>
                                                          May be it is a
                                                          problem of not
                                                          clean state
                                                          after
                                                          termination?<br>
                                                          Does it work
                                                          fine the first
                                                          time after
                                                          DUTs reboot?<br>
                                                          How do you
                                                          terminate
                                                          testing? It
                                                          should be done
                                                          using Ctrl+C
                                                          in terminal
                                                          where you
                                                          execute run.sh
                                                          command.<br>
                                                           In this case
                                                          it should
                                                          shutdown
                                                          gracefully and
                                                          close all test
                                                          agents and
                                                          engine
                                                          applications.<br>
                                                          <br>
                                                          (I'm trying to
                                                          understand why
                                                          you've seen
                                                          many test
                                                          agent
                                                          processes. It
                                                          should not
                                                          happen.)<br>
                                                          <br>
                                                          Andrew.<br>
                                                          <br>
                                                          On 8/25/23
                                                          17:41, Andrew
                                                          Rybchenko
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div>On
                                                          8/25/23 17:06,
                                                          Adam Hassick
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>Hi
                                                          Andrew,<br>
                                                          <br>
                                                          </div>
                                                          Two of our
                                                          systems (the
                                                          Test Engine
                                                          runner and the
                                                          DUT host) are
                                                          running Ubuntu
                                                          20.04 LTS,
                                                          however this
                                                          morning I
                                                          noticed that
                                                          the tester
                                                          system (the
                                                          one having
                                                          issues) is
                                                          running Ubuntu
                                                          22.04 LTS.<br>
                                                          </div>
                                                          <div>This
                                                          could be the
                                                          source of the
                                                          problem. I
                                                          encountered a
                                                          dependency
                                                          issue trying
                                                          to run the
                                                          Test Engine on
                                                          22.04 LTS, so
                                                          I downgraded
                                                          the system.
                                                          Since the
                                                          tester is also
                                                          the host
                                                          having
                                                          connection
                                                          issues, I will
                                                          try
                                                          downgrading
                                                          that system to
                                                          20.04, and see
                                                          if that
                                                          changes
                                                          anything.<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          Unlikely, but
                                                          who knows. We
                                                          run tests
                                                          (DUTs) on
                                                          Ubuntu 20.04,
                                                          Ubuntu 22.04,
                                                          Ubuntu 22.10,
                                                          Ubuntu 23.04,
                                                          Debian 11 and
                                                          Fedora 38
                                                          every night.<br>
                                                          Right now
                                                          Debian 11 is
                                                          used for test
                                                          engine in
                                                          nightly
                                                          regressions.<br>
                                                          <br>
                                                          I'll
                                                          double-check
                                                          test engine on
                                                          Ubuntu 20.04
                                                          and Ubuntu
                                                          22.04.<br>
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>I did try
                                                          passing in the
                                                          "--vg-rcf"
                                                          argument to
                                                          the run.sh
                                                          script of the
                                                          test suite
                                                          after
                                                          installing
                                                          valgrind, but
                                                          there was no
                                                          additional
                                                          output that I
                                                          saw.<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          Sorry, I
                                                          should
                                                          valgrind
                                                          output should
                                                          be in
                                                          valgrind.te_rcf
                                                          (direction
                                                          where you run
                                                          test engine).<br>
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div><br>
                                                          </div>
                                                          <div>I will
                                                          try pulling in
                                                          the changes
                                                          you've pushed
                                                          up, and will
                                                          see if that
                                                          fixes
                                                          anything.<br>
                                                          <br>
                                                          </div>
                                                          <div>Thanks,<br>
                                                          </div>
                                                          <div>Adam<br>
                                                          </div>
                                                          </div>
                                                          <br>
                                                          <div class="gmail_quote">
                                                          <div dir="ltr" class="gmail_attr">On Fri, Aug 25, 2023 at 9:57 AM Andrew Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a>>
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                                                          <div>
                                                          <div>Hello
                                                          Adam,<br>
                                                          <br>
                                                          On 8/24/23
                                                          23:54, Andrew
                                                          Rybchenko
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">I'd
                                                          like to try to
                                                          repeat the
                                                          problem
                                                          locally. Which
                                                          Linux distro
                                                          is running on
                                                          test engine
                                                          and agents?<br>
                                                          <br>
                                                          In fact I know
                                                          one problem
                                                          with Debian 12
                                                          and Fedora 38
                                                          and we have<br>
                                                          patch in
                                                          review to fix
                                                          it, however,
                                                          the behaviour
                                                          is different
                                                          in<br>
                                                          this case, so
                                                          it is unlike
                                                          the same
                                                          problem.<br>
                                                          </blockquote>
                                                          <br>
                                                          I've just
                                                          published a
                                                          new tag which
                                                          fixes known
                                                          test engine
                                                          side problems
                                                          on Debian 12
                                                          and Fedora 38.<br>
                                                          <br>
                                                          <blockquote type="cite"><br>
                                                          One more idea
                                                          is to install
                                                          valgrind on
                                                          the test
                                                          engine host
                                                          and<br>
                                                          run with
                                                          option
                                                          --vg-rcf to
                                                          check if
                                                          something
                                                          weird is
                                                          happening.<br>
                                                          <br>
                                                          What I don't
                                                          understand
                                                          right now is
                                                          why I see just
                                                          one failed
                                                          attempt<br>
                                                          to connect in
                                                          your log.txt
                                                          and then
                                                          Logger
                                                          shutdown after
                                                          9 minutes.<br>
                                                          <br>
                                                          Andrew.<br>
                                                          <br>
                                                          On 8/24/23
                                                          23:29, Adam
                                                          Hassick wrote:<br>
                                                          <blockquote type="cite"> >
                                                          Is there any
                                                          firewall in
                                                          the network or
                                                          on test hosts
                                                          which could
                                                          block incoming
                                                          TCP connection
                                                          to the port
                                                          23571 <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
                                                          from the host
                                                          where you run
                                                          test engine?<br>
                                                          <br>
                                                          Our test
                                                          engine host
                                                          and the
                                                          testbed are on
                                                          the same
                                                          subnet. The
                                                          connection
                                                          does work
                                                          sometimes.<br>
                                                          <br>
                                                           > If
                                                          behaviour the
                                                          same on the
                                                          next try and
                                                          you see that
                                                          test agent is
                                                          kept running,
                                                          could you
                                                          check using<br>
                                                           ><br>
                                                           > #
                                                          netstat -tnlp<br>
                                                           ><br>
                                                           > that
                                                          Test Agent is
                                                          listening on
                                                          the port and
                                                          try to
                                                          establish TCP
                                                          connection
                                                          from test
                                                          agent using<br>
                                                           ><br>
                                                           > $ telnet
                                                          <a href="http://iol-dts-tester.dpdklab.iol.unh.edu" target="_blank">iol-dts-tester.dpdklab.iol.unh.edu</a>
                                                          <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
                                                          23571 <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
                                                           ><br>
                                                           > and
                                                          check if TCP
                                                          connection
                                                          could be
                                                          established.<br>
                                                          <br>
                                                          I was able to
                                                          replicate the
                                                          same behavior
                                                          again, where
                                                          it hangs while
                                                          RCF is trying
                                                          to start.<br>
                                                          Running this
                                                          command, I see
                                                          this in the
                                                          output:<br>
                                                          <br>
                                                          tcp        0  
                                                             0 <a href="http://0.0.0.0:23571" target="_blank">0.0.0.0:23571</a> <a href="http://0.0.0.0:23571" target="_blank"><http://0.0.0.0:23571></a>          
                                                          0.0.0.0:*    
                                                                   
                                                          LISTEN    
                                                           18599/ta<br>
                                                          <br>
                                                          So it seems
                                                          like it is
                                                          listening on
                                                          the correct
                                                          port.<br>
                                                          Additionally,
                                                          I was able to
                                                          connect to the
                                                          Tester machine
                                                          from our Test
                                                          Engine host
                                                          using telnet.
                                                          It printed the
                                                          PID of the
                                                          process once
                                                          the connection
                                                          was opened.<br>
                                                          <br>
                                                          I tried
                                                          running the
                                                          "ta"
                                                          application
                                                          manually on
                                                          the command
                                                          line, and it
                                                          didn't print
                                                          anything at
                                                          all.<br>
                                                          Maybe the
                                                          issue is
                                                          something on
                                                          the Test
                                                          Engine side.<br>
                                                          <br>
                                                          On Thu, Aug
                                                          24, 2023 at
                                                          2:35 PM Andrew
                                                          Rybchenko <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a> <a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
                                                          wrote:<br>
                                                          <br>
                                                              Hi Adam,<br>
                                                          <br>
                                                               > On
                                                          the tester
                                                          host (which
                                                          appears to be
                                                          the Peer
                                                          agent), there<br>
                                                              are four
                                                          processes that
                                                          I see running,
                                                          which look
                                                          like the test<br>
                                                              agent
                                                          processes.<br>
                                                          <br>
                                                              Before the
                                                          next try I'd
                                                          recommend to
                                                          kill these
                                                          processes.<br>
                                                          <br>
                                                              Is there
                                                          any firewall
                                                          in the network
                                                          or on test
                                                          hosts which
                                                          could<br>
                                                              block
                                                          incoming TCP
                                                          connection to
                                                          the port 23571<br>
                                                              <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
                                                          from the host<br>
                                                              where you
                                                          run test
                                                          engine?<br>
                                                          <br>
                                                              If
                                                          behaviour the
                                                          same on the
                                                          next try and
                                                          you see that
                                                          test agent is<br>
                                                              kept
                                                          running, could
                                                          you check
                                                          using<br>
                                                          <br>
                                                              # netstat
                                                          -tnlp<br>
                                                          <br>
                                                              that Test
                                                          Agent is
                                                          listening on
                                                          the port and
                                                          try to
                                                          establish TCP<br>
                                                              connection
                                                          from test
                                                          agent using<br>
                                                          <br>
                                                              $ telnet <a href="http://iol-dts-tester.dpdklab.iol.unh.edu" target="_blank">iol-dts-tester.dpdklab.iol.unh.edu</a><br>
                                                              <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a>
                                                          23571<br>
                                                              <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
                                                          <br>
                                                              and check
                                                          if TCP
                                                          connection
                                                          could be
                                                          established.<br>
                                                          <br>
                                                              Another
                                                          idea is to
                                                          login Tester
                                                          under root as
                                                          testing does,
                                                          get<br>
                                                              start TA
                                                          command from
                                                          the log and
                                                          try it by
                                                          hands without
                                                          -n and<br>
                                                              remove
                                                          extra
                                                          escaping.<br>
                                                          <br>
                                                              # sudo
                                                          PATH=${PATH}:/tmp/linux_x86_root_76872_1692885663_1<br>
                                                             
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}/tmp/linux_x86_root_76872_1692885663_1
/tmp/linux_x86_root_76872_1692885663_1/ta Peer 23571
host=iol-dts-tester.dpdklab.iol.unh.edu:port=23571:user=root:key=/opt/tsf/keys/id_ed25519:ssh_port=22:copy_timeout=15:kill_timeout=15:sudo=:shell=<br>
                                                          <br>
                                                              Hopefully
                                                          in this case
                                                          test agent
                                                          directory
                                                          remains in the
                                                          /tmp and<br>
                                                              you don't
                                                          need to copy
                                                          it as testing
                                                          does.<br>
                                                              May be
                                                          output could
                                                          shed some
                                                          light on
                                                          what's going
                                                          on.<br>
                                                          <br>
                                                              Andrew.<br>
                                                          <br>
                                                              On 8/24/23
                                                          17:30, Adam
                                                          Hassick wrote:<br>
                                                          <blockquote type="cite">   
                                                          Hi Andrew,<br>
                                                          <br>
                                                              This is
                                                          the output
                                                          that I see in
                                                          the terminal
                                                          when this
                                                          failure<br>
                                                              occurs,
                                                          after the test
                                                          agent binaries
                                                          build and the
                                                          test engine<br>
                                                              starts:<br>
                                                          <br>
                                                              Platform
                                                          default build
                                                          - pass<br>
                                                              Simple RCF
                                                          consistency
                                                          check
                                                          succeeded<br>
                                                             
                                                          --->>>
                                                          Starting
                                                          Logger...done<br>
                                                             
                                                          --->>>
                                                          Starting
                                                          RCF...rcf_net_engine_connect():
                                                          Connection
                                                          timed<br>
                                                              out <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank">iol-dts-tester.dpdklab.iol.unh.edu:23571</a><br>
                                                              <a href="http://iol-dts-tester.dpdklab.iol.unh.edu:23571" target="_blank"><http://iol-dts-tester.dpdklab.iol.unh.edu:23571></a><br>
                                                          <br>
                                                              Then, it
                                                          hangs here
                                                          until I kill
                                                          the "te_rcf"
                                                          and "te_tee"<br>
                                                              processes.
                                                          I let it hang
                                                          for around 9
                                                          minutes.<br>
                                                          <br>
                                                              On the
                                                          tester host
                                                          (which appears
                                                          to be the Peer
                                                          agent), there
                                                          are<br>
                                                              four
                                                          processes that
                                                          I see running,
                                                          which look
                                                          like the test
                                                          agent<br>
                                                              processes.<br>
                                                          <br>
                                                              ta.Peer is
                                                          an empty file.
                                                          I've attached
                                                          the log.txt
                                                          from this run.<br>
                                                          <br>
                                                               - Adam<br>
                                                          <br>
                                                              On Thu,
                                                          Aug 24, 2023
                                                          at 4:22 AM
                                                          Andrew
                                                          Rybchenko<br>
                                                              <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a><br>
                                                              <a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
                                                          wrote:<br>
                                                          <br>
                                                                  Hi
                                                          Adam,<br>
                                                          <br>
                                                                  Yes,
                                                          TE_RCFUNIX_TIMEOUT
                                                          is in seconds.
                                                          I've
                                                          double-checked<br>
                                                                  that
                                                          it goes to
                                                          'copy_timeout'
                                                          in
                                                          ts-conf/rcf.conf.<br>
                                                                 
                                                          Description in
                                                          in
                                                          doc/sphinx/pages/group_te_engine_rcf.rst<br>
                                                                  says
                                                          that
                                                          copy_timeout
                                                          is in seconds
                                                          and
                                                          implementation
                                                          in<br>
                                                                 
                                                          lib/rcfunix/rcfunix.c
                                                          passes the
                                                          value to
                                                          select()
                                                          tv_sec.<br>
                                                                 
                                                          Theoretically
                                                          select() could
                                                          be interrupted
                                                          by signal, but
                                                          I<br>
                                                                  think
                                                          it is unlikely
                                                          here.<br>
                                                          <br>
                                                                  I'm
                                                          not sure that
                                                          I understand
                                                          what do you
                                                          mean by RCF<br>
                                                                 
                                                          connection
                                                          timeout. Does
                                                          it happen on
                                                          TE startup
                                                          when RCF<br>
                                                                  starts
                                                          test agents.
                                                          If so,
                                                          TE_RCFUNIX_TIMEOUT
                                                          could help. Or<br>
                                                                  does
                                                          it happen when
                                                          tests are in
                                                          progress, e.g.
                                                          in the middle<br>
                                                                  of a
                                                          test. If so,
                                                          TE_RCFUNIX_TIMEOUT
                                                          is unrelated
                                                          and most<br>
                                                                  likely
                                                          either host
                                                          with test
                                                          agent dies or
                                                          test agent
                                                          itself<br>
                                                                 
                                                          crashes. It
                                                          would be
                                                          easier for me
                                                          if classify it
                                                          if you share<br>
                                                                  text
                                                          log (log.txt,
                                                          full or just
                                                          corresponding
                                                          fragment with<br>
                                                                  some
                                                          context). Also
                                                          content of
                                                          ta.DPDK or
                                                          ta.Peer file<br>
                                                                 
                                                          depending on
                                                          which agent
                                                          has problems
                                                          could shed
                                                          some light.<br>
                                                                 
                                                          Corresponding
                                                          files contain
                                                          stdout/stderr
                                                          of test
                                                          agents.<br>
                                                          <br>
                                                                 
                                                          Andrew.<br>
                                                          <br>
                                                                  On
                                                          8/23/23 17:45,
                                                          Adam Hassick
                                                          wrote:<br>
                                                          <blockquote type="cite">       
                                                          Hi Andrew,<br>
                                                          <br>
                                                                  I've
                                                          set up a test
                                                          rig repository
                                                          here, and have
                                                          created<br>
                                                                 
                                                          configurations
                                                          for our
                                                          development
                                                          testbed based
                                                          off of the<br>
                                                                 
                                                          examples.<br>
                                                                  We've
                                                          been able to
                                                          get the test
                                                          suite to run
                                                          manually on<br>
                                                                 
                                                          Mellanox CX5
                                                          devices once.<br>
                                                                 
                                                          However, we
                                                          are running
                                                          into an issue
                                                          where, when
                                                          RCF starts,<br>
                                                                  the
                                                          RCF connection
                                                          times out very
                                                          frequently. We
                                                          aren't sure<br>
                                                                  why
                                                          this is the
                                                          case.<br>
                                                                  It
                                                          works
                                                          sometimes, but
                                                          most of the
                                                          time when we
                                                          try to run<br>
                                                                  the
                                                          test engine,
                                                          it encounters
                                                          this issue.<br>
                                                                  I've
                                                          tried changing
                                                          the RCF port
                                                          by setting<br>
                                                                 
                                                          "TE_RCF_PORT=<some
                                                          port
                                                          number>"
                                                          and rebooting
                                                          the testbed<br>
                                                                 
                                                          machines.
                                                          Neither seems
                                                          to fix the
                                                          issue.<br>
                                                          <br>
                                                                  It
                                                          also seems
                                                          like the
                                                          timeout takes
                                                          far longer
                                                          than 60<br>
                                                                 
                                                          seconds, even
                                                          when running
                                                          "export
                                                          TE_RCFUNIX_TIMEOUT=60"<br>
                                                                  before
                                                          I try to run
                                                          the test
                                                          suite.<br>
                                                                  I
                                                          assume the
                                                          unit for this
                                                          variable is
                                                          seconds?<br>
                                                          <br>
                                                                 
                                                          Thanks,<br>
                                                                  Adam<br>
                                                          <br>
                                                                  On
                                                          Mon, Aug 21,
                                                          2023 at
                                                          10:19 AM Adam
                                                          Hassick<br>
                                                                  <<a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a> <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a>>
                                                          wrote:<br>
                                                          <br>
                                                                      Hi
                                                          Andrew,<br>
                                                          <br>
                                                                     
                                                          Thanks, I've
                                                          cloned the
                                                          example
                                                          repository and
                                                          will start<br>
                                                                     
                                                          setting up a
                                                          configuration
                                                          for our
                                                          development
                                                          testbed<br>
                                                                     
                                                          today. I'll
                                                          let you know
                                                          if I run into
                                                          any
                                                          difficulties<br>
                                                                      or
                                                          have any
                                                          questions.<br>
                                                          <br>
                                                                       -
                                                          Adam<br>
                                                          <br>
                                                                      On
                                                          Sun, Aug 20,
                                                          2023 at
                                                          4:40 AM Andrew
                                                          Rybchenko<br>
                                                                     
                                                          <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a><br>
                                                                      <a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
                                                          wrote:<br>
                                                          <br>
                Hi Adam,<br>
                                                          <br>
                I've published<br>
                <a href="https://github.com/ts-factory/ts-rigs-sample" target="_blank">https://github.com/ts-factory/ts-rigs-sample</a><br>
                <a href="https://github.com/ts-factory/ts-rigs-sample" target="_blank"><https://github.com/ts-factory/ts-rigs-sample></a>.<br>
                Hopefully it will help to define your test rigs and<br>
                successfully run some tests manually. Feel free to<br>
                ask any questions and I'll answer here and try to<br>
                update documentation.<br>
                                                          <br>
                Meanwhile I'll prepare missing bits for steps (2) and<br>
                (3).<br>
                Hopefully everything is in place for step (4), but we<br>
                need to make steps (2) and (3) first.<br>
                                                          <br>
                Andrew.<br>
                                                          <br>
                On 8/18/23 21:40, Andrew Rybchenko wrote:<br>
                                                          <blockquote type="cite">               
                                                          Hi Adam,<br>
                                                          <br>
                > I've conferred with the rest of the team, and we<br>
                think it would be best to move forward with mainly<br>
                option B.<br>
                                                          <br>
                OK, I'll provide the sample on Monday for you. It is<br>
                almost ready right now, but I need to double-check<br>
                it before publishing.<br>
                                                          <br>
                Regards,<br>
                Andrew.<br>
                                                          <br>
                On 8/17/23 20:03, Adam Hassick wrote:<br>
                                                          <blockquote type="cite">               
                                                          Hi Andrew,<br>
                                                          <br>
                I'm adding the CI mailing list to this<br>
                conversation. Others in the community might find<br>
                this conversation valuable.<br>
                                                          <br>
                We do want to run testing on a regular basis. The<br>
                Jenkins integration will be very useful for us, as<br>
                most of our CI is orchestrated by Jenkins.<br>
                I've conferred with the rest of the team, and we<br>
                think it would be best to move forward with mainly<br>
                option B.<br>
                If you would like to know anything about our<br>
                testbeds that would help you with creating an<br>
                example ts-rigs repo, I'd be happy to answer any<br>
                questions you have.<br>
                                                          <br>
                We have multiple test rigs (we call these<br>
                "DUT-tester pairs") that we run our existing<br>
                hardware testing on, with differing network<br>
                hardware and CPU architecture. I figured this might<br>
                be an important detail.<br>
                                                          <br>
                Thanks,<br>
                Adam<br>
                                                          <br>
                On Thu, Aug 17, 2023 at 11:44 AM Andrew Rybchenko<br>
                <<a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank">andrew.rybchenko@oktetlabs.ru</a><br>
                <a href="mailto:andrew.rybchenko@oktetlabs.ru" target="_blank"><mailto:andrew.rybchenko@oktetlabs.ru></a>>
                                                          wrote:<br>
                                                          <br>
                    Greatings Adam,<br>
                                                          <br>
                    I'm happy to hear that you're trying to bring<br>
                    it up.<br>
                                                          <br>
                    As I understand the final goal is to run it on<br>
                    regular basis. So, we need to make it properly<br>
                    from the very beginning.<br>
                    Bring up of all features consists of 4 steps:<br>
                                                          <br>
                    1. Create site-specific repository (we call it<br>
                    ts-rigs) which contains information about test<br>
                    rigs and other site-specific information like<br>
                    where to send mails, where to store logs etc.<br>
                    It is required for manual execution as well,<br>
                    since test rigs description is essential. I'll<br>
                    return to the topic below.<br>
                                                          <br>
                    2. Setup logs storage for automated runs.<br>
                    Basically it is a disk space plus apache2 web<br>
                    server with few CGI scripts which help a lot to<br>
                    save disk space.<br>
                                                          <br>
                    3. Setup Bublik web application which provides<br>
                    web interface to view testing results. Same as<br>
                    <a href="https://ts-factory.io/bublik" target="_blank">https://ts-factory.io/bublik</a><br>
                    <a href="https://ts-factory.io/bublik" target="_blank"><https://ts-factory.io/bublik></a><br>
                                                          <br>
                    4. Setup Jenkins to run tests on regularly,<br>
                    save logs in log storage (2) and import it to<br>
                    bublik (3).<br>
                                                          <br>
                    Last few month we spent on our homework to make<br>
                    it simpler to bring up automated execution<br>
                    using Jenkins -<br>
                    <a href="https://github.com/ts-factory/te-jenkins" target="_blank">https://github.com/ts-factory/te-jenkins</a><br>
                    <a href="https://github.com/ts-factory/te-jenkins" target="_blank"><https://github.com/ts-factory/te-jenkins></a><br>
                    Corresponding bits in dpdk-ethdev-ts will be<br>
                    available tomorrow.<br>
                                                          <br>
                    Let's return to the step (1).<br>
                                                          <br>
                    Unfortunately there is no publicly available<br>
                    example of the ts-rigs repository since<br>
                    sensitive site-specific information is located<br>
                    there. But I'm ready to help you to create it<br>
                    for UNH. I see two options here:<br>
                                                          <br>
                    (A) I'll ask questions and based on your<br>
                    answers will create the first draft with my<br>
                    comments.<br>
                                                          <br>
                    (B) I'll make a template/example ts-rigs repo,<br>
                    publish it and you'll create UNH ts-rigs based<br>
                    on it.<br>
                                                          <br>
                    Of course, I'll help to debug and finally bring<br>
                    it up in any case.<br>
                                                          <br>
                    (A) is a bit simpler for me and you, but (B) is<br>
                    a bit more generic and will help other<br>
                    potential users to bring it up.<br>
                    We can combine (A)+(B). I.e. start from (A).<br>
                    What do you think?<br>
                                                          <br>
                    Thanks,<br>
                    Andrew.<br>
                                                          <br>
                    On 8/17/23 15:18, Konstantin Ushakov wrote:<br>
                                                          <blockquote type="cite">                   
                                                          Greetings
                                                          Adam,<br>
                                                          <br>
                                                          <br>
                    Thanks for contacting us. I copy Andrew who<br>
                    would be happy to help<br>
                                                          <br>
                    Thanks,<br>
                    Konstantin<br>
                                                          <br>
                                                          <blockquote type="cite">                   
                                                          On 16 Aug
                                                          2023, at
                                                          21:50, Adam
                                                          Hassick<br>
                    <a href="mailto:ahassick@iol.unh.edu" target="_blank"><ahassick@iol.unh.edu></a><br>
                    <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a> wrote:<br>
                                                          <br>
                    <br>
                    Greetings Konstantin,<br>
                                                          <br>
                    I am in the process of setting up the DPDK<br>
                    Poll Mode Driver test suite as an addition to<br>
                    our testing coverage for DPDK at the UNH lab.<br>
                                                          <br>
                    I have some questions about how to set the<br>
                    test suite arguments.<br>
                                                          <br>
                    I have been able to configure the Test Engine<br>
                    to connect to the hosts in the testbed. The<br>
                    RCF, Configurator, and Tester all begin to<br>
                    run, however the prelude of the test suite<br>
                    fails to run.<br>
                                                          <br>
                    <a href="https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters" target="_blank">https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters</a>
                                                          <a href="https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters" target="_blank"><https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters></a><br>
                                                          <br>
                    The documentation mentions that there are<br>
                    several test parameters for the test suite,<br>
                    like for the IUT test link MAC, etc. These<br>
                    seem like they would need to be set somewhere<br>
                    to run many of the tests.<br>
                                                          <br>
                    I see in the Test Engine documentation, there<br>
                    are instructions on how to create new<br>
                    parameters for test suites in the Tester<br>
                    configuration, but there is nothing in the<br>
                    user guide or in the Tester guide for how to<br>
                    set the arguments for the parameters when<br>
                    running the test suite that I can find. I'm<br>
                    not sure if I need to write my own Tester<br>
                    config, or if I should be setting these in<br>
                    some other way.<br>
                                                          <br>
                    How should these values be set?<br>
                                                          <br>
                    I'm also not sure what environment<br>
                    variables/arguments are strictly necessary or<br>
                    which are optional.<br>
                                                          <br>
                    Regards,<br>
                    Adam<br>
                                                          <br>
                    --                     *Adam Hassick*<br>
                    Senior Developer<br>
                    UNH InterOperability Lab<br>
                    <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                    <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                    <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a>
                                                          <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                    +1 (603) 475-8248<br>
                                                          </blockquote>
                                                          </blockquote>
                                                          <br>
                                                          <br>
                                                          <br>
                --                 *Adam Hassick*<br>
                Senior Developer<br>
                UNH InterOperability Lab<br>
                <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a> <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a>
                                                          <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                +1 (603) 475-8248<br>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          <br>
                                                          <br>
                                                                      --
                                                                     
                                                          *Adam Hassick*<br>
                                                                     
                                                          Senior
                                                          Developer<br>
                                                                     
                                                          UNH
                                                          InterOperability
                                                          Lab<br>
                                                                      <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a> <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                                                                      <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a>
                                                          <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                                                                      +1
                                                          (603) 475-8248<br>
                                                          <br>
                                                          <br>
                                                          <br>
                                                                  --
                                                                  *Adam
                                                          Hassick*<br>
                                                                  Senior
                                                          Developer<br>
                                                                  UNH
                                                          InterOperability
                                                          Lab<br>
                                                                  <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a>
                                                          <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                                                                  <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a> <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                                                                  +1
                                                          (603) 475-8248<br>
                                                          </blockquote>
                                                          <br>
                                                          <br>
                                                          <br>
                                                              --    
                                                          *Adam Hassick*<br>
                                                              Senior
                                                          Developer<br>
                                                              UNH
                                                          InterOperability
                                                          Lab<br>
                                                              <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a>
                                                          <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                                                              <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a> <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                                                              +1 (603)
                                                          475-8248<br>
                                                          </blockquote>
                                                          <br>
                                                          <br>
                                                          <br>
                                                          -- <br>
                                                          *Adam Hassick*<br>
                                                          Senior
                                                          Developer<br>
                                                          UNH
                                                          InterOperability
                                                          Lab<br>
                                                          <a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a>
                                                          <a href="mailto:ahassick@iol.unh.edu" target="_blank"><mailto:ahassick@iol.unh.edu></a><br>
                                                          <a href="http://iol.unh.edu" target="_blank">iol.unh.edu</a> <a href="https://www.iol.unh.edu/" target="_blank"><https://www.iol.unh.edu/></a><br>
                                                          +1 (603)
                                                          475-8248<br>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br clear="all">
                                                          <br>
                                                          <span class="gmail_signature_prefix">--</span><br>
                                                          <div dir="ltr" class="gmail_signature">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam
                                                          Hassick</span></span></b><br>
                                                          </div>
                                                          </div>
                                                          <div><span style="color:rgb(102,102,102)">Senior
                                                          Developer</span></div>
                                                          <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                                          </span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                                          </span></div>
                                                          +1 (603)
                                                          475-8248<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br clear="all">
                                                          <br>
                                                          <span class="gmail_signature_prefix">--</span><br>
                                                          <div dir="ltr" class="gmail_signature">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam
                                                          Hassick</span></span></b><br>
                                                          </div>
                                                          </div>
                                                          <div><span style="color:rgb(102,102,102)">Senior
                                                          Developer</span></div>
                                                          <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                                          </span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                                          </span></div>
                                                          +1 (603)
                                                          475-8248<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br clear="all">
                                                          <br>
                                                          <span class="gmail_signature_prefix">--</span><br>
                                                          <div dir="ltr" class="gmail_signature">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam
                                                          Hassick</span></span></b><br>
                                                          </div>
                                                          </div>
                                                          <div><span style="color:rgb(102,102,102)">Senior
                                                          Developer</span></div>
                                                          <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
InterOperability Lab</span></span></span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                                          </span></div>
                                                          <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                                          </span></div>
                                                          +1 (603)
                                                          475-8248<br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                  <br>
                                                </div>
                                              </blockquote>
                                            </div>
                                            <br clear="all">
                                            <br>
                                            <span class="gmail_signature_prefix">--</span><br>
                                            <div dir="ltr" class="gmail_signature">
                                              <div dir="ltr">
                                                <div>
                                                  <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
                                                  </div>
                                                </div>
                                                <div><span style="color:rgb(102,102,102)">Senior
                                                    Developer</span></div>
                                                <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
                                                        InterOperability
                                                        Lab</span></span></span></div>
                                                <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                                  </span></div>
                                                <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                                  </span></div>
                                                +1 (603) 475-8248<br>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                        </div>
                                      </blockquote>
                                    </div>
                                    <br clear="all">
                                    <br>
                                    <span class="gmail_signature_prefix">--</span><br>
                                    <div dir="ltr" class="gmail_signature">
                                      <div dir="ltr">
                                        <div>
                                          <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
                                          </div>
                                        </div>
                                        <div><span style="color:rgb(102,102,102)">Senior
                                            Developer</span></div>
                                        <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
                                                InterOperability Lab</span></span></span></div>
                                        <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                          </span></div>
                                        <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                          </span></div>
                                        +1 (603) 475-8248<br>
                                      </div>
                                    </div>
                                  </blockquote>
                                </div>
                                <br clear="all">
                                <br>
                                <span class="gmail_signature_prefix">--</span><br>
                                <div dir="ltr" class="gmail_signature">
                                  <div dir="ltr">
                                    <div>
                                      <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
                                      </div>
                                    </div>
                                    <div><span style="color:rgb(102,102,102)">Senior
                                        Developer</span></div>
                                    <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH InterOperability Lab</span></span></span></div>
                                    <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                                      </span></div>
                                    <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                                      </span></div>
                                    +1 (603) 475-8248<br>
                                  </div>
                                </div>
                              </blockquote>
                              <br>
                            </blockquote>
                            <br>
                          </div>
                        </blockquote>
                        <div style="white-space:normal"> </div>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </blockquote>
              </div>
              <br clear="all">
              <br>
              <span class="gmail_signature_prefix">-- </span><br>
              <div dir="ltr" class="gmail_signature">
                <div dir="ltr">
                  <div>
                    <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
                    </div>
                    <span style="color:rgb(102,102,102)"></span></div>
                  <div><span style="color:rgb(102,102,102)">Senior
                      Developer</span></div>
                  <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
                          InterOperability Lab</span></span></span><span style="color:rgb(102,102,102)"></span></div>
                  <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
                    </span></div>
                  <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
                    </span></div>
                  +1 (603) 475-8248<br>
                </div>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
      <br clear="all">
      <br>
      <span class="gmail_signature_prefix">-- </span><br>
      <div dir="ltr" class="gmail_signature">
        <div dir="ltr">
          <div>
            <div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br>
            </div>
            <span style="color:rgb(102,102,102)"></span></div>
          <div><span style="color:rgb(102,102,102)">Senior Developer</span></div>
          <div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH
                  InterOperability Lab</span></span></span><span style="color:rgb(102,102,102)"></span></div>
          <div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br>
            </span></div>
          <div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br>
            </span></div>
          +1 (603) 475-8248<br>
        </div>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div><br clear="all"><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div><b><span style="background-color:rgb(255,255,255)"><span style="color:rgb(102,102,102)">Adam Hassick</span></span></b><br></div><span style="color:rgb(102,102,102)"></span></div><div><span style="color:rgb(102,102,102)">Senior Developer</span></div><div><span style="color:rgb(102,102,102)"><span style="color:rgb(11,83,148)"><span style="background-color:rgb(255,255,255)">UNH InterOperability Lab</span></span></span><span style="color:rgb(102,102,102)"></span></div><div><span style="color:rgb(102,102,102)"><a href="mailto:ahassick@iol.unh.edu" target="_blank">ahassick@iol.unh.edu</a><br></span></div><div><span style="color:rgb(102,102,102)"><a href="https://www.iol.unh.edu/" target="_blank">iol.unh.edu</a><br></span></div>+1 (603) 475-8248<br></div></div>