[dpdk-ci] [dpdklab] Re: Intel performance test is failing

Ma, LihongX lihongx.ma at intel.com
Fri May 8 10:02:15 CEST 2020


Thanks Brandon, wait your reply.

Regards,
Ma,lihong

From: Brandon Lo [mailto:blo at iol.unh.edu]
Sent: Wednesday, May 6, 2020 9:05 PM
To: Ma, LihongX <lihongx.ma at intel.com>
Cc: Chen, Zhaoyan <zhaoyan.chen at intel.com>; David Marchand <david.marchand at redhat.com>; dpdklab at iol.unh.edu; Lincoln Lavoie <lylavoie at iol.unh.edu>; Thomas Monjalon <thomas at monjalon.net>; ci at dpdk.org; Tu, Lijuan <lijuan.tu at intel.com>; Xu, Qian Q <qian.q.xu at intel.com>; Zhang, XuemingX <xuemingx.zhang at intel.com>; O'Driscoll, Tim <tim.odriscoll at intel.com>; Lin, Xueqin <xueqin.lin at intel.com>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Lihong,

Just a further update: we have noticed that there is another internal script that is used to calculate baselines that will pull a newer baseline if it is found.
We are looking to solve the issues that we are having with baselines and will get back to you.

Thanks for your patience,
Brandon

On Thu, Apr 30, 2020 at 10:19 AM Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>> wrote:
Hi Lihong,

The expected value was reset by one of our internal scripts.
I believe that I have resolved this issue for the future by ensuring that the baseline that you sent me will not be overwritten automatically.

I will continue to monitor this expected throughput in case of any issues.

Thanks for your patience,
Brandon

On Wed, Apr 29, 2020 at 1:30 AM Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>> wrote:
Hi, Brandon

I checked the new result of FVL, find the expected value also not changed.
From the log find the expected value also is:
[cid:image001.png at 01D62552.0A1802A0]

Can you help to double check it ? Is there any different between FVL and NNT ?

Regards,
Ma,lihong

From: Brandon Lo [mailto:blo at iol.unh.edu<mailto:blo at iol.unh.edu>]
Sent: Wednesday, April 29, 2020 12:52 AM
To: Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>; David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>; O'Driscoll, Tim <tim.odriscoll at intel.com<mailto:tim.odriscoll at intel.com>>; Lin, Xueqin <xueqin.lin at intel.com<mailto:xueqin.lin at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Lihong,

Sorry about that, I have reset the baseline to the values you sent in the previous email.
I'll look to rerun tests that have failed due to the incorrect baseline.

Thanks for letting me know,
Brandon

On Mon, Apr 27, 2020 at 11:39 PM Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>> wrote:
Hi, Brandon
I find the baseline of NNT have changed as expected, but FVL still same as before.
Can you help to check it and change the baseline as expected ?

Regards,
Ma,lihong

From: Brandon Lo [mailto:blo at iol.unh.edu<mailto:blo at iol.unh.edu>]
Sent: Friday, April 3, 2020 2:39 AM
To: Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>; David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>; O'Driscoll, Tim <tim.odriscoll at intel.com<mailto:tim.odriscoll at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Lihong,

I have changed the baselines to reflect the new expected values.
The performance tests should work as expected and pass.

We will email again in the future if we come across any problems.
Feel free to email us as well if you would like to make any other changes.

Thank you for all your help

On Wed, Apr 1, 2020 at 2:00 AM Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>> wrote:
Hi, Brandon
Thanks for you recommends, I have done the changes.
As the throughput  value of nic_single_core is proportional to the cpu frequency.
I recommend you can change the baseline according to our report system.

On the our 2.50GHz system, the baseline value as below:
NNT:
pkt_size

trd/rxd

expected_value

64

512

52.562

64

2048

41.439


FVL:
pkt_size

trd/rxd

expected_value

64

512

59.608

64

2048

47.73


For the testbed in UNH, it’s a 2.1Ghz CPU server, so the expected number should be
NNT:
pkt_size

trd/rxd

expected_value

64

512

52.562 / 2.5 * 2.1=44.152

64

2048

41.439 / 2.5 * 2.1=34.809


FVL:
pkt_size

trd/rxd

expected_value

64

512

59.608 / 2.5 * 2.1=50.071

64

2048

47.73 / 2.5 * 2.1=40.093




Regards,
Ma,lihong

From: Brandon Lo [mailto:blo at iol.unh.edu<mailto:blo at iol.unh.edu>]
Sent: Tuesday, March 31, 2020 9:42 PM
To: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>; O'Driscoll, Tim <tim.odriscoll at intel.com<mailto:tim.odriscoll at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

To make changes to either Intel machine, please reboot using the command "reboot_to_rw" as root to reboot the machine into read/write mode.
This command will also disable any testing on the machine.

To re-enable the machine, please run "reboot_to_ro" as root, and it will save all of the changes that you've made and re-enable testing on the machine.
I recommend rebooting using either "reboot_to_rw" or "reboot_to_ro" instead of the normal "reboot" while you're making changes.

After you're done, please let me know. I'll have to manually run a test and update the baseline using our internal CI.

Thank you

On Mon, Mar 30, 2020 at 12:43 AM Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>> wrote:
Hi, Brandon,

Please let me know how to make change to this reset machine. (ip/access...) and disable it.

After that please help to change the baseline.


Regards,
Zhaoyan Chen

From: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Sent: Thursday, March 26, 2020 11:39 PM
To: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>; O'Driscoll, Tim <tim.odriscoll at intel.com<mailto:tim.odriscoll at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

Currently, we have a system in place that resets any changes made while testing is enabled for a machine.
If you would like, I can disable testing and allow you to make permanent changes.

I can also reset the baseline of Intel 10G test performance once you make these changes.
Please let me know if you would like to make permanent changes on the Intel 10G so I can disable it for you.

Thanks

On Wed, Mar 25, 2020 at 12:59 AM Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>> wrote:
Thanks. Brandon.

That’s good. We have made changed on 10G testbed.

I monitored the several execution results; I found the results of 10G always has -0.9%~-1.x% gap against expected number. So it could lead to see sometime failures..+-1% I suggest adjusting the expected number. I don’t know where the expected number is from? as I know it a dynamic number? depends on baseline.. Please help to clarify, thanks.


Thanks.

Regards,
Zhaoyan Chen

From: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Sent: Tuesday, March 24, 2020 9:31 PM
To: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

I have enabled the 10G Intel machine for testing.
If you would like to make any more changes, please let me know so I can perform the necessary steps to prepare the machine for changes.
Please feel free to let me know if you need anything.

Thank you

On Sun, Mar 22, 2020 at 9:58 PM Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>> wrote:
Hi, Brandon,

For 10G, please enable it. our code is at original path /opt/test-harness/dts.

For 40G, please keep running. and see if any issue. But, anyway, we have modified the DTS code at /opt/test-harness/dts-new-suite. If we met same problem, then use this new DTS instead.

Thanks a lot

Regards,
Zhaoyan Chen

From: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Sent: Saturday, March 21, 2020 1:49 AM
To: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Ma, LihongX <lihongx.ma at intel.com<mailto:lihongx.ma at intel.com>>; Zhang, XuemingX <xuemingx.zhang at intel.com<mailto:xuemingx.zhang at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

Currently, the 40G machine is stable enough to be put on production dashboard to run tests which may cause Trex to be killed.
Should I disable the 40G Intel machine for you to make changes?

Also, just for confirmation: on the 10G machine, is the folder that you are using for the testing located in /opt/test-harness/dts-2020-3-4, or are you still using the one in the standard /opt/test-harness/dts folder?

If everything is ok, I will enable the 10G machine for production testing.

Thank you very much

On Thu, Mar 19, 2020 at 9:36 PM Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>> wrote:
Brandon,

We worked out a workaround on Intel testbeds. NNT(10G) and FVL(40G). Could you please help to recover them?

But, for FVL(40G) testbed,  we met some problems, could you please help to check before recover it

  *   Sometime 1G hugepage will be changed to 2Mhugepage automatically...we have to restart the system
  *   When we debugging on the testbed, found that Trex was killed by some one(app)..
Please help to check if any other program running on the testbed.

Thanks a lot.



Regards,
Zhaoyan Chen

From: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Sent: Wednesday, March 18, 2020 9:04 PM
To: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>; Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Subject: RE: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Brandon, we almost made a workaround.

Maybe tomorrow, you could recover Intel’s testbed. I will let you know soon.



Regards,
Zhaoyan Chen

From: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Sent: Wednesday, March 18, 2020 3:34 AM
To: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>
Cc: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

Have you finished making changes on the Intel machine?
I will turn on the machine on March 3rd for testing if you do not have any issues with it.
Please let me know if you need anything else.

Thanks

On Tue, Mar 10, 2020 at 10:13 PM Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>> wrote:
Hi, Brandon,

Yes, it’s a wired issue. And it also mixed our DTS upgrading and Trex upgrading.
So we are reviewing our DTS script, different Trex version, and CI calling procedure.

Anyway, we are focusing on this task recently, any update will let you know.

Thanks.

Regards,
Zhaoyan Chen

From: Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>>
Sent: Tuesday, March 10, 2020 10:46 PM
To: David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>>
Cc: Chen, Zhaoyan <zhaoyan.chen at intel.com<mailto:zhaoyan.chen at intel.com>>; dpdklab at iol.unh.edu<mailto:dpdklab at iol.unh.edu>; Lincoln Lavoie <lylavoie at iol.unh.edu<mailto:lylavoie at iol.unh.edu>>; Thomas Monjalon <thomas at monjalon.net<mailto:thomas at monjalon.net>>; ci at dpdk.org<mailto:ci at dpdk.org>; Tu, Lijuan <lijuan.tu at intel.com<mailto:lijuan.tu at intel.com>>; Xu, Qian Q <qian.q.xu at intel.com<mailto:qian.q.xu at intel.com>>
Subject: Re: [dpdklab] Re: [dpdk-ci] Intel performance test is failing

Hi Zhaoyan,

How is the current status of the Intel 82599ES?
Were there any configuration changes made to fix performance issues?

Thanks

On Tue, Mar 10, 2020 at 9:11 AM Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>> wrote:
Hi David,

This was just a weird issue with the packet generator not cleaning itself after a test fast enough before another test.
I'll rerun the tests that were affected and keep an eye out to see if it's stable enough to be put back online.

Thanks

On Tue, Mar 10, 2020 at 5:33 AM David Marchand <david.marchand at redhat.com<mailto:david.marchand at redhat.com>> wrote:
On Tue, Mar 3, 2020 at 3:14 PM Brandon Lo <blo at iol.unh.edu<mailto:blo at iol.unh.edu>> wrote:
>
> Hi David and Zhaoyan,
>
>
> Yes, those results are related to the Intel machine; I have disabled testing for the Intel testbed.
>
> The 82599ES machine is now available for ssh and modifications.

Any news about this?

I received a failure on a patch of mine (changing macros in a ARM header).
https://lab.dpdk.org/results/dashboard/patchsets/9900/

But this time, it is with the 40G Intel nic test.

--
David Marchand


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>


--

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo at iol.unh.edu<mailto:blo at iol.unh.edu>

www.iol.unh.edu<http://www.iol.unh.edu/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/ci/attachments/20200508/5b779626/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4767 bytes
Desc: image001.png
URL: <http://mails.dpdk.org/archives/ci/attachments/20200508/5b779626/attachment-0001.png>


More information about the ci mailing list