[dpdk-stable] patch 'test/ring: reduce duration of performance tests' has been queued to stable release 20.11.1
luca.boccassi at gmail.com
luca.boccassi at gmail.com
Fri Feb 5 12:18:10 CET 2021
Hi,
FYI, your patch has been queued to stable release 20.11.1
Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 02/07/21. So please
shout if anyone has objections.
Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.
Queued patches are on a temporary branch at:
https://github.com/bluca/dpdk-stable
This queued commit can be viewed at:
https://github.com/bluca/dpdk-stable/commit/5b7e8b9dc1f42592745d59b5bb15c9bfcd62b0de
Thanks.
Luca Boccassi
---
>From 5b7e8b9dc1f42592745d59b5bb15c9bfcd62b0de Mon Sep 17 00:00:00 2001
From: Feifei Wang <feifei.wang2 at arm.com>
Date: Fri, 29 Jan 2021 13:59:03 +0800
Subject: [PATCH] test/ring: reduce duration of performance tests
[ upstream commit d310d64271459624c2c06995827e7e9b419f39cb ]
When testing ring performance in the case that multiple lcores are mapped
to the same physical core, e.g. --lcores '(0-3)@10', it takes a very long
time to wait for the "enqueue_dequeue_bulk_helper" to finish.
This is because too much iteration numbers and extremely low efficiency
for enqueue and dequeue with this kind of core mapping. Following are the
test results to show the above phenomenon:
x86-Intel(R) Xeon(R) Gold 6240:
$sudo ./app/test/dpdk-test --lcores '(0-1)@25'
Testing using two hyperthreads(bulk (size: 8):)
iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
run time: 7s 7s 7s 8s 9s 16s 47s 170s 660s >0.5h >1h
legacy APIs: SP/SC: 37 11 6 40525 40525 40209 40367 40407 40541 NoData NoData
legacy APIs: MP/MC: 56 14 11 50657 40526 40526 40526 40625 40585 NoData NoData
aarch64-n1sdp:
$sudo ./app/test/dpdk-test --lcore '(0-1)@1'
Testing using two hyperthreads(bulk (size: 8):)
iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
run time: 8s 8s 8s 9s 9s 14s 34s 111s 418s 25min >1h
legacy APIs: SP/SC: 0.4 0.2 0.1 488 488 488 488 488 489 489 NoData
legacy APIs: MP/MC: 0.4 0.3 0.2 488 488 488 488 490 489 489 NoData
As the number of iterations increases, so does the time which is required
to run the program. Currently (iter_shift = 23), it will take more than
1 hour to wait for the test to finish. To fix this, the "iter_shift" should
decrease and ensure enough iterations to keep the test data stable.
In order to achieve this, we also test with "-l" EAL argument:
x86-Intel(R) Xeon(R) Gold 6240:
$sudo ./app/test/dpdk-test -l 25-26
Testing using two NUMA nodes(bulk (size: 8):)
iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
run time: 6s 6s 6s 6s 6s 6s 6s 7s 8s 11s 27s
legacy APIs: SP/SC: 47 20 13 22 54 83 91 73 81 75 95
legacy APIs: MP/MC: 44 18 18 240 245 270 250 249 252 250 253
aarch64-n1sdp:
$sudo ./app/test/dpdk-test -l 1-2
Testing using two physical cores(bulk (size: 8):)
iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
run time: 8s 8s 8s 8s 8s 8s 8s 9s 9s 11s 23s
legacy APIs: SP/SC: 0.7 0.4 1.2 1.8 2.0 2.0 2.0 2.0 2.0 2.0 2.0
legacy APIs: MP/MC: 0.3 0.4 1.3 1.9 2.9 2.9 2.9 2.9 2.9 2.9 2.9
According to above test data, when "iter_shift" is set as "15", the test
run time is reduced to less than 1 minute and the test result can keep
stable in x86 and aarch64 servers.
Fixes: 1fa5d0099efc ("test/ring: add custom element size performance tests")
Signed-off-by: Feifei Wang <feifei.wang2 at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
---
app/test/test_ring_perf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
index e63e25a867..fd82e20412 100644
--- a/app/test/test_ring_perf.c
+++ b/app/test/test_ring_perf.c
@@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int flag, const int esize,
struct thread_params *p)
{
int ret;
- const unsigned int iter_shift = 23;
+ const unsigned int iter_shift = 15;
const unsigned int iterations = 1 << iter_shift;
struct rte_ring *r = p->r;
unsigned int bsize = p->size;
--
2.29.2
---
Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- - 2021-02-05 11:18:38.267059334 +0000
+++ 0204-test-ring-reduce-duration-of-performance-tests.patch 2021-02-05 11:18:29.162697890 +0000
@@ -1 +1 @@
-From d310d64271459624c2c06995827e7e9b419f39cb Mon Sep 17 00:00:00 2001
+From 5b7e8b9dc1f42592745d59b5bb15c9bfcd62b0de Mon Sep 17 00:00:00 2001
@@ -5,0 +6,2 @@
+[ upstream commit d310d64271459624c2c06995827e7e9b419f39cb ]
+
@@ -56 +57,0 @@
-Cc: stable at dpdk.org
More information about the stable
mailing list