[dpdk-dev] [PATCH 3/3] ring: use element APIs to implement legacy APIs
Feifei Wang
Feifei.Wang2 at arm.com
Wed Jul 8 03:30:46 CEST 2020
Hi, David
> -----Original Message-----
> From: David Christensen <drc at linux.vnet.ibm.com>
> Sent: 2020年7月8日 4:07
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Feifei Wang
> <Feifei.Wang2 at arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>
> Cc: dev at dpdk.org; nd <nd at arm.com>; Ruifeng Wang
> <Ruifeng.Wang at arm.com>
> Subject: Re: [PATCH 3/3] ring: use element APIs to implement legacy APIs
>
>
>
> On 7/7/20 7:04 AM, Ananyev, Konstantin wrote:
> >
> > Hi Feifei,
> >
> > > Hi, Konstantin, David
> >>
> >> I'm Feifei Wang from Arm. Sorry to make the following request:
> >> Would you please do some ring performance tests of this patch in your
> platforms at the time you are free?
> >> And I want to know whether this patch has a significant impact on other
> platforms except ARM.
> >
> > I run few tests on SKX box and so far didn’t notice any real perf difference.
> > Konstantin
> >
>
Thanks very much for presenting these test results.
Feifei
> Full performance results for IBM POWER9 system below. I ran the tests
> twice for each version and the results were consistent.
>
> without this patch with this patch
> Testing burst enq/deq
> legacy APIs: SP/SC: burst (size: 8): 43.63 43.63
> legacy APIs: SP/SC: burst (size: 32): 50.07 50.04
> legacy APIs: MP/MC: burst (size: 8): 58.43 58.42
> legacy APIs: MP/MC: burst (size: 32): 65.52 65.51
> Testing bulk enq/deq
> legacy APIs: SP/SC: bulk (size: 8): 43.61 43.61
> legacy APIs: SP/SC: bulk (size: 32): 50.05 50.02
> legacy APIs: MP/MC: bulk (size: 8): 58.43 58.43
> legacy APIs: MP/MC: bulk (size: 32): 65.50 65.49
>
> HW:
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 128
> On-line CPU(s) list: 0-127
> Thread(s) per core: 4
> Core(s) per socket: 16
> Socket(s): 2
> NUMA node(s): 6
> Model: 2.3 (pvr 004e 1203)
> Model name: POWER9, altivec supported
> CPU max MHz: 3800.0000
> CPU min MHz: 2300.0000
> L1d cache: 32K
> L1i cache: 32K
> L2 cache: 512K
> L3 cache: 10240K
> NUMA node0 CPU(s): 0-63
> NUMA node8 CPU(s): 64-127
>
> OS: RHEL 8.2
>
> GCC: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
>
> DPDK: 20.08.0-rc0 (a8550b773)
>
>
>
> Unpatched
> ===========
> sudo app/test/dpdk-test -l 68,69
> EAL: Detected 128 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'VA'
> EAL: No available hugepages reported in hugepages-2048kB
> EAL: Probing VFIO support...
> EAL: VFIO support initialized
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.0 (socket 0)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.1 (socket 0)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.0 (socket 8)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.1 (socket 8)
> EAL: using IOMMU type 7 (sPAPR)
> EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.0 (socket 8)
> EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.1 (socket 8)
> APP: HPET is not enabled, using TSC as default timer
> RTE>>ring_perf_autotest
>
> ### Testing single element enq/deq ###
> legacy APIs: SP/SC: single: 42.01
> legacy APIs: MP/MC: single: 56.27
>
> ### Testing burst enq/deq ###
> legacy APIs: SP/SC: burst (size: 8): 43.63 legacy APIs: SP/SC: burst (size: 32):
> 50.07 legacy APIs: MP/MC: burst (size: 8): 58.43 legacy APIs: MP/MC: burst
> (size: 32): 65.52
>
> ### Testing bulk enq/deq ###
> legacy APIs: SP/SC: bulk (size: 8): 43.61 legacy APIs: SP/SC: bulk (size: 32):
> 50.05 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk (size:
> 32): 65.50
>
> ### Testing empty bulk deq ###
> legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): 7.16
>
> ### Testing using two hyperthreads ###
> legacy APIs: SP/SC: bulk (size: 8): 12.44 legacy APIs: MP/MC: bulk (size: 8):
> 16.19 legacy APIs: SP/SC: bulk (size: 32): 3.10 legacy APIs: MP/MC: bulk (size:
> 32): 3.64
>
> ### Testing using all slave nodes ###
>
> Bulk enq/dequeue count on size 8
> Core [68] count = 362382
> Core [69] count = 362516
> Total count (size: 8): 724898
>
> Bulk enq/dequeue count on size 32
> Core [68] count = 361565
> Core [69] count = 361852
> Total count (size: 32): 723417
>
> ### Testing single element enq/deq ###
> elem APIs: element size 16B: SP/SC: single: 42.81 elem APIs: element size 16B:
> MP/MC: single: 56.78
>
> ### Testing burst enq/deq ###
> elem APIs: element size 16B: SP/SC: burst (size: 8): 45.04 elem APIs: element
> size 16B: SP/SC: burst (size: 32): 59.27 elem APIs: element size 16B: MP/MC:
> burst (size: 8): 60.68 elem APIs: element size 16B: MP/MC: burst (size: 32):
> 75.00
>
> ### Testing bulk enq/deq ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 45.05 elem APIs: element
> size 16B: SP/SC: bulk (size: 32): 59.23 elem APIs: element size 16B: MP/MC:
> bulk (size: 8): 60.64 elem APIs: element size 16B: MP/MC: bulk (size: 32):
> 75.11
>
> ### Testing empty bulk deq ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element
> size 16B: MP/MC: bulk (size: 8): 7.16
>
> ### Testing using two hyperthreads ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.15 elem APIs: element
> size 16B: MP/MC: bulk (size: 8): 15.55 elem APIs: element size 16B: SP/SC:
> bulk (size: 32): 3.22 elem APIs: element size 16B: MP/MC: bulk (size: 32): 3.86
>
> ### Testing using all slave nodes ###
>
> Bulk enq/dequeue count on size 8
> Core [68] count = 374327
> Core [69] count = 374433
> Total count (size: 8): 748760
>
> Bulk enq/dequeue count on size 32
> Core [68] count = 324111
> Core [69] count = 320038
> Total count (size: 32): 644149
> Test OK
>
> Patched
> =======
> $ sudo app/test/dpdk-test -l 68,69
> EAL: Detected 128 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'VA'
> EAL: No available hugepages reported in hugepages-2048kB
> EAL: Probing VFIO support...
> EAL: VFIO support initialized
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.0 (socket 0)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.1 (socket 0)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.0 (socket 8)
> EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.1 (socket 8)
> EAL: using IOMMU type 7 (sPAPR)
> EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.0 (socket 8)
> EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.1 (socket 8)
> APP: HPET is not enabled, using TSC as default timer
> RTE>>ring_perf_autotest
>
> ### Testing single element enq/deq ###
> legacy APIs: SP/SC: single: 42.00
> legacy APIs: MP/MC: single: 56.27
>
> ### Testing burst enq/deq ###
> legacy APIs: SP/SC: burst (size: 8): 43.63 legacy APIs: SP/SC: burst (size: 32):
> 50.04 legacy APIs: MP/MC: burst (size: 8): 58.42 legacy APIs: MP/MC: burst
> (size: 32): 65.51
>
> ### Testing bulk enq/deq ###
> legacy APIs: SP/SC: bulk (size: 8): 43.61 legacy APIs: SP/SC: bulk (size: 32):
> 50.02 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk (size:
> 32): 65.49
>
> ### Testing empty bulk deq ###
> legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): 7.16
>
> ### Testing using two hyperthreads ###
> legacy APIs: SP/SC: bulk (size: 8): 12.43 legacy APIs: MP/MC: bulk (size: 8):
> 16.17 legacy APIs: SP/SC: bulk (size: 32): 3.10 legacy APIs: MP/MC: bulk (size:
> 32): 3.65
>
> ### Testing using all slave nodes ###
>
> Bulk enq/dequeue count on size 8
> Core [68] count = 363208
> Core [69] count = 363334
> Total count (size: 8): 726542
>
> Bulk enq/dequeue count on size 32
> Core [68] count = 361592
> Core [69] count = 361690
> Total count (size: 32): 723282
>
> ### Testing single element enq/deq ###
> elem APIs: element size 16B: SP/SC: single: 42.78 elem APIs: element size 16B:
> MP/MC: single: 56.75
>
> ### Testing burst enq/deq ###
> elem APIs: element size 16B: SP/SC: burst (size: 8): 45.04 elem APIs: element
> size 16B: SP/SC: burst (size: 32): 59.27 elem APIs: element size 16B: MP/MC:
> burst (size: 8): 60.66 elem APIs: element size 16B: MP/MC: burst (size: 32):
> 75.03
>
> ### Testing bulk enq/deq ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 45.04 elem APIs: element
> size 16B: SP/SC: bulk (size: 32): 59.33 elem APIs: element size 16B: MP/MC:
> bulk (size: 8): 60.65 elem APIs: element size 16B: MP/MC: bulk (size: 32):
> 75.04
>
> ### Testing empty bulk deq ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element
> size 16B: MP/MC: bulk (size: 8): 7.16
>
> ### Testing using two hyperthreads ###
> elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.14 elem APIs: element
> size 16B: MP/MC: bulk (size: 8): 15.56 elem APIs: element size 16B: SP/SC:
> bulk (size: 32): 3.22 elem APIs: element size 16B: MP/MC: bulk (size: 32): 3.86
>
> ### Testing using all slave nodes ###
>
> Bulk enq/dequeue count on size 8
> Core [68] count = 372618
> Core [69] count = 372415
> Total count (size: 8): 745033
>
> Bulk enq/dequeue count on size 32
> Core [68] count = 318784
> Core [69] count = 316066
> Total count (size: 32): 634850
> Test OK
More information about the dev
mailing list