Hi, David
> -----Original Message----- > From: David Christensen <d...@linux.vnet.ibm.com> > Sent: 2020年7月8日 4:07 > To: Ananyev, Konstantin <konstantin.anan...@intel.com>; Feifei Wang > <feifei.wa...@arm.com>; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com> > Cc: dev@dpdk.org; nd <n...@arm.com>; Ruifeng Wang > <ruifeng.w...@arm.com> > Subject: Re: [PATCH 3/3] ring: use element APIs to implement legacy APIs > > > > On 7/7/20 7:04 AM, Ananyev, Konstantin wrote: > > > > Hi Feifei, > > > > > Hi, Konstantin, David > >> > >> I'm Feifei Wang from Arm. Sorry to make the following request: > >> Would you please do some ring performance tests of this patch in your > platforms at the time you are free? > >> And I want to know whether this patch has a significant impact on other > platforms except ARM. > > > > I run few tests on SKX box and so far didn’t notice any real perf > > difference. > > Konstantin > > > Thanks very much for presenting these test results. Feifei > Full performance results for IBM POWER9 system below. I ran the tests > twice for each version and the results were consistent. > > without this patch with this patch > Testing burst enq/deq > legacy APIs: SP/SC: burst (size: 8): 43.63 43.63 > legacy APIs: SP/SC: burst (size: 32): 50.07 50.04 > legacy APIs: MP/MC: burst (size: 8): 58.43 58.42 > legacy APIs: MP/MC: burst (size: 32): 65.52 65.51 > Testing bulk enq/deq > legacy APIs: SP/SC: bulk (size: 8): 43.61 43.61 > legacy APIs: SP/SC: bulk (size: 32): 50.05 50.02 > legacy APIs: MP/MC: bulk (size: 8): 58.43 58.43 > legacy APIs: MP/MC: bulk (size: 32): 65.50 65.49 > > HW: > Architecture: ppc64le > Byte Order: Little Endian > CPU(s): 128 > On-line CPU(s) list: 0-127 > Thread(s) per core: 4 > Core(s) per socket: 16 > Socket(s): 2 > NUMA node(s): 6 > Model: 2.3 (pvr 004e 1203) > Model name: POWER9, altivec supported > CPU max MHz: 3800.0000 > CPU min MHz: 2300.0000 > L1d cache: 32K > L1i cache: 32K > L2 cache: 512K > L3 cache: 10240K > NUMA node0 CPU(s): 0-63 > NUMA node8 CPU(s): 64-127 > > OS: RHEL 8.2 > > GCC: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC) > > DPDK: 20.08.0-rc0 (a8550b773) > > > > Unpatched > =========== > sudo app/test/dpdk-test -l 68,69 > EAL: Detected 128 lcore(s) > EAL: Detected 2 NUMA nodes > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > EAL: Selected IOVA mode 'VA' > EAL: No available hugepages reported in hugepages-2048kB > EAL: Probing VFIO support... > EAL: VFIO support initialized > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.0 (socket 0) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.1 (socket 0) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.0 (socket 8) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.1 (socket 8) > EAL: using IOMMU type 7 (sPAPR) > EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.0 (socket 8) > EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.1 (socket 8) > APP: HPET is not enabled, using TSC as default timer > RTE>>ring_perf_autotest > > ### Testing single element enq/deq ### > legacy APIs: SP/SC: single: 42.01 > legacy APIs: MP/MC: single: 56.27 > > ### Testing burst enq/deq ### > legacy APIs: SP/SC: burst (size: 8): 43.63 legacy APIs: SP/SC: burst (size: > 32): > 50.07 legacy APIs: MP/MC: burst (size: 8): 58.43 legacy APIs: MP/MC: burst > (size: 32): 65.52 > > ### Testing bulk enq/deq ### > legacy APIs: SP/SC: bulk (size: 8): 43.61 legacy APIs: SP/SC: bulk (size: 32): > 50.05 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk > (size: > 32): 65.50 > > ### Testing empty bulk deq ### > legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): > 7.16 > > ### Testing using two hyperthreads ### > legacy APIs: SP/SC: bulk (size: 8): 12.44 legacy APIs: MP/MC: bulk (size: 8): > 16.19 legacy APIs: SP/SC: bulk (size: 32): 3.10 legacy APIs: MP/MC: bulk > (size: > 32): 3.64 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [68] count = 362382 > Core [69] count = 362516 > Total count (size: 8): 724898 > > Bulk enq/dequeue count on size 32 > Core [68] count = 361565 > Core [69] count = 361852 > Total count (size: 32): 723417 > > ### Testing single element enq/deq ### > elem APIs: element size 16B: SP/SC: single: 42.81 elem APIs: element size 16B: > MP/MC: single: 56.78 > > ### Testing burst enq/deq ### > elem APIs: element size 16B: SP/SC: burst (size: 8): 45.04 elem APIs: element > size 16B: SP/SC: burst (size: 32): 59.27 elem APIs: element size 16B: MP/MC: > burst (size: 8): 60.68 elem APIs: element size 16B: MP/MC: burst (size: 32): > 75.00 > > ### Testing bulk enq/deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 45.05 elem APIs: element > size 16B: SP/SC: bulk (size: 32): 59.23 elem APIs: element size 16B: MP/MC: > bulk (size: 8): 60.64 elem APIs: element size 16B: MP/MC: bulk (size: 32): > 75.11 > > ### Testing empty bulk deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element > size 16B: MP/MC: bulk (size: 8): 7.16 > > ### Testing using two hyperthreads ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.15 elem APIs: element > size 16B: MP/MC: bulk (size: 8): 15.55 elem APIs: element size 16B: SP/SC: > bulk (size: 32): 3.22 elem APIs: element size 16B: MP/MC: bulk (size: 32): > 3.86 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [68] count = 374327 > Core [69] count = 374433 > Total count (size: 8): 748760 > > Bulk enq/dequeue count on size 32 > Core [68] count = 324111 > Core [69] count = 320038 > Total count (size: 32): 644149 > Test OK > > Patched > ======= > $ sudo app/test/dpdk-test -l 68,69 > EAL: Detected 128 lcore(s) > EAL: Detected 2 NUMA nodes > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > EAL: Selected IOVA mode 'VA' > EAL: No available hugepages reported in hugepages-2048kB > EAL: Probing VFIO support... > EAL: VFIO support initialized > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.0 (socket 0) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0000:01:00.1 (socket 0) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.0 (socket 8) > EAL: Probe PCI driver: net_mlx5 (15b3:1019) device: 0030:01:00.1 (socket 8) > EAL: using IOMMU type 7 (sPAPR) > EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.0 (socket 8) > EAL: Probe PCI driver: net_i40e (8086:1583) device: 0034:01:00.1 (socket 8) > APP: HPET is not enabled, using TSC as default timer > RTE>>ring_perf_autotest > > ### Testing single element enq/deq ### > legacy APIs: SP/SC: single: 42.00 > legacy APIs: MP/MC: single: 56.27 > > ### Testing burst enq/deq ### > legacy APIs: SP/SC: burst (size: 8): 43.63 legacy APIs: SP/SC: burst (size: > 32): > 50.04 legacy APIs: MP/MC: burst (size: 8): 58.42 legacy APIs: MP/MC: burst > (size: 32): 65.51 > > ### Testing bulk enq/deq ### > legacy APIs: SP/SC: bulk (size: 8): 43.61 legacy APIs: SP/SC: bulk (size: 32): > 50.02 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk > (size: > 32): 65.49 > > ### Testing empty bulk deq ### > legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): > 7.16 > > ### Testing using two hyperthreads ### > legacy APIs: SP/SC: bulk (size: 8): 12.43 legacy APIs: MP/MC: bulk (size: 8): > 16.17 legacy APIs: SP/SC: bulk (size: 32): 3.10 legacy APIs: MP/MC: bulk > (size: > 32): 3.65 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [68] count = 363208 > Core [69] count = 363334 > Total count (size: 8): 726542 > > Bulk enq/dequeue count on size 32 > Core [68] count = 361592 > Core [69] count = 361690 > Total count (size: 32): 723282 > > ### Testing single element enq/deq ### > elem APIs: element size 16B: SP/SC: single: 42.78 elem APIs: element size 16B: > MP/MC: single: 56.75 > > ### Testing burst enq/deq ### > elem APIs: element size 16B: SP/SC: burst (size: 8): 45.04 elem APIs: element > size 16B: SP/SC: burst (size: 32): 59.27 elem APIs: element size 16B: MP/MC: > burst (size: 8): 60.66 elem APIs: element size 16B: MP/MC: burst (size: 32): > 75.03 > > ### Testing bulk enq/deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 45.04 elem APIs: element > size 16B: SP/SC: bulk (size: 32): 59.33 elem APIs: element size 16B: MP/MC: > bulk (size: 8): 60.65 elem APIs: element size 16B: MP/MC: bulk (size: 32): > 75.04 > > ### Testing empty bulk deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element > size 16B: MP/MC: bulk (size: 8): 7.16 > > ### Testing using two hyperthreads ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.14 elem APIs: element > size 16B: MP/MC: bulk (size: 8): 15.56 elem APIs: element size 16B: SP/SC: > bulk (size: 32): 3.22 elem APIs: element size 16B: MP/MC: bulk (size: 32): > 3.86 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [68] count = 372618 > Core [69] count = 372415 > Total count (size: 8): 745033 > > Bulk enq/dequeue count on size 32 > Core [68] count = 318784 > Core [69] count = 316066 > Total count (size: 32): 634850 > Test OK