Hi Jerin
On 10/25/2017 9:26 PM, Jerin Jacob Wrote:
-----Original Message-----
Date: Tue, 24 Oct 2017 10:04:26 +0800
From: Jia He <hejia...@gmail.com>
To: Jerin Jacob <jerin.ja...@caviumnetworks.com>
Cc: "Ananyev, Konstantin" <konstantin.anan...@intel.com>, "Zhao, Bing"
<iloveth...@163.com>, Olivier MATZ <olivier.m...@6wind.com>,
"dev@dpdk.org" <dev@dpdk.org>, "jia...@hxt-semitech.com"
<jia...@hxt-semitech.com>, "jie2....@hxt-semitech.com"
<jie2....@hxt-semitech.com>, "bing.z...@hxt-semitech.com"
<bing.z...@hxt-semitech.com>, "Richardson, Bruce"
<bruce.richard...@intel.com>
Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod
loading when doing enqueue/dequeue
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
Thunderbird/52.4.0
Hi Jerin
Hi Jia,
example:
./build/app/test -c 0xff -n 4
ring_perf_autotest
Seem in our arm64 server, the ring_perf_autotest will be finished in a few
seconds:
Yes. It just need a few seconds.
Anything wrong about configuration or environment setup?
By default, arm64+dpdk will be using el0 counter to measure the cycles. I
think, in your SoC, it will be running at 50MHz or 100MHz.So, You can
follow the below scheme to get accurate cycle measurement scheme:
See: http://dpdk.org/doc/guides/prog_guide/profile_app.html
check: 44.2.2. High-resolution cycle counter
Thank you for the suggestions.
But I tried your provided ko module to enable the accurate cycle
measurement in user space, the
test result is as below:
root@nfv-demo01:~/dpdk/build/build/test/test# lsmod |grep pmu
pmu_el0_cycle_counter 262144 0
[old codes, without any patches]
============================================
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 0
MP/MC single enq/dequeue: 0
SP/SC burst enq/dequeue (size: 8): 0
MP/MC burst enq/dequeue (size: 8): 0
SP/SC burst enq/dequeue (size: 32): 0
MP/MC burst enq/dequeue (size: 32): 0
### Testing empty dequeue ###
SC empty dequeue: 0.00
MC empty dequeue: 0.00
### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 0.00
MP/MC bulk enq/dequeue (size: 8): 0.00
SP/SC bulk enq/dequeue (size: 32): 0.00
MP/MC bulk enq/dequeue (size: 32): 0.00
### Testing using two hyperthreads ###
SP/SC bulk enq/dequeue (size: 8): 0.00
MP/MC bulk enq/dequeue (size: 8): 0.00
SP/SC bulk enq/dequeue (size: 32): 0.00
MP/MC bulk enq/dequeue (size: 32): 0.00
Test OK
[with full rte_smp_rmb barrier patch]
======================================
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 0
MP/MC single enq/dequeue: 0
SP/SC burst enq/dequeue (size: 8): 0
MP/MC burst enq/dequeue (size: 8): 0
SP/SC burst enq/dequeue (size: 32): 0
MP/MC burst enq/dequeue (size: 32): 0
### Testing empty dequeue ###
SC empty dequeue: 0.00
MC empty dequeue: 0.00
### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 0.00
MP/MC bulk enq/dequeue (size: 8): 0.00
SP/SC bulk enq/dequeue (size: 32): 0.00
MP/MC bulk enq/dequeue (size: 32): 0.00
### Testing using two hyperthreads ###
SP/SC bulk enq/dequeue (size: 8): 0.00
MP/MC bulk enq/dequeue (size: 8): 0.00
SP/SC bulk enq/dequeue (size: 32): 0.00
MP/MC bulk enq/dequeue (size: 32): 0.00
Test OK
RTE>>
No difference,all time is 0 ?
If I rmmod pmu_el0_cycle_counter and revise the ./build/.config to
comment the config line
#CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y
Then the time is bigger than 0
root@ubuntu:/home/hj/dpdk/build/build/test/test# ./test -c 0xff -n 4
EAL: Detected 44 lcore(s)
EAL: Probing VFIO support...
APP: HPET is not enabled, using TSC as default timer
RTE>>per_lcore_autotest
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 0
MP/MC single enq/dequeue: 2
SP/SC burst enq/dequeue (size: 8): 0
If you follow the above link, The value '0' will be replaced with more meaning
full data.
MP/MC burst enq/dequeue (size: 8): 0
SP/SC burst enq/dequeue (size: 32): 0
MP/MC burst enq/dequeue (size: 32): 0
### Testing empty dequeue ###
SC empty dequeue: 0.02
MC empty dequeue: 0.04
### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 0.12
MP/MC bulk enq/dequeue (size: 8): 0.31
SP/SC bulk enq/dequeue (size: 32): 0.05
MP/MC bulk enq/dequeue (size: 32): 0.09
### Testing using two hyperthreads ###
SP/SC bulk enq/dequeue (size: 8): 0.12
MP/MC bulk enq/dequeue (size: 8): 0.39
SP/SC bulk enq/dequeue (size: 32): 0.04
MP/MC bulk enq/dequeue (size: 32): 0.12
### Testing using two physical cores ###
SP/SC bulk enq/dequeue (size: 8): 0.37
MP/MC bulk enq/dequeue (size: 8): 0.92
SP/SC bulk enq/dequeue (size: 32): 0.12
MP/MC bulk enq/dequeue (size: 32): 0.26
Test OK
RTE>>
Cheers,
Jia
By default, arm64+dpdk will be using el0 counter to measure the cycles. I
think, in your SoC, it will be running at 50MHz or 100MHz.So, You can
follow the below scheme to get accurate cycle measurement scheme:
See: http://dpdk.org/doc/guides/prog_guide/profile_app.html
check: 44.2.2. High-resolution cycle counter
--
Cheers,
Jia