Hi Jerin
Do you thinkĀ next step whether I need to implement the load_acquire
half barrier as per freebsd
or find any other performance test case to compare the performance impact?
Thanks for any suggestions.
Cheers,
Jia
On 10/25/2017 9:26 PM, Jerin Jacob Wrote:
-----Original Message-----
Date: Tue, 24 Oct 2017 10:04:26 +0800
From: Jia He <hejia...@gmail.com>
To: Jerin Jacob <jerin.ja...@caviumnetworks.com>
Cc: "Ananyev, Konstantin" <konstantin.anan...@intel.com>, "Zhao, Bing"
<iloveth...@163.com>, Olivier MATZ <olivier.m...@6wind.com>,
"dev@dpdk.org" <dev@dpdk.org>, "jia...@hxt-semitech.com"
<jia...@hxt-semitech.com>, "jie2....@hxt-semitech.com"
<jie2....@hxt-semitech.com>, "bing.z...@hxt-semitech.com"
<bing.z...@hxt-semitech.com>, "Richardson, Bruce"
<bruce.richard...@intel.com>
Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod
loading when doing enqueue/dequeue
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
Thunderbird/52.4.0
Hi Jerin
Hi Jia,
example:
./build/app/test -c 0xff -n 4
ring_perf_autotest
Seem in our arm64 server, the ring_perf_autotest will be finished in a few
seconds:
Yes. It just need a few seconds.
Anything wrong about configuration or environment setup?
By default, arm64+dpdk will be using el0 counter to measure the cycles. I
think, in your SoC, it will be running at 50MHz or 100MHz.So, You can
follow the below scheme to get accurate cycle measurement scheme:
See: http://dpdk.org/doc/guides/prog_guide/profile_app.html
check: 44.2.2. High-resolution cycle counter
root@ubuntu:/home/hj/dpdk/build/build/test/test# ./test -c 0xff -n 4
EAL: Detected 44 lcore(s)
EAL: Probing VFIO support...
APP: HPET is not enabled, using TSC as default timer
RTE>>per_lcore_autotest
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 0
MP/MC single enq/dequeue: 2
SP/SC burst enq/dequeue (size: 8): 0
If you follow the above link, The value '0' will be replaced with more meaning
full data.
MP/MC burst enq/dequeue (size: 8): 0
SP/SC burst enq/dequeue (size: 32): 0
MP/MC burst enq/dequeue (size: 32): 0
### Testing empty dequeue ###
SC empty dequeue: 0.02
MC empty dequeue: 0.04
### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 0.12
MP/MC bulk enq/dequeue (size: 8): 0.31
SP/SC bulk enq/dequeue (size: 32): 0.05
MP/MC bulk enq/dequeue (size: 32): 0.09
### Testing using two hyperthreads ###
SP/SC bulk enq/dequeue (size: 8): 0.12
MP/MC bulk enq/dequeue (size: 8): 0.39
SP/SC bulk enq/dequeue (size: 32): 0.04
MP/MC bulk enq/dequeue (size: 32): 0.12
### Testing using two physical cores ###
SP/SC bulk enq/dequeue (size: 8): 0.37
MP/MC bulk enq/dequeue (size: 8): 0.92
SP/SC bulk enq/dequeue (size: 32): 0.12
MP/MC bulk enq/dequeue (size: 32): 0.26
Test OK
RTE>>
Cheers,
Jia
By default, arm64+dpdk will be using el0 counter to measure the cycles. I
think, in your SoC, it will be running at 50MHz or 100MHz.So, You can
follow the below scheme to get accurate cycle measurement scheme:
See: http://dpdk.org/doc/guides/prog_guide/profile_app.html
check: 44.2.2. High-resolution cycle counter
--
Cheers,
Jia