Hi
On 10/13/2017 3:33 PM, Jianbo Liu Wrote:
The 10/13/2017 07:19, Jerin Jacob wrote:
-----Original Message-----
Date: Fri, 13 Oct 2017 09:16:31 +0800
From: Jia He <hejia...@gmail.com>
To: Jerin Jacob <jerin.ja...@caviumnetworks.com>, "Ananyev, Konstantin"
<konstantin.anan...@intel.com>
Cc: Olivier MATZ <olivier.m...@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>,
"jia...@hxt-semitech.com" <jia...@hxt-semitech.com>,
"jie2....@hxt-semitech.com" <jie2....@hxt-semitech.com>,
"bing.z...@hxt-semitech.com" <bing.z...@hxt-semitech.com>
Subject: Re: [PATCH] ring: guarantee ordering of cons/prod loading when
doing enqueue/dequeue
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
Thunderbird/52.3.0
Hi
On 10/13/2017 9:02 AM, Jia He Wrote:
Hi Jerin
On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
-----Original Message-----
Date: Thu, 12 Oct 2017 17:05:50 +0000
[...]
On the same lines,
Jia He, jie2.liu, bing.zhao,
Is this patch based on code review or do you saw this issue on any
of the
arm/ppc target? arm64 will have performance impact with this change.
sorry, miss one important information
Our platform is an aarch64 server with 46 cpus.
Is this an OOO(Out of order execution) aarch64 CPU implementation?
If we reduced the involved cpu numbers, the bug occurred less frequently.
Yes, mb barrier impact the performance, but correctness is more important,
isn't it ;-)
Yes.
Maybe we can find any other lightweight barrier here?
Yes, Regarding the lightweight barrier, arm64 has native support for acquire
and release
semantics, which is exposed through gcc as architecture agnostic
functions.
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
http://preshing.com/20130922/acquire-and-release-fences/
Good to know,
1) How much overhead this patch in your platform? Just relative
numbers are enough
2) As a prototype, Is Changing to acquire and release schematics
reduces the overhead in your platform?
+1, can you try what ODP does in the link mentioned below?
Sure, pls see the result:
root@server:~/odp/test/linux-generic/ring# ./ring_main
HW time counter freq: 20000000 hz
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishm.c:880:_odp_ishm_reserve():No huge pages, fall back to normal
pages. check: /proc/sys/vm/nr_hugepages.
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
PKTIO: initialized loop interface.
PKTIO: initialized ipc interface.
PKTIO: initialized socket mmap, use export
ODP_PKTIO_DISABLE_SOCKET_MMAP=1 to disable.
PKTIO: initialized socket mmsg,use export
ODP_PKTIO_DISABLE_SOCKET_MMSG=1 to disable.
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
ODP API version: 1.15.0
ODP implementation name: "odp-linux"
ODP implementation version: "odp-linux" 1.15.0-0 (v1.15.0) 1.15.0.0
CUnit - A unit testing framework for C - Version 2.1-3
http://cunit.sourceforge.net/
Suite: ring basic
Test: ring_test_basic_create
...pktio/ring.c:177:_ring_create():Requested size is invalid, must be
power of 2,and do not exceed the size limit 268435455
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate memory
passed
Test: ring_test_basic_burst ...passed
Test: ring_test_basic_bulk ...passed
Test: ring_test_basic_watermark
...passed_ishmphy.c:152:_odp_ishmphy_map():mmap failed:Cannot allocate
memory
Suite: ring stress
Test: ring_test_stress_1_1_producer_consumer ...passed
Test: ring_test_stress_1_N_producer_consumer ...passed
Test: ring_test_stress_N_1_producer_consumer ...passed
Test: ring_test_stress_N_M_producer_consumer ...
<the test case has hung here for half an hour>
Cheers,
Jia
Reference FreeBSD ring/DPDK style ring implementation through acquire
and release schematics
https://github.com/Linaro/odp/blob/master/platform/linux-generic/pktio/ring.c
I will also spend on cycles on this.
Cheers,
Jia
Based on mbuf_autotest, the rte_panic will be invoked in seconds.
PANIC in test_refcnt_iter():
(lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
1: [./test(rte_dump_stack+0x38) [0x58d868]]
Aborted (core dumped)
Cheers,
Jia
Konstantin
--
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.