On 10/20/2017 4:02 AM, Ananyev, Konstantin Wrote:
-----Original Message-----
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
Sent: Thursday, October 19, 2017 3:15 PM
To: Zhao, Bing <iloveth...@163.com>; Jia He <hejia...@gmail.com>; Jerin Jacob
<jerin.ja...@caviumnetworks.com>
Cc: Olivier MATZ <olivier.m...@6wind.com>; dev@dpdk.org;
jia...@hxt-semitech.com; jie2....@hxt-semitech.com; bing.zhao@hxt-
semitech.com
Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading
when doing enqueue/dequeue
Hi,
On 2017/10/19 18:02, Ananyev, Konstantin wrote:
Hi Jia,
Hi
On 10/13/2017 9:02 AM, Jia He Wrote:
Hi Jerin
On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
-----Original Message-----
Date: Thu, 12 Oct 2017 17:05:50 +0000
[...]
On the same lines,
Jia He, jie2.liu, bing.zhao,
Is this patch based on code review or do you saw this issue on any of
the
arm/ppc target? arm64 will have performance impact with this change.
sorry, miss one important information
Our platform is an aarch64 server with 46 cpus.
If we reduced the involved cpu numbers, the bug occurred less frequently.
Yes, mb barrier impact the performance, but correctness is more
important, isn't it ;-)
Maybe we can find any other lightweight barrier here?
Cheers,
Jia
Based on mbuf_autotest, the rte_panic will be invoked in seconds.
PANIC in test_refcnt_iter():
(lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
1: [./test(rte_dump_stack+0x38) [0x58d868]]
Aborted (core dumped)
So is it only reproducible with mbuf refcnt test?
Could it be reproduced with some 'pure' ring test
(no mempools/mbufs refcnt, etc.)?
The reason I am asking - in that test we also have mbuf refcnt updates
(that's what for that test was created) and we are doing some optimizations
here too
to avoid excessive atomic updates.
BTW, if the problem is not reproducible without mbuf refcnt,
can I suggest to extend the test with:
- add a check that enqueue() operation was successful
- walk through the pool and check/printf refcnt of each mbuf.
Hopefully that would give us some extra information what is going wrong here.
Konstantin
Currently, the issue is only found in this case here on the ARM
platform, not sure how it is going with the X86_64 platform
I understand that it is only reproducible on arm so far.
What I am asking - with dpdk is there any other way to reproduce it (on arm)
except then running mbuf_autotest?
Something really simple that not using mbuf/mempool etc?
Just do dequeue/enqueue from multiple threads and check data integrity at the
end?
If not - what makes you think that the problem is precisely in rte_ring code?
Why not in rte_mbuf let say?
Actually I reread your original mail and finally get your point.
If I understand you correctly the problem with read reordering here is that
after we read prot.tail but before we read cons.head
both cons.head and prod.tail might be updated,
but for us prod.tail change might be unnoticed.
As an example:
time 0 (cons.head == 0, prod.tail == 0):
prod_tail = r->prod.tail; /* due read reordering */
/* prod_tail == 0 */
time 1 (cons.head ==5, prod.tail == 5):
*old_head = r->cons.head;
/* cons.head == 5 */
*entries = (prod_tail - *old_head);
/* *entries == (0 - 5) == 0xFFFFFFFB */
And we will move our cons.head forward, even though there are no filled entries
in the ring.
Is that right?
Yes
As I side notice, I wonder why we have here:
*entries = (prod_tail - *old_head);
instead of:
*entries = r->size + prod_tail - *old_head;
?
Yes, I agree with you at this code line.
But reordering will still mess up things even after this change(I have
tested, still the same as before)
I thought the *entries is a door to prevent consumer from moving forward
too fast than the producer.
But in some cases, it is correct that prod_tail is smaller than
*old_head due to the cirular queue.
In other cases, it is incorrect because of read/read reordering.
AFAICT, the root cause here is the producer tail and cosumer head are
dependant on each other.
Thus a memory barrier is neccessary.
Cheers,
Jia
Konstantin
. In another
mail of this thread, we've made a simple test based on this and captured
some information and I pasted there.(I pasted the patch there :-))
Are you talking about that one:
http://dpdk.org/dev/patchwork/patch/30405/
?
It still uses test/test/test_mbuf.c...,
but anyway I don't really understand how mbuf_autotest supposed
to work with these changes:
@@ -730,7 +739,7 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
rte_ring_enqueue(refcnt_mbuf_ring, m);
}
}
- rte_pktmbuf_free(m);
+ // rte_pktmbuf_free(m);
}
@@ -741,6 +750,12 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
while (!rte_ring_empty(refcnt_mbuf_ring))
;
+ if (NULL != m) {
+ if (1 != rte_mbuf_refcnt_read(m))
+ printf("m ref is %u\n", rte_mbuf_refcnt_read(m));
+ rte_pktmbuf_free(m);
+ }
+
/* check that all mbufs are back into mempool by now */
for (wn = 0; wn != REFCNT_MAX_TIMEOUT; wn++) {
if ((i = rte_mempool_avail_count(refcnt_pool)) == n) {
That means all your mbufs (except the last one) will still be allocated.
So the test would fail - as it should, I think.
And
it seems that Juhamatti & Jacod found some reverting action several
months ago.
Didn't get that one either.
Konstantin