> -----Original Message-----
> From: Gavin Hu (Arm Technology China)
> Sent: 2018年9月26日 17:30
> To: Gavin Hu (Arm Technology China) <gavin...@arm.com>; dev@dpdk.org
> Cc: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Steve Capper
> <steve.cap...@arm.com>; Ola Liljedahl <ola.liljed...@arm.com>;
> jerin.ja...@caviumnetworks.com; nd <n...@arm.com>; sta...@dpdk.org; Justin
> He <justin...@arm.com>
> Subject: RE: [PATCH v3 3/3] ring: move the atomic load of head above the loop
>
> +Justin He for review
>
> > -----Original Message-----
> > From: Gavin Hu <gavin...@arm.com>
> > Sent: Monday, September 17, 2018 4:17 PM
> > To: dev@dpdk.org
> > Cc: Gavin Hu (Arm Technology China) <gavin...@arm.com>; Honnappa
> > Nagarahalli <honnappa.nagaraha...@arm.com>; Steve Capper
> > <steve.cap...@arm.com>; Ola Liljedahl <ola.liljed...@arm.com>;
> > jerin.ja...@caviumnetworks.com; nd <n...@arm.com>; sta...@dpdk.org
> > Subject: [PATCH v3 3/3] ring: move the atomic load of head above the loop
> >
> > In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
> > the do {} while loop as upon failure the old_head will be updated, another
> > load is costly and not necessary.
> >
> > This helps a little on the latency,about 1~5%.
> >
> > Test result with the patch(two cores):
> > SP/SC bulk enq/dequeue (size: 8): 5.64
> > MP/MC bulk enq/dequeue (size: 8): 9.58
> > SP/SC bulk enq/dequeue (size: 32): 1.98 MP/MC bulk enq/dequeue (size:
> > 32): 2.30
> >
> > Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Gavin Hu <gavin...@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > Reviewed-by: Steve Capper <steve.cap...@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
> > ---
> > lib/librte_ring/rte_ring_c11_mem.h | 10 ++++------
> > 1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> > b/lib/librte_ring/rte_ring_c11_mem.h
> > index 0eae3b3..95cc508 100644
> > --- a/lib/librte_ring/rte_ring_c11_mem.h
> > +++ b/lib/librte_ring/rte_ring_c11_mem.h
> > @@ -61,13 +61,11 @@ __rte_ring_move_prod_head(struct rte_ring *r,
> > unsigned int is_sp,
> > unsigned int max = n;
> > int success;
> >
> > +*old_head = __atomic_load_n(&r->prod.head,
> > __ATOMIC_ACQUIRE);
> > do {
> > /* Reset n to the initial burst count */
> > n = max;
> >
> > -*old_head = __atomic_load_n(&r->prod.head,
> > -__ATOMIC_ACQUIRE);
> > -
> > /* load-acquire synchronize with store-release of ht->tail
> > * in update_tail.
> > */
> > @@ -93,6 +91,7 @@ __rte_ring_move_prod_head(struct rte_ring *r,
> > unsigned int is_sp,
> > if (is_sp)
> > r->prod.head = *new_head, success = 1;
> > else
> > +/* on failure, *old_head is updated */
> > success = __atomic_compare_exchange_n(&r-
> > >prod.head,
> > old_head, *new_head,
> > 0, __ATOMIC_ACQUIRE,
> > @@ -134,13 +133,11 @@ __rte_ring_move_cons_head(struct rte_ring *r, int
> > is_sc,
> > int success;
> >
> > /* move cons.head atomically */
> > +*old_head = __atomic_load_n(&r->cons.head,
> > __ATOMIC_ACQUIRE);
> > do {
> > /* Restore n as it may change every loop */
> > n = max;
> >
> > -*old_head = __atomic_load_n(&r->cons.head,
> > -__ATOMIC_ACQUIRE);
> > -
> > /* this load-acquire synchronize with store-release of ht->tail
> > * in update_tail.
> > */
> > @@ -165,6 +162,7 @@ __rte_ring_move_cons_head(struct rte_ring *r, int
> > is_sc,
> > if (is_sc)
> > r->cons.head = *new_head, success = 1;
> > else
> > +/* on failure, *old_head will be updated */
> > success = __atomic_compare_exchange_n(&r-
> > >cons.head,
> > old_head,
> > *new_head,
> > 0,
> > __ATOMIC_ACQUIRE,
> > --
> > 2.7.4
Reviewed-by: Jia He <justin...@arm.com>
Cheers,
Justin (Jia He)
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.