-----Original Message----- > Date: Fri, 21 Sep 2018 05:48:44 +0000 > From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > To: Jerin Jacob <jerin.ja...@caviumnetworks.com> > CC: "Phil Yang (Arm Technology China)" <phil.y...@arm.com>, "dev@dpdk.org" > <dev@dpdk.org>, nd <n...@arm.com>, "kkokkilaga...@caviumnetworks.com" > <kkokkilaga...@caviumnetworks.com>, "Gavin Hu (Arm Technology China)" > <gavin...@arm.com>, "ferruh.yi...@intel.com" <ferruh.yi...@intel.com> > Subject: RE: [PATCH v2 2/3] kni: fix kni fifo synchronization > > > > > > -----Original Message----- > > > > > Date: Wed, 19 Sep 2018 21:42:39 +0800 > > > > > From: Phil Yang <phil.y...@arm.com> > > > > > To: dev@dpdk.org > > > > > CC: n...@arm.com, jerin.ja...@caviumnetworks.com, > > > > > kkokkilaga...@caviumnetworks.com, honnappa.nagaraha...@arm.com, > > > > > gavin...@arm.com > > > > > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization > > > > > X-Mailer: git-send-email 2.7.4 > > > > > > > > > > > > > + Ferruh Yigit <ferruh.yi...@intel.com> > > > > > > > > > > > > > > With existing code in kni_fifo_put, rx_q values are not being > > > > > updated before updating fifo_write. While reading rx_q in > > > > > kni_net_rx_normal, This is causing the sync issue on other core. > > > > > The same situation happens in kni_fifo_get as well. > > > > > > > > > > So syncing the values by adding C11 atomic memory barriers to make > > > > > sure the values being synced before updating fifo_write and fifo_read. > > > > > > > > > > Fixes: 3fc5ca2 ("kni: initial import") > > > > > Signed-off-by: Phil Yang <phil.y...@arm.com> > > > > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > > > Reviewed-by: Gavin Hu <gavin...@arm.com> > > > > > --- > > > > > .../linuxapp/eal/include/exec-env/rte_kni_common.h | 5 ++++ > > > > > lib/librte_kni/rte_kni_fifo.h | 30 > > > > > +++++++++++++++++++++- > > > > > 2 files changed, 34 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git > > > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h > > > > > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h > > > > > index cfa9448..1fd713b 100644 > > > > > --- > > > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h > > > > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common. > > > > > +++ h > > > > > @@ -54,8 +54,13 @@ struct rte_kni_request { > > > > > * Writing should never overwrite the read position > > > > > */ > > > > > struct rte_kni_fifo { > > > > > +#ifndef RTE_USE_C11_MEM_MODEL > > > > > volatile unsigned write; /**< Next position to be > > > > > written*/ > > > > > volatile unsigned read; /**< Next position to be read */ > > > > > +#else > > > > > + unsigned write; /**< Next position to be > > > > > written*/ > > > > > + unsigned read; /**< Next position to be read */ > > > > > +#endif > > > > > unsigned len; /**< Circular buffer length */ > > > > > unsigned elem_size; /**< Pointer size - for 32/64 > > > > > bit OS */ > > > > > void *volatile buffer[]; /**< The buffer contains mbuf > > > > > pointers */ > > > > > diff --git a/lib/librte_kni/rte_kni_fifo.h > > > > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644 > > > > > --- a/lib/librte_kni/rte_kni_fifo.h > > > > > +++ b/lib/librte_kni/rte_kni_fifo.h > > > > > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void > > > > > **data, unsigned num) { > > > > > unsigned i = 0; > > > > > unsigned fifo_write = fifo->write; > > > > > - unsigned fifo_read = fifo->read; > > > > > unsigned new_write = fifo_write; > > > > > +#ifdef RTE_USE_C11_MEM_MODEL > > > > > + unsigned fifo_read = __atomic_load_n(&fifo->read, > > > > > + > > > > > +__ATOMIC_ACQUIRE); #else > > > > > + unsigned fifo_read = fifo->read; #endif > > > > > > > > Correct. > > > > > > My apologies, did not follow your comment here. Do you want us to correct > > anything here? '#endif' is not appearing on the correct line in the email, > > but it > > shows up fine on the patch work. > > > > No. What I meant is, code is correct. > > > > > > > > > > > > > > > > > > > > > > > for (i = 0; i < num; i++) { > > > > > new_write = (new_write + 1) & (fifo->len - 1); @@ > > > > > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void > > > > > **data, > > > > unsigned num) > > > > > fifo->buffer[fifo_write] = data[i]; > > > > > fifo_write = new_write; > > > > > } > > > > > +#ifdef RTE_USE_C11_MEM_MODEL > > > > > + __atomic_store_n(&fifo->write, fifo_write, > > > > > +__ATOMIC_RELEASE); #else > > > > > + rte_smp_wmb(); > > > > > fifo->write = fifo_write; > > > > > +#endif > > > > > > > > Correct. > > > > > return i; > > > > > } > > > > > > > > > > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void > > > > > **data, unsigned num) { > > > > > unsigned i = 0; > > > > > unsigned new_read = fifo->read; > > > > > +#ifdef RTE_USE_C11_MEM_MODEL > > > > > + unsigned fifo_write = __atomic_load_n(&fifo->write, > > > > > +__ATOMIC_ACQUIRE); #else > > > > > unsigned fifo_write = fifo->write; > > > > > +#endif > > > > > > > > Correct. > > > > > > > > > + > > > > > for (i = 0; i < num; i++) { > > > > > if (new_read == fifo_write) > > > > > break; > > > > > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void > > > > > **data, > > > > unsigned num) > > > > > data[i] = fifo->buffer[new_read]; > > > > > new_read = (new_read + 1) & (fifo->len - 1); > > > > > } > > > > > +#ifdef RTE_USE_C11_MEM_MODEL > > > > > + __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE); > > > > > +#else > > > > > + rte_smp_wmb(); > > > > > fifo->read = new_read; > > > > > +#endif > > > > > > > > Correct. > > > > > > > > > return i; > > > > > } > > > > > > > > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void > > > > > **data, unsigned num) static inline uint32_t > > > > > kni_fifo_count(struct rte_kni_fifo *fifo) { > > > > > +#ifdef RTE_USE_C11_MEM_MODEL > > > > > + unsigned fifo_write = __atomic_load_n(&fifo->write, > > > > > + __ATOMIC_ACQUIRE); > > > > > + unsigned fifo_read = __atomic_load_n(&fifo->read, > > > > > + > > > > > +__ATOMIC_ACQUIRE); > > > > > > > > Isn't too heavy to have two __ATOMIC_ACQUIREs? a simple > > > > rte_smp_rmb() would be enough here. Right? > > > > or > > > > Do we need __ATOMIC_ACQUIRE for fifo_write case? > > > > > > > We also had some amount of debate internally on this: > > > 1) We do not want to use rte_smp_rmb() as we want to keep the memory > > models separated (for ex: while using C11, use C11 everywhere). It is also > > not > > sufficient, please see 3) below. > > > > But Nothing technically wrong in using rte_smp_rmb() here in terms > > functionally and code generated by the compiler. > > rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal. > 'LDAR' is a better option which is generated when C11 atomics are used.
Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ? > > > > > > 2) This API can get called from writer or reader, so both the loads > > > have to be __ATOMIC_ACQUIRE > > > 3) Other option is to use __ATOMIC_RELAXED. That would allow any > > loads/stores around of this API to get reordered, especially since this is > > an > > inline function. This would put burden on the application to manage the > > ordering depending on its usage. It will also require the application to > > understand the implementation of this API. > > > > __ATOMIC_RELAXED may be fine too for _count() case as it may not very > > important to get the exact count for the exact very moment, Application can > > retry. > > > > I am in favor of performance effective implementation. > > The requirement on the correctness of the count depends on the usage of this > function. I see the following usage: > > In the file kni_net.c, function: kni_net_tx: > > if (kni_fifo_free_count(kni->tx_q) == 0 || > kni_fifo_count(kni->alloc_q) == 0) { > /** > * If no free entry in tx_q or no entry in alloc_q, > * drops skb and goes out. > */ > goto drop; > } > > There is no retry here, the packet is dropped. OK. Then pick an implementation which is an optimal this case. I think, then rte_smp_rmb() makes sense here as a) no #ifdef clutter b) it is optimal compared to 2 x LDAR > > > > > > > > > > > > > > Other than that, I prefer to avoid ifdef clutter by introducing two > > > > separate file just like ring C11 implementation. > > > > > > > > I don't have strong opinion on this this part, I let KNI MAINTAINER > > > > to decide on how to accommodate this change. > > > > > > I prefer to change this as well, I am open for suggestions. > > > Introducing two separate files would be too much for this library. A > > > better > > way would be to have something similar to 'smp_store_release' provided by > > the kernel. i.e. create #defines for loads/stores. Hide the clutter behind > > the > > #defines. > > > > No Strong opinion on this, leaving to KNI Maintainer. > Will wait on this before re-spinning the patch > > > > > This patch needs to split by two, > > a) Fixes for non C11 implementation(i.e new addition to rte_smp_wmb()) > > b) add support for C11 implementation. > Agree > > > > > > > > > > > > > > > + return (fifo->len + fifo_write - fifo_read) & (fifo->len - > > > > > +1); #else > > > > > return (fifo->len + fifo->write - fifo->read) & (fifo->len > > > > > - 1); > > > > > +#endif > > > > > } > > > > > -- > > > > > 2.7.4 > > > > >