Tuesday, January 15, 2019 10:29 AM, Ilya Maximets: > Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: add platform memory > ordering feature support > > On 15.01.2019 9:33, Shahaf Shuler wrote: > > Thursday, January 10, 2019 10:37 PM, Shahaf Shuler: > >> Subject: RE: [dpdk-dev] [PATCH v2] net/virtio: add platform memory > >> ordering feature support > >> > >> Wednesday, January 9, 2019 5:50 PM, Michael S. Tsirkin: > >>> alejandro.luc...@netronome.com; Daniel Marcovitch > >>> <dani...@mellanox.com> > >>> Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: add platform memory > >>> ordering feature support > >>> > >>> On Wed, Jan 09, 2019 at 05:34:38PM +0300, Ilya Maximets wrote: > >>>> virtio_mb() is really heavy. I'd like to avoid it somehow, but I > >>>> don't know how to do this yet. > >>> > >>> Linux driver doesn't avoid it either. > >> > >> I understand v3 was merged but still would like to continue the > >> discuss and make sure all is clear and agreed. > >> > >> Form patch [1] description it is very clear why we need the > >> rte_smp_mb() barrier. > >> However I am not sure why this barrier is interoperate into rte_mb in > >> case of vDPA. In vDPA case, both read of the user ring and write of > >> the avail index are for local cached memory. > >> The only write which is to uncachable memory (device memory) is the > >> notify itself. > >> > >> As I mentioned, there is a need to have a store fence before doing > >> the notify, but from different reasons. So vDPA use case and need Is > >> a bit different than what presented in [1]. > > > > Any answer? > > It is pity if we add redundant barriers which will degrade the driver > performance. > > Sorry for late responses. Have a lot of work with OVS right now. > > Regarding your question. > Current code looks like this: > > 1. Update ring. > 2. virtio_wmb() > 3. Update idx. > 4. virtio_mb() > 5. read flags. > 6. notify. > > virtio_mb() is here to avoid reordering of steps 3 and 5. > i.e. we need a full barrier to ensure the order between store (idx update) > and load (reading the flags). Otherwise we could miss the notification. > We can't avoid the barrier here, because even x86 does not guarantee the > ordering of the local load with earlier local store.
This is clear. You need the rte_smp_mb() here. My question is why you need the rte_mb() in case of vDPA? As you said, all accesses are local. Pasting you commit code: /* * Per virtio_ring.h in Linux. * For virtio_pci on SMP, we don't need to order with respect to MMIO * accesses through relaxed memory I/O windows, so smp_mb() et al are * sufficient. * * For using virtio to talk to real devices (eg. vDPA) we do need real * barriers. */ static inline void virtio_mb(uint8_t weak_barriers) { if (weak_barriers) rte_smp_mb(); else rte_mb(); } > > > > >> > >> [1] > >> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa > >> > tches.dpdk.org%2Fpatch%2F49545%2F&data=02%7C01%7Cshahafs%40 > mellan > >> > ox.com%7C01907f1b2e0e4002cb7508d67ac38a98%7Ca652971c7d2e4d9ba6a4 > d1492 > >> > 56f461b%7C0%7C0%7C636831377591864200&sdata=TSpc%2Fzyq2aq0N3 > %2Bh4o > >> ro4std8ut%2FQU6%2BOeMDeeaQdsM%3D&reserved=0 > >> > >>> > >>> -- > >>> MST > > > >