On 5 January 2017 at 14:24, Jerin Jacob <jerin.ja...@caviumnetworks.com> wrote: > On Thu, Jan 05, 2017 at 01:31:44PM +0800, Jianbo Liu wrote: >> On 4 January 2017 at 18:01, Jerin Jacob <jerin.ja...@caviumnetworks.com> >> wrote: >> > On Tue, Jan 03, 2017 at 03:48:32PM +0800, Jianbo Liu wrote: >> >> On 27 December 2016 at 17:49, Jerin Jacob >> >> <jerin.ja...@caviumnetworks.com> wrote: >> >> > CC: Jianbo Liu <jianbo....@linaro.org> >> >> > Signed-off-by: Jerin Jacob <jerin.ja...@caviumnetworks.com> >> >> > --- >> >> > lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++++++ >> >> > 1 file changed, 6 insertions(+) >> >> > >> >> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h >> >> > b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h >> >> > index 78ebea2..ef0efc7 100644 >> >> > --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h >> >> > +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h >> >> > @@ -88,6 +88,12 @@ static inline void rte_rmb(void) >> >> > >> >> > #define rte_smp_rmb() dmb(ishld) >> >> > >> >> > +#define rte_io_mb() rte_mb() >> >> > + >> >> > +#define rte_io_wmb() rte_wmb() >> >> > + >> >> > +#define rte_io_rmb() rte_rmb() >> >> > + >> >> >> >> I think it's better to use outer shareable dmb for io barrier, instead of >> >> dsb. >> > >> > Its is difficult to generalize. AFAIK, from the IO barrier perspective >> > dsb would be the right candidate. But just for the DMA barrier between IO >> > may >> > be outer sharable dmb is enough. In-terms of performance implication, the >> > fastpath code(door bell write) has been changed to relaxed write in all >> > the drivers in this patchset and rte_io_* will be only >> > used by rte_[read/write]8/16/32/64 which will be in slow-path. >> > So, IMO, it better stick with dsb and its safe from the complete IO barrier >> > perspective. >> >> If so, why not use *mb() directly? > > Adding David Marchand, EAL Maintainer. > > Instead of rte_io_?. I thought, IO specific constraints can be abstracted > here in rte_io_*. Apart from arm, there other arch like "arc" has similar > constraints. IMHO, no harm in keeping that abstraction. > > Thoughts ? > > http://lxr.free-electrons.com/ident?i=__iormb > >> >> > >> > At least on ThunderX, I couldn't see any performance difference between >> > using dsb(st) and dmb(oshst) for dma write barrier before the doorbell >> > register >> > write in fastpath. In case there are platforms which has such performance >> > difference, >> > may be could add rte_dma_wmb() and rte_dma_rmb() in future like Linux >> > kernel >> > dma_wmb() and dma_rmb().(But i couldn't see all the driver are using it, >> > though) >> > >> >> But there is no io_*mb() in the kernel, so you want to be different? > > It is their for arm,arm64,arc architectures in Linux kernel. Please check > writel > implementation for arm64 > > http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L143 >
Yes, I knew. But I'm afraid it will be mixed with dma_*mb by someone else.