On Wed, Nov 30, 2016 at 05:52:02PM +0000, Ananyev, Konstantin wrote: > Hi Jerin,
Hi Konstantin, > > > > > On Tue, Nov 22, 2016 at 01:46:54PM +0000, Bruce Richardson wrote: > > > On Tue, Nov 22, 2016 at 03:46:38AM +0530, Jerin Jacob wrote: > > > > On Sun, Nov 20, 2016 at 11:21:43PM +0000, Ananyev, Konstantin wrote: > > > > Hi > > > > > > > > > > > > i40e_asq_send_command: rd32 & wr32 under ThunderX gives > > > > > > unpredictable > > > > > > results. To solve this include rte memory > > > > > > barriers > > > > > > > > > > > > Signed-off-by: Satha Rao <skoteshwar at caviumnetworks.com> > > > > > > --- > > > > > > drivers/net/i40e/base/i40e_osdep.h | 14 ++++++++++++++ > > > > > > 1 file changed, 14 insertions(+) > > > > > > > > > > > > diff --git a/drivers/net/i40e/base/i40e_osdep.h > > > > > > b/drivers/net/i40e/base/i40e_osdep.h > > > > > > index 38e7ba5..ffa3160 100644 > > > > > > --- a/drivers/net/i40e/base/i40e_osdep.h > > > > > > +++ b/drivers/net/i40e/base/i40e_osdep.h > > > > > > @@ -158,7 +158,13 @@ do { > > > > > > \ > > > > > > ((volatile uint32_t *)((char *)(a)->hw_addr + (reg))) > > > > > > static inline uint32_t i40e_read_addr(volatile void *addr) > > > > > > { > > > > > > +#if defined(RTE_ARCH_ARM64) > > > > > > + uint32_t val = rte_le_to_cpu_32(I40E_PCI_REG(addr)); > > > > > > + rte_rmb(); > > > > > > + return val; > > > > > > > > > > If you really need an rmb/wmb with MMIO read/writes on ARM, > > > > > I think you can avoid #ifdefs here and use rte_smp_rmb/rte_smp_wmb. > > > > > BTW, I suppose if you need it for i40e, you would need it for other > > > > > devices too. > > > > > > > > Yes. ARM would need for all devices(typically, the devices on external > > > > PCI bus). > > > > I guess rte_smp_rmb may not be the correct abstraction. So we need more > > > > of > > > > rte_rmb() as we need only non smp variant on IO side. I guess then it > > > > make sense to > > > > create new abstraction in eal with following variants so that each arch > > > > gets opportunity to make what it makes sense that specific platform > > > > > > > > rte_readb_relaxed > > > > rte_readw_relaxed > > > > rte_readl_relaxed > > > > rte_readq_relaxed > > > > rte_writeb_relaxed > > > > rte_writew_relaxed > > > > rte_writel_relaxed > > > > rte_writeq_relaxed > > > > rte_readb > > > > rte_readw > > > > rte_readl > > > > rte_readq > > > > rte_writeb > > > > rte_writew > > > > rte_writel > > > > rte_writeq > > > > > > > > Thoughts ? > > > > > > > > > > That seems like a lot of API calls! > > > Perhaps you can clarify - why would the rte_smp_rmb() not work for you? > > > > Currently arm64 mapped DMB as rte_smp_rmb() for smp case. > > > > Ideally for io barrier and non smp case, we need to map it as DSB and it is > > bit heavier than DMB > > Ok, so you need some new macro, like rte_io_(r|w)mb or so, that would expand > into dmb > for ARM, correct? The io barrier expands to dsb. http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L110 > > > > > The linux kernel arm64 mappings > > http://lxr.free-electrons.com/source/arch/arm64/include/asm/io.h#L142 > > > > DMB vs DSB > > https://community.arm.com/thread/3833 > > > > The relaxed one are without any barriers.(the use case like accessing on > > chip peripherals may need only relaxed versions) > > > > Thoughts on new rte EAL abstraction? > > Looks like a lot of macros but if you guys think that would help - NP with > that :) I don't have strong opinion here. If there is concern on a lot of macros then, I can introduce only "rte_io_(r|w)mb" instead of read[b|w|l|q]/write[b|w|l|q]/relaxed. let me know? > Again, in that case we probably can get rid of driver specific pci reg > read/write defines. Yes. But, That's going to have a lot of change :-( If there is no objection then I will introduce "read[b|w|l|q]/write[b|w|l|q]/relaxed" and then change all external pcie drivers with new macros. > > Konstantin > > > > > > > > > /Bruce