On Tue, Aug 20, 2013 at 09:32:13AM -0500, Matt Sealey wrote: > On Mon, Aug 19, 2013 at 11:59 AM, Ezequiel Garcia > <ezequiel.gar...@free-electrons.com> wrote: > > On Mon, Aug 12, 2013 at 07:29:42PM +0100, Will Deacon wrote: > > > >> This means that you don't have ordering guarantees between the two accesses > >> outside of the CPU, potentially giving you: > >> > >> spin_lock(&__io_lock); > >> spin_unlock(&__io_lock); > >> writel((readl(reg) & ~clear) | set, reg); > >> > >> which is probably not what you want. > >> > >> I suggest adding an iowmb after the writel if you really need this ordering > >> to be enforced (but this may have a significant performance impact, > >> depending on your SoC). > > > > I don't want to argue with you, given I have zero knowledge about this > > ordering issue. However let me ask you a question. > > > > In arch/arm/include/asm/spinlock.h I'm seeing this comment: > > > > ""ARMv6 ticket-based spin-locking. > > A memory barrier is required after we get a lock, and before we > > release it, because V6 CPUs are assumed to have weakly ordered > > memory."" > > > > and also: > > > > static inline void arch_spin_unlock(arch_spinlock_t *lock) > > { > > smp_mb(); > > lock->tickets.owner++; > > dsb_sev(); > > } > > > > So, knowing this atomic API should work for every ARMv{N}, and not being > > very > > sure what the call to dsb_sev() does. Would you care to explain how the > > above > > is *not* enough to guarantee a memory barrier before the spin unlocking? > > arch_spin_[un]lock as an API is not guaranteed to use a barrier before > or after doing anything, even if this particular implementation does. > > dsb_sev() is an SMP helper which does a synchronization barrier and > then sends events to other CPUs which informs them of the unlock. If > the other CPUs were in WFE state waiting on that spinlock, they can > now thunder in and attempt to lock it themselves. It's not really > relevant to the discussion as arch_spin_unlock is not guaranteed to > perform a barrier before returning. > > On some other architecture there may be ISA additions which make > locking barrier-less, or on a specific implementation of an ARM > architecture SoC whereby there may be a bank of "hardware spinlocks" > available. > > So, in this sense, you shouldn't rely on implementation-specific > behaviors of a function. If you need to be sure C follows B follows A, > insert a barrier yourself. Don't expect A to barrier for you just > because you saw some source code that does it today, as tomorrow it > may be different. It's not an optimization, just a potential source of > new bugs if the implementation changes underneath you later. >
Of course. I agree completely. Thanks a lot, -- Ezequiel GarcĂa, Free Electrons Embedded Linux, Kernel and Android Engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/