On Tue, Aug 28, 2007 at 03:56:28PM -0500, Brent Casavant wrote: > On Fri, 24 Aug 2007, Nick Piggin wrote: > > > And all platforms other than sn2 don't appear to reorder IOs after > > they leave the CPU, so only sn2 needs to do the mmiowb thing before > > spin_unlock. > > I'm sure all of the following is already known to most readers, but > I thought the paragraph above might potentially cause confusion as > to the nature of the problem mmiowb() is solving on SN2. So for > the record... > > SN2 does not reorder IOs issued from a single CPU (that would be > insane). Neither does it reorder IOs once they've reached the IO > fabric (equally insane). From an individual CPU's perspective, all > IOs that it issues to a device will arrive at that device in program > order.
This is why I think mmiowb() is not like a Linux memory barrier. And I presume that the device would see IOs and regular stores from a CPU in program order, given the correct wmb()s? (but maybe I'm wrong... more below). > (In this entire message, all IOs are assumed to be memory-mapped.) > > The problem mmiowb() helps solve on SN2 is the ordering of IOs issued > from multiple CPUs to a single device. That ordering is undefined, as > IO transactions are not ordered across CPUs. That is, if CPU A issues > an IO at time T, and CPU B at time T+1, CPU B's IO may arrive at the > IO fabric before CPU A's IO, particularly if CPU B happens to be closer > than CPU B to the target IO bridge on the NUMA network. > > The simplistic method to solve this is a lock around the section > issuing IOs, thereby ensuring serialization of access to the IO > device. However, as SN2 does not enforce an ordering between normal > memory transactions and memory-mapped IO transactions, you cannot > be sure that an IO transaction will arrive at the IO fabric "on the > correct side" of the unlock memory transaction using this scheme. Hmm. So what if you had the following code executed by a single CPU: writel(data, ioaddr); wmb(); *mem = 10; Will the device see the io write before the store to mem? > Enter mmiowb(). > > mmiowb() causes SN2 to drain the pending IOs from the current CPU's > node. Once the IOs are drained the CPU can safely unlock a normal > memory based lock without fear of the unlock's memory write passing > any outstanding IOs from that CPU. mmiowb needs to have the disclaimer that it's probably wrong if called outside a lock, and it's probably wrong if called between two io writes (need a regular wmb() in that case). I think some drivers are getting this wrong. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev