On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote: > > > On Thu, 16 Aug 2007, Paul E. McKenney wrote: > > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote: > > > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote: > > > > > > > > The compiler can also reorder non-volatile accesses. For an example > > > > patch that cares about this, please see: > > > > > > > > http://lkml.org/lkml/2007/8/7/280 > > > > > > > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and > > > > rcu_read_unlock() to ensure that accesses aren't reordered with respect > > > > to interrupt handlers and NMIs/SMIs running on that same CPU. > > > > > > Good, finally we have some code to discuss (even though it's > > > not actually in the kernel yet). > > > > There was some earlier in this thread as well. > > Hmm, I never quite got what all this interrupt/NMI/SMI handling and > RCU business you mentioned earlier was all about, but now that you've > pointed to the actual code and issues with it ...
Glad to help... > > > First of all, I think this illustrates that what you want > > > here has nothing to do with atomic ops. The ORDERED_WRT_IRQ > > > macro occurs a lot more times in your patch than atomic > > > reads/sets. So *assuming* that it was necessary at all, > > > then having an ordered variant of the atomic_read/atomic_set > > > ops could do just as well. > > > > Indeed. If I could trust atomic_read()/atomic_set() to cause the compiler > > to maintain ordering, then I could just use them instead of having to > > create an ORDERED_WRT_IRQ(). (Or ACCESS_ONCE(), as it is called in a > > different patch.) > > +#define WHATEVER(x) (*(volatile typeof(x) *)&(x)) > > I suppose one could want volatile access semantics for stuff that's > a bit-field too, no? One could, but this is not supported in general. So if you want that, you need to use the usual bit-mask tricks and (for setting) atomic operations. > Also, this gives *zero* "re-ordering" guarantees that your code wants > as you've explained it below) -- neither w.r.t. CPU re-ordering (which > probably you don't care about) *nor* w.r.t. compiler re-ordering > (which you definitely _do_ care about). You are correct about CPU re-ordering (and about the fact that this example doesn't care about it), but not about compiler re-ordering. The compiler is prohibited from moving a volatile access across a sequence point. One example of a sequence point is a statement boundary. Because all of the volatile accesses in this code are separated by statement boundaries, a conforming compiler is prohibited from reordering them. > > > However, I still don't know which atomic_read/atomic_set in > > > your patch would be broken if there were no volatile. Could > > > you please point them out? > > > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with > > atomic_read() and atomic_set(). Starting with __rcu_read_lock(): > > > > o If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++" > > was ordered by the compiler after > > "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then > > suppose an NMI/SMI happened after the rcu_read_lock_nesting but > > before the rcu_flipctr. > > > > Then if there was an rcu_read_lock() in the SMI/NMI > > handler (which is perfectly legal), the nested rcu_read_lock() > > would believe that it could take the then-clause of the > > enclosing "if" statement. But because the rcu_flipctr per-CPU > > variable had not yet been incremented, an RCU updater would > > be within its rights to assume that there were no RCU reads > > in progress, thus possibly yanking a data structure out from > > under the reader in the SMI/NMI function. > > > > Fatal outcome. Note that only one CPU is involved here > > because these are all either per-CPU or per-task variables. > > Ok, so you don't care about CPU re-ordering. Still, I should let you know > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you > want is a full compiler optimization barrier(). No. See above. > [ Your code probably works now, and emits correct code, but that's > just because of gcc did what it did. Nothing in any standard, > or in any documented behaviour of gcc, or anything about the real > (or expected) semantics of "volatile" is protecting the code here. ] Really? Why doesn't the prohibition against moving volatile accesses across sequence points take care of this? > > o If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1" > > was ordered by the compiler to follow the > > "ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI > > happened between the two, then an __rcu_read_lock() in the NMI/SMI > > would incorrectly take the "else" clause of the enclosing "if" > > statement. If some other CPU flipped the rcu_ctrlblk.completed > > in the meantime, then the __rcu_read_lock() would (correctly) > > write the new value into rcu_flipctr_idx. > > > > Well and good so far. But the problem arises in > > __rcu_read_unlock(), which then decrements the wrong counter. > > Depending on exactly how subsequent events played out, this could > > result in either prematurely ending grace periods or never-ending > > grace periods, both of which are fatal outcomes. > > > > And the following are not needed in the current version of the > > patch, but will be in a future version that either avoids disabling > > irqs or that dispenses with the smp_read_barrier_depends() that I > > have 99% convinced myself is unneeded: > > > > o nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting); > > > > o idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1; > > > > Furthermore, in that future version, irq handlers can cause the same > > mischief that SMI/NMI handlers can in this version. > > > > Next, looking at __rcu_read_unlock(): > > > > o If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1" > > was reordered by the compiler to follow the > > "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--", > > then if an NMI/SMI containing an rcu_read_lock() occurs between > > the two, this nested rcu_read_lock() would incorrectly believe > > that it was protected by an enclosing RCU read-side critical > > section as described in the first reversal discussed for > > __rcu_read_lock() above. Again, fatal outcome. > > > > This is what we have now. It is not hard to imagine situations that > > interact with -both- interrupt handlers -and- other CPUs, as described > > earlier. > > It's not about interrupt/SMI/NMI handlers at all! What you clearly want, > simply put, is that a certain stream of C statements must be emitted > by the compiler _as they are_ with no re-ordering optimizations! You must > *definitely* use barrier(), IMHO. Almost. I don't care about most of the operations, only about the loads and stores marked volatile. Again, although the compiler is free to reorder volatile accesses that occur -within- a single statement, it is prohibited by the standard from moving volatile accesses from one statement to another. Therefore, this code can legitimately use volatile. Or am I missing something subtle? Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html