On Fri, 2012-10-26 at 09:15 -0700, Peter LaDow wrote: > On Tue, Oct 23, 2012 at 9:32 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > > Could you try following patch ? > > So, I applied your patch. And so far, it seems to have fixed the > issue. I've had my systems running for 48 hours, and no lockup in > iptables. Usually, I could get a lockup to occur within 12 to 24 > hours, and running this long tells me this patch may have things > fixed. > > Now, I have a couple of questions. > > First, I'm not sure how this actually fixes things. It does seem > there was a race before. But it isn't clear to me how this patch > eliminates the race. I fear that this patch only reduces the window > in which a race could occur, but doesn't eliminate it completely. >
... > Second, perhaps use seqlock_t instead of a seqcount_t for xt_recseq? > This eliminates the problem. But given that it has been using > seqcount_t for a long time, and is still using it, and nobody else has > had this issue appear, makes me wonder if this problem isn't something > unique to the RT patches. Perhaps only use seqlock_t in built for > PREEMPT_RT_FULL? > > Third, recent RT patches (such as 3.6.2-rt4) have added > preempt_disable_rt() and preempt_enable_rt() calls inside of > read_seqcount_begin() and write_seqcount_end() respectively. The call > to local_bh_disable/local_bh_enable doesn't do anything to > disable/enable preemption (in 3.6.2-rt4), but it does in 3.03.36-rt58. > But in 3.0.36-rt58 it only did so if not in an atomic context. And > it doesn't appear to be an atomic context since local_bh_disable > increments preempt_count by SOFTIRQ_OFFSET. And it appeared that > building with SMP enabled, even though it did call > preempt_disable_rt(), indirectly, through local_bh_disable(), but the > lockup still occurred. > > Finally, after more testing, if this patch proves a solution to the > problem, we could apply it locally. But what kind of testing would be > required as part of a submission back to the general kernel/RT folk? > The patch is easy enough to generate, but if it can't be proven that > this patch actually fixes anything, I fail to see how it would be > useful. > > Thanks, > Pete LaDow Upstream kernel is fine, there is no race, as long as : local_bh_disable() disables BH and preemption. Apparently RT changes this, so RT needs to change the code. cmpxchg() has strong guarantees (and is also slower than upstream code), and seems a reasonable way to fix RT/iptables -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/