On Fri, Oct 26, 2012 at 2:05 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > Do you know what is per cpu data in linux kernel ?
I sorta did. But since your response, I did more reading, and now I see what you mean. But I don't think this is a per cpu issue. More below. > Because its not needed. Really I dont know why you want that. > > Once you are sure a thread cannot be interrupted by a softirq, and > cannot migrate to another cpu, access to percpu data doesnt need other > synchronization at all. Because there are multiple entry points on the same CPU. In net/ipv4/netfilter/ip_tables, there are two entries to xt_write_recseq_begin(). The first is in ipt_do_table and the other is in get_counters. Where we are seeing the lockup is with a getsockopt syscall leading to do_counters. The other path is through ipt_do_table, which is installed as a hook. I'm not sure from what context the hooks are called, but it is certainly from a different context than the syscall. > Following sequence is safe : > > addend = (__this_cpu_read(xt_recseq.sequence) + 1) & 1; > /* > * This is kind of a write_seqcount_begin(), but addend is 0 or 1 > * We dont check addend value to avoid a test and conditional jump, > * since addend is most likely 1 > */ > __this_cpu_add(xt_recseq.sequence, addend); If this were safe, we wouldn't be seeing this lockup and your patch wouldn't be needed. So it seems that your patch doesn't really address the issue that we are not "sure a thread cannot be interrupted by a softirq, and cannot migrate to another cpu". Well, we know it cannot migrate to another CPU, because there isn't another CPU. So apparently, it can be interrupted by a softirq. So local_bh_disable isn't doing anything useful in the RT patches with regard to this. As I mentioned earlier, I think perhaps what your patch did was ensure an atomic update of the sequence counter. But it does nothing to synchronize two writers. If they were already synchronized (such as via the call to local_bh_disable), then we wouldn't see sequence counter corruption, no? Pete -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/