On Tue, Jan 29, 2019 at 09:59:03AM +0100, Peter Zijlstra wrote: > On Mon, Jan 28, 2019 at 01:37:12PM -0800, Alexei Starovoitov wrote: > > On Mon, Jan 28, 2019 at 09:43:10AM +0100, Peter Zijlstra wrote: > > > > Isn't that still broken? AFAIU networking progs can happen in task > > > context (TX) and SoftIRQ context (RX), which can nest. > > > > Sure. sendmsg side of networking can be interrupted by napi receive. > > Both can have bpf progs attached at different points, but napi won't run > > when bpf prog is running, because bpf prog disables preemption. > > Disabling preemption is not sufficient, it needs to have done > local_bh_disable(), which isn't unlikely given this is all networking > code.
right. excellent point. Since bh is not disabled nesting of the sender side socket filter is possible with receive side socket filter. Theoretically the same socket filter prog can be nested. I believe it's fine with the current state of things including parrallel access to bpf maps (my patch to add preempt_disable around socket filter is still needed). But this point regarding local_bh_disable changes the plans for bpf_spin_lock. We'd need local_bh_disable around socket filters and around syscall access to map with 'struct bpf_spin_lock' or two new helpers bpf_spin_lock_irqsave and corresponding verifier support. I guess we'll argue about best option when that time comes. For now there will be no bpf_spin_lock in socket filters and I need to add local_bh_disable for syscall access to maps with bpf_spin_lock to avoid deadlock. I'll send a set of fixes for lockdep false positives and respin bpf_spin_lock set.