Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator

2020-08-18 Thread Alexei Starovoitov
On Tue, Aug 18, 2020 at 05:30:37PM -0700, Yonghong Song wrote: > > > On 8/18/20 5:05 PM, Alexei Starovoitov wrote: > > On Tue, Aug 18, 2020 at 03:23:09PM -0700, Yonghong Song wrote: > > > > > > We did not use cond_resched() since for some iterators, e.g., > > > netlink iterator, where rcu read_l

Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator

2020-08-18 Thread Yonghong Song
On 8/18/20 5:05 PM, Alexei Starovoitov wrote: On Tue, Aug 18, 2020 at 03:23:09PM -0700, Yonghong Song wrote: We did not use cond_resched() since for some iterators, e.g., netlink iterator, where rcu read_lock critical section spans between consecutive seq_ops->next(), which makes impossible

Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator

2020-08-18 Thread Alexei Starovoitov
On Tue, Aug 18, 2020 at 03:23:09PM -0700, Yonghong Song wrote: > > We did not use cond_resched() since for some iterators, e.g., > netlink iterator, where rcu read_lock critical section spans between > consecutive seq_ops->next(), which makes impossible to do cond_resched() > in the key while loop

[PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator

2020-08-18 Thread Yonghong Song
In our production system, we observed rcu stalls when 'bpftool prog` is running. rcu: INFO: rcu_sched self-detected stall on CPU rcu: \x097-: (20999 ticks this GP) idle=302/1/0x4000 softirq=1508852/1508852 fqs=4913 \x09(t=21031 jiffies g=2534773 q=179750) NMI backtrace for