[Patch net] vsock: fix recursive ->recvmsg calls

2024-08-11 Thread Cong Wang
From: Cong Wang After a vsock socket has been added to a BPF sockmap, its prot->recvmsg has been replaced with vsock_bpf_recvmsg(). Thus the following recursiion could happen: vsock_bpf_recvmsg() -> __vsock_recvmsg() -> vsock_connectible_recvmsg() -> p

Re: [PATCH net 2/2] net/sched: sch_frag: fix stack OOB read while fragmenting IPv4 packets

2021-04-20 Thread Cong Wang
packet fragment > support.") > Cc: # 5.11 > Reported-by: Shuang Li > Signed-off-by: Davide Caratti Acked-by: Cong Wang Thanks.

Re: [PATCH net 2/2] net/sched: sch_frag: fix stack OOB read while fragmenting IPv4 packets

2021-04-20 Thread Cong Wang
On Tue, Apr 20, 2021 at 1:59 AM Davide Caratti wrote: > > hello Cong, thanks for looking at this! > > On Mon, 2021-04-19 at 11:46 -0700, Cong Wang wrote: > > On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti wrote: > > > diff --git a/net/sched/sch_frag.c b/net/s

Re: [PATCH v4] net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule

2021-04-19 Thread Cong Wang
rio_schedule to > prevent this condition. > > Reported as bug on syzkaller: > https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c > > Reported-by: syzbot+d50710fd0873a9c6b...@syzkaller.appspotmail.com > Signed-off-by: Du Cheng Acked-by: Cong Wang Thanks.

Re: [PATCH net 2/2] net/sched: sch_frag: fix stack OOB read while fragmenting IPv4 packets

2021-04-19 Thread Cong Wang
On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti wrote: > diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c > index e1e77d3fb6c0..8c06381391d6 100644 > --- a/net/sched/sch_frag.c > +++ b/net/sched/sch_frag.c > @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff > *skb

[Patch bpf-next v2 9/9] selftests/bpf: add test cases for redirection between udp and unix

2021-04-19 Thread Cong Wang
From: Cong Wang Add two test cases to ensure redirection between udp and unix work bidirectionally. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 165 ++ 1 file

[Patch bpf-next v2 8/9] selftests/bpf: add a test case for unix sockmap

2021-04-19 Thread Cong Wang
From: Cong Wang Add a test case to ensure redirection between two AF_UNIX datagram sockets work. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 92 +++ 1 file changed

[Patch bpf-next v2 7/9] selftests/bpf: factor out add_to_sockmap()

2021-04-19 Thread Cong Wang
From: Cong Wang Factor out a common helper add_to_sockmap() which adds two sockets into a sockmap. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 59 +++ 1 file changed

[Patch bpf-next v2 6/9] selftests/bpf: factor out udp_socketpair()

2021-04-19 Thread Cong Wang
From: Cong Wang Factor out a common helper udp_socketpair() which creates a pair of connected UDP sockets. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 76 ++- 1 file

[Patch bpf-next v2 5/9] sock_map: update sock type checks for AF_UNIX

2021-04-19 Thread Cong Wang
From: Cong Wang Now AF_UNIX datagram supports sockmap and redirection, we can update the sock type checks for them accordingly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/core/sock_map.c | 8 1 file changed, 8

[Patch bpf-next v2 2/9] af_unix: implement ->psock_update_sk_prot()

2021-04-19 Thread Cong Wang
From: Cong Wang unix_proto is special, it is very different from INET proto, which even does not have a ->close(). We have to add a dummy one to satisfy sockmap. And now we can implement unix_bpf_update_proto() to update sk_prot. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki

[Patch bpf-next v2 4/9] af_unix: implement unix_dgram_bpf_recvmsg()

2021-04-19 Thread Cong Wang
From: Cong Wang We have to implement unix_dgram_bpf_recvmsg() to replace the original ->recvmsg() to retrieve skmsg from ingress_msg. AF_UNIX is again special here because the lack of sk_prot->recvmsg(). I simply add a special case inside unix_dgram_recvmsg() to call sk->sk_prot

[Patch bpf-next v2 1/9] af_unix: implement ->read_sock() for sockmap

2021-04-19 Thread Cong Wang
From: Cong Wang Implement ->read_sock() for AF_UNIX datagram socket, it is pretty much similar to udp_read_sock(). Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/unix/af_unix.c | 37 + 1 f

[Patch bpf-next v2 3/9] af_unix: set TCP_ESTABLISHED for datagram sockets too

2021-04-19 Thread Cong Wang
From: Cong Wang Currently only unix stream socket sets TCP_ESTABLISHED, datagram socket can set this too when they connect to its peer socket. At least __ip4_datagram_connect() does the same. This will be used by the next patch to determine whether an AF_UNIX datagram socket can be redirected

[Patch bpf-next v2 0/9] sockmap: add sockmap support to Unix datagram socket

2021-04-19 Thread Cong Wang
From: Cong Wang This is the last patchset of the original large patchset. In the previous patchset, a new BPF sockmap program BPF_SK_SKB_VERDICT was introduced and UDP began to support it too. In this patchset, we add BPF_SK_SKB_VERDICT support to Unix datagram socket, so that we can finally

Re: [PATCH v2] net: fix a concurrency bug in l2tp_tunnel_register()

2021-04-16 Thread Cong Wang
On Thu, Apr 15, 2021 at 7:18 AM Sishuai Gong wrote: > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c > index 203890e378cb..879f1264ec3c 100644 > --- a/net/l2tp/l2tp_core.c > +++ b/net/l2tp/l2tp_core.c > @@ -1478,6 +1478,9 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, > struct

Re: [PATCH v3] net: sched: tapr: prevent cycle_time == 0 in parse_taprio_schedule

2021-04-16 Thread Cong Wang
On Thu, Apr 15, 2021 at 4:17 PM Du Cheng wrote: > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index 8287894541e3..abd6b176383c 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -901,6 +901,10 @@ static int parse_taprio_schedule(struct taprio_sched *q, >

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-15 Thread Cong Wang
On Wed, Apr 14, 2021 at 9:25 PM Alexei Starovoitov wrote: > > As I said earlier: > " > If prog refers such hmap as above during prog free the kernel does > for_each_map_elem {if (elem->opaque) del_timer().} > " This goes back to our previous discussion. Forcing timer deletions on prog exit is not

Re: [PATCH] net: fix a concurrency bug in l2tp_tunnel_register()

2021-04-14 Thread Cong Wang
On Wed, Apr 14, 2021 at 2:14 PM Sishuai Gong wrote: > > l2tp_tunnel_register() registers a tunnel without fully > initializing its attribute. This can allow another kernel thread > running l2tp_xmit_core() to access the uninitialized data and > then cause a kernel NULL pointer dereference error, a

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-14 Thread Cong Wang
On Mon, Apr 12, 2021 at 4:01 PM Alexei Starovoitov wrote: > > On Mon, Apr 05, 2021 at 05:36:27PM -0700, Cong Wang wrote: > > On Fri, Apr 2, 2021 at 4:45 PM Alexei Starovoitov > > wrote: > > > > > > On Fri, Apr 02, 2021 at 02:24:51PM -0700, Cong Wang wrote: &g

Re: A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core()

2021-04-14 Thread Cong Wang
On Tue, Apr 13, 2021 at 3:10 PM Gong, Sishuai wrote: > > Hi, > > We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it > under x86. This bug happens when two l2tp functions l2tp_tunnel_register() > and l2tp_xmit_core() are running in parallel. In general, > l2tp_tunnel_re

Re: [Patch net] smc: disallow TCP_ULP in smc_setsockopt()

2021-04-12 Thread Cong Wang
On Sun, Apr 11, 2021 at 11:52 PM Karsten Graul wrote: > > > > On 10/04/2021 20:17, Cong Wang wrote: > > From: Cong Wang > > > > syzbot is able to setup kTLS on an SMC socket, which coincidentally > > uses sk_user_data too, later, kTLS treats it as psock s

[Patch net] smc: disallow TCP_ULP in smc_setsockopt()

2021-04-10 Thread Cong Wang
From: Cong Wang syzbot is able to setup kTLS on an SMC socket, which coincidentally uses sk_user_data too, later, kTLS treats it as psock so triggers a refcnt warning. The cause is that smc_setsockopt() simply calls TCP setsockopt(). I do not think it makes sense to setup kTLS on top of SMC, so

Re: [syzbot] WARNING: refcount bug in sk_psock_get

2021-04-10 Thread Cong Wang
On Fri, Apr 9, 2021 at 12:45 PM John Fastabend wrote: > > syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:9c54130c Add linux-next specific files for 20210406 > > git tree: linux-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=1

Re: [Patch bpf-next] sock_map: fix a potential use-after-free in sock_map_close()

2021-04-08 Thread Cong Wang
On Thu, Apr 8, 2021 at 5:26 PM John Fastabend wrote: > > Cong Wang wrote: > > From: Cong Wang > > > > The last refcnt of the psock can be gone right after > > sock_map_remove_links(), so sk_psock_stop() could trigger a UAF. > > The reason why I placed sk_pso

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-08 Thread Cong Wang
On Tue, Apr 6, 2021 at 4:36 PM Song Liu wrote: > I am not sure whether this makes sense. I feel there is still some > misunderstanding. It will be helpful if you can share more information > about the overall design. > > BTW: this could be a good topic for the BPF office hour. See more details > h

Re: [PATCH net v2 0/3] Action initalization fixes

2021-04-08 Thread Cong Wang
of > the loop in tcf_action_init() which is properly fixed by the following > patch. I still hate the init_res[] array, but I have no easy and better way to fix it either, so: Acked-by: Cong Wang For the long term, we probably want to split the action ->init() into two: ->init() and ->

Re: [PATCH net v2 2/3] net: sched: fix action overwrite reference counting

2021-04-08 Thread Cong Wang
On Thu, Apr 8, 2021 at 4:59 AM Jamal Hadi Salim wrote: > > On 2021-04-07 7:50 p.m., Cong Wang wrote: > > On Wed, Apr 7, 2021 at 8:36 AM Vlad Buslov wrote: > >> > >> Action init code increments reference counter when it changes an action. > >> This is the d

Re: [PATCH net v2 2/3] net: sched: fix action overwrite reference counting

2021-04-08 Thread Cong Wang
On Thu, Apr 8, 2021 at 12:50 AM Vlad Buslov wrote: > > > On Thu 08 Apr 2021 at 02:50, Cong Wang wrote: > > In my last comments, I actually meant whether we can avoid this > > 'init_res[]' array. Since here you want to check whether an action > > returned by t

Re: [External] linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Cong Wang .
On Wed, Apr 7, 2021 at 8:11 PM Stephen Rothwell wrote: > > Hi all, > > Today's linux-next merge of the net-next tree got a conflict in: > > net/core/skmsg.c > > between commit: > > 144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting") > > from the bpf tree and commit: > > e3526bb

Re: [External] linux-next: manual merge of the net-next tree with the bpf tree

2021-04-07 Thread Cong Wang .
On Wed, Apr 7, 2021 at 8:02 PM Stephen Rothwell wrote: > > Hi all, > > Today's linux-next merge of the net-next tree got a conflict in: > > include/linux/skmsg.h > > between commit: > > 1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset") > > from the bpf tree and commit: > > 8a59f9d1

[Patch bpf-next] sock_map: fix a potential use-after-free in sock_map_close()

2021-04-07 Thread Cong Wang
From: Cong Wang The last refcnt of the psock can be gone right after sock_map_remove_links(), so sk_psock_stop() could trigger a UAF. The reason why I placed sk_psock_stop() there is to avoid RCU read critical section, and more importantly, some callee of sock_map_remove_links() is supposed to

Re: [PATCH net v2 2/3] net: sched: fix action overwrite reference counting

2021-04-07 Thread Cong Wang
On Wed, Apr 7, 2021 at 8:36 AM Vlad Buslov wrote: > > Action init code increments reference counter when it changes an action. > This is the desired behavior for cls API which needs to obtain action > reference for every classifier that points to action. However, act API just > needs to change the

Re: [PATCH net-next 0/2] Additional tests for action API

2021-04-07 Thread Cong Wang
On Wed, Apr 7, 2021 at 8:46 AM Vlad Buslov wrote: > > Add two new tests for action create/change code. Acked-by: Cong Wang Thanks.

Re: WARNING net/core/stream.c:208 when running test_sockmap

2021-04-07 Thread Cong Wang
On Wed, Apr 7, 2021 at 2:22 PM Jiri Olsa wrote: > > hi, > I'm getting couple of WARNINGs below when running > test_sockmap on latest bpf-next/master, like: > > # while :; do ./test_sockmap ; done > > The warning is at: > WARN_ON(sk->sk_forward_alloc); > > so looks like some socket allocation m

[Patch bpf-next] skmsg: pass psock pointer to ->psock_update_sk_prot()

2021-04-06 Thread Cong Wang
From: Cong Wang Using sk_psock() to retrieve psock pointer from sock requires RCU read lock, but we already get psock pointer before calling ->psock_update_sk_prot() in both cases, so we can just pass it without bothering sk_psock(). Reported-and-tested-by: syzbot+320a3bc8d80f47

Re: [syzbot] KASAN: use-after-free Write in sk_psock_stop

2021-04-06 Thread Cong Wang
On Tue, Apr 6, 2021 at 6:01 AM syzbot wrote: > == > BUG: KASAN: use-after-free in __lock_acquire+0x3e6f/0x54c0 > kernel/locking/lockdep.c:4770 > Read of size 8 at addr 888024f66238 by task syz-executor.1/14202 > > CPU: 0 PID: 142

Re: [Patch bpf-next v8 10/16] sock: introduce sk->sk_prot->psock_update_sk_prot()

2021-04-06 Thread Cong Wang
On Mon, Apr 5, 2021 at 1:25 AM Eric Dumazet wrote: > > > > On 3/31/21 4:32 AM, Cong Wang wrote: > > From: Cong Wang > > > > Currently sockmap calls into each protocol to update the struct > > proto and replace it. This certainly won't work when the pro

Re: [syzbot] WARNING: suspicious RCU usage in tcp_bpf_update_proto

2021-04-06 Thread Cong Wang
On Tue, Apr 6, 2021 at 2:44 AM syzbot wrote: > > Hello, > > syzbot found the following issue on: > > HEAD commit:514e1150 net: x25: Queue received packets in the drivers i.. > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=112a8831d0 > kernel config:

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-06 Thread Cong Wang
On Mon, Apr 5, 2021 at 11:18 PM Song Liu wrote: > > > > > On Apr 5, 2021, at 6:24 PM, Cong Wang wrote: > > > > On Mon, Apr 5, 2021 at 6:08 PM Song Liu wrote: > >> > >> > >> > >>> On Apr 5, 2021, at 4:49 PM, Cong Wang

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-05 Thread Cong Wang
On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: > > I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the > coming days. If it works, then we can consider proceeding with it, > otherwise I am all for reverting the whole NOLOCK stuff. > > [1] > https://lore.kernel.org/linux-ca

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-05 Thread Cong Wang
On Mon, Apr 5, 2021 at 6:08 PM Song Liu wrote: > > > > > On Apr 5, 2021, at 4:49 PM, Cong Wang wrote: > > > > On Fri, Apr 2, 2021 at 4:31 PM Song Liu wrote: > >> > >> > >> > >>> On Apr 2, 2021, at 1:57 PM, Cong Wang wrote: >

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-05 Thread Cong Wang
On Fri, Apr 2, 2021 at 4:45 PM Alexei Starovoitov wrote: > > On Fri, Apr 02, 2021 at 02:24:51PM -0700, Cong Wang wrote: > > > > where the key is the timer ID and the value is the timer expire > > > > timer. > > > > > > The timer ID is unnece

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-05 Thread Cong Wang
On Fri, Apr 2, 2021 at 4:31 PM Song Liu wrote: > > > > > On Apr 2, 2021, at 1:57 PM, Cong Wang wrote: > > > > Ideally I even prefer to create timers in kernel-space too, but as I already > > explained, this seems impossible to me. > > Would hrtimer (includ

Re: [PATCH RFC 2/4] net: sched: fix err handler in tcf_action_init()

2021-04-05 Thread Cong Wang
On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov wrote: > So, the following happens in reproduction provided in commit message > when executing "tc actions add action simple sdata \"1\" index 1 > action simple sdata \"2\" index 2" command: > > 1. tcf_action_init() is called with batch of two actions of

[Patch bpf-next] udp_bpf: remove some pointless comments

2021-04-02 Thread Cong Wang
From: Cong Wang These comments in udp_bpf_update_proto() are copied from the original TCP code and apparently do not apply to UDP. Just remove them. Reported-by: Jakub Sitnicki Cc: John Fastabend Cc: Daniel Borkmann Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/ipv4/udp_bpf.c | 2

Re: [Patch bpf-next v8 10/16] sock: introduce sk->sk_prot->psock_update_sk_prot()

2021-04-02 Thread Cong Wang
On Fri, Apr 2, 2021 at 3:16 AM Jakub Sitnicki wrote: > > -struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock) > > +int udp_bpf_update_proto(struct sock *sk, bool restore) > > { > > int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6; > > + struct sk_

Re: [Patch bpf-next v8 11/16] udp: implement ->read_sock() for sockmap

2021-04-02 Thread Cong Wang
On Wed, Mar 31, 2021 at 11:01 PM John Fastabend wrote: > This 'else if' is always true if above is false right? Would be > impler and clearer IMO as, > >if (used <= 0) { > if (!copied) > copied = used; >

Re: [PATCH RFC 2/4] net: sched: fix err handler in tcf_action_init()

2021-04-02 Thread Cong Wang
On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov wrote: > > With recent changes that separated action module load from action > initialization tcf_action_init() function error handling code was modified > to manually release the loaded modules if loading/initialization of any > further action in same b

Re: [PATCH RFC 1/4] net: sched: fix action overwrite reference counting

2021-04-02 Thread Cong Wang
On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov wrote: > > Action init code increments reference counter when it changes an action. > This is the desired behavior for cls API which needs to obtain action > reference for every classifier that points to action. However, act API just > needs to change th

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-02 Thread Cong Wang
On Fri, Apr 2, 2021 at 12:28 PM Alexei Starovoitov wrote: > > On Wed, Mar 31, 2021 at 09:26:35PM -0700, Cong Wang wrote: > > > This patch introduces a bpf timer map and a syscall to create bpf timer > > from user-space. > > That will severely limit timer api usabilit

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-02 Thread Cong Wang
On Fri, Apr 2, 2021 at 12:45 PM Song Liu wrote: > > > > > On Apr 2, 2021, at 12:08 PM, Cong Wang wrote: > > > > On Fri, Apr 2, 2021 at 10:57 AM Song Liu wrote: > >> > >> > >> > >>> On Apr 2, 2021, at 10:34 AM, Cong Wang

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-02 Thread Cong Wang
On Fri, Apr 2, 2021 at 10:57 AM Song Liu wrote: > > > > > On Apr 2, 2021, at 10:34 AM, Cong Wang wrote: > > > > On Thu, Apr 1, 2021 at 1:17 PM Song Liu wrote: > >> > >> > >> > >>> On Apr 1, 2021, at 10:28 AM, Cong Wang w

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-02 Thread Cong Wang
On Thu, Apr 1, 2021 at 1:17 PM Song Liu wrote: > > > > > On Apr 1, 2021, at 10:28 AM, Cong Wang wrote: > > > > On Wed, Mar 31, 2021 at 11:38 PM Song Liu wrote: > >> > >> > >> > >>> On Mar 31, 2021, at 9:26 PM, Cong Wang wrote

Re: [RFC Patch bpf-next] bpf: introduce bpf timer

2021-04-01 Thread Cong Wang
On Wed, Mar 31, 2021 at 11:38 PM Song Liu wrote: > > > > > On Mar 31, 2021, at 9:26 PM, Cong Wang wrote: > > > > From: Cong Wang > > > > (This patch is still in early stage and obviously incomplete. I am sending > > it out to get some high-lev

[RFC Patch bpf-next] bpf: introduce bpf timer

2021-03-31 Thread Cong Wang
From: Cong Wang (This patch is still in early stage and obviously incomplete. I am sending it out to get some high-level feedbacks. Please kindly ignore any coding details for now and focus on the design.) This patch introduces a bpf timer map and a syscall to create bpf timer from user-space

Re: [PATCH v2 1/1] net: sched: bump refcount for new action in ACT replace mode

2021-03-30 Thread Cong Wang
On Mon, Mar 29, 2021 at 3:55 PM Kumar Kartikeya Dwivedi wrote: > diff --git a/net/sched/act_api.c b/net/sched/act_api.c > index b919826939e0..43cceb924976 100644 > --- a/net/sched/act_api.c > +++ b/net/sched/act_api.c > @@ -1042,6 +1042,9 @@ struct tc_action *tcf_action_init_1(struct net *net, >

[Patch bpf-next v8 16/16] selftests/bpf: add a test case for loading BPF_SK_SKB_VERDICT

2021-03-30 Thread Cong Wang
From: Cong Wang This adds a test case to ensure BPF_SK_SKB_VERDICT and BPF_SK_STREAM_VERDICT will never be attached at the same time. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_basic.c | 40

[Patch bpf-next v8 15/16] selftests/bpf: add a test case for udp sockmap

2021-03-30 Thread Cong Wang
From: Cong Wang Add a test case to ensure redirection between two UDP sockets work. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 136 ++ .../selftests/bpf/progs

[Patch bpf-next v8 14/16] sock_map: update sock type checks for UDP

2021-03-30 Thread Cong Wang
From: Cong Wang Now UDP supports sockmap and redirection, we can safely update the sock type checks for it accordingly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/core/sock_map.c | 5 - 1 file changed, 4 insertions(+), 1

[Patch bpf-next v8 10/16] sock: introduce sk->sk_prot->psock_update_sk_prot()

2021-03-30 Thread Cong Wang
From: Cong Wang Currently sockmap calls into each protocol to update the struct proto and replace it. This certainly won't work when the protocol is implemented as a module, for example, AF_UNIX. Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each protocol can implement

[Patch bpf-next v8 13/16] udp: implement udp_bpf_recvmsg() for sockmap

2021-03-30 Thread Cong Wang
From: Cong Wang We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/ipv4/udp_bpf.c |

[Patch bpf-next v8 12/16] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data()

2021-03-30 Thread Cong Wang
From: Cong Wang Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Cc: John

[Patch bpf-next v8 11/16] udp: implement ->read_sock() for sockmap

2021-03-30 Thread Cong Wang
From: Cong Wang This is similar to tcp_read_sock(), except we do not need to worry about connections, we just need to retrieve skb from UDP receive queue. Note, the return value of ->read_sock() is unused in sk_psock_verdict_data_ready(), and UDP still does not support splice() due to lack

[Patch bpf-next v8 09/16] sock_map: introduce BPF_SK_SKB_VERDICT

2021-03-30 Thread Cong Wang
From: Cong Wang Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach

[Patch bpf-next v8 08/16] sock_map: kill sock_map_link_no_progs()

2021-03-30 Thread Cong Wang
From: Cong Wang Now we can fold sock_map_link_no_progs() into sock_map_link() and get rid of sock_map_link_no_progs(). Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Cc: John Fastabend Signed-off-by: Cong Wang --- net/core/sock_map.c | 55

[Patch bpf-next v8 07/16] sock_map: simplify sock_map_link() a bit

2021-03-30 Thread Cong Wang
From: Cong Wang sock_map_link() passes down map progs, but it is confusing to see both map progs and psock progs. Make the map progs more obvious by retrieving it directly with sock_map_progs() inside sock_map_link(). Now it is aligned with sock_map_link_no_progs() too. Cc: Daniel Borkmann Cc

[Patch bpf-next v8 05/16] skmsg: use rcu work for destroying psock

2021-03-30 Thread Cong Wang
From: Cong Wang The RCU callback sk_psock_destroy() only queues work psock->gc, so we can just switch to rcu work to simplify the code. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- include/linux/skmsg.h | 5 + net/c

[Patch bpf-next v8 06/16] skmsg: use GFP_KERNEL in sk_psock_create_ingress_msg()

2021-03-30 Thread Cong Wang
From: Cong Wang This function is only called in process context. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core

[Patch bpf-next v8 03/16] net: introduce skb_send_sock() for sock_map

2021-03-30 Thread Cong Wang
From: Cong Wang We only have skb_send_sock_locked() which requires callers to use lock_sock(). Introduce a variant skb_send_sock() which locks on its own, callers do not need to lock it any more. This will save us from adding a ->sendmsg_locked for each protocol. To reuse the code, p

[Patch bpf-next v8 02/16] skmsg: introduce a spinlock to protect ingress_msg

2021-03-30 Thread Cong Wang
From: Cong Wang Currently we rely on lock_sock to protect ingress_msg, it is too big for this, we can actually just use a spinlock to protect this list like protecting other skb queues. __tcp_bpf_recvmsg() is still special because of peeking, it still has to use lock_sock. Cc: Daniel Borkmann

[Patch bpf-next v8 04/16] skmsg: avoid lock_sock() in sk_psock_backlog()

2021-03-30 Thread Cong Wang
From: Cong Wang We do not have to lock the sock to avoid losing sk_socket, instead we can purge all the ingress queues when we close the socket. Sending or receiving packets after orphaning socket makes no sense. We do purge these queues when psock refcnt reaches zero but here we want to purge

[Patch bpf-next v8 01/16] skmsg: lock ingress_skb when purging

2021-03-30 Thread Cong Wang
From: Cong Wang Currently we purge the ingress_skb queue only when psock refcnt goes down to 0, so locking the queue is not necessary, but in order to be called during ->close, we have to lock it here. Cc: John Fastabend Cc: Daniel Borkmann Cc: Lorenz Bauer Acked-by: Jakub Sitnicki Sig

[Patch bpf-next v8 00/16] sockmap: introduce BPF_SK_SKB_VERDICT and support UDP

2021-03-30 Thread Cong Wang
From: Cong Wang We have thousands of services connected to a daemon on every host via AF_UNIX dgram sockets, after they are moved into VM, we have to add a proxy to forward these communications from VM to host, because rewriting thousands of them is not practical. This proxy uses an AF_UNIX

Re: [PATCH v2 bpf-next 00/14] bpf: Support calling kernel function

2021-03-30 Thread Cong Wang
On Tue, Mar 30, 2021 at 7:36 AM Alexei Starovoitov wrote: > > On Tue, Mar 30, 2021 at 2:43 AM Lorenz Bauer wrote: > > > > On Thu, 25 Mar 2021 at 01:52, Martin KaFai Lau wrote: > > > > > > This series adds support to allow bpf program calling kernel function. > > > > I think there are more build

Re: [Patch bpf-next v7 09/13] udp: implement ->read_sock() for sockmap

2021-03-29 Thread Cong Wang
On Mon, Mar 29, 2021 at 11:23 PM John Fastabend wrote: > > Cong Wang wrote: > > On Mon, Mar 29, 2021 at 1:54 PM John Fastabend > > wrote: > > > > > > Cong Wang wrote: > > > > From: Cong Wang > > > > > > > > This is

Re: [Patch bpf-next v7 12/13] sock_map: update sock type checks for UDP

2021-03-29 Thread Cong Wang
On Mon, Mar 29, 2021 at 4:10 PM John Fastabend wrote: > I think its a bit odd for TCP_ESTABLISHED to work with !tcp, but > thats not your invention so LGTM. It has been there for many years, so why it is suddenly a problem with my patchset? More importantly, why don't you change it by yourself as

Re: [Patch bpf-next v7 09/13] udp: implement ->read_sock() for sockmap

2021-03-29 Thread Cong Wang
On Mon, Mar 29, 2021 at 1:54 PM John Fastabend wrote: > > Cong Wang wrote: > > From: Cong Wang > > > > This is similar to tcp_read_sock(), except we do not need > > to worry about connections, we just need to retrieve skb > > from UDP receive queue. > >

Re: [Patch bpf-next v7 07/13] sock_map: introduce BPF_SK_SKB_VERDICT

2021-03-29 Thread Cong Wang
On Mon, Mar 29, 2021 at 1:10 PM John Fastabend wrote: > > Cong Wang wrote: > > From: Cong Wang > > > > Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is > > confusing and more importantly we still want to distinguish them > > from user-space. So

Re: [PATCH v2 bpf-next 00/14] bpf: Support calling kernel function

2021-03-29 Thread Cong Wang
On Sun, Mar 28, 2021 at 6:24 PM Martin KaFai Lau wrote: > Could you also check the CONFIG_DYNAMIC_FTRACE and also try 'y' if it > is not set? On my side, with pahole==1.17, changing CONFIG_DYNAMIC_FTRACE makes no difference. With pahole==1.20, CONFIG_DYNAMIC_FTRACE=y makes it gone, but CONFIG_DYN

Re: [Patch bpf-next v7 00/13] sockmap: introduce BPF_SK_SKB_VERDICT and support UDP

2021-03-29 Thread Cong Wang
On Mon, Mar 29, 2021 at 8:03 AM John Fastabend wrote: > > Alexei Starovoitov wrote: > > On Sun, Mar 28, 2021 at 01:20:00PM -0700, Cong Wang wrote: > > > From: Cong Wang > > > > > > We have thousands of services connected to a daemon on every host > >

[Patch bpf-next v7 13/13] selftests/bpf: add a test case for udp sockmap

2021-03-28 Thread Cong Wang
From: Cong Wang Add a test case to ensure redirection between two UDP sockets work. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 136 ++ .../selftests/bpf/progs

[Patch bpf-next v7 11/13] udp: implement udp_bpf_recvmsg() for sockmap

2021-03-28 Thread Cong Wang
From: Cong Wang We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/ipv4/udp_bpf.c |

[Patch bpf-next v7 12/13] sock_map: update sock type checks for UDP

2021-03-28 Thread Cong Wang
From: Cong Wang Now UDP supports sockmap and redirection, we can safely update the sock type checks for it accordingly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/core/sock_map.c | 5 - 1 file changed, 4 insertions(+), 1

[Patch bpf-next v7 07/13] sock_map: introduce BPF_SK_SKB_VERDICT

2021-03-28 Thread Cong Wang
From: Cong Wang Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to set

[Patch bpf-next v7 10/13] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data()

2021-03-28 Thread Cong Wang
From: Cong Wang Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Cc: John

[Patch bpf-next v7 08/13] sock: introduce sk->sk_prot->psock_update_sk_prot()

2021-03-28 Thread Cong Wang
From: Cong Wang Currently sockmap calls into each protocol to update the struct proto and replace it. This certainly won't work when the protocol is implemented as a module, for example, AF_UNIX. Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each protocol can implement

[Patch bpf-next v7 09/13] udp: implement ->read_sock() for sockmap

2021-03-28 Thread Cong Wang
From: Cong Wang This is similar to tcp_read_sock(), except we do not need to worry about connections, we just need to retrieve skb from UDP receive queue. Note, the return value of ->read_sock() is unused in sk_psock_verdict_data_ready(). Cc: John Fastabend Cc: Daniel Borkmann Cc: Ja

[Patch bpf-next v7 06/13] skmsg: use GFP_KERNEL in sk_psock_create_ingress_msg()

2021-03-28 Thread Cong Wang
From: Cong Wang This function is only called in process context. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c

[Patch bpf-next v7 04/13] skmsg: avoid lock_sock() in sk_psock_backlog()

2021-03-28 Thread Cong Wang
From: Cong Wang We do not have to lock the sock to avoid losing sk_socket, instead we can purge all the ingress queues when we close the socket. Sending or receiving packets after orphaning socket makes no sense. We do purge these queues when psock refcnt reaches zero but here we want to purge

[Patch bpf-next v7 05/13] skmsg: use rcu work for destroying psock

2021-03-28 Thread Cong Wang
From: Cong Wang The RCU callback sk_psock_destroy() only queues work psock->gc, so we can just switch to rcu work to simplify the code. Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Acked-by: John Fastabend Signed-off-by: Cong Wang --- include/linux/skmsg.h | 5 + net/c

[Patch bpf-next v7 02/13] skmsg: introduce a spinlock to protect ingress_msg

2021-03-28 Thread Cong Wang
From: Cong Wang Currently we rely on lock_sock to protect ingress_msg, it is too big for this, we can actually just use a spinlock to protect this list like protecting other skb queues. __tcp_bpf_recvmsg() is still special because of peeking, it still has to use lock_sock. Cc: John Fastabend

[Patch bpf-next v7 03/13] net: introduce skb_send_sock() for sock_map

2021-03-28 Thread Cong Wang
From: Cong Wang We only have skb_send_sock_locked() which requires callers to use lock_sock(). Introduce a variant skb_send_sock() which locks on its own, callers do not need to lock it any more. This will save us from adding a ->sendmsg_locked for each protocol. To reuse the code, p

[Patch bpf-next v7 00/13] sockmap: introduce BPF_SK_SKB_VERDICT and support UDP

2021-03-28 Thread Cong Wang
From: Cong Wang We have thousands of services connected to a daemon on every host via AF_UNIX dgram sockets, after they are moved into VM, we have to add a proxy to forward these communications from VM to host, because rewriting thousands of them is not practical. This proxy uses an AF_UNIX

[Patch bpf-next v7 01/13] skmsg: lock ingress_skb when purging

2021-03-28 Thread Cong Wang
From: Cong Wang Currently we purge the ingress_skb queue only when psock refcnt goes down to 0, so locking the queue is not necessary, but in order to be called during ->close, we have to lock it here. Cc: John Fastabend Cc: Daniel Borkmann Cc: Lorenz Bauer Acked-by: Jakub Sitnicki Sig

Re: [PATCH v2 bpf-next 00/14] bpf: Support calling kernel function

2021-03-28 Thread Cong Wang
On Sat, Mar 27, 2021 at 3:54 PM Alexei Starovoitov wrote: > > On Sat, Mar 27, 2021 at 3:08 PM Cong Wang wrote: > > BTFIDS vmlinux > > FAILED unresolved symbol cubictcp_state > > make: *** [Makefile:1199: vmlinux] Error 255 > > > > I suspect it is related to

Re: [PATCH v2 bpf-next 00/14] bpf: Support calling kernel function

2021-03-27 Thread Cong Wang
On Sat, Mar 27, 2021 at 2:28 PM Alexei Starovoitov wrote: > > On Sat, Mar 27, 2021 at 2:25 PM Cong Wang wrote: > > > > Hi, > > > > On Wed, Mar 24, 2021 at 8:40 PM Martin KaFai Lau wrote: > > > Martin KaFai Lau (14): > > > bpf: Simplify fr

Re: [PATCH v2 bpf-next 00/14] bpf: Support calling kernel function

2021-03-27 Thread Cong Wang
Hi, On Wed, Mar 24, 2021 at 8:40 PM Martin KaFai Lau wrote: > Martin KaFai Lau (14): > bpf: Simplify freeing logic in linfo and jited_linfo > bpf: Refactor btf_check_func_arg_match > bpf: Support bpf program calling kernel function > bpf: Support kernel function call in x86-32 > tcp: Re

Re: [Patch bpf-next v6 04/12] skmsg: avoid lock_sock() in sk_psock_backlog()

2021-03-26 Thread Cong Wang
On Thu, Mar 25, 2021 at 7:10 PM John Fastabend wrote: > Hi Cong, > > I'm trying to understand if the workqueue logic will somehow prevent the > following, > > CPU0 CPU1 > > work dequeue > sk_psock_backlog() > ... do backlog > ... also maybe sleep > >

Re: [bpf PATCH 2/2] bpf, sockmap: fix incorrect fwd_alloc accounting

2021-03-25 Thread Cong Wang
On Wed, Mar 24, 2021 at 7:46 PM John Fastabend wrote: > > Cong Wang wrote: > > On Wed, Mar 24, 2021 at 2:00 PM John Fastabend > > wrote: > > > > > > Incorrect accounting fwd_alloc can result in a warning when the socket > > > is torn down, > &g

  1   2   3   4   5   6   7   8   9   10   >