From: Cong Wang
After a vsock socket has been added to a BPF sockmap, its prot->recvmsg
has been replaced with vsock_bpf_recvmsg(). Thus the following
recursiion could happen:
vsock_bpf_recvmsg()
-> __vsock_recvmsg()
-> vsock_connectible_recvmsg()
-> p
packet fragment
> support.")
> Cc: # 5.11
> Reported-by: Shuang Li
> Signed-off-by: Davide Caratti
Acked-by: Cong Wang
Thanks.
On Tue, Apr 20, 2021 at 1:59 AM Davide Caratti wrote:
>
> hello Cong, thanks for looking at this!
>
> On Mon, 2021-04-19 at 11:46 -0700, Cong Wang wrote:
> > On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti wrote:
> > > diff --git a/net/sched/sch_frag.c b/net/s
rio_schedule to
> prevent this condition.
>
> Reported as bug on syzkaller:
> https://syzkaller.appspot.com/bug?extid=d50710fd0873a9c6b40c
>
> Reported-by: syzbot+d50710fd0873a9c6b...@syzkaller.appspotmail.com
> Signed-off-by: Du Cheng
Acked-by: Cong Wang
Thanks.
On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti wrote:
> diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c
> index e1e77d3fb6c0..8c06381391d6 100644
> --- a/net/sched/sch_frag.c
> +++ b/net/sched/sch_frag.c
> @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff
> *skb
From: Cong Wang
Add two test cases to ensure redirection between udp and unix
work bidirectionally.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 165 ++
1 file
From: Cong Wang
Add a test case to ensure redirection between two AF_UNIX
datagram sockets work.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 92 +++
1 file changed
From: Cong Wang
Factor out a common helper add_to_sockmap() which adds two
sockets into a sockmap.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 59 +++
1 file changed
From: Cong Wang
Factor out a common helper udp_socketpair() which creates
a pair of connected UDP sockets.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 76 ++-
1 file
From: Cong Wang
Now AF_UNIX datagram supports sockmap and redirection,
we can update the sock type checks for them accordingly.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/core/sock_map.c | 8
1 file changed, 8
From: Cong Wang
unix_proto is special, it is very different from INET proto,
which even does not have a ->close(). We have to add a dummy
one to satisfy sockmap.
And now we can implement unix_bpf_update_proto() to update
sk_prot.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
From: Cong Wang
We have to implement unix_dgram_bpf_recvmsg() to replace the
original ->recvmsg() to retrieve skmsg from ingress_msg.
AF_UNIX is again special here because the lack of
sk_prot->recvmsg(). I simply add a special case inside
unix_dgram_recvmsg() to call sk->sk_prot
From: Cong Wang
Implement ->read_sock() for AF_UNIX datagram socket, it is
pretty much similar to udp_read_sock().
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/unix/af_unix.c | 37 +
1 f
From: Cong Wang
Currently only unix stream socket sets TCP_ESTABLISHED,
datagram socket can set this too when they connect to its
peer socket. At least __ip4_datagram_connect() does the same.
This will be used by the next patch to determine whether an
AF_UNIX datagram socket can be redirected
From: Cong Wang
This is the last patchset of the original large patchset. In the
previous patchset, a new BPF sockmap program BPF_SK_SKB_VERDICT
was introduced and UDP began to support it too. In this patchset,
we add BPF_SK_SKB_VERDICT support to Unix datagram socket, so that
we can finally
On Thu, Apr 15, 2021 at 7:18 AM Sishuai Gong wrote:
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 203890e378cb..879f1264ec3c 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1478,6 +1478,9 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel,
> struct
On Thu, Apr 15, 2021 at 4:17 PM Du Cheng wrote:
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 8287894541e3..abd6b176383c 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -901,6 +901,10 @@ static int parse_taprio_schedule(struct taprio_sched *q,
>
On Wed, Apr 14, 2021 at 9:25 PM Alexei Starovoitov
wrote:
>
> As I said earlier:
> "
> If prog refers such hmap as above during prog free the kernel does
> for_each_map_elem {if (elem->opaque) del_timer().}
> "
This goes back to our previous discussion. Forcing timer deletions on
prog exit is not
On Wed, Apr 14, 2021 at 2:14 PM Sishuai Gong wrote:
>
> l2tp_tunnel_register() registers a tunnel without fully
> initializing its attribute. This can allow another kernel thread
> running l2tp_xmit_core() to access the uninitialized data and
> then cause a kernel NULL pointer dereference error, a
On Mon, Apr 12, 2021 at 4:01 PM Alexei Starovoitov
wrote:
>
> On Mon, Apr 05, 2021 at 05:36:27PM -0700, Cong Wang wrote:
> > On Fri, Apr 2, 2021 at 4:45 PM Alexei Starovoitov
> > wrote:
> > >
> > > On Fri, Apr 02, 2021 at 02:24:51PM -0700, Cong Wang wrote:
&g
On Tue, Apr 13, 2021 at 3:10 PM Gong, Sishuai wrote:
>
> Hi,
>
> We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it
> under x86. This bug happens when two l2tp functions l2tp_tunnel_register()
> and l2tp_xmit_core() are running in parallel. In general,
> l2tp_tunnel_re
On Sun, Apr 11, 2021 at 11:52 PM Karsten Graul wrote:
>
>
>
> On 10/04/2021 20:17, Cong Wang wrote:
> > From: Cong Wang
> >
> > syzbot is able to setup kTLS on an SMC socket, which coincidentally
> > uses sk_user_data too, later, kTLS treats it as psock s
From: Cong Wang
syzbot is able to setup kTLS on an SMC socket, which coincidentally
uses sk_user_data too, later, kTLS treats it as psock so triggers a
refcnt warning. The cause is that smc_setsockopt() simply calls
TCP setsockopt(). I do not think it makes sense to setup kTLS on
top of SMC, so
On Fri, Apr 9, 2021 at 12:45 PM John Fastabend wrote:
>
> syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:9c54130c Add linux-next specific files for 20210406
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1
On Thu, Apr 8, 2021 at 5:26 PM John Fastabend wrote:
>
> Cong Wang wrote:
> > From: Cong Wang
> >
> > The last refcnt of the psock can be gone right after
> > sock_map_remove_links(), so sk_psock_stop() could trigger a UAF.
> > The reason why I placed sk_pso
On Tue, Apr 6, 2021 at 4:36 PM Song Liu wrote:
> I am not sure whether this makes sense. I feel there is still some
> misunderstanding. It will be helpful if you can share more information
> about the overall design.
>
> BTW: this could be a good topic for the BPF office hour. See more details
> h
of
> the loop in tcf_action_init() which is properly fixed by the following
> patch.
I still hate the init_res[] array, but I have no easy and better way to fix
it either, so:
Acked-by: Cong Wang
For the long term, we probably want to split the action ->init() into
two: ->init() and ->
On Thu, Apr 8, 2021 at 4:59 AM Jamal Hadi Salim wrote:
>
> On 2021-04-07 7:50 p.m., Cong Wang wrote:
> > On Wed, Apr 7, 2021 at 8:36 AM Vlad Buslov wrote:
> >>
> >> Action init code increments reference counter when it changes an action.
> >> This is the d
On Thu, Apr 8, 2021 at 12:50 AM Vlad Buslov wrote:
>
>
> On Thu 08 Apr 2021 at 02:50, Cong Wang wrote:
> > In my last comments, I actually meant whether we can avoid this
> > 'init_res[]' array. Since here you want to check whether an action
> > returned by t
On Wed, Apr 7, 2021 at 8:11 PM Stephen Rothwell wrote:
>
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
> net/core/skmsg.c
>
> between commit:
>
> 144748eb0c44 ("bpf, sockmap: Fix incorrect fwd_alloc accounting")
>
> from the bpf tree and commit:
>
> e3526bb
On Wed, Apr 7, 2021 at 8:02 PM Stephen Rothwell wrote:
>
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
> include/linux/skmsg.h
>
> between commit:
>
> 1c84b33101c8 ("bpf, sockmap: Fix sk->prot unhash op reset")
>
> from the bpf tree and commit:
>
> 8a59f9d1
From: Cong Wang
The last refcnt of the psock can be gone right after
sock_map_remove_links(), so sk_psock_stop() could trigger a UAF.
The reason why I placed sk_psock_stop() there is to avoid RCU read
critical section, and more importantly, some callee of
sock_map_remove_links() is supposed to
On Wed, Apr 7, 2021 at 8:36 AM Vlad Buslov wrote:
>
> Action init code increments reference counter when it changes an action.
> This is the desired behavior for cls API which needs to obtain action
> reference for every classifier that points to action. However, act API just
> needs to change the
On Wed, Apr 7, 2021 at 8:46 AM Vlad Buslov wrote:
>
> Add two new tests for action create/change code.
Acked-by: Cong Wang
Thanks.
On Wed, Apr 7, 2021 at 2:22 PM Jiri Olsa wrote:
>
> hi,
> I'm getting couple of WARNINGs below when running
> test_sockmap on latest bpf-next/master, like:
>
> # while :; do ./test_sockmap ; done
>
> The warning is at:
> WARN_ON(sk->sk_forward_alloc);
>
> so looks like some socket allocation m
From: Cong Wang
Using sk_psock() to retrieve psock pointer from sock requires
RCU read lock, but we already get psock pointer before calling
->psock_update_sk_prot() in both cases, so we can just pass it
without bothering sk_psock().
Reported-and-tested-by: syzbot+320a3bc8d80f47
On Tue, Apr 6, 2021 at 6:01 AM syzbot
wrote:
> ==
> BUG: KASAN: use-after-free in __lock_acquire+0x3e6f/0x54c0
> kernel/locking/lockdep.c:4770
> Read of size 8 at addr 888024f66238 by task syz-executor.1/14202
>
> CPU: 0 PID: 142
On Mon, Apr 5, 2021 at 1:25 AM Eric Dumazet wrote:
>
>
>
> On 3/31/21 4:32 AM, Cong Wang wrote:
> > From: Cong Wang
> >
> > Currently sockmap calls into each protocol to update the struct
> > proto and replace it. This certainly won't work when the pro
On Tue, Apr 6, 2021 at 2:44 AM syzbot
wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:514e1150 net: x25: Queue received packets in the drivers i..
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=112a8831d0
> kernel config:
On Mon, Apr 5, 2021 at 11:18 PM Song Liu wrote:
>
>
>
> > On Apr 5, 2021, at 6:24 PM, Cong Wang wrote:
> >
> > On Mon, Apr 5, 2021 at 6:08 PM Song Liu wrote:
> >>
> >>
> >>
> >>> On Apr 5, 2021, at 4:49 PM, Cong Wang
On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote:
>
> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the
> coming days. If it works, then we can consider proceeding with it,
> otherwise I am all for reverting the whole NOLOCK stuff.
>
> [1]
> https://lore.kernel.org/linux-ca
On Mon, Apr 5, 2021 at 6:08 PM Song Liu wrote:
>
>
>
> > On Apr 5, 2021, at 4:49 PM, Cong Wang wrote:
> >
> > On Fri, Apr 2, 2021 at 4:31 PM Song Liu wrote:
> >>
> >>
> >>
> >>> On Apr 2, 2021, at 1:57 PM, Cong Wang wrote:
>
On Fri, Apr 2, 2021 at 4:45 PM Alexei Starovoitov
wrote:
>
> On Fri, Apr 02, 2021 at 02:24:51PM -0700, Cong Wang wrote:
> > > > where the key is the timer ID and the value is the timer expire
> > > > timer.
> > >
> > > The timer ID is unnece
On Fri, Apr 2, 2021 at 4:31 PM Song Liu wrote:
>
>
>
> > On Apr 2, 2021, at 1:57 PM, Cong Wang wrote:
> >
> > Ideally I even prefer to create timers in kernel-space too, but as I already
> > explained, this seems impossible to me.
>
> Would hrtimer (includ
On Sat, Apr 3, 2021 at 3:01 AM Vlad Buslov wrote:
> So, the following happens in reproduction provided in commit message
> when executing "tc actions add action simple sdata \"1\" index 1
> action simple sdata \"2\" index 2" command:
>
> 1. tcf_action_init() is called with batch of two actions of
From: Cong Wang
These comments in udp_bpf_update_proto() are copied from the
original TCP code and apparently do not apply to UDP. Just
remove them.
Reported-by: Jakub Sitnicki
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/ipv4/udp_bpf.c | 2
On Fri, Apr 2, 2021 at 3:16 AM Jakub Sitnicki wrote:
> > -struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock)
> > +int udp_bpf_update_proto(struct sock *sk, bool restore)
> > {
> > int family = sk->sk_family == AF_INET ? UDP_BPF_IPV4 : UDP_BPF_IPV6;
> > + struct sk_
On Wed, Mar 31, 2021 at 11:01 PM John Fastabend
wrote:
> This 'else if' is always true if above is false right? Would be
> impler and clearer IMO as,
>
>if (used <= 0) {
> if (!copied)
> copied = used;
>
On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov wrote:
>
> With recent changes that separated action module load from action
> initialization tcf_action_init() function error handling code was modified
> to manually release the loaded modules if loading/initialization of any
> further action in same b
On Wed, Mar 31, 2021 at 9:41 AM Vlad Buslov wrote:
>
> Action init code increments reference counter when it changes an action.
> This is the desired behavior for cls API which needs to obtain action
> reference for every classifier that points to action. However, act API just
> needs to change th
On Fri, Apr 2, 2021 at 12:28 PM Alexei Starovoitov
wrote:
>
> On Wed, Mar 31, 2021 at 09:26:35PM -0700, Cong Wang wrote:
>
> > This patch introduces a bpf timer map and a syscall to create bpf timer
> > from user-space.
>
> That will severely limit timer api usabilit
On Fri, Apr 2, 2021 at 12:45 PM Song Liu wrote:
>
>
>
> > On Apr 2, 2021, at 12:08 PM, Cong Wang wrote:
> >
> > On Fri, Apr 2, 2021 at 10:57 AM Song Liu wrote:
> >>
> >>
> >>
> >>> On Apr 2, 2021, at 10:34 AM, Cong Wang
On Fri, Apr 2, 2021 at 10:57 AM Song Liu wrote:
>
>
>
> > On Apr 2, 2021, at 10:34 AM, Cong Wang wrote:
> >
> > On Thu, Apr 1, 2021 at 1:17 PM Song Liu wrote:
> >>
> >>
> >>
> >>> On Apr 1, 2021, at 10:28 AM, Cong Wang w
On Thu, Apr 1, 2021 at 1:17 PM Song Liu wrote:
>
>
>
> > On Apr 1, 2021, at 10:28 AM, Cong Wang wrote:
> >
> > On Wed, Mar 31, 2021 at 11:38 PM Song Liu wrote:
> >>
> >>
> >>
> >>> On Mar 31, 2021, at 9:26 PM, Cong Wang wrote
On Wed, Mar 31, 2021 at 11:38 PM Song Liu wrote:
>
>
>
> > On Mar 31, 2021, at 9:26 PM, Cong Wang wrote:
> >
> > From: Cong Wang
> >
> > (This patch is still in early stage and obviously incomplete. I am sending
> > it out to get some high-lev
From: Cong Wang
(This patch is still in early stage and obviously incomplete. I am sending
it out to get some high-level feedbacks. Please kindly ignore any coding
details for now and focus on the design.)
This patch introduces a bpf timer map and a syscall to create bpf timer
from user-space
On Mon, Mar 29, 2021 at 3:55 PM Kumar Kartikeya Dwivedi
wrote:
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index b919826939e0..43cceb924976 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -1042,6 +1042,9 @@ struct tc_action *tcf_action_init_1(struct net *net,
>
From: Cong Wang
This adds a test case to ensure BPF_SK_SKB_VERDICT and
BPF_SK_STREAM_VERDICT will never be attached at the same time.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_basic.c | 40
From: Cong Wang
Add a test case to ensure redirection between two UDP sockets work.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 136 ++
.../selftests/bpf/progs
From: Cong Wang
Now UDP supports sockmap and redirection, we can safely update
the sock type checks for it accordingly.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/core/sock_map.c | 5 -
1 file changed, 4 insertions(+), 1
From: Cong Wang
Currently sockmap calls into each protocol to update the struct
proto and replace it. This certainly won't work when the protocol
is implemented as a module, for example, AF_UNIX.
Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each
protocol can implement
From: Cong Wang
We have to implement udp_bpf_recvmsg() to replace the ->recvmsg()
to retrieve skmsg from ingress_msg.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/ipv4/udp_bpf.c |
From: Cong Wang
Although these two functions are only used by TCP, they are not
specific to TCP at all, both operate on skmsg and ingress_msg,
so fit in net/core/skmsg.c very well.
And we will need them for non-TCP, so rename and move them to
skmsg.c and export them to modules.
Cc: John
From: Cong Wang
This is similar to tcp_read_sock(), except we do not need
to worry about connections, we just need to retrieve skb
from UDP receive queue.
Note, the return value of ->read_sock() is unused in
sk_psock_verdict_data_ready(), and UDP still does not
support splice() due to lack
From: Cong Wang
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
confusing and more importantly we still want to distinguish them
from user-space. So we can just reuse the stream verdict code but
introduce a new type of eBPF program, skb_verdict. Users are not
allowed to attach
From: Cong Wang
Now we can fold sock_map_link_no_progs() into sock_map_link()
and get rid of sock_map_link_no_progs().
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Cc: John Fastabend
Signed-off-by: Cong Wang
---
net/core/sock_map.c | 55
From: Cong Wang
sock_map_link() passes down map progs, but it is confusing
to see both map progs and psock progs. Make the map progs
more obvious by retrieving it directly with sock_map_progs()
inside sock_map_link(). Now it is aligned with
sock_map_link_no_progs() too.
Cc: Daniel Borkmann
Cc
From: Cong Wang
The RCU callback sk_psock_destroy() only queues work psock->gc,
so we can just switch to rcu work to simplify the code.
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Acked-by: John Fastabend
Signed-off-by: Cong Wang
---
include/linux/skmsg.h | 5 +
net/c
From: Cong Wang
This function is only called in process context.
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Acked-by: John Fastabend
Signed-off-by: Cong Wang
---
net/core/skmsg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/skmsg.c b/net/core
From: Cong Wang
We only have skb_send_sock_locked() which requires callers
to use lock_sock(). Introduce a variant skb_send_sock()
which locks on its own, callers do not need to lock it
any more. This will save us from adding a ->sendmsg_locked
for each protocol.
To reuse the code, p
From: Cong Wang
Currently we rely on lock_sock to protect ingress_msg,
it is too big for this, we can actually just use a spinlock
to protect this list like protecting other skb queues.
__tcp_bpf_recvmsg() is still special because of peeking,
it still has to use lock_sock.
Cc: Daniel Borkmann
From: Cong Wang
We do not have to lock the sock to avoid losing sk_socket,
instead we can purge all the ingress queues when we close
the socket. Sending or receiving packets after orphaning
socket makes no sense.
We do purge these queues when psock refcnt reaches zero but
here we want to purge
From: Cong Wang
Currently we purge the ingress_skb queue only when psock
refcnt goes down to 0, so locking the queue is not necessary,
but in order to be called during ->close, we have to lock it
here.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Lorenz Bauer
Acked-by: Jakub Sitnicki
Sig
From: Cong Wang
We have thousands of services connected to a daemon on every host
via AF_UNIX dgram sockets, after they are moved into VM, we have to
add a proxy to forward these communications from VM to host, because
rewriting thousands of them is not practical. This proxy uses an
AF_UNIX
On Tue, Mar 30, 2021 at 7:36 AM Alexei Starovoitov
wrote:
>
> On Tue, Mar 30, 2021 at 2:43 AM Lorenz Bauer wrote:
> >
> > On Thu, 25 Mar 2021 at 01:52, Martin KaFai Lau wrote:
> > >
> > > This series adds support to allow bpf program calling kernel function.
> >
> > I think there are more build
On Mon, Mar 29, 2021 at 11:23 PM John Fastabend
wrote:
>
> Cong Wang wrote:
> > On Mon, Mar 29, 2021 at 1:54 PM John Fastabend
> > wrote:
> > >
> > > Cong Wang wrote:
> > > > From: Cong Wang
> > > >
> > > > This is
On Mon, Mar 29, 2021 at 4:10 PM John Fastabend wrote:
> I think its a bit odd for TCP_ESTABLISHED to work with !tcp, but
> thats not your invention so LGTM.
It has been there for many years, so why it is suddenly a problem with
my patchset? More importantly, why don't you change it by yourself
as
On Mon, Mar 29, 2021 at 1:54 PM John Fastabend wrote:
>
> Cong Wang wrote:
> > From: Cong Wang
> >
> > This is similar to tcp_read_sock(), except we do not need
> > to worry about connections, we just need to retrieve skb
> > from UDP receive queue.
> >
On Mon, Mar 29, 2021 at 1:10 PM John Fastabend wrote:
>
> Cong Wang wrote:
> > From: Cong Wang
> >
> > Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
> > confusing and more importantly we still want to distinguish them
> > from user-space. So
On Sun, Mar 28, 2021 at 6:24 PM Martin KaFai Lau wrote:
> Could you also check the CONFIG_DYNAMIC_FTRACE and also try 'y' if it
> is not set?
On my side, with pahole==1.17, changing CONFIG_DYNAMIC_FTRACE
makes no difference. With pahole==1.20, CONFIG_DYNAMIC_FTRACE=y
makes it gone, but CONFIG_DYN
On Mon, Mar 29, 2021 at 8:03 AM John Fastabend wrote:
>
> Alexei Starovoitov wrote:
> > On Sun, Mar 28, 2021 at 01:20:00PM -0700, Cong Wang wrote:
> > > From: Cong Wang
> > >
> > > We have thousands of services connected to a daemon on every host
> >
From: Cong Wang
Add a test case to ensure redirection between two UDP sockets work.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 136 ++
.../selftests/bpf/progs
From: Cong Wang
We have to implement udp_bpf_recvmsg() to replace the ->recvmsg()
to retrieve skmsg from ingress_msg.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/ipv4/udp_bpf.c |
From: Cong Wang
Now UDP supports sockmap and redirection, we can safely update
the sock type checks for it accordingly.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/core/sock_map.c | 5 -
1 file changed, 4 insertions(+), 1
From: Cong Wang
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
confusing and more importantly we still want to distinguish them
from user-space. So we can just reuse the stream verdict code but
introduce a new type of eBPF program, skb_verdict. Users are not
allowed to set
From: Cong Wang
Although these two functions are only used by TCP, they are not
specific to TCP at all, both operate on skmsg and ingress_msg,
so fit in net/core/skmsg.c very well.
And we will need them for non-TCP, so rename and move them to
skmsg.c and export them to modules.
Cc: John
From: Cong Wang
Currently sockmap calls into each protocol to update the struct
proto and replace it. This certainly won't work when the protocol
is implemented as a module, for example, AF_UNIX.
Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each
protocol can implement
From: Cong Wang
This is similar to tcp_read_sock(), except we do not need
to worry about connections, we just need to retrieve skb
from UDP receive queue.
Note, the return value of ->read_sock() is unused in
sk_psock_verdict_data_ready().
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Ja
From: Cong Wang
This function is only called in process context.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Signed-off-by: Cong Wang
---
net/core/skmsg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
From: Cong Wang
We do not have to lock the sock to avoid losing sk_socket,
instead we can purge all the ingress queues when we close
the socket. Sending or receiving packets after orphaning
socket makes no sense.
We do purge these queues when psock refcnt reaches zero but
here we want to purge
From: Cong Wang
The RCU callback sk_psock_destroy() only queues work psock->gc,
so we can just switch to rcu work to simplify the code.
Cc: Daniel Borkmann
Cc: Jakub Sitnicki
Cc: Lorenz Bauer
Acked-by: John Fastabend
Signed-off-by: Cong Wang
---
include/linux/skmsg.h | 5 +
net/c
From: Cong Wang
Currently we rely on lock_sock to protect ingress_msg,
it is too big for this, we can actually just use a spinlock
to protect this list like protecting other skb queues.
__tcp_bpf_recvmsg() is still special because of peeking,
it still has to use lock_sock.
Cc: John Fastabend
From: Cong Wang
We only have skb_send_sock_locked() which requires callers
to use lock_sock(). Introduce a variant skb_send_sock()
which locks on its own, callers do not need to lock it
any more. This will save us from adding a ->sendmsg_locked
for each protocol.
To reuse the code, p
From: Cong Wang
We have thousands of services connected to a daemon on every host
via AF_UNIX dgram sockets, after they are moved into VM, we have to
add a proxy to forward these communications from VM to host, because
rewriting thousands of them is not practical. This proxy uses an
AF_UNIX
From: Cong Wang
Currently we purge the ingress_skb queue only when psock
refcnt goes down to 0, so locking the queue is not necessary,
but in order to be called during ->close, we have to lock it
here.
Cc: John Fastabend
Cc: Daniel Borkmann
Cc: Lorenz Bauer
Acked-by: Jakub Sitnicki
Sig
On Sat, Mar 27, 2021 at 3:54 PM Alexei Starovoitov
wrote:
>
> On Sat, Mar 27, 2021 at 3:08 PM Cong Wang wrote:
> > BTFIDS vmlinux
> > FAILED unresolved symbol cubictcp_state
> > make: *** [Makefile:1199: vmlinux] Error 255
> >
> > I suspect it is related to
On Sat, Mar 27, 2021 at 2:28 PM Alexei Starovoitov
wrote:
>
> On Sat, Mar 27, 2021 at 2:25 PM Cong Wang wrote:
> >
> > Hi,
> >
> > On Wed, Mar 24, 2021 at 8:40 PM Martin KaFai Lau wrote:
> > > Martin KaFai Lau (14):
> > > bpf: Simplify fr
Hi,
On Wed, Mar 24, 2021 at 8:40 PM Martin KaFai Lau wrote:
> Martin KaFai Lau (14):
> bpf: Simplify freeing logic in linfo and jited_linfo
> bpf: Refactor btf_check_func_arg_match
> bpf: Support bpf program calling kernel function
> bpf: Support kernel function call in x86-32
> tcp: Re
On Thu, Mar 25, 2021 at 7:10 PM John Fastabend wrote:
> Hi Cong,
>
> I'm trying to understand if the workqueue logic will somehow prevent the
> following,
>
> CPU0 CPU1
>
> work dequeue
> sk_psock_backlog()
> ... do backlog
> ... also maybe sleep
>
>
On Wed, Mar 24, 2021 at 7:46 PM John Fastabend wrote:
>
> Cong Wang wrote:
> > On Wed, Mar 24, 2021 at 2:00 PM John Fastabend
> > wrote:
> > >
> > > Incorrect accounting fwd_alloc can result in a warning when the socket
> > > is torn down,
> &g
1 - 100 of 2972 matches
Mail list logo