On 06/05/2018 01:08 AM, John Fastabend wrote: > On 06/04/2018 12:59 PM, Daniel Borkmann wrote: >> On 06/04/2018 05:21 PM, John Fastabend wrote: >>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead >>> of tcpv6_prot. >>> >>> Previously we overwrote the sk->prot field with tcp_prot even in the >>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot >>> are used. Further, only allow ESTABLISHED connections to join the >>> map per note in TLS ULP, >>> >>> /* The TLS ulp is currently supported only for TCP sockets >>> * in ESTABLISHED state. >>> * Supporting sockets in LISTEN state will require us >>> * to modify the accept implementation to clone rather then >>> * share the ulp context. >>> */ >>> >>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as >>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously >>> crashing case here. >>> >>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") >>> Reported-by: syzbot+5c063698bdbfac19f...@syzkaller.appspotmail.com >>> Signed-off-by: John Fastabend <john.fastab...@gmail.com> >>> Signed-off-by: Wei Wang <wei...@google.com> >> >> Applied to bpf-next, thanks everyone! > > Thanks Daniel, this has the unfortunate side-effect though of > making it hard to add sockets transitioning from LISTEN into > ESTABLISHED states to a sockmap. Before this patch we could add > sockets from the sock_ops event BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB > to a sock{map|hash}. However, this is before a socket is in established > state so risked crashing and wasn't valid at all per this thread. So > I believe its correct to block this action, seeing it will crash a > system in many (most!) cases. > > That said we still would like to support pushing sockets into a > sock{map|hash} in this case. I thought about adding a new hook but > we already have a few sock op hooks in the TCP stack so its too bad we > don't have one that fires after the ESTABLISHED state has transitioned. > Right now I'm looking into seeing if the following would have any > issues, > > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -2206,9 +2206,6 @@ void tcp_set_state(struct sock *sk, int state) > BUILD_BUG_ON((int)BPF_TCP_NEW_SYN_RECV != (int)TCP_NEW_SYN_RECV); > BUILD_BUG_ON((int)BPF_TCP_MAX_STATES != (int)TCP_MAX_STATES); > > - if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) > - tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); > - > switch (state) { > case TCP_ESTABLISHED: > if (oldstate != TCP_ESTABLISHED) > @@ -2234,6 +2231,9 @@ void tcp_set_state(struct sock *sk, int state) > */ > inet_sk_state_store(sk, state); > > + if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) > + tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); > + > #ifdef STATE_TRACE > SOCK_DEBUG(sk, "TCP sk=%p, State %s -> %s\n", sk, > statename[oldstate], statename[state]); > #endif > > This would change the call hook slightly, moving it to after the state > change. However unless the unhash is some how visible from the bpf program > I don't think it should impact existing BPF programs.
Hmm, the current fix also breaks compilation when IPv6 is compiled out, so I had to take it out for now. :-( I think this needs similar workaround as in kTLS case in tls_init(). Given this and your above seen side-effect, lets respin all with a clean fix. tree: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master head: 4e1f687e302835e45e2f296392f21cfeb5671303 commit: 4e1f687e302835e45e2f296392f21cfeb5671303 [3/3] bpf: sockmap, fix crash when ipv6 sock is added config: i386-randconfig-a0-06041847 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: git checkout 4e1f687e302835e45e2f296392f21cfeb5671303 # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): kernel/bpf/sockmap.o: In function `bpf_tcp_ulp_register': >> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot' >> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot' vim +1127 kernel/bpf/sockmap.c 1122 1123 static int bpf_tcp_ulp_register(void) 1124 { 1125 tcp_bpf_proto = tcp_prot; 1126 tcp_bpf_proto.close = bpf_tcp_close; > 1127 tcpv6_bpf_proto = tcpv6_prot; 1128 tcpv6_bpf_proto.close = bpf_tcp_close; 1129 /* Once BPF TX ULP is registered it is never unregistered. It 1130 * will be in the ULP list for the lifetime of the system. Doing 1131 * duplicate registers is not a problem. 1132 */ 1133 return tcp_register_ulp(&bpf_tcp_ulp_ops); 1134 } 1135 Thanks, Daniel