On Thu, 09 May 2019 21:57:49 -0700, John Fastabend wrote: > It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE > state via tcp_disconnect() without calling into close callback. This > would allow a kTLS enabled socket to exist outside of ESTABLISHED > state which is not supported. > > Solve this the same way we solved the sock{map|hash} case by adding > an unhash hook to remove tear down the TLS state. > > In the process we also make the close hook more robust. We add a put > call into the close path, also in the unhash path, to remove the > reference to ulp data after free. Its no longer valid and may confuse > things later if the socket (re)enters kTLS code paths. Second we add > an 'if(ctx)' check to ensure the ctx is still valid and not released > from a previous unhash/close path. > > Fixes: d91c3e17f75f2 ("net/tls: Only attach to sockets in ESTABLISHED state") > Reported-by: Eric Dumazet <eduma...@google.com> > Signed-off-by: John Fastabend <john.fastab...@gmail.com>
Looks like David Beckett managed to trigger another nasty on the release path :/ BUG: kernel NULL pointer dereference, address: 0000000000000012 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.2.0-rc1-00139-g14629453a6d3 #21 RIP: 0010:tcp_peek_len+0x10/0x60 RSP: 0018:ffffc02e41c54b98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9cf924c4e030 RCX: 0000000000000051 RDX: 0000000000000000 RSI: 000000000000000c RDI: ffff9cf97128f480 RBP: ffff9cf9365e0300 R08: ffff9cf94fe7d2c0 R09: 0000000000000000 R10: 000000000000036b R11: ffff9cf939735e00 R12: ffff9cf91ad9ae40 R13: ffff9cf924c4e000 R14: ffff9cf9a8fcbaae R15: 0000000000000020 FS: 0000000000000000(0000) GS:ffff9cf9af7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000012 CR3: 000000013920a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> strp_data_ready+0x48/0x90 tls_data_ready+0x22/0xd0 [tls] tcp_rcv_established+0x569/0x620 tcp_v4_do_rcv+0x127/0x1e0 tcp_v4_rcv+0xad7/0xbf0 ip_protocol_deliver_rcu+0x2c/0x1c0 ip_local_deliver_finish+0x41/0x50 ip_local_deliver+0x6b/0xe0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ip_rcv+0x52/0xd0 ? ip_rcv_finish_core.isra.20+0x380/0x380 __netif_receive_skb_one_core+0x7e/0x90 netif_receive_skb_internal+0x42/0xf0 napi_gro_receive+0xed/0x150 nfp_net_poll+0x7a2/0xd30 [nfp] ? kmem_cache_free_bulk+0x286/0x310 net_rx_action+0x149/0x3b0 __do_softirq+0xe3/0x30a ? handle_irq_event_percpu+0x6a/0x80 irq_exit+0xe8/0xf0 do_IRQ+0x85/0xd0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xbc/0x450 If I read this right strparser calls sock->ops->peek_len(sock), but the sock->sk is already NULL. I'm guess this is because inet_release() does: sock->sk = NULL; sk->sk_prot->close(sk, timeout); And I don't really see a way for ktls to know that sock->sk is about to be cleared, and therefore no way to stop strparser. Or for strparser to always do the check, given tcp_peek_len() will do another dereference of sock->sk :S That's mostly a guess, it takes me half an hour of ktls connections running to repro. Any advice would be appreciated.. Can we move the sock->sk assignment after close?.. diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 5183a2daba64..aff93e7cdb31 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -428,8 +428,8 @@ int inet_release(struct socket *sock) if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING)) timeout = sk->sk_lingertime; - sock->sk = NULL; sk->sk_prot->close(sk, timeout); + sock->sk = NULL; } return 0; } I don't see IPv6 clearing this pointer, perhaps we don't have to? We tested it and it seems to works, but this is pre-git code, so it's hard to tell what the reason to clear was :)