Since vmci_transport_recv_dgram_cb is a callback function and we access the socket struct without holding the lock here, there is a possibility that sk has been released and we use it again. This may cause a NULL pointer dereference later, while receiving. Here is the call trace:
[ 389.486319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 389.494148] PGD 0 P4D 0 [ 389.496687] Oops: 0000 [#1] SMP PTI [ 389.500170] Modules linked in: vhost_net vmw_vsock_vmci_transport tun vsock vhost vmw_vmci tap iptable_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_s [ 389.510984] Failed to add new resource (handle=0x2:0x2711), error: -22 [ 389.543309] Failed to add new resource (handle=0x2:0x2711), error: -22 [ 389.570936] ttm drm crc32c_intel mptsas scsi_transport_sas serio_raw ata_piix mptscsih libata i2c_core mptbase bnx2 dm_mirror dm_region_hash dm_log dm_mod [ 389.597899] CPU: 3 PID: 113 Comm: kworker/3:2 Tainted: G I 4.17.0-rc6.latest+ #25 [ 389.606673] Hardware name: Dell Inc. PowerEdge R710/0XDX06, BIOS 6.1.0 10/18/2011 [ 389.614158] Workqueue: events dg_delayed_dispatch [vmw_vmci] [ 389.619820] RIP: 0010:selinux_socket_sock_rcv_skb+0x46/0x270 [ 389.625475] RSP: 0018:ffffbcb5416b7ce0 EFLAGS: 00010293 [ 389.630698] RAX: 0000000000000000 RBX: 0000000000000028 RCX: 0000000000000007 [ 389.637825] RDX: 0000000000000000 RSI: ffff94a29feec500 RDI: ffffbcb5416b7d18 [ 389.644953] RBP: ffff94a29bd9a640 R08: 0000000000000001 R09: ffff94a187c03080 [ 389.652080] R10: ffffbcb5416b7d80 R11: 0000000000000000 R12: ffffbcb5416b7d18 [ 389.659206] R13: ffff94a29feec500 R14: ffff94a2afda5e00 R15: 0ffff94a2afda5e0 [ 389.666336] FS: 0000000000000000(0000) GS:ffff94a2afd80000(0000) knlGS:0000000000000000 [ 389.674419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 389.680160] CR2: 0000000000000010 CR3: 000000004320a003 CR4: 00000000000206e0 [ 389.687283] Call Trace: [ 389.689738] ? __alloc_skb+0xa0/0x230 [ 389.693407] security_sock_rcv_skb+0x32/0x60 [ 389.697679] ? __alloc_skb+0xa0/0x230 [ 389.701343] sk_filter_trim_cap+0x4e/0x1f0 [ 389.705442] __sk_receive_skb+0x32/0x290 [ 389.709372] vmci_transport_recv_dgram_cb+0xa7/0xd0 [vmw_vsock_vmci_transport] [ 389.716593] dg_delayed_dispatch+0x22/0x50 [vmw_vmci] [ 389.721648] process_one_work+0x1f2/0x4a0 [ 389.725662] worker_thread+0x38/0x4c0 [ 389.729329] ? process_one_work+0x4a0/0x4a0 [ 389.733512] kthread+0x12f/0x150 [ 389.736743] ? kthread_create_worker_on_cpu+0x90/0x90 [ 389.741796] ret_from_fork+0x35/0x40 [ 389.745370] Code: 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 e8 42 15 db ff 0f b7 5d 10 48 8b 85 70 02 00 00 4c 8d 64 24 38 b9 07 00 00 00 4c 89 e7 <44> 8b 70 10 31 c0 41 89 df 41 83 e7 f7 [ 389.764342] RIP: selinux_socket_sock_rcv_skb+0x46/0x270 RSP: ffffbcb5416b7ce0 [ 389.771467] CR2: 0000000000000010 [ 389.774784] ---[ end trace e83d65291a15ae6a ]--- Fix it by checking sk state before using it. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Hangbin Liu <liuhang...@gmail.com> --- net/vmw_vsock/vmci_transport.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index a7a73ff..0d26040 100644 --- a/net/vmw_vsock/vmci_transport.c +++ b/net/vmw_vsock/vmci_transport.c @@ -612,6 +612,13 @@ static int vmci_transport_recv_dgram_cb(void *data, struct vmci_datagram *dg) if (!vmci_transport_allow_dgram(vsk, dg->src.context)) return VMCI_ERROR_NO_ACCESS; + bh_lock_sock(sk); + if (sk->sk_state == TCP_CLOSE) { + bh_unlock_sock(sk); + return VMCI_ERROR_DATAGRAM_FAILED; + } + bh_unlock_sock(sk); + size = VMCI_DG_SIZE(dg); /* Attach the packet to the socket's receive queue as an sk_buff. */ -- 1.8.3.1