On 11 January 2016 at 16:23, Pravin B Shelar <pshe...@nicira.com> wrote: > STT unregisters nf-hook when there are no other STT devices > left in the namespace. On some kernel versions the nf-unreg API > take RTNL lock, but it is already taken in the tunnel device > destroy code path which results in deadlock. To fix the issue > I moved the unreg call into net-exit. > > Bug: #1582410 > Reported-by: Joe Stringer <j...@ovn.org> > Signed-off-by: Pravin B Shelar <pshe...@nicira.com>
Running the kernel module testsuite, I'm getting some complaints and crashes from 25 - conntrack - IPv6 FTP test. On one ubuntu kernel-3.13 host the test failed, I attempted to remove the kernel module and found this in dmesg: [2796049.160967] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [2796049.162370] IP: [<ffffffff81656ff7>] nf_unregister_hook+0x27/0x80 [2796049.163450] PGD 0 [2796049.163872] Oops: 0002 [#1] SMP [2796049.164512] Modules linked in: openvswitch(OX-) nf_defrag_ipv6 vxlan gre ip_tunnel nf_defrag_ipv4 nf_conntrack_netlink nfnetlink nf_conntrack 8021q garp stp mrp llc veth btrfs raid6_pq xor ufs msdos xfs libcrc32c netconsole configfs dm_crypt ppdev vmw_balloon vmw_vmci parport_pc parport vmxnet3 vmw_pvscsi floppy [last unloaded: nf_conntrack_ipv4] [2796049.170608] CPU: 0 PID: 22897 Comm: rmmod Tainted: G OX 3.13.0-68-generic #111-Ubuntu [2796049.172063] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [2796049.173813] task: ffff880003204800 ti: ffff880005df0000 task.ti: ffff880005df0000 [2796049.175052] RIP: 0010:[<ffffffff81656ff7>] [<ffffffff81656ff7>] nf_unregister_hook+0x27/0x80 [2796049.176495] RSP: 0018:ffff880005df1eb8 EFLAGS: 00010246 [2796049.177391] RAX: 0000000000000000 RBX: ffffffffa031c3c0 RCX: 00000000c0000100 [2796049.178632] RDX: 0000000000000000 RSI: ffff880003204800 RDI: ffffffff81cde560 [2796049.179963] RBP: ffff880005df1ec0 R08: ffff880005df0000 R09: 0000000000000000 [2796049.181258] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa031c520 [2796049.182565] R13: 0000000000000800 R14: 0000000000000000 R15: 00007f5a25e141a0 [2796049.183872] FS: 00007f5a24d7b740(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 [2796049.185321] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [2796049.186349] CR2: 0000000000000008 CR3: 000000003b971000 CR4: 00000000000007f0 [2796049.187707] Stack: [2796049.188150] 0000000000000000 ffff880005df1ed0 ffffffffa030f852 ffff880005df1ee0 [2796049.189865] ffffffffa0302b5e ffff880005df1ef0 ffffffffa02fa520 ffff880005df1f78 [2796049.191346] ffffffff810e08d2 00000000c87dacc0 ffffffffa031c520 0000000000000800 [2796049.192829] Call Trace: [2796049.193356] [<ffffffffa030f852>] ovs_stt_cleanup_module+0x22/0x40 [openvswitch] [2796049.194697] [<ffffffffa0302b5e>] ovs_vport_exit+0xe/0x40 [openvswitch] [2796049.195900] [<ffffffffa02fa520>] dp_cleanup+0x60/0x90 [openvswitch] [2796049.197035] [<ffffffff810e08d2>] SyS_delete_module+0x162/0x200 [2796049.198098] [<ffffffff81013ed7>] ? do_notify_resume+0x97/0xb0 [2796049.199145] [<ffffffff81734cdd>] system_call_fastpath+0x1a/0x1f [2796049.200213] Code: 1f 44 00 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb 48 c7 c7 60 e5 cd 81 e8 07 39 0d 00 48 8b 43 08 48 8b 13 48 c7 c7 60 e5 cd 81 <48> 89 42 08 48 89 10 48 b8 00 02 20 00 00 00 ad de 48 89 43 08 [2796049.205628] RIP [<ffffffff81656ff7>] nf_unregister_hook+0x27/0x80 [2796049.206755] RSP <ffff880005df1eb8> [2796049.207421] CR2: 0000000000000008 [2796049.208366] ---[ end trace 62f941e5b9590b3e ]--- The module wasn't removed; subsequent attempts to remove it ended with errors like this: # rmmod openvswitch rmmod: ERROR: ../libkmod/libkmod-module.c:769 kmod_module_remove_module() could not remove 'openvswitch': Device or resource busy rmmod: ERROR: could not remove module openvswitch: Device or resource busy On a RHEL7 host, I ran just this failing test 25, the test passed, but when I attempted to remove the kernel module the host crashed: [276998.955187] BUG: unable to handle kernel NULL pointer dereference at (null) [276998.955243] IP: [<ffffffff8130c379>] __list_del_entry+0x29/0xd0 [276998.955291] PGD 6c3b4067 PUD 6c0b9067 PMD 0 [276998.955320] Oops: 0000 [#1] SMP [276998.955341] Modules linked in: openvswitch(OE-) vxlan ip6_udp_tunnel udp_tunnel 8021q garp stp mrp llc nf_conntrack_netlink nfnetlink veth gre nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack netconsole xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi vmwgfx drm_kms_helper ttm crct10dif_pclmul crct10dif_common crc32c_intel serio_raw drm ata_piix vmxnet3 libata vmw_pvscsi i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: vport_geneve] [276998.955646] CPU: 6 PID: 4713 Comm: modprobe Tainted: G OE ------------ 3.10.0-327.el7.x86_64 #1 [276998.955680] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 [276998.955716] task: ffff8800bb5b0b80 ti: ffff88006f138000 task.ti: ffff88006f138000 [276998.955744] RIP: 0010:[<ffffffff8130c379>] [<ffffffff8130c379>] __list_del_entry+0x29/0xd0 [276998.955779] RSP: 0018:ffff88006f13bea8 EFLAGS: 00010207 [276998.955799] RAX: 0000000000000000 RBX: ffffffffa03bc6c0 RCX: dead000000200200 [276998.955825] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa03bc6c0 [276998.955850] RBP: ffff88006f13bea8 R08: ffff88006f13be58 R09: 0000000000000000 [276998.955875] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffa03bc8c0 [276998.955900] R13: 0000000000000800 R14: 0000000001141c98 R15: 0000000000000000 [276998.955950] FS: 00007f1eff0e0740(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 [276998.955978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [276998.956000] CR2: 0000000000000000 CR3: 00000000b6733000 CR4: 00000000000407e0 [276998.956050] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [276998.956092] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [276998.956119] Stack: [276998.956131] ffff88006f13bec0 ffffffff8155c411 fffffffffffffff5 ffff88006f13bed0 [276998.956176] ffffffffa03afa42 ffff88006f13bee0 ffffffffa03a2ffe ffff88006f13bef0 [276998.956217] ffffffffa039a800 ffff88006f13bf78 ffffffff810eb23b 0000000000000000 [276998.956258] Call Trace: [276998.956285] [<ffffffff8155c411>] nf_unregister_hook+0x21/0x70 [276998.956314] [<ffffffffa03afa42>] ovs_stt_cleanup_module+0x22/0x40 [openvswitch] [276998.956344] [<ffffffffa03a2ffe>] ovs_vport_exit+0xe/0x40 [openvswitch] [276998.956371] [<ffffffffa039a800>] dp_cleanup+0x60/0x90 [openvswitch] [276998.956405] [<ffffffff810eb23b>] SyS_delete_module+0x16b/0x2d0 [276998.956439] [<ffffffff81014b12>] ? do_notify_resume+0x92/0xb0 [276998.956473] [<ffffffff81645909>] system_call_fastpath+0x16/0x1b [276998.956496] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08 [276998.956822] RIP [<ffffffff8130c379>] __list_del_entry+0x29/0xd0 [276998.956846] RSP <ffff88006f13bea8> [276998.956862] CR2: 0000000000000000 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev