On 11 January 2016 at 16:23, Pravin B Shelar <pshe...@nicira.com> wrote:
> STT unregisters nf-hook when there are no other STT devices
> left in the namespace. On some kernel versions the nf-unreg API
> take RTNL lock, but it is already taken in the tunnel device
> destroy code path which results in deadlock. To fix the issue
> I moved the unreg call into net-exit.
>
> Bug: #1582410
> Reported-by: Joe Stringer <j...@ovn.org>
> Signed-off-by: Pravin B Shelar <pshe...@nicira.com>

Running the kernel module testsuite, I'm getting some complaints and
crashes from 25 - conntrack - IPv6 FTP test. On one ubuntu kernel-3.13
host the test failed, I attempted to remove the kernel module and
found this in dmesg:

[2796049.160967] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000008
[2796049.162370] IP: [<ffffffff81656ff7>] nf_unregister_hook+0x27/0x80
[2796049.163450] PGD 0
[2796049.163872] Oops: 0002 [#1] SMP
[2796049.164512] Modules linked in: openvswitch(OX-) nf_defrag_ipv6
vxlan gre ip_tunnel nf_defrag_ipv4 nf_conntrack_netlink nfnetlink
nf_conntrack 8021q garp stp mrp llc veth btrfs raid6_pq xor ufs msdos
xfs libcrc32c netconsole configfs dm_crypt ppdev vmw_balloon vmw_vmci
parport_pc parport vmxnet3 vmw_pvscsi floppy [last unloaded:
nf_conntrack_ipv4]
[2796049.170608] CPU: 0 PID: 22897 Comm: rmmod Tainted: G           OX
3.13.0-68-generic #111-Ubuntu
[2796049.172063] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[2796049.173813] task: ffff880003204800 ti: ffff880005df0000 task.ti:
ffff880005df0000
[2796049.175052] RIP: 0010:[<ffffffff81656ff7>]  [<ffffffff81656ff7>]
nf_unregister_hook+0x27/0x80
[2796049.176495] RSP: 0018:ffff880005df1eb8  EFLAGS: 00010246
[2796049.177391] RAX: 0000000000000000 RBX: ffffffffa031c3c0 RCX:
00000000c0000100
[2796049.178632] RDX: 0000000000000000 RSI: ffff880003204800 RDI:
ffffffff81cde560
[2796049.179963] RBP: ffff880005df1ec0 R08: ffff880005df0000 R09:
0000000000000000
[2796049.181258] R10: 0000000000000000 R11: 0000000000000000 R12:
ffffffffa031c520
[2796049.182565] R13: 0000000000000800 R14: 0000000000000000 R15:
00007f5a25e141a0
[2796049.183872] FS:  00007f5a24d7b740(0000) GS:ffff88003f600000(0000)
knlGS:0000000000000000
[2796049.185321] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[2796049.186349] CR2: 0000000000000008 CR3: 000000003b971000 CR4:
00000000000007f0
[2796049.187707] Stack:
[2796049.188150]  0000000000000000 ffff880005df1ed0 ffffffffa030f852
ffff880005df1ee0
[2796049.189865]  ffffffffa0302b5e ffff880005df1ef0 ffffffffa02fa520
ffff880005df1f78
[2796049.191346]  ffffffff810e08d2 00000000c87dacc0 ffffffffa031c520
0000000000000800
[2796049.192829] Call Trace:
[2796049.193356]  [<ffffffffa030f852>]
ovs_stt_cleanup_module+0x22/0x40 [openvswitch]
[2796049.194697]  [<ffffffffa0302b5e>] ovs_vport_exit+0xe/0x40 [openvswitch]
[2796049.195900]  [<ffffffffa02fa520>] dp_cleanup+0x60/0x90 [openvswitch]
[2796049.197035]  [<ffffffff810e08d2>] SyS_delete_module+0x162/0x200
[2796049.198098]  [<ffffffff81013ed7>] ? do_notify_resume+0x97/0xb0
[2796049.199145]  [<ffffffff81734cdd>] system_call_fastpath+0x1a/0x1f
[2796049.200213] Code: 1f 44 00 00 66 66 66 66 90 55 48 89 e5 53 48 89
fb 48 c7 c7 60 e5 cd 81 e8 07 39 0d 00 48 8b 43 08 48 8b 13 48 c7 c7
60 e5 cd 81 <48> 89 42 08 48 89 10 48 b8 00 02 20 00 00 00 ad de 48 89
43 08
[2796049.205628] RIP  [<ffffffff81656ff7>] nf_unregister_hook+0x27/0x80
[2796049.206755]  RSP <ffff880005df1eb8>
[2796049.207421] CR2: 0000000000000008
[2796049.208366] ---[ end trace 62f941e5b9590b3e ]---

The module wasn't removed; subsequent attempts to remove it ended with
errors like this:

# rmmod openvswitch
rmmod: ERROR: ../libkmod/libkmod-module.c:769
kmod_module_remove_module() could not remove 'openvswitch': Device or
resource busy
rmmod: ERROR: could not remove module openvswitch: Device or resource busy

On a RHEL7 host, I ran just this failing test 25, the test passed, but
when I attempted to remove the kernel module the host crashed:

[276998.955187] BUG: unable to handle kernel NULL pointer dereference
at           (null)
[276998.955243] IP: [<ffffffff8130c379>] __list_del_entry+0x29/0xd0
[276998.955291] PGD 6c3b4067 PUD 6c0b9067 PMD 0
[276998.955320] Oops: 0000 [#1] SMP
[276998.955341] Modules linked in: openvswitch(OE-) vxlan
ip6_udp_tunnel udp_tunnel 8021q garp stp mrp llc nf_conntrack_netlink
nfnetlink veth gre nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack
netconsole xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif ata_generic
pata_acpi vmwgfx drm_kms_helper ttm crct10dif_pclmul crct10dif_common
crc32c_intel serio_raw drm ata_piix vmxnet3 libata vmw_pvscsi i2c_core
floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
vport_geneve]
[276998.955646] CPU: 6 PID: 4713 Comm: modprobe Tainted: G
OE  ------------   3.10.0-327.el7.x86_64 #1
[276998.955680] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
[276998.955716] task: ffff8800bb5b0b80 ti: ffff88006f138000 task.ti:
ffff88006f138000
[276998.955744] RIP: 0010:[<ffffffff8130c379>]  [<ffffffff8130c379>]
__list_del_entry+0x29/0xd0
[276998.955779] RSP: 0018:ffff88006f13bea8  EFLAGS: 00010207
[276998.955799] RAX: 0000000000000000 RBX: ffffffffa03bc6c0 RCX:
dead000000200200
[276998.955825] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffffa03bc6c0
[276998.955850] RBP: ffff88006f13bea8 R08: ffff88006f13be58 R09:
0000000000000000
[276998.955875] R10: 0000000000000000 R11: 0000000000000001 R12:
ffffffffa03bc8c0
[276998.955900] R13: 0000000000000800 R14: 0000000001141c98 R15:
0000000000000000
[276998.955950] FS:  00007f1eff0e0740(0000) GS:ffff88013fd80000(0000)
knlGS:0000000000000000
[276998.955978] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[276998.956000] CR2: 0000000000000000 CR3: 00000000b6733000 CR4:
00000000000407e0
[276998.956050] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[276998.956092] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[276998.956119] Stack:
[276998.956131]  ffff88006f13bec0 ffffffff8155c411 fffffffffffffff5
ffff88006f13bed0
[276998.956176]  ffffffffa03afa42 ffff88006f13bee0 ffffffffa03a2ffe
ffff88006f13bef0
[276998.956217]  ffffffffa039a800 ffff88006f13bf78 ffffffff810eb23b
0000000000000000
[276998.956258] Call Trace:
[276998.956285]  [<ffffffff8155c411>] nf_unregister_hook+0x21/0x70
[276998.956314]  [<ffffffffa03afa42>] ovs_stt_cleanup_module+0x22/0x40
[openvswitch]
[276998.956344]  [<ffffffffa03a2ffe>] ovs_vport_exit+0xe/0x40 [openvswitch]
[276998.956371]  [<ffffffffa039a800>] dp_cleanup+0x60/0x90 [openvswitch]
[276998.956405]  [<ffffffff810eb23b>] SyS_delete_module+0x16b/0x2d0
[276998.956439]  [<ffffffff81014b12>] ? do_notify_resume+0x92/0xb0
[276998.956473]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
[276998.956496] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
42 08
[276998.956822] RIP  [<ffffffff8130c379>] __list_del_entry+0x29/0xd0
[276998.956846]  RSP <ffff88006f13bea8>
[276998.956862] CR2: 0000000000000000
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to