On Tue, Feb 25, 2014 at 12:56 PM, Sean Swehla <sean.swe...@gmail.com> wrote: > Hello, > > I'm currently hitting a null pointer dereference and kernel panic that seems > to be in ovs. The problem is sporadic. I have one production machine that's > hit it four times in the past 24hrs, and one lab machine that I can't get to > hit it at all. > > We rebuilt openvswitch with debugging symbols turned on, and traced the null > pointer dereference to datapath/linux/flow.c:814 . Do you have any advice on > how to trace this back to a root cause (or, ideally, a fix) ? I've scoured > Google for related issues but come up short. (I'll happily accept that my > google-fu is lacking, though.) > > I would greatly appreciate any guidance you could offer. Here's some more > information about my system, for context.
This looks suspiciously similar to the problem addressed by this commit. It doesn't apply to OVS 1.10, so your kernel might have the issue mentioned here. commit a98ea90e1e86fc7f2bbdc6afd4ad9dc78a06e889 Author: Pravin B Shelar <pshe...@nicira.com> Date: Fri Jul 26 13:52:24 2013 -0700 datapath: list: Fix double fetch of pointer in hlist_entry_safe() Following patch backports commit f65846a1800ef8c48d (list: Fix double fetch of pointer in hlist_entry_safe()) from upstream kernel. This patch fixes following panic. Thanks to Jesse for helping to debug this issue. BUG: unable to handle kernel NULL pointer dereference at 0000000000000118 [129608.216422] IP: [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [ [129608.216918] PGD 11500a067 PUD 120851067 PMD 0 [129608.216994] Oops: 0000 [#1] SMP [129608.217049] CPU 0 [129608.218697] [129608.218726] Pid: 0, comm: swapper/0 Tainted: G O 3.2.39-server-nn21 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [129608.219288] RIP: 0010:[<ffffffffa02436da>] [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 [openvswitch] [129608.219454] RSP: 0018:ffff88013fc03b60 EFLAGS: 00010282 [129608.219536] RAX: 0000000000000020 RBX: ffff880123087100 RCX: ffff88012098e000 [129608.219719] RDX: ffff8800b3b0ca30 RSI: 000000000000010a RDI: ffff88011df8c000 [129608.220121] RBP: ffff88013fc03c30 R08: 0000000000000001 R09: 0000000020069825 [129608.220287] R10: 0000000000000020 R11: 0000000000000001 R12: ffff880036e1c6c0 [129608.220451] R13: ffff88013fc03b98 R14: 0000000000000024 R15: ffffffffffffffe0 [129608.220618] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000 [129608.220794] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [129608.220911] CR2: 0000000000000118 CR3: 00000001190c9000 CR4: 00000000000406f0 [129608.221122] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [129608.221320] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [129608.221488] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0d020) [129608.221669] Stack: [129608.221725] 0000000000000044 0000000000000020 ffff88013fc03c60 0000000000000000 [129608.221906] 0000000000000000 0000000000000000 0000000000000000 f014232200000002 [129608.222069] 1973f0142322015a 0000000600080000 1973f0140579f414 000000002f1dc7ec [129608.222211] Call Trace: [129608.222264] <IRQ> [129608.222316] [<ffffffffa02445bd>] ovs_flow_lookup+0x5d/0x70 [openvswitch] [129608.222411] [<ffffffffa0242550>] ovs_dp_process_received_packet+0x70/0x [129608.222541] [<ffffffff8104d6ec>] ? resched_task+0x2c/0x80 [129608.222644] [<ffffffffa0249b20>] ? netdev_create+0x120/0x120 [openvswitch] [129608.222743] [<ffffffffa02483f8>] ovs_vport_receive+0x38/0x40 [openvswitch] [129608.222838] [<ffffffffa0249bc3>] netdev_frame_hook+0xa3/0xf0 [openvswitch] [129608.222933] [<ffffffffa0249b20>] ? netdev_create+0x120/0x120 [openvswitch] [129608.223029] [<ffffffff81539318>] __netif_receive_skb+0x1c8/0x620 [129608.223114] [<ffffffff81325740>] ? map_single+0x60/0x60 [129608.223192] [<ffffffff81539b71>] process_backlog+0xb1/0x190 [129608.223274] [<ffffffff8153ae64>] net_rx_action+0x134/0x290 [129608.223355] [<ffffffff8106e1a8>] __do_softirq+0xa8/0x210 [129608.223433] [<ffffffff816458de>] ? _raw_spin_lock+0xe/0x20 [129608.223513] [<ffffffff8164fcac>] call_softirq+0x1c/0x30 [129608.223590] [<ffffffff81016215>] do_softirq+0x65/0xa0 [129608.223665] [<ffffffff8106e58e>] irq_exit+0x8e/0xb0 [129608.223738] [<ffffffff81650573>] do_IRQ+0x63/0xe0 [129608.223808] [<ffffffff81645d6e>] common_interrupt+0x6e/0x6e [129608.223887] <EOI> [129608.223933] [<ffffffff8103ced6>] ? native_safe_halt+0x6/0x10 [129608.224014] [<ffffffff8101c6a3>] default_idle+0x53/0x1d0 [129608.224092] [<ffffffff81013236>] cpu_idle+0xd6/0x120 [129608.224167] [<ffffffff8161785e>] rest_init+0x72/0x74 [129608.224252] [<ffffffff81cfcbaa>] start_kernel+0x3b5/0x3c2 [129608.224331] [<ffffffff81cfc347>] x86_64_start_reservations+0x132/0x136 [129608.224421] [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140 [129608.224506] [<ffffffff81cfc44d>] x86_64_start_kernel+0x102/0x111 [129608.224589] Code: 25 48 63 53 28 48 8d 42 01 48 c1 e0 04 49 01 c7 49 8b 07 48 85 c0 74 61 4d 8b 3f 48 c1 e2 04 48 83 c2 10 49 29 d7 4d 85 ff 74 26 <4d> 39 a7 38 01 00 00 75 cd 48 8b 95 38 ff ff ff 4c 89 ee 49 8d [129608.224949] RIP [<ffffffffa02436da>] ovs_masked_flow_lookup+0xda/0x140 Original commit msg: list: Fix double fetch of pointer in hlist_entry_safe() The current version of hlist_entry_safe() fetches the pointer twice, once to test for NULL and the other to compute the offset back to the enclosing structure. This is OK for normal lock-based use because in that case, the pointer cannot change. However, when the pointer is protected by RCU (as in "rcu_dereference(p)"), then the pointer can change at any time. This use case can result in the following sequence of events: 1. CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected pointer as sees that it is non-NULL. 2. CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0 just fetched a pointer to. Because this is the last entry in the list, the pointer fetched by CPU 0 is now NULL. 3. CPU 0 refetches the pointer, obtains NULL, and then gets a NULL-pointer crash. This commit therefore applies gcc's "({ })" statement expression to create a temporary variable so that the specified pointer is fetched only once, avoiding the above sequence of events. Please note that it is the caller's responsibility to use rcu_dereference() as needed. This allows RCU-protected uses to work correctly without imposing any additional overhead on the non-RCU case. Many thanks to Eric Dumazet for spotting root cause! Reported-by: CAI Qian <caiq...@redhat.com> Reported-by: Eric Dumazet <eric.duma...@gmail.com> Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com> Tested-by: Li Zefan <lize...@huawei.com> Bug #17099 Signed-off-by: Pravin B Shelar <pshe...@nicira.com> Acked-by: Jesse Gross <je...@nicira.com> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev