Hi Joseph, Is the following patch for this issue? ``` commit 3bb8b653c86f6b1d2cc05aa1744fed4b18f99485 Author: Joseph Qi <joseph...@huawei.com> Date: Mon Sep 19 14:44:33 2016 -0700
ocfs2: fix double unlock in case retry after free truncate log If ocfs2_reserve_cluster_bitmap_bits() fails with ENOSPC, it will try to free truncate log and then retry. Since ocfs2_try_to_free_truncate_log will lock/unlock global bitmap inode, we have to unlock it before calling this function. But when retry reserve and it fails with no /* reserve -> deserve, i think */ global bitmap inode lock taken, it will unlock again in error handling branch and BUG. This issue also exists if no need retry and then ocfs2_inode_lock fails. So fix it. Fixes: 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in truncate log") Link: http://lkml.kernel.org/r/57d91939.6030...@huawei.com Signed-off-by: Joseph Qi <joseph...@huawei.com> Signed-off-by: Jiufei Xue <xuejiu...@huawei.com> Cc: Mark Fasheh <mfas...@suse.de> Cc: Joel Becker <jl...@evilplan.org> Cc: Junxiao Bi <junxiao...@oracle.com> Signed-off-by: Andrew Morton <a...@linux-foundation.org> Signed-off-by: Linus Torvalds <torva...@linux-foundation.org> ``` If so, Gerhard, try to backport this fix. Eric On 10/26/2016 05:29 AM, Gerhard Mack wrote: > Hello, > > I had a server reboot on me and I'm at a loss as to what caused this > crash. Please keep in mind this server is mission critical and my > options for testing are rather limited. > > Anyone have any ideas? > Gerhard > > > Oct 25 15:38:38 172.28.23.18 kernel: [ 180.900950] o2net: Connected to > node monmailcl01 (num 1) at 10.45.0.11:7777 > Oct 25 15:38:39 172.28.23.18 kernel: [ 181.455469] o2dlm: Node 1 joins > domain 85372A5B9E7C4C2C95F1E9922D5A83AF ( 1 2 ) 2 nodes > Oct 25 15:38:40 172.28.23.18 kernel: [ 182.972901] o2dlm: Node 1 joins > domain 490180441A5248339D36ECD96514427C ( 1 2 ) 2 nodes > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.410379] ------------[ cut > here ]------------ > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.410452] kernel BUG at > fs/ocfs2/dlmglue.c:780! > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.410515] invalid opcode: 0000 > [#1] SMP > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.410576] Modules linked in: > xt_multiport iptable_filter ocfs2 quota_tree xt_tcpudp iptable_mangle > xt_mark > ip_tables x_tables ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm > ocfs2_nodemanager ocfs2_stackglue ib_iser rdma_cm iw_cm ib_cm ib_core > configfs iscsi_tcp > libiscsi_tcp libiscsi scsi_transport_iscsi bonding ext4 crc16 jbd2 > mbcache coretemp kvm_intel kvm snd_pcm irqbypass snd_timer snd soundcore > pcspkr > iTCO_wdt iTCO_vendor_support dcdbas evdev shpchp serio_raw i2c_i801 > i2c_core acpi_cpufreq lpc_ich mfd_core tpm_tis tpm i5100_edac button > edac_core > processor loop autofs4 xfs crc32c_generic libcrc32c raid1 md_mod sg > sd_mod hid_generic usbhid hid ahci libahci libata e1000e scsi_mod > uhci_hcd ehci_pci > ehci_hcd usbcore ptp psmouse pps_core usb_common r8169 mii > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] CPU: 3 PID: 3563 > Comm: imap Not tainted 4.7.6 #8 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] Hardware name: > Dell CS24-SC /CS24-SC , BIOS S45_3A20 > 01/21/2009 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] task: > ffff8800bb35cd00 ti: ffff8800bb2d8000 task.ti: ffff8800bb2d8000 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RIP: > 0010:[<ffffffffa0535365>] [<ffffffffa0535365>] > __ocfs2_cluster_unlock.isra.34+0x4a/0x92 [ocfs2] > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RSP: > 0018:ffff8800bb2dbbe0 EFLAGS: 00010046 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RAX: > 0000000000000246 RBX: ffff8800bbbd7a18 RCX: 000000000005a25c > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RDX: > 0000000000000000 RSI: ffff8800bbbd7a18 RDI: ffff8800bbbd7a84 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RBP: > ffff8800bbbd7a84 R08: ffff8800bb2d8000 R09: 0000000000000001 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] R10: > ffff8800bb2dbbd8 R11: 000000000000000b R12: ffff88041782b000 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] R13: > 0000000000000246 R14: 0000000000000003 R15: 0000000000000003 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] FS: > 00007fe9a96c2700(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] CS: 0010 DS: 0000 > ES: 0000 CR0: 0000000080050033 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] CR2: > 000056169b47e000 CR3: 00000000bb112000 CR4: 00000000000406e0 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] Stack: > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] ffff88042d757c00 > 0000000000000000 ffff88042a0e1b40 ffff8800ba8194d8 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] 0000000000000000 > ffffffffa0528ce0 ffff88042a0e1b78 ffff8800ba8194d8 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] 0000000000000000 > ffff88042a0e1b40 ffff88042a0e1b40 ffff8800ba8194d8 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] Call Trace: > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffffa0528ce0>] > ? ocfs2_dentry_attach_lock+0x2c2/0x3f2 [ocfs2] > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffffa0548a8d>] > ? ocfs2_lookup+0x17c/0x268 [ocfs2] > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff81140925>] > ? lookup_slow+0xcf/0x104 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff811422fa>] > ? walk_component+0x69/0x12b > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff81142890>] > ? path_lookupat+0x7d/0xfe > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff81143f8c>] > ? filename_lookup+0x78/0xf5 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff8112a9f9>] > ? kmem_cache_alloc+0x99/0x124 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff8113c544>] > ? vfs_fstatat+0x46/0x83 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff8113c544>] > ? vfs_fstatat+0x46/0x83 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff8113c5ca>] > ? SYSC_newstat+0x10/0x27 > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] [<ffffffff813f831b>] > ? entry_SYSCALL_64_fastpath+0x13/0x8f > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] Code: db 75 02 0f 0b > 41 83 fe 03 49 89 c5 74 16 41 83 fe 05 > 75 20 8b 53 5c 85 d2 75 02 0f 0b ff ca 89 53 5c eb 12 8b 53 58 85 d2 75 > 02 <0f> 0b ff ca 89 53 58 eb 02 0f 0b f6 43 30 04 74 24 8a 43 62 3c > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RIP > [<ffffffffa0535365>] __ocfs2_cluster_unlock.isra.34+0x4a/0x92 [ocfs2] > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] RSP <ffff8800bb2dbbe0> > Oct 25 15:40:04 172.28.23.18 kernel: [ 266.414339] ---[ end trace > 4eaf20faca7a8f81 ]--- > > > The server hard rebooted after this.. > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users