Hi, We use debian for a number of machines in our storage infrastructure and we have recently been seeing a number of "hangs". We primary notice this by seeing nfsd processes locking up and then a hung task killer going wild. We finally managed to get a trace last night - its pasted below:
We did not see this crash under 2.6.39 back port however this kernel spontaneously rebooted at ~200 days uptime (we had about 3/4 of our infra reboot in a few weeks. It was not a good time for our ops teams). I would be grateful if anybody who could help me narrow this down would jump in and help with requests for further info, or provide further advice. [11309697.466397] ------------[ cut here ]------------ [11309697.466556] WARNING: at /build/buildd-linux_3.2.23-1~bpo60+2-amd64-oLufer/linux-3.2.23/fs/jbd2/journal.c:507 __jbd2_log_start_commit+0x7e/0x8c [jbd2]() [11309697.466660] Hardware name: X8DT6 [11309697.466728] JBD2: bad log_start_commit: 2205591757 2205591757 14613566 0 [11309697.466798] Modules linked in: netconsole autofs4 8021q garp bridge stp nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding tcp_htcp ext4 jbd2 crc16 configfs loop ohci_hcd tpm_tis tpm i7core_edac i2c_i801 snd_pcm snd_timer snd soundcore edac_core i2c_core ioatdma tpm_bios snd_page_alloc coretemp crc32c_intel psmouse pcspkr joydev evdev acpi_cpufreq mperf processor serio_raw button thermal_sys ext3 jbd mbcache usbhid hid sd_mod ses enclosure crc_t10dif uhci_hcd ahci libahci libata igb ehci_hcd e1000e usbcore dca megaraid_sas usb_common scsi_mod [last unloaded: netconsole] [11309697.470190] Pid: 62, comm: kswapd0 Not tainted 3.2.0-0.bpo.3-amd64 #1 [11309697.470261] Call Trace: [11309697.470329] [<ffffffff810498a8>] ? warn_slowpath_common+0x78/0x8c [11309697.470399] [<ffffffff8104995a>] ? warn_slowpath_fmt+0x45/0x4a [11309697.470471] [<ffffffffa01cabad>] ? __jbd2_log_start_commit+0x7e/0x8c [jbd2] [11309697.470558] [<ffffffffa01cac83>] ? jbd2_log_start_commit+0x21/0x2f [jbd2] [11309697.470634] [<ffffffffa02dee7a>] ? ext4_evict_inode+0x86/0x2d1 [ext4] [11309697.470707] [<ffffffff81119626>] ? evict+0x9a/0x14e [11309697.470775] [<ffffffff811198b4>] ? dispose_list+0x35/0x3f [11309697.470844] [<ffffffff81119b87>] ? prune_icache_sb+0x2c9/0x2d8 [11309697.470915] [<ffffffff811081b0>] ? prune_super+0xd6/0x147 [11309697.470987] [<ffffffff810cb9e2>] ? shrink_slab+0x1a3/0x266 [11309697.471056] [<ffffffff810cd937>] ? balance_pgdat+0x335/0x625 [11309697.471126] [<ffffffff810cdf31>] ? kswapd+0x30a/0x325 [11309697.471196] [<ffffffff81063815>] ? wake_up_bit+0x20/0x20 [11309697.471265] [<ffffffff810cdc27>] ? balance_pgdat+0x625/0x625 [11309697.471334] [<ffffffff810cdc27>] ? balance_pgdat+0x625/0x625 [11309697.471403] [<ffffffff810633d9>] ? kthread+0x7a/0x82 [11309697.471472] [<ffffffff8136d3f4>] ? kernel_thread_helper+0x4/0x10 [11309697.471543] [<ffffffff8106335f>] ? kthread_worker_fn+0x147/0x147 [11309697.471613] [<ffffffff8136d3f0>] ? gs_change+0x13/0x13 [11309697.471680] ---[ end trace 56d2be5ea52d0917 ]--- -- George Barnett gbarn...@atlassian.com -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/b2ec601cdda242189a46599b31ea6...@atlassian.com