Hello, I've got a server which has now crashed a few times in a similar fashion, even tried moving to new hardware with similar effect (tho on the new hardware this seems to be happening more frequently), so this seems likely some interaction between nagios and the kernel causing a soft lock. Any ideas on how to resolve this would be appreciated. Unfortunately this is the only log I have of the event, the first event didn't produce any output like this, and I haven't got a record of the logs from the previous hardware as I thought they may have been isolated incidents.
The previous hardware was running Lenny rather than Squeeze, so this seems not isolated to just one version of anything in particular. Let me know if there is any more information which would be of use. There's quite a few bits of software running on here, RTG, Cricket, Nagios, smokeping, rancid Debian 6.0.3 Linux zzz-zzz 2.6.32-5-686-bigmem #1 SMP Wed Jan 11 13:17:56 UTC 2012 i686 GNU/Linux Jan 22 22:40:40 zzz-zzz kernel: [176617.648985] BUG: soft lockup - CPU#13 stuck for 61s! [nagios3:2070] Jan 22 22:40:40 zzz-zzz kernel: [176617.649040] Modules linked in: netconsole configfs joydev usbhid hid xt_multiport iptable_filter ip_tables x_tables 8021q garp stp loop snd_pcm snd_timer snd soundcore snd_page_alloc ioatdma pcspkr evdev cdc_ether usbnet button processor serio_raw dca mii shpchp pci_hotplug i2c_i801 i2c_core ext4 mbcache jbd2 crc16 raid10 md_mod sd_mod crc_t10dif ata_generic uhci_hcd megaraid_sas ata_piix ehci_hcd libata usbcore scsi_mod nls_base thermal bnx2 thermal_sys [last unloaded: netconsole] Jan 22 22:40:40 zzz-zzz kernel: [176617.649078] Jan 22 22:40:40 zzz-zzz kernel: [176617.649082] Pid: 2070, comm: nagios3 Not tainted (2.6.32-5-686-bigmem #1) System x3550 M3 -[7944D2M]- Jan 22 22:40:40 zzz-zzz kernel: [176617.649085] EIP: 0060:[<c10249bb>] EFLAGS: 00000202 CPU: 13 Jan 22 22:40:40 zzz-zzz kernel: [176617.649094] EIP is at native_flush_tlb_others+0x85/0xa6 Jan 22 22:40:40 zzz-zzz kernel: [176617.649096] EAX: 00000282 EBX: c14661ac ECX: c10200d8 EDX: 00000020 Jan 22 22:40:40 zzz-zzz kernel: [176617.649099] ESI: 00000005 EDI: 00000140 EBP: c14661a0 ESP: ee4c9a3c Jan 22 22:40:40 zzz-zzz kernel: [176617.649101] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Jan 22 22:40:40 zzz-zzz kernel: [176617.649104] CR0: 8005003b CR2: b758a376 CR3: 2eb7e000 CR4: 000006f0 Jan 22 22:40:40 zzz-zzz kernel: [176617.649106] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 Jan 22 22:40:40 zzz-zzz kernel: [176617.649108] DR6: ffff0ff0 DR7: 00000400 Jan 22 22:40:40 zzz-zzz kernel: [176617.649110] Call Trace: Jan 22 22:40:40 zzz-zzz kernel: [176617.649116] [<c1024aa3>] ? flush_tlb_page+0x5d/0x65 Jan 22 22:40:40 zzz-zzz kernel: [176617.649120] [<c1023e90>] ? ptep_set_access_flags+0x59/0x63 Jan 22 22:40:40 zzz-zzz kernel: [176617.649125] [<c10a1040>] ? do_wp_page+0x3b9/0x7dd Jan 22 22:40:40 zzz-zzz kernel: [176617.649131] [<c1031770>] ? finish_task_switch+0x76/0x95 Jan 22 22:40:40 zzz-zzz kernel: [176617.649135] [<c10b61a0>] ? kmem_cache_free+0x78/0xaf Jan 22 22:40:40 zzz-zzz kernel: [176617.649138] [<c1031770>] ? finish_task_switch+0x76/0x95 Jan 22 22:40:40 zzz-zzz kernel: [1766Jan 23 07:13:24 zzz-zzz syslog-ng[1807]: syslog-ng starting up; version='3.1.3' Cheers, Blair -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAHn0gTvQOqWwmOJqRY+ZoFvABMQhVrMwKVjF1bnycPwksm=e...@mail.gmail.com