On Sat, 07 May 2016 10:57:18 +0000, Mark Fletcher <mark2...@gmail.com> wrote:
[...] > So if you eliminate processor microcode, or other firmware issues, > then I'd look next at your graphics hardware and your desktop > environment. Contrary to your instincts, mine are that this most > likely _is_ a hardware issue of some kind -- not necessarily that > there is anything wrong with the hardware per se, but maybe not fully > supported or not configured correctly. Thank you both for these excellent pointers. As I feared, it will be a long and tedious debug project. I definitely won't dismiss hardware support and configuration issues as you suggest, as it is a fairly new processor (i7-6700 @ 3.4 GHz). The graphics card (GT-720) on nouveau or proprietary Nvidia drivers may also be suspect, as I've always had issues with Nvidia cards in my older systems (I had no choice with this new system). Here's a more encompassing excerpt of the lockup in my first message, in case something catches attention: ---<--------------------cut here---------------start------------------->--- May 5 22:48:19 otaria kernel: [ 6706.415657] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1 May 5 22:48:19 otaria kernel: [ 6706.415659] Modules linked in: md4(E) nls_utf8(E) cifs(E) dns_resolver(E) fscache(E) rfcomm(E) xt_multiport(E) iptable_filter(E) ip_tables(E) x_tables(E) pci_stub(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) bnep(E) dm_mod(E) snd_hda_codec_hdmi(E) arc4(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) hmac(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) iwlmvm(E) glue_helper(E) ablk_helper(E) cryptd(E) mac80211(E) nouveau(E) i915(E) pcspkr(E) serio_raw(E) joydev(E) evdev(E) usblp(E) btusb(E) btrtl(E) mxm_wmi(E) snd_hda_intel(E) ttm(E) snd_hda_codec(E) drm_kms_helper(E) snd_hda_core(E) iwlwifi(E) snd_hwdep(E) drm(E) snd_pcm(E) snd_timer(E) snd(E) cfg80211(E) i2c_algo_bit(E) soundcore(E) i2c_i801(E) mei_me(E) shpchp(E) sg(E) mei(E) wmi(E) hci_uart(E) btbcm(E) btqca(E) btintel(E) 8250_fintek(E) bluetooth(E) battery(E) intel_lpss_acpi(E) rfkill(E) intel_lpss(E) mfd_core(E) video(E) acpi_als(E) kfifo_buf(E) tpm_tis(E) industrialio(E) tpm(E) acpi_pad(E) button(E) processor(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) uas(E) usb_storage(E) sr_mod(E) cdrom(E) sd_mod(E) hid_generic(E) usbhid(E) crc32c_intel(E) psmouse(E) ahci(E) e1000e(E) libahci(E) ptp(E) xhci_pci(E) pps_core(E) xhci_hcd(E) libata(E) scsi_mod(E) usbcore(E) usb_common(E) fan(E) thermal(E) i2c_hid(E) hid(E) fjes(E) May 5 22:48:19 otaria kernel: [ 6706.415698] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G OE 4.5.0-1-amd64 #1 Debian 4.5.1-1 May 5 22:48:19 otaria kernel: [ 6706.415698] Hardware name: LENOVO 10FWCTO1WW/SKYBAY, BIOS FWKT38A 01/28/2016 May 5 22:48:19 otaria kernel: [ 6706.415699] task: ffff8808417700c0 ti: ffff880841120000 task.ti: ffff880841120000 May 5 22:48:19 otaria kernel: [ 6706.415700] RIP: 0010:[<ffffffff81482718>] [<ffffffff81482718>] cpuidle_enter_state+0x118/0x2c0 May 5 22:48:19 otaria kernel: [ 6706.415703] RSP: 0018:ffff880841123eb8 EFLAGS: 00000246 May 5 22:48:19 otaria kernel: [ 6706.415704] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000018 May 5 22:48:19 otaria kernel: [ 6706.415705] RDX: 001c288ad3fa948e RSI: 00000000004b1da7 RDI: 0000000000000000 May 5 22:48:19 otaria kernel: [ 6706.415705] RBP: 00000616b0534df8 R08: 0000000000000018 R09: ffff880865452ab4 May 5 22:48:19 otaria kernel: [ 6706.415706] R10: 000000000000209a R11: 00000000000008a5 R12: ffffe8ffffc492d0 May 5 22:48:19 otaria kernel: [ 6706.415707] R13: ffffffff81ab5898 R14: 00000616b0375ce0 R15: ffffffff81ab5640 May 5 22:48:19 otaria kernel: [ 6706.415707] FS: 0000000000000000(0000) GS:ffff880865440000(0000) knlGS:0000000000000000 May 5 22:48:19 otaria kernel: [ 6706.415708] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 22:48:19 otaria kernel: [ 6706.415709] CR2: 00007f9f20800000 CR3: 0000000001a0b000 CR4: 00000000003406e0 May 5 22:48:19 otaria kernel: [ 6706.415709] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 5 22:48:19 otaria kernel: [ 6706.415710] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 May 5 22:48:19 otaria kernel: [ 6706.415710] Stack: May 5 22:48:19 otaria kernel: [ 6706.415711] 0000000005ba63ce ffffe8ffffc492d0 ffff880841124000 ffff880841120000 May 5 22:48:19 otaria kernel: [ 6706.415712] ffff880841120000 ffff880841124000 ffffffff81ab5640 ffffffff810b8c67 May 5 22:48:19 otaria kernel: [ 6706.415713] 2f95298c05ba63ce 6db3faa90427c13c 0000000000000000 0000000000000000 May 5 22:48:19 otaria kernel: [ 6706.415714] Call Trace: May 5 22:48:19 otaria kernel: [ 6706.415717] [<ffffffff810b8c67>] ? cpu_startup_entry+0x287/0x340 May 5 22:48:19 otaria kernel: [ 6706.415719] [<ffffffff8104d3fa>] ? start_secondary+0x15a/0x190 May 5 22:48:19 otaria kernel: [ 6706.415720] Code: 55 48 89 c3 e8 1a 3a c6 ff 48 89 c5 0f 1f 44 00 00 31 ff e8 eb 61 c3 ff 8b 44 24 04 85 c0 0f 85 29 01 00 00 fb 66 0f 1f 44 00 00 <4c> 29 f5 48 ba cf f7 53 e3 a5 9b c4 20 48 89 e8 48 c1 fd 3f 48 May 5 22:48:19 otaria kernel: [ 6715.124385] INFO: rcu_sched detected stalls on CPUs/tasks: May 5 22:48:19 otaria kernel: [ 6715.124389] 1-...: (2 GPs behind) idle=631/1/0 softirq=384388/384388 fqs=5007 May 5 22:48:19 otaria kernel: [ 6715.124390] (detected by 5, t=5252 jiffies, g=356388, c=356387, q=59432) May 5 22:48:19 otaria kernel: [ 6715.124392] Task dump for CPU 1: May 5 22:48:19 otaria kernel: [ 6715.124393] swapper/1 R running task 0 0 1 0x00000008 May 5 22:48:19 otaria kernel: [ 6715.124395] 0000000000000246 ffff880841123eb8 0000000000000018 ffffffff81482705 May 5 22:48:19 otaria kernel: [ 6715.124397] 0000000005ba63ce ffffe8ffffc492d0 ffff880841124000 ffff880841120000 May 5 22:48:19 otaria kernel: [ 6715.124398] ffff880841120000 ffff880841124000 ffffffff81ab5640 ffffffff810b8c67 May 5 22:48:19 otaria kernel: [ 6715.124399] Call Trace: May 5 22:48:19 otaria kernel: [ 6715.124418] [<ffffffff81482705>] ? cpuidle_enter_state+0x105/0x2c0 May 5 22:48:19 otaria kernel: [ 6715.124420] [<ffffffff810b8c67>] ? cpu_startup_entry+0x287/0x340 May 5 22:48:19 otaria kernel: [ 6715.124422] [<ffffffff8104d3fa>] ? start_secondary+0x15a/0x190 May 5 22:49:22 otaria kernel: [ 6778.145577] INFO: rcu_sched detected stalls on CPUs/tasks: May 5 22:49:22 otaria kernel: [ 6778.145581] 1-...: (2 GPs behind) idle=631/1/0 softirq=384388/384388 fqs=19961 May 5 22:49:22 otaria kernel: [ 6778.145582] (detected by 6, t=21007 jiffies, g=356388, c=356387, q=272524) May 5 22:49:22 otaria kernel: [ 6778.145584] Task dump for CPU 1: May 5 22:49:22 otaria kernel: [ 6778.145585] swapper/1 R running task 0 0 1 0x00000008 May 5 22:49:22 otaria kernel: [ 6778.145587] 0000000000000246 ffff880841123eb8 0000000000000018 ffffffff81482705 May 5 22:49:22 otaria kernel: [ 6778.145601] 0000000005ba63ce ffffe8ffffc492d0 ffff880841124000 ffff880841120000 May 5 22:49:22 otaria kernel: [ 6778.145602] ffff880841120000 ffff880841124000 ffffffff81ab5640 ffffffff810b8c67 May 5 22:49:22 otaria kernel: [ 6778.145604] Call Trace: May 5 22:49:22 otaria kernel: [ 6778.145607] [<ffffffff81482705>] ? cpuidle_enter_state+0x105/0x2c0 May 5 22:49:22 otaria kernel: [ 6778.145622] [<ffffffff810b8c67>] ? cpu_startup_entry+0x287/0x340 May 5 22:49:22 otaria kernel: [ 6778.145624] [<ffffffff8104d3fa>] ? start_secondary+0x15a/0x190 ---<--------------------cut here---------------end--------------------->--- Yes, all hard lockups look the same, and they all seem to be with CPU 1. Not sure about the soft lockups. -- Seb