On Sat, 07 May 2016 10:57:18 +0000,
Mark Fletcher <mark2...@gmail.com> wrote:
[...]

> So if you eliminate processor microcode, or other firmware issues,
> then I'd look next at your graphics hardware and your desktop
> environment. Contrary to your instincts, mine are that this most
> likely _is_ a hardware issue of some kind -- not necessarily that
> there is anything wrong with the hardware per se, but maybe not fully
> supported or not configured correctly.

Thank you both for these excellent pointers.  As I feared, it will be a
long and tedious debug project.  I definitely won't dismiss hardware
support and configuration issues as you suggest, as it is a fairly new
processor (i7-6700 @ 3.4 GHz).  The graphics card (GT-720) on nouveau or
proprietary Nvidia drivers may also be suspect, as I've always had
issues with Nvidia cards in my older systems (I had no choice with this
new system).

Here's a more encompassing excerpt of the lockup in my first message, in
case something catches attention:

---<--------------------cut here---------------start------------------->---
May  5 22:48:19 otaria kernel: [ 6706.415657] NMI watchdog: Watchdog detected 
hard LOCKUP on cpu 1
May  5 22:48:19 otaria kernel: [ 6706.415659] Modules linked in: md4(E) 
nls_utf8(E) cifs(E) dns_resolver(E) fscache(E) rfcomm(E) xt_multiport(E) 
iptable_filter(E) ip_tables(E) x_tables(E) pci_stub(E) vboxpci(OE) 
vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) bnep(E) dm_mod(E) 
snd_hda_codec_hdmi(E) arc4(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) 
intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) 
kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) 
ghash_clmulni_intel(E) hmac(E) drbg(E) ansi_cprng(E) aesni_intel(E) 
aes_x86_64(E) lrw(E) gf128mul(E) iwlmvm(E) glue_helper(E) ablk_helper(E) 
cryptd(E) mac80211(E) nouveau(E) i915(E) pcspkr(E) serio_raw(E) joydev(E) 
evdev(E) usblp(E) btusb(E) btrtl(E) mxm_wmi(E) snd_hda_intel(E) ttm(E) 
snd_hda_codec(E) drm_kms_helper(E) snd_hda_core(E) iwlwifi(E) snd_hwdep(E) 
drm(E) snd_pcm(E) snd_timer(E) snd(E) cfg80211(E) i2c_algo_bit(E) soundcore(E) 
i2c_i801(E) mei_me(E) shpchp(E) sg(E) mei(E) wmi(E) hci_uart(E) btbcm(E) 
btqca(E) btintel(E) 8250_fintek(E) bluetooth(E) battery(E) intel_lpss_acpi(E) 
rfkill(E) intel_lpss(E) mfd_core(E) video(E) acpi_als(E) kfifo_buf(E) 
tpm_tis(E) industrialio(E) tpm(E) acpi_pad(E) button(E) processor(E) 
parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) ext4(E) crc16(E) mbcache(E) 
jbd2(E) crc32c_generic(E) uas(E) usb_storage(E) sr_mod(E) cdrom(E) sd_mod(E) 
hid_generic(E) usbhid(E) crc32c_intel(E) psmouse(E) ahci(E) e1000e(E) 
libahci(E) ptp(E) xhci_pci(E) pps_core(E) xhci_hcd(E) libata(E) scsi_mod(E) 
usbcore(E) usb_common(E) fan(E) thermal(E) i2c_hid(E) hid(E) fjes(E)
May  5 22:48:19 otaria kernel: [ 6706.415698] CPU: 1 PID: 0 Comm: swapper/1 
Tainted: G           OE   4.5.0-1-amd64 #1 Debian 4.5.1-1
May  5 22:48:19 otaria kernel: [ 6706.415698] Hardware name: LENOVO 
10FWCTO1WW/SKYBAY, BIOS FWKT38A   01/28/2016
May  5 22:48:19 otaria kernel: [ 6706.415699] task: ffff8808417700c0 ti: 
ffff880841120000 task.ti: ffff880841120000
May  5 22:48:19 otaria kernel: [ 6706.415700] RIP: 0010:[<ffffffff81482718>]  
[<ffffffff81482718>] cpuidle_enter_state+0x118/0x2c0
May  5 22:48:19 otaria kernel: [ 6706.415703] RSP: 0018:ffff880841123eb8  
EFLAGS: 00000246
May  5 22:48:19 otaria kernel: [ 6706.415704] RAX: 0000000000000000 RBX: 
0000000000000006 RCX: 0000000000000018
May  5 22:48:19 otaria kernel: [ 6706.415705] RDX: 001c288ad3fa948e RSI: 
00000000004b1da7 RDI: 0000000000000000
May  5 22:48:19 otaria kernel: [ 6706.415705] RBP: 00000616b0534df8 R08: 
0000000000000018 R09: ffff880865452ab4
May  5 22:48:19 otaria kernel: [ 6706.415706] R10: 000000000000209a R11: 
00000000000008a5 R12: ffffe8ffffc492d0
May  5 22:48:19 otaria kernel: [ 6706.415707] R13: ffffffff81ab5898 R14: 
00000616b0375ce0 R15: ffffffff81ab5640
May  5 22:48:19 otaria kernel: [ 6706.415707] FS:  0000000000000000(0000) 
GS:ffff880865440000(0000) knlGS:0000000000000000
May  5 22:48:19 otaria kernel: [ 6706.415708] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
May  5 22:48:19 otaria kernel: [ 6706.415709] CR2: 00007f9f20800000 CR3: 
0000000001a0b000 CR4: 00000000003406e0
May  5 22:48:19 otaria kernel: [ 6706.415709] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
May  5 22:48:19 otaria kernel: [ 6706.415710] DR3: 0000000000000000 DR6: 
00000000fffe0ff0 DR7: 0000000000000400
May  5 22:48:19 otaria kernel: [ 6706.415710] Stack:
May  5 22:48:19 otaria kernel: [ 6706.415711]  0000000005ba63ce 
ffffe8ffffc492d0 ffff880841124000 ffff880841120000
May  5 22:48:19 otaria kernel: [ 6706.415712]  ffff880841120000 
ffff880841124000 ffffffff81ab5640 ffffffff810b8c67
May  5 22:48:19 otaria kernel: [ 6706.415713]  2f95298c05ba63ce 
6db3faa90427c13c 0000000000000000 0000000000000000
May  5 22:48:19 otaria kernel: [ 6706.415714] Call Trace:
May  5 22:48:19 otaria kernel: [ 6706.415717]  [<ffffffff810b8c67>] ? 
cpu_startup_entry+0x287/0x340
May  5 22:48:19 otaria kernel: [ 6706.415719]  [<ffffffff8104d3fa>] ? 
start_secondary+0x15a/0x190
May  5 22:48:19 otaria kernel: [ 6706.415720] Code: 55 48 89 c3 e8 1a 3a c6 ff 
48 89 c5 0f 1f 44 00 00 31 ff e8 eb 61 c3 ff 8b 44 24 04 85 c0 0f 85 29 01 00 
00 fb 66 0f 1f 44 00 00 <4c> 29 f5 48 ba cf f7 53 e3 a5 9b c4 20 48 89 e8 48 c1 
fd 3f 48 
May  5 22:48:19 otaria kernel: [ 6715.124385] INFO: rcu_sched detected stalls 
on CPUs/tasks:
May  5 22:48:19 otaria kernel: [ 6715.124389]   1-...: (2 GPs behind) 
idle=631/1/0 softirq=384388/384388 fqs=5007 
May  5 22:48:19 otaria kernel: [ 6715.124390]   (detected by 5, t=5252 jiffies, 
g=356388, c=356387, q=59432)
May  5 22:48:19 otaria kernel: [ 6715.124392] Task dump for CPU 1:
May  5 22:48:19 otaria kernel: [ 6715.124393] swapper/1       R  running task   
     0     0      1 0x00000008
May  5 22:48:19 otaria kernel: [ 6715.124395]  0000000000000246 
ffff880841123eb8 0000000000000018 ffffffff81482705
May  5 22:48:19 otaria kernel: [ 6715.124397]  0000000005ba63ce 
ffffe8ffffc492d0 ffff880841124000 ffff880841120000
May  5 22:48:19 otaria kernel: [ 6715.124398]  ffff880841120000 
ffff880841124000 ffffffff81ab5640 ffffffff810b8c67
May  5 22:48:19 otaria kernel: [ 6715.124399] Call Trace:
May  5 22:48:19 otaria kernel: [ 6715.124418]  [<ffffffff81482705>] ? 
cpuidle_enter_state+0x105/0x2c0
May  5 22:48:19 otaria kernel: [ 6715.124420]  [<ffffffff810b8c67>] ? 
cpu_startup_entry+0x287/0x340
May  5 22:48:19 otaria kernel: [ 6715.124422]  [<ffffffff8104d3fa>] ? 
start_secondary+0x15a/0x190
May  5 22:49:22 otaria kernel: [ 6778.145577] INFO: rcu_sched detected stalls 
on CPUs/tasks:
May  5 22:49:22 otaria kernel: [ 6778.145581]   1-...: (2 GPs behind) 
idle=631/1/0 softirq=384388/384388 fqs=19961 
May  5 22:49:22 otaria kernel: [ 6778.145582]   (detected by 6, t=21007 
jiffies, g=356388, c=356387, q=272524)
May  5 22:49:22 otaria kernel: [ 6778.145584] Task dump for CPU 1:
May  5 22:49:22 otaria kernel: [ 6778.145585] swapper/1       R  running task   
     0     0      1 0x00000008
May  5 22:49:22 otaria kernel: [ 6778.145587]  0000000000000246 
ffff880841123eb8 0000000000000018 ffffffff81482705
May  5 22:49:22 otaria kernel: [ 6778.145601]  0000000005ba63ce 
ffffe8ffffc492d0 ffff880841124000 ffff880841120000
May  5 22:49:22 otaria kernel: [ 6778.145602]  ffff880841120000 
ffff880841124000 ffffffff81ab5640 ffffffff810b8c67
May  5 22:49:22 otaria kernel: [ 6778.145604] Call Trace:
May  5 22:49:22 otaria kernel: [ 6778.145607]  [<ffffffff81482705>] ? 
cpuidle_enter_state+0x105/0x2c0
May  5 22:49:22 otaria kernel: [ 6778.145622]  [<ffffffff810b8c67>] ? 
cpu_startup_entry+0x287/0x340
May  5 22:49:22 otaria kernel: [ 6778.145624]  [<ffffffff8104d3fa>] ? 
start_secondary+0x15a/0x190
---<--------------------cut here---------------end--------------------->---

Yes, all hard lockups look the same, and they all seem to be with CPU 1.
Not sure about the soft lockups.

-- 
Seb

Reply via email to