Hello Junien, After your last crash - similar to previous ones - one thing called my attention: For the first time we had one CPU RCU stall detected by another CPU. This made me think that it wasn't only related to the SMP logic - like I believed - but the stall occurred also somewhere else.
---- [ 5792.466770] INFO: rcu_sched detected stalls on CPUs/tasks: { 7} (detected by 15, t=15003 jiffies, g=182379, c=182378, q=0) ---- And this stall happened before Async I/O callbacks started to be suppressed: ---- [ 5793.190218] block nbd6: Attempted send on closed socket [ 5793.190221] blk_update_request: 1154 callbacks suppressed [ 5793.190223] blk_update_request: I/O error, dev nbd6, sector 125828992 [ 5793.190226] buffer_io_error: 1151 callbacks suppressed [ 5793.190227] Buffer I/O error on dev nbd6, logical block 125828992, async page read [ 5793.190235] block nbd6: Attempted send on closed socket [ 5793.190237] blk_update_request: I/O error, dev nbd6, sector 125828993 [ 5793.190238] Buffer I/O error on dev nbd6, logical block 125828993, async page read [ 5793.190242] block nbd6: Attempted send on closed socket [ 5793.190243] blk_update_request: I/O error, dev nbd6, sector 125828994 [ 5793.190245] Buffer I/O error on dev nbd6, logical block 125828994, async page read [ 5793.190248] block nbd6: Attempted send on closed socket ---- Digging upstream (from 3.13 to HEAD) I could see there were not a huge amount of fixes: ---- $ git log --pretty=oneline v3.13..HEAD -- drivers/block/nbd.c | wc -l 31 ---- For nbd.c and I identified an improvement on nbd timeout handling: ---- commit 7e2893a16d3e71035a38122a77bc55848a29f0e4 Author: Markus Pargmann <m...@pengutronix.de> Date: Mon Aug 17 08:20:00 2015 +0200 nbd: Fix timeout detection ---- This fix is pretty recent (4.3) and it fit to the case: 3.18 kernel facing the same issue. Later I found out that Debian had a similar bug: ---- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=770479 https://lists.debian.org/debian-kernel/2015/05/msg00054.html ---- for kernel 3.16, complaining about messages like this: ---- [ 5793.190242] block nbd6: Attempted send on closed socket ---- And the lack of proper timeout for nbd connections (now based on timeout after IO submission). SO... The backport shall be easy* and I'll probably make one PPA containing a 3.18 (+ this patch) available for you tomorrow. * 2 out of 12 hunks FAILED -- saving rejects to file drivers/block/nbd.c.rej * Debian has a 3.16 version already Thank you Rafael Tinoco ** Bug watch added: Debian Bug tracker #770479 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=770479 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1505564 Title: Soft lockup with "block nbdX: Attempted send on closed socket" spam Status in linux package in Ubuntu: In Progress Bug description: Hi, Some of our nova compute hosts regularly freeze, sometimes for a few hours, with kern.log getting spammed with : block nbdX: Attempted send on closed socket and a few "CPU soft lockup" messages (see attached log). This clears up when the queue gets cleared, eg : block nbdX: queue cleared trusty hosts with kernel version 3.19.0-30-generic. Note that timestamps from kern.log appears to be wrong, it looks like the messages are being held, and then all delivered at once when the kernel unfreezes. Attaching apport files from 2 hosts below. --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 9 13:14 seq crw-rw---- 1 root audio 116, 33 Oct 9 13:14 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.15 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL385 G7 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-30-generic root=UUID=1bd039df-c419-4cb7-b1ad-fe004d55ccd4 ro console=tty0 console=ttyS1,38400 nosplash ProcVersionSignature: Ubuntu 3.19.0-30.34~14.04.1-generic 3.19.8-ckt6 RelatedPackageVersions: linux-restricted-modules-3.19.0-30-generic N/A linux-backports-modules-3.19.0-30-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.19.0-30-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 12/08/2012 dmi.bios.vendor: HP dmi.bios.version: A18 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrA18:bd12/08/2012:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL385 G7 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 9 12:37 seq crw-rw---- 1 root audio 116, 33 Oct 9 12:37 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.15 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL360p Gen8 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-30-generic root=UUID=f3c6cae8-09dc-4607-8675-c9123ea9c9fd ro console=tty0 console=ttyS1,38400 nosplash ProcVersionSignature: Ubuntu 3.19.0-30.34~14.04.1-generic 3.19.8-ckt6 RelatedPackageVersions: linux-restricted-modules-3.19.0-30-generic N/A linux-backports-modules-3.19.0-30-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory StagingDrivers: visorchannel visorutil Tags: trusty uec-images staging Uname: Linux 3.19.0-30-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 11/14/2013 dmi.bios.vendor: HP dmi.bios.version: P71 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP71:bd11/14/2013:svnHP:pnProLiantDL360pGen8:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL360p Gen8 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 28 13:07 seq crw-rw---- 1 root audio 116, 33 Oct 28 13:07 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.16 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL360p Gen8 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-29-generic root=UUID=46ac2c1e-5f16-45bd-b383-e952f78fd142 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M ProcVersionSignature: Ubuntu 3.13.0-29.53-generic 3.13.11.2 RelatedPackageVersions: linux-restricted-modules-3.13.0-29-generic N/A linux-backports-modules-3.13.0-29-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.13.0-29-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 11/14/2013 dmi.bios.vendor: HP dmi.bios.version: P71 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP71:bd11/14/2013:svnHP:pnProLiantDL360pGen8:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL360p Gen8 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 28 16:31 seq crw-rw---- 1 root audio 116, 33 Oct 28 16:31 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.18 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL385 G7 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=29988d72-90fe-4329-a0c1-0e3bfb88beab ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9 RelatedPackageVersions: linux-restricted-modules-3.13.0-24-generic N/A linux-backports-modules-3.13.0-24-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: Oct 29 07:58:52 druk kernel: [55713.346292] nr_pdflush_threads exported in /proc is scheduled for removal Oct 29 07:58:52 druk kernel: [55713.346523] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case. If you have one, please send an email to linux...@kvack.org. _MarkForUpload: True dmi.bios.date: 02/02/2014 dmi.bios.vendor: HP dmi.bios.version: A18 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL385 G7 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 28 19:56 seq crw-rw---- 1 root audio 116, 33 Oct 28 19:56 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.18 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL360p Gen8 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-29-generic root=UUID=f3c6cae8-09dc-4607-8675-c9123ea9c9fd ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M ProcVersionSignature: Ubuntu 3.13.0-29.53-generic 3.13.11.2 RelatedPackageVersions: linux-restricted-modules-3.13.0-29-generic N/A linux-backports-modules-3.13.0-29-generic N/A linux-firmware 1.127.15 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.13.0-29-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: Oct 29 07:41:34 orlo kernel: [42337.380135] qbrce3298b1-cb: port 2(tapce3298b1-cb) entered disabled state Oct 29 07:41:34 orlo kernel: [42337.380543] device tapce3298b1-cb left promiscuous mode Oct 29 07:41:34 orlo kernel: [42337.380580] qbrce3298b1-cb: port 2(tapce3298b1-cb) entered disabled state Oct 29 07:41:35 orlo kernel: [42338.223036] type=1400 audit(1446104495.424:137): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-80a3c754-9f62-4011-baea-b3a8f37d3746" pid=11554 comm="apparmor_parser" Oct 29 07:41:35 orlo kernel: [42338.308192] qbrce3298b1-cb: port 1(qvbce3298b1-cb) entered disabled state _MarkForUpload: True dmi.bios.date: 11/14/2013 dmi.bios.vendor: HP dmi.bios.version: P71 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP71:bd11/14/2013:svnHP:pnProLiantDL360pGen8:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL360p Gen8 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 29 21:10 seq crw-rw---- 1 root audio 116, 33 Oct 29 21:10 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.18 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL360p Gen8 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-31-generic root=UUID=f3c6cae8-09dc-4607-8675-c9123ea9c9fd ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M ProcVersionSignature: Ubuntu 3.19.0-31.36~14.04.1-generic 3.19.8-ckt7 RelatedPackageVersions: linux-restricted-modules-3.19.0-31-generic N/A linux-backports-modules-3.19.0-31-generic N/A linux-firmware 1.127.16 RfKill: Error: [Errno 2] No such file or directory StagingDrivers: visorchannel visorutil Tags: trusty uec-images staging Uname: Linux 3.19.0-31-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 11/14/2013 dmi.bios.vendor: HP dmi.bios.version: P71 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP71:bd11/14/2013:svnHP:pnProLiantDL360pGen8:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL360p Gen8 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 30 12:31 seq crw-rw---- 1 root audio 116, 33 Oct 30 12:31 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.18 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL360p Gen8 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-31-generic root=UUID=46ac2c1e-5f16-45bd-b383-e952f78fd142 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M ProcVersionSignature: Ubuntu 3.19.0-31.36~14.04.1-generic 3.19.8-ckt7 RelatedPackageVersions: linux-restricted-modules-3.19.0-31-generic N/A linux-backports-modules-3.19.0-31-generic N/A linux-firmware 1.127.16 RfKill: Error: [Errno 2] No such file or directory StagingDrivers: visorchannel visorutil Tags: trusty uec-images staging Uname: Linux 3.19.0-31-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 11/14/2013 dmi.bios.vendor: HP dmi.bios.version: P71 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrP71:bd11/14/2013:svnHP:pnProLiantDL360pGen8:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL360p Gen8 dmi.sys.vendor: HP --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Nov 17 12:47 seq crw-rw---- 1 root audio 116, 33 Nov 17 12:47 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.19 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL385 G7 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-33-generic root=UUID=29988d72-90fe-4329-a0c1-0e3bfb88beab ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M nox2apic intremap=off ProcVersionSignature: Ubuntu 3.19.0-33.38~14.04.1-generic 3.19.8-ckt7 RelatedPackageVersions: linux-restricted-modules-3.19.0-33-generic N/A linux-backports-modules-3.19.0-33-generic N/A linux-firmware 1.127.18 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.19.0-33-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 02/02/2014 dmi.bios.vendor: HP dmi.bios.version: A18 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL385 G7 dmi.sys.vendor: HP To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505564/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp