Public bug reported: Hi,
We're seeing this often on our HP Moonshot ARM64 nova-compute nodes where qemu-nbd processes would lock up. At the same time, there's also a bunch of kernel spew as follows: | [605282.018238] block nbd3: Attempted send on closed socket | [605282.018242] block nbd3: Attempted send on closed socket | [605282.018245] block nbd3: Attempted send on closed socket | [605282.018249] block nbd3: Attempted send on closed socket swirlix01: | hloeung@swirlix01:~$ uname -a | Linux swirlix01 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:15:46 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix01:~$ ps afx | grep qe\\mu-nbd | 27782 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd10 /var/lib/nova/instances/ba50751e-56d7-4bc4-8742-1193fe7a138e/disk | hloeung@swirlix01:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125dac>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc0001268b4>] futex_wait+0x154/0x24c | [<ffffffc000128638>] do_futex+0x1a0/0x9ec | [<ffffffc000128f1c>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix08: | hloeung@swirlix08:~$ uname -a | Linux swirlix08 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix08:~$ ps afx | grep qe\\mu-nbd | 31976 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd6 /var/lib/nova/instances/92ceb061-2ea4-4212-be20-ab0ded6eb3cd/disk | hloeung@swirlix08:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix11: | hloeung@swirlix11:~$ uname -a | Linux swirlix11 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix11:~$ ps afx | grep qe\\mu-nbd | 18149 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd3 /var/lib/nova/instances/84cac137-c1e4-46ac-894a-efcd55ef7e05/disk | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }'/stack | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff | hloeung@swirlix11:~$ sudo strace -f -p 18149 | Process 18149 attached with 3 threads | [pid 18150] rt_sigtimedwait([BUS ALRM IO], NULL, NULL, 8 <unfinished ...> | [pid 18149] futex(0x7f749ec230, FUTEX_WAIT, 18152, NULL | ... (hangs here) ... We're using the QEMU package backported from Vivid as per LP:1457639 | hloeung@swirlix11:~$ apt-cache policy qemu-utils | qemu-utils: | Installed: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Candidate: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Version table: | *** 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 0 | 500 http://ppa.launchpad.net/canonical-is-sa/arm64-infra-workarounds/ubuntu/ trusty/main arm64 Packages I'm also not sure if this is related to LP:1505564, which is for amd64/x86_64. ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Affects: qemu (Ubuntu) Importance: Undecided Status: New ** Also affects: qemu (Ubuntu) Importance: Undecided Status: New ** Description changed: Hi, We're seeing this often on our HP Moonshot ARM64 nova-compute nodes where qemu-nbd processes would lock up. At the same time, there's also a bunch of kernel spew as follows: | [605282.018238] block nbd3: Attempted send on closed socket | [605282.018242] block nbd3: Attempted send on closed socket | [605282.018245] block nbd3: Attempted send on closed socket | [605282.018249] block nbd3: Attempted send on closed socket swirlix01: | hloeung@swirlix01:~$ uname -a | Linux swirlix01 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:15:46 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix01:~$ ps afx | grep qe\\mu-nbd | 27782 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd10 /var/lib/nova/instances/ba50751e-56d7-4bc4-8742-1193fe7a138e/disk | hloeung@swirlix01:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125dac>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc0001268b4>] futex_wait+0x154/0x24c | [<ffffffc000128638>] do_futex+0x1a0/0x9ec | [<ffffffc000128f1c>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix08: | hloeung@swirlix08:~$ uname -a | Linux swirlix08 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix08:~$ ps afx | grep qe\\mu-nbd | 31976 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd6 /var/lib/nova/instances/92ceb061-2ea4-4212-be20-ab0ded6eb3cd/disk | hloeung@swirlix08:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix11: | hloeung@swirlix11:~$ uname -a | Linux swirlix11 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix11:~$ ps afx | grep qe\\mu-nbd | 18149 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd3 /var/lib/nova/instances/84cac137-c1e4-46ac-894a-efcd55ef7e05/disk | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }'/stack | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff | hloeung@swirlix11:~$ sudo strace -f -p 18149 | Process 18149 attached with 3 threads | [pid 18150] rt_sigtimedwait([BUS ALRM IO], NULL, NULL, 8 <unfinished ...> | [pid 18149] futex(0x7f749ec230, FUTEX_WAIT, 18152, NULL | ... (hangs here) ... We're using the QEMU package backported from Vivid as per LP:1457639 | hloeung@swirlix11:~$ apt-cache policy qemu-utils | qemu-utils: | Installed: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Candidate: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Version table: | *** 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 0 | 500 http://ppa.launchpad.net/canonical-is-sa/arm64-infra-workarounds/ubuntu/ trusty/main arm64 Packages + + I'm also not sure if this is related to LP:1505564, which is for + amd64/x86_64. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1512185 Title: qemu-nbd on ARM64 deadlock? Stuck in rt_sigtimedwait([BUS ALRM IO], ..) and futex(0x7f749ec230, FUTEX_WAIT, ...) Status in linux package in Ubuntu: Incomplete Status in qemu package in Ubuntu: New Bug description: Hi, We're seeing this often on our HP Moonshot ARM64 nova-compute nodes where qemu-nbd processes would lock up. At the same time, there's also a bunch of kernel spew as follows: | [605282.018238] block nbd3: Attempted send on closed socket | [605282.018242] block nbd3: Attempted send on closed socket | [605282.018245] block nbd3: Attempted send on closed socket | [605282.018249] block nbd3: Attempted send on closed socket swirlix01: | hloeung@swirlix01:~$ uname -a | Linux swirlix01 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:15:46 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix01:~$ ps afx | grep qe\\mu-nbd | 27782 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd10 /var/lib/nova/instances/ba50751e-56d7-4bc4-8742-1193fe7a138e/disk | hloeung@swirlix01:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125dac>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc0001268b4>] futex_wait+0x154/0x24c | [<ffffffc000128638>] do_futex+0x1a0/0x9ec | [<ffffffc000128f1c>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix08: | hloeung@swirlix08:~$ uname -a | Linux swirlix08 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix08:~$ ps afx | grep qe\\mu-nbd | 31976 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd6 /var/lib/nova/instances/92ceb061-2ea4-4212-be20-ab0ded6eb3cd/disk | hloeung@swirlix08:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff swirlix11: | hloeung@swirlix11:~$ uname -a | Linux swirlix11 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 10:50:10 UTC 2015 aarch64 aarch64 aarch64 GNU/Linux | hloeung@swirlix11:~$ ps afx | grep qe\\mu-nbd | 18149 ? Ssl 0:00 /usr/bin/qemu-nbd -c /dev/nbd3 /var/lib/nova/instances/84cac137-c1e4-46ac-894a-efcd55ef7e05/disk | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }'/stack | hloeung@swirlix11:~$ sudo cat /proc/$(ps afx | grep qe\\mu-nbd | awk '{ print $1 }')/stack | [<ffffffc0000875b0>] __switch_to+0x74/0x8c | [<ffffffc000125d6c>] futex_wait_queue_me+0xf4/0x184 | [<ffffffc000126874>] futex_wait+0x154/0x24c | [<ffffffc0001285f8>] do_futex+0x1a0/0x9ec | [<ffffffc000128edc>] SyS_futex+0x98/0x1cc | [<ffffffc00008642c>] el0_svc_naked+0x20/0x28 | [<ffffffffffffffff>] 0xffffffffffffffff | hloeung@swirlix11:~$ sudo strace -f -p 18149 | Process 18149 attached with 3 threads | [pid 18150] rt_sigtimedwait([BUS ALRM IO], NULL, NULL, 8 <unfinished ...> | [pid 18149] futex(0x7f749ec230, FUTEX_WAIT, 18152, NULL | ... (hangs here) ... We're using the QEMU package backported from Vivid as per LP:1457639 | hloeung@swirlix11:~$ apt-cache policy qemu-utils | qemu-utils: | Installed: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Candidate: 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 | Version table: | *** 1:2.2+dfsg-5expubuntu9.5+bug1457639~ubuntu14.04.1 0 | 500 http://ppa.launchpad.net/canonical-is-sa/arm64-infra-workarounds/ubuntu/ trusty/main arm64 Packages I'm also not sure if this is related to LP:1505564, which is for amd64/x86_64. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1512185/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp