------- Comment From [email protected] 2016-08-22 13:11 EDT-------
(In reply to comment #16)
> Test kernel at http://people.canonical.com/~rtg/eeh-lp1602724/ with upstream
> commit c21377f8366c95440d533edbe47d070f662c62ef ('nvme: Suspend all queues
> before deletion') applied.

This test kernel is not ok, it stalls the wq:

[  540.097661] INFO: rcu_sched detected stalls on CPUs/tasks:
[  540.103320]  1-...: (1 GPs behind) idle=d35/140000000000000/0 
softirq=2385/2386 fqs=65
[  540.103411]  (detected by 11, t=5472 jiffies, g=1335, c=1334, q=793)
[  540.103492] Task dump for CPU 1:
[  540.103539] kworker/u32:1   D 0000000000000000     0   101      0 0x00000800
[  540.103656] Call Trace:
[  540.103692] [c00000017bc539c0] [c00000017bc53a00] 0xc00000017bc53a00 
(unreliable)
[  540.103805] [c00000017bc53a00] [d000000001614480] 
nvme_suspend_queue+0x30/0x150 [nvme]
[  540.103914] [c00000017bc53a30] [d000000001616850] 
nvme_dev_disable+0x110/0x440 [nvme]
[  540.104022] [c00000017bc53b10] [d000000001617e60] 
nvme_reset_work+0xe0/0x1120 [nvme]
[  540.104132] [c00000017bc53c50] [c0000000000dd630] 
process_one_work+0x1e0/0x5a0
[  540.104239] [c00000017bc53ce0] [c0000000000ddb84] worker_thread+0x194/0x680
[  540.104331] [c00000017bc53d80] [c0000000000e6680] kthread+0x110/0x130
[  540.104424] [c00000017bc53e30] [c000000000009538] 
ret_from_kernel_thread+0x5c/0xa4
[  604.094501] INFO: rcu_sched detected stalls on CPUs/tasks:
[  604.094699]  1-...: (1 GPs behind) idle=d35/140000000000000/0 
softirq=2385/2386 fqs=82
[  604.094700]  (detected by 5, t=21472 jiffies, g=1335, c=1334, q=1283)
[  604.094705] Task dump for CPU 1:

Can you provide the backported patch for verification?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1602724

Title:
  Ubuntu 16.04 - Full EEH Recovery Support for NVMe devices

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  In Progress

Bug description:
  == Comment: #0 - Heitor Ricardo Alves de Siqueira <[email protected]> - 
2016-07-12 12:54:27 ==
  Current nvme driver in Ubuntu 16.04 kernel does not handle error recovery; we 
are missing some patches from the upstream nvme driver.

  We would like to ask Canonical to cherry pick the following patches for the 
16.04 kernel, if possible:
      * 9396dec916c0 ("nvme: use a work item to submit async event requests")
      * 79f2b358c9ba ("nvme: don't poll the CQ from the kthread")
      * 2d55cd5f511d ("nvme: replace the kthread with a per-device watchdog 
timer")
      * 9bf2b972afea ("NVMe: Fix reset/remove race")
      * c875a7093f04 ("nvme: Avoid reset work on watchdog timer function during 
error recovery")
      * a5229050b69c ("NVMe: Always use MSI/MSI-x interrupts")

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602724/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to