Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-07 Thread Singh, Balbir
On Thu, 2019-10-03 at 15:13 -0400, Tyler Ramer wrote: > Always shutdown the controller when nvme_remove_dead_controller is > reached. > > It's possible for nvme_remove_dead_controller to be called as part of a > failed reset, when there is a bad NVME_CSTS. The controller won't > be comming back on

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-07 Thread Keith Busch
On Mon, Oct 07, 2019 at 11:13:12AM -0400, Tyler Ramer wrote: > > Setting the shutdown to true is > > usually just to get the queues flushed, but the nvme_kill_queues() that > > we call accomplishes the same thing. > > The intention of this patch was to clean up another location where > nvme_dev_di

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-07 Thread Tyler Ramer
> Setting the shutdown to true is > usually just to get the queues flushed, but the nvme_kill_queues() that > we call accomplishes the same thing. The intention of this patch was to clean up another location where nvme_dev_disable() is called with shutdown == false, but the device is being removed

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-06 Thread Keith Busch
On Fri, Oct 04, 2019 at 11:36:42AM -0400, Tyler Ramer wrote: > Here's a failure we had which represents the issue the patch is > intended to solve: > > Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300 > Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will >

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-05 Thread Tyler Ramer
> What is the bad CSTS bit? CSTS.RDY? The reset will be triggered by the result of nvme_should_reset(): 1196 static bool nvme_should_reset(struct nvme_dev *dev, u32 csts) 1197 { 1198 1199 ⇥ /* If true, indicates loss of adapter communication, possibly by a 1200 ⇥* NVMe Subsystem res

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-04 Thread Singh, Balbir
On Fri, 2019-10-04 at 11:36 -0400, Tyler Ramer wrote: > Here's a failure we had which represents the issue the patch is > intended to solve: > > Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300 > Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will > reset:

Re: [PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-04 Thread Tyler Ramer
Here's a failure we had which represents the issue the patch is intended to solve: Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300 Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10 Aug 26 15:02:10 testhost kernel: nvme n

[PATCH] nvme-pci: Shutdown when removing dead controller

2019-10-03 Thread Tyler Ramer
Always shutdown the controller when nvme_remove_dead_controller is reached. It's possible for nvme_remove_dead_controller to be called as part of a failed reset, when there is a bad NVME_CSTS. The controller won't be comming back online, so we should shut it down rather than just disabling. Signe