Hi Mathias,

On 2/27/2019 1:01 PM, Mathias Nyman wrote:
> Hi
> 
> On 26.2.2019 19.55, Shah, Nehal-bakulchandra wrote:
>> Hi
>>
>> In one of our customer platform, we are getting following errors
>>
>> [65136.606651] xhci_hcd 0000:00:10.0: Command timeout
>> [65136.606690] xhci_hcd 0000:00:10.0: Abort command ring
>> [65150.739738] xhci_hcd 0000:00:10.0: Abort failed to stop command ring: -110
>> [65150.740115] xhci_hcd 0000:00:10.0: // Halt the HC
>> [65150.785382] xhci_hcd 0000:00:10.0: Host halt failed, -110
>> [65150.785419] xhci_hcd 0000:00:10.0: xHCI host controller not responding, 
>> assume dead
>> [65150.785874] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 1, ep index 0
>> [65150.785882] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 1, ep index 2
>> [65150.785911] xhci_hcd 0000:00:10.0: xHCI dying, ignoring interrupt. 
>> Shouldn't IRQs be disabled?
>> [65150.785921] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 2, ep index 0
>> [65150.785927] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 2, ep index 2
>> [65150.785937] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 3, ep index 0
>> [65150.785943] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 3, ep index 2
>> [65150.785971] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 3, ep index 3
>> [65150.785978] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 3, ep index 6
>> [65150.785987] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 4, ep index 0
>> [65150.785993] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 4, ep index 2
>> [65150.786003] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 4, ep index 4
>> [65150.786012] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 5, ep index 0
>> [65150.786018] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 5, ep index 2
>> [65150.786027] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 6, ep index 0
>> [65150.786033] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 6, ep index 2
>> [65150.786039] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 6, ep index 3
>> [65150.786046] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 7, ep index 0
>> [65150.786052] xhci_hcd 0000:00:10.0: Killing URBs for slot ID 8, ep index 0
>> [65150.786059] xhci_hcd 0000:00:10.0: HC died; cleaning up
>> [65150.786597] xhci_hcd 0000:00:10.0: Timeout while waiting for setup device 
>> command
>>
>>
>> So as per my understanding, we are getting time out in abort command as CRR 
>> is not getting negated and it assumes controller is died. Now post this
>> host goes completely in weird state. So what can be the recovery mechanism? 
>> The comment inĀ  xhci_abort_cmd_ring function says that "In the future we 
>> should distinguish between -ENODEV and -ETIMEDOUT * and try to recover a 
>> -ETIMEDOUT with a host controller reset."
> 
> What kernel version is this issue seen on?
> I recall there being some race issue in this area some time ago.
> 
>>
>> Will it be a good idea to reset the controller or any other suggestion for 
>> recovery ? Current situation demands the rebooting of the system.
> 
> Yes, I think it would be a good idea to try to reset the host in -ETIMEDOUT 
> case.
> So far the most common case was that host controller was actually removed 
> (PCI hotplug)
> in the case of first a command timing out, and then aborting the command ring 
> timing out, so
> just tearing down the host has so far been enough.

> Now we just need to implement this :)
Thanks for your input.  Will have a look to implement XHCI Reset host 
controller. 
 
> -Mathias
> 

Regards
Nehal Shah

Reply via email to