Hi
On 24.02.2017 17:01, s...@free.fr wrote:
Hello,
I have a BUG on USB xhci.
The trace here :
[11518.982950] xhci_hcd 0000:07:00.0: Stopped the command ring failed, maybe
the host is dead
[11519.027106] xhci_hcd 0000:07:00.0: Host halt failed, -110
[11519.027108] xhci_hcd 0000:07:00.0: Abort command ring failed
[11519.027215] xhci_hcd 0000:07:00.0: HC died; cleaning up
[11519.027230] xhci_hcd 0000:07:00.0: Timeout while waiting for setup device
command
[11519.442303] usb 3-1: device not accepting address 15, error -108
[11519.442324] usb usb3-port1: couldn't allocate usb_device
After this error happens, I have to reboot Linux. Without reboot the USB port
doesn't work for any devices.
We're waiting for the device to respond to a setup device. It doesn't respond,
so we have to cancel the command.
(stop the command ring, skip the command, and restart the command ring)
We first fail in stopping the command ring, then we fail in halting the entire
host controller.
The situation.
uname -a :
Linux shal 4.10.0-8-generic #10-Ubuntu SMP Mon Feb 13 14:04:59 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux
4.10 contains changes in exactly this area to prevent a race that might
re-start the command we check if it stopped
Do you have an older kernel available to check if its a regression in 4.10?
Part of lspci:
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family
DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core
Processor Family PCI Express Root Port (rev 09)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family
USB Enhanced Host Controller #1 (rev 05)
07:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host
Controller
Do you have a host from another vendor to try this on?
Log show that host controller becomes really unresponsive after we try to abort
the command ring.
# lsusb
Bus 002 Device 004: ID 0582:0044 Roland Corp. EDIROL UA-1000
Bus 002 Device 003: ID 046d:c52e Logitech, Inc. MK260 Wireless Combo Receiver
Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Note that I have booted with the GRUB Option :
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash usbcore.old_scheme_first=1"
I work with an old Android smartphone in fastboot mode. The smartphone is
connected with a long USB cable (5m).
In fastboot mode (and only with this mode), the devices is not reachable .
There is error like this :
usb 3-1: device not accepting address 12, error -71
So, I had "usbcore.old_scheme_first=1" in kernel command option and then I can
reach the device in fastboot mode.
But I performs some operation on the smartphone and sometime the device hung .
Does the host always hang after a command times out?, i.e is there ever a
timeout message:
"xhci_hcd 0000:07:00.0: Timeout while ..." without the host dying messages:
xhci_hcd 0000:07:00.0: Stopped the command ring failed, maybe the host is dead
xhci_hcd 0000:07:00.0: Host halt failed, -110
xhci_hcd 0000:07:00.0: Abort command ring failed
xhci_hcd 0000:07:00.0: HC died; cleaning up
In this case, my USB port hung too and it is impossible to connect any devices
on it (smartphone or usb key for e.g).
I have to reboot my Linux, in order to have USB port working again....
Note that, during operation the entire Linux freeze few seconds...
My question :
- There is a method to avoid that my USB port hung
You could try if the EHCI usb controller works.
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family
USB Enhanced Host Controller
- If not, there is a method to have a working usb port without rebooting ?
Try reloading xhci, might do the trick, unless controller is really stuck.
Thank
More traces:
[11466.611552] usb 3-1: USB disconnect, device number 11
sudden disconnect
[11468.957608] usb 3-1: new high-speed USB device number 12 using xhci_hcd
[11470.878811] usb 3-1: Device not responding to setup address.
[11486.881738] usb 3-1: Device not responding to setup address.
So there are already a couple transaction errors when trying to address the
device
[11487.088447] usb 3-1: device not accepting address 12, error -71
[11487.532378] usb 3-1: new high-speed USB device number 14 using xhci_hcd
[11487.559735] usb 3-1: unable to get BOS descriptor
[11487.564929] usb 3-1: New USB device found, idVendor=18d1, idProduct=d00d
[11487.564932] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[11487.564934] usb 3-1: Product: Android
[11487.564935] usb 3-1: Manufacturer: Google
[11489.585534] usb 3-1: USB disconnect, device number 14
sudden disconnect
[11491.748090] usb 3-1: new high-speed USB device number 15 using xhci_hcd
[11518.982950] xhci_hcd 0000:07:00.0: Stopped the command ring failed, maybe
the host is dead
[11519.027106] xhci_hcd 0000:07:00.0: Host halt failed, -110
[11519.027108] xhci_hcd 0000:07:00.0: Abort command ring failed
[11519.027215] xhci_hcd 0000:07:00.0: HC died; cleaning up
[11519.027230] xhci_hcd 0000:07:00.0: Timeout while waiting for setup device
command
[11519.442303] usb 3-1: device not accepting address 15, error -108
[11519.442324] usb usb3-port1: couldn't allocate usb_device
Connection looks really unreliable.
Enabling xhci debugging might reveal something:
echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html