Hi all, I have an issue with an external hard drive that I'm at my wit's end with. I'll try to keep it short:
My system is connected to an external SATA HD via USB 3 (used for backups). For 6+ months, this setup has worked flawlessly. About a week ago, I disconnected the external drive (a Seagate GoFlex docking station + disk combo, but I've since switched to another enclosure), put another drive in the dock, and reconnected. Ever since, my system is possessed. At random times, the external drive will disconnect for no discernible reason. It can happen in the middle of a write or after the disk has been idle or sleeping for hours. It may happen within minutes or days after the device was first connected. The only relevant thing I can find in the logs is a laconic "usb 3-1: USB disconnect, device number 3". Once the system is in this state, things are thoroughly messed up. For starters, the disk will not reconnect (no errors or messages in dmesg) if I plug it out and back in. Even rebooting the host will not bring it back! Also, anything assuming the existence of certain USB devices is borked. lsusb just hangs, forever. I can't kill -9 it. Heck, sometimes I can't even rmmod xhci_hcd (same thing - just hangs, unkillable.) Weirdly enough, USB 2 devices still work in the USB 3 port. In fact, this is where this tale becomes entirely bizarre. The only way so far that I've found to get the USB 3 back to live is this workaround: I plug in a USB 2 device in the USB 3 port (I have a Lexar memory card reader I use for this purpose, but presumably, any USB 2 device would do), plug it out, plug the disk back in, and voila! It connects. Until it disconnects again, and the insane rain dance begins anew. At this point, I have: * tried 3 different HDDs (from 3 different manufacturers) so it's probably not related to the disk. * tried 2 different external enclosures/docks, so it's probably not related to the usb-sata adapter. * tried 2 different USB cables with those docks, so it's probably not the cable. * swapped the motherboard (a supermicro A1SAi-2750F) with an identical new one, so probably not an electrical or mechanical issue with the board. * disabled USB autosuspend (options usbcore autosuspend=-1 and autosuspend_delay_ms=-1) * upgraded from Wheezy (kernel 3.2.0-4-amd64) to Jessie (3.16.0-4-amd64). Could this be a kernel bug introduced somewhere around 3.2.0-4, and still present in 3.16.0-4? Or is the USB 3 controller on my board just buggy (and if so, any idea why has this not manifested itself until recently?) Any workarounds I can try (other than using USB 2)? Thanks in advance! - Dave. Relevant system & log info: *uname -a* Linux deepthought 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux *lsusb* Bus 001 Device 005: ID 0557:2419 ATEN International Co., Ltd Bus 001 Device 004: ID 0557:7000 ATEN International Co., Ltd Hub Bus 001 Device 002: ID 8087:07db Intel Corp. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 002: ID 174c:55aa ASMedia Technology Inc. ASMedia 2105 SATA bridge Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 003: ID 0764:0601 Cyber Power System, Inc. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub (the only things plugged into USB are the external disk and a UPS. The ATEN devices, I believe, are a kb and mouse emulated by the IPMI). *lspci* 00:00.0 Host bridge: Intel Corporation Atom processor C2000 SoC Transaction Router (rev 02) 00:01.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 1 (rev 02) 00:02.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 2 (rev 02) 00:03.0 PCI bridge: Intel Corporation Atom processor C2000 PCIe Root Port 3 (rev 02) 00:0e.0 Host bridge: Intel Corporation Atom processor C2000 RAS (rev 02) 00:0f.0 IOMMU: Intel Corporation Atom processor C2000 RCEC (rev 02) 00:13.0 System peripheral: Intel Corporation Atom processor C2000 SMBus 2.0 (rev 02) 00:14.0 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03) 00:14.1 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03) 00:14.2 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03) 00:14.3 Ethernet controller: Intel Corporation Ethernet Connection I354 (rev 03) 00:16.0 USB controller: Intel Corporation Atom processor C2000 USB Enhanced Host Controller (rev 02) 00:17.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA2 Controller (rev 02) 00:18.0 SATA controller: Intel Corporation Atom processor C2000 AHCI SATA3 Controller (rev 02) 00:1f.0 ISA bridge: Intel Corporation Atom processor C2000 PCU (rev 02) 00:1f.3 SMBus: Intel Corporation Atom processor C2000 PCU SMBus (rev 02) 01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03) 02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30) 03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03) *USB controller (lspci -l):* 03:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03) (prog-if 30 [XHCI]) Subsystem: Super Micro Computer Inc Device 0813 Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at df100000 (64-bit, non-prefetchable) [size=8K] Capabilities: [50] Power Management version 3 Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+ Capabilities: [90] MSI-X: Enable+ Count=8 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Latency Tolerance Reporting Kernel driver in use: xhci_hcd */var/log/messages entry when connecting the disk:* Jul 17 22:21:09 deepthought kernel: [ 7301.806555] usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Jul 17 22:21:09 deepthought kernel: [ 7301.827884] usb 3-1: New USB device found, idVendor=174c, idProduct=55aa Jul 17 22:21:09 deepthought kernel: [ 7301.827889] usb 3-1: New USB device strings: Mfr=2, Product=3, SerialNumber=1 Jul 17 22:21:09 deepthought kernel: [ 7301.827892] usb 3-1: Product: ASMT1153e Jul 17 22:21:09 deepthought kernel: [ 7301.827895] usb 3-1: Manufacturer: asmedia Jul 17 22:21:09 deepthought kernel: [ 7301.827897] usb 3-1: SerialNumber: 123456789298 Jul 17 22:21:09 deepthought kernel: [ 7301.829738] usb-storage 3-1:1.0: USB Mass Storage device detected Jul 17 22:21:09 deepthought kernel: [ 7301.829985] usb-storage 3-1:1.0: Quirks match for vid 174c pid 55aa: 400000 Jul 17 22:21:09 deepthought kernel: [ 7301.830090] scsi7 : usb-storage 3-1:1.0 Jul 17 22:21:10 deepthought kernel: [ 7302.831353] scsi 7:0:0:0: Direct-Access asmedia ASMT1153e 0 PQ: 0 ANSI: 6 Jul 17 22:21:10 deepthought kernel: [ 7302.831797] sd 7:0:0:0: Attached scsi generic sg4 type 0 Jul 17 22:21:10 deepthought kernel: [ 7302.835837] sd 7:0:0:0: [sde] Spinning up disk... Jul 17 22:21:24 deepthought kernel: [ 7303.839266] .............ready Jul 17 22:21:24 deepthought kernel: [ 7315.900124] sd 7:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) Jul 17 22:21:24 deepthought kernel: [ 7315.901309] sd 7:0:0:0: [sde] Write Protect is off Jul 17 22:21:24 deepthought kernel: [ 7315.902367] sd 7:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jul 17 22:21:24 deepthought kernel: [ 7315.944094] sde: unknown partition table Jul 17 22:21:24 deepthought kernel: [ 7315.946724] sd 7:0:0:0: [sde] Attached SCSI disk */var/log/syslog entry when the disk disconnects:* Jul 19 19:15:13 deepthought kernel: [169072.319365] usb 3-1: USB disconnect, device number 3 Jul 19 19:15:13 deepthought kernel: [169072.321166] sd 7:0:0:0: [sde] Synchronizing SCSI cache Jul 19 19:15:13 deepthought kernel: [169072.321302] sd 7:0:0:0: [sde] Jul 19 19:15:13 deepthought kernel: [169072.321306] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK *messages (likely) related to lsusb:* Jul 19 19:41:34 deepthought kernel: [170654.220433] khubd D ffff88046b6bda48 0 116 2 0x00000000 Jul 19 19:41:34 deepthought kernel: [170654.220439] ffff88046b6bd5f0 0000000000000046 0000000000012f00 ffff88046b373fd8 Jul 19 19:41:34 deepthought kernel: [170654.220443] 0000000000012f00 ffff88046b6bd5f0 ffff88024efda148 ffff88046b373d00 Jul 19 19:41:34 deepthought kernel: [170654.220446] ffff88024efda140 ffff88046b6bd5f0 0000000000000282 ffff88011b428060 Jul 19 19:41:34 deepthought kernel: [170654.220450] Call Trace: Jul 19 19:41:34 deepthought kernel: [170654.220460] [<ffffffff8150d259>] ? schedule_timeout+0x229/0x2a0 Jul 19 19:41:34 deepthought kernel: [170654.220465] [<ffffffff81072cb6>] ? lock_timer_base.isra.35+0x26/0x50 Jul 19 19:41:34 deepthought kernel: [170654.220469] [<ffffffff8107253a>] ? internal_add_timer+0x2a/0x70 Jul 19 19:41:34 deepthought kernel: [170654.220473] [<ffffffff81074777>] ? mod_timer+0x127/0x1e0 Jul 19 19:41:34 deepthought kernel: [170654.220476] [<ffffffff8150e768>] ? wait_for_completion+0xa8/0x120 Jul 19 19:41:34 deepthought kernel: [170654.220481] [<ffffffff81096920>] ? wake_up_state+0x10/0x10 Jul 19 19:41:34 deepthought kernel: [170654.220488] [<ffffffffa06dae5c>] ? xhci_alloc_dev+0xac/0x250 [xhci_hcd] Jul 19 19:41:34 deepthought kernel: [170654.220511] [<ffffffffa000778b>] ? usb_alloc_dev+0x6b/0x2f0 [usbcore] Jul 19 19:41:34 deepthought kernel: [170654.220520] [<ffffffffa000e581>] ? hub_thread+0xcb1/0x1740 [usbcore] Jul 19 19:41:34 deepthought kernel: [170654.220524] [<ffffffff810a7a70>] ? prepare_to_wait_event+0xf0/0xf0 Jul 19 19:41:34 deepthought kernel: [170654.220533] [<ffffffffa000d8d0>] ? hub_port_debounce+0x130/0x130 [usbcore] Jul 19 19:41:34 deepthought kernel: [170654.220538] [<ffffffff81087fad>] ? kthread+0xbd/0xe0 Jul 19 19:41:34 deepthought kernel: [170654.220542] [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180 Jul 19 19:41:34 deepthought kernel: [170654.220546] [<ffffffff81511518>] ? ret_from_fork+0x58/0x90 Jul 19 19:41:34 deepthought kernel: [170654.220550] [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180 *BIOS* Legacy USB support: Enabled. XHCI Handoff: Enabled. EHCI Handoff: Disabled. USB Mass Storage Driver Support: Enabled. Port 60/64 Emulation: Enabled. USB Transfer Timeout: 20 sec. Device Reset Timeout: 20 sec. Device Power-Up Delay: Auto.