Made it a valid patch file (for Debian stable): amd64:/usr/src/linux-3.16.35# cd /usr/src/linux-3.16.35 amd64:/usr/src/linux-3.16.35# patch -p1 --dry-run </tmp/hid-core.c-3.16-to-4.5.diff checking file drivers/hid/usbhid/hid-core.c
See attached hid-core.c-3.16-to-4.5.diff I don't know how it will work on Ubuntu. Maybe they differ too much. ** Attachment removed: "Difference between Debian 3.16 and 4.5 hid-core.c" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1557172/+attachment/4673143/+files/hid-core.c-3.16-to-4.5.diff ** Patch added: "Difference between Debian 3.16 and 4.5 hid-core.c" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1557172/+attachment/4673144/+files/hid-core.c-3.16-to-4.5.diff -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1557172 Title: khubd/usbhid deadlock(?) creates processes in state D Status in linux package in Ubuntu: Incomplete Bug description: (Note that the attached logs are not a machine exhibiting the issue) I've observed this issue specifically on Dell's PowerEdge R410 with its DRAC. I manage quite a few other machines with DRACs and haven't seen this issue so it may be limited to the R410. I've seen cases of this both with the precise and trusty kernel. Basically, what happens is that things that read any info about the DRAC's usbhid device will block in an uninterruptible state. A quick way to do this is with "cat /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-2/manufacturer". This means that commands like lspci will block as well. The thing is that it doesn't happen every time, but once it happens the first time all future reads on the device will block in state D too. This is also associated with the following from the kernel (first from a precise machine): [197852.595820] usb 5-2: reset full-speed USB device number 2 using uhci_hcd [197857.716899] usb 5-2: device descriptor read/64, error -71 [197857.944966] usb 5-2: device descriptor read/64, error -71 [197858.165020] usb 5-2: reset full-speed USB device number 2 using uhci_hcd [197858.285090] usb 5-2: device descriptor read/64, error -71 [197858.513108] usb 5-2: device descriptor read/64, error -71 [197858.733154] usb 5-2: reset full-speed USB device number 2 using uhci_hcd [197859.145214] usb 5-2: device not accepting address 2, error -71 [197859.261234] usb 5-2: reset full-speed USB device number 2 using uhci_hcd [197859.669322] usb 5-2: device not accepting address 2, error -71 [197859.675357] usb 5-2: USB disconnect, device number 2 [198047.530714] INFO: task khubd:203 blocked for more than 120 seconds. [198047.537147] Not tainted 3.13.0-74-generic #118~precise1-Ubuntu [198047.543691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [198047.551792] khubd D ffff880803fe18a0 0 203 2 0x00000000 [198047.551803] ffff8810036b1c18 0000000000000046 ffff88102fc53180 ffff8810036b1fd8 [198047.551809] 0000000000013180 0000000000013180 ffff881003f7c800 ffff880804634800 [198047.551814] ffff8810036b1c38 ffff8810020908e8 ffff8810020908ec 00000000ffffffff [198047.551819] Call Trace: [198047.551828] [<ffffffff81764b59>] schedule+0x29/0x70 [198047.551833] [<ffffffff81764e7e>] schedule_preempt_disabled+0xe/0x10 [198047.551837] [<ffffffff81766cb4>] __mutex_lock_slowpath+0x114/0x1b0 [198047.551845] [<ffffffff8156e66e>] ? usb_alloc_urb+0x1e/0x50 [198047.551848] [<ffffffff81766d73>] mutex_lock+0x23/0x37 [198047.551852] [<ffffffff8156597a>] usb_disconnect+0x5a/0x210 [198047.551857] [<ffffffff8156ff6a>] ? usb_control_msg+0xea/0x110 [198047.551860] [<ffffffff8156833f>] hub_port_connect_change+0xcf/0x9c0 [198047.551864] [<ffffffff81562715>] ? hub_port_status+0xd5/0x120 [198047.551868] [<ffffffff81569094>] hub_events+0x464/0x8c0 [198047.551871] [<ffffffff8176472e>] ? __schedule+0x38e/0x700 [198047.551875] [<ffffffff81569525>] hub_thread+0x35/0x150 [198047.551881] [<ffffffff810b0410>] ? __wake_up_sync+0x20/0x20 [198047.551885] [<ffffffff815694f0>] ? hub_events+0x8c0/0x8c0 [198047.551890] [<ffffffff8108ff09>] kthread+0xc9/0xe0 [198047.551893] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [198047.551898] [<ffffffff81771768>] ret_from_fork+0x58/0x90 [198047.551902] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [198047.551908] INFO: task kworker/9:1:246 blocked for more than 120 seconds. [198047.558892] Not tainted 3.13.0-74-generic #118~precise1-Ubuntu [198047.565491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [198047.573524] kworker/9:1 D ffff880803fe18a0 0 246 2 0x00000000 [198047.573547] Workqueue: events hid_reset [usbhid] [198047.573550] ffff8808011f9788 0000000000000046 ffff88080fc93180 ffff8808011f9fd8 [198047.573555] 0000000000013180 0000000000013180 ffff881003f94800 ffff8808011f0000 [198047.573559] ffff8808011f9798 ffff8808011f98d8 7fffffffffffffff 7fffffffffffffff [198047.573564] Call Trace: [198047.573579] [<ffffffff81764b59>] schedule+0x29/0x70 [198047.573582] [<ffffffff81763e25>] schedule_timeout+0x1e5/0x250 [198047.573587] [<ffffffff810a4b1d>] ? wake_affine+0x16d/0x2d0 [198047.573591] [<ffffffff81765df7>] wait_for_completion+0xa7/0x160 [198047.573596] [<ffffffff8109fe00>] ? try_to_wake_up+0x210/0x210 [198047.573602] [<ffffffff81089509>] flush_work+0x29/0x40 [198047.573605] [<ffffffff81085170>] ? start_worker+0x40/0x40 [198047.573609] [<ffffffff8108962c>] __cancel_work_timer+0x9c/0x1b0 [198047.573623] [<ffffffff81089770>] cancel_work_sync+0x10/0x20 [198047.573632] [<ffffffffa00a6a19>] hid_cancel_delayed_stuff+0x29/0x30 [usbhid] [198047.573640] [<ffffffffa00a88cc>] usbhid_close+0xcc/0x110 [usbhid] [198047.573650] [<ffffffffa0036772>] hidinput_close+0x22/0x30 [hid] [198047.573659] [<ffffffff815b3bf1>] input_close_device+0x61/0x90 [198047.573663] [<ffffffff815b9b7f>] evdev_cleanup+0xbf/0xd0 [198047.573670] [<ffffffff815b9bc6>] evdev_disconnect+0x36/0x70 [198047.573674] [<ffffffff815b5b75>] __input_unregister_device+0xc5/0x1a0 [198047.573678] [<ffffffff815b5ca5>] input_unregister_device+0x55/0x80 [198047.573687] [<ffffffffa0036105>] hidinput_disconnect+0x95/0xc0 [hid] [198047.573695] [<ffffffffa0033548>] hid_disconnect+0x68/0x70 [hid] [198047.573708] [<ffffffffa0033625>] hid_device_remove+0xd5/0xf0 [hid] [198047.573720] [<ffffffff814b7c7f>] __device_release_driver+0x7f/0xf0 [198047.573724] [<ffffffff814b7d1c>] device_release_driver+0x2c/0x40 [198047.573728] [<ffffffff814b76c4>] bus_remove_device+0x104/0x170 [198047.573736] [<ffffffff814b46e8>] device_del+0x118/0x1a0 [198047.573744] [<ffffffffa0032af0>] hid_destroy_device+0x30/0x70 [hid] [198047.573754] [<ffffffffa00a6a4b>] usbhid_disconnect+0x2b/0x50 [usbhid] [198047.573758] [<ffffffff81573380>] usb_unbind_interface+0x60/0x1b0 [198047.573764] [<ffffffff814b7c7f>] __device_release_driver+0x7f/0xf0 [198047.573768] [<ffffffff814b7d1c>] device_release_driver+0x2c/0x40 [198047.573772] [<ffffffff81573560>] usb_driver_release_interface+0x90/0xa0 [198047.573780] [<ffffffffa00a8c51>] ? hid_post_reset+0x51/0x200 [usbhid] [198047.573785] [<ffffffff815735a7>] usb_forced_unbind_intf+0x37/0x60 [198047.573788] [<ffffffff81573629>] unbind_marked_interfaces.isra.8+0x59/0x80 [198047.573792] [<ffffffff815737dd>] usb_unbind_and_rebind_marked_interfaces+0x1d/0x30 [198047.573796] [<ffffffff81567c1b>] usb_reset_device+0x15b/0x1d0 [198047.573804] [<ffffffffa00a7e2f>] hid_reset+0x16f/0x1e0 [usbhid] [198047.573808] [<ffffffff81087c5f>] process_one_work+0x17f/0x4c0 [198047.573811] [<ffffffff81088dbb>] worker_thread+0x11b/0x3d0 [198047.573815] [<ffffffff81088ca0>] ? manage_workers.isra.21+0x190/0x190 [198047.573819] [<ffffffff8108ff09>] kthread+0xc9/0xe0 [198047.573823] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [198047.573827] [<ffffffff81771768>] ret_from_fork+0x58/0x90 [198047.573830] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 And these from a trusty machine (which includes info from one of the stuck processes too): [1367979.659164] SysRq : Show Blocked State [1367979.663114] task PC stack pid father [1367979.669300] khubd D ffff88100054f160 0 203 2 0x00000000 [1367979.676574] ffff88100352fc18 0000000000000046 ffff88080fc93180 ffff88100352ffd8 [1367979.684216] 0000000000013180 0000000000013180 ffff8808049f4800 ffff8808043f4800 [1367979.691855] ffff88100352fc38 ffff881002e100e8 ffff881002e100ec 00000000ffffffff [1367979.699495] Call Trace: [1367979.702134] [<ffffffff81764b69>] schedule+0x29/0x70 [1367979.707288] [<ffffffff81764e8e>] schedule_preempt_disabled+0xe/0x10 [1367979.713831] [<ffffffff81766cc4>] __mutex_lock_slowpath+0x114/0x1b0 [1367979.720288] [<ffffffff8156e67e>] ? usb_alloc_urb+0x1e/0x50 [1367979.726049] [<ffffffff81766d83>] mutex_lock+0x23/0x37 [1367979.731376] [<ffffffff8156598a>] usb_disconnect+0x5a/0x210 [1367979.737138] [<ffffffff8156ff7a>] ? usb_control_msg+0xea/0x110 [1367979.743160] [<ffffffff8156834f>] hub_port_connect_change+0xcf/0x9c0 [1367979.749703] [<ffffffff81562725>] ? hub_port_status+0xd5/0x120 [1367979.755724] [<ffffffff815690a4>] hub_events+0x464/0x8c0 [1367979.761223] [<ffffffff8176473e>] ? __schedule+0x38e/0x700 [1367979.766897] [<ffffffff81569535>] hub_thread+0x35/0x150 [1367979.772313] [<ffffffff810b0410>] ? __wake_up_sync+0x20/0x20 [1367979.778160] [<ffffffff81569500>] ? hub_events+0x8c0/0x8c0 [1367979.784942] [<ffffffff8108ff09>] kthread+0xc9/0xe0 [1367979.790008] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [1367979.796378] [<ffffffff81771768>] ret_from_fork+0x58/0x90 [1367979.801965] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [1367979.808345] kworker/8:1 D ffff88100054f160 0 329 2 0x00000000 [1367979.815632] Workqueue: events hid_reset [usbhid] [1367979.820445] ffff881002625788 0000000000000046 ffff88102fc93180 ffff881002625fd8 [1367979.828088] 0000000000013180 0000000000013180 ffff8808049f3000 ffff881002639800 [1367979.835726] ffff881002625798 ffff8810026258d8 7fffffffffffffff 7fffffffffffffff [1367979.843367] Call Trace: [1367979.845999] [<ffffffff81764b69>] schedule+0x29/0x70 [1367979.851151] [<ffffffff81763e35>] schedule_timeout+0x1e5/0x250 [1367979.857174] [<ffffffff810aa9cf>] ? enqueue_entity+0x2bf/0x760 [1367979.863196] [<ffffffff81765e07>] wait_for_completion+0xa7/0x160 [1367979.869392] [<ffffffff8109fe00>] ? try_to_wake_up+0x210/0x210 [1367979.875414] [<ffffffff81089509>] flush_work+0x29/0x40 [1367979.880741] [<ffffffff81085170>] ? start_worker+0x40/0x40 [1367979.886415] [<ffffffff8108962c>] __cancel_work_timer+0x9c/0x1b0 [1367979.892609] [<ffffffff81089770>] cancel_work_sync+0x10/0x20 [1367979.898461] [<ffffffffa0072a19>] hid_cancel_delayed_stuff+0x29/0x30 [usbhid] [1367979.905801] [<ffffffffa00748cc>] usbhid_close+0xcc/0x110 [usbhid] [1367979.912176] [<ffffffffa0042772>] hidinput_close+0x22/0x30 [hid] [1367979.918374] [<ffffffff815b3c01>] input_close_device+0x61/0x90 [1367979.924397] [<ffffffff815b9b8f>] evdev_cleanup+0xbf/0xd0 [1367979.929983] [<ffffffff815b9bd6>] evdev_disconnect+0x36/0x70 [1367979.935832] [<ffffffff815b5b85>] __input_unregister_device+0xc5/0x1a0 [1367979.942547] [<ffffffff815b5cb5>] input_unregister_device+0x55/0x80 [1367979.949007] [<ffffffffa0042105>] hidinput_disconnect+0x95/0xc0 [hid] [1367979.955638] [<ffffffffa003f548>] hid_disconnect+0x68/0x70 [hid] [1367979.961837] [<ffffffffa003f625>] hid_device_remove+0xd5/0xf0 [hid] [1367979.968298] [<ffffffff814b7c8f>] __device_release_driver+0x7f/0xf0 [1367979.974752] [<ffffffff814b7d2c>] device_release_driver+0x2c/0x40 [1367979.981095] [<ffffffff814b76d4>] bus_remove_device+0x104/0x170 [1367979.987202] [<ffffffff814b46f8>] device_del+0x118/0x1a0 [1367979.992707] [<ffffffffa003eaf0>] hid_destroy_device+0x30/0x70 [hid] [1367979.999253] [<ffffffffa0072a4b>] usbhid_disconnect+0x2b/0x50 [usbhid] [1367980.005968] [<ffffffff81573390>] usb_unbind_interface+0x60/0x1b0 [1367980.012252] [<ffffffff814b7c8f>] __device_release_driver+0x7f/0xf0 [1367980.018710] [<ffffffff814b7d2c>] device_release_driver+0x2c/0x40 [1367980.024992] [<ffffffff81573570>] usb_driver_release_interface+0x90/0xa0 [1367980.031885] [<ffffffffa0074c51>] ? hid_post_reset+0x51/0x200 [usbhid] [1367980.038603] [<ffffffff815735b7>] usb_forced_unbind_intf+0x37/0x60 [1367980.044971] [<ffffffff81573639>] unbind_marked_interfaces.isra.8+0x59/0x80 [1367980.052121] [<ffffffff815737ed>] usb_unbind_and_rebind_marked_interfaces+0x1d/0x30 [1367980.059981] [<ffffffff81567c2b>] usb_reset_device+0x15b/0x1d0 [1367980.066004] [<ffffffffa0073e2f>] hid_reset+0x16f/0x1e0 [usbhid] [1367980.072199] [<ffffffff81087c5f>] process_one_work+0x17f/0x4c0 [1367980.078221] [<ffffffff81088dbb>] worker_thread+0x11b/0x3d0 [1367980.083982] [<ffffffff81088ca0>] ? manage_workers.isra.21+0x190/0x190 [1367980.090699] [<ffffffff8108ff09>] kthread+0xc9/0xe0 [1367980.095765] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [1367980.102132] [<ffffffff81771768>] ret_from_fork+0x58/0x90 [1367980.107719] [<ffffffff8108fe40>] ? flush_kthread_worker+0xb0/0xb0 [1367980.114098] python2.7 D ffff88100054f160 0 18303 1 0x00000004 [1367980.121369] ffff881001497d08 0000000000000082 ffff88080fc53180 ffff881001497fd8 [1367980.129007] 0000000000013180 0000000000013180 ffff881003f86000 ffff881001688000 [1367980.136648] 0000881000000024 ffff881002e100e8 ffff881002e100ec 00000000ffffffff [1367980.144286] Call Trace: [1367980.146921] [<ffffffff81764b69>] schedule+0x29/0x70 [1367980.152073] [<ffffffff81764e8e>] schedule_preempt_disabled+0xe/0x10 [1367980.158615] [<ffffffff81766cc4>] __mutex_lock_slowpath+0x114/0x1b0 [1367980.165069] [<ffffffff81766d83>] mutex_lock+0x23/0x37 [1367980.170397] [<ffffffff81576a8b>] manufacturer_show+0x2b/0x60 [1367980.176331] [<ffffffff814b4890>] dev_attr_show+0x20/0x60 [1367980.181918] [<ffffffff81766d76>] ? mutex_lock+0x16/0x37 [1367980.187418] [<ffffffff811eec6b>] ? seq_buf_alloc+0x1b/0x50 [1367980.193180] [<ffffffff81242875>] sysfs_seq_show+0x145/0x260 [1367980.199027] [<ffffffff811ef3fb>] seq_read+0xfb/0x3c0 [1367980.204267] [<ffffffff811cbc8b>] vfs_read+0xab/0x180 [1367980.209506] [<ffffffff811cbf12>] SyS_read+0x52/0xa0 [1367980.214657] [<ffffffff8177181d>] system_call_fastpath+0x1a/0x1f lsusb for the device in question: $ sudo lsusb -v -d 0624:0248 Bus 005 Device 002: ID 0624:0248 Avocent Corp. Virtual Hub Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x0624 Avocent Corp. idProduct 0x0248 Virtual Hub bcdDevice 0.00 iManufacturer 1 Avocent iProduct 2 USB Composite Device-0 iSerial 3 20080519 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 59 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xc0 Self Powered MaxPower 2mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 1 Keyboard iInterface 4 Keyboard HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.00 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 65 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 8 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 2 Mouse iInterface 5 Mouse HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.00 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 63 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 8 Device Status: 0x0000 (Bus Powered) And now for my purely speculative analysis of what might be going on after digging through the code... There are two devices interacting here, the usbhid that is the keyboard for the DRAC and the hub that it's connected to. As a part of the usbhid initialization (usbhid_probe) it creates a tasklet for hid_reset (I have no idea why, but perhaps it needs to reset the underlying hid/usb devices to start working) which unloads and reloads all the parts of the hidusb driver (usb, hid, input, evdev). As a part of the unload for the usbhid it calls hid_cancel_delayed_stuff (don't let it fool you, it actually works with a usbhid device, not a hid device) iff it is no longer opened, which normally it should already be opened, but... I haven't been able to confirm this, but I think that if the hub closes it can close the devices associated with it, which gives us a fun race condition where if we start the probe for the usbhid device and then the underlying hub gets removed it will have no instances open, which causes usbhid to call hid_cancel_delayed_stuff, from its own cancel tasklet. Unsurprisingly when it does this it ends up waiting on itself forever. I'm currently working around the issue by adding a udev rule to prevent the device from being used by setting authorized=0 for it. This obviously prevents the DRAC from actually working, though. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1557172/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp