Hello Roger and Mathias, Running with slub_debug=FZPU and removing an XHCI host controller via sysfs, I've hit a use-after-free that I've bisected to:
8c24d6d7b09deee3036ddc4f2b81b53b28c8f877 is the first bad commit commit 8c24d6d7b09deee3036ddc4f2b81b53b28c8f877 Author: Roger Quadros <rog...@ti.com> Date: Mon Sep 21 17:46:14 2015 +0300 usb: xhci: stop everything on the first call to xhci_stop xhci_stop will be called twice, once for the shared hcd and again for the primary hcd. We stop the XHCI controller in any case so clean up everything on the first call else we can timeout waiting for pending requests to complete. Cc: <sta...@vger.kernel.org> Signed-off-by: Roger Quadros <rog...@ti.com> Signed-off-by: Mathias Nyman <mathias.ny...@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> I can repo the following list_del corruption warning every time, simply by removing the device: % lspci -D | grep -i xhci 0000:65:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05) % echo 1 > $(find /sys/devices -name '0000:65:14.0')/remove ------------[ cut here ]------------ WARNING: CPU: 22 PID: 13964 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() list_del corruption. prev->next should be ffff881032144350, but was 6b6b6b6b6b6b6b6b [ ... modules snip ... ] CPU: 22 PID: 13964 Comm: bash Not tainted 4.4.0-rc5+ #27 Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.1:61 09/10/2015 0000000000000000 00000000dfa07299 ffff88103091b898 ffffffff8131d770 ffff88103091b8e0 ffff88103091b8d0 ffffffff8107ef56 ffff88103205d2d0 ffff8810320c2698 ffff881032144350 0000000000000000 0000000000000204 Call Trace: [<ffffffff8131d770>] dump_stack+0x44/0x64 [<ffffffff8107ef56>] warn_slowpath_common+0x86/0xc0 [<ffffffff8107efec>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff811dac6c>] ? __slab_free+0x1bc/0x240 [<ffffffff81339511>] __list_del_entry+0xa1/0xd0 [<ffffffff814b3a49>] xhci_urb_dequeue+0xd9/0x380 [<ffffffff8147fb7d>] unlink1+0x2d/0x110 [<ffffffff81481d75>] usb_hcd_flush_endpoint+0xf5/0x190 [<ffffffff81484c79>] usb_disable_endpoint+0x59/0x90 [<ffffffff81484cf5>] usb_disable_interface+0x45/0x60 [<ffffffff81487558>] usb_unbind_interface+0x1b8/0x260 [<ffffffff81444176>] __device_release_driver+0x96/0x130 [<ffffffff81444233>] device_release_driver+0x23/0x30 [<ffffffff81442fa1>] bus_remove_device+0x101/0x170 [<ffffffff8143f3a9>] device_del+0x139/0x260 [<ffffffff8148bc3f>] ? usb_remove_ep_devs+0x1f/0x30 [<ffffffff81484db6>] usb_disable_device+0xa6/0x280 [<ffffffff8147a9f4>] usb_disconnect+0x94/0x270 [<ffffffff8147ab54>] usb_disconnect+0x1f4/0x270 [<ffffffff8147fd32>] usb_remove_hcd+0xd2/0x240 [<ffffffff81491f0f>] usb_hcd_pci_remove+0x6f/0x140 [<ffffffff814c6e9e>] xhci_pci_remove+0x4e/0x70 [<ffffffff8135be99>] pci_device_remove+0x39/0xc0 [<ffffffff81444176>] __device_release_driver+0x96/0x130 [<ffffffff81444233>] device_release_driver+0x23/0x30 [<ffffffff813549fc>] pci_stop_bus_device+0x8c/0xa0 [<ffffffff81354b1a>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [<ffffffff8135d9fc>] remove_store+0x7c/0x90 [<ffffffff8143e5c8>] dev_attr_store+0x18/0x30 [<ffffffff81275e3a>] sysfs_kf_write+0x3a/0x50 [<ffffffff812754c0>] kernfs_fop_write+0x120/0x170 [<ffffffff811f9d67>] __vfs_write+0x37/0x100 [<ffffffff812ab343>] ? selinux_file_permission+0xc3/0x110 [<ffffffff812a2e9d>] ? security_file_permission+0x3d/0xc0 [<ffffffff810c65bf>] ? percpu_down_read+0x1f/0x50 [<ffffffff811fa442>] vfs_write+0xa2/0x1a0 [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [<ffffffff811fb205>] SyS_write+0x55/0xc0 [<ffffffff81666dee>] entry_SYSCALL_64_fastpath+0x12/0x71 ---[ end trace 02b6650c4e01b29e ]--- I added some instrumentation to xhci_urb_dequeue: - if (!list_empty(&td->td_list)) - list_del_init(&td->td_list); + if (!list_empty(&td->td_list)) { + pr_err("%s(%p, %p, ...) list_del_init(%p)\n next=%p(n=%p, p=%p) prev=%p(n=%p, p=%p)\n", + __func__, hcd, urb, &td->td_list, + td->td_list.prev, td->td_list.prev->prev, td->td_list.prev->next, + td->td_list.next, td->td_list.next->prev, td->td_list.next->next); + } to prove the list corruption complaint is from the td_list: xhci_hcd 0000:65:14.0: remove, state 4 usb usb4: USB disconnect, device number 1 xhci_hcd 0000:65:14.0: USB bus 4 deregistered xhci_hcd 0000:65:14.0: remove, state 1 usb usb3: USB disconnect, device number 1 usb 3-1: USB disconnect, device number 2 xhci_urb_dequeue(ffff8810365b0000, ffff882032987cf0, ...) list_del_init(ffff882032885588) next=ffff881037742b00(n=6b6b6b6b6b6b6b6b, p=6b6b6b6b6b6b6b6b) prev=ffff881037742b00(n=6b6b6b6b6b6b6b6b, p=6b6b6b6b6b6b6b6b) If I revert 8c24d6d7b09d "usb: xhci: stop everything on the first call to xhci_stop", the warning goes away. Let me know if any additional instrumentation or information would help track down this issue. Thanks, -- Joe -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html