Hello Greg, I'm adding Tejun to the list of addressees.
On 12/10/2016 03:04 PM, Greg Kroah-Hartman wrote: > On Sat, Dec 10, 2016 at 02:38:41PM +0200, Vladimir Zapolskiy wrote: >> Hello Greg, >> >> On 12/10/2016 09:32 AM, Greg Kroah-Hartman wrote: >>> On Sat, Dec 10, 2016 at 02:15:19AM +0200, Vladimir Zapolskiy wrote: >>>> If CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled a number of false >>>> positives are reported for ATA controller drivers, because ATA port >>>> probes are done asynchronously, and the same problem may also touch >>>> other asynchronously probed drivers. >>>> >>>> To reduce the rate of false reports on boot call async_synchronize_full() >>>> before attempting to remove a driver, the same is done in delete_module() >>>> syscall for all possible drivers and in __device_release_driver() function >>>> for asynchronously probed drivers. >>> >>> __device_release_driver() already calls this function, why call it >>> again? >>> >> >> __device_release_driver() is not called on test removal of drivers, if >> CONFIG_DEBUG_TEST_DRIVER_REMOVE option is enabled. >> >> This opens a possibility to races like one I've discovered: >> >> https://www.spinics.net/lists/linux-ide/msg53473.html >> >> Next, async_synchronize_full() from __device_release_driver() is not >> called in case of removal of ATA controller drivers, because >> driver_allows_async_probing(drv) return value is false. > > Hm, how does this not also get hit if you unbind/bind/unbind/bind/etc. > from userspace as well? I don't think this is a > CONFIG_DEBUG_TEST_DRIVER_REMOVE issue, but just that this option finds > the problem corner cases as you are finding out :) > and you are right, I managed to reproduce exactly the same race as before running the unmodified kernel built from Torvald's branch head: root@imx6q-sabresd:~# zcat /proc/config.gz | grep TEST_DRIVER_REMOVE # CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set root@imx6q-sabresd:~# uname -a Linux imx6q-sabresd 4.9.0-rc8-00134-g0451698 #23 SMP Sun Dec 11 03:37:30 EET 2016 armv7l GNU/Linux root@imx6q-sabresd:~# echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/unbind ; echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/bind; echo 2200000.sata > /sys/bus/platform/drivers/ahci-imx/unbind ahci-imx 2200000.sata: fsl,transmit-level-mV not specified, using 00000024 ahci-imx 2200000.sata: fsl,transmit-boost-mdB not specified, using 00000480 ahci-imx 2200000.sata: fsl,transmit-atten-16ths not specified, using 00002000 ahci-imx 2200000.sata: fsl,receive-eq-mdB not specified, using 05000000 ahci-imx 2200000.sata: SSS flag set, parallel bus scan disabled ahci-imx 2200000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl platform mode ahci-imx 2200000.sata: flags: ncq sntf stag pm led clo only pmp pio slum part ccc apst scsi host0: ahci-imx ata3: SATA max UDMA/133 mmio [mem 0x02200000-0x02203fff] port 0x100 irq 72 ata3: SATA link down (SStatus 0 SControl 300) ahci-imx 2200000.sata: no device found, disabling link. ahci-imx 2200000.sata: pass .hotplug=1 to enable hotplug ------------[ cut here ]------------ WARNING: CPU: 2 PID: 375 at drivers/ata/libata-core.c:6482 ata_port_detach+0x130/0x140 Modules linked in: ahci_imx dw_hdmi_ahb_audio evbug CPU: 2 PID: 375 Comm: sh Not tainted 4.9.0-rc8-00134-g0451698 #23 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<>] (dump_backtrace) from [<>] (show_stack+0x20/0x24) [<>] (show_stack) from [<>] (dump_stack+0xb4/0xe8) [<>] (dump_stack) from [<>] (__warn+0xe4/0x110) [<>] (__warn) from [<>] (warn_slowpath_null+0x30/0x38) [<>] (warn_slowpath_null) from [<>] (ata_port_detach+0x130/0x140) [<>] (ata_port_detach) from [<>] (ata_platform_remove_one+0x34/0x4c) [<>] (ata_platform_remove_one) from [<>] (platform_drv_remove+0x34/0x4c) [<>] (platform_drv_remove) from [<>] (__device_release_driver+0x98/0x134) [<>] (__device_release_driver) from [<>] (device_release_driver+0x30/0x3c) [<>] (device_release_driver) from [<>] (unbind_store+0x88/0x108) [<>] (unbind_store) from [<>] (drv_attr_store+0x30/0x3c) [<>] (drv_attr_store) from [<>] (sysfs_kf_write+0x58/0x5c) [<>] (sysfs_kf_write) from [<>] (kernfs_fop_write+0x108/0x21c) [<>] (kernfs_fop_write) from [<>] (__vfs_write+0x3c/0x128) [<>] (__vfs_write) from [<>] (vfs_write+0xb0/0x178) [<>] (vfs_write) from [<>] (SyS_write+0x4c/0xa0) [<>] (SyS_write) from [<>] (ret_fast_syscall+0x0/0x1c) ---[ end trace 457071853738c95e ]--- Tejun, will you look at the problem again? -- With best wishes, Vladimir