Hi On 16.07.2020 07:30, Guenter Roeck wrote: > On 7/15/20 10:08 PM, Saravana Kannan wrote: >> Marek and Guenter reported that commit 287905e68dd2 ("driver core: >> Expose device link details in sysfs") caused sleeping/scheduling while >> atomic warnings. >> >> BUG: sleeping function called from invalid context at >> kernel/locking/mutex.c:935 >> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 12, name: kworker/0:1 >> 2 locks held by kworker/0:1/12: >> #0: ee8074a8 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: >> process_one_work+0x174/0x7dc >> #1: ee921f20 ((work_completion)(&sdp->work)){+.+.}-{0:0}, at: >> process_one_work+0x174/0x7dc >> Preemption disabled at: >> [<c01b10f0>] srcu_invoke_callbacks+0xc0/0x154 >> ----- 8< ----- SNIP >> [<c064590c>] (device_del) from [<c0645c9c>] (device_unregister+0x24/0x64) >> [<c0645c9c>] (device_unregister) from [<c01b10fc>] >> (srcu_invoke_callbacks+0xcc/0x154) >> [<c01b10fc>] (srcu_invoke_callbacks) from [<c01493c4>] >> (process_one_work+0x234/0x7dc) >> [<c01493c4>] (process_one_work) from [<c01499b0>] (worker_thread+0x44/0x51c) >> [<c01499b0>] (worker_thread) from [<c0150bf4>] (kthread+0x158/0x1a0) >> [<c0150bf4>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20) >> Exception stack(0xee921fb0 to 0xee921ff8) >> >> This was caused by the device link device being released in the context >> of srcu_invoke_callbacks(). There is no need to wait till the RCU >> callback to release the device link device. So release the device >> earlier and revert the RCU callback code to what it was before >> commit 287905e68dd2 ("driver core: Expose device link details in sysfs") >> >> Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs") >> Reported-by: Marek Szyprowski <m.szyprow...@samsung.com> >> Reported-by: Guenter Roeck <li...@roeck-us.net> >> Signed-off-by: Saravana Kannan <sarava...@google.com> >> --- >> Marek and Guenter, >> >> It haven't had a chance to test this yet. Can one of you please test it >> and confirm it fixes the issue? >> > With this patch applied, the original warning is gone, but I get lots > of other warnings. > > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M > Device 'regulators:regulator@0:50038000.ethernet' does not have a release() > function, it is broken and must be fixed. > > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4 > Device '53f9c000.gpio:50038000.ethernet' does not have a release() function, > it is broken and must be fixed. > > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M > Device '50030000.tscadc:50030400.tcq' does not have a release() function, it > is broken and must be fixed.
I confirm that I also get such warnings for every platform device in the system with this patch applied to linux next-20200715: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0x98 Device '10023c40.power-domain:13620000.sysmmu' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst. Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc5-next-20200715-00002-g0f637964c4b0 #1270 Hardware name: Samsung Exynos (Flattened Device Tree) [<c011184c>] (unwind_backtrace) from [<c010d250>] (show_stack+0x10/0x14) [<c010d250>] (show_stack) from [<c051b8fc>] (dump_stack+0xbc/0xe8) [<c051b8fc>] (dump_stack) from [<c0126ed8>] (__warn+0xf0/0x108) [<c0126ed8>] (__warn) from [<c0126f64>] (warn_slowpath_fmt+0x74/0xb8) [<c0126f64>] (warn_slowpath_fmt) from [<c064a2a0>] (device_release+0x94/0x98) [<c064a2a0>] (device_release) from [<c0522178>] (kobject_put+0x104/0x288) [<c0522178>] (kobject_put) from [<c064b45c>] (__device_link_del+0x38/0xac) [<c064b45c>] (__device_link_del) from [<c064c1f0>] (device_links_driver_bound+0x260/0x26c) [<c064c1f0>] (device_links_driver_bound) from [<c0650af0>] (driver_bound+0x5c/0x110) [<c0650af0>] (driver_bound) from [<c0651038>] (really_probe+0x2d4/0x4fc) [<c0651038>] (really_probe) from [<c06513c8>] (driver_probe_device+0x78/0x1fc) [<c06513c8>] (driver_probe_device) from [<c064ee00>] (bus_for_each_drv+0x74/0xb8) [<c064ee00>] (bus_for_each_drv) from [<c0650cc4>] (__device_attach+0xd4/0x16c) [<c0650cc4>] (__device_attach) from [<c064fdc4>] (bus_probe_device+0x88/0x90) [<c064fdc4>] (bus_probe_device) from [<c064c604>] (fw_devlink_resume+0xa0/0x134) [<c064c604>] (fw_devlink_resume) from [<c102bfd4>] (of_platform_default_populate_init+0xa8/0xc0) [<c102bfd4>] (of_platform_default_populate_init) from [<c0102378>] (do_one_initcall+0x8c/0x424) [<c0102378>] (do_one_initcall) from [<c1001158>] (kernel_init_freeable+0x190/0x204) [<c1001158>] (kernel_init_freeable) from [<c0ac05d0>] (kernel_init+0x8/0x118) [<c0ac05d0>] (kernel_init) from [<c0100114>] (ret_from_fork+0x14/0x20) Exception stack(0xef0dffb0 to 0xef0dfff8) ffa0: 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 irq event stamp: 40543 hardirqs last enabled at (40551): [<c019d624>] console_unlock+0x430/0x6cc hardirqs last disabled at (40568): [<c019d348>] console_unlock+0x154/0x6cc softirqs last enabled at (40584): [<c010174c>] __do_softirq+0x50c/0x608 softirqs last disabled at (40595): [<c0130218>] irq_exit+0x168/0x16c ---[ end trace 1d4780a89f63483a ]--- > and so on. I don't know if this is caused by this patch or by > some other patch in -next. This is caused by patch 287905e68dd2 ("driver core: Expose device link details in sysfs"). If you revert it, the warning will go away. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland