On 23.01.2018 19:26, Stephen Hemminger wrote: > On Mon, 22 Jan 2018 12:41:41 +0300 > Kirill Tkhai <ktk...@virtuozzo.com> wrote: > >> Commit be3fc413da9e "net: use synchronize_rcu_expedited()" introducing >> synchronize_net() says: >> >> >When we hold RTNL mutex, we would like to spend some cpu cycles but not >> >block too long other processes waiting for this mutex. >> >We also want to setup/dismantle network features as fast as possible at >> >boot/shutdown time. >> >This patch makes synchronize_net() call the expedited version if RTNL is >> >locked. >> >> At the time of the commit (May 23 2011) there was no possible to differ, >> who is the actual owner of the mutex. Only the fact that it's locked >> by someone at the moment. So (I guess) this is the only reason the generic >> primitive mutex_is_locked() was used. >> >> But now mutex owner is available outside the locking subsystem and >> __mutex_owner() may be used instead (there is an example in >> audit_log_start()). >> So, let's make expensive synchronize_rcu_expedited() be used only >> when a caller really owns rtnl_mutex(). >> >> There are several possibilities to fix that. The first one is >> to fix synchronize_net(), the second is to change rtnl_is_locked(). >> >> I prefer the second, as it seems it's more intuitive for people >> to think that rtnl_is_locked() is about current process, not >> about the fact mutex is locked in general. Grep over kernel >> sources just proves this fact: >> >> drivers/staging/rtl8723bs/os_dep/osdep_service.c:297 >> drivers/staging/rtl8723bs/os_dep/osdep_service.c:316 >> >> if (!rtnl_is_locked()) >> ret = register_netdev(pnetdev); >> else >> ret = register_netdevice(pnetdev); > > Conditional locking like this especially in the registration path > is wrong and broken. The driver should just be removed for this. > >> drivers/staging/wilc1000/linux_mon.c:310 >> >> if (rtnl_is_locked()) { >> rtnl_unlock(); >> rollback_lock = true; >> } > > > This driver like most of staging is sh*t.
Yeah, but the history of in-tree drivers shows they are not the only drivers had the same mistake. My work tree is 3.10 and buggy caif_serial.c is there. >> Side effect of this patch is three BUGs in above examples >> become fixed. >> >> Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com> >> --- >> net/core/rtnetlink.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c >> index 16d644a4f974..a5ddf373ffa9 100644 >> --- a/net/core/rtnetlink.c >> +++ b/net/core/rtnetlink.c >> @@ -117,7 +117,7 @@ EXPORT_SYMBOL(rtnl_trylock); >> >> int rtnl_is_locked(void) >> { >> - return mutex_is_locked(&rtnl_mutex); >> + return __mutex_owner(&rtnl_mutex) == current; >> } >> EXPORT_SYMBOL(rtnl_is_locked); > > No. You are breaking the concept of is_locked. It means the mutex > is locked anywhere, not just in the calling process. For example, > rtnl_is_locked is used to break AB BA lock deadlock in sysfs. Then, it could be moved to synchronize_net(): diff --git a/net/core/dev.c b/net/core/dev.c index 94435cd09072..2dfcbde8f2fa 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -8372,7 +8372,7 @@ EXPORT_SYMBOL(free_netdev); void synchronize_net(void) { might_sleep(); - if (rtnl_is_locked()) + if (__mutex_owner(&rtnl_mutex) == current) synchronize_rcu_expedited(); else synchronize_rcu(); The same effect. Kirill