On Thu, May 21, 2015 at 11:53:21PM +0200, Denys Vlasenko wrote: > On 05/21/2015 05:26 PM, Paul E. McKenney wrote: > > On Thu, May 21, 2015 at 05:09:27PM +0200, Denys Vlasenko wrote: > >> On 05/21/2015 03:45 PM, Steven Rostedt wrote: > >>> On Thu, 21 May 2015 12:04:07 +0200 > >>> Denys Vlasenko <dvlas...@redhat.com> wrote: > >>> > >>>> DEBUG_LOCK_ALLOC=y is not a production setting, but it is > >>>> not very unusual either. Many developers routinely > >>>> use kernels built with it enabled. > >>>> > >>>> Apart from being selected by hand, it is also auto-selected by > >>>> PROVE_LOCKING "Lock debugging: prove locking correctness" and > >>>> LOCK_STAT "Lock usage statistics" config options. > >>>> LOCK STAT is necessary for "perf lock" to work. > >>>> > >>>> I wouldn't spend too much time optimizing it, but this particular > >>>> function has a very large cost in code size: when it is deinlined, > >>>> code size decreases by 830,000 bytes: > >>>> > >>>> text data bss dec hex filename > >>>> 85674192 22294776 20627456 128596424 7aa39c8 vmlinux.before > >>>> 84837612 22294424 20627456 127759492 79d7484 vmlinux > >>>> > >>>> (with this config: http://busybox.net/~vda/kernel_config) > >>>> > >>>> Signed-off-by: Denys Vlasenko <dvlas...@redhat.com> > >>>> CC: "Paul E. McKenney" <paul...@linux.vnet.ibm.com> > >>>> CC: Josh Triplett <j...@joshtriplett.org> > >>>> CC: Steven Rostedt <rost...@goodmis.org> > >>>> CC: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> > >>>> CC: Lai Jiangshan <la...@cn.fujitsu.com> > >>>> CC: linux-kernel@vger.kernel.org > >>>> CC: Tejun Heo <t...@kernel.org> > >>>> CC: Oleg Nesterov <o...@redhat.com> > >>>> --- > >>>> include/linux/rcupdate.h | 40 ++----------------------------------- > >>>> kernel/rcu/update.c | 52 > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> 2 files changed, 54 insertions(+), 38 deletions(-) > >>>> > >>>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > >>>> index 7809749..6024a65 100644 > >>>> --- a/include/linux/rcupdate.h > >>>> +++ b/include/linux/rcupdate.h > >>>> @@ -439,46 +439,10 @@ int rcu_read_lock_bh_held(void); > >>>> * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an > >>>> * RCU-sched read-side critical section. In absence of > >>>> * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched > >>>> read-side > >>>> - * critical section unless it can prove otherwise. Note that disabling > >>>> - * of preemption (including disabling irqs) counts as an RCU-sched > >>>> - * read-side critical section. This is useful for debug checks in > >>>> functions > >>>> - * that required that they be called within an RCU-sched read-side > >>>> - * critical section. > >>>> - * > >>>> - * Check debug_lockdep_rcu_enabled() to prevent false positives during > >>>> boot > >>>> - * and while lockdep is disabled. > >>>> - * > >>>> - * Note that if the CPU is in the idle loop from an RCU point of > >>>> - * view (ie: that we are in the section between rcu_idle_enter() and > >>>> - * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the > >>>> CPU > >>>> - * did an rcu_read_lock(). The reason for this is that RCU ignores CPUs > >>>> - * that are in such a section, considering these as in extended > >>>> quiescent > >>>> - * state, so such a CPU is effectively never in an RCU read-side > >>>> critical > >>>> - * section regardless of what RCU primitives it invokes. This state of > >>>> - * affairs is required --- we need to keep an RCU-free window in idle > >>>> - * where the CPU may possibly enter into low power mode. This way we can > >>>> - * notice an extended quiescent state to other CPUs that started a grace > >>>> - * period. Otherwise we would delay any grace period as long as we run > >>>> in > >>>> - * the idle task. > >>>> - * > >>>> - * Similarly, we avoid claiming an SRCU read lock held if the current > >>>> - * CPU is offline. > >>>> + * critical section unless it can prove otherwise. > >>>> */ > >>>> #ifdef CONFIG_PREEMPT_COUNT > >>>> -static inline int rcu_read_lock_sched_held(void) > >>>> -{ > >>>> - int lockdep_opinion = 0; > >>>> - > >>>> - if (!debug_lockdep_rcu_enabled()) > >>>> - return 1; > >>>> - if (!rcu_is_watching()) > >>>> - return 0; > >>>> - if (!rcu_lockdep_current_cpu_online()) > >>>> - return 0; > >>>> - if (debug_locks) > >>>> - lockdep_opinion = lock_is_held(&rcu_sched_lock_map); > >>>> - return lockdep_opinion || preempt_count() != 0 || > >>>> irqs_disabled(); > >>>> -} > >>>> +int rcu_read_lock_sched_held(void); > >>>> #else /* #ifdef CONFIG_PREEMPT_COUNT */ > >>>> static inline int rcu_read_lock_sched_held(void) > >>>> { > >>>> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > >>>> index e0d31a3..e02218f 100644 > >>>> --- a/kernel/rcu/update.c > >>>> +++ b/kernel/rcu/update.c > >>>> @@ -62,6 +62,58 @@ MODULE_ALIAS("rcupdate"); > >>>> > >>>> module_param(rcu_expedited, int, 0); > >>>> > >>>> +#ifdef CONFIG_DEBUG_LOCK_ALLOC > >>>> +/** > >>>> + * rcu_read_lock_sched_held() - might we be in RCU-sched read-side > >>>> critical section? > >>>> + * > >>>> + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an > >>>> + * RCU-sched read-side critical section. In absence of > >>>> + * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched > >>>> read-side > >>>> + * critical section unless it can prove otherwise. Note that disabling > >>>> + * of preemption (including disabling irqs) counts as an RCU-sched > >>>> + * read-side critical section. This is useful for debug checks in > >>>> functions > >>>> + * that required that they be called within an RCU-sched read-side > >>>> + * critical section. > >>>> + * > >>>> + * Check debug_lockdep_rcu_enabled() to prevent false positives during > >>>> boot > >>>> + * and while lockdep is disabled. > >>>> + * > >>>> + * Note that if the CPU is in the idle loop from an RCU point of > >>>> + * view (ie: that we are in the section between rcu_idle_enter() and > >>>> + * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the > >>>> CPU > >>>> + * did an rcu_read_lock(). The reason for this is that RCU ignores CPUs > >>>> + * that are in such a section, considering these as in extended > >>>> quiescent > >>>> + * state, so such a CPU is effectively never in an RCU read-side > >>>> critical > >>>> + * section regardless of what RCU primitives it invokes. This state of > >>>> + * affairs is required --- we need to keep an RCU-free window in idle > >>>> + * where the CPU may possibly enter into low power mode. This way we can > >>>> + * notice an extended quiescent state to other CPUs that started a grace > >>>> + * period. Otherwise we would delay any grace period as long as we run > >>>> in > >>>> + * the idle task. > >>>> + * > >>>> + * Similarly, we avoid claiming an SRCU read lock held if the current > >>>> + * CPU is offline. > >>>> + */ > >>>> +#ifdef CONFIG_PREEMPT_COUNT > >>>> +int rcu_read_lock_sched_held(void) > >>>> +{ > >>>> + int lockdep_opinion = 0; > >>>> + > >>>> + if (!debug_lockdep_rcu_enabled()) > >>>> + return 1; > >>>> + if (!rcu_is_watching()) > >>>> + return 0; > >>>> + if (!rcu_lockdep_current_cpu_online()) > >>>> + return 0; > >>>> + if (debug_locks) > >>>> + lockdep_opinion = lock_is_held(&rcu_sched_lock_map); > >>>> + return lockdep_opinion || preempt_count() != 0 || > >>>> irqs_disabled(); > >>>> +} > >>>> +EXPORT_SYMBOL(rcu_read_lock_sched_held); > >>>> +#else > >>>> +/* !CONFIG_PREEMPT_COUNT - the function is inlined to always return 1 */ > >>> > >>> Nuke the #else. It's not needed and this is a common enough practice to > >>> have the static inline foo() { } when disabled that we do not need to > >>> comment about it here. > >> > >> Sending patch v2 in a few minutes. > >> > >>>> +#endif > >>> > >>> Hmm, you added two #ifdef, and one #endif. How does this even compile?? > >> > >> Er... it... doesn't. > >> > >> There was "#if defined CONFIG_DEBUG_LOCK_ALLOC && defined > >> CONFIG_PREEMPT_COUNT" > >> but then I split it into two #ifs to have a nice explanatory empty > >> #else clause. And I did not test it after that edit > >> ("what could possibly go wrong?"). Sorry. > >> > >> Patch v2 I'm sending _is_ tested. > > > > Boot/run as well as build? > > I did not even dare to try booting 128 megabyte allyesconfig monstrosity, > before you asked. > > It took me 4 hours of rebuilds to figure out that CONFIG_CMDLINE_BOOL needs > to be > disabled for my qemu boot to succeed :) :)
Just make sure you retain the .config file to allow you to more easily test future changes! ;-) > So, yes, patched allyesconfig kernel seems to boot... it takes 250 seconds in > qemu. Steve, are you OK with Denys's most recent patch? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/